留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

Photonic transformer chip: interference is all you need

Ye Tian, Shuiying Xiang, Xingxing Guo, Yahui Zhang, Jiashang Xu, Shangxuan Shi, Haowen Zhao, Yizhi Wang, Xinran Niu, Wenzhuo Liu, Yue Hao. Photonic transformer chip: interference is all you need[J]. PhotoniX. doi: 10.1186/s43074-025-00182-7
Citation: Ye Tian, Shuiying Xiang, Xingxing Guo, Yahui Zhang, Jiashang Xu, Shangxuan Shi, Haowen Zhao, Yizhi Wang, Xinran Niu, Wenzhuo Liu, Yue Hao. Photonic transformer chip: interference is all you need[J]. PhotoniX. doi: 10.1186/s43074-025-00182-7

doi: 10.1186/s43074-025-00182-7

Photonic transformer chip: interference is all you need

Funds: This work was supported by the National Key Research and Development Program of China (2021YFB2801900, 2021YFB2801901, 2021YFB2801902, 2021YFB2801904); National Natural Science Foundation of China (No. 61974177); National Outstanding Youth Science Fund Project of National Natural Science Foundation of China (62022062); The Fundamental Research Funds for the Central Universities (QTZX23041).
    • 关键词:
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  / 
    •  
  • [1] Vaswani, A., et al. Attention is all you need in Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc.: Long Beach, California, USA. 2017;6000–6010.
    [2] Dosovitskiy, A., et al. An image is worth 16x16 words: Transformers for image recognition at scale. ArXiv. 2020;abs/2010.11929.
    [3] Mehta, S. and M. Rastegari, Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer. ArXiv, 2021. abs/2110.02178.
    [4] Carion N, et al. End-to-end object detection with transformers. Cham: Springer International Publishing; 2020.
    [5] Devlin J, et al. Bert: Pre-training of deep bidirectional transformers for language understanding in North American Chapter of the Association for Computational Linguistics. 2019.
    [6] Achiam OJ, et al. Gpt-4 technical report. 2023.
    [7] Xie T, et al. Vit-mvt: A unified vision transformer network for multiple vision tasks. IEEE Transactions on Neural Networks and Learning Systems, 2023. p. 1–15.
    [8] Howard, J. and S. Ruder. Fine-tuned language models for text classification. ArXiv. 2018;abs/1801.06146.
    [9] Tripp, C.E., et al. Measuring the energy consumption and efficiency of deep neural networks: An empirical analysis and design recommendations. ArXiv. 2024;abs/2403.08151.
    [10] Berggren K, et al. Roadmap on emerging hardware and technology for machine learning. Nanotechnology. 2021;32(1):012002.
    [11] Totović AR, et al. Femtojoule per mac neuromorphic photonics: an energy and technology roadmap. IEEE J Sel Top Quantum Electron. 2020;26(5):1–15.
    [12] Shen Y, et al. Deep learning with coherent nanophotonic circuits. Nat Photon. 2017;11(7):441–6.
    [13] Tian Y, et al. Scalable and compact photonic neural chip with low learning-capability-loss. Nanophotonics. 2022;11(2):329–44.
    [14] Bandyopadhyay S, et al. Single-chip photonic deep neural network with forward-only training. Nat Photon. 2024;18(12):1335–43.
    [15] Feldmann J, et al. Parallel convolutional processing using an integrated photonic tensor core. Nature. 2021;589(7840):52–8.
    [16] Shekhar S, et al. Roadmapping the next generation of silicon photonics. Nat Commun. 2024;15(1):751.
    [17] Ramey, C. Silicon photonics for artificial intelligence acceleration : Hotchips 32 in 2020 IEEE Hot Chips 32 Symposium (HCS). 2020.
    [18] Liu J, et al. Research progress in optical neural networks: theory, applications and developments. PhotoniX. 2021;2(1):5.
    [19] Inagaki T, et al. Collective and synchronous dynamics of photonic spiking neurons. Nat Commun. 2021;12(1):2325.
    [20] Feldmann J, et al. All-optical spiking neurosynaptic networks with self-learning capabilities. Nature. 2019;569(7755):208–14.
    [21] Xiang S, et al. Semiconductor lasers for photonic neuromorphic computing and photonic spiking neural networks: a perspective. APL Photonics. 2024. https://doi.org/10.1063/5.0217968.
    [22] Xiang S, et al. Hardware-algorithm collaborative computing with photonic spiking neuron chip based on an integrated fabry–perot laser with a saturable absorber. Optica. 2023;10(2):162–71.
    [23] Xiang S, et al. Photonic integrated neuro-synaptic core for convolutional spiking neural network. Opto-Electron Adv. 2023;6(11):230140.
    [24] Marinis LD, et al. Photonic neural networks: a survey. IEEE Access. 2019;7:175827–41.
    [25] Wright LG, et al. Deep physical neural networks trained with backpropagation. Nature. 2022;601(7894):549–55.
    [26] Zhu H, et al. Dota: a dynamically-operated photonic tensor core for energy-efficient transformer accelerator. Optica Open. 2023.
    [27] Gu, J., et al. Towards area-efficient optical neural networks: An fft-based architecture in 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC). 2020.
    [28] Feng C, et al. Integrated multi-operand optical neurons for scalable and hardware-efficient deep learning. Nanophotonics. 2024;13(12):2193–206.
    [29] Zhou H, et al. Photonic matrix multiplication lights up photonic accelerator and beyond. Light Sci Appl. 2022;11(1): 30.
    [30] Afifi S, et al. Tron: Transformer neural network acceleration with non-coherent silicon photonics inProceedings of the Great Lakes Symposium on VLSI 2023. Association for Computing Machinery: Knoxville, TN, USA. 2023. p. 15–21.
    [31] Anderson MG, et al. Optical transformers. Trans. Mach. Learn. Res. 2023.2024.
    [32] Sha H, et al. Vision transformer with photonic integrated circuits. Spie/cos photonics asia. SPIE. 2024:(13236).
    [33] Dong B, et al. Partial coherence enhances parallelized photonic computing. Nature. 2024;632(8023):55–62.
    [34] Tait AN, et al. Microring weight banks. IEEE J Sel Top Quantum Electron. 2016;22(6):312–25.
    [35] Demirkıran C, et al. Mirage: An rns-based photonic accelerator for dnn training. 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA). 2023.p.73–87.
    [36] Grundmann M. Kramers–kronig relations in The physics of semiconductors: An introduction including nanophysics and applications. Springer Berlin Heidelberg: Berlin, Heidelberg. 2010.p. 775–776.
    [37] Xu X, et al. Self-calibrating programmable photonic integrated circuits. Nat Photon. 2022;16(8):595–602.
    [38] Mecozzi A, Antonelli C, Shtaif M. Kramers-kronig receivers. Adv Opt Photonics. 2019;11(3):480–517.
    [39] Zhu Z, et al. Coherent general-purpose photonic matrix processor. ACS Photonics. 2024;11(3):1189–96.
    [40] Zhang M, et al. Tempo: efficient time-multiplexed dynamic photonic tensor core for edge AI with compact slow-light electro-optic modulator. J Appl Phys. 2024. https://doi.org/10.1063/5.0203036.
    [41] Li C, et al. The challenges of modern computing and new opportunities for optics. PhotoniX. 2021;2(1):20.
    [42] Mourgias-Alexandris G, et al. Noise-resilient and high-speed deep learning with coherent silicon photonics. Nat Commun. 2022;13(1):5572.
    [43] Youngblood N. Coherent photonic crossbar arrays for large-scale matrix-matrix multiplication. IEEE J Sel Top Quantum Electron. 2023;29(2: Optical Computing):1–11.
    [44] Rahimi Kari S, et al. Realization of an integrated coherent photonic platform for scalable matrix operations. Optica. 2024;11(4):542–51.
    [45] McMahon PL. The physics of optical computing. Nat Rev Phys. 2023;5(12):717–34.
    [46] Meng X, et al. Compact optical convolution processing unit based on multimode interference. Nat Commun. 2023;14(1):3000.
    [47] Ashtiani F, Geers AJ, Aflatouni F. An on-chip photonic deep neural network for image classification. Nature. 2022;606(7914):501–6.
    [48] Xu X, et al. 11 tops photonic convolutional accelerator for optical neural networks. Nature. 2021;589(7840):44–51.
    [49] Shastri BJ, et al. Photonics for artificial intelligence and neuromorphic computing. Nat Photon. 2021;15(2):102–14.
    [50] Zhang H, et al. An optical neural chip for implementing complex-valued neural network. Nat Commun. 2021;12(1):457.
    [51] Sze V, et al. Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE. 2017;105(12):2295–329.
    [52] Tian Y, et al. Photonic neural networks with kramers–kronig activation. Adv Photon Res. 2023;4(9): 2300062.
    [53] Tang S, et al. Reconfigurable integrated photonic unitary neural networks with phase encoding enabled by in-situ training. IEEE Photon J. 2024;16(5):1–11.
    [54] Choromanski K, et al. Rethinking attention with performers. ArXiv, 2020. abs/2009.14794.
    [55] Rahimi A, Recht B. Random features for large-scale kernel machines, In Proceedings of the 21st International Conference on Neural Information Processing Systems. Curran Associates Inc.: Vancouver, British Columbia, Canada; 2007. p. 1177–1184.
    [56] Fang MYS, et al. Design of optical neural networks with component imprecisions. Opt Express. 2019;27(10):14009–29.
    [57] Cem A, et al. Thermal crosstalk modeling and compensation for programmable photonic processors. In 2023 IEEE Photonics Conference (IPC). 2023.
    [58] Milanizadeh M, et al. Control and calibration recipes for photonic integrated circuits. IEEE J Sel Top Quantum Electron. 2020;26(5):1–10.
    [59] Bai B, et al. Microcomb-based integrated photonic processing unit. Nat Commun. 2023;14(1):66.
    [60] Lightelligence. Tianshu optical-electrical hybrid computing card. 2025. Available from: https://www.xztech.ai/index.php/product/index/6.html.
    [61] Xue Z, et al. Fully forward mode training for optical neural networks. Nature. 2024;632(8024):280–6.
    [62] Zhu H, et al. Lightening-transformer: A dynamically-operated optically-interconnected photonic transformer accelerator. In 2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA). 2024.
    [63] Su J, et al. Roformer: Enhanced transformer with rotary position embedding. ArXiv. 2021.abs/2104.09864.
    [64] Wei X, et al. Videorope: What makes for good video rotary position embedding? ArXiv. 2025.abs/2502.05173.
    [65] Bai J, et al. Qwen technical report. ArXiv. 2023.abs/2309.16609.
    [66] Zeng TGA, et al. Chatglm: A family of large language models from glm-130b to glm-4 all tools. ArXiv. 2024.abs/2406.12793.
    [67] Touvron H, et al. Llama: Open and efficient foundation language models. ArXiv. 2023.abs/2302.13971.
    [68] Jouppi NP, et al. Tpu v4: An optically reconfigurable supercomputer for machine learning with hardware support for embeddings. Proceedings of the 50th Annual International Symposium on Computer Architecture. 2023.
    [69] Wang YE, Wei G.-Y, Brooks DM. Benchmarking tpu, gpu, and cpu platforms for deep learning. ArXiv. , 2019.abs/1907.10701.
    [70] Hertel IV, Schulz CP. Coherence and photons in Atoms, molecules and optical physics 2: Molecules and photons - spectroscopy and collisions. Springer Berlin Heidelberg: Berlin, Heidelberg. 2015. p. 71–134.
    [71] Ahmed SR, et al. Universal photonic artificial intelligence acceleration. Nature. 2025;640(8058):368–74.
    [72] Hua S, et al. An integrated large-scale photonic accelerator with ultralow latency. Nature. 2025;640(8058):361–7.
    [73] Fann CH, et al. Novel parallel digital optical computing system (doc) for generative a.I. In 2024 IEEE International Electron Devices Meeting (IEDM). 2024.
    [74] Han C, et al. Slow-light silicon modulator with 110-GHz bandwidth. Sci Adv. 2023;9(42): eadi5339.
    [75] Alloatti L, et al. 100 GHz silicon–organic hybrid modulator. Light Sci Appl. 2014;3(5):e173–e173.
    [76] Filipovich MJ, et al. Silicon photonic architecture for training deep neural networks with direct feedback alignment. Optica. 2022;9(12):1323–32.
    [77] Dao T, et al. Flashattention: Fast and memory-efficient exact attention with io-awareness. ArXiv. 2022.abs/2205.14135.
    [78] Kwon W, et al. Efficient memory management for large language model serving with pagedattention. Proceedings of the 29th Symposium on Operating Systems Principles. 2023.
    [79] Pope R, et al. Efficiently scaling transformer inference. ArXiv. 2022.abs/2211.05102.
    [80] Ivanov A, et al. Data movement is all you need: A case study on optimizing transformers. ArXiv. 2020.abs/2007.00072.
    [81] Shao Z, et al. Deepseek-v2: a strong, economical, and efficient mixture-of-experts language model. ArXiv. 2024. abs/2405.04434.
    [82] Kumar SK. On weight initialization in deep neural networks. ArXiv. 2017.abs/1704.08863.
    [83] He K, et al. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. IEEE International Conference on Computer Vision (ICCV). 2015;2015:1026–34.
    [84] Narkhede MV, Bartakke PP, Sutaone MS. A review on weight initialization strategies for neural networks. Artif Intell Rev. 2022;55(1):291–322.
计量
  • 文章访问数:  20
  • HTML全文浏览量:  0
  • PDF下载量:  1
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-04-10
  • 录用日期:  2025-07-29
  • 修回日期:  2025-07-19
  • 网络出版日期:  2025-10-31

目录

    /

    返回文章
    返回