Oct 76 min read

Photonic-Electronic Integrated Circuits for High-Performance Computing and AI Accelerators

Introduction

In recent decades, the demand for computational power has surged, particularly with the rapid expansion of artificial intelligence (AI). As we navigate the post-Moore's law era, the limitations of traditional electrical digital computing, including process bottlenecks and power consumption issues, are propelling the search for alternative computing paradigms. Among various emerging technologies, integrated photonics stands out as a promising solution for the next generation of high-performance computing due to the inherent advantages of light, such as low latency, high bandwidth, and unique multiplexing techniques.

Furthermore, the progress in photonic integrated circuits (PICs), which are equipped with abundant photoelectronic components, positions photonic-electronic integrated circuits as a viable solution for high-performance computing and as hardware AI accelerators. This tutorial article surveys recent advancements in both PIC-based digital and analog computing for AI, exploring the principal benefits and obstacles of implementation. Additionally, it proposes a comprehensive analysis of photonic AI from the perspectives of hardware implementation, accelerator architecture, and software-hardware co-design. Finally, acknowledging the existing challenges, the article underscores potential strategies for overcoming these issues and offers insights into the future drivers for optical computing.

Optical Digital Computing on PICs

PICs are comprised of a range of optical components, both passive and active, featuring various hardware implementations and circuit topologies to fulfill distinct functionalities. This section focuses on electro-optic (E-O) digital logic and provides a concise overview of recent progress in PIC-based digital computing, while highlighting these implementation techniques and associated challenges.

1. Optical Logic Gates

In the digital domain, both input and output are binary, and the resolution is defined by the number of bits and remains unaffected by the circuit size. A range of building blocks for optical digital computing on integrated photonic platforms, such as optical switches, modulators, interconnects, and photodetectors, have been experimentally demonstrated.

Fig. 1 illustrates electro-optic (E-O) logic gates, combining passive and active optical components for logic operations. Electrical signals configure the circuit each clock cycle, with light performing logic functions. Schematics show Microring Resonator (MRR)-based AND/NAND, OR/NOR, and XOR/XNOR gates. MRRs alternate between "block/pass" and "pass/block" modes, depicted by dotted and solid lines, determining the logic outputs "0" and "1" at the through port based on a "0" electrical input.

As shown in Fig. 1.a, all E-O devices in the functional block, such as Mach-Zehnder Interferometer (MZI), microring resonator (MRR), and microdisk, are simultaneously configured by electrical signals. When light traverses the block, optical signals are modulated to execute logic operations in accordance with PIC design and then propagated downstream or detected by monitors to read out the results. Fig. 1.b-c show examples of E-O logic gates using two cascaded MRRs to perform 2-input AND/NAND and OR/NOR operations. The proposed optical logic gate leverages the transmission characteristic of add-drop MRRs, which work as optical switches to implement logic operation and generate complementary outputs from the add and drop port.

2. Combinational Logic and Reconfigurable PIC

In digital circuits, the output of a combinational logic unit is determined only by the current input combination, without dependence on previous or future states. Similarly, the implementation of optical combinational logic could begin with extracting the logical expression from its truth table, followed by designing the corresponding PIC based on the simplified expression.

Fig. 2 showcases reconfigurable Photonic Integrated Circuits (PICs) for executing any combinational logic. (a) Displays a reconfigurable microring resonator (MRR) with dual modulation mechanisms: an RF signal modulates input through a p-i-n junction, and a microheater, controlled by a DC signal, adjusts resonance modes. (b)-(c) Illustrate the PIC architecture employing reconfigurable optical switches (ROS) for versatile logic expression implementation.

To address the limitation of fixed or limited logic representation in tailored optical logic gates and units, reconfigurable PICs offer a promising solution by programming the operational states of optical switches within a pre-designed framework using additional signals. As illustrated in Fig. 2.b-c, the architecture can theoretically implement arbitrary logic functions by leveraging reconfigurable optical switches.

3. Toward Fully-functional EPALU

Besides the advantages of high bandwidth, low latency, and reduced power consumption of optical logic units, unique multiplexing techniques play a pivotal role in further improving computing capacity and performance. An example is the WDM-based electronics-photonic arithmetic logic unit (EPALU) (Fig. 3.b), which performs arithmetic and bitwise operations.

WDM-based electronics-photonic arithmetic logic unit — Fig. 3 presents the EPALU (Electronics-Photonic Arithmetic Logic Unit) architecture for advanced digital computing. (a) Depicts a 2-bit Carry Propagate Adder (CPA) employing E-O logic. (b) Outlines EPALU's components and data pathway. (c) Details a Wavelength Division Multiplexing (WDM)-based, multifunctional N-bit processing unit including a generation unit (PGU), optical carry propagation networks (OCPNs), photodetectors (PDs), electronic multiplexers (MUXU), and a sum generation unit (SGU), capable of executing various logic operations. (d) Shows a 2-bit barrel shifter's layout utilizing a microdisk add-drop switch array. (e) Illustrates an electro-photonic decoder and photonic-electronic multiplexer setup, indicating the role of electrical input signals (si).

The scalable electro-optic carry propagation adder (CPA) has been developed, and an optimized architecture, the carry select adder (CSA), splits N-bit operands into n m-bit CPA and computes two possible outcomes for each m-bit CPA simultaneously. This WDM-based EPALU architecture enables the execution of addition, subtraction, comparison, and bitwise operations with various input combinations.

4. PIC-based Analog Computing for AI

Undoubtedly, modern AI, functioning on digital computing systems, has achieved significant progress in diverse fields and has even exceeded human performance in specific tasks. However, the digital representation can encounter challenges stemming from hardware complexity overhead and speed reduction caused by the sampling and digitization into binary streams processed by logic units. On the other hand, the human brain, operating as an analog signal "processor", demonstrates remarkable efficiency compared to the substantial energy requirements of cutting-edge AI.

Fig. 4. Schematic of an artificial neuron with simple synaptic model.

Before discussing the details of implementing optical analog computing for AI, it is helpful to provide a concise overview of artificial neural networks (ANNs) and the neuron model. The schematic of an artificial neuron with a basic synaptic model is illustrated in Fig. 4, where x, y, and w represent the inputs from the pre-synaptic neuron, post-synaptic output, and weights of the connection, respectively.

5. Programmable Modulation for Optical Analog Computing

In analog AI accelerators, both inputs x and weights w could correspond to a higher level of programming resolution, in contrast to the binary values in digital computing circuits. The hardware implementation of the above process using PIC requires the reconfigurable programming of network parameters, which relies on the modulation of optical components.

Fig. 5 outlines modulation methods in PICs: (a) Demonstrates a thermo-optic Mach-Zehnder Interferometer (MZI) with microheaters for phase shifting, optimizing output signal modulation for high extinction ratios. (b) Depicts modulators utilizing free-carrier effects in injection, depletion modes, and MOSCAP-driven adjustments. (c) Shows an MRR modulator with a Sb2S3 cladding and silicon PIN heater, highlighting the transmission change between Sb2S3 phases.

A number of modulation mechanisms have been developed, among which tuning the effective refractive index neff of waveguides is a widely adopted approach in ONNs. These mechanisms include thermal tuning, field-effect tuning, and non-volatile modulation using phase change materials, as illustrated in Fig. 5. Each modulation technique has its own advantages and tradeoffs in terms of modulation speed, efficiency, power consumption, and footprint.

6. Implementations of Photonic Tensor Core

As illustrated in Fig. 4, the interconnections in ANNs can be conceptualized as weights akin to synaptic coupling coefficients in biological systems. This analogy extends to representing these connections through tensor operations, thereby abstracting the complex interactions in a computationally manageable form. Building upon the aforementioned modulation mechanisms, various active devices and encoding mechanisms have been effectively utilized in photonic neurons.

Fig. 6 showcases photonic-electronic tensor core implementations: (a) Displays a 2x2 MZI, using MMIs or directional couplers for beam splitting, with an alternative Y-branch configuration for single input/output. (b) Highlights an MZI-based coherent tensor core employing Singular Value Decomposition (SVD) for weight matrix representation, simplifying to U and Σ for circuit practicality. (c) Shows an add-drop MRR's schematic and spectral response, with specific power coupling coefficients and round-trip power loss considerations. (d) Describes a weight bank architecture using MRRs as tunable filters for WDM signal modulation, allowing for wavelength-selective operations. (e) Illustrates a 4x4 MRR crossbar array for signal modulation and distribution, culminating in a photodetector array for output summation. (f) Outlines an optical tensor core enabling dynamic, full-range matrix multiplication via WDM and coherent interference.

From the viewpoint of signal properties, ONNs can be classified into coherent and incoherent systems. Within coherent ONNs, both weights and inputs can be encoded in the complex plane, allowing multiplication through lossless interference. For instance, a pair of beam splitters and phase shifters in the form of MZIs are widely adopted for conducting linear operations in coherent ONNs, as shown in Fig. 6.a. In contrast, incoherent ONNs encode information in the optical power domain, leveraging devices such as MRRs and photodetectors to perform multiplication and addition, as depicted in Fig. 6.b.

7. Photonic AI Accelerator Architectures

Beyond a review from the device and circuit level, this section provides a comprehensive analysis of recent photonic AI efforts from the perspectives of AI accelerator architectures.

Fig. 7 highlights efficient Photonic Tensor Core (PTC) implementations: (a) & (b) Show space-saving ONNs using diffractive cells and metasurface structures for optical (inverse) discrete Fourier transforms. (c) Presents a multi-operand optical neuron with multi-operand MZI and MRR designs for enhanced processing capabilities. (d) Details the subspace ONNs architecture and a 4x4 butterfly-style tensor core, using unitary matrices (B and P) and a diagonal matrix (Σ) for complex transformations and projections. (e) Describes a delocalized time-integration computing system leveraging smart transceivers for cloud-edge communication, where neural network parameters are WDM-encoded, transmitted via optical fiber, and processed through photodetector-based time integration at the edge.

PIC-based ONN architectures can be categorized into feed-forward, recurrent, and spiking neural network topologies, as shown in Fig. 7. The feed-forward ONN is the most widely studied, where the optical signals propagate unidirectionally from the input to the output. Recurrent ONNs, on the other hand, feature feedback connections that allow the network to maintain an internal state and exhibit dynamic temporal behavior. Spiking ONNs, inspired by the biological neural networks, encode information in the timing of discrete optical pulses rather than continuous optical power.

8. Software-Hardware Co-Design for Photonic AI

Effective software-hardware co-design is crucial for realizing the full potential of photonic AI systems. This section discusses the key considerations and recent developments in this area.

Fig. 8 showcases on-chip nonlinear activation function implementations: (a) & (b) Illustrate Optical-Electrical-Optical (O-E-O) units for adaptable activation functions, enabling reconfigurability. (c) Describes a Phase Change Material (PCM)-based all-optical spiking neuron, where input spikes are modulated by PCM cells, and the MRR's PCM cell alters resonance based on the postsynaptic spike power, dictating output spike generation.

As illustrated in Fig. 8, the software-hardware co-design landscape for photonic AI encompasses various aspects, including photonic hardware modeling, hardware-aware neural network training, and compiler and runtime systems. Accurate modeling of photonic hardware characteristics, such as loss, crosstalk, and nonlinearity, is essential for enabling efficient mapping of neural networks onto photonic hardware. Hardware-aware training techniques, in turn, can help mitigate the impact of these hardware imperfections and improve the robustness of the trained models. Finally, compiler and runtime systems play a crucial role in bridging the gap between the photonic hardware and the higher-level AI software, facilitating the seamless deployment of photonic AI accelerators.

Conclusion and Outlook

This tutorial article has provided a comprehensive overview of the recent progress in photonic-electronic integrated circuits for high-performance computing and AI acceleration. It has covered the fundamental building blocks, from optical logic gates to fully functional photonic processing units, and explored the implementation of optical analog computing for AI, including programmable modulation techniques and photonic tensor core designs.

Beyond the device and circuit level, the article has also analyzed the photonic AI accelerator architectures and the importance of software-hardware co-design. Acknowledging the existing challenges, the article has highlighted potential strategies for overcoming these issues and offered insights into the future drivers for optical computing.

As the field of photonic-electronic integrated circuits continues to evolve, the integration of photonics and electronics is poised to play a pivotal role in addressing the computational demands of the AI era and beyond. The advancements in this interdisciplinary domain hold the promise of revolutionizing high-performance computing, leading to transformative breakthroughs across a wide range of applications.

Reference

[1] S. Ning, H. Zhu, C. Feng, J. Gu, Z. Jiang, Z. Ying, J. Midkiff, S. Jain, M. H. Hlaing, D. Z. Pan, and R. T. Chen, "Photonic-Electronic Integrated Circuits for High-Performance Computing and AI Accelerator," arXiv preprint arXiv:2403.14806, Mar. 2024.

Photonic-Electronic Integrated Circuits for High-Performance Computing and AI Accelerators

Introduction

Optical Digital Computing on PICs

Conclusion and Outlook

Reference

Recent Posts

Comments