Next-Generation High-Speed Wireline Transceivers for Artificial Intelligence and Data Center Connectivity

Latitude Design Systems
Mar 18, 2024
5 min read

Introduction

The exponential growth of data consumption, driven by advancements in artificial intelligence (AI) and machine learning (ML) technologies, has created an unprecedented demand for high-speed connectivity in modern data centers. As AI models continue to grow in complexity, with the number of parameters reaching astronomical figures (e.g., BaGuaLu with over 37 million cores), the demand for bandwidth and low-latency interconnects has become crucial. This tutorial explores the industry trends, emerging technologies, and design considerations for next-generation wireline transceivers capable of supporting data rates beyond 200 Gbps, which are essential for enabling the seamless flow of data in AI and data center applications.

Megatrends Driving Connectivity Demands

AI Connectivity and Scale-Out The rapid growth of AI and ML workloads has led to the deployment of massive compute clusters, consisting of hundreds to thousands of accelerators (xPUs) interconnected through high-speed links. By 2027, it is expected that approximately 50% of market revenue will be driven by AI-accelerated servers, with 20% of Ethernet data center switch ports connected to AI servers. Furthermore, 50% of these switch ports are anticipated to operate at 400 Gbps or higher, with 800 Gbps eclipsing 400 Gbps by 2025 (Figure 1).

Figure 1: Projected growth of AI connectivity and scale-out (Source: Dell'Oro Group Data Center IT Capex Forecast, Jan 2023)

Disaggregated Storage Another significant trend driving the demand for high-speed connectivity is the rise of disaggregated storage architectures. By concentrating storage in shared pools, data centers can improve efficiency and enable larger shared pools, leading to better resource utilization. However, this approach relies on low-latency interconnects, such as PCIe and CXL, to ensure seamless communication between compute resources and disaggregated storage.

Wireline Transceiver Trends To meet the ever-increasing bandwidth requirements, wireline transceiver data rates have been doubling approximately every five years (Figure 2). This trend is expected to continue, with 200 Gbps transceivers being widely adopted in the near future, followed by 400 Gbps and 800 Gbps transceivers in the coming years.

Figure 2: Published wireline transceivers from 2010-2023, demonstrating a data rate doubling trend every five years (Source: ISSCC Forum)

Benefits of 200G Links The adoption of 200 Gbps links offers several advantages over lower data rate links. For instance, a 51.2 Tbps 1RU (Rack Unit) switch would require 32 modules with 16 x 100 Gbps optical links each, resulting in twice the number of lasers compared to an equivalent configuration with 8 x 200 Gbps links. By reducing the laser count, 200 Gbps links can substantially decrease power consumption and cost. Furthermore, higher per-lane data rates enable the use of flatter network topologies with higher radix switches, reducing latency – a critical requirement for AI workloads.

New Technologies and Considerations for 200G Links

Within Transceivers To support 200 Gbps data rates, wireline transceivers must incorporate advanced digital signal processing (DSP) techniques and powerful forward error correction (FEC) schemes. Extensive equalization, such as decision feedback equalizers (DFE) with a large number of taps, is necessary to mitigate intersymbol interference (ISI) caused by significant channel losses (>30 dB). Additionally, DSP techniques like roving-tap finite impulse response (FIR) equalizers can help address reflections in short cable channels. FEC plays a crucial role in ensuring reliable data transmission over lossy channels. At 200 Gbps, more powerful FEC schemes are required, leading to increased decoding complexity, power consumption, and latency. Techniques like segmented FEC, where each link segment is protected by its own optimized FEC, and concatenated FEC, which provides double protection to the optical link, are being explored to balance coding gain, power, and latency trade-offs. One significant architectural implication of adopting soft-decision FEC at 200 Gbps is the effective preclusion of analog serializer/deserializer (SerDes) architectures. Instead, tighter integration between the FEC and analog front-end (AFE) is necessary, favoring analog-to-digital converter (ADC) based DSP SerDes architectures.
200G Optics On the optical front, various modulation technologies are being investigated for 200 Gbps per wavelength applications. Electro-absorption modulated lasers (EMLs) are a promising option, offering moderate swing requirements and potential for differential drive configurations. However, challenges remain in optimizing extinction ratio (ER) and chirp, especially for longer reaches. Silicon photonic (SiP) Mach-Zehnder modulators (MZMs) and micro-ring resonator modulators (MRMs) are attractive due to their potential for integration and low cost. However, achieving the required bandwidth, modulation efficiency (Vπ), and low optical loss simultaneously remains a challenge for SiP modulators at 200 Gbps. Thin-film lithium niobate (TFLN) modulators are also being explored, offering high bandwidth and low drive voltages, but with higher cost and potential integration challenges.
Optical/Electrical Co-Design As data rates increase, the co-design and co-optimization of optical and electrical components become increasingly important. For instance, the packaging interconnect between the photodiode (PD) and transimpedance amplifier (TIA) in the receiver has a significant impact on the broadband frequency response. Techniques like optimizing the trace impedance and incorporating on-die T-coils can improve bandwidth and mitigate reflections. Furthermore, the optimal design parameters may vary depending on the presence and capabilities of DSP equalization. Without DSP equalization, minimizing reflections is critical, while with DSP equalization, leaving some residual reflections can be beneficial for achieving better overall performance.
Co-Packaged Optics To address the challenges of chip-to-module interconnects and enable higher aggregate bandwidth, co-packaged optics (CPO) solutions are gaining traction. By integrating optical engines within the same package as the ASIC, CPO can eliminate the need for retimers, reduce power consumption, and lower latency. However, CPO also introduces challenges such as increased power density and thermal management within the package, as well as potential ecosystem constraints for innovation.

Beyond 200 Gbps: Emerging Technologies

Parallelism: WDM and PSM To scale beyond 200 Gbps per wavelength, techniques like wavelength division multiplexing (WDM) and parallel single-mode (PSM) fiber architectures are being explored. WDM involves multiplexing multiple wavelengths onto a single fiber, enabling higher aggregate data rates. Compact modulation technologies, low-cost and low-loss wavelength multiplexers/demultiplexers, and multi-wavelength laser sources are key enablers for practical WDM implementations.
Higher-Order Modulation Formats Increasing the baud rate and adopting higher-order modulation formats, such as 6-PAM and 8-PAM, are potential pathways to achieve per-lane data rates beyond 200 Gbps. However, these approaches require significant advancements in analog bandwidth, DSP, and coding techniques.
Coherent Optical Communication Coherent optical communication, a proven technology in long-haul networks, is being adapted for shorter reaches within data centers. By leveraging coherent modulation formats like dual-polarization quadrature amplitude modulation (DP-QAM), coherent links can achieve four times the data rate at the same baud rate compared to intensity modulation and direct detection (IM/DD) links.

Recent developments in coherent-lite solutions, tailored for reach lengths below 10 km, have shown promising results. These solutions leverage the O-band (around 1310 nm) to reduce DSP power consumption while maintaining acceptable fiber loss for short-reach applications. Additionally, synchronous baud-rate sampling DSP architectures are being explored to further reduce power consumption and latency in coherent transceivers.

Conclusion

The relentless growth of data consumption, driven by AI and ML technologies, has created an unprecedented demand for high-speed connectivity in modern data centers. To meet these demands, the industry is actively pursuing the development of next-generation wireline transceivers capable of supporting data rates beyond 200 Gbps.

Key technologies and considerations for 200 Gbps links include advanced DSP techniques, powerful FEC schemes, co-design of optical and electrical components, and the exploration of new optical modulation formats. Furthermore, co-packaged optics and coherent optical communication are emerging as promising solutions to address the challenges of chip-to-module interconnects and enable even higher data rates within data centers.

As we look beyond 200 Gbps, techniques such as WDM, higher-order modulation formats, and coherent optical communication over shorter reaches are being actively researched. Collaboration across various disciplines, including analog and digital design, coding theory, optics, and system architecture, will be crucial in overcoming the challenges and enabling the seamless flow of data in future AI and data center applications.