Semiconductor Technology: The Enabler of the AI Revolution

Introduction

In 1997, the IBM Deep Blue supercomputer made history by defeating world chess champion Garry Kasparov, providing a first glimpse into how high-performance computing could one day surpass human-level intelligence. Over the subsequent decades, artificial intelligence (AI) has advanced rapidly, becoming practical for many tasks like facial recognition, language translation, and recommendation systems.

Fast forward to today, and AI has reached the point of "synthesizing knowledge." Generative AI models like ChatGPT and Stable Diffusion can compose poems, create artwork, diagnose diseases, write code and reports, and even design integrated circuits rivaling human efforts. The tremendous opportunities of AI as a digital assistant for all human endeavors have been enabled by three key factors:

Innovations in efficient machine learning algorithms
Availability of massive training data for neural networks

Progress in energy-efficient computing through semiconductor technology advancements

While the first two factors have received significant attention, the critical role of semiconductor technology has been somewhat underappreciated, despite its ubiquity in enabling AI milestones over the past three decades.

Figure 1: Advances in semiconductor technology [top line]—including new materials, advances in lithography, new types of transistors, and advanced packaging—have driven the development of more capable AI systems [bottom line]

Every major AI breakthrough, from Deep Blue's chess victory to the recent rise of powerful language models like ChatGPT, has been made possible by the leading-edge semiconductor technology of its time. As the figure shows, Deep Blue utilized 0.6 and 0.35-micrometer chip manufacturing tech, while ChatGPT has been powered by servers using 4nm transistors. Semiconductor scaling has acted as a multiplier for AI performance at every level, from software and algorithms down to architecture, circuits, and devices.

Relentless Growth in AI Model Sizes

The computational demands of training state-of-the-art AI models have skyrocketed in recent years. For example, training the GPT-3 language model required the equivalent of over 5,000 petaflop-days of computation and 3 terabytes of memory capacity. As generative AI applications continue advancing, the required computing power and memory grow rapidly, begging the question: How can semiconductor technology keep pace?

From Integrated Devices to Integrated Chiplets

Historically, semiconductor scaling focused on cramming more transistors onto a single, increasingly smaller chip. Today, we've reached a new paradigm of 3D system integration, where multiple chips can be assembled into a tightly integrated, massively interconnected system.

Advanced packaging technologies like TSMC's chip-on-wafer-on-substrate (CoWoS) allow combining multiple compute chips with high-bandwidth memory (HBM) chips on a large silicon interposer, far exceeding the traditional reticle limit of ~800mm2 for single chips.

NVIDIA's Ampere and Hopper GPUs, workhorses for large language model training, exemplify this approach with one massive GPU die integrated with multiple HBM cubes using CoWoS. The transition from 7nm to 4nm process technology enabled packing 50% more transistors (80 billion) into the latest Hopper GPUs, yet it takes tens of thousands of these to train models like ChatGPT.

Another key technology is stacking chips vertically using through-silicon vias (TSVs), as in HBM where DRAM chips are stacked atop a logic die. Future 3D system-on-integrated-chip (SoIC) technology will enable even denser vertical integration, with up to 12 chip layers bonded using hybrid copper-copper connections.

AMD's MI300A processor for large AI workloads showcases both CoWoS and SoIC approaches. It consists of 9 GPU chiplets (150B transistors) stacked atop 4 base chiplets using 3D integration, all interconnected with a silicon interposer and HBM.

As the industry moves towards multi-chiplet, multi-trillion-transistor GPUs, interconnect density scaling and adoption of silicon photonics will become critical. Optical GPU-GPU links will allow scaling bandwidths to treat hundreds of servers as a unified giant processor with shared memory.

Trillion Transistor GPUs and Energy Efficiency Scaling

Along with 3D integration, new materials, EUV lithography, and circuit/architecture innovations will continue driving exponential growth in AI hardware capability. The semiconductor industry has historically tripled a metric called energy-efficient performance (combining speed and efficiency) every two years. This trend is projected to continue, enabling future trillion-transistor AI accelerators.

Figure 2: Energy-Efficient Performance Trend-Largely thanks to advances in semiconductor technology, a measure called energy-efficient performance is on track to triple every two years (EEP units are 1/femtojoule-picoseconds).

Crucially, advanced packaging and a system-technology co-optimization (STCO) approach will be key enablers of this scaling. STCO allows optimizing each GPU component (chiplet) using the ideal process node, then integrating them into an overall energy-efficient system.

A Mead-Conway Moment for 3D ICs

Just as the Mead & Conway revolution simplified VLSI design in 1978 via computer-aided design and process abstraction, we now need a similar advance to incorporate 3D integration. The open-source 3Dblox standard aims to provide a common hardware description language, freeing designers to work on 3D systems without grappling with low-level technology details.

Beyond the Semiconductor Tunnel

AI has led semiconductor technology to a pivotal point. For 50 years, the path was clear - shrink transistors on 2D chips. But we've reached "the end of the tunnel" and will face greater complexity ahead.

Yet looking beyond, incredible possibilities emerge when liberated from past constraints. Future integrated AI systems won't be bound by fixed chips and form factors, but can seamlessly combine an optimized number of energy-efficient transistors, specialized compute architectures, and co-designed hardware/software - all enabled by innovations like chiplets, 3D stacking, new materials, silicon photonics, and design/process abstraction.

While semiconductor development will grow more challenging, it remains pivotal to continued AI progress. As TSMC's Mark Liu states: "semiconductor technology is a key enabler for new AI capabilities and applications." The tunnel has ended, but an expansive new frontier lies ahead for computing's next AI-driven revolution.

Reference

[2] M. Liu and H.-S. P. Wong, "How We’ll Reach a 1 Trillion Transistor GPU: Advances in semiconductors are feeding the AI boom," IEEE Spectrum, 28 Mar. 2024, Available: https://spectrum.ieee.org/trillion-transistor-gpu.