top of page

Integrated Photonic Encoder for Ultra-Low Power and High-Speed Image Processing

Introduction

The ability to acquire and process high-resolution image data at extremely high rates is becoming increasingly important for applications such as surveillance, microscopy, machine vision, astronomy, and remote sensing. Modern lens designs are capable of resolving greater than 10 gigapixels, while advances in camera frame-rate and hyperspectral imaging have made data acquisition rates of Terapixel/second (1012 pixels/second) a real possibility. However, the main bottlenecks preventing such high data-rate imaging systems are power consumption and data storage requirements.

In a recent study published in Nature Communications, researchers have proposed and demonstrated a novel approach that could address this challenge by enabling high-speed image compression using orders-of-magnitude lower power than digital electronics. Their approach relies on a silicon-photonics front-end to compress raw image data, foregoing energy-intensive image conditioning and reducing data storage requirements.

The Photonic Encoding Approach

The compression scheme uses a passive disordered photonic structure to perform kernel-type random projections of the raw image data with minimal power consumption and low latency. This is achieved through an integrated photonic encoder chip that consists of the following key components, as shown in Figure 1:

Working principle of the photonic image encoder. The silicon photonics-based all-optical image encoder uses N single mode input waveguides to carry pixel information. These connect to a multimode waveguide followed by a disordered scattering region for local random transformation and image compression. The encoded output is divided into M non-overlapping spatial regions (M < N), compressing the image. Compression is performed optically, while reconstruction and conditioning are done electronically at the backend.
Fig. 1 | Working principle of the photonic image encoder. The silicon photonics-based all-optical image encoder uses N single mode input waveguides to carry pixel information. These connect to a multimode waveguide followed by a disordered scattering region for local random transformation and image compression. The encoded output is divided into M non-overlapping spatial regions (M < N), compressing the image. Compression is performed optically, while reconstruction and conditioning are done electronically at the backend.
  1. N single mode input waveguides, each with a dedicated modulator to encode a √N x √N pixel block of the input image onto the amplitude of light transmitted through each waveguide.

  2. A multimode waveguide region to allow the light from each single-mode waveguide to spread out along the transverse axis.

  3. A random encoding layer, consisting of randomly positioned scattering centers etched in the silicon waveguiding layer.

  4. M photodetectors to record the encoded output, where M < N, resulting in image compression.

The compression process can be described using a single transmission matrix (T) that relates the input (I) to the transmitted output (O) as O = TI. By forcing M to be less than N, the device effectively performs a single matrix multiplication to compress an N pixel block of the original image into M output pixels.

This analog photonic encoding process can be extremely fast, operating on N pixels in parallel at speeds limited only by the modulators and photodetectors. Additionally, the energy consumption scales linearly with N, even though the device performs M × N operations, providing significant energy savings compared to digital electronics.

Experimental Demonstration

The researchers fabricated a prototype device on a silicon photonics platform with N=16 single-mode input waveguides and experimentally characterized its performance. Figure 2 shows the fabricated device and an example of a measured transmission matrix.

Numerical simulations and experimental characterization. (a, b) Full-wave frequency domain simulations of the optical encoder without (a) and with (b) a 32 µm multimode waveguide in front of a 15 µm scattering region. The addition of the multimode waveguide spreads the input light laterally, creating a random transmission matrix. Intensity along Y, normalized by width W, is plotted by integrating intensity along X in the red-boxed region. (c) Scanning electron micrograph of the fabricated silicon-photonics all-optical image encoder. (d) Example of a typical experimental measurement for characterizing the transmission matrix.
Fig. 2 | Numerical simulations and experimental characterization. (a, b) Full-wave frequency domain simulations of the optical encoder without (a) and with (b) a 32 µm multimode waveguide in front of a 15 µm scattering region. The addition of the multimode waveguide spreads the input light laterally, creating a random transmission matrix. Intensity along Y, normalized by width W, is plotted by integrating intensity along X in the red-boxed region. (c) Scanning electron micrograph of the fabricated silicon-photonics all-optical image encoder. (d) Example of a typical experimental measurement for characterizing the transmission matrix.

Using this experimentally measured transmission matrix, the researchers demonstrated image compression with a ratio of 1:4 and developed a back-end neural network capable of reconstructing the original images with an average peak signal-to-noise ratio (PSNR) of ~25 dB and structural similarity index measure (SSIM) of ~0.9, comparable to common lossy compression schemes like JPEG.

Figure 3 illustrates the experimental results, including the compressed and reconstructed images, as well as a comparison of the experimentally measured transmission matrices.

Experimental demonstration of denoising and image compression using the DIV2K and Flickr2K dataset. (a, b) Experimental measurements of transmission matrices of the encoder at different times. (c) 2D plot showing the difference in magnitude of the transmission matrices' elements (~1%). (d) Histogram of the difference values, fitted with a Gaussian function used as the noise source in reconstruction algorithms. (e) Example of a compressed image (128x128 pixels) with a 1:4 compression ratio. (f) Reconstructed 512x512-pixel image from the compressed image.
Fig. 3 | Experimental demonstration of denoising and image compression using the DIV2K and Flickr2K dataset. (a, b) Experimental measurements of transmission matrices of the encoder at different times. (c) 2D plot showing the difference in magnitude of the transmission matrices' elements (~1%). (d) Histogram of the difference values, fitted with a Gaussian function used as the noise source in reconstruction algorithms. (e) Example of a compressed image (128x128 pixels) with a 1:4 compression ratio. (f) Reconstructed 512x512-pixel image from the compressed image.
Performance Analysis

The researchers analyzed the potential throughput and power consumption of their photonic image processing engine, indicating that this technique has the potential to encode Terapixel/second data streams utilizing <100 femtojoules per pixel—representing a >1000x reduction in power consumption compared with state-of-the-art electronic approaches.

Figure 4 compares the energy consumption per multiply-accumulate (MAC) operation for the photonic approach with a typical GPU, showing orders-of-magnitude lower power consumption for the photonic encoder, especially for larger kernel sizes.

Comparison of energy consumption for electronic and all-optical encoding approaches. Energy consumption (Energy/MAC) is plotted as a function of the number of input pixels (N) for electronic approaches (GPU, SoC, ASIC) and the optical photonic encoder (Tscatter = 20%). Electronic approaches show typical energy efficiency around ~1 pJ/MAC (red band). The photonic approach shows contributions from the laser (dashed red line), modulators (dashed yellow line), and detectors (dashed magenta line). The vertical gray line indicates values for an 8x8 kernel. Energy consumption is dominated by the input laser, with efficiency improving as N increases.
Fig. 4 | Comparison of energy consumption for electronic and all-optical encoding approaches. Energy consumption (Energy/MAC) is plotted as a function of the number of input pixels (N) for electronic approaches (GPU, SoC, ASIC) and the optical photonic encoder (Tscatter = 20%). Electronic approaches show typical energy efficiency around ~1 pJ/MAC (red band). The photonic approach shows contributions from the laser (dashed red line), modulators (dashed yellow line), and detectors (dashed magenta line). The vertical gray line indicates values for an 8x8 kernel. Energy consumption is dominated by the input laser, with efficiency improving as N increases.
Advantages and Future Prospects

The proposed photonic encoding approach offers several key advantages over conventional image processing techniques:

  1. High-throughput processing: By processing 8 x 8 pixel blocks in parallel at ~16 GHz, this approach could process 1 Terapixel/second, enabling ultra-high-resolution data acquisition systems.

  2. Ultra-low power consumption: The energy consumption scales linearly with the number of input pixels, providing orders-of-magnitude lower power consumption than digital electronics.

  3. Simultaneous compression and denoising: The back-end neural network can reconstruct the original images while performing image conditioning and denoising tasks, eliminating the need for energy-intensive front-end image conditioning.

  4. Compatibility: By positioning the accelerator after the image formation and initial optical-to-electrical conversion stage, this scheme is immediately compatible with any image acquisition system, regardless of operating wavelength, camera resolution, front-end optics, frame rate, or application.

Future work will focus on integrating high-speed modulators and detectors, increasing the kernel size to support larger pixel blocks (e.g., 8 x 8), and exploring alternative encoding techniques, such as multimode waveguides or chaotic cavities, for improved performance.

Conclusion

The integrated photonic encoder proposed in this study represents a significant advancement in low-power and high-speed image processing. By leveraging the unique capabilities of analog photonics, this approach enables ultra-high-resolution data acquisition systems with orders-of-magnitude lower power consumption than current techniques. The successful experimental demonstration and promising performance analysis pave the way for practical implementation of this technology in a wide range of imaging applications, including surveillance, microscopy, machine vision, astronomy, and remote sensing.

Reference

[1] X. Wang et al., "Integrated photonic encoder for low power and high-speed image processing," Nature Communications, vol. 15, no. 4510, pp. 1-11, Jun. 2024. [Online]. Available: https://doi.org/10.1038/s41467-024-48099-2

Comments


bottom of page