Photonic Extreme Learning Machines: Harnessing Programmable Waveguide Meshes

Introduction

Extreme learning machines (ELMs) have emerged as a powerful class of machine learning algorithms based on single-layer feedforward neural networks (SLFNs). Unlike traditional neural networks that rely on backpropagation for training, ELMs use randomly initialized input weights and biases, with only the output weights requiring training. This approach offers a simpler and faster algorithm than deep neural networks, though it typically requires more hidden neurons and can suffer from variability due to the random input layer.

To leverage the inherent advantages of photonic systems, such as lower power consumption and higher bandwidth, researchers have proposed photonic implementations of ELMs (PELMs). In these architectures, the random input weights are realized through free-space light propagation or integrated microresonators, while the training stage remains in the digital domain.

In a recent work, a team of researchers from the Universitat Politècnica de València and iPronics Programmable Photonics S.L. proposed a programmable PELM (PPELM) based on a hexagonal waveguide mesh. This reconfigurable photonic circuit allows for the implementation of a wide range of random matrices, enabling adaptability to various problem domains.

The Proposed Architecture

The PPELM is implemented using the Smartlight photonic processor from iPronics, which features a hexagonal mesh comprised of 72 programmable unit cells (PUCs). These PUCs can be software-controlled to select the coupling and phase difference between their two input ports, providing a versatile platform for optical signal processing.

Figure 1: The hexagonal programmable photonic mesh with 72 PUCs implementing a feedforward random matrix.

As shown in Figure 1, light is input through the top left port (red arrow) and split into four different paths using a programmed splitter tree. The input data, normalized between [-1, 1], is then encoded in the amplitude and phase of the optical field using the blue-colored PUCs. The remaining part of the mesh is used to implement the random input matrix W, with the green PUCs acting as tunable couplers and the black PUCs kept in the cross state to avoid resonant structures.

On-chip photodetectors at the outputs 1 to 10 measure the output power, effectively applying an absolute square function as the nonlinear activation function of the PPELM. The resulting high-dimensional representation of the input data forms the hidden node matrix H, which is then used to digitally train the output weights β using the formula β = H^T, where ^ is the generalized Moore-Penrose pseudoinverse and T represents the target outputs.

Experimental Validation

To validate the proposed PPELM architecture, the researchers conducted experiments on two classification tasks: the Iris dataset and the Banknote dataset. Both datasets were divided into training (70%) and test (30%) sets, and multiple models were trained with different random input matrices.

For the Iris dataset, which aims to classify plants into three subspecies based on four features, the PPELM used 10 hidden nodes (outputs 1 to 10). The results showed a median accuracy of 98.3% on the training set and 95.3% on the test set, comparable to the 97.7% and 95.6% obtained by a fully digital model.

Fig. 2: a) Accuracy of the Iris model in the training and test set for the PPELM (Train Opt. and Test Opt.) and digital model (Train Dig. and Test Dig.) on 20 different trainings, b) Confusion matrix for one of the PPELM models in the Iris test set, c) Accuracy of the Banknote model in the training and test set for the PPELM and digital models. Models are trained 20 times and d) Confusion matrix for one of the PPELM models in the Banknote test set.

The Banknote dataset, which classifies banknotes as genuine or forged based on four features, used eight hidden nodes (outputs 3 to 10). The PPELM achieved a median accuracy of 92.5% on the training set and 91.8% on the test set, while the digital model scored 98.8% and 97.9%, respectively.

Conclusion

The proposed programmable photonic extreme learning machine, based on a hexagonal waveguide mesh, demonstrates the potential of photonic systems for implementing ELMs. By leveraging the reconfigurability of the waveguide mesh, the PPELM can adapt to various problem domains by implementing different random input matrices.

The experimental results on the Iris and Banknote datasets validate the PPELM's performance, achieving comparable accuracy to fully digital models while harnessing the inherent advantages of photonic systems, such as lower power consumption and higher bandwidth.

As research in photonic machine learning continues to advance, architectures like the PPELM pave the way for efficient and versatile implementations of machine learning algorithms, potentially enabling new applications in areas where low power consumption and high throughput are critical requirements.

Reference

[1] J. R. Rausell Campo, D. Pérez López, and J. Capmany Francoy, "Photonic Extreme Learning Machines Using Hexagonal Programmable Waveguide Meshes," Photonics Research Labs, iTEAM, Universitat Politecnica de València, Valencia, Spain; iPronics Programmable Photonics S.L, Valencia, Spain, 2024, pp. 1-6, doi: 979-8-3503-9404-7/24/$31.00 ©2024 IEEE.