As the coauthors explain in a blog post published today, prior work has shown that a type of photonic circuit known as a Mach-Zender interferometer (MZI) can be configured to perform a two-by-two matrix multiplication between quantities related to the phases of two light beams. (In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions arranged in rows and columns.) When these small matrix multiplications are arranged in a triangular mesh to create larger matrices, they produce a circuit that implements a matrix-vector multiplication, a core computation in deep learning.
The Intel team considered two architectures for building an AI system out of MZIs: GridNet and FFTNet. GridNet predictably arranges the MZIs in a grid, while FFTNet slots them into a butterfly-like pattern. After training the two in simulation on a benchmark deep learning task of handwritten digit recognition (MNIST), the researchers found that GridNet achieved higher accuracy than FFTNet (98% versus 95%) in the case of double-precision floating point accuracy, but that FFTNet was “significantly more robust.” In fact, GridNet’s performance fell below 50% with the addition of artificial noise, while FFTNet’s remained nearly constant.
The scientists say their research lays the groundwork for AI software training techniques that might obviate the need to fine-tune optical chips post-manufacturing, saving valuable time and labor.
“As in any manufacturing process, there are imperfections, which means that there will be small variations within and across chips, and these will affect the accuracy of computations,” wrote Intel AI products group senior director Casimir Wierzynski. “If ONNs are to become a viable piece of the AI hardware ecosystem, they will need to scale up to larger circuits and industrial manufacturing techniques … Our results suggest that choosing the right architecture in advance can greatly increase the probability that the resulting circuits will achieve their desired performance even in the face of manufacturing variations.
Comments