MIT designs new photonic chips with 10 million times higher efficiency than electronic chips

Neural network is a machine learning model widely used in tasks such as robot target recognition, natural language processing, drug development, medical imaging, and driving driverless cars. New optical neural networks that use optical phenomena to accelerate calculations can run faster and more efficiently than other electronic counterparts.

But as traditional neural networks and optical neural networks become more complex, they consume a lot of energy. To solve this problem, researchers and major technology companies including Google, IBM, and Tesla have developed "artificial intelligence accelerators," which are specialized chips that increase the speed and efficiency of training and testing neural networks.

For electronic chips, including most artificial intelligence accelerators, there is a theoretical minimum energy consumption limit. Recently, MIT researchers have begun to develop photon accelerators for optical neural networks. These chips perform orders of magnitude more efficiently, but they rely on bulky optical components, which limit their use in relatively small neural networks.

In a paper published in Physical Review X, MIT researchers described a new type of photon accelerator that uses more compact optical components and optical signal processing techniques to significantly reduce power consumption and chip area. This allows the chip to be expanded to a neural network, which is orders of magnitude larger than the corresponding chip.

10 million times lower than the energy consumption limit of traditional electron accelerators

The simulation training of the neural network on the MNIST image classification data set shows that the accelerator can theoretically process the neural network, which is more than 10 million times lower than the energy consumption limit of the traditional electronic accelerator and about 1,000 times lower than the energy consumption limit of the photon accelerator. Researchers are now developing a prototype chip to experimentally prove this result.

"People are looking for a technology that can calculate beyond the limits of basic energy consumption," said Ryan Hamerly, a postdoctoral fellow in the Electronics Research Laboratory. "Photon accelerators are promising ... but our motivation is to build a (photon accelerator) that can Expand to large neural networks. "

Practical applications of these technologies include reducing energy consumption in data centers. "The demand for data centers running large neural networks is increasing, and as the demand grows, it becomes more and more difficult to calculate," said Alexander Sludds, a co-author and graduate student in the Electronics Research Laboratory, whose purpose is to "use neural Network hardware meets computing needs ... to solve bottlenecks in energy consumption and latency. "

Co-authoring the paper with Sludds and Hamerly are: RLE graduate student and co-author Liane Bernstein; MIT professor of physics Marin Soljacic; an associate professor of electrical engineering and computer science at MIT Dirk Englund; a RLE researcher Head of Quantum Photonics Laboratory.

Rely on a more compact, energy-saving "photoelectric" solution

Neural networks process data through many computation layers that contain interconnected nodes (called "neurons") to find patterns in the data. Neurons receive input from their upstream "neighbors" and calculate an output signal, which is sent to neurons further downstream. Each input is also assigned a "weight", a value based on its relative importance to all other inputs. As data spreads “deep” in each layer, the network gradually learns more complex information. Finally, the output layer generates predictions based on the calculation of the entire layer.

The goal of all artificial intelligence accelerators is to reduce the energy required to process and move data in specific linear algebraic steps in neural networks (called "matrix multiplication"). There, neurons and weights are encoded into separate rows and lists, which are then combined to calculate the output.

In a traditional photon accelerator, a pulsed laser encodes information about each neuron in a layer, then flows into a waveguide and passes through a beam splitter. The resulting optical signal is fed into a grid of square optical elements called "Mach-Zehnder Interferometer", which is programmed to perform matrix multiplication. The interferometer encodes the information of each weight, and uses signal interference techniques that process the optical signal and weight values ​​to calculate the output of each neuron. But there is a scaling problem: for each neuron, there must be a waveguide, and for each weight, there must be an interferometer. Since the number of weights is proportional to the number of neurons, those interferometers take up a lot of space.

"You will soon realize that the number of input neurons will never exceed 100 or so, because you can't install so many components on the chip," Hamerly said, "if your photon accelerator cannot handle more than 100 per layer Neurons, it is difficult to apply large neural networks to this structure. "

The researchers' chips rely on a more compact, energy-saving "photoelectric" scheme that uses optical signals to encode data, but uses "balanced homodyne detection" for matrix multiplication. This is a technique that produces measurable electrical signals after calculating the product of the amplitude (wave height) of two optical signals.

Each neural network layer of light pulse encoded information input and output neurons-used to train the network-flows through a single channel. The individual pulses encoded with the entire row of weight information in the matrix multiplication table flow through separate channels. An optical signal that transmits neuron and weight data to a homodyne photodetector grid. The photodetector uses the amplitude of the signal to calculate the output value of each neuron. Each detector inputs the electrical output signal of each neuron into a modulator, which converts the signal back into optical pulses. The optical signal becomes the input to the next layer, and so on.

This design requires only one channel per input and output neuron, and only requires as many homodyne photodetectors as neurons, without the need for weight. Because the number of neurons is always far less than the weight, which saves a lot of space, the chip can be expanded to a neural network with more than one million neurons per layer.

Find the best

With a photon accelerator, there will be inevitable noise in the signal. The more light injected into the chip, the lower the noise and the higher the accuracy-but this becomes very inefficient. The less input light, the higher the efficiency, but it will negatively affect the performance of the neural network. But there is a "best point", Bernstein said, which uses the minimum optical power while maintaining accuracy

The optimal position of an artificial intelligence accelerator is measured by how many joules are required to perform a single operation of multiplying two numbers at a time (such as matrix multiplication). Today, traditional accelerators are measured in picojoules or trillions of joules. Photon accelerators are measured with attojoules and are one million times more efficient.

In the simulation, the researchers found that their photon accelerators can operate at lower efficiency than attojoules. "You can send some minimum optical power before losing accuracy. The basic limits of our chip are much lower than traditional accelerators ... and lower than other photon accelerators," Bernstein said.

Shopping Bags

Shopping Bags,Cotton Tote Shopping Bag,Economical Cotton Tote Shopping Bag,Canvas Tote Shopping Bag

Dongguan C.Y. RedApple Industrial Limited , https://www.hpgbags.com