Home GADGETS China’s AI Analog Chip Claimed To Be 3000X Faster Than Nvidia’s A100...

China’s AI Analog Chip Claimed To Be 3000X Faster Than Nvidia’s A100 GPU

A new paper from Tsinghua University, China, describes the development and operation of an ultra-fast and highly efficient AI processing chip specialized in computer vision tasks. The All-analog Chip Combining Electronic and Light Computing (ACCEL), as the chip is called, leverages photonic and analog computing in a specialized architecture that’s capable of delivering over 3,000 times the performance of an Nvidia A100 at an energy consumption that’s four million times lower. Yes, it’s a specialized chip – but instead of seeing it as market fragmentation, we can see it as another step towards the future of heterogeneous computing, where semiconductors are increasingly designed to fit a specific need rather than in a “catch-all” configuration.

As published in Nature, ACCEL is quoted as hitting 4.6 trillion operations per second in  vision tasks – hence the 3,0000x performance improvement against Nvidia’s A100 (Ampere) and its 0.312 quadrillion operations. According to the research paper, ACCEL can perform 74.8 quadrillion operations per second at 1 W of power (what the researchers call “systemic energy efficiency) and a computing speed of 4.6 peta-operations per second. Nvidia’s A100 has since been superseded by Hopper and its 80-billion transistors H100 super-chip, but even that looks unimpressive against these results.

Of course, speed is essential in any processing system. However, accuracy is necessary for computer vision tasks. After all, the range of applications and ways these systems are used to govern our lives and civilization is wide: it stretches from the wearable devices market (perhaps in XR scenarios) through autonomous driving, industrial inspections, and other image detection and recognition systems in general, such as facial recognition. Tsinghua University’s paper says that ACCEL was experimentally tried against Fashion-MNIST, 3-class ImageNet classification, and time-lapse video recognition tasks with “competitively high” accuracy levels (at 85.5%, 82.0%, and 92.6%, respectively) while showing superior system robustness in low-light conditions (0.14 fJ μm−2 each frame).

a, The workflow of traditional optoelectronic computing, including large-scale photodiode and ADC arrays. bThe workflow of ACCEL. A diffractive optical computing module processes the input image in the optical domain for feature extraction, and its output light field is used to generate photocurrents directly by the photodiode array for analog electronic computing. EAC outputs sequential pulses corresponding to multiple output nodes of the equivalent network. The binary weights in EAC are reconfigured during each pulse by SRAM by switching the connection of the photodiodes to either V+ or V− lines. The comparator outputs the pulse with the maximum voltage as the predicted result of ACCEL. cSchematic of ACCEL with an OAC integrated directly in front of an EAC circuit for high-speed, low-energy processing of vision tasks. MZI, Mach–Zehnder interferometer; D2NN, diffractive deep neural network” (Image credit: Tsinghua University/Nature)

In the case of ACCEL, Tsinghua’s architecture operates through diffractive optical analog computing (OAC) assisted by electronic analog computing (EAC) with scalability, nonlinearity, and flexibility in one chip – but 99% of its operation is implemented within the optical system. According to the paper, this helps in fighting constraints found in other vision architectures such as Mach–Zehnder interferometers and diffractive Deep Neural Networks (DNNs).

Source link