China's AI Analog Chip Claimed To Be 3.7X Faster Than Nvidia's A100 GPU in Computer Vision Tasks
A new paper from Tsinghua University, China, describes the development and operation of an ultra-fast and highly efficient AI processing chip specialized in computer vision tasks. The All-analog Chip Combining Electronic and Light Computing (ACCEL), as the chip is called, leverages photonic and analog computing in a specialized architecture that’s capable of delivering over 3.7 times the performance of an Nvidia A100 in an image classification workload. Yes, it’s a specialized chip for vision tasks – but instead of seeing it as market fragmentation, we can see it as another step towards the future of heterogeneous computing, where semiconductors are increasingly designed to fit a specific need rather than in a “catch-all” configuration.
As noted in the paper published in Nature, the simulated ACCEL processor hits 4,600 tera-operations per second (TOPS) in vision tasks. This works out to a 3.7X performance advantage over Nvidia’s A100 (Ampere) that's listed at a peak of 1,248 TOPS in INT8 workloads (with sparsity). According to the research paper, ACCEL can has a systemic energy efficiency of 74.8 peta-operations per second per watt. Nvidia’s A100 has since been superseded by Hopper and its 80-billion transistors H100 super-chip, but even that looks unimpressive against these results.
Of course, speed is essential in any processing system. However, accuracy is necessary for computer vision tasks. After all, the range of applications and ways these systems are used to govern our lives and civilization is wide: it stretches from the wearable devices market (perhaps in XR scenarios) through autonomous driving, industrial inspections, and other image detection and recognition systems in general, such as facial recognition. Tsinghua University’s paper says that ACCEL was experimentally tried against Fashion-MNIST, 3-class ImageNet classification, and time-lapse video recognition tasks with “competitively high” accuracy levels (at 85.5%, 82.0%, and 92.6%, respectively) while showing superior system robustness in low-light conditions (0.14 fJ μm−2 each frame).
No comments:
Post a Comment