Home GADGETS Huawei Ascend AI 910D processor designed to take on Nvidia’s Blackwell and...

Huawei Ascend AI 910D processor designed to take on Nvidia’s Blackwell and Rubin GPUs

Huawei Ascend AI 910D processor designed to take on Nvidia’s Blackwell and Rubin GPUs


Huawei Ascend AI 910D processor designed to take on Nvidia’s Blackwell and Rubin GPUs

Huawei’s next-generation HiSilicon Ascend 910D AI processor is expected to offer better performance than Nvidia’s H100, reports Reuters. The new processor will be slower on a chip vs chip basis compared to Nvidia’s Blackwell B200 and Blackwell Ultra B300 GPUs, never mind the next-generation Rubin GPUs slated to launch next year. However, Huawei’s approach of building pods with hundreds of processors should allow the Ascend 910D to compete against pods based on Nvidia’s current Blackwell and upcoming Rubin GPUs.

Huawei is preparing to start tests of its most advanced artificial intelligence processor, the Ascend 910D, with the performance goal of surpassing Nvidia’s H100 and offering a domestic alternative amid U.S. export restrictions. According to sources, Huawei has approached several local companies to assess whether the new Ascend 910D chip meets performance and deployment requirements. Initial samples are expected by late May.

Separately, Huawei plans to start large-scale shipments of its dual-chiplet Ascend 910C AI processors to Chinese customers (and probably full systems based on the chips) as early as next month. The majority of of these processors were reportedly produced by TSMC for a third-party company. It remains to be seen whether the Ascend 910D will be made by China-based SMIC, or whether — nearly five years after the U.S. government restricted Huawei’s access to leading-edge semiconductor production capabilities — Huawei will once again find a way to circumvent U.S. sanctions.

Reaching Nvidia H100 performance levels won’t be easy for Huawei. The company’s latest dual-chiplet Ascend 910C offers around 780 BF16 TFLOPS of performance, whereas Nvidia’s H100 can deliver around 2,000 BF16 TFLOPS. In order to achieve H100 performance levels, Huawei will have to redesign the internal architecture of the Ascend 910D and possibly increase the number of compute chiplets.

To stay competitive in the AI industry next year, Huawei will have to achieve performance comparable to that of AI clusters developed in the U.S. This year, the company introduced its CloudMatrix 384 system with 384 Ascend 910C processors. It can reportedly beat Nvidia’s GB200 NVL72 in certain workloads, but at the cost of significantly higher power consumption due to dramatically lower performance-per-watt. It also has over five times as many ‘AI processors’ as an NVL72 rack. Whether the interconnect can scale well to the required number of processors remains to be seen.

Without access to leading-edge process technologies, it will become significantly more difficult for Huawei to maintain competitive positions next year. Nvidia is on-track to introduce its codenamed Rubin GPUs for AI and HPC in 2026. Rubin GPUs are set to be made on TSMC’s N3 (or a more advanced) fabrication process, and they should offer even higher performance-per-watt than the current-generation Blackwell GPUs.

Rubin GPUs are slated to offer around 8,300 TFLOPS of FP8 training performance, and presumably half that for BF16 — about twice the performance of the B200. Huawei’s Ascend 910D and next-generation CloudMatrix systems with 384 of such processors could theoretically offer competitive AI performance on the rack level. However, it remains to be seen what performance benefits Huawei’s Ascend 910D and Nvidia’s Rubin GPUs will offer compared to existing offerings. Also, it should be noted that Nvidia will barely be able to sell its high-performance Rubin GPUs in China, so for that market Huawei won’t really have a direct competitor.

Regardless of performance or efficiency, Huawei’s Ascend 910D processors will likely become China’s workhorses when it comes to AI training in the coming years. Given the strategic importance of AI, the power consumption of the Ascend 910D (or any other domestic AI processor) will not be a limiting factor, as the number of deployed units could offset the efficiency of Nvidia’s (or AMD, Intel, Broadcom, etc.) AI processors. The main limiting factor for China will be its ability to produce enough processors — either domestically, or overseas using proxy companies.

Source link