Home GADGETS Nvidia’s Jensen Huang admits AI chip design flaw was ‘100% Nvidia’s fault’...

GADGETS

Nvidia’s Jensen Huang admits AI chip design flaw was ‘100% Nvidia’s fault’ — TSMC not to blame, now-fixed Blackwell chips are in production

October 24, 2024

Nvidia’s yield-killing design flaw in its Blackwell GPU was fixed months ago, and a refined version of the B100/B200 processors is about to enter mass production. Jensen Huang, Nvidia’s CEO, admitted this week that the flaw was entirely caused by Nvidia and said that the company’s production partner TSMC helped fix it in a timely manner, according to Reuters.

“We had a design flaw in Blackwell, it was functional, but the design flaw caused the yield to be low,” Huang said. “It was 100% Nvidia’s fault.”

When the first reports about the design flaw emerged, some media outlets reported that TSMC was to blame — and suggested this might be causing strain between Nvidia and its foundry partner. This was not the case, according to Huang, and Nvidia’s own miscalculations caused the problem. Huang also dismissed reports of tensions between the two companies as “fake news.”

Nvidia’s Blackwell B100 and B200 GPUs link their two chiplets using TSMC’s CoWoS-L packaging technology, which relies on an RDL interposer equipped with local silicon interconnect (LSI) bridges (to enable data transfer rates of about 10 TB/s). The placement of these bridges is critical. However, a supposed mismatch in the thermal expansion properties between the GPU chiplets, LSI bridges, RDL interposer, and motherboard substrate caused the system to warp and fail, and Nvidia reportedly had to modify the top metal layers and bumps of the GPU silicon to enhance production yields. While the company did not disclose specific details about the fix, it did mention that new masks were required.

Yield-killing problems and major functionality issues (errata) are not unheard of in the semiconductor world. Typically, companies fix them by modifying a metal layer (or two) and calling it a new stepping. Case in point: Intel’s Sapphire Rapids reportedly had 500 bugs, and the company released around a dozen steppings to fix them all (five were base respins). Every new stepping takes around three months to complete (including identifying the problem, fixing it, and producing a new version of the chip), so the speed at which Nvidia and TSMC fixed the Blackwell GPU is pretty impressive.

The now-fixed Blackwell GPUs for AI and supercomputers will enter mass production in late October and should start shipping early next year (which will still be Nvidia’s fiscal year 2025).

That said, Nvidia disclosed earlier this year that, in order to meet demand for its Blackwell GPUs among major cloud service providers such as AWS, Google, and Microsoft, it will still have to ship some of the initial low-yield Blackwell processors in 2024. It’s unclear how many Blackwell GPUs will be shipped to data centers in 2024.

Source link

Nvidia’s Jensen Huang admits AI chip design flaw was ‘100% Nvidia’s fault’ — TSMC not to blame, now-fixed Blackwell chips are in production

EDITOR PICKS

Rajamouli reveals how much He spent on Baahubali Promotions

Anushka Shetty not approached for ‘Khaidi 2’ yet!

Celebrities praise Ashwin Babu’s ‘Shivam Baje’ trailer

Hebah Patel in a saree

WTC Points Table: Updated World Test Championship Standings After New Zealand Whitewash India In...

Deccan Chronicle office attacked by TD activists in Vizag

Cochin to Bangalore on my Suzuki Gixxer SF 250: Ride experience

Docs at Preeti Urology and Kidney Hospital perform India’s first bilateral ureteral reconstruction

Realme P3 Pro and Realme P3x launched in India: Price, features, sale

Traffic curbs in Andhra’s Guntur from July 20

1 killed, 7 injured as TSRTC bus collides with lorry in Andhra Pradesh’s Nellore...

Divi Vadthya raises the glam quotient

Arm’s ASR upscaler for mobile devices is finally available — Plugins planned for Unity...

Karnataka High Court extends interim protection to Bhavani Revanna in kidnapping case

EVEN MORE NEWS

Health and cultural programmes mark Yatri Sewa Diwas at Vijayawada airport

Not a rupee of public money wasted on KLIS, says KTR...

BRS, Telangana Jagruthi mark Hyderabad Liberation Day with celebrations

POPULAR CATEGORY