Home GADGETS Fujitsu uses Fugaku supercomputer to train LLM: 13 billion parameters

GADGETS

Fujitsu uses Fugaku supercomputer to train LLM: 13 billion parameters

May 11, 2024

Although Fujitsu’s Fugaku supercomputer is no longer the world’s fastest machine from the Top 500 supercomputer list, it still is a very capable system and the versatility of the A64FX processor allows to use it for a variety of workloads, such as AI. This week Fujitsu released its Fugaku-LLM, a large language model with advanced Japanese language processing capabilities that is designed for both research and commercial applications.

Fujitsu’s Fugaku-LLM was trained using 380 billion tokens on 13,824 nodes of the Fugaku supercomputer based on the A64FX processor that supports FP64, FP32, FP16 and INT8 modes for a variety of AI and conventional supercomputer applications. The training of Fugaku-LLM naturally took advantage of distributed parallel learning techniques optimized for the supercomputer’s architecture and the Tofu interconnect D.

The Fugaku-LLM features 13 billion parameters, which looks pale compared to GPT-4’s 175 billion, which is the largest LLM ever trained in Japan. Fujitsu says that its 13 billion parameter LLM does not require vast compute resources to inference, which will be optimal for businesses and researchers in Japan. Approximately 60% of the training data was in Japanese and 40% of the data was data in English, mathematics, and code.

This extensive Japanese-centric training sets it apart from other Japanese models that were trained primarily on English datasets. As a result, Fugaku-LLM boasts superior proficiency in Japanese, achieving an average score of 5.5 on the Japanese MT-Bench, the top score among openly available models trained with original data from Japan. It particularly excels in humanities and social sciences, achieving an impressive benchmark score of 9.18, according to Fujitsu.

The Fugaku-LLM initiative has been driven by collaborations among leading Japanese institutions including Tokyo Institute of Technology, Tohoku University, Fujitsu Limited, RIKEN, Nagoya University, CyberAgent, and Kotoba Technologies. One of the reasons they collaborated was a shortage of GPUs typically used to train and inference AI models. Another reason is that the model could be used with Fujitsu’s next-generation 150-core Monaka datacenter CPU optimized for both AI and HPC workloads.

Fugaku-LLM is now available for both academic and commercial purposes under specified licensing terms from GitHub and Hugging Face (though Fujitsu did not provide any links). Additionally, it will also be offered via the Fujitsu Research Portal from May 10, 2024.

Source link

Fujitsu uses Fugaku supercomputer to train LLM: 13 billion parameters

EDITOR PICKS

WATCH: Virat Kohli Awkwardly Denies Kanpur Hotel Staff Member Handshake Ahead of Second Test...

Dinesh Karthik to play in upcoming Legends League Cricket season | Cricket News

Libas Expands to Hyderabad with its Latest Flagship Store

Bills vs. Chargers Livestream: How to Watch NFL Week 16 Online Today

Morning temperatures plummet in Hyderabad and districts-Telangana Today

Vijayawada PE arrested for murdering grocery shop owner after daughter rejects marriage proposal

Telangana HC Rejects Plea Against Toilets in Government School

Notorious gang of pickpockets held in Hyderabad-Telangana Today

After Stuntman’s Death, Hero Offers Insurance to 650 Stuntmen

AirPods Pro 2 Drop to $154.00 on Amazon for Cyber Monday

Formula E race case: ED summons KTR on Jan 7 | Hyderabad News

TVS Jupiter 125 SmartXonnect Launched in India – Features, Colours

Bajinder Singh, ‘Yeshu Yeshu’ pastor from Punjab, gets life term in 2018 rape case...

Implement women’s quota in Parl, assemblies: Lamba | Hyderabad News

EVEN MORE NEWS

In pics: Midlife update for my Mahindra XUV 500’s interior

India squad for Asia Cup women’s qualifiers

Heart Problems On The Rise In Kurnool Youngsters

POPULAR CATEGORY