India’s First Vision-Language Foundational Model for Documents

June 5, 2025

BharatGen is a Government supported initiative for developing India-centric Multimodal Large Language Models. A team representing BharatGen from The International Institute of Information Technology, Hyderabad (IIIT-H) and Indian Institute of Technology, Bombay (IIT-B) has launched Patram-7B-Instruct, India’s first vision-language foundational model built from scratch for complex document understanding tasks.

Patram is part of BharatGen suite of Multimodal Large Language Models being created with funding from DST. Patram-7B-Instruct is a 7-billion parameter vision-language AI model trained on a large and diverse collection of Indian documents. It is designed to analyze and understand scanned or photographed documents and respond to natural-language instructions. The model is now freely available as an open-source release on Hugging Face and MeitY IndiaAI’s AIKosh platform.

Despite its compact size, Patram outperforms several larger international models, including DeepSeek-VL-2, on key benchmarks such as DocVQA and VisualMRC. It also shows strong results on Patram-Bench, a custom benchmark reflecting real-world Indian document scenarios.

Patram was officially unveiled on June 2, 2025, by Shri Jitendra Singh, Hon’ble Minister of State for Science and Technology, at the BharatGen National Summit in New Delhi, in the presence of Prof Abhay Karandikar, DST Secretary, Shri Kris Gopalakrishnan, Chair of the MGB- NMICPS, Shri Abhishek Singh – Additional Secy., MeitY and other dignitaries. Prof. P. J. Narayanan, Director of IIIT Hyderabad, also attended.

Prof. P. J. Narayanan, Director, IIIT Hyderabad, said, “Patram marks a significant step as India designs state-of-the-art foundational models. With this launch, we integrate language available in all forms: as text, as speech, and as images. This can power multimodal applications with integrated vision-language intelligence.”

Patram was developed in just five months by a team based at IIIT Hyderabad, consisting of engineers (alumni) and student interns, with support from IIIT-H and TiH-IoT, IIT Bombay. The team was led by Dr. Ravi Kiran Sarvadevabhatla, Associate Professor at IIIT-Hyderabad and Dr. Ganesh Ramakrishnan, Professor at IIT-Bombay.

Dr. Ravi Kiran Sarvadevabhatla, Associate Professor at IIIT-Hyderabad and lead researcher on the project, said, “With Patram, we’ve built a model that understands the unique structure and diversity of Indian documents. This is just the beginning of what India can achieve in vision-language AI.”

Alongside Patram, DocBodh, a generative AI suite for Indic document intelligence was also launched. DocBodh is designed for use across sectors like governance, education, law, and business.

This initiative reinforces India’s commitment to building open, inclusive, and cutting-edge AI infrastructure that aligns with national goals such as Digital India and Atmanirbhar Bharat.

About IIIT-Hyderabad

The International Institute of Information Technology, Hyderabad (IIIT-H) is an autonomous research university founded in 1998 that focuses on the core areas of Information Technology, such as Computer Science, Electronics and Communications, and their applications in other domains through inter-disciplinary research with great social impact. Some of its research domains include Visual Information Technologies, Human Language Technologies, Data Engineering, VLSI and Embedded Systems, Computer Architecture, Wireless Communications, Algorithms and Information Security, Robotics, Building Science, Earthquake Engineering, Computational Natural Sciences and Bioinformatics, IT in Agriculture and e-Governance

Source link

India’s First Vision-Language Foundational Model for Documents

EDITOR PICKS

Italy’s Sardinia to fight depopulation by importing shepherds from Kyrgyzstan

Saanvee Megghana

Special Paid Premieres for VD’s Kingdom

The Rise of the Titans

HYDRAA sets deadline for removing illegal ad hoardings

Renault Kiger Turbo CVT facelift: Test drive impression

No Pre-release Event For Kalki 2898 AD In Telugu States!

The Original Lion King Is Roaring Back to Theaters

Nellore dist witnesses widespread rains

Educational institutes accountable for drug abuse on campus, says Telangana chief minister Revanth Reddy...

Prabhas' Salaar advance booking exceeds Shah Rukh Khan's Dunki Bollywood

Telangana High Court hears petition challenging non-refundable liquor policy fee

Why Mahavatar Narsimha Was Ignored by Promoters

ST Telemedia to invest Rs 3.5k cr in its new data centre campus At...

EVEN MORE NEWS

Why I feel the Mahindra BE6 is the best car launched...

Syria: Two US soldiers, one civilian killed; Islamic State launches ambush

Photos : Mana Sankara Varaprasad Garu Movie Date Press Meet

POPULAR CATEGORY