Master UPSC with Drishti's NCERT Course Learn More
This just in:

State PCS

State PCS Current Affairs


Maharashtra

BharatGen

  • 26 Nov 2025
  • 3 min read

Why in News? 

BharatGen, India’s first sovereign, multilingual and multimodal AI-driven Large Language Model (LLM), was highlighted by Union Minister of Science and Technology during his visit to IIT Bombay. 

Key Points 

  • About BharatGen: 
    • BharatGen is India’s first government-supported sovereign AI stack, developed to support over 22 Indian languages across text, speech, and document-vision modalities. 
    • It has been developed by a consortium led by IIT Bombay, in collaboration with IIT Madras, IIT Kanpur, IIT Hyderabad, IIT Mandi, IIT Kharagpur, IIIT Hyderabad, IIIT Delhi, and IIM Indore. 
    • The project is funded with ₹235 crore under National Mission on Interdisciplinary Cyber Physical Systems (NM-ICPS), along with an additional ₹1,058 crore from MeitY under the India AI Mission, bringing the total public investment to ₹1,293 crore. 
    • It is designed to advance India’s digital sovereignty, ensuring that AI development is rooted in Indian languages, cultural contexts, and national priorities 

Core AI Models Under BharatGen 

Model Name 

Modality 

Key Specifications 

Param-1 

Text (LLM) 

Foundational text model trained on 7.5 trillion tokens, with one-third of the dataset comprising Indian linguistic data. 

Shrutam 

Speech (ASR) 

Automatic Speech Recognition (ASR) model designed to handle complex Indian linguistic diversity, including multiple dialects. 

Sooktam 

Speech (TTS) 

Text-to-Speech (TTS) model providing speech synthesis in nine Indic languages. 

Patram 

Document Vision 

India’s first document-vision model trained to interpret complex India-specific document formats such as identity records and legal documents. 

  • About Bharat Data Sagar: 
    • Bharat Data Sagar is one of India’s largest sovereign data initiatives, aimed at ensuring complete national ownership of AI-relevant datasets. 
    • It focuses on India-centric data curation, covering diverse languages, dialects, cultural nuances, and region-specific contexts. 
    • It ensures data sovereignty, accuracy, and long-term national regulation over India’s AI ecosystem. 
       
close
Share Page
images-2
images-2