Business Standard

Wednesday, February 05, 2025 | 12:11 PM ISTEN Hindi

Notification Icon
userprofile IconSearch

A challenging road ahead for India's DeepSeek moment in AI development

Even though DeepSeek has shown how to build a low-cost foundational AI model, India still has many hurdles to cross

deepseek artificial intelligence

Shivani Shinde
Even as the world was marvelling at DeepSeek, another China-based tech firm Alibaba announced a new version of its Qwen 2.5 AI model, which surpassed the performance of the DeepSeek V3 on several key benchmarks.
 
The back-to-back announcements of advanced AI models from Chinese companies have raised some questions in India too. What is stopping it from developing its own foundational AI model?
 
This was partly answered by IT Minister Ashwini Vaishnaw in a recent press conference. He stated that India would have multiple sovereign foundational AI models developed and ready for deployment over the next eight to ten months. The Ministry of Electronics and Information Technology has been working with experts in the field of large language model (LLM) and small language model (SLM) over the last 18 months to develop the framework for India’s foundational AI model, Vaishnaw said. He also highlighted the issue of compute capacity, which is a critical requirement for building foundational models.
 
 
“We have nearly 15,000 high-end GPUs. Just to give you some context, DeepSeek was trained on 2,000 GPUs while the Chat GPT version 4 was trained on 25,000 GPUs. This (procurement of 18,693 GPUs) gives us a very robust compute facility, which will give a boost for creating AI applications, models, distillation and training processes and creating new algorithms,” Vaishnaw said.      
 
While the announcement was met with optimism from industry experts and academia, they pointed out that challenges persist. Compute power is just one aspect of building foundational models; other crucial factors must also be addressed.
 
Many experts noted that DeepSeek challenges the assumption that only massive capital investment is needed to develop foundational models. However, it does not entirely eliminate financial constraints.
 
“The model was published by DeepSeek under an MIT license, which permits free reuse and modification. This is one of the few models with a highly permissible license, making it particularly valuable for open-source enthusiasts and developers. Unlike restrictive licenses, the MIT license allows users to build better models on top of DeepSeek’s foundation, fostering innovation and collaboration,” explains Y Kiran Chandra, founder of Swecha and Centre head Viswam AI, a collaborative CoE by Swecha and IIIT Hyderabad.
 
Viswami is a centre of excellence dedicated to developing AI solutions tailored to the unique needs of the Global South.
 
Chandra however points out that it’s important to note that the model cannot be considered fully open-source because the training data has not been made available.
 
“For a project to be truly open-source, all four critical components—algorithms, datasets, model weights, and source code–must be accessible for public review and use,” he added.
 
That said, DeepSeek’s release of an “open-weight” model is still a significant milestone. By making the model weights available, they have enabled researchers and developers to study, adapt, and build upon the algorithm.
 
“This approach demonstrates that high-quality AI models can be developed without relying solely on the brute-force, resource-intensive methods often pursued by organisations in the Global North. Already, the community has begun innovating with the model, for instance, some have extracted the reasoning component and integrated it with other models like LLaMA, showcasing its versatility and potential for hybrid applications,” said Chandra. DeepSeek’s achievement has raised questions for India: Is India being too reliant on western world? Can we create a foundational model unique to India? Will it be able to solve the bias and hallucination issues? And more.
 
Kunal Walia, partner at Dalberg Advisor says that fostering an environment for innovation within these limitations and, most importantly, developing high-quality, India-specific datasets will be critical to success.
 
“A stronger focus on research and development is essential, along with collaboration between public and private sectors to mobilise resources for building and training these models. This development serves as a wake-up call, and India will naturally move in this direction,” said Walia.
 
Professor Balaraman Ravindran, head, Centre for Responsible AI, IIT Madras, and head of Wadhwani School of Data Science and AI, IIT Madras believes India needs a stronger AI research ecosystem with better incentives for fundamental research. While China has built a thriving AI landscape with deep investments, India must focus on industry-academia collaboration and increased VC funding to bridge the gap.
 
“We do have high quality research happening in India, but in a very few universities. Perhaps confined to a few IITs, IISC, and IIIT. But in China they have probably 100 institutes of that level.  The amount of investment that some of the top universities in China get for fundamental research is way higher than we can imagine… I think Tsinghua and Peking Universities get more money for research than the entire Indian academia,” added Ravindran.
 
Perhaps the government of India is realising this. Education and literacy has received a total outlay of Rs 78,572 crore for FY25-26, this is the highest ever for the Department of School Education and Literacy. In addition, the government announced the setting up of an AI Centre of Excellence with an investment of Rs 500 crore. Experts also point out the biggest challenge in creating a foundational model for India, which can be used for India use cases is availability of digital first datasets. Chandra believes that India can create its own datasets easily by community sourcing. Swecha has created India’s first Telugu language SML-- Chandamama Kathalu, followed by Gonthuka project with 1.5 million voice samples by 45,000 contributors.
 
Chandra stresses that India needs a culturally rich LLM. Language is a compressed form of expression and it is intrinsic to cultural nuances. “To tackle that as a community we have been able to collect close to 50 million tokens. We believe that it is possible to build a large language model with cultural nuances and datasets with 200 million tokens. We trained 30,000 students last summer to get data from nearby villages with cultural nuances,” added Chandra.

Don't miss the most important news and views of the day. Get them on our Telegram channel

First Published: Feb 02 2025 | 11:03 PM IST

Explore News