The Indian Institute of Technology Madras (IIT Madras) announced earlier this week that it’s launched an institute specifically dedicated to the development of language technology for Indian languages.
The Nilekani Centre at AI4Bharat officially launched July 28 after receiving a ₹360 million (roughly $4,536,792) grant from Nilekani Philanthropies. The institute aims to advance the development of language technology specializing in the local languages spoken throughout India, which have historically received less attention in artificial intelligence (AI) and language technology development than other languages like English or Chinese.
“The Digital India Bhashini mission has been launched with the goal of all services and information being available to citizens in their own language with ‘collaborative AI’ at the core of the design,” said Nandan Nilekani, the Indian entrepreneur for whom the center is named after. “AI4Bharat will further contribute to and accelerate the Indic language AI work as a public good and is fully aligned with the goals of the Bhashini mission.”
India is home to more than 400 languages, making it one of the most linguistically diverse countries in the world. Languages native to the country are relatively underrepresented on the web and in tech, however — for example, despite having the fourth largest population of native speakers in the world, Hindi is the 35th most-used language on the internet. As a result of the more limited dataset available to researchers, Indian languages have not benefited as much from the improvements in language technology in recent years.
However, the Nilekani Centre at AI4Bharat could be changing that — Anoop Kunchukuttan, a researcher at Microsoft, said that, while developing AI technology for Indian languages is very expensive, organizations like Nilekani Philanthropies have shown strong support for “efforts to build open-source AI.” By creating datasets and pre-trained models, the center’s leadership hopes start-ups and other companies in the private sector will be able to capitalize upon the research and advance language technology for Indian languages.
“Given the rich diversity of languages in India coupled with a rapidly expanding digital world, it is important to make significant advances in language technology to benefit the common man,” said Mitesh Khapra, a professor of computer science and engineering at IIT-Madras. “While language technology has significantly improved for English and a few languages, Indian languages are lagging behind. The focus of the Centre would be to bridge this gap.”