India speaks over 100 languages. Microsoft wants AI to bridge its linguistic gaps

January 28, 2024 ndowd

Depending on how you count, India has at least 120 languages, and another 1,300 “mother tongues,” an Indian term that refers to local dialects. The country’s government recognizes 22 languages but primarily operates in just two: Hindi, mostly spoken in India’s north, and English. That excludes tens of thousands of Indians who speak neither.

Enter Microsoft’s AI for Good initiative—the tech giant’s umbrella program that tries to use AI to solve problems in health, environmental protection, and human development. The U.S. company has used India to pilot several novel uses of the new technology, such as an app that uses AI to tell farmers the best time to sow seeds or a model that uses satellite images to forecast how a natural disaster might hurt a vulnerable population.

But Microsoft and its AI researchers are particularly interested in navigating India’s linguistic challenges, hoping it might unlock breakthroughs elsewhere. “India’s complexity makes it a test bed for multilingual settings everywhere,” says Ahmed Mazhari, Asia president for Microsoft. “If you can solve and build for India, then you can solve and build for the world.”

Small languages and large language models

The Jugalbandi chatbot, which Microsoft debuted in May 2023, is one of AI for Good’s flagship projects. The chatbot is targeted to rural farmers—specifically those who live in areas that don’t speak India’s more popular languages—who want to learn about or access public services, such as applying for a scholarship.

Jugalbandi uses a large language model, developed with local research lab AI4Bharat, to parse a query, uncover the relevant information, then generate an easy-to-understand answer in the user’s local tongue. (Currently, Jugalbandi can translate 10 of India’s 22 official languages.)

(Fortune earlier featured Microsoft’s work with AI and Jugalbandi on its 2023’s “Change the World” list.)

Another Microsoft initiative called VeLLM, or “Universal Empowerment with Large Language Models,” aims to improve how GPT, the OpenAI-developed model that underpins ChatGPT, works when using less-popular languages. Most of today’s large language models work best in a handful of major global languages—primarily English and Chinese—because so much data are in those two languages. It’s harder to train AI on so-called low-resource languages, where data is scarce or non-existent.

VeLLM is the foundation for other experiments with AI, like Shiksha, a generative AI bot that helps teachers create new curricula in non-English languages quickly, freeing up more to be spent on teaching.

‘Participatory’ design

Microsoft engineers like Kalika Bali, principal researcher for Microsoft Research India, are wary of cutesy technology solutions that don’t reflect how rural Indians live their lives.

Technologists have long tried to use the South Asian country as a testing ground to prove that digital technologies—cheap laptops, affordable internet, and smartphone apps—can improve quality of life in rural India.

Yet not every initiative was a success, Bali notes dryly. She remembers one project in which designers from a development organization tried to create a game to help women farmers in India access important information.

“The women gave that person such a disdainful look,” she said. “They said ‘Do you think we have time for playing games?’”

Instead, Bali says she and her team pursue a “participatory” design process. “We spend a lot of time with the communities that we are working for, trying to have them say what they want out of a technology, or how they want to solve a problem,” she says.

Not just social good

Microsoft, of course, isn’t just interested in AI for its potential for social good. The U.S. tech giant is developing its own AI products, hosted on its Azure cloud computing system. It’s also a key backer of ChatGPT developer OpenAI. The hype around AI has helped lift Microsoft’s stock by 65% over the past year, pushing its market value to $3 trillion, making it the U.S.’s most valuable company.

Mahzhari sees a lot of opportunity for Microsoft in Asia, where there is “an incredible pace of change and transformation across industries and geographies.” He points to several examples where Asian companies have turned to Microsoft’s generative AI services: Lazada, the Southeast Asian e-commerce platform owned by Alibaba, used Microsoft tools to create the first e-commerce chatbot in Southeast Asia.

Still, even if Microsoft’s experiments in India don’t do much for the company’s bottom line directly, they provide important lessons for the company going forward.

“Our partnerships under AI for Good and other pilot initiatives enable us to pick up early signals for advancing AI security and safety,” Mahzhari says. Those lessons are then used to develop “policies for much-needed guardrails” on the new technology.

Bali knows that you can’t separate her work from Microsoft’s overall business interest in AI.

“These are early forays in terms of how to make people who do not have access to technology get on the technology wagon,” she says. “Then they will become, hopefully, future technology users who would, amongst other things, also use Microsoft products.”

Fortune is hosting the inaugural Fortune Innovation Forum in Hong Kong on March 27-28. Experts, investors, and leaders of the world’s largest companies will come together to discuss “New Strategies for Growth,” or how companies can best seize opportunities in a fast-changing world.

source