On July 6, Meta announced a major milestone in its No Language Left Behind (NLLB) initiative: NLLB-200, an artificial intelligence model that can translate content to and from 200 different languages.
The announcement comes just a few months after the company announced that it would begin a project to develop a machine translation (MT) tool that can translate the majority of the world’s 7,000 or so languages. The company shared details on NLLB-200’s development and performance in a 190-page paper and also launched a demo site for users to test out NLLB-200 themselves.
“As we build for the metaverse, integrating real-time AR/VR text translation in hundreds of languages is a priority,” the company writes. “Our aim is to set a new standard of inclusion — where someday everyone can have access to virtual-world content, devices and experiences, with the ability to communicate with anyone, in any language in the metaverse.”
In the company’s first announcement of the NLLB initiative, it noted that it was particularly interested in making MT more accessible for speakers of low-resource languages. Low-resource languages — which make up the vast majority of the world’s languages — have less attention devoted to them in mainstream research than high-resource languages like English and Spanish.
As high-resource languages pervade the internet, Meta notes that the internet could soon become inaccessible to speakers of low-resource languages if there are no efforts to improve accessibility. Of the 200 languages available on NLLB-200, roughly three-quarters are considered low-resource.
Though NLLB-200 was only announced this week, it’s already being put to use. Meta stated that it has partnered with the Wikimedia Foundation to give Wikipedia editors access to the technology to quickly translate Wikipedia articles into low-resource languages that do not have a particularly prominent presence on the site.
“Translation is one of the most exciting areas in artificial intelligence because of its impact on people’s everyday lives,” reads a July 6 blog post announcing the initiative’s 200-language milestone. “NLLB is about much more than just giving people better access to content on the web. It will make it easier for people to contribute and share information across languages.”