Facebook’s parent company Meta announced today that it is working on a long-term effort to build language and machine translation (MT) tools that will include most of the world’s languages. The ambitious aim is to create real-time translation technologies that include everyone in the world by teaching AI to translate hundreds of spoken and written languages.
The two new projects announced are:
No Language Left Behind: building an advanced AI model that can learn from languages with fewer examples to train from, ranging from Asturian to Luganda to Urdu.
Universal Speech Translator: designing a new approach to translating from speech in one language to another in real time without the need of a standard writing system.
Today’s MT systems still rely heavily on learning from large amounts of textual data, so they do not work well for low-resource languages for which we are unable to generate AI training data. By advancing and open sourcing work in corpus creation, multilingual modeling, and evaluation, Meta is hoping to provide opportunities for other researchers to also build on its work.
According to Meta, the AI translation systems currently available are not yet able to serve the thousands of languages in existence. To provide real-time speech-to-speech translation and serve everyone, MT researchers worldwide will need to overcome three important challenges:
- Data scarcity: acquiring more training data in more languages
- Data leveraging: finding new ways to leverage data already available
- Modeling scalability: growing models to serve many more languages
MT systems typically rely on learning from large data sets to be able to provide high-quality translations. There are many languages around the world for which we do not have that kind of data. Developing systems outside of the handful of languages that dominate the web will therefore require finding new ways to acquire and use training examples from languages with sparse web presences.
For languages without a standard writing system that are primarily oral, the challenge is even greater because most speech MT systems use text as an intermediary step, translating text from one language to another and finally converting it back to speech in the second language. That is why Meta is developing speech-based approaches.
Additionally, most current MT systems are bilingual, meaning that there is a separate model for each language pair. It is Meta’s ambition to scale that to hundreds of languages simultaneously.
One of the biggest challenges with real-time speech-to-speech MT models is how to overcome latency – the lag that occurs while words are being converted. Even professional simultaneous interpreters lag several seconds behind the original speech. The German sentence, ” Ich möchte gerne in alle Sprachen übersetzen,” and its French equivalent, “J’aimerais traduire dans toutes les langues,” both mean, “I would like to translate in all languages.” But the translating from German to English is more challenging for a MT engine than it is from French to English because of the similar word order in the latter combination.
Finally, while scaling to more and more languages, Meta is also developing new ways of evaluating the work produced by MT models. With No Language Left Behind and Universal Speech Translator, Meta is aiming to make it possible for billions of people to communicate in the Metaverse in their native or preferred languages.