On Nov. 10, Facebook won a prestigious award at the Sixth Annual Conference on Machine Translation (WMT) for its multilingual model, which could make the future development of machine translation (MT) models much simpler and more accessible.
The company explained the development in more detail in a recent blog post. The model differentiates itself from other widely used MT models in that, rather than utilizing bilingual models that translate between pairs of two languages, Facebook’s multilingual model uses one model to translate multiple languages. The company believes this is a more scalable approach to developing MT and can improve the prospects of MT development for languages that have less resources and training data.
“Most MT systems today use groups of bilingual models, which typically require extensive labeled examples for each language pair and task,” reads the blog post from Facebook’s AI team. “Unfortunately, this approach fails many languages with scarce training data (e.g., Icelandic and Hausa). Its high complexity also makes it impractical to scale to practical applications on Facebook, where billions of people post in hundreds of languages every day.”
According to the company, this is the first time in history that a multilingual model has significantly outperformed bilingual models on translation tasks across multiple different language pairs. The company created two multilingual models, one which translates from English to any language and that translates from any language to English.
“(This) brings us one step closer to building a universal translator that connects people in all languages around the world, regardless of how much translation data exists,” the company wrote.
Traditionally, multilingual models have not been widely used, as their implementation can be difficult and inefficient — however, Facebook notes that recent developments in artificial intelligence have made it possible to streamline their implementation.
“MT as a field has had impressive advances in bridging barriers, but most have centered on a handful of widely spoken languages,” the company wrote. “Low-resource translation remains a ‘last mile’ problem for MT and the biggest open challenge for the subfield today.”