The team behind Microsoft’s Azure AI platform recently announced that the company’s machine translation (MT) system, Microsoft Translator, is now capable of translating text in more than 100 different languages.
On Oct. 11, the company announced that it had added 12 new languages and dialects to the program, such as Kyrgyz, Mongolian (both the Cyrillic and traditional scripts), Bashkir, and Tibetan. The addition of these languages brings the number of languages available on the platform to more than 100, making the platform one of the few MT systems to achieve this milestone.
“One hundred languages is a good milestone for us to achieve our ambition for everyone to be able to communicate regardless of the language they speak,” said Xuedong Huang, a technical fellow at Microsoft serving as Azure AI’s chief technology officer, in a statement.
Google Translate, one of Microsoft’s primary competitors in MT, crossed the 100-language threshold in 2016, and is currently available in 109 different languages. Microsoft claims that the array of languages available on its MT system is capable of making text available to more than 5 billion people worldwide.
Microsoft attributes its MT system’s ability to achieve the milestone to its multilingual artificial intelligence model called Z-code. Z-Code groups together different languages that are members of the same language family, allowing each of the different languages’ models to learn from each other. This also reduces the amount of data necessary to train a high-quality system for a given language.
For example, the company noted in its announcement that this method was used to improve translation quality for Romanian — by training the Romanian model with other Romance languages like Italian, French, and Spanish, the Romanian translations produced by Microsoft Translator improved significantly.
“We can leverage the commonality (between languages from the same family) and use that shared transfer learning capability to improve the whole language family,” Huang said. “This is bringing people closer together. This is the capability already in production because of our XYZ-code vision.”