Beginning with ten languages, SYSTRAN will use TED content to develop neural machine translation models for technical content in a variety of fields.
AI-based translation technology company SYSTRAN announced recently its new partnership with TED to build specialized neural translation models that are based on high-quality translations of TED Talks. These unique models are designed to meet the sophisticated translation needs of multinational companies, educational institutions, government agencies and other organizations by enabling accurate and fluent translations of learning, scientific, business, and technical content in ten languages.
A nonprofit organization whose slogan is “Ideas Worth Spreading,” TED has committed to global language access as one of its core foundations. Organizations in 150 countries participate in the TEDx initiative, which allows groups to apply for licenses to organize conferences made up of local participants, ranging from professors to scientists to writers.
Along with TEDx, the company currently has a major translation initiative of their online resources, with a team of over 35,000 human translators, who have produced almost 175,000 translations and captions in 115 languages. The data from this major cache of language resources will likely enable SYSTRAN to expand their neural translation models to even more languages as well.
“SYSTRAN is TED’s first-ever authorized partner in bringing together TED content and machine learning to develop a commercial product,” said Alex Hofmann, Director, Global Distribution & Licensing at TED. “The fact that our inaugural collaboration in the AI space is focused on neural machine translation models built from translations of TED Talks in multiple languages feels natural and are now available on a licensed basis to help enterprises and organizations meet their most sophisticated translation needs.”
The proprietary models are developed by SYSTRAN, pairing TED’s unique multilingual data and SYSTRAN’s AI expertise, and are an early step in advancing data usage in wider applications. TED requires a license for authorized use of its data for commercial AI and machine learning purposes, and SYSTRAN is the first to obtain such a license. In accordance with SYSTRAN’s core principles of security and data privacy, TED fully preserves its intellectual property and ownership of its data as well as the specialized models. The TED-owned models are available on the SYSTRAN Marketplace, a catalog of specialized models for specific domains such as legal, finance, health, education, science/technology and many more.
“This strategic partnership is about taking our shared goals of connecting people and cultures and facilitating multilingual engagement globally,” said CIO of SYSTRAN, John Paul Barraza. “The human-created translations generated by the TED Translator community are of the highest quality, enabling SYSTRAN to build accurate and fluent translation models for use across a plethora of business and professional applications.”
SYSTRAN conducted double-blind human evaluations on the TED models it built, and the results show improvements in accuracy and fluency over baseline state-of-the-art generic models. The human evaluations also revealed unexpected results, with 41% of the models scoring higher than the human reference translations.
“The current global situation is showing us how inter-connected the different countries and populations worldwide are. Companies are imagining a world with far less boundaries — starting with the way we communicate,” said Jean Senellart, SYSTRAN CEO. “Introducing models to the SYSTRAN Marketplace is an incredible opportunity and will respond to real needs in the translation of educational, business, scientific, and technical materials.”