TAUS provides test datasets for Intento’s State of MT 2021 Report

Amsterdam — TAUS collaborates with Intento for the Intento 2021 State of Machine Translation Report by providing high-quality, domain-specific language datasets to be used in their MT tests.

With this report, Intento, the leading AI integration platform, provides those working in and around the MT landscape an in-depth analysis of the current vendors and best strategies to successfully leverage their offerings. The report is based on the tests conducted using test datasets offered by TAUS, the language data network, offering the largest industry-shared repository of data, deep know-how in language engineering, and a network of Human Language Project workers around the globe. You can download the report here.


The report aims to provide an expert vision of the constantly-changing MT landscape to save internationally-facing businesses both human and financial capital. The 2021 edition delivers an extensive evaluation to help you choose the best-fit MT engines for your language pair and industry sector. The report evaluated the performance of different MT engines across 7 industries (Education, Finance, Healthcare, Hospitality, Legal, Entertainment, and General) and 13 language key pairs, using the latest data on 24 commercial MT engines including Alibaba eCommerce and General, Amazon, Apptek, Baidu, DeepL, Elia, Globalese, Google, GTCom, IBM Watson, Microsoft, ModernMT, Naver, Kawamura / NICT, Pangeanic, PROMT, Rozetta, Systran, Tilde, Tencent, Yandex, Youdao, and XL8.

“Working with MT is like living on an erupting volcano. We had 16,000 language pairs available from 34 MT providers just a year ago, and today it’s about 100,000 from 46. We don’t have datasets to evaluate them all, but by working with TAUS we get a look into 13 language pairs and 7 domains”, says Konstantin Savenkov, Intento CEO. “The level of quality we see from stock models in 2021 is unprecedented. However, real-world business applications demand even more, and simply knowing the best stock model is not enough to succeed with MT. Make sure you have domain adaptation, glossaries, tone of voice control, and other tools on your belt.”

Savenkov continues, “this year, together with TAUS, we had a particular focus on using high-quality domain-specific data. It took more time to prepare, but the results should be relevant to a wider audience and applicable to more use cases than before. One key highlight we see from this year is the emergence of new semantic similarity metrics, such as COMET”.

“The availability of high-quality, domain-specific language data has become ever so significant as AI-enabled automatic translation becomes more and more common. We believe the findings will provide guidance on which MT engines are best suitable for the users’ requirements and, above all, demonstrate the value of high-quality, domain-specific data in increasing the quality of the final output,” says Jaap van der Meer, TAUS Director.

About TAUS

TAUS was founded in 2005 as a think tank with a mission to automate and innovate translation. Ideas transformed into actions. TAUS has become the one-stop language data shop, established through deep knowledge of the language industry, globally sourced community talent, and in-house NLP expertise. We create and enhance language data for the training of better, human-informed AI services.

Our mission today is to empower global enterprises and their service and technology providers with data solutions that help them to communicate in all languages, faster, better, and more efficiently. For more information, visit https://www.taus.net/.

About Intento

Intento, the leading AI integration platform, helps global companies utilize the best-fit cognitive AI services and automate content creation (text synthesis), content transformation (between text, speech, and image), and content localization (machine translation), enabling enterprises to translate 20x more content on their existing budgets.

This year, Intento was recognized as a 2021 Cool Vendor in Conversational and Natural Language Technologies by Gartner for its success in enhancing the supply chain of the global translation business. The Intento AI Hub gives global corporations direct, easy access to a multitude of MT engines (such as Amazon, Google AutoML, or Microsoft Cognitive Services) and seamlessly connects them with all of their business systems.

Launched in 2017, Intento offers its patented, ISO-27001 and ISO-9001 certified platform to global companies across all industries, augmenting their Localization, Content Management, Customer Support, and Marketing Operations with AI. For more information, visit https://inten.to.

MultiLingualStaff
MultiLingual creates go-to news and resources for language industry professionals.

Related Articles

Weekly Digest

Subscribe to stay updated