sponsored content

– Supported by Translated –

ModernMT Outperforms Generic MTs and GenAI

Disheng Qiu, VP of Engineering at Translated

An independent study, led by Achim Ruopp of Polyglot Technology, evaluated Translated’s adaptive machine translation (MT) solution against leading public MT systems using publicly available data sets and algorithms based on commonly used metrics. ModernMT performed best across the board. GPT-4 was also tested and couldn’t match it. Here, Translated’s VP of Engineering Disheng Qiu discusses the reasoning behind the study and what makes it unique.

Why Compare Adaptive With Generic MT?

“Selecting an MT service often involves private testing, which is time-consuming and requires in-depth knowledge of the MT systems being tested, and relying on third-party analysis, which focuses mainly on comparing leading generic (or static), publicly available MT systems. This doesn’t take into account ModernMT’s unique ability to instantly adapt to specific content without any training.

This is why we asked Polyglot Technology to conduct an independent study that evaluates and compares out-of-the-box MT systems. The results further demonstrate that ModernMT’s adaptive model, with access to a small translation memory but without additional training, provides an unparalleled level of accuracy and context awareness right out of the box, which static models simply can’t match without additional effort.”

How Is This Study Unique?

“Companies can easily replicate the study using the available MT solutions with their own content. The study was based on publicly available evaluation and comparison scripts to translate a public dataset from Autodesk from US English to German, Italian, Spanish, Brazilian Portuguese, and Simplified Chinese. It explores a typical example where the generic baseline needs to adapt immediately to enterprise domain content to be useful. It also focused on understanding the MT system’s ability to handle different languages, contexts, and specialized terminology, providing a direct comparison of these tools in typical translation workflows.

Polyglot Technology’s research employed commonly used quality measurement metrics (COMET, TER, and SacreBLEU) and tested the main public MT systems (Amazon Translate, DeepL Translator, Google Translate, and Microsoft Translator) against multiple “no-effort” ModernMT models (static, adaptive, adaptive with access to an Autodesk TM of 10,000 segments).”

What About LLMs?

“Translated also ran OpenAI’s GPT-4, the state-of-the-art large language model (LLM), through the same evaluation tests and quality assessment and found that GPT-4 consistently performed worse than any of the other leading neural MT services tested. In our experience, LLMs perform best when translating content with a complete document and clear context. This isn’t the case in many MT use cases, like sending individual user interface components for translation.

LLMs also require sophisticated fine-tuning and prompt-driven modifications to even attempt to address enterprise domain optimization. Nevertheless, we expect LLMs to play an important role in the evolution of MT.”

About Disheng Qiu

Disheng has extensive expertise in computer science, especially big data and AI. His leadership style, which emphasizes innovation and collaboration, has led his diverse teams of data scientists, AI engineers, and software developers to pioneer advanced AI-driven translation solutions. Before joining Translated, he earned a Ph.D. in Computer Science with a focus on machine learning, crowdsourcing, and web data integration – a foundation that fueled his co-founding of Wanderio, a multimodal travel search and booking platform.

Related Articles