The Evolution of Machine Translation: A Brief History and What’s Coming Next

Machine translation (MT) systems are applications or online services that use technology to translate text between any of their supported languages. Prior to the adoption of MT, translation was largely a manual process. Today, translators often work with MT tools to become significantly more productive. These technologies have transformed the translation and localization industries, delivering increased productivity, reduced costs, improved consistency and scalability, and the ability to easily handle domain-specific terminology. Although the concepts behind MT and the interfaces to use it are relatively simple, the science behind it is complex and brings together several leading-edge technologies. 

Previous and Current MT Approaches

Over time, there has been an evolution in approaches to MT, including:

  • Rules-based MT: based on dictionaries and grammar rules of each language
  • Statistical MT (SMT): based on statistical analysis of bilingual text corpora
  • Neural MT (NMT): uses statistical analysis to predict the likelihood of word sequences and relies on neural networks to model entire sentences

Many of the current state-of-the-art-translation applications are based on NMT, which is an improvement over previous SMT-based approaches. NMT uses far more dimensions to represent the tokens (such as words, morphemes, and punctuation) of the source and target text. 

The Future of MT: Generative AI

Now, a new MT approach has taken root: generative artificial intelligence (GenAI). GenAI relies on large language models (LLMs), which are deep-learning AI models that consume and train on massive datasets, allowing them to excel in language processing tasks such as translation. After these models have completed their learning processes, they generate statistically probable outputs when prompted. The models create new combinations of text that mimic natural language based on their training data.

The development of LLMs has been a gradual process. The first LLMs were relatively small and could only perform simple language tasks. However, with the advances in deep neural networks, larger and more powerful LLMs were created. The 2020 release of the Generative Pre-trained Transformer 3 (GPT-3) model marked a significant milestone in the development of LLMs. GPT-3 demonstrated the ability to generate coherent and natural-sounding text. GPT-3 and subsequent models have been trained on datasets in multiple languages and can therefore generate output in those languages. 

NMT vs. LLMs

LLMs have the potential to outperform NMT in cost, speed, and translation quality, while enabling the development of natural language processing features in multilingual applications. However, NMT and LLMs each have their own strengths and weaknesses. For some translation tasks, NMT will be the most appropriate technology; for others, LLMs will make more sense.

There are similarities between NMT and LLMs:

  • Both are pretrained using bilingual (or multi-lingual) corpora.
  • Both can be trained, or fine-tuned, to perform better for specific tasks.

However, there are also important differences:

  • It’s easier and cheaper to fine-tune NMT for specific domains, such as healthcare.
  • LLMs, in general, produce more natural-sounding text, while NMT produces more accurate text.
  • NMT typically processes segment by segment, while LLMs can work on entire documents at once. So, LLMs perform better with explicit context.
  • It can be easier to integrate existing glossaries and term bases with NMT than LLMs.
  • Currently, NMT performs faster than LLMs; however, newer LLMs perform better than previous LLMs. Speed might be a significant concern for processing large volumes of text.
  • Currently, processing translations using LLMs is more expensive than NMT. This is especially true for low-resource languages.
  • NMT can be optimized for language variants. LLMs might have trouble differentiating between and producing text for language variants such as Portuguese for Portugal and Brazilian Portuguese.
  • NMT is optimized specifically for translation, while LLMs can be used for various language processing tasks. For example, an LLM could be used to create a business email in Japanese.

Impact on the Language Industry

Like previous evolutions of MT, GenAI makes the translation process faster and cheaper than purely human translation. However, unlike previous evolutions, GenAI can also transform other aspects of the translation process such as source review, terminology extraction, context extraction, and assessment of translation quality.

While LLMs weren’t trained specifically for translation, their broad applicability to natural language tasks means that they perform well on translation tasks, especially for languages that were trained on a large set of data. LLMs can produce natural-sounding translations of source text, even using idiomatic expressions in the target text. These qualities give LLMs the potential to be transformative technologies for translation in the years to come.

Bruno Lewin
Bruno Lewin is a technical program manager at Centific who supports globalization, AI, and data at Microsoft. He previously worked in localization, engineering, compliance, and finance roles in Poland, France, Ireland, and the United States. Outside work, he is active in non-profits focused on education.
Joe O'Brien
Joe O'Brien is an analytics program manager at Microsoft and has many years of experience as a Japanese-English translator and localizer. He is also a linguaphile with a special interest in East Asian languages and language preservation.
John Wilcock
John Wilcock is a program manager at Microsoft with a background in i18n and l10n. He leads the effort to update the Globalization Essentials content on Microsoft Learn. John also contributes to the Properties & Algorithms Group, a Working Group of the Unicode Technical Committee.

RELATED ARTICLES

Weekly Digest

Subscribe to stay updated

 
MultiLingual Media LLC