Humans still beat machines when it comes to literary translation

You wouldn’t use Google Translate to churn out an English-language (or any language, for that matter) version of a novel like Gabriel García Márquez’s Cien años de soledad — or would you?

The answer to that question is likely still a resounding “no.” Although researchers have been fascinated with potential applications for machine translation (MT) in the field of literary translation, “any serious challenge to human literary translators [from machines] is still a long way off,” as the European Council of Literary Translators’ Associations put it in a 2020 report

That said, researchers are still trying to see how MT can be applied to literary works — a recent study from researchers at the University of Massachusetts at Amherst attempted to reveal why MT usually falls flat compared to human literary translations.

“As literary MT is understudied (especially at a document level), it is unclear how state-of-the-art MT systems perform … and what systematic errors they make,” the researchers wrote in a paper, which was recently pre-published and available for free on ArXiv.

To shed light on some of the problems with literary MT, the researchers collected a corpus of non-English literary works that met the following criteria:

  • In the public domain in their source country by 2022
  • Multiple human translations published in English
  • Published in an electronic format

The dataset that the researchers compiled — named PAR3 — includes at least two human translations of every source paragraph. To test the efficacy of MT for literary uses, the researchers used Google Translate to create English versions of the source paragraphs and presented them, side-by-side, with the human translations, to two groups of readers: professional literary translators and monolingual English writers.

Perhaps unsurprisingly, both groups overwhelmingly preferred the human translations — 84% of the time, human raters preferred human translations to the machine-translated version. The raters also shared insights that the researchers believe could be used to improve MT’s potential for literary applications. Based on their feedback, the researchers identified five ways in which MT can be improved. Nearly half of the MT errors were a result of “over-literal” text translation — while these instances may not have been outright errors, they often disrupted the flow of the paragraph, making the text feel awkward to read.

Additionally, lack of context caused about 20% of the issues reported in the MT paragraphs. Other errors were either a result of poor word choice, over- or under-precision, and so-called “catastrophic” errors that “completely invalidate the translation” (misgendering a character, for instance). The raters also used these insights to create a GPT-3-based, automatic post-editing model to adjust the machine-translated output — the post-edited versions received more preferable ratings than the unedited versions produced by Google Translate. 

“Overall, our work uncovers new challenges to progress in literary MT, and we hope that the public release of Par3 will encourage researchers to tackle them,” the researchers conclude.


Andrew Warner
Andrew Warner is a writer from Sacramento. He received his B.A. in linguistics and English from UCLA and is currently working toward an M.A. in applied linguistics at Columbia University. His writing has been published in Language Magazine, Sactown Magazine, and The Takeout.

Weekly Digest

Subscribe to stay updated

MultiLingual Media LLC