Study Explores Areas of Improvement in Machine Translation

As machine translation (MT) has become more and more prevalent in our day-to-day lives, it seems that so too have comical, and sometimes harmful, blunders in translation. A team of researchers has presented a paper on Google translate and other MT systems’ difficulty producing accurate translations that involve contronyms — words with multiple meanings that can be opposite each other in different contexts — and how to improve the way we use the technology so that such mistakes are minimized.

“The recent raft of high-profile gaffes involving neural machine translation (NMT) technology has brought to light the unreliability and brittleness of this fledgling technology,” the paper, which was recently presented at the Africa Natural Language Processing (NLP) Workshop reads. “These revelations have worryingly coincided with two other developments: The rise of back-translated text being increasingly used to augment training data in so termed low-resource NLP scenarios and the emergence of ’AI-enhanced legaltech’ as a panacea that promises ’disruptive democratization’ of access to legal services.”

The paper, entitled “Did they direct the violence or admonish it? A cautionary tale on contronymy, androcentrism and back-translation foibles” plays with one particular example of contronymy-induced confusion in its title. The word “enjoin” is a common legal term which can mean both “to order or direct” and “to condemn” — when the sentence “The court enjoined the violence” (as in the latter definition of the word) is translated into Kannada on Google Translate, the verb used in the translation actually suggests that the court ordered the violence. This was also the case in a total of 88 languages on the platform, out of a total of 109 tested in the study.

The researchers warned of a whole slew of other potential errors in translating contronyms using MT — all in all, the paper included 15 examples of different contronyms that resulted in similarly erroneous translations. They also stressed the importance of furthering research to train NMT systems as well as suggesting that Google give users access to a confidence score for each translation. Tech companies like Facebook have often had to apologize for mistakes in translation — for example, an offensive mistranslation of Chinese President Xi Jinping’s name — errors that the researchers believe could have been avoided with more thorough quality control and training.

The research team also discussed MT errors arising from male-centric translations, another issue Google Translate has recently come under scrutiny for. According to a report from Reuters, Google has defended its translation services, noting that MT is meant, not as a substitute for professional, human translation, but rather as a complement to human translations.

Andrew Warner
Andrew Warner is a writer from Sacramento. He received his B.A. in linguistics and English from UCLA and is currently working toward an M.A. in applied linguistics at Columbia University. His writing has been published in Language Magazine, Sactown Magazine, and The Takeout.


Weekly Digest

Subscribe to stay updated

MultiLingual Media LLC