Machine translation and slang: A new approach

Slang terms can be one of the most difficult parts of a given text for human translators to adequately translate into another language.

It’s even harder for machine translation (MT) systems.

Humans can often intuitively process and interpret the meaning of slang and other informal language, at least when it’s in their native tongue (that’s not to say it doesn’t present its own difficulties — in fact, many people use slang forms whose meanings they are completely unaware of). When it comes to MT though, it’s even more difficult to pinpoint the proper meaning of certain terminology. A team of researchers at the University of Toronto has introduced a new framework for interpreting slang in natural language processing (NLP).

“Slang is a predominant form of informal language making flexible and extended use of words that is notoriously hard for NLP systems to interpret,” the researchers write. “Existing approaches to slang interpretation tend to rely on context but ignore semantic extensions common in slang word usage.”

In the researchers’ recently published paper, they note that interpreting the meaning of a given slang term before literally translating a phrase or sentence, can render a more accurate MT output. The researchers begin with the English slang form “steamed,” meaning “angry” or “upset” in certain contexts. When using Google Translate to translate the sentence “I got really steamed when my car broke down” into Spanish yields the sentence “Me puse realmente al vapor cuando mi auto se descompuso.”

A back-translation of the Spanish form into English shows that the system did not take into account the actual meaning of the slang form “steamed,” instead using a phrase that refers to water vapor (i.e., actual steam), rather than anger.

In order to produce more accurate translations of forms involving slang, the researchers note that MT systems typically take a context-based approach. This means that the program uses the context around a given word to determine a likely meaning, paraphrasing the input and translating the paraphrase. The researchers argue that a semantically informed approach — that is, an approach that attempts to connect the conventional meaning with the slang meaning — may be a more effective means of translating and interpreting the meaning of certain slang forms.

“The flexible nature of slang is a hallmark of informal language, and to our knowledge we have presented the first principled framework for automated slang interpretation that takes into account both contextual information and knowledge about semantic extensions of slang usage,” the researchers conclude.

Andrew Warner
Andrew Warner is a writer from Sacramento. He received his B.A. in linguistics and English from UCLA and is currently working toward an M.A. in applied linguistics at Columbia University. His writing has been published in Language Magazine, Sactown Magazine, and The Takeout.


Weekly Digest

Subscribe to stay updated

MultiLingual Media LLC