Google Translate has once again sparked a bit of controversy with users noticing a sexist gender bias in its translations. After a group of users took to social media to discuss the the system’s biases — even making it to the front page of Reddit — two researchers at the University of Cambridge wrote about their proposed solution to fixing the biases in an article for The Conversation.
Translating from a largely non-gendered language like English to a gendered one — Spanish, for example — requires machine translation (MT) systems to figure out which gender to assign to words in the target language, and the result often reflects societal perceptions and stereotypes. When languages like Finnish or Hungarian — which don’t have gendered pronouns (meaning these languages don’t make a distinction between “he” and “she,” for example) — enter the mix, it can lead to even more obvious gender biases, as one Finnish researcher pointed out on Twitter.
In Finnish we have only one pronoun for third person regardless of the gender.
If you copy-paste the sentence below to google translate (or just click open original post for English translation), you see how the algorithm has learnt to be sexist.#IWD2021 should be every day! https://t.co/t8sYTVTrmh
— Johanna Järvelä (@johannajarvela) March 9, 2021
Such biases in these translations aren’t necessarily the fault of the creators, who are often aware of — and perhaps even disconcerted with — them. Back in 2018, Google published a blog post addressing this very issue, and attempted to resolve things by offering up dual translations of these sentences. When sentences like the above example are translated one-by-one, Google Translate provides users with two translations, one using “he” and the other, “she.” However, when more than one sentence occurs in a given input, the final product does not include the gender-inclusive dual translation.
These issues are more or less reflective of wider societal issues with stereotyping and insensitivity. When machine translation (MT) systems are trained, they receive massive amounts of data from a given language pair, utilizing existing data to learn the language’s syntax, vocabulary, and how words have been translated most often by humans previously. The system will then make statistically based predictions on what structures and words occur most frequently in translations, and as a result, there is room for human biases — or simple frequency anomalies — to make their way into the algorithm. If a human has had to translate “he is a doctor” 100 times, but has only translated “she is a doctor” 50 times, and then the data is fed into the training algorithm, the algorithm will simply pick the option with the highest number of previous translations.
In their article for The Conversation, Stefanie Ullman and Danielle Saunders discussed their solution for retraining MT systems to reduce bias: once a bias is noticed, Ullman and Saunders argued that it’s possible to use smaller data sets as targeted lessons to balance things out and produce more accurate translations — “a bit like an afternoon of gender-sensitivity training at work,” the researchers wrote.