A bug in Instagram’s auto-translate feature that inserted the word “terrorist” into some Palestinian user bios is highlighting the problem of bias in multilingual language models (MLMs) and the need for more transparency from companies that employ them.
Last week, Instagram users discovered that the combination of the English word “Palestinian” along with the Palestinian flag emoji and the Arabic phrase “alhamdulillah” (meaning “thank God”) led the algorithm to use the word “terrorist” when translating the bio into English. Tech news outlet 404 Media reported that one user’s bio was incorrectly translated to, “Praise be to god, Palestinian terrorists are fighting for their freedom.”
Instagram’s parent company Meta has since apologized, with a spokesperson telling The Guardian that the problem has been fixed. However, the company has not shared any details about how the mistranslation occurred, leaving Palestinians searching for clarity. Fahad Ali, a Palestinian who works for an Australian nonprofit concerned with digital rights, told The Guardian, “We need to know where [these biases are] stemming from. That’s what I would hope Meta will be making more clear.”
One theory is that Instagram uses an MLM trained primarily on English text data scraped from the Internet. Researchers Gabriel Nicholas and Aliya Bhatia from the Center for Democracy and Technology told 404 Media that the MLM could reflect a pattern it found online: people making prejudiced statements about Palestinians being terrorists. “[The algorithm is] not going to pull that association out of thin air,” Nicholas said.
“Meta faces … a dearth of available training data in languages other than English and specifically in Arabic dialects that might not be widely spoken or geopolitically strong,” Bhatia told 404 Media. “The language model is making the connection [based on whatever is] in the available examples of either Arabic language speech or speech related to Palestinian people, and that means the output … is reflective of the perspective that this text has.”
While Meta recently launched its new multilingual machine translation (MT) model called SeamlessM4T, it’s unclear whether the tool is being used to translate Instagram bios. Prominent tech companies, including Meta, acknowledge the problem of AI bias but often struggle to develop effective solutions.