MARKETING

Promoting inclusion
How automatic QA checks can help you detect
inappropriate language

Supported by MultiLingual magazine and the Process Innovation Challenge

terena-bell

Sara Basile is product director at XTM International. With 10 years of experience in the localization industry at global enterprises, Sara owns the XTM strategic product vision. She is an advocate for innovation and smart technology and has built a customer centric approach to product management at XTM.

The rise in demand for inclusive language has made it a key consideration for companies. They need to ensure it is used externally to engage their customers, but also internally to attract and retain top talent. As brands are increasingly committed to adopting inclusive language, the data shows it’s here to stay — and growing fast.

A Harris Poll commissioned by Google Cloud this year found that 82% of shoppers prefer for a brand’s values to align with their own. Deloitte’s Global Human Capital Trends research revealed two years ago that 79% of organizations see fostering a sense of belonging as a top human resources issue — which cannot be achieved using language that excludes any of its employees. Adopting inclusive language is becoming a key part of our set of values, and companies who ignore this will inevitably be left behind.

Across the board, top brands like Apple, United Airlines, and MasterCard are not only changing their language to be more inclusive but going beyond that to introduce new products to address this growing demand. Salesforce has had a dedicated task force working on inclusivity for years, creating organic, scalable, and repeatable processes across the organization, empowering people that have been historically marginalized through discriminatory behaviors and inappropriate language.

Gender-neutral language has recently gained great importance, especially in retail, a sector that has historically targeted a binary audience through products being manufactured for a specific gender. Even in this sector, the repercussions of using gender-biased slogans and formulations has already proven to lead to damage to a brand’s image, as proven by the criticism directed at English supermarket chain Morrisons for selling T-shirts with “sexist slogans” a couple of years ago.

Ensuring that inclusive language is used in localized content can be a challenge for the localization industry due to the additional layer of complexity added to the quality-assurance process — just think about the sheer volume of content that needs to be reviewed for appropriateness for all possible target audiences. This is particularly relevant in the light of the recent rise of automatic content-generation technology. There are solutions in which large language models such as GPT-3 are integrated seamlessly into the marketing or copywriting workflows. This makes it almost impossible to distinguish which content is written by humans and which is computer-generated. While human copy is generally reviewed in terms of correctness and appropriateness, automatically generated content is not. And since computers have learned to create text by observing the entirety of the internet, they may have picked up really bad writing habits and non-inclusive language patterns.

The process of transitioning to a more appropriate and inclusive language is not immediate — there are many terms and expressions that have been used for years without being considered inappropriate that are still used subconsciously. Think about how long it’s taking some brands to finally get rid of any mention of a “blacklist.” Or think about Romance languages being so gender-rich in their grammar, and how difficult it can be to avoid any use of gender-specific word endings.

In the academic field, there isn’t yet enough research into automatic detection of inappropriate, non-inclusive language. There are so called “profanity filters” in social media moderation or live chats, but no solution that can help translators, reviewers, and copywriters to catch inappropriate language on the fly.

Our XTM AI Lab is going to enable brands and their translation providers to overcome this challenge by adding automatic detection of inappropriate or discriminatory language directly within XTM’s CAT tool, XTM Workbench.

With this solution, XTM aims at addressing two main requirements: high recall and high precision. The recall parameter indicates the percentage of actual errors which are successfully detected by the automatic reviewer. Precision, on the other hand, is the percentage of actual errors within all alarms raised by the automatic reviewer. It is expected that the automatic reviewer will not leave out any error in the text, so its recall effectiveness should be at 100% or nearly 100%. However, at the same time, the precision is also expected to be high — low-precision causes many false alarms which generate unnecessary work for users and undermine their trust in the automatic review mechanism. Recall and precision are always hard to balance. It is relatively easy to design a mechanism which catches all the errors but also raises many false alarms (high recall, low precision) or a very cautious reviewer which only catches evident profanity or obscenity but misses more subtle violations of language appropriateness (low recall, high precision).

The truly innovative part is the technology powering these QA checks — a combination of a rule-based algorithm (high precision) and a machine learning neural network (high recall). Together, they provide the perfect blend of reliability and customizability with the ability to learn and improve over time. While the rules-based algorithm is based on resources compiled from publicly available guides for inclusive language published by national governments, NGOs, and universities (not the internet), the machine learning method adds great scalability and the ability to improve performance as languages evolve.

The neural network has been trained on a dataset composed of over 9 million tweets annotated for “offensiveness” (a SOLID dataset, which stands for Semi-Supervised Offensive Language Identification Dataset). Machine learning on this kind of data gives the possibility to catch new, obscure, or ambiguous offensive language not defined in the rules-based algorithm. XTM has achieved a 91% accuracy rate on the test set and is continuing to improve on that figure.

Additionally, XTM is investigating how its groundbreaking ILVS technology can detect offensive language in other languages using the neural network. The SOLID dataset is only in English, but ILVS enables them to align large language models, potentially extending offensive language detection to other languages supported by ILVS (more than 57 languages at this point).

With these checks being seamlessly integrated into existing translation workflows, linguists, and copywriters will no longer have to interrupt their work to reference offline style guides or dictionaries. Appropriate and inclusive words and phrases are instantly presented to the translator in their work environment, which results in significant savings in terms of time, costs, and quality. Inappropriate language checks can serve multiple scenarios, from source-language correction (in the so-called pre-processing workflow in XTM Cloud) to automatic detection of discriminatory MT output, which saves a considerable amount of post-editing efforts.

This innovation is currently in development and will be presented to a limited group of beta clients in early 2023. This feature is intended to initially work with the seven most used languages in XTM Cloud: English, German, French, Spanish, Italian, Portuguese, and Simplified Chinese.

At a later point, we intend for this feature to be customizable, which would enable users to define their own custom glossaries of inappropriate words and phrases according to their internal brand guidelines or territories.

Ensuring all global content is more inclusive is fast becoming key for companies. We believe being able to automatically detect inappropriate and non-inclusive language will be a basic-need technology for enterprises in the near future. At XTM, we are doing our bit to enable this.

RELATED ARTICLES

WEEKLY DIGEST

Subscribe to stay updated between magazine issues.

MultiLingual Media LLC