Can your bot chat in Chinese, Czech or Chuvash?

By Arle Lommel July 6, 2018

Chatbots are rapidly becoming ubiquitous in marketing and support. The potential for brands to interact with customers using natural language — and perhaps a bit of personality — without needing an army of paid human agents is driving major investment from enterprises. Tech giants from IBM to Facebook and Weibo to Microsoft have started a virtual race to dominate this field. However, what is missing in this picture so far is serious attention to multilingual needs. Amazon, Apple, Google and Microsoft have all released digital assistants (Alexa, Siri, Google Assistant and Cortana respectively) in multiple languages, but these are the exception, and other enterprises struggle to move past the language of their home market.

As part of a detailed examination of how enterprises provide multilingual customer support, Common Sense Advisory (CSA Research) interviewed market leaders in the creation of chatbots, also known as intelligent online agents, about how they provide multilingual capabilities. The results revealed that few companies have found ways to deal with language in a systematic fashion. As a consequence, results are sporadic, methods are ad hoc and failures common. One chatbot implementer at a large tech vendor — one that develops a commonly used open chatbot framework — stated that the company’s team expects a 30% success rate from its own efforts.

Why are problems so common? At one level, these problems are the sort that are common with any new technology: developers are so focused on solving the basic problems in their primary corporate languages that they do not have the development resources to consider other locales. As a result, the first generations of tools frequently fail when they encounter linguistic needs that their creators did not anticipate. Some problems such as frequent use of concatenation in today’s products have well-developed solutions honed over the past 30 years, and future versions will undoubtedly implement these internationalization best practices.

However, not all of the issues are so simple: chatbots pose challenges fundamentally different from what is seen with traditional content. The shift to conversational structures and the need to embrace “messy” terminology are among these.

Conversational content

CSA Research has documented the ongoing shift in enterprises to treating content as a conversation that exists over time and across multiple channels and that can take multiple paths (see the CSA Research report “The Winds of Content Are Changing”). Chatbots represent one particularly extreme example of this shift. Because they exist as a dialogue between a company and its customers or prospects, they have to respect social conventions in a way that simple support documents or web forms do not need to worry about.

For example, consider a chatbot that helps customers find and purchase airline flights. In a typical search-oriented application, the user would provide certain details — such as dates, departure and destination cities, and number of passengers — and receive a list of options to select from. Only at the time of payment would the application require personal information. By contrast, a chatbot developed in the United States might ask “What is your name?” early on and then use the customer’s given name regularly in conversation. However, if this behavior is simply translated for another, more formal market, it could be seen as too intrusive, insufficiently respectful, or even just “too American.” Developers told CSA Research that anticipating and accounting for such cultural differences are difficult tasks because development teams seldom have the expertise to know when they need to adjust behavior.

As a result, developers find that they need to do something more akin to transcreation than translation. They cannot simply use the same structures with a different set of user interface strings as they would with most software. Instead, they use the source chatbot as an inspiration for an independent one in the target language. This approach requires them to carefully document the paths that the bots take to accomplish their goals and then work closely with in-country staff or language service providers (LSPs) to modify them as needed. When they do not take this approach, they can end up with results that fall short of expectations and leave customers dissatisfied.

When your customers’ terms aren’t your terms

Another major source of difficulty that developers reported is that the terms their customers use do not match the ones they use internally. Perhaps the development team has followed best practice and implemented and enforced use of a termbase. It has carefully scrubbed all non-approved terms and has trained the chatbot on translation memory data and has it ready to go. But the first thing the chatbot encounters is someone using terms it does not know. For example, if an internet service provider has a customer start a chat because their “internet box” does not work, it doesn’t matter that the company calls it a “cable modem” if the customer doesn’t know that term. In fact, a major reason why customers require service in the first place is because they do not understand the official terms that companies use.

This problem is compounded tremendously when enterprises start building multilingual chatbots. It is hard enough to anticipate nonstandard terms in one language, let alone in others that development staff do not speak. If developers have access to chat logs involving human agents, these can provide an invaluable resource as they mine them to discover what questions required clarification. However, when such resources are not available — for example, in cases where chatbots are being implemented to reduce the need for over-the-phone support lines — the task can be almost impossible. Here, LSPs can help their clients to identify likely terms and can help monitor logs of chats that required escalation to human agents to understand what went wrong.

To deal with the challenge of nonstandard terminology in conversational content, enterprises need to embrace a broader view of terminology management and move away from dictating the terms they use internally in a top-down fashion to discovering how others use terms to describe their products and services and interact with their content. This raises the imperative for adopting proper terminology management systems and moving away from simple glossaries that cannot capture the richness of language in use. It also necessitates linking your termbases so chatbots can use them to resolve unknown terms.

Success is possible

The developers CSA Research interviewed said that even if building multilingual chatbots is difficult and best practices have yet to emerge, developers and LSPs can do it with careful planning and thought. Understanding the issues that enterprises and their localization partners will face in this exciting field can help them avoid mistakes and increase the likelihood that the results will meet expectations. It is key to incorporate linguists into the development phase as early on as possible so that they can ensure that specifications are realistic and that plans account for the needs of individual markets. When they work on these projects, they need to consider legal and regulatory issues, a topic where LSPs can offer real value based on experience.