Focus

When internationalization isn’t enough

Arle Lommel

Arle Lommel

Arle Lommel is a senior analyst for Common Sense Advisory (CSA Research).

Arle Lommel

Arle Lommel

Arle Lommel is a senior analyst for Common Sense Advisory (CSA Research).

I

n June 2019, Politico reported on the travails of Epic, a leading US maker of digital health software, in adapting its product to the requirements of the Danish health system. The company discovered that translating the user interface was far from enough. Although some of the problems Danish healthcare workers faced were the result of subpar translation, others were more fundamental. For example, a lot of Epic’s functions were around customer billing, but in the Danish single-payer system, these features were irrelevant. Similarly, the system had hard-coded roles for nurses and doctors based on US legal requirements that prevented them from doing their job in Denmark. One physician described the results as “indescribable, total chaos” and Politico reports that the same physician found that “Epic might work in the United States […] but its design was so hard-coded in US medical culture that it couldn’t be disentangled.” After the Danish government spent over $500 million over three years, the system still is not fully functional.

Localization separates core functions from data and content.

Figure 1: Localization separates core functions from data and content.

Why should such a system struggle? After all, the art of internationalization has been clear for the better part of three decades: you write a culturally and linguistically neutral code base, to which you apply language as a “skin” that can be swapped at will. In this model, the goal is to remove all assumptions from the underlying logic of a program and create something that can be deployed anywhere from Baghdad to Wuhan. Although software often fell short of the idea — particularly when dealing with “difficult” writing systems — the methods and approaches are well understood.

In the internationalization paradigm, as shown in Figure 1, developers build a code base of core technical and business functions that they ruthlessly scrubbed of all linguistic and cultural references. Strings, colors, icons, dates, times and anything else that would vary from country to country had to be “externalized”: placed in separate files where it was easy to change them as needed. In some cases, data contained culturally specific information, such as in the case of addresses, where no single model can capture all of the variants. Developers would then build complex structures to represent the variety of forms in the world and store abstract pointers to the data in the core product.

On top of this internationalized core sat multiple localized user interfaces. These determined what fields and data were displayed in the case of complex structures such as addresses. The results might vary along both linguistic and geographical lines. For example, a German user interface in Germany might use euros for currency and spell words with the character ß but Switzerland would use Swiss francs, and ss instead of ß. Nevertheless, despite such variation, the assumption was that one code base could serve the entire world with a few adaptations here and there.

Complex applications require locale-specific functions.

Figure 2: Complex applications require locale-specific functions.

So why did Epic struggle? Indications are that the company did internationalize the code properly. Even if it did not, the problems the Danish healthcare system found were not something internationalization could solve: internationalization works well when software functions are themselves truly neutral with respect to culture — such as a tool for making technical drawings or designing a database — but breaks down as software becomes more complex and interacts with culture, custom, and language in more complex ways. For some cases, such as supporting a new writing system, it may be enough to add additional features that apply only to particular languages — but in other cases no single code base can support all requirements.

This was the situation in which Epic found itself. Its basic functions could not be written in a culturally neutral manner. Such problems become almost inevitable as software becomes more and more complex and ties together more and more knowledge and functions. Even something as simple as a function to send a package must deal with the realities of hundreds of different postal systems, not to mention customs, tariffs, taxes and regulations. When faced with vast number of variants of requirements both known and unknown, seemingly reasonable and innocent assumptions can lead to unintended consequences. Hard-coded database structures, processes, sequences and requirements can complicate translation and adaptation. In some cases, comparable functions may be just different enough that no amount of translation will make them work.

Add in the current trends to build functions based on machine learning, and the problem becomes more severe. Few organizations have training data for anything beyond their source language, usually English or Chinese. Although it is tempting to slap machine translation on top of English training data, this approach runs the risk of replicating American approaches to complex tasks that do not apply to other regions. The alternative, data manufacture, is expensive and can lead to poor outcomes if enterprises do not anticipate local needs accurately. In other cases, lack of data can lead to product failure, such as in the case of facial recognition functions that do not recognize certain ethnicities because they were trained on data sets representative of specific countries.

So what can organizations do when their software must interact with complex business or cultural requirements that vary from country to country? The key is to capture the variations in functions and processes from the beginning, to internationalize what can be, and to modularize your code base to use different modules when needed. As Figure 2 shows, much of some application may consist of local versions of code.

Critically, the local versions of modules should themselves be internationalized. For example, a Germany-specific tax module should not assume all implementers will speak German. Increasing geographic mobility means that expatriate operators may need to see the locally relevant information in English, French, Chinese or any other language. Even though the functionality may be specific to Germany, the localization layer needs to support any language. You may not localize each version into multiple languages, but it needs to be localizable. However, the degree of internationalization may vary. For example, although you might translate the Germany module, you would be unlikely to need to use different calendars or non-euro currency, so you might not internationalize those aspects.

One of the challenges of creating such complex products and determining what each module must support is that core software development teams seldom consult with localization teams or local implementers. As a result, they may set certain decisions in stone before they understand the implications. At that point, proper localization may require them to rip and replace significant portions of code and then reengineer solutions for their markets, even if they have properly internationalized the code.

The solution in such cases is to bring together a team to define requirements early on, and determine what components can be localized as-is, which ones need to exist in multiple versions and what training data may be required. This team needs to include experts in regulatory and legal compliance, local business practices, data requirements, local languages, finance and any special subject areas of concern. These individuals do not need to be involved at every step but should sign off on initial plans and any substantial changes. Involve them again in early-stage reviews and be sure to include experts not just for current markets and languages, but also for any you expect to add in the foreseeable future.

This approach does not mean you need to build everything all at once, but rather that your development should plan for international growth and requirements. For example, even if you will not initially need multiple tax modules, building in the ability to switch them — rather than hard-coding them — will make it easier to adapt your product in the future. In such cases, you can start by identifying every point where your software interacts with laws, customs or business practices that may vary, and modularize your code around those points. This lower-cost approach also makes it easier for you to deal with any changes that may occur in your home market without needing to reengineer your products, and so may save money in the long run, even for your home market.

You can achieve success in building complex software applications that rely on algorithms or data if you think beyond the simple internationalization paradigm that has dominated software development for the past 30 years. Treating international markets seriously will make your products more relevant and useful and increase brand loyalty. The up-front costs may seem high, but they will be small compared to the costs of market failure or reengineering the core of a product after launch. This is a case where an ounce of planning and prevention can save pounds of trouble later on.