A case for dedicated games localization tools

By Rolf Klischewski May 14, 2015

As opposed to other narrative media such as films or books, many games do not have a single fixed ending. In fact, they do not have a singular plot or story in the classic sense. Instead their focus is on creating total immersion, the illusion of each individual player having a unique gameplay experience. And while that does not boil down to each and every reader of, say, a novel, having a story tailor-made for themselves, it’s still not all that far-fetched a comparison.

Any current role-playing game (RPG) will feature a more or less predefined story set in a more or less predefined world. We have a setting (Mordor, for instance) and a set of rules (Elves live in woods, Hobbits in caves with round doors). Within those boundaries, players may experience various storylines and make their own decisions. There’ll be some do-gooders and some psychos. There will be repercussions and consequences, even dead ends and permanent death.

Now, to create that enormous and complex illusion of total freedom of choice and infinite variety, the game has to cover many eventualities, take care of a vast number of possibilities. You want to pickpocket Gandalf? Sure. Feel like swearing fealty to Sauron? Done. Have a craving for setting Sauramon’s beard on fire? Why not? All of those choices and decisions will become unique elements of an individual player’s gameplay experience. So they have to be in the game’s code. Imagine a book that keeps changing depending on what you want to happen next. It’s interactive storytelling, and there’s nothing quite like it outside of games.

On a technical level, the only way to make this work is the use of variables or placeholders. This often starts with a player entering his or her name right at the beginning of their journey, but it certainly doesn’t stop there. Many games are all about items and so-called quests, which in turn are all about getting items, doing errands and odd jobs and solving puzzles. Most game programmers will try to tackle the problem of all that diversity with variables. It’s hard to blame them, really, because coding something in the vein of “You find a %ITEM” and filling a database with entries such as sword, bow and armadillo is much easier than actually writing out several thousand instances of the same sentence with the item being the only difference.

This is where the problems start for most languages. Actually, it doesn’t even work in English, to be perfectly honest, as seen in Figure 1. The text engine (the part of the game’s programming code dealing with text) usually has no way of knowing exactly what it shows on the screen. Enabling a game to tell the difference between nouns starting with a vowel or consonant (and drawing the right conclusions from that in each and every context) is something most programmers would consider to be beyond their call of duty in their own language, let alone in foreign languages.

As if working with a complex narrative text with a lot of embedded code weren’t enough of a challenge, the price we have to pay for a (seemingly) infinite variety of choices and stories is a huge number of words. 300,000 words for an RPG are not all that uncommon, and due to a number of factors, translation needs to be done by several translators working simultaneously on different parts or chapters.

Sticking with those 300,000 words, let’s say we have five translators taking care of 60,000 words (about 200 pages) each. Apart from some menu texts and messages (errors, status, feedback and so on) most of their work will consist of narrative texts such as dialogues, background stories and quest texts. Since all of them have to start translating at the same time and face the same deadline, only one of them is going to have a chance to get into the game’s story right from the start. Everybody else is going to deal with events that take place in the middle and at the end of the game. They won’t know how certain characters developed, who betrayed whom, what happened when and so forth. Granted, they might have access to the game’s design document or even the game itself, but even so, most of the time there’s going to be a lack of context.

Let’s take another look at Sid Meier’s Civilization IV.

Luckily, we can ignore the fact that the German text on the right is too long and cut off. Much more interesting is the fact that Power (in the sense of “political power”) was translated as Elektrizität (“electricity” or “electric power”), probably because the translator in question simply had no chance of knowing about this particular context. Of course, why and how this mistake made it through quality control and into the release version of the game is another question.

Even if you have a design document or a version of the game, your deadline often won’t leave you enough time to actually play the game or immerse yourself in its story and structure. Most likely you’ll be dealing with a single column of cells filled with various kinds of texts. Frequently you won’t know what Siegbert did to Ethel on Woodberry Pines, why King Grisbald invaded the Northern Plains or whether Castle Shivertomb is a Burg or Schloss in German (it’s complicated). You won’t know whether Princess Clarigold and Brom the Bold shared a kiss by that pond, which, in German, would mean you’d have to change the form of address they’re using (it’s even more complicated). And, of course, we’ll end up with five different styles of writing.

Let’s take a look at how we’re currently dealing with these challenges of translating what you may call technical prose.

Current localization tools

What are we working with? Ironically, Microsoft Excel still seems to be something like the gold standard of games localization tools (Figure 3). Many developers design their game’s text engine to export files as Comma Separated Values (CSV), which in turn many localization vendors tend to open and edit in Excel.

Usually we have one language per column, often many different languages on a single sheet. Sometimes we even have context information in a separate column, but that depends entirely on the developers’ text production workflow. On their side, texts may be created and edited and annotated in an entirely different format, Excel being nothing more and nothing less than a file exchange tool. Some developers will enter a wealth of information in their context columns; others won’t even have anything to add to their plain source texts. In the above example, the source text is in column B, the translation in column C and some comments in column D.

Now let’s take a look at how Trados Studio 2014 handles this rather basic file (Figure 4). First of all, we have to hide all columns except for B.

As you can see, the context information is gone. We’d have to refer to the original Excel file, open it and Alt-Tab whenever we’d like to have more context. So even if context information is technically available, it stays outside of Trados Studio.

Before we can open our file in memoQ 2014, we have to copy the source text into the target column. As opposed to Trados Studio, memoQ lets you choose the Excel cells you want to translate. The translation then overwrites the source text.

As you can see, it is possible to include context information in memoQ 2014, which is a huge step forward into the right direction. The feature itself is still quite rudimentary and not nearly as flexible as it should be for games localization. For instance, the context information needs to be stored within the same Excel sheet and it has to be in the same line.

Trados Studio and memoQ make our work a lot easier by adding the benefits of translation memories and glossaries (or TermBases), and while memoQ offers a simple option for adding context information, the use of external data sources is still not supported.

To make things worse, neither Trados Studio nor memoQ have an option to actually define variables within a project. So, for example, we can’t tell either tool to handle {0} as a placeholder for numbers or characters. We can’t just simulate how the game at hand would replace those variables with, say, item names, even though those very item’s names are usually part of the same translation project. And as we’ve seen in Figures 1 and 2, having more context information and the proper values for our variables might very well make all the difference between a good or a bad localization.

The ideal tool

So, what might a translation tool look like that’s better suited to the needs of our trade? As our previous examples have shown, any dedicated games localization software will have to feature both context information and support for variables.

Introducing the as-yet fictitious Games Localization System, as seen in Figure 6.

Granted, this particular system uses a more WordFast-like approach, but for our purposes that doesn’t make much of a difference. What does make a difference is a number of little helpful windows on the right. The GLS (as I’ll call it from now on) uses a translation memory and a glossary, just like its real-world cousins. But on top of that we now have some handy context information.

So, when we ask ourselves what a “Disruptor” looks like in this particular context, we have a small illustration answering that question, and we’re a lot less likely to translate it as a Klingon gun, for example. Now, where does that information come from? Well, during the development of a game dozens and dozens of illustrations and character models are made, both for the game itself but also for promotional purposes. We tap into that existing resource and make it available to translators. All it takes, apart from having access to those images in any standard format, is linking those graphics to texts. When we’re talking about in-game items, those are quite often stored in separate arrays or databases. So linking them with images shouldn’t be much of a problem, and it’s something tech-savvy games linguists could do themselves.

Alternatively we could have other kinds of context information on the screen — for example the in-game characteristics of units such as our little Disruptor.

As for variables, we could use a similar approach.

Imagine that we’ve defined the variable {1} to have three possible values: Bronze Medal, Silver Medal and Gold Medal. As a bonus, we have a little preview of what such a medal might look like. Again, this is just another case of linking existing databases and using existing assets to assist translators within one and the same localization environment.

Prerequisites for such a system to come into existence and work are data in an accessible format, cooperative developers, linguistic expertise and, of course, development funding. Any localization-aware developer will have realized by now that the old mantra of “localization doesn’t sell (more) copies” has been disproved by too many games to count. Consequently, they’ll have a vested interest in turning localization from a cost to a profit center, and they’re much more likely to create their data and text assets with localization in mind.

Ultimately, everybody wins. Developers will stop seeing localization as a necessary evil and a nuisance. They will find out that creating localization-ready games will save them valuable time (and money) and help them generate revenue in foreign-language markets. Translators will have a much easier time making sense of what they have to work with, which will translate into, well, better translations. And gamers will be able to enjoy games independently of the language they choose to play in.