How to Localize a Game with Procedurally Generated Text

HOW TO

How to Localize a Game with Procedurally Generated Text

By Mikhail Alekseev, Daria Batrova, and Danil Belousov

We recently worked on a few projects featuring procedurally generated text. And since it remains a mysterious process for many people, we created a handy guide. But what is procedural generation, what types exist, and how does one avoid the pitfalls hidden in text generation?

How games use procedural generation

Procedural generation is used in the industry to create large amounts of content in a game. Some examples are The Binding of Isaac (2D levels, random placement of monsters and loot), the Civilization series (the world map), the Borderlands series (weapons and upgrades), Star Dynasties (narrative), and Rogue Legacy (random levels, items).

Narrowing it down: text generation

So what does this have to do with localization? It’s true that a randomly generated level probably won’t drastically affect the localization of a game, but it becomes relevant when you start using procedural generation of the game text itself. We’ll cover two ways of generating in-game text. They differ in complexity, scale, and what they make possible.

The first way is a procedurally generated narrative, which is probably the most difficult thing to localize. And the second way is generation of the composed lines and things you can encounter in a game, such as items of different rarity/quality, varying attributes, and so on.

Procedural Text Generation

Procedural generation allows for dynamic and varied narratives, dialogue, quests, and other textual elements in games. By defining a set of narrative structures, rules, and variables, developers can generate unique quests or storylines for each playthrough, offering players a more personalized and diverse experience. This approach allows developers to create a wide variety of content in the game, effectively multiplying the available narrative options by combining different parts.

While procedural generation provides a multitude of variations and scenarios in game narratives, it is crucial to acknowledge the potential challenges it presents. The combinations of generated texts can occasionally appear awkward or unnatural, underscoring the need for a comprehensive pre-production stage to ensure adherence to proper grammar rules and narrative coherence. Moreover, it is imperative to prioritize localization efforts to ensure that all the mentioned generated texts are properly adapted to the target languages, ensuring seamless integration within the game.

It’s important to acknowledge the significance of grammar in the localization process of such texts. Grammar rules in different languages play a vital role in the localization of regular texts and in the localization of procedurally generated content. Various factors, including gender, singular and plural forms, word order, case systems, and agreement rules, greatly influence the adaptation and translation of generated texts. For instance, in languages like Spanish, French, or German, translators must maintain gender agreement between nouns, articles, adjectives, and pronouns in the text. Sentence structures may differ across languages, necessitating the rearrangement of phrases or clauses to preserve proper grammar and meaning. Furthermore, languages with case systems introduce added complexity, requiring translators to ensure that the generated texts adhere to appropriate case forms and declensions for grammatical accuracy in the localized version. Taking these language-specific considerations into account is crucial for achieving a seamless and linguistically precise localization of procedurally generated texts in games.

When setting up rules and variables for generated texts, it is crucial to consider the peculiarities of grammar rules in the target languages, especially for localization purposes. Since developers may not know all the rules in different languages, the system’s rules and variables for narrative generation should be flexible. A fixed set of rules might not work effectively for all languages, necessitating customization for each language’s specific grammar rules.

We localized Star Dynasties into German in 2021. With this game, we worked with the lead linguist and the developers to partially reinvent the variable system for German localization. Here is the list of key considerations for adjustable rules with examples from this game:

Gender differentiation: In some languages, gender affects the spelling and structure of phrases. Developers should consider implementing rules to differentiate between male and female strings based on the speaker’s gender in the game. Note that gender-neutral translations may not always be possible due to language constraints.

This is an example of how this feature was customized when localizing Star Dynasties.

Grammatical tenses: Languages may have variations in grammatical tenses, affecting how the third-person singular rule works or pronoun conjugations. Developers should create a system that allows for different rules for male and female lines to account for such variations.

Example:

Pronoun conjugation: Pronouns in different languages may require specific conjugation rules based on gender, number, and case. Developers should ensure that their system incorporates these rules accurately to generate appropriate pronoun forms.

Here is also one more example to illustrate how the variables transfer not only the correct pronoun conjugation but also the correct gender of the character.

These functions can truly allow for all gender variables. For example, it can be possible to say, “{Character} cannot have any more children with {his(Character)} spouse anyway,” omitting the gender-neutral “their” to give the corresponding possessive pronoun of {Character}. However, German, like many other languages (such as French), has no gender-neutral word for spouse. Therefore, it would make localization into other languages much easier to turn this sentence into, “{Character} cannot have any more children with {his(Character} {spouse(Character)} anyway,” which would allow for the German pronouns like his/her/its (German: sein/e or ihr/e) and gendered words like “spouse” (German: Ehemann or Ehefrau, i.e. husband or wife).

Singular and plural nouns grammar: The grammar rules for singular and plural nouns can vary across languages. Developers should consider these distinctions and design their system to generate grammatically correct forms for both singular and plural nouns.

German does not have a simple pluralization form like adding an “s.” As you can see from the following examples, there are multiple different ways to pluralize words: duke/dukes are Herzog/Herzöge, aunt/aunts are Tante/Tanten, uncle/uncles are Onkel/Onkels and mother/mothers are Mutter/Mütter. It is therefore difficult yet necessary to consider how the target language pluralizes nouns and its corresponding pronouns and articles and develop functions in the target language to allow for as many variations as possible. However, there will be instances where this is impossible, which requires translators and localizers to find natural workarounds without functions in the target. To use the example from above, this may be a possible workaround.

Noun cases: Another crucial aspect to consider in language-specific rule sets is the concept of noun cases. Different languages employ noun cases to indicate the role or function of nouns within a sentence. Developers should account for these variations and ensure that their procedural text generation system can generate accurate noun case forms based on the grammatical rules of the target language.

The last point doesn’t have any specific example from Star Dynasties as the game doesn’t have such a function, but this point may be relevant for your game and the languages you would like to localize it into.

Customizing the rule set for each target language ensures optimal performance. However, there are potential pitfalls associated with this approach. Here is a list of pros and cons for customized rule sets:

In addition, it is crucial to provide means for linguists to test how each line appears in the game. While localizing a project, it’s not always possible to accurately grasp the context solely from the information provided in the localization file. A valuable tool in this regard is the Dynamic Text Tester, which was generously created and provided to Allcorrect during the localization of Star Dynasties. Linguists can simply copy and paste the line they wish to check, and the tester generates variations of the line based on the functions used in the game. This tool proved invaluable in ensuring that the localized text aligned perfectly with the intended context and functionality within the game.

Composed Lines

Now let’s jump to something a little different. In Rogue Legacy 2, some of the player’s equipment is randomly generated. The material such as Leather, Gilded, Obsidian, etc., (adjective) is concatenated with the type of equipment like Weapon, Cape, Helm, etc., (noun). It works perfectly in English, in all situations without extra work, but in other languages, adjectives and nouns can have different genders, and in some languages, the order will be reversed.

You need to concoct a “formula” to tell the game how it should glue these lines together, and know that the formula will differ depending on language.
Here’s a snippet of how it works in Rogue Legacy 2.

MATERIAL

EQUIPMENT

FORMATTER/FORMULA

(EN) Leather Cape turns into (IT) Mantello di cuoio.

Chinese is pretty straightforward: We reversed the order and removed the whitespace (since the Chinese language, just like Japanese, doesn’t use whitespace) in the formula. In Portuguese, we needed to reverse the order and insert the “de” preposition in the formula. Italian was trickier because the preposition could be different, so we put the preposition into the translation of adjectives.

With such an approach, composed lines should not be used outside of this concatenation. The translation and the formulas are tailored to work in this specific case, and ideally, they shouldn’t be used anywhere else in the game. If you reused the “Leather” string somewhere else in the game, it might not be correct in Italian. In English, it can be either a noun or adjective, and it works fine, but in the example, our Italian translation for “Leather” is more like “made of leather,” and the first letter is in lowercase, so it would only work in the composed line together with the formula.

HOW TO PREPARE

ROGUE LEGACY 2 GENDERED STRINGS

The main character could be either male or female. We didn’t have any custom tags for the gender, but we could provide two translations for specific lines (and the game would automatically pick either the male or female version based on the character’s gender).

Advantage: Without a custom tag system, this needs the least amount of dev time.

Disadvantage: You need a bigger localization budget.

Pre-production

It can’t be stressed enough that pre-production is crucial in the localization of generated texts. Many problems related to generated texts could be avoided if enough time was spent preparing for the localization process. Here are some things that would help a lot:

Documents explaining how different functions work. Such documents would be the best reference for all linguists working on the project, helping them decide how exactly they want to translate a specific line based on how it could be used in functions.
Documents that explain how different parts of sentences are connected in the case of composed lines. It is absolutely necessary for linguists to understand what options could be connected to a specific line to translate it correctly.
A list of all functions present in the lockit. It would be a great start to do pre-translation analysis of all functions present in a game so we can find a general logic that could be used throughout the project.
Beastiarium (or a similar encyclopedia). An overview of all characters, enemies, NPCs, and locations would be an amazing reference to ensure that every team member understands every character.
A tool for the linguistic team to test how localized lines will look in the game.

We already mentioned pre-translation analysis, but it has a very important role in projects with any generation type. During that analysis, many potential issues can be found and resolved. For example, analyzing “if” functions allow us to introduce genders to nouns in languages that require it.

Last but not least, selecting one linguist per language to communicate with developers directly helps a lot. Direct communication makes it much easier to solve language-specific problems, such as functions unsuitable for language grammar.

LQA

Linguistic quality assurance is crucial to ensure the effectiveness of text-generated narratives. It is essential to conduct thorough testing before the game’s release to avoid unexpected surprises. Without proper testing, the game experience may feel unpredictable, like opening a box of chocolates without knowing what you’ll get. To optimize the testing process in terms of budget, time, and effort, a per-function testing approach is typically recommended.

Since these games offer non-linear gameplay, where different players can encounter varied content based on their choices, thoroughly checking all game content can be time-consuming and costly. Thus, adopting a per-function testing strategy can be more efficient. This approach involves testing a set of functions that share the same logic and usage in the game. Once all variations are tested within that set, the tester can move on to the next set of functions, ensuring comprehensive coverage of all possible combinations in the game while minimizing the time spent on iterations.

By employing per-function testing, developers can streamline the testing process, verify all potential variations, and achieve higher confidence in the game’s generated content.

Closing words

We hope this article helped you understand what procedural generation is and shed light on how to avoid some pitfalls of this technique. Projects with procedural generation are often great, have high replayability, and you can create very rich and dynamic worlds that would keep players entertained for multiple playthroughs. Yes, this feature comes at the price of a more complicated localization requiring more time, budget, and developer involvement, but we think it’s worth it in the end. Last but not least, if you have any questions left, please be sure to reach out — we would be glad to help!

We would like to thank the Iceberg Interactive and Cellar Door Games teams for the opportunity to work on Star Dynasties and Rogue Legacy 2 localizations. We would also like to thank the translation teams for making everything happen, and Anna Augustin, the lead linguist on Star Dynasties, for linguistic guidance.

Mikhail Alekseev is an Allcorrect account manager focused on fusing language and video games together.
Daria Batrova is an Allcorrect account manager passionate about tackling intricate projects with enthusiasm and expertise.
Danil Belousov is an Allcorrect account manager who enjoys working on large games and solving complex issues.

Back to Issue

Review

Babel: Around the World in Twenty Languages

By Katie Botkin

→ Continue Reading

localization

Gender and the Italian localization of Tomb Raider

By Mirco Carlini

→ Continue Reading

Community, Government, Interpreting

A New Role for LSPs in the Field of Social Justice

By Carol Velandia

Language barriers can hinder effective communication, limit access to essential services, and impede cultural exchange. What remains unclear is the extent to which such barriers…

→ Continue Reading

WEEKLY DIGEST

Subscribe to stay updated between magazine issues.

MultiLingual Media LLC

Procedural Text Generation

Composed Lines

Pre-production

LQA

Babel: Around the World in Twenty Languages

Gender and the Italian localization of Tomb Raider

A New Role for LSPs in the Field of Social Justice

Weekly Newsletter, Subscribe to stay updated!

Login or Register