Making language quality work for startups

In today’s world, it is taken for granted that products and services must be technically flawless. In just a few simple clicks, users must be able to achieve their goals, whether that is to perform a task, find information or buy other products and services. These technological solutions — the hidden back-ends — are made visible by language. How products and services speak to their consumers is one of the key contributors to successful market penetration.

Translation service buyers are therefore understandably anxious about language quality. Everyone wants to deliver high-quality products and services to be the best in their markets and retain the top player position for the longest period possible. This means the right language, tone and register to express (technical) concepts as well as the right cultural connotations to attract buyers. It is about what impression the language makes. If good language quality can open doors to a local market, bad grammar and spelling mistakes can close them. Potential customers wouldn’t trust such a careless company to deliver or produce quality goods and services.

Defining quality

Consider the use of language in the online retail experience. Sam Walton, the founder of Walmart, famously said, “The secret of successful retailing is to give your customers what they want.” But do we always know what the most critical language elements are for product or service buyers?

Language elements can attract user attention and persuade a potential buyer to finally click the “buy” button. They can produce a positive emotional user experience to make the user “like” a product or “share” a service page. They can help users accomplish activities quickly and easily.

But unless we know what is important to our buyers, we will be unable to assess whether our customers are getting what they want. As a result, we’ll be exposed to one of the biggest risks: no definition of success criteria would mean language quality remains a subjective phenomenon.

We only can manage what we can measure. And we only can measure what we can benchmark. Therefore, quality must be seen as conformance to requirements — requirements that reflect what the customer wants.

Thankfully, there are core localization business features that help reduce subjectivity and introduce quality measurement.

End-to-end quality management

Managing language quality end-to-end is the best business plan. Yet this is not as simple as “translate this source content into that target language.”

Quality localization of a product or service demands an in-depth understanding of the subject matter, source language and culture as well as the translator’s ability to adapt and transfer the original user experience into a localized, market-adapted version for new users. Just a few simple and straightforward questions could uncover challenges that would affect the creation and delivery of a successful product.

• Is the source content ready for localization?

• Does the translation best reflect the source content?

• How understandable is the translated content?

• Does the translation meet its communication goals?

Thus, quality localization consists of three interrelated elements: source quality, translation quality and the emotional experience of the user.

Source quality

Translators, language service providers and translation service buyers agree that translation is more difficult if there are errors in the source material. Yet the role of language itself is still underestimated.

English as a source language can represent a huge challenge, for example. Its grammar, lexicography, style and tone are hurdles to overcome for the right translation. Sentence fragments, short word strings or emotionally rich content always challenge translators, especially if there is not enough context available. How would you handle interjections such as oh snap, oops, whoops, yikes and fiddlesticks if they all appear in one file?

Often, source writers use connotations typical of their home culture and forget that their cultural concepts might be irrelevant or unknown in other parts of the world. Holiday marketing text that says “The stockings are hung up but are they filled? Fill up your stockings!” makes a lot of sense in countries where a Christmas stocking is hung on Christmas Eve for Santa Claus to fill with gifts. But how do you translate that for other markets? And let’s think of various nontextual elements — images, emojis and symbols — that are often embedded in source text.

Preparing for internationalization must become a default for source content writers. The cultural aspects of source content are extremely relevant to the source quality.

Translation quality metrics

To meet the needs and expectations of our customers, we have to define translation quality specifications. There are several parameters to consider.

• Language locale can be of critical importance in targeting a specific market. While sharing many commonalities, French is different in Canada, Paris and Côte d’Ivoire, for example.

• Field of expertise means finding the best-suited translators for the given job, whether it is legal, finance or cloud computing.

• Intended use (purpose) helps define language tone and register and influences how to approach the translation task.

• Target audience similarly shapes what language register to use. A translation intended for teenagers will differ from one created for experts.

• Content type — for example, promotional material, software strings, technical documentation or eLearning content — specifies the nature of the text to be translated and indicates what skills are needed.

• Output format for streaming media, mobile device or website, for example, clarifies the specific demands of the content in terms of length limitations, use of condensed language and more.

• Style and tone can guide translators on brand-specific references to be followed.

• Evaluation approach explains how the translated content will be assessed. Will it be based on reported and counted issues or will it be assessed as a whole based on specific criteria?

Quality levels

Our defined values for the individual parameters will help us determine what final output quality we need. Implicitly, every translation service buyer wants the best possible quality. However, most buyers have a broad portfolio of content to translate. Detailed communication with the buyer might reveal that different content has different purposes and, thus, different definitions of “the best possible quality.”

The localization industry generally recognizes three levels of quality. The nomenclature has not yet been codified, therefore several different designations may refer to the same quality level. There is no industry definition of what exactly constitutes the individual quality levels.

Fully automatic useful translation (FAUT) or basic quality is the typical output quality of machine-translated content. Here, usability is the central aspect — the translated (target) content must be intelligible and convey the information given in the source material.

Good enough, standard or value quality is applied to target content in which the source is accurately conveyed but terminology, grammar, language fluency and naturalness do not have to be flawless.

Human translation, premium or high quality involves individual quality parameters that typically underline and fine-tune how “top quality” expectations are met. The focus can be on creativity or absolute accuracy; often it is the combination of both.

Quality dimensions

While parameters help define the required quality level, dimensions are the core elements that specify what matters and how much.

Issue. Was any problem detected while assessing the quality of the source or target content? Issues are not errors, which have a negative impact on the quality score. Issues are potential challenges that are present in the content of which the translator should be made aware, for example, different formulations that have the same meaning.

Error. Unlike an issue, an error is an actual, verified problem with the text. The issue is flagged as an error only if it can be documented that it produces wrong and improper output (factually, geopolitically, legally and so on) or does not meet standards (grammar, typography) or violates rules set for the given task (terminology, length limitations, style guidelines).

Error typology. This is the system of issue and error categories and subcategories that describe specific problem areas: accuracy, terminology, fluency, style, locale convention, design and verity. Simple quality metrics can operate just using the master categories. It is recommended, however, to use subcategories (in full or partially) to better address root causes and drive improvements.

There are numerous error typology nomenclatures. Many translation service buyers have their own proprietary categorization systems. However, we recommend using the harmonized MQM-DQF error typology model (Figure 1). It is an outcome of the alignment efforts between TAUS-based DQF and the QT21 project funded by the European Union and managed by the German Institute for Artificial Intelligence (DFKI).

Error weight. This is a numerical indication (error point) of the importance of a particular error type in the overall quality assessment. While accuracy and correct and consistent terminology are of a critical importance for a legal content, factors such as fluency, naturalness and readability are crucial for creative types of content. In a legal translation, for example, errors within accuracy and terminology categories might have a bigger weight (for example, 1) than errors in fluency (error weight = 0.5). In the case of a marketing translation, it could be the opposite.

Quality threshold. The number of error points (not the number of errors) that is permissible within a defined word count sets the passing bar, which is expressed as a number (quality score). For instance, if we set 90 as the passing bar on a scale 0-100, anything below 90 is graded as a fail (Figure 2). A critical issue would fail the assessed sample automatically. The calculation formula for the final quality score of the reviewed sample would reflect the count of the reported error points against the quality threshold.

Dynamic quality metrics

Historically, we evaluated translation quality statically. We captured, recorded and measured all errors without regard to the content.

This has changed in recent years. Today, we recognize that there is no single set of quality criteria that should apply in all cases. Context is king.

“Customers don’t measure you on how hard you tried,” said Apple co-founder Steve Jobs. “They measure you on what you deliver.” Quality must be sensitive, therefore, to purpose and situation and allow for differentiation with regard to the content itself. And the broad variety of aforementioned factors should determine how quality assessments and measurements are designed, whether approached simply (using main error categories and weighted per content) or granularly.

These dynamic quality assessment models are an excellent fit for translation service buyers with a broad portfolio of content types.

User experience

The ISO 9421-210 standard defines user experience as the “perceptions and responses that result from the use or anticipated use of a product, system, or service.” It is not about counting errors. Rather, it is the outcome of the interaction between the internal state of the user and the product. Users want efficiency and need products to be as functional and intuitive as they can be. Accurate, clear, straightforward language is what helps. Language must be attractive, play with puns, use idioms and other elements to awake attention. The target content must not only convey the message, it must deliver it in an emotionally and culturally relevant way.

As users, we know that our internal state depends on a wide range of factors: our expectations, our needs, our immediate mood and even the amount of caffeine we’ve consumed. While, as language service buyers, we cannot control all of these factors, knowing exactly who our customers are sets us up better for success. Today, thanks to big data, we can understand user behavior and expectations like never before. The information that we collect can be used for various adjustments and improvements to product functionality but also to product language.

Language-focused inputs — whether gathered via A/B testing, user forums, social media or specific feedback from local offices and marketers — can be structured to allow for comparisons with data collected during translation/language quality assessment. Requested language changes by local offices should be tracked and analyzed to lend crucial insight to the translation effort and, in turn, improvements in user experience.

The habit of excellence

“We are what we repeatedly do,” said the historian and philosopher Will Durant. “Excellence, then, is not an act, but a habit.”

Taking Durant’s perspective, successfully creating and managing language quality does not have to be complicated. The toughest aspect of a habit is its diligent and systematic execution. Breaking the quality management process into single habits can bring us all — language service providers, buyers and users — experiences that we can count on.

The critical habits of translation quality excellence are:

• Know your customers and understand their success criteria.

• Clarify and set expectations upfront and communicate these to every stakeholder in the workflow chain.

• Define quality measurements, parameters and dimensions to prioritize what matters to you.

• Execute the quality plan.

• Measure, analyze and benchmark success.