Quantifying and measuring linguistic quality

By Jason Arnsparger September 20, 2013

Even with all the tools, technology and automation available today, we still rely on people to perform translation. And as we know all too well, people make mistakes. Furthermore, people interpret language differently.

As with any human process, translation can be subjective. Some aspects of translation are objective. Yes translated as no is a clear mistranslation, for example. However, translation quality is often judged upon subjective criteria such as style, terminology or how the core concepts have been articulated in another language.

Quality is a word that is thrown around rather loosely, in many different contexts. It has become a buzzword of sorts. What one person would consider quality, another might not. How can you objectively determine if something is, in fact, quality? Furthermore, how can you quantify the level of quality? Asking ten different people would likely get ten different definitions. Let’s start with a few. The Project Management Institute, in A Guide to the Project Management Body of Knowledge (4th Edition), defines it as “The degree to which a product meets the specified requirements.” The Six Sigma Handbook calls it the “Number of defects per million opportunities.” The International Organization for Standardization says it is “Determined by comparing a set of inherent characteristics with a set of requirements. If those inherent characteristics meet all requirements, high or excellent quality is achieved. If those characteristics do not meet all requirements, a low or poor level of quality is achieved.”

All of these definitions are from highly regarded organizations in the quality management field. There are two common attributes. First, there is a scale of degrees of quality. Second, there is a set of defined specifications to provide a point of reference on the scale.

A translation that accurately conveys the intended meaning of the source may still not necessarily meet the criteria of high quality. Due to the subjective nature of the particular needs of the end user or customer, a quality translation is more than just maintaining the meaning. Sure, in many instances quality can be black or white, but quite often there is a lot of gray area when talking about linguistic quality.

Quality is in the eye of the beholder. A linguist cannot know the individual preferences of all of their target audience. If quality is measured in part on preference, then the translator needs to know what quality really entails in any given instance to each individual. This is where specifications become so critical.

Defining and measuring specifications

From project to project, the needs and expectations of the customer may change. This is why defined specifications become so critical. For example, does the translation need to be concise? Does it need to use specific, preferred terminology (red blood cells or erythrocytes)? Does it need to portray a certain tone? The list of specifications can go on and on.

Additional considerations of context need to be outlined at the start of an individual project as well. Is the translation for a lawsuit? Then it needs to be fairly literal. Is it for a marketing campaign? Then the translator may be doing creative rewriting or transcreation. If the translation is done in the setting of a regulated industry, then it needs to meet regulatory standards. One aspect of a quality translation is that it meets the purpose for which it will be used. So to ensure quality, the customer needs to provide the linguist a detailed description of the target audience and setting in which the target content will be used. The bottom line is, a qualified translator can and should deliver exceptional quality translations, but without specifications the translation still might not be good enough to meet the customer’s expectations. Therefore, the expectations must be formulated into specifications, which will in turn be the measuring stick for linguistic quality.

Once the expectations are well established and formulated into specifications, an error can be objectively identified. Counting errors leads to a measurement method. There are several ways to measure the quality of a product by counting the number of errors. Take Six Sigma, for example. The Six Sigma quality level is 3.44 errors per million opportunities, which calculates to a 99.99966% yield. The Four Sigma quality level is 6,210 errors per million opportunities, which calculates to a 99.38% yield. These methods are useful because they make quality very black and white — the translation either meets the Six Sigma quality level or it doesn’t. We can use a tool such as Six Sigma to verify and quantify once we have clearly defined an error in the translation process.

Measured quality means objectively assessing the quality level of the translated product for its conformance to defined requirements and specifications. As requirements vary from project to project, company to company, and even supplier to supplier, the assessment needs to be customized to meet each individual set of requirements.

Without specifications against which to measure quality, subjectivity can tend to derail translation projects, leaving them at the mercy of individuals’ preferences. This has a significant impact on an organization’s bottom line.

Five whys

The “five whys” of measured quality is an iterative question-asking technique used to explore the cause-and-effect relationships underlying a particular problem. The primary goal of the technique is to determine the root cause of a defect or problem. Along those lines, there are five main arguments for the use of measured quality: proactive process, saving cost, reducing timelines, regulatory compliance and driving business.

Let’s consider the proactive process. Quality assessment tends to be overly reactive. Most organizations determine linguistic quality late in the localization process through in-country review, depending on individuals that may or may not be qualified to make the ultimate decision on the quality of translated content. That approach may seem sufficient to most organizations, even if it puts an undue amount of pressure on those validating linguistic quality.

In any process there is a series of checks and balances: execution, inspection and acceptance. It is tricky to find the right amount of checks and balances. Some are necessary, but if there are too many checks and balances, there could be diminished returns. Diminished returns occur when there is a “pass the buck” mentality. One resource will depend on the next in line to catch errors. Such a process of checks and balances typically does result in quality translations, but the process is inefficient. Excessive quality control steps can paralyze a project and add questionable value.

There is an important distinction between quality control (QC) and quality assurance (QA). QC is essentially inspection. QC simply validates tasks. By nature, QC is reactive, while QA is more proactive. QA is the active pursuit of quality, engrained in the overall process and in the mind-set of an organization. QA likely includes a subset of QC steps within the process, but QA expands to the actual design and execution of a process. Part of QA is the ongoing measurement of quality. The ultimate goal is that no errors or non-conformances are found during QC, or inspection. Eventually QA reduces dependence on QC, driving efficiency and leaning out the process.

As far as the cost of good quality is concerned, there are costs for prevention. There are costs for inspection. There are costs for detection, costs for rework and there are human resource costs. Ask your language service provider how much you are paying for QC, but also consider the cost of poor quality. Poor quality can compromise the reputation of the organization from the customer’s perspective, which in turn could result in lost revenue. In extreme cases like a recall, the cost could be huge. You should also consider your internal QA staff. Could those resources be working on other things if you knew incoming translation quality was exceptional — if it could be quantified?

Measured quality can have a direct impact on the bottom line. Put into the context of Six Sigma, this impact can be exponential. Using the previous example, a Four Sigma quality level means there are approximately 6,210 errors per million transactions or opportunities. In the context of translation, let’s say that’s 6,210 out of one million words. A Six Sigma quality level, however, would mean approximately three and a half errors per million words. What is the cost of preventing, inspecting, detecting and fixing 6,210 errors versus three and a half errors? Remember that seemingly minute difference between 99.38% quality and 99.99% quality? When put into the context of translation, the impact to the bottom line is not minute at all.

Saving time is another consideration. Wouldn’t it be great to receive a translation from your service provider that didn’t need to go to in-country review or any other internal verification, but just went straight on the shelf for distribution? Think about the time saved and how quickly you could get localized products to market. Measured quality is the gateway to such a lofty goal.

In the translation industry, there is constant pressure to reduce cycle times to get products to market faster. Analyze how much of the time incurred during a translation project is on actual execution of work versus QC (inspection, detection and correction of errors). How much time could be saved on a schedule if QC steps were reduced and quality levels were known and quantified along the way? Add up the time associated with current QC steps and estimate how much sooner products could get to the market if those tasks could be scaled down. This alone could justify the investment in the pursuit of measured quality.

The need to meticulously meet global and regional regulatory requirements is a major factor in the costs and extensive timelines for many organizations, especially medical device and pharmaceutical translations. Mistakes in translations that compromise regulatory compliance put an organization at high risk. Risks can range from minor non-conformances during audits, requiring resubmission of product labeling for approval, to major non-conformances resulting in product recalls and even affecting patient/user safety. Language service providers with intimate knowledge of the regulatory landscape fully understand the risks involved and the need for accurate and consistent translations. Organizations should align regulatory compliance requirements with the quality specifications they provide to the language service provider. By compiling all requirements into the specifications document, the language service provider can ensure compliance in tandem with other quality requirements. Deploying measured quality can help regulated companies reduce risk and increase confidence that their multilingual content is compliant with global and regional regulations.

Lastly, there is driving business. Now more than ever, maximizing revenue and getting to market quickly are critical business goals. It is commonplace to have to “do more with less.” Not only can measured quality be used for quantifiable, continuous improvement of translations, it truly can help drive business by helping to reduce costs and timelines. It can help organizations free up critical resources so they can focus on other areas that directly impact the business. Unlike the typical QC-driven process, measured quality provides organizations with a predictable, repeatable outcome with their localized content, allowing the business to focus on critical goals.

Linguistic quality measurement systems

A handful of linguistic quality measurement systems have emerged in the last ten to 15 years. The goal of these measurement systems is to quantify translation quality. These systems have many commonalities: categories of error types are defined (mistranslations, omissions, additions and more) and severity of the errors is defined.

The goal is to find a way to objectively measure the translation quality. Once an evaluation of a translation has been completed the errors are aggregated, resulting in a quality score. Here are a few quality measurement standards that are used to measure linguistic quality.

SAE J2450: Originally released in 2001 by the Society of Automotive Engineers (SAE), J2450 is a statistical method used to classify translator errors for automotive translations. It has since been expanded to other industries. It is intended as a consistent standard against which the quality of translation can be objectively measured.

The standard has seven error categories, including misspelling or omission, with two subcategories for severity (minor and major). A numerical weight is assigned to each of these error categories and subcategories. SAE does not dictate how the weighting should be done. Each company implementing J2450 must determine acceptable versus unacceptable translations, based on the ultimate purpose of the translated content. It does not include a way of measuring style errors, and so is not well suited for marketing materials, for instance. The focus on nonstylistic error classification does, however, make J2450 a viable system for measuring the quality of technical content.

LISA QA model: This model was developed and released by the now-defunct Localization Industry Standards Association (LISA). The principles of the quality standard are still widely used in the industry. Error categories are jointly defined by the language service provider and the customer. The categories also include subcategories of severity (minor, major and critical). Custom approaches for software, online help and documentation are included in the standard as well.

Canadian Language Quality Measurement System: The Canadian government’s Translation Bureau released the Canadian Language Quality Measurement System which it uses to assess translation quality. Translation and language errors are captured, with distinctions made between major and minor errors.

When this system was initially released, errors from a 400-word passage were tallied, leading to a quality rating (quality level A, or superior, had zero major errors and no more than six minor errors). This has since been modified to zero tolerance of errors for final delivery of translations (from Translation Quality Assessment: An Argumentation-Centred Approach by Malcolm Williams).

METRiQ: Originally released in 2009 by ForeignExchange Translations, METRiQ is a statistical method used to classify linguistic errors for medical translations. It is intended as a consistent standard against which the quality of translation can be objectively measured, both during and at completion of a translation project. The system has four error categories, each with its own sub-categories. There are three categories of severity (critical, major and minor).

Errors are counted for a 3,000 word sample of text and entered into the METRiQ online system. Numerical weightings are assigned to each of these error categories and subcategories. These weightings are modified based on the point in the translation process at which the text is being evaluated (after initial translation, after edit, prior to final delivery) and depending on what type of project it is. If a first pass translation is being evaluated, more minor errors are accepted, as compared with an evaluation being completed prior to final delivery. A medical marketing brochure would have higher weighting of stylistic errors than a technical manual. These weightings are set by the linguistic quality team, and can be customized per customer.

Implementing measured quality

Creating effective and efficient linguistic quality management systems that integrate quality standards and automated QA tools may seem overwhelming. That alone is why many organizations often fail to implement such a process. How then can you approach creation of a quality measurement system? By taking a systematic, methodical approach and using the tools we’ve covered, it is an attainable goal.

Step one is to set your corporate quality objectives. This should be done at an enterprise level. The corporate quality mission should cascade down to programs, departments, projects and employees. Everyone in the organization has a stake in quality assurance. Aligning measured quality to the company’s overall quality objectives will help ensure buy-in from stakeholders. The quality objectives should be customer focused, measurable and aligned with the company vision. The linguistic strategy can then be defined to support these goals.

Step two is to define quality. As demonstrated earlier, there are varying definitions of quality. That is why it is so important for each organization to clearly articulate what quality means to them and formulate that definition into project requirements, specifications and acceptance criteria. This helps to minimize the subjectivity of quality assessments. To define quality, start by answering what quality means to you and your end users. Identify consistent themes, integrate regulatory and legal requirements, and formulate them into a single, consistent quality definition.

Companies should also partner with their LSP to ensure alignment between their definition and procedures and the LSP’s. Compare to be sure that the LSP is benchmarking to the same standards that you are. Ideally there should be no disagreement as to whether quality requirements have been met.

Step three involves analyzing your current quality levels. Rather than just assessing one recent project, it’s recommended to take a more comprehensive approach. Sample current or recently delivered projects against the new quality definition. Capture information about any non-conformances and issues with final project deliveries. Collect feedback from in-country reviewers on current and historical quality, and how it has trended over time. Some of this feedback might be subjective; however, consistent subjective feedback from many individuals could indicate a more systemic problem, which should not be discounted. Even subjective information can be developed into objective measurements.

You might also assess the quality of source content and support materials. If there are quality issues in the source language content, it increases the risk that those issues are perpetuated in the translations. High-quality source content facilitates the delivery of high-quality translations.

Assess the quality of linguistic assets such as translation glossaries, locale-specific style guides and translation memories. It is common for linguistic assets to degrade over time, especially before a measured quality system is in place. Linguistic assets are living, breathing entities and should be assessed and maintained accordingly. Document which tools are being used now and what measurements are currently performed.

The end result should be your baseline quality metrics, which will be used to measure improvements once the quality measurement system is in place.

Step four is creating or updating your linguistic support materials. If source language and translation glossaries and style guides are not already in place, create them. They are key inputs to the content development and localization life cycle. If these materials already exist, use the results from the quality assessment to update and revalidate them. Determine what improvements are needed to the source language content to drive efficiency in the translation process. Determine what modifications are needed to the specifications, style guides, translation memory, translation glossaries and reference materials to make them more effective.

Step five is to deploy the measured quality system. This step is one that requires buy-in from all relevant team members to be successfully accomplished. Ensure everyone is aligned to the understanding of the current state, as determined in the analysis stage. The stakeholders should all be aligned to the overall quality objectives. Only then can the team form an effective strategy.

Determine what measurement systems/standards will be used. Consider using an out-of-the-box standard, such as J2450, LISA QA or the others discussed previously. If none of the existing standards suit your specific needs, create a custom measurement system and use it consistently. You should clearly define:

Error types or categories

Error severities and what constitutes a given severity

What weightings will be applied to different error categories — may vary based on the type of project or content being translated

Resource requirements for those performing evaluations

When in the process evaluations should take place

Quality thresholds and acceptance criteria — may vary based on the type of project or content being translated

In addition, QA tools can be leveraged to automate the process. There are many effective tools in the industry that can be deployed out-of-the-box or customized to meet specific requirements. Partner with language service providers to understand how these tools can help. Remember, however, that the tools are only as effective as the underlying process and the people evaluating the work.

Managing the results

Once the measured quality system is deployed, what do you do with the data? This again is for each organization to decide. Diligently perform root cause analyses to truly understand the underlying issues. Use approaches such as the “five whys” to ensure the actual root cause is uncovered, not just the symptoms. Determine the actions that need to be taken based on the results. Are there issues with linguists? Are there issues with source content, support materials or linguistic assets? Is the process not optimal?

Set ongoing targets for quality metrics. Be realistic in your targets. Clearly a target of zero defects is the ultimate goal, but it might be hard to measure improvements if the target is zero defects and you consistently fall short.

Adjust the tools or measurement system as necessary. Organizational dynamics, project types, products and deliverables may change over time. Ensure the quality system is flexible for a changing environment.

There are several misperceptions about implementing measured quality. First, that it is too expensive. Second, that it slows down the process and causes delays. Third, that to increase quality, you need to add process steps. What is actually needed is a paradigm shift that can change the way localization projects are managed. Despite misperceptions, measured quality, as a tool, can be used to increase quality and drive efficiency. In fact, the cost-time-quality trinity can coexist through smart implementation of a measured quality program. Measured quality allows for focused, surgical improvements rather than siege-style QC processes.

Defining and quantifying translation quality is the most effective answer to the quality challenges and inefficiencies common in the industry. It can help companies go from assuming quality is good to knowing quality is good.