Shaping Google’s voice in 60+ languages

By Svein Hermansen March 3, 2014

As one of the most widely recognized brands on the web, Google is known for many things. One of them is scale. Everything Google does is at a huge scale, and it’s at the core of the company that everything we do needs to be scalable. Think about Gmail, with 9 gigabytes (and counting) of free e-mail storage for anyone who wants it. Or YouTube, with 100 hours of video material uploaded every minute. Or of course Google Search, with 3 billion search queries a day.

There are a few other characteristics we share as a company. Automation enables that scalability; “launching and iterating” means we can get products out fast and then polish them based on user reception — and finally we want to do things that are reflective of Google’s values.

For localization at Google, this is all context that helps us to fit into the rest of the company and meet Google’s needs. The way that we localize needs to share these characteristics too. We need to work at the scale and the pace of the rest of the company, and we have to do the products honor.

That can be intimidating given the large number and type of products. We try to remember that Google’s mission as a company has never changed. It’s still to “organize the world’s information and make it universally accessible and useful.” We keep targeting the “universal” part in new ways, such as Project Loon, which delivers internet access to remote parts of the world by balloon. But how useful is access to all of Google’s technology if you can’t understand the interface?

That’s of course where localization comes in. More than three in four of Google’s users live outside the United States. Three in four internet users have a different native language than English. As language professionals, we wield the power of language to convey values. And our mandate as Google’s language specialists is to use the power of our respective languages to convey Google’s values.

Here’s a story about how much localization can matter on a micro-level: one of the most recent additions to the over 160 languages that Google Search is localized into is Myanmar. Bringing Google to Myanmar was a volunteer project bringing together Google and the Burmese community worldwide, and it met a host of challenges. But interestingly for us in localization, of all the obstacles, what was called out as one of the trickiest problems was how to translate the “I’m Feeling Lucky” button on the Google Search page. The whole community of volunteers got very involved, and after heated discussions and loads of suggestions — from “With hope” to “A blind chicken stumbles onto a pot of rice” — they settled for something that back-translates approximately into “Abracadabra.” This lends an air of fun and mystique and the expectation that something wonderful might happen when you use it — very much in line with the button’s intention.

It’s not every day that we get to make decisions like that. But this is a good example of the mindset we try to have about localization in general and about being truly local, globally. We’re responsible for getting Google products to our markets in good linguistic shape and in a form that’s appropriate. However, we also represent our markets back to Google centrally, and flag our needs in terms of language and beyond. This can range from pushing for support of Chinese morphology to suggesting product names that work internationally.

Shaping Google’s voice

Three things are essential to our effort to establish a voice across the globe. The first is how the Google voice is defined — and it is defined, for people who write the source text. This matters for translators too. The second is how we at Google think about the language specialist role — and the language owner role, which is a very direct way in which we meet our need to not just shape the voice, but do it at scale. As for the third, we have some concrete ways in which we try to overcome the subjectivity that anything related to style or “readability” is often said to have. It’s a common objection: “Style? Oh, it’s all a matter of taste.” We don’t think it is.

One illustration of why style isn’t just a matter of taste is the fact that there are concrete principles guiding how anyone who writes for Google — copywriters, user experience writers, tech writers — should go about making Google sound like Google. As language specialists, we make sure the principles that govern the source writing are used in translations — where applicable, and that’s certainly not everywhere. But we need to use the power of our own languages to convey the same values, because they don’t change.

One principle is that “it’s OK to wear a suit, but don’t sound like one,” meaning that we want to avoid business-speak in our marketing texts. You know, things like “circling back on action items in order to be more proactive on key deliverables.” However, not every language has the same traditions of business-speak. For example, in Norwegian there is no tradition of business- or management-speak. If you want to sound like a management consultant in Norwegian, you use English words. So it’s not relevant to tell translators that they shouldn’t “sound like a suit.” What we do have a tradition for in Norwegian is bureaucratic lingo — you know, a kind of light-legalese, local-government-form sort of language. A typical feature of this language in Norwegian is that it’s extremely noun-heavy, almost to the point where you won’t find concrete, active verbs. For unknown reasons, this language has become common in IT translation.

So it used to be that in Norwegian you would “perform a search”; “perform an analysis of”; and “perform a review of.” There are several problems with that, like using unnecessary words — three or four where one would do — and abstract, weak verbs that are meaningless on their own, such as perform, instead of concrete and meaningful ones like search, analyze and review. These are valid concerns in themselves for anyone who wants to write well, especially in a user interface where the focus is really on clarity. Imagine three buttons next to each other all starting with perform. Apart from that, the important thing is that this doesn’t sound like Google.

So the voice principles can translate into something slightly different and specific to a language or language family. Concerns about bureaucratic language are equally valid for Danish, as well as for German. One of the things Google’s German language specialists are very clear about is that they want to avoid so-called Beamtendeutsch or “bureaucracy German” in Google localization.

Even if not all of the principles are directly relevant or can even be translated, together they form a pretty coherent picture. As a whole they can be distilled to a few maxims that are really language-independent and which many of us use as a basis for our style guides:

We want to be user-friendly and helpful.

We want the language to “get out of the way” as much as possible and let our products speak for themselves.

We want to speak to users on their level and never talk down to people.

We want to insert some fun into our products.

The fun is usually for a reason. It can be to put a human face on technology when it goes haywire, like the “Aw, Snap!” message in a crashed Chrome tab (Figure 1), or it could be used to inform users about something that might be tedious without sounding paternalistic, such as Chrome’s reminder to “Be wary of people standing behind you!” when you’re using incognito mode (Figure 2). It might also be used to call attention to something truly important, like the heading for the license agreement for the Google Toolbar, which said “Please read this — it’s not the usual yada yada.” That heading led to unprecedented click-through rates for the policy document it linked to. And when Gmail changed its privacy policy some time later and didn’t use a vivid heading like this, the click-through rate was back down where it’s commonly found. So these things matter. The exact way to accomplish them through humor and other style effects will vary between different languages, but these principles are so integral to Google’s image that they really have to be taken into account.

Creating standards

So Google does have a voice, based on principles that are transferable. Another crucial element in how we shape the Google voice is how we think about the roles of language specialist and language owner. As much as we’re specialists, our role is really about quality management. Our core responsibility is the Google voice in a language, broken down into the standards that make up that voice. A “standard” in this case can be anything from a term in a product glossary to the broader decision that we won’t accept translations that make us sound like local government. Our mandate really boils down to those standards: defining them, together with our local stakeholders such as marketing; communicating them to everyone involved in localizing Google; making sure all our translations adhere to them; and finally iterating on them as needed, because they’re not static. User habits change, general terminology changes. Where we think it’s appropriate, we change with it.

The main reason the language specialist role is so focused on standards is really one of those Google characteristics mentioned at the beginning, namely scale. We processed 350 million words of translation last year. Imagine that any language specialist would review even just top-priority translations. Whenever she spent two hours brushing up a marketing website for an important launch for a given product, there would be literally a hundred other products she neglected. And at least 10-15 of those are equally important and therefore should get the same priority. Couple that with the speed we have, and it’s very likely that a few of them also had important launches that week. The point is, over time we’ve realized that the only way we can really meet our mandate to establish the Google voice across all products, all the time, is through creating standards, and then continuously work to implement as well as refine and update them. When we talk about standards and about quality management, for us the voice is an essential aspect.

But the number of products is only one dimension of scale. The other is the number of languages, and that’s where the language-owner part of the role comes in. We’re 18 language specialists at the moment, but we cover 60+ languages. The key to that scalability is our vendor model, and we work with many great people who are employed by our vendors and who have the role as language leads for the languages where we don’t have in-house language specialists. In addition to having a native-speaker lead, each of those languages is “adopted” by a Google language specialist. For instance, our Thai language specialist is the language owner for other Southeast Asian languages. As language owners we’re responsible for quality in these languages, although of course we delegate the day-to-day linguistic work to the vendor lead linguists who actually speak them. But beyond delegating, we also guide. As language owners we have relationships with the regional marketing teams and pave the way for the linguistic discussions that need to happen. And as fellow language professionals we steer the style decisions the lead linguists make for their language, so that all 60+ languages convey the same values — in a way that’s appropriate for the language and market.

Removing subjectivity

Finally, an essential part of how we try to shape the Google voice is how we try to counter the argument that style is a matter of taste: by creating guidelines that take as much subjectivity as possible out of the equation. Again, this comes down to standards. Our most important documentation for the style we want in each language is the style guide. This is our “translation” of the voice principles into instructions that first of all are relevant for each language and culture — like cautioning against bureaucratic language instead of business-speak — and, second, that make for clear and helpful guidelines that translators and reviewers can follow consistently. To a certain extent that’s simple language analysis. Why does bureaucratic language sound bad? It’s full of nouns and unnecessary words. It uses passive verbs. It uses unnecessarily long sentences. And so on. So each of these becomes something to avoid.

The point is that these are style questions, but they’re quantifiable and it’s possible to create guidelines around them. To tie this together with the previous two points, that’s what each of us does as language specialists: we define standards based on certain general principles like user-friendliness, which leads to clear guidelines.

Since we have this focus on style and on natural, idiomatic language and we believe that we can create objective guidelines around them, it follows naturally that we also want to make this part of our quality benchmarks. That’s why we recently introduced an error we call Readability as a weighted category in our translation quality evaluation. This seems to sometimes scare translators, since there appears to be an assumption that it gives reviewers a free pass to reject translations they happen not to like personally, with no further explanation. That’s not what we want. Thus, we’ve been working hard to avoid it by making sure the guidelines are as clear as possible and reviewers don’t make arbitrary corrections.

We have a couple of very good reasons for introducing readability as a weighted error category in the first place. The first really reflects well on our vendors, in that the other error categories we track are looking great. But the flip-side of that is that there can be a certain discrepancy between the quality as measured by those metrics and the perceived quality by the end user or by the client — in this case us, or marketing and product teams. So the fact that all the metrics look great is double-edged. On the one hand, it means our vendors are great at living up to the expectations we set for them. On the other hand, it means we might not have captured the full range of our expectations in our measurement. It’s like we’ve been drawing ourselves a bath where we’ve optimized the water for all kinds of metrics reflecting its chemical composition — only to find out as we get in that it’s freezing. Of course you’d expect water to have a certain set of properties at room temperature, but beyond that, what you probably care most about if you’re having a bath is the temperature. Similarly, we expect professional translations to have the properties of correct grammar and spelling as per the rules of each language, but we need that right temperature, the style, and we’ve finally built a thermometer. To get back to the risk of individual reviewers starting to apply their own preferences — their own preferred temperature, as it were — our position is that if we define the principles of the Google voice in each language through standards then what should result is clear guidelines that leave as little room as possible for arbitrary corrections.

What it means for translators

So what does all this mean if you’re a translator working on Google material? First of all, it means that voice and style need to be taken into account. They’re not optional. And the style is probably different than that of traditional IT translation, which is only logical since Google as a company is different from traditional IT companies and we want to convey different values. IT translation in many languages appears to be plagued by an acceptance traditionally for very literal translation, treating it as technical translation. However, modern, consumer-facing IT translation is anything but. It means as a translator you need to “get” the style and to some extent to identify with it. It’s like any discourse: to be able to write convincingly you need to be familiar with it and even enjoy it.

This leads to another obvious-sounding but crucial point. You can’t underestimate the need to really, fully understand the source. All translators know this in theory, of course. But real life is often different. A deep understanding is especially important because our discourse is centered around clarity and user-friendliness, and that means talking to all our users in a language they will understand. To be able to do that, as a translator you need to know exactly what you’re dealing with and how it’s common to talk about that thing. That’s why it’s not enough to understand the source as an isolated text, because it isn’t. You need to be familiar with the landscape it describes. You won’t get very far simply following the glossary if concepts like purchase funnel or long tail mean nothing to you. These are terms of the trade and are not specific to Google. We’ve seen “long-tail advertisers” translated literally into “advertisers with long tails,” which, if anything, sounds like cryptozoology.

Finally, one thing you’re able to do if you know the landscape and develop a healthy allergy against word-for-word translation is to be critical toward the source — and the glossaries and the style guide. There’s nothing better than being put straight by a translator who really knows her way around web analytics and webmaster tools or maybe uses AdSense on her own blog, and who can justify with usage examples that Google is using a term which at best makes people in this field snicker, or at worst makes no sense. This happens all the time, and we expect it. We don’t expect people to follow our guidelines blindly if they don’t make sense to them. Our German colleagues sum this up nicely in their standards: Style guide is queen, context is king and common sense is ace.

So Google has a reputation as a demanding client. We probably are. Many say that we require transcreation, but that’s a wrong term for true localization. True localization allows a translator to employ creativity, curiosity and talent. That’s an opportunity to help millions of people and do something meaningful, rather than just typing 300 or 400 words per hour. Maybe, like some of us, you came into this industry wondering where some of these conventions came from and why they were so far removed from real life and idiomatic language. Why products that normal people use every day speak to them in stilted and unidiomatic “IT language” when they could just use “language.” Why the language hasn’t evolved from when the internet started to get translated, when it was really still a world apart, until now, where it’s part of everyday life and indistinguishable from it. Why we can’t just use our languages the way they were meant to be used instead of squeezing them into templates dictated by quirks of the English language. Chances are that if you did wonder all this, you’ll find our guidelines a relief.

In any case, they’re an integral part of what makes Google Google. The same values that underpin our mission — giving the world easy access to useful information — also govern the way we want and need to communicate. As language professionals, that’s our expertise. That’s the value we add. It’s up to us to realize that style requirements don’t need to be arbitrary or subjective if they’re based on standards that follow from certain basic principles. That’s how we help bring Google’s magic to the world and give people everywhere products that are not only useful, but that take them seriously wherever they are, and that now and again may put a smile on their faces.