The dawn of a new linguistic epoch

Jay Marciano

Writer Jay Marciano has served as Lionbridge’s director of machine translation since 2010. He is also on the board of the Association for Machine Translation in the Americas (AMTA).

Jay Marciano headshot

Time is a funny thing. It’s hard to believe five years have already elapsed since the promising early results of applying deep learning and artificial neural networks to machine translation (MT). Yet it’s equally hard to believe it’s only been that long — that a radically new method of approaching one of computer science’s oldest challenges, automated translation, could so quickly and completely usurp statistical machine translation.

Since our company started doing post-editing at scale in 2001, we now have hundreds of millions of words that allow us to track trends in MT quality and the amount of changes that professional translators have to make to MT output to create quality translations. The data confirms what readers likely already know: carefully trained neural machine translation (NMT) produces draft translations that require substantially less editing.

That’s a very simple and straightforward statement, but one with potentially massive implications for our industry — and that’s just the first concrete and most obvious example of applying artificial intelligence (AI) to language service providers’ offerings.

Let’s pause there for a brief word on AI, the buzziest of all buzzwords. Whenever AI is mentioned, it’s important to establish what the term actually means. AI systems fall into three categories:

[ Systems designed by people and trained with people-created data that perform a task assumed to require intelligence.

[ Systems designed by people that can iteratively improve their own task performance without human intervention.

[ Artificial general intelligence systems that were initially designed by people but now have both the ability to independently improve performance and autonomous decision-making abilities – such as changing tasks or self-improvement.

NMT, like most AI products in use today, falls squarely into the first category. This means in terms of analyzing mountains of data and applying those learnings to a task, AI is closer to children’s book character Mary Anne, a steam shovel that “could dig as much in a day as a hundred men could dig in a week” than to the humanity-attacking machines from bad science fiction.

That doesn’t mean, however, that NMT and current AI aren’t tremendous achievements. They are, and they bring with them challenges and opportunities we’re just beginning to come to terms with. But responding to change is nothing new in an industry as old as ours. Language service providers have witnessed and adapted to every major technological advance since translators moved from etching characters into clay tablets to writing them with ink on parchment. What is new is the rapidity of change that we’re currently witnessing. Think of the changes that a translator approaching retirement has seen over the course of that career, witnessing the digitization of nearly every aspect of the job. She might have started translating on a typewriter, then moved into a word processing program, then to specifically designed translation environments, having to go through the massive rethinking and payment changes that came with translation memory. Her stack of dictionaries may have been replaced by computer aids like spellcheck or other online resources. By the time our industry figures something out, it can seem like the technological landscape has already moved on. And today we see MT producing output that is quite often shockingly good.

Change is inevitable, but how we confront it is within each individual’s power. When I try to forecast what’s going to happen in our industry regarding NMT and AI, I find it instructive to look to history for similar events to see if there are lessons to be learned. Three very different historical scenarios come to mind, each illustrating ways in which technological change has had radical implications for those involved.

1. Typesetting

Upon its establishment in 1852, the National Typographical Union — later the International Typographical Union — became the first trade union in the United States. It was founded to protect and promote the tens of thousands of workers who manually set type for newspapers, magazines and book publishers. In its heyday, the union had more than 120,000 highly skilled members — people who had the education, skill and dexterity necessary to read a manuscript and render it in reverse with stunning speed and accuracy. But mechanization and automation radically reduced the need. The union disbanded in 1986, just about the time anyone with a laser printer could produce camera-ready copy. Today in the age of ebooks and online newspapers, the occupation no longer exists outside traditionalists who practice typesetting as an art.

2. Acting

When the first feature-length movies were produced more than 100 years ago, before filmmakers had the technological ability to add synchronized soundtracks, any actor who wanted to be successful had to adopt extreme mimicry to convey emotion. A scant twenty years later, life for a generation of silent-era actors changed when 1927 saw the launch of talkies — movies that had sound. Speech coaches were suddenly in great demand as enunciation — a skill movie stars had never needed — was suddenly critical. Even stage actors of the day, who had never forsaken their oratory skills, had to modify their diction, which had been completely unnecessary on a sound stage. Technology-driven market changes demanded they expand their skillset by working on enunciation and the expressiveness of their voices. Among those silent-film stars who were able to adapt to voice acting are some of the biggest names of Hollywood: Greta Garbo, Joan Crawford and Carole Lombard. But many others, like Mary Pickford, Douglas Fairbanks and Charlie Chaplin, never became comfortable with voice acting and their careers suffered for it.

While it’s easy to focus on actors who were left behind, adding dialog and sound created far more jobs than it cost — jobs that didn’t exist before such as sound technician, boom operator, sound effects specialist and sound mixers. And of course the screenwriter’s job changed radically.

To give you an idea of how quickly talking pictures became the norm, consider this: at the first Academy Awards in 1929, the films nominated for Outstanding Picture were all silent. But the next year, every single nominated film had sound.

3. Field hockey

A German friend of mine has been playing field hockey competitively since the late 1970s. Over the course of his involvement, international field hockey has gone from being played on natural grass to artificial turf. On artificial turf, a field hockey ball rolls much faster and further than on natural grass. It can be struck more cleanly and is subject to fewer unexpected hops and direction changes. The result is that today’s game is much faster and more precise with an emphasis on long passes and terrifyingly hard shots. Winning on artificial turf requires modified skills — particularly the ability to trap speedy passes, new offensive and defensive strategies, greater levels of fitness and even changes in equipment with modifications from shoes and sticks to considerably more padding for the goalkeeper.

So yes, the translation and localization game is changing quickly and it is up to each of us — whether we work independently or as part of a corporation — to ensure that we have the proper skills, equipment and strategies to play competitively and successfully.

Here are some of the changes to our game that I consider likely, presuming no further dramatic leap in the sophistication of machine translation systems:

[ Translation memories will be used for 100% matches and for MT training material, but their importance for partial matches will diminish.

[ Translation of consistently authored informational content will become a review process of substantially accurate and fluent machine-generated translations by professional translators with subject area expertise.

[ Unsupervised MT — MT output that has not undergone human review — will be used much more broadly and confidently as a completely adequate solution for lower-risk content.

[ Increasing volumes of MT output that has been carefully post-edited will lead to continuing incremental improvements in MT quality.

[ Significant improvements in efficiency will lead to a rethinking of pricing away from per-word and toward pricing based on business value and the level of necessary human interaction.

[ Big data analytics will allow new ways to analyze how well a translation performs for the customer, meaning effective resolution of a business need will be the measure of translation quality, which in turn will allow a more meaningful tracking of the ROI of a translation.

[ Rethinking of pricing and clearer ROI models will lead to substantial increases in translation volume.

Some of these changes are already taking place, while others will require a considerable amount of time. In an industry as fragmented as ours, we can also be sure that these changes will not happen uniformly. As cyberpunk novelist (and hero of mine) William Gibson said in an interview in The Economist in 2001, “The future is already here, it’s just not evenly distributed.”

The effects of technological advances on the language services industry don’t start and end with NMT. The next epoch is already upon us. Our industry tells a much broader story and the continuation of that story represents massive opportunity. For millennia, we have been translating texts and interpreting speech so that they can be understood by people who speak different languages. For decades, we have been localizing products so that they are accessible to users around the globe. And now, in the age of AI, there is a vast and growing array of smart products that need linguistic intelligence, so that they can understand our speech and so that our wants and desires are accessible to them. Not unlike the introduction of speech to motion pictures, adding linguistic functionality to smart products has created new opportunities in the creation, curation and testing of the linguistic data sets from which these systems learn. This is a whole series of language services that didn’t exist a few short years ago because the demand for them didn’t yet exist. Even as we work to create ever more effective methods of translating texts, interpreting speech and localizing products, the next quantum leap is here. We are no longer just localizing products so that we can understand their interface but so they can understand ours. That’s a big part of what AI is and it’s a growing part of what the language services industry is.

So while our industry seems to be in a head-spinning state of change, it’s critical to remember that we have been reacting to change for decades and decades. What hasn’t changed — and what should never change — is our focus on fulfilling the multilingual needs of our customers. Those needs will continue to change as the world around them changes.