Language technologies used in media translation

Language technologies used in media translation
Angela Starkmann

Angela Starkmann

Angela Starkmann started her career as a subtitle translator before she moved on to the regular localization industry, working as linguist, project manager and trainer. After the recent boom of streaming services she went back into her old line of work. Today, she advises translation companies and serves as AV consultant for memoQ translation technologies while at the same time moonlighting as a translator to understand what is going on in the industry.

Angela Starkmann

Angela Starkmann

Angela Starkmann started her career as a subtitle translator before she moved on to the regular localization industry, working as linguist, project manager and trainer. After the recent boom of streaming services she went back into her old line of work. Today, she advises translation companies and serves as AV consultant for memoQ translation technologies while at the same time moonlighting as a translator to understand what is going on in the industry.


omething grand is happening in the language industry. A new market has exploded, with new players, enormous growth, and many language combinations and subject matters. This industry already existed on a smaller scale before it was dragged to everybody’s attention with the arrival of the big streaming services. I am talking about audiovisual (AV) translation.

Netflix, for example, is changing the way they present languages to their audiences. This is very exciting. More and more original productions are emerging in languages other than English. Polish, Turkish and Arabic feature films and series are being translated into English and many other local languages. International productions with several spoken languages reflect an international lifestyle most of us linguists know from our personal living environment.

Apart from original productions for all the streaming services, a crazy number of specialized films from remote countries are available. Bollywood films, Korean teenage love stories or Japanese anime are ignored by most, but enthusiastically watched by their own particular audiences.

This reflects the phenomenon journalist Chris Anderson called the long tail over 15 years ago. With the emergence of the internet, he said in an October 2004 Wired article that “The future of entertainment is in the millions of niche markets at the shallow end of the bitstream.”

And this is exactly what we see today. A steadily growing translation need for many languages, mainly to and from English (as the pivot language), with their own community of professionals, technologies and processes and a relatively small exchange of information from the traditional localization industry.

Translation tech and AV

The use of translation technology is not standard within the AV industry. Now is the time to change this. Many professionals, certainly in subtitling, think it is impossible to use translation technologies and others don’t use them because they don’t know they even exist. Or else because they just have a very general understanding of the available technologies and their use, and think they are not what would help them in their work. I hope that my contribution will help to show how AV translations are different from other translated texts, yet they can still be translated successfully with translation memory (TM) and machine translation (MT) technologies.

There is often a romantic image of the profession of a subtitler. “You get paid to watch movies” was one of the misleading claims you would sometimes hear. The best subtitle is one that the reader hardly even notices, and it is read at the same time the speaker is saying the words being translated.

AV translation is indeed very different than any other form of translation. Subtitling, for example, means the transfer from spoken into written language, in real time. In our professional work, we are only rarely confronted with texts that are subject to such a dynamic reception process.

Due to its particular nature, subtitling (along with dubbing) is bound to strict formatting rules. The reading speed is important. It is calculated as characters per second, or words per second (which might lead to funny subtitles in my native German, with very long words and little space).

Often, the word order in longer phrases needs to be changed over the course of several subtitles, in order to make the subtitles readable for their audience.

Subtitle translation in eight colors

The analysis in Table 1 shows the exceptional leverage translation technologies can bring for AV translation. Our example comes from the first subtitles for the first nine episodes of an Asian soap opera translated for Netflix. Every individual episode has a duration of about 45 minutes, with approximately 600 subtitles and consists of somewhere between 2,500 and 3,500 words.

In our example, every column is an episode and every individual box is a subtitle. Thus we see about 50 subtitles per episode, for a total of 450 subtitles. The source template has been created in English, while the source language is an Asian language. Working with pivot languages is common; subtitles are rarely created by the actual translator of a program. Most AV projects are translated into several languages, and thus the time-consuming template-creation process needs to be done only once.

I have color-coded every single box in order to demonstrate the possible leverage of the use of translation technologies (translation memory and machine translation) for similar subtitle projects.

The main criteria here was the structure of subtitle text in the source language. I am interested in how simple or complicated the source material is, as this offers a good indication for the possibilities of reuse with TM and MT.

In order to estimate the usability of TM and MT technologies for subtitles, I have found the following categories to be useful:

Standard text

May not be changed by translator. This standard text is required by the client and recurring between projects. There is a standard translation available for all target languages, or the translation becomes the standard translation over the course of the translation.

Example: “A NETFLIX ORIGINAL SERIES,” “Previously on,” “Subtitle translation by.”

Proper names and product names

As long as terms are fixed, consistency is ensured. Term decisions later in the process can be done quickly and easily. Subtitles marked in green only contain names.

Example: Mom, Dad, the Doctor, Mr. and Mrs. Smith.

Very simple phrases and dialogues of four words or less

Automated MT pre-translation is usually very good but needs to be checked. Reading speed issues are usually minimal due to the short total length of the subtitle.

Example: “Yes.” “Good-bye.” “Okay.” “Wait!” “What is it, Susie?” “How is it?”

Table 1: The first subtitles for the first nine episodes of an Asian soap opera translated for Netflix, color-coded to show what could be leveraged using MT or TM.
Table 1: The first subtitles for the first nine episodes of an Asian soap opera translated for Netflix, color-coded to show what could be leveraged using MT or TM.

Table 1: The first subtitles for the first nine episodes of an Asian soap opera translated for Netflix, color-coded to show what could be leveraged using MT or TM.

Longer sentences

Human translator can use MT pre-translation, but they’ll need to adapt word-order and also check formality and reading speed issues. Line breaks are indicated by [/], subtitle breaks are indicated by [//].

Example: There’s a satellite/call for you. // Trust me, buddy,/ you wanna take this call. (Titanic)

Longer sentences and sentences spread out over several subtitles

Decisions like formality need to be made by a human translator. Pre-translation is useable, but not final. Massive changes of word order might be required for a grammatically correct translation in the target language.

Example: Should this have/remained unseen // at the bottom of the/ocean for eternity… // …when we can see/and enjoy it now? (Titanic)

Repetitions or recap from previous episodes

Human translators are assisted by TM as previous translations are offered within the system. This is currently a task done manually by translator and technical staff, and therefore rather time-consuming. Consistency can be achieved must easier, even if reading speed issues might give different results between individual episodes.


Terminology and search functionalities within the translation environment improve consistency and overall quality. MT offers remarkable results for generally established terms.

Example: Medical (medical drama), culinary terms (cooking shows), military (history series), sports

“Triage is one of the most important tools a doctor has.” (Grey’s Anatomy)

Difficult or ambiguous source material

Translation needs to be witty and the original might be ambiguous. Human translators need to be very cautious and look out for mistakes with MT. Mistakes can distort meaning. This is particularly critical when colloquial language is translated using MT.

Example: Anything with particular or idiomatic language and style, such as comedy or reality shows.

Lots of leverage: Look for green and blue

If you look at our example again, you will see that everything in green and blue will offer the best leverage when TM and MT are used. It might come as a surprise how many simple words and phrases (in green and blue) are actually part of our sample translation. But remember: Subtitles are a written representation of spoken texts, and these texts are mostly words between people. And as people often speak in a rather simple way with each other, this is what we see in the dialogues we are going to translate for subtitling. This is where we should definitely consider using translation technologies for the best translation support.

This screenshot shows the original analysis of our nine-episode project before the start of the translation. Without a single translated word — with an empty TM — it already gives me 10% repetitions. We can expect that this will still improve after the first translated episodes.

All that can be subtitled

Not all subtitled material consists of soap operas, and therefore might not be equally suited for reuse. It is important to understand that there might be different usage of language technology features that are particularly useful for the different projects.

During my career I often joked about how many different projects I get to work on. I would always use translation technologies whatever the source material was, as there could always be several reasons why working with them would be better than without.

As a subtitle translator, I’m no different. I’ve edited classic Hollywood feature films, but much more often translated late night shows, home improvement programs, esoteric workshops to improve the spectator’s lives and prepare for the arrival of aliens, sports documentaries, and many, many Asian soap series and anime. All this audiovisual content is very diverse, and it can be challenging but also wonderfully interesting to deal with all the different topics within one professional career.

Using translation technologies will always be useful, for all different sorts of projects. But have a good look at the source material first. This can help you to get the best leverage for this project.

Think of the last subtitled program you have watched. Not all texts in moving images are complicated and complex, and sometimes film dialogues can be simple or simply banal. Let’s have a look at a couple of typical programs, and see how translation technologies could facilitate the audiovisual translation:

The sitcom: 45 minutes of fun and laughter. Often situational comedy. Recurring names, dialogues and phrases, sometimes over the course of several seasons.

Challenge for subtitle translator: Be funny and engaging. Consistency between episodes or seasons. Every-day dialogues can (and should) always be translated the same.

The thriller: Complicated stuff the local audience needs to understand as clearly as possible. Police and military terms need to be translated consistently. Simple language and many repetitions in action scenes.

Challenge for subtitle translator: Terminology needs to be taken into consideration. Flashbacks of scenes from the past must be translated absolutely the same as when they first occurred.

The sports program: Very particular sports terminology with no opportunity for a more detailed explanation. Often very fast, spontaneous speakers. Many repetitions and repeated phrases.

Challenge for subtitle translator: Very little available space. Abbreviation needed everywhere. Units of measurements need to be transferred to local standard. There are many proper names (athletes, cities, competitions) that need to be localized appropriately and consistently.

The documentary: Can be many different topics. Often structured speaker in moderate, even speed. Few repetitions.

Challenge for subtitle translator: Can be difficult and very specific. Neutral style. Machine translation helps with unknown terms. Abbreviation and simplification of language structures may be the main challenge.

The romantic comedy: Often all about dialogue between the characters. Subtitles might show personal development and drive the plot. It helps to understand the individual characters in the course of time.

Challenge for subtitle translator: Colloquial language (such as terms of endearment) need to be reflected in the translation. Playfulness and humor are important for the special tone and charm.

The toughest nut: Reality shows with messy speakers and colloquial language. Many exclamations, maybe swearing or catch phrases. Local or colloquial language and very particular subject matters.

Challenge for the subtitle translator: Shortening (truncating) the translation. Characters speak very quickly. Finding suitable translations for flashy or funny phrases. Linguistic ambiguity is a great challenge for MT, but TM helps to stay consistent.

These are just a few examples of different programs that might need to be translated in today’s vast range of media. These jobs present us with linguistic challenges that we solve with our professional skills: Our sense of language, humor, a bombastic general knowledge and this great curiosity without which this work would be impossible. Our contribution as creative human translators with linguistic finesse will always be important.

Translation technologies can support our work, and we should definitely make use of this possibility. We should use these technologies to be more productive, work faster and leave the boring bits to the computer. Language technology can help us be more consistent, to type less and to automate everything that can be automated, so that we get a high-quality product as quickly as possible in this market under strong pressure.

Learning from related business cases

Translation technology is practically unknown in AV translation today. This has probably been cemented by the fact that AV translation professionals are considering themselves creative people, who would and should not work with computer support.

This is not a new phenomenon. Also other translation areas were very resistant when translation technology was first introduced to them. Béatrice Compagnon, a technology expert and experienced games localization specialist, recalls:

“When I started as a gaming translator many many moons ago, the standard of the industry was to send Excel sheets to the translators. They could use a CAT tool but nothing was connected or shared. Now imagine, translating an MMO can mean a volume from about 1.5 million words for the core game, representing roughly three months of work for a team of six (five translators, one proofreader).

The concept of working online with files shared in real time and a common glossary, as well as working directly in XML files, was mind-blowing.

But when this new standard was implemented, the change in speed and quality as well as in consistency was impressive. And yes, translators may have been reluctant at first but once trained in the new ways, not even one ever looked back.

I feel like the AV industry is at the same crucial turning point, where technology is needed and necessary but it’s implementation means a huge change in the usual workflows and changes are always met with resistance.”

Postscript: Ode to the translator

I am a translator myself. I know and love languages. And yet I am promoting the use of translation technologies for one of the last creative domains of our language industry in spite of the fact that many fellow linguists would prefer this industry to stay just the way it is.

I accept the fact that computer-supported translation might lead to more generic translations — at least for the rather bland film material we are sometimes working with. We are not giving our talents away, but rather trying to tackle the challenges of our work with all the resources that might be available. To us personally, but also our toolbox. Because AV translators are artisans rather than artists, working on the marble of language with a hammer and chisel and an electric drill and belt sander, if needed. We are using our brains, hands and all the tools available to support our work in order to earn our living.

This requires a new generation of linguists with many talents, including in media translation. Men and women who love languages, and master them like nobody else. Versatile linguists who understand the cultural content they are dealing with every day, and are able to mediate between nations, cultures and settings. But also high-tech professionals willing and able to work in our fast-paced, ever-changing environment with all the possible means. Professionals who also understand “the machine” and are perfectly able to use it, in conjunction with their own linguistic skills, for the creation of new, creative pieces of art, in the AV translation business.

I have seen the AV business change once before. This was in the late 1990s, when digital subtitle technologies started to push into the market, competition increased, and subtitle prices fell by almost 50%. I subsequently abandoned the industry, started working in technical localization, just to return again many years later.

The film industry has come a long way from the initial development of moving pictures at the end of the 19th century, to today’s digitally enhanced movies. We understand that technology can be embraced as a part of the artistic process in film-making, and most cineasts will never doubt that films like James Cameron’s Avatar and The Wachovskis’ The Matrix (among others) with their extensive use of new motion capture filming techniques will make their films less a work of art because of all the technology used there. It is still the work of creative professionals, and we appreciate it just because of this. I am expecting the same for media translation. The translated subtitle or dubbed film is a part of the cinematic work of skilled professionals — whether the translation has been supported by a computer or not.