Seven tasks for audiovisual training

Continuous delivery is one of the Holy Grails of the modern audiovisual localization industry. The content tide is growing. It is complemented daily by new video and game offerings from all over the world. Producers are thinking globally as social networks and internet enable them to launch multicountry projects at less cost than ever.

New technologies and metadata applications will enable most social network users to create their own content in a visually engaging and professional manner, even create inexpensive TV channels of their own on the basis of their social network communities. With more and more people becoming empowered users, the demand for the translation of newly generated audiovisual content will grow exponentially.

To improve quality control for its shows, audiovisual giant Netflix said it took a “Hollywood meets Silicon Valley” approach. Still, there is more Hollywood human wizardry than Silicon Valley tech in the large-scale continuous delivery of audiovisual productions and audiovisual translations. It is based on teamwork. Teams include translators, editors, subtitlers, voice artists and dubbing directors. All of them become coherent elements of one collective authorship in a foreign language release. Translators do not stand alone as people who merely “translate” words, sentences and other verbal expressions. They are not just mediators; they create new productions for target languages — especially in the case of lip sync dubbing.

In terms of monetization, the key issue is continuous quality of audiovisual translation that will match the continuous quality of productions. As the Netflix tech blog pointed out, “it’s nearly impossible for Netflix to maintain a standard…to ensure constant quality at a reliability and scale we need to support our constant international growth.”

This may be a sign that vendors are not paying translators enough to get acceptable quality. Netflix works directly with translators in some cases. Though this model involves extensive project management work from the Netflix staff, they’ll supposedly be able to control the quality level of the translations they hire.

Does compiling a database of 10,000 translators mean a language service provider or even Netflix has a team for doing an audiovisual localization project? It is just the beginning, of course.

Major audiovisual localization and gaming companies have rigorous translator selection procedures in place; there is anecdotal evidence that currently only 1-2% of all applicants are qualified enough to perform localization work. This is largely how Netflix has had to operate while trying to drive improvement and quality for the last few years before the launch of production platform HERMES.

At the same time, having a trained and coherent team ensures a community or a company a quick rise to commercial success. In a recent article in The New York Times, Glenn Kenny quotes Colin Decker, chief operating officer of Crunchyroll, describing the history of the anime site, which now has more than 20 million viewers. “Official subtitling was a little more assembly line,” Decker said. “Cranked out. And fans said accuracy matters. The medium drew them in to strive to understand the foreign culture, which raised the bar on the quality of the translation.”

But at a certain point the drive to create localization teams meets its match. There is tension between Hollywood and Silicon Valley in the audiovisual localization equation. There is still a popular misconception from tech savvy users that at a certain future moment machine translation (MT) and automation will greatly reduce the need for human translation of audiovisual materials.

However, the demands of viewer-oriented transcreation and the presence of ever-changing combinations of visuals, music and environment that accompany words could not be easily tailored for MT. Audiovisual projects can’t be broken down into small pieces that will easily be automated, as at almost all localization stages, creativity is a must. Excessively strict standards and creativity do not go together well. Attempts to describe and regulate everything related only to subtitle translation would result in Yellow Pages-sized style guides. It is also important that no matter how large, an audiovisual production is still a coherent whole from its inception, and must be localized as an entity and not as a bunch of “do-it-yourself” sets that a gamer will fit together after the fact.

The backbone of any system of continuous delivery of audiovisual materials is not just some software platform. It is the ability of a localizing team to rapidly expand and work together as a collective author at any given moment of time. It is the key responsibility of an audiovisual localization services vendor to ensure the integrity of this collective authorship. Expanding, working together and ensuring continuous delivery of quality materials are tasks that require the same inseparable combination of corporate development platforms. There are four cornerstones of any system of continuous delivery in audiovisual localization.

1. A common production platform: online/offline subtitling software, studio standards, game translation testing protocols.

2. A common fiscal and logistical management platform: tracking of deliverables, deadlines, billing.

3. A common talent and output evaluation platform: mostly for the uniformity of talent selection — audiovisual translation proficiency tests, game world knowledge tests, voice sampling.

4. A common education and training platform: universities, corporate trainings/coaches, online and offline translator training.

There are plenty of solutions for entries 1-3 in various formats. The most successful are Sfera (operated by Deluxe), OOONA, DotSub and HERMES. But the issue of a common education and training platform is often overlooked. 

Such a system should deal with demand fluctuations for audiovisual services, ensuring the supply of new skilled personnel sharing the same knowledge background and the retraining of the personnel in lean periods to other tasks such as audio description. Another important issue is that if training and motivation are properly organized, a team shares the same corporate ideology and is very tight. This enables a company to go through short and sometimes midterm periods of insufficient money supply during channel or project launches or periods of exponential growth of a customer’s activities in a certain market.

Flexible training systems successfully handle topic fluctuations. Many content providers offer channel bundles with specialized programs on various niche topics with, say, cooking, kitesurfing, car racing, horse breeding, angling and UFO studies being packaged together. The composition of these bundles varies greatly and is a major workflow challenge unless there is a system allowing a vendor to retrain the personnel.

Standards of delivery also fluctuate, so there is a need for constant training and retraining in this respect as well. Practical audiovisual translation teaching strategies are broken into seven specific education tasks, and this has worked for RuFilms School of Audiovisual Translation in Russia. Currently the school has partnerships with 11 Russian and two foreign universities as well as three major localization international market players.

Task 1: Deal with audiovisual

 Translators deal with audiovisual linguistic and semantic environments. These include environments that are pseudo-oral (a key feature of audiovisual speech as defined by various researchers worldwide); multimodal (combining visual and verbal sign systems) and centered around emotions and not just hard facts and meanings. Communication is target audience oriented; constrained by extralinguistic factors; and structured around coherent plots and gaming experiences.

The features of these environments required cooperation with several Russian universities and a course curriculum that bundled together disciplines such as film studies, cognitive psychology and linguistics, with the emphasis on the parameters of pseudo-orality in phonetics/spelling, grammar, syntax and vocabulary.

Task 2: Ensure flexibility

The courses are taught both in online and offline formats with the only difference being the intensity of the practical training as some of the tools used for it (such as adjusted respeaking and voice recognition) can be applied offline only. The duration could range from a week to a year. Courses are modified to fit the need to translate various genres and subjects.

Formats are designed to fit various customer tasks and various types of trainee groups. The most effective format so far has been a three-month course, although for some tasks one-week courses proved to be quite successful (such as teaching how to translate songs and rhymes).

Task 3: Mix solid theory

and practice

The comprehensive theoretical curriculum is complemented by several innovative tools for making practice engaging and entertaining and helping to introduce elements of gamification. Such tools include interlingual respeaking with certain protocol adjustments to fit the needs to master audiovisual translation in general and constrained translation in particular, virtual reality demonstrations of videos to be translated, using glossary management and plot tracking software tools (Grammatica 4.0) as well as more down-to-earth professional subtitling programs (EZTitles 5).

Task 4: Make training

motivating and interesting

The process of training an audiovisual translator is intensive and painstaking. So we thought such a transition could be easier if the process could be made more motivating and engaging. Motivation is achieved through gamification based on the set of the instant output evaluation (based on the proprietary set of parameters and respeaking tools). We also thought virtual reality demonstrations of certain pieces to be translated could increase the immersion of the students into the process of training. Trainees find it entertaining, but the actual impact is yet to be measured.

Task 5: Build team spirit

This approach has aimed at raising translator and customer communities’ awareness about audiovisual translation.

This is accomplished, among other things, by promoting specialists in the field at various industry events. Team spirit forms the foundation for a “special forces” kind of psychological framework. Team members are in the trenches together doing elite-level work.

Task 6: Provide opportunity for quick output

This task is solved by the consistent application of slightly modified respeaking technologies to the audiovisual translation training process. Respeaking is real-time subtitling in a mode of translation that is very similar to simultaneous translation. The respeaker re-utters everything that is being said in the same language of the speaker, or translates it to another language (so-called “translation respeaking”). The output is processed by voice detection software that transforms output into written subtitles. Our approach requires the translator to comply with limitations of audiovisual translation.

When we say this is “real time” subtitling, it means that within six to ten seconds after a sentence is uttered by the speaker, it can be seen as a subtitle on the screen. Punctuation is added by the respeaker during the process, and the professional respeaker also adjusts what is heard to create an easily perceived audiovisual discourse. The output may be instantly evaluated by both a coach and a trainee and discussed.

The respeaker may encounter a number of challenges. Regulation of voice volume and pitch, good pronunciation and short pauses are needed to ensure good human-machine interaction. Mistakes can occur due to limitations of the software, especially in the case of homonyms, homophones, unknown words and so on. In interlingual respeaking, there is also the transfer between different languages and cultures to consider.

Task 7: Develop a system for constructive feedback

The set of quality evaluation parameters that our integrated approach is based upon — pseudo-orality metrics, degree of constrained translation, dynamic equivalence metrics, plot coherence metrics, target audience metrics, linguistic and structural correctness — is introduced at the earliest phases of the training and helps students to keep subjective components (that are usually present in any transcreative translation evaluation) to a minimum.