Preparing image text for CAT tools

In translation projects with images and graphics that include text, good planning can mean the difference between an effective approach or days of back-and-forth. Here we will look at some tips for preparing image text, from the bare basics to a few advanced pointers. This should be of value for translators and project managers learning to work with graphically complex documents, and graphic operators learning to deal with those documents for the process of translation.

Translation projects have at least two sides. One is the purely linguistic, while especially in more complex jobs such as magazines and manuals, the overall file management and desktop publishing (DTP) can be a challenge that must be handled with the right mix of experience, technology and communication skills. More specifically, everyone involved — meaning not only the managers — should have a general knowledge of the limits and abilities of others in a team. In our graphic design company, we’ve used many systems to interact with linguists and editors in our efforts to serve the translation industry for over a decade. For example, Kilgray’s ambitious and comprehensive memoQ makes it possible to set up everything about a project, including personnel, term bases and memories. The tool offers an interface between those two sides of the project: the words and everything beside them, including images. The following recommendations come from our experiences in the field of multilingual DTP as a practical guide of how to deal with text that is embedded in images, which in our experience is important to know not only for DTP specialists but also for translators, editors and project managers.

Let’s mention upfront what the key to everything is: a well-planned workflow and preflight for each project. Preflight is the part of the overall process where the documents are prepared for translation. This means editing the source files to fix segmentation, and to look for linked or embedded images that include text that must be translated. When we find images, charts, forms or any other elements with text that is not editable within the working source file, we need to make some well-informed decisions.

Computer assisted translation (CAT) tools have basic input/translation/output process structures. As with most software, these tools deal with different kinds of files and are able to import text under a variety of file types — and export it in a different file type. Different translating tools will be capable of opening a different variety of file types for translation, each one with their own approach. As a rule, they won’t be able to process as editable text any strings of characters that are externally linked or embedded in a document. For example, a PDF chart that is pasted or linked in an InDesign IDML document will not be editable or translatable once the document is imported into a CAT tool, even though both file types belong to Adobe. Only the text set in InDesign text boxes will. This “invisible” text in the images sometimes can include a vast number of words — in some cases perhaps more than half of all the text in a document.

So, say we have our main document converted to an exchange format (let’s say our InDesign INDD file is already converted into IDML) and ready to import to our CAT tool. But it will not “see” all the text that is in our charts and graphics, because it was made with, for example, other Adobe tools, and this means we are not able to translate or even count for an estimate all the words on our project’s images. What we need now is to extract all that text to convert it into a format that’s readable by our CAT tool, and we have different choices of how to do it.

The key idea in planning and preflight is to analyze and plan in detail to choose an approach for the best results in quality, in the shortest time, using minimal effort.


Typical options for text extraction

We can group the possible approaches under four options: exporting, extracting “out,” transcribing and extracting “in.” First, and ideally, you can save and export images in an editable format. This should be the first thing you try — can you convert the image as it is, with the text in it, into a format that’s among the import options of the CAT tool? For example, graphics made in Adobe Illustrator (AI or EPS files) can be exported in the SVG format, and then imported normally for translation in memoQ. As with all formats, check carefully for segmentation in preflight, and that captions or other text in the image have not been vectorized (converted to curves). When SVG or other direct exchange formats work, they work like a charm, and all of your charts and graphics will come back from translation with their sizes and styles on, almost ready to go. However, you will end up with as many files as there are images, which can be a lot. And some of these exchange formats can be tricky or unstable depending on obscure code subtleties, so try to export and translate one sample image and see how it works first.

Extracting image text into a word file is perhaps the most traditional, foolproof technique. Open the image file and copy or retype all the text in it to a Microsoft word file. Make sure to fix segments. Make sure to make the file “format rich” by using size, bold text and color, just enough to roughly mimic the source layout to make it easier for DTP to copy and paste the text into the translated image. One good thing about this is that you can extract all the text of several images across several documents into one single word document. This is ideal when you have many images that repeat the same text or that need some basic formatting in the extraction document. This solution also creates extra documents in the project, however, and ultimately, it falls into the time-consuming, fall-back copy-pasting technique.

Transcribing is a process specific to memoQ. This industrial-level platform recently upgraded by Kilgray allows you to import the images in a project to make them available to a DTP specialist, process them and export them back, all within the system. You can retype all text within text entry windows inside the system and then prepare an image localization pack. This creates a zipped package with all the images of a project. This integrates and interrelates basically all the steps into the CAT system, including the transcription. This is ideal for posters, advertisements and images with short and simple text. Long term, with a team used to it, this can be a powerful approach. However, also consider that depending on who does the extraction, it makes the DTP go back and forth between translating and DTP systems, which can be overall relatively slower in projects with many or complex images, such as technical manuals. Additionally, the learning curve is relatively steep.

Finally, you can extract image text into the main document. This is one “lazy” approach that we started doing, and with time it turned out to be the most efficient for certain jobs. Let’s say we have a magazine in INDD format with a number of graphics and charts that include text. What we do in this technique is to open the image/graphics files and copy the text from the graphics in normal text boxes, which we create inside the main INDD document, usually in the page and along the position of the image. Or simply retype them — the idea is to copy the text from the linked image files to a box in the main document they are linked to. This approach is simple and fast: you get main text and extracted text together in one single file. It works best if the same person does the extraction and the formatting.

This sums up the approaches we know to preparing text from images for translation. Choosing the right technique considering the technology, characteristics of each project and human resources available can save many hours of work and keep the most demanding clients happy, as specialized experience will allow you to solve many problems before they show themselves.