translate5: A new approach to translation review

Client review is one of those perennial issues in the translation world that scores of project managers scratch their heads about, seldom coming up with a truly elegant solution.

This is not for lack of trying, but client-side resources frequently do not have expertise in translation environments, nor can this be reasonably expected of them. How, then, can this crucial element in the quality cycle be realized? What is needed is a tool that enables client interaction at the project level directly in the bilingual format where it is most efficient for service providers, yet is intuitive enough that nonexpert users will be willing and able to quickly and successfully adopt it.

It is precisely this interaction point that the creators of translate5 are seeking to reach. An open-source project whose initial development was sponsored by three German organizations (beo, [itl] and DFKI), the translate5 website is currently available for testing with a range of sample material. Additional sponsors are being actively sought to move the project forward. Much of the project description should be available in English by the time this article appears, and the source code is available for download from GitHub. translate5 can be expected to gain more prominence on the tools landscape in the coming years, as it has been chosen as the front-end user interaction medium for QTLaunchpad, an initiative funded by the European Commission that seeks to improve translation quality throughout Europe. Because the application is open-source, organizations can tailor it specifically to their own needs as opposed to remaining beholden to a tool manufacturer’s vision of what such a tool should offer.

In short, translate5 is a browser-based review environment that supports a segment-by-segment view of work-in-progress translation projects. It offers a palette of sorting plus cascading filter options as well as ease of navigation through the document set. Clicking on a document in the navigation pane on the left causes the application to immediately scroll to the first segment in that document. There are two fundamental views: editor mode and ergonomic mode, and it is easy to toggle between them. Editing mode (Figure 1) shows substantially more information columns by default than ergonomic mode does; in ergonomic mode, only the columns that are most relevant to editing are displayed by default. Additional columns, for example the comment column, can be switched on and off (Figure 2) as desired.

Position in the document set is shown in a hierarchical display on the left-hand side of the window, and the far right-hand side features a metadata area that houses review and rating sections. These two side sections can be collapsed at the click of an icon to provide more onscreen viewing space for the editable content. In between the two side panels is a cell matrix in which the source and target texts appear. The source and target cells each have a respective companion cell in which edited content may be entered. There are a number of other columns that can be displayed or not, depending on the individual user’s preferences and needs. These columns are used to record and display information at any given stage in the editing process; a particular display configuration may be appropriate at one stage, but not at others.

Additionally, the metadata sections serve as repositories for categorization of error types on a segment or subsegment basis. For example, the reviewer can highlight a segment or subsegment in an editing cell, and then choose the error description from the provided pick lists, as seen in Figure 3.

translate5 has been architected with flexibility in mind. The various metadata pick list items are freely configurable as part of the individual project creation so that each project can have its own dedicated rating system. Providing users with predefined pick lists not only speeds editors’ work, it forces consistency in notation of error types. This becomes important later when the error results are aggregated for reporting. Additional flexibility is provided by allowance of error overlapping. For example, a phrase might be marked for grammar and a word within that phrase marked separately for terminology usage. The segment display also presents tags in two views: consolidated and expanded (Figure 4). Toggling between the two views is done by clicking a button in the menu bar.

translate5 offers editors the capability to work on both target and source segments, whereby the project manager must decide at import time whether or not source text should be editable. In practice, this provides the opportunity for organizations to close a gap in the quality feedback loop that is often present in translation project flow. Translators make excellent editors of source text because they diligently read and seek to understand the concepts or instructions being expressed. It is not at all uncommon for translators to trip over ambiguity or incorrect statements, and then to escalate queries up through the project management infrastructure. Frequently, linguistic solutions to such queries are implemented in the target language without the source content being updated accordingly. Giving editors access to source segments as well as target segments allows translation units to be more easily cleaned, resulting in more robust translation memory.

The value proposition of translate5 transcends editing of source and target segments. Real value is added by functionality that aggregates the metadata, providing it in report format. This adds a layer of quantitative analysis to the editing process. Decision-makers in the linguistic supply chain can objectively isolate recurring error types and then initiate more precisely targeted remedial action in response. Noteworthy again, as can be seen in Figure 5, is that the aggregation functionality pertains not only to the target language, but to the source content as well.

This coverage of translate5 captures the application in the very early stages of what will hopefully be a long and useful product life cycle. For this reason, and because translate5 is not positioned as a commercial product but rather as an ongoing open-source initiative intended to serve the language industry at large, criticism is perhaps less important than contribution to a feature wish list. To begin with, the current functionality is based exclusively on the SDLXLIFF file format, which is known to not entirely conform to published industry standards. While it cannot be denied that the decision to begin with this file format as opposed to a more generic flavor of XLIFF is surely pragmatic, the decision could be perceived as lacking a certain element of idealism. So, first and foremost on the wish list could be expansion of the range of supported file formats. Such expansion will, of course, be driven by funding which, in turn, will be driven by demand. But perhaps some of the tool providers who vocally support the concept of common standards could feel themselves compelled to step up to the plate through provision of funds or resources.

Also conspicuously missing in this early version is a user-friendly capability to export or print reports of the quality management statistics, although they may be exported from the project overview in XML format. Some expansion of this capability would be nice to see. From a product philosophy standpoint, the developers have striven to keep all functionality that is not strictly related to the editing process out of the editing interface. They stress the point that reporting of statistics is generally a task that occurs after editing is finished. But one could potentially argue that users might want to have the immediate gratification of printing out a report directly from the chart display, as is a common practice with spreadsheet programs. Not having the functionality here seems to be somewhat counterintuitive. Supporting the other side of the discussion, though, is the fact that the data can be extracted through an API call, so if translate5 is embedded as a component into a larger quality management system, printing functionality might be better situated there in any case.

An additional concern might be whether the content display paradigm will find adoption among certain critical segments of the intended user base. Professional linguists are, through regular exposure to translation environments, familiar with the side-by-side, segment-based display of content. Yet one cannot help but wonder whether “outsider” users such as marketing executives or regional sales professionals will find such a display constraining to their perception of context and thus be resistant to adoption. The fact that translate5 is browser-based certainly helps remove some obstacles to adoption. Reviewers need not install nor learn complex or expensive applications. Nevertheless, if untrained users cannot immediately feel comfortable accomplishing their assigned tasks, the whole concept might fall flat.

If we may indulge in dreaming a bit about an ideal world, an optimal solution would be to have translate5 attached to some sort of document rendering application, thus offering users the capability of proofing in the context of the final layout while the segment-based view runs in the background. Then, if they have a change to make, clicking on a segment in the layout would bring the segment-by-segment, quality management and editing view into the foreground. For now, though, this will have to remain, as the Germans like to say, Zukunftsmusik (future music). In the meantime, translate5 is already incorporated in production for several large German customers of beo and [itl], and adoption has reportedly been positive.

Counterbalancing the aforementioned reservations is a robust architecture under the hood. The application is designed for performance over the web. Various programming techniques, for example, background loading of contiguous blocks of segments while any given block is being viewed facilitates a smooth scrolling experience. That being said, the developers do recommend optimizing the user experience by employing certain browsers known for processing Javascript code faster than others.

It does not matter how many individual files are in a project. In the editing area of the user interface, translate5 presents all of the segments in the project in one flowing stream as if they originated from one large file. The folder structure of large imported file sets is represented in the tree display in the left-hand user interface panel. An editor can freely navigate between folders and files represented in the tree display, whereby the segment display jumps to the currently marked file. If the order of files and folders does not suit the user’s purposes, it may be altered through simple drag and drop within the tree display.

translate5 also offers terminology import via the TBX format. Imported terms are highlighted in the segment display. The technology used for this is the XliffTermTagger, the code for which is also available as open source. In addition to identifying exact term matches, the XliffTermTagger supports stemming, fuzzy matching and capitalization differences. The term display in the metadata panel also differentiates between preferred term usage, allowed terms and forbidden terms.

Editing efficiency for repetitions is generously supported by a dedicated repetition editor that displays both the source and target segments plus information about whether the source segment is repeated, the target segment is repeated or both. translate5 can also identify and treat repetitions in cases where the target segment is repeated, but not the source. Insertion of repetitions can be executed automatically or manually.

Also worthy of note is that metadata about changes made by editors in the translate5 environment may be transported back into the Trados Studio change history, thus preserving the overall integrity of document evolution in the core translation environment. Also, if translators or editors use a pivot language in between the original source and target language, for example going from Russian to Japanese via English, the pivot language segments may be displayed in a column of their own to aid the editing process.

translate5 is an application based on a foundation of open-source idealism. It serves a very real, everyday need in the translation process without requiring language service providers to invest in expensive software tools that present a proprietary product philosophy. Interested parties with coding expertise may freely download the source code and add their own features or tune translate5 more exactly to their individual requirements. The creators of the tool are hopeful that other sponsors buy into the value proposition that translate5 represents, and will come forward to support this initiative with financial contributions or development resources