Community Lives: Interoperability based on Linport

It’s not uncommon these days to have projects involving translation into dozens of languages. And this is the age of containers, orchestration and microservice architectures. We could not deliver projects of this scale without all kinds of enabling technologies, but how do we ensure consistency?

Who among us doesn’t suffer headaches with software incompatibilities? The challenge of ensuring interoperability is well-known. Broadly, interoperability allows for the open exchange of information with standardized control of entire projects and their component parts. The openness that interoperability facilitates allows data to be exchanged between different applications and file formats without restrictions. Recently my work with Translation Commons has brought me to a much fuller appreciation of what interoperability involves.

A cornerstone of Translation Commons is the Technology Think Tank (TTT), whose main task is to investigate, evaluate and review issues concerning the use of IT in our industry. Emerging standards and quantum leaps in the sophistication of the tools we use are easing interoperability issues thanks to the sustained efforts of remarkable people.

Alan Melby, professor emeritus of linguistics and computer linguistics at Brigham Young University, has been a pioneer in his field and his contributions are second to none. His research interests and activities are numerous, but he has consistently worked on both human and machine translation technologies and standards. He likes to say that his research focuses on translation, training and testing, but he is equally at home discussing quality, linguistics or philosophy of language.

It all started when he was 15 years old. After always being asked by his mother to turn off the lights when he left a room, Melby began building a sensor that would count people entering and leaving a room and automatically switch off the lights when empty. He went on to win first the school science fair, then the state and got to compete at the national level. Soon after, Melby spent a summer studying in France. A defining moment for him was years later when he returned home and read an article that mentioned machine translation (MT). With both his passions for French and technology combined, he went on to BYU and while still a student, he helped start an MT project at the Translation Sciences Institute. As a professor at BYU and an American Translators Association (ATA) certified translator, Melby got involved in standards when a specific terminology system that many were depending on went out of business and he realized the need for permanent solutions. That was the beginning of Term Base Exchange (TBX).

Melby’s prominence resulted in his 2011 contribution to the Language Interoperability Portfolio Project, better known as Linport. Linport is a collaborative project developing an open, vendor-independent format that can be used by many different translation tools to package translation materials. Linport makes use of containers. In computing terms, containers allow you to run an application and its dependencies in isolated processes that in turn allow you to package its code, configurations and dependencies into easy-to-use building blocks. Linport uses a combination of two types of container: portfolios and packages. The former “contains” all the elements in a multilingual project; the latter caters for the needs of a single bilingual task within that project.

Linport emerged from collaboration between The Container Project, The Multilingual Electronic Dossier (MED) project at the European Directorate-General Translation Department (DGT) and Interoperability Now! (TIPP). However, this collaborative effort was brought to a halt when the European DGT considered requiring MED to be implemented in all translator tools responding to the next call for tender and Linport was forced into abeyance. But thanks to Melby’s and others’ efforts, this was not the end of the story because the requirements that brought it about in the first place remained unfulfilled. The first of these requirements may be summarized as simplifying and standardizing the translation process. The second main need was for provision of an open standard that would allow all users to implement and support it. Quite simply, openness facilitates sharing. If we are to meet the complex language needs of a global community, then we need the technology to support that. Unfortunately, the EU’s DGT put a hold on its backing of Linport. Consequently Linport was down, but far from out.

Reviving Linport

Openness is not just a feature of the language community, it’s now a feature of global culture and this is prominently seen in technology with the ubiquity of open source licenses. Graphics, applications and entire network topologies are now open source. The web could not exist without it. Nor for that matter could technology like MT. In 1995, Professor Mikel Forcada found his place in the language community when he was asked to teach a course on language and computing at the University of Alicante. Intrigued by then-current MT applications, he reverse-engineered them, and built his own application, now known to us as Apertium. Leading the thriving Apertium community has brought him in contact with a rich mix of talents and gave him first-hand experience of creating an easy and open platform to make users’ results reproducible when offering machine translation services. Work with developing plug-ins for other tools like the computer-aided translation (CAT) tool OmegaT has deepened his commitment to satisfying the requirement of adhering to standards while ensuring that the software works seamlessly and interoperably. Forcada is fully committed to open source development, evangelizing on the benefit of avoiding vendor lock-in.

Melby suggested a partnership between Translation Commons and LTAC to revive Linport, thanks to a recent reversal of position by the DGT, which now encourages further work on Linport. As a consequence, a separate initiative was started with the goal of renewing the effort to bring Linport to the language community.

Others joined the initiative separately. At LocWorld in Montreal last year, Andrzej Zydroń, chief technical officer of XTM International, delivered an intriguing presentation on standards. My curiosity was piqued.

Among his considerable accomplishments, Zydroń was involved with Melby at the 2011 inception of Linport. He too was disappointed that the project ended up in mothballs, but kept faith that as an ideal solution to continuing needs in our industry, it would bounce back. I was, therefore, doubly delighted when I had the opportunity to let him know about partnership with LTAC.

One aspect of Zydroń’s work illustrates some of the key issues that make interoperability a tough nut to crack. When he tells us “standards don’t always mean interoperability,” that seemingly innocuous phrase hones directly in on the workings of our increasingly complex localization supply chain and implies the need for translation management system/CAT tools to function transparently. This means dispensing with the dreaded implementation issues and arcane working methods that we’ve all suffered headaches with in trying to manage projects in today’s pressure-cooker business world. Let’s remind ourselves that Linport specifies two kinds of containers: a package container for describing one bilingual task and a portfolio container for describing an entire translation project, which can include many languages, many tasks, many transactions and thus many packages. The Linport project adopted the TIPP format by Interoperability Now! as its package format and is currently implemented by tools such as memoQ and XTM. The TIPP format allows a highly dynamic set of tools and components to coexist happily in the workflow of an entire project. In other words, interoperability is enabled.

A fourth member of the Linport revival team is Marc Mittag, whom many LocWorld attendees will know for bringing yoga principles to the world of business. He too has forged a career developing software for the language industry and is well known for his passionate advocacy of open source. In particular, he has been involved in integrating Translate5, OpenTM2, Moses and Okapi to form a community-based free software translation system on the web. This experience has given Mittag insight into the issues of application interoperability. Through his company MittagQI, his work in creating a web-centric development environment has endowed him with a well-grounded knowledge of all aspects of language community tasks that supports theory with practical solutions. Helping developers and end-users to be on the same page allows a much more efficient and successful use of resources to facilitate multilingual communication.

Mittag is a driving force in bringing TIPP to its present state of existence. TIPP is first and foremost a standard way to represent certain types of data for exchange between tools. Other types of data involved in a translation supply chain may also need standardized representations, as well as well-defined rules for how the data should be interpreted and handled.

However, the Interoperability Now! working group has developed TIPP to work best in conjunction with other standardized formats like XLIFF, the XML-based format that was created to standardize the way content is passed between tools is a common format for CAT tools. TIPP is now being updated by the newly created Linport working group. The group has also joined the new initiative TAPICC, which will consider adopting TIPP 2.0 as payload container and object model for a payload application programming interface, followed by Linport 2.0 in 2018.

Looking ahead

While proprietary software developers can claim to provide translation and localization tools, these do not allow for universal interoperability and tie users into their own set of practices. Given the growing maturity and sophistication of our industry’s supply chain, it seems to be counter-productive not to lead the way with a means of providing tools openly and interoperably. What remains is for the general language technology sector to adopt both TIPP 2.0 and Linport 2.0. Certainly open source projects such as Translate5 and Apertium, as well as existing proprietary supporters such as XTM and memoQ, will be quick to adopt.

Computers are logical machines, but we keep asking them to do tasks outside their set logic. Language is a case in point. If there is logic in the way humans use language, it is of a quite different nature to the binary logic in our computers. However, we have some extraordinarily gifted members in the language industry and we are succeeding in assisting business and numerous cultural interests to communicate on their own terms.