iADAATPA: A single access point to MT and CAT connectors

iADAATPA may sound as unpronounceable as it looks at first sight, there is no question about it. However, the aims of this European Commission-backed project could not be clearer: to build a platform that will act as a single connection point to many technologies and machine translation (MT) vendors. iADAATPA started as a deployment effort for European Public Administrations  involving three countries (Spain, Ireland and Latvia) and four MT developers (Pangeanic and Prompsit from Spain, KantanMT from Ireland and Tilde from Latvia).

These four language technology leaders joined forces in a joint proposal during late 2016 together with consulting company Everis and Dublin City University (DCU). They involved national authorities and public administrations that volunteered to be early adopters of MT for public administrations in the European Union (EU). Each of these two organizations provided a different and fresh approach to machine translation deployment. While Everis provided the knowledge and understanding of public administration needs as a global consulting company part of NTT DATA group, DCU could build on a lot of experience in quality estimation and engine ranking as well as automatic language detection so that the most appropriate engine from a translation vendor could be chosen for a particular job. This idea could potentially be the inception of a marketplace where MT vendors and users could meet and quickly plug in to the right service, language combination and domain. They could even switch vendors dynamically. There would be no need to build connector after connector for every computer-aided translation tool or every MT provider. A single connection point to iADAATPA would suffice. And this connection would be very secure, meeting EU security standards if required. European authorities liked the idea.

SESIAD from Spain was quick to understand and join the effort. SESIAD is the State Secretariat for the Information Society and Digital Agenda in Spain. The National Language Plan has a budget of €90M/$100M for language technologies. One of the agency’s mandates is to organize language assets and promote language technologies in the Spanish language and co-official languages in Spain such as Basque, Catalan and Galician. Other public administrations soon followed. At the time of publishing this article, the translation department working for the Parliament of Lithuania will have joined the initiative and also the Irish Parliament. Their translation teams will benefit from custom-built engines built for purpose as part of the project, which will need to beat the EU’s eTranslation service output.

The benefits of iADAATPA

If you are a public administration in the EU or any associate country, the benefits of plugging into iADAATPA are obvious. You can enjoy a very secure and reliable connection to free machine translation via the connection to CEF’s eTranslation service. It is part of the project’s objectives to promote the use of machine translation at public administrations. This was central to the proposal.

However, there are many cases in which eTranslation will not suffice because it is very specialized in certain matters (mostly legal matters, its training data coming from the European institutions it serves). Or perhaps the user wants to run a private, on-premise version of the platform. In that case, it may want to have its own engines developed by a preferred vendor. This has already been envisaged in the architecture, so a version of iADAATPA will be able to run privately and independently with a single supplier, many suppliers and from the user’s own infrastructure. This can be the case with intelligence services, private companies where privacy is paramount such as banks and financial institutions, or the case of a company’s headquarters or international organizations where no outside calls can be made due to the sensitive nature of the documentation they handle.

As an EU project, the license for a version of iADAATPA will be open source, as well as its “Access Points” (plug-ins) for synchronous translation (such as on-the-fly translation of web pages and requests from computer assisted translation tools) and asynchronous translation (full document translation). iADAATPA cannot be sold because it is an EU project and open sourcing the results is a precondition.

The fact that the platform and its major components will be open source may spur the creation and adoption of fully-managed MT platforms, just as the release of Moses kick-started a pro-MT adoption movement. Thus, different instances of the platform will be able to be deployed by developers who charge for that work (as it happened with Moses, for instance). iADAATPA itself is specifically developed for public administrations to be linked to eTranslation.

The marketplace for MT services

One of the main attractions is that the license for a version of iADAATPA will be open source. Consequently, it will facilitate the European Single Digital Market. The central objective of the iADAATPA proposal is to lower language barriers to fully implementing multilingual language services across member states. This can be achieved by deploying the iADAATPA platform, which connects MT providers to content publishers. As mentioned above, a central version of iADAATPA is the first step toward the creation of a marketplace for MT suppliers to provide in-domain high quality multilingual services — but this is not restricted to European companies. Because of its “single connector” philosophy, any MT vendor can connect to it and offer its engines for specific domains and “hard language combinations” where Western developers do not excel.

For example, many assume that the majority of MT engines have English as one of the languages in the combination, but there is a huge demand for translation (and MT) for combinations such as German-French; cross border trade makes Russian-Chinese and Russian-Japanese translations a popular combination; any combination between Spanish-French-Arabic is not strange for translation companies in those markets, nor Spanish-Portuguese in Latin America. And let’s not forget the huge language needs of the Indian subcontinent and Southeast Asia. There are six official languages in the United Nations and English is in only five combinations. Many UN agencies have daily translation work from Chinese, Russian or Arabic into French or Spanish, for example, without pivoting through English.

In short, it will be possible for a Japanese niche MT developer to become plugged in to the platform and offer, for example, Japanese-Chinese-Korean-English MT services. By adhering to the eSens AS4 Profile standard, other MT providers can also plug in their specialized in-domain engines. They will be registered and deployed within the iADAATPA platform.

Lastly, as the access to all engines and vendors happens through a single access point, this will reduce the complexity and costs of integrating high-quality MT engines. For public administrations in the EU and associate member states, this will reduce costs and open new opportunities to become more multilingual societies and provide more information to European citizens. For the many users, it means the possibility of running iADAATPA instances and broadening the usage of automated translation services.

How will iADAATPA work?

iADAAPTA will include a “domain adaptation” routing service in order to choose the most appropriate available engine for a given language pair request and CPanel/Dashboard with full statistics.

iADAATPA will utilize existing secure solutions such as Domibus to guarantee a standardized message exchange protocol for interoperable, secure and reliable data exchange. Conformance tests with e-SENS AS4 profiles will guarantee an open technical specification for the secure and payload-agnostic exchange of translation data using web services between the use cases part of the project and future implementations. Thus, iADAATPA’s architecture ensures the ability to deploy existing and future automated translation technologies quickly and securely by being listed as a conformant solution. This can be exploitable as a commercial service by partners for custom implementations at different levels of public bodies and administrations (or commercially at large). It will also facilitate language accessibility by offering customized, onsite or securely transmitted high-quality automated translation pluggable to public services, administrations, security or health bodies and so on or via existing free eTranslation services for free-to-use MT services.

To start matters working, iADAATPA includes the provision of domain-specific customized MT engines to complement the right of European institutions to use eTranslation for free. The partners’ domain specific engines will be made available to the national authorities involved in the project with the aim of providing more accurate automated translations for particular domains or language pairs.

“Domain detection” is a key element of the platform. This is an automatic routing service that enables translation requests to be routed to the most appropriate, available MT engine for their domain and language pair. This way, users experience the best possible MT service to match their specific needs, with user-defined granularity: at peak times a user might want to use several vendors for a single document. But most importantly, domain detection can realize that a request for proposal document is made up of sections detailing the financial proposal of the tender and sections that are heavily civil engineering. Not one single engine will cover such specialized areas, and in these cases iADAATPA will switch engines automatically to paragraph level, page level or line level if required.