Evolution of cloud-based translation memory

Cloud-based sharing of translation memories (TMs) has occurred at a much slower pace than we first expected when we started to learn about this technology, partially due to lackluster adoption by freelancers. A previously unpublished survey answered by 1,302 participants was conducted in February through Proz.com, one of the major internet portals for professional translators, to document the topic from the linguist’s perspective. Language service providers (LSPs) will need to make an effort to address their concerns if we want to keep working with the best translators on a cloud-based setup.

In the mid-1980s, we first found software whose main capacity was the creation of a database (or TM), fed with the work of human translators. Sharing of TMs has been happening since the very conception of computer-aided translation (CAT) tools and all company-level editions of this kind of software incorporated the ability of sharing databases over a local area network.

The idea of sharing TMs over the internet was the next logical step in the development of CAT tools, and from the beginning of this century, the first solutions connecting linguists through the internet entered the market. ForeignDesk from Lionbridge was one of the first solutions approaching this type of collaborative work. It was not based on a centralized TM, but instead it was a repository of projects on each linguist’s computer connecting to the rest of the team. Other pioneers were Telelingua with T-Remote, an add-on that was able to connect, for example, a Trados TM over the internet, and Logoport, a web service based on the software as a service (SaaS) model that connected team members to a central TM hosted at Logoport servers.

Today, ForeignDesk is an open source solution, thanks to the generosity of Lionbridge. T-Remote never had a real impact on the market and the company stopped its development in 2005. Logoport was acquired in 2005 by the developer of ForeignDesk.

Idiom WorldServer, which became Freeware after SDL’s acquisition of Idiom in 2008, had a modern and singular approach to online TM sharing. It incorporated two ways of sharing a TM. Firstly, connecting through a desktop application to a central TM; the singularity being that the central TM was not fed in real time. Translators received 100% and fuzzy segments on a local TM and were able to do a concordance search in the central TM. Team members could update the TM from time to time. This approach was designed to overcome the most important factor affecting the adoption of online TM sharing: infrastructure. Real-time TM reading and writing over the internet needs stable and powerful internet connections, and Idiom’s approach minimized the impact of this, making its approach practical when working with low-quality connections. For real-time sharing of TMs, WorldServer included a web-based interface, which we would now call a cloud-based solution.

While ten years ago we only had four or five solutions using online TMs, today we can find more than 40. We now classify CAT tools in two different categories: desktop-based or cloud-based. For higher flexibility, we find both approaches being addressed by some developers such as WordFast or Kilgray (memoQ).

One of the most important changes that we can appreciate in these ten years is the proliferation of this type of solution, with a clear trend toward cloud-based solutions. Proliferation means competition and competition means lower and flexible pricing. Today a group of translators can offer this technology to its clients while in the past only big corporations had the infrastructure and money to do so. It is still true that the level of development is very different and that this will be reflected in the price, but high-end solutions are still more affordable today than they used to be.

OmegaT, probably the open source solution with the greatest impact on the CAT tool industry, now offers the possibility of sharing an online TM. Even omnipresent internet giant Google has entered the game with Google Translator Toolkit, with the “fee” consisting of them having the right to use your translations to power their statistical machine translation solution. Many of the big LSPs also have their own proprietary systems.

Despite all of this, we can state that this technology has not yet fulfilled its whole potential and the barriers are still the same — internet infrastructure and the inherent hurdles involved in team work with a freelance base.

In a recent report commissioned by the Internet Corporation for Assigned Names and Numbers, The Boston Consulting Group analyzes the main factors that hinder the full realization of e-business. They call them e-frictions.

Infrastructure accounts for 50% of e-frictions, broken down into access, speed, price, traffic and architecture. The infrastructure e-friction will be the factor we need to consider with regard to the implementation of workflows involving the use of online TMs. This has been one of the main reasons why this technology is not more widespread in our industry, and the reason why it has grown in importance in recent years, with a lot of players appearing on the market. In our particular case, it was what prevented us from investing in this technology in 2005, after having the experience of working with an online TM in a translation project for another LSP and realizing that we had to spend a considerable amount of unpaid time waiting for the responses from the TM.

Nowadays, this is still the same if we want to work with a team located in countries such as Morocco, Pakistan or Nigeria. The abovementioned e-friction index model establishes a classification for 65 countries according to their infrastructure. The top country in this classification is Sweden, with an e-friction score of 14, and the last is Nigeria, with 85 points. This type of classification can help project managers to establish areas of collaboration where connection speed and stability are not a source of risk for their projects.


Survey results and analysis

Many highly qualified translators are reluctant to work on a model that shifts the power balance to the LSPs. LSPs need to understand that if they want to work with the best qualified translators they will need to address all the freelance translators’ concerns. These concerns, from a linguist’s point of view, are still the same as those named by Garry Levitt over a decade ago in a 2003 MultiLingual article.

Proz.com’s survey on the subject reached the same conclusions. Out of a total of 1,302 freelance translators who responded, 854 (65.6%) were full-time translators, 270 (20.8%) part-time translators and the rest were people taking translation work as a parallel activity. A full presentation of the collected data is available at www.abroadlink.com/onlineTMsurvey.pdf.

From the collected data, we have an indicator of the penetration rate of online TM sharing in our industry. According to the survey results, less than 6% of the translators regularly work on projects involving TM sharing (Figure 1).

If we delve deeper into the analysis of the data, we can see that senior translators do not participate in this type of project as often as junior translators. When filtering the responses by fulltime freelancers with more than ten years in the market, we observe that 48% answered that they were willing or eager to work with this technology and 11% actually work often or regularly with it. In comparison, 57.63% of fulltime translators with one to three years of experience are willing to do so, and 16.87% already work often or regularly on such projects. See Figure 2.

One of the main objectives of conducting this survey was to give the freelance translator community a voice with regard to this technology. Question 7 presented the major identified issues from the translators’ perspective and asked them to classify them in order of importance. The indicated issues were the following:

Getting low-quality fuzzy matches from other translators working on the project that I will need to fix or that will appear as done by me

Payment calculation

Higher processing time that lowers my translation productivity

Uncertainty of how long the project will take

Not being able to keep a TM of my own translations

The order of the questions was set up randomly to avoid the conditioning of answers. Respondents were forced to choose a different value for each question. Translators were asked to evaluate these drawbacks of working with online TM solutions from 1 (most important) to 5 (less important). See Figure 3.

LSP and software development companies should take action to respond to these concerns. Regarding ownership, we have already solved this in most of the desktop-based applications where translators can keep their own TM locally (for example, in the case of SDL Studio or memoQ). This issue mostly affects cloud-based-only interfaces. In any case, this is a concern that can easily be solved technically if there is an agreement on that.

But the most important issue according to freelance translators is still the quality of fuzzy matches received from other linguists working on the project. As a matter of fact, this is an issue even when freelancers accept discounts for fuzzies in a TM sent to them to work locally. Working with online TM sharing makes this problem bigger as the TM is fed in real time. If translators are being paid according to fuzzies and new words calculated as they translate, it is important that all segments introduced in the TM by linguists are final, so that when the other colleagues working on the same project find a fuzzy they can work with it. On the other hand, if translators translate a segment and then “sleep on it” before confirming the segment and introducing it into the TM, they may be preventing others from benefiting from their work, losing possible fuzzies and creating inconsistencies.

The rest of the aforementioned issues need the attention of LSPs, as their good handling of these matters will ensure successful projects and will guarantee talent retention. For example, LSPs should ensure good management of company IT resources and be aware of freelancers’ internet connections to avoid time-consuming delays that annoy the end users of the system, lowering their productivity in a way that can affect the final delivery date of the project.

In regard to the software companies offering this solution, the survey ratifies SDL Trados server solutions as the most used, with 24.06% of respondents reporting having participated in a project using this technology. MemoQ Server is the second most used solution with 15.09%, and XTM Cloud the third with 10.60%. Figure 4 shows the most popular commercial software.

Like other technical solutions providing faster deliveries with a higher guarantee of quality, online TM sharing is here to stay. Competition on the software development arena and the SaaS model will improve these types of solutions and make them more affordable, enabling smaller players to compete for high volume projects. LSPs and freelance translators will keep improving their capacities, adapting to the new challenges for the sake of satisfying the needs of their clients.