Jaap van der Meer

Owner and founder of TAUS, formed in 2005. Born in The Hague, 1954. Attended the University of Amsterdam 1973-79. Degree in Literature and Linguistics. Frequently bikes 70 km to work

A Visionary’s Perspective of the Future of the Language Industry
Jaap van der Meer: We [humans] are the only ones who can add nuance and a cultural aspect to language.

The whole industry is now a one-on-one service; we have one customer that needs something translated, and the translator and the LSPs are working for that one particular need. But with dashboards that track collective needs, we’re moving toward a shared model, and we can create a one-to-many economy — a more social economy.

Translators — or linguists, or cultural adapters, or whatever new titles we invent — will be able to see the need for a specific terminology, data, nation, language, and go to a platform and provide data. They’ll be able to sell it multiple times because there will be many enterprises, platforms, and application builders that will need that kind of data to polish and customize their applications.

That’s the infrastructure and the ecosystem we need to build for the future generation. We don’t need project management, vendor management, quality management. Collaborative platforms can take over those things and disintermediate this whole industry.

JvdM: I want computers to do something smart with language, and even though I was much more passionate about literature and the cultural side of language [as a student], I became deeply interested and fascinated with using computers to do repetitive work, like looking up terminology and translating the same sentence with small variations again and again. So at INK — this was before Trados — we started developing translation memory (TM) and terminology lookup software.

How do you feel about what your recent article stirred up?
JvdM: People are locked up in their here and now, and they don’t see what’s really happening in the world.

I’m just explaining how economics work. If there’s a shortage of resources and people who can do the work, and there’s technology that can do it, then jobs change. This started years ago.

Let’s be clear: I’m not talking about novels or poetry. My wife is a literary translator, and I have the greatest admiration for translating Tolstoy or Baudelaire. That’s the real art, you know? We humans have to outsmart the machines, which means we shouldn’t become slaves to them and do the stupid work of correcting their output. I think it’s a relief that we can start doing more intellectually challenging work.

The TAUS vision
JvdM: With each new generation the definition of cultural and language experts evolve. They can work in a corporate setting or for themselves. That’s the vision behind the TAUS data marketplace, and it’s already happening. A Syrian doctor who fled his home country uploaded a large amount of medical translation data he had been collecting, and gained significant income from a single transaction.

I remain very interested in the potential for computers to help the world communicate better — that was actually our original mission statement. So when the first big paradigm shift in machine translation occurred, when we went from a rule-based to a statistical system, I immediately thought: it all comes down to data.

I compared it to the Human Genome Project — a massive academic and political collaboration that helped solve a fundamental problem in human evolution and led to multiple breakthroughs in medicine — and thought: We can do the same with language. If we collect all data in all languages, we can really break the language barrier and help the world communicate better.

So in 2008 we started building the TAUS Data Cloud, where everybody could upload translation memories, and we created a reciprocal model — no money transfer, but if you upload your data, you earn credits to download other people’s data. It was a fantastic synergistic model. And quite early on we collected a huge amount of human translation data.

Everyone benefited from the early aggregation of data and improved their engines, and we proved the point that data works, but we realized the reciprocal model was not sustainable. The same people who own the data were not going to train the data. It is very expensive to set up your own MT development team and host all the infrastructure. So we had to become more commercial; we went from being a visionary think tank to being part of the action. It was a big transition, and it wasn’t easy. Our members became data customers, and the membership fees were turned into credits for acquiring data or training courses in data, NLP services or training courses in data management.