Profiling giraffes and reindeer

By Jim Compton October 29, 2018

At the 2018 Globalization and Localization Association (GALA) conference in Boston, I was part of the “Rise of Big Data in Localization” panel. It was an interactive session intended to spark discussions about how big data, machine learning and AI trends are shaping the globalization and localization industry.

It’s a broad subject that I encounter a lot at industry conferences, often in the form of the “Will AI replace humans?” debate. Our GALA panel wanted to take a slightly different approach. We wanted to ask the audience some provocative, open-ended, yet ultimately personal questions about their take on big data and AI.

Those interested in the subject tend to fall into a few different categories. There are those who are suspicious and fearful of AI; those who directly develop the technology; and those who are optimistic about its potential and keen to maximize its practical use, even if they’re not entirely sure what that looks like. They’re seeking those “killer applications” of AI that are both relevant to their business and obtainable.

As part of the third group, I have an interest in learning as much as I can from the practical experience of others. And I believe that there’s a lot that the globalization and localization industry can gain by sharing ideas, challenges and experiences. The question is: how can we use AI to solve practical problems and improve how we work?

Here I’ll pose to you the same questions that I asked the folks at GALA Boston, as well as share some of my own ideas. My hope is that we can routinely have these kinds of discussions about AI, and that through such “shop talk” we can standardize its use and maximize its benefit throughout our industry.

Incidentally, making good use of AI has always been part of the globalization and localization industry’s culture. Even excluding machine translation, which has been used on projects since the 1960s, we’ve been using translation memory, concordance searching, terminology recognition, optical character recognition (OCR), quality assurance checking and other applications of AI for some time.

Continuing to take positive advantage of AI technology is our prerogative.

Provocative questions

I posed two questions to the GALA audience: one that I wrote myself, and the other I stole from a vintage IBM advertisement for a punch card calculator.

The first was, “What are the killer applications of artificial intelligence for us?” (with “us” being deliberately open to interpretation). The second was, “What would you do with 150 extra engineers?” I find the latter question to be timely, both in general and in the context of the globalization and localization industry — especially if you expand the question to include translators, project managers, copywriters, solution architects, account managers and other roles that depend on finite supplies of human expertise. You could alternatively ask, “What would you do if human resource limitations didn’t exist?”

I wasn’t around in 1951, but I can imagine reading this ad (Figure 1) and being inspired to brainstorm about the future. I like this type of “what if” question because it forces you to decouple the task of deciding what you’d like to accomplish from the task of figuring out how to make it happen. The latter shouldn’t overly influence the former. This is especially true when talking about applying computer technology to problem solving — it can execute in microseconds what it takes humans minutes to do. Through the application of technology, ideas can and routinely do go from fantasy to reality.

In 1951, IBM delivered pure computational firepower with their room-sized punch-card machines. Today, artificial intelligence offers more complex and nuanced capabilities that go beyond number crunching. With AI, we can reasonably expect that not only engineers, but all globalization and localization roles could potentially experience a 150-fold growth in efficiency.

So, 67 years later, it’s time for us to ask ourselves this question again.

The globalization and localization industry has a profoundly important mandate: to remove language barriers that impede world progress. We’ve made impressive strides since 1951, but with the continued shrinking of the world, the rise of digital communication and other global megatrends, it’s become a harder problem to solve.

Can advances in AI capability help us solve these big challenges? Obviously, machine translation (MT) will play a huge role, but what else?

Someone in the audience asked me, “What do you think the killer applications for AI in our industry are?” So, I’ll share my thoughts here.

What would Jim do with 150 extra engineers?

I don’t take the question too literally; I see it as a metaphor for the automation of complex human tasks.

I start my brainstorming by drawing inspiration from my own professional experience — especially pain suffered, or instances where I’ve been face-to-face with situations that seemed hard to resolve. When do I remember recognizing something wasteful, or having to accept a compromise on a solution because of some human-based limitation?

Here are a few real-world examples of painful experiences from my last twenty-plus years at various globalization and localization companies:

1. Valuable transactional data lives in a set of folders that gets archived after the completion of a project, never to be seen again. When similar projects are executed later, the same lessons have to be learned again and again.

2. A client gets frustrated because a price estimate that was quoted to them for a project seems inconsistent with the price that was previously quoted for a similar effort. Looking at the details of the quote, it’s discovered that significant parts of the effort have been estimated with a “finger in the wind” approach.

3. An account is operating smoothly, and the client is happy. In large part this is thanks to the efforts of a long-time veteran of the projects who has a nuanced understanding of how they should operate. But when that person quits the company, errors start popping up and the client stops being happy.

4. A large team is assembled in preparation for an expected upcoming big project, but the client changes its plans and these resources aren’t needed after all. So, they’re either redeployed elsewhere or let go. Three months later, the client again changes its plans and the project is on again, this time as an urgent priority.

5. An opportunity presents itself as a unique client challenge, and a pursuit team of solutions architects and subject matter experts convene and develop the ideal solution over a period of weeks. Later, through a casual conversation, it’s discovered that another pursuit team had developed a different solution for a very similar opportunity a few months prior. Some of the elements of that solution are actually better.

You may have had similar experiences. What I see them having in common is that they could all benefit from applied intelligence — from the ability to better recognize a pattern, understand cause and effect, predict a future quantity, identify something — but for which “throwing experts” at the situation would be expensive and likely create its own problems.

How might AI be able to help with these situations?

“Alexa, play Animal Game”

My kids love to play a game on our Amazon Echo called “Animal Game.” Alexa has you think about a specific animal, asks you a series of mostly yes or no questions about it, and then guesses which one you’re thinking about. It goes like this: “Does it eat leaves? Can it climb trees? Does it live underwater? Is your animal a giraffe?”

It’s a version of the old Twenty Questions parlor game that any two people can play. Computer-based variants of the game exist in the form of dedicated websites such as 20Q and Akinator, or in highly specialized “profiling” quizzes on social media that can tell you what Star Wars character or 80s metal band you most resemble.

I find this approach — asking a series of quantifiable questions in order to identify something mysterious by name — as having real-world relevance. If you know the name of something, you are better empowered to deal with it. For example, by identifying that the spider you found in your garage was Eratigena atrica, you know that it isn’t poisonous. By knowing that your itchy eyes are the result of allergic conjunctivitis, you can take an antihistamine and skip the topical antibiotics. In combination with access to collected information about it, knowing something’s name is quite powerful.

For several of my experience examples, but especially for scenario #5, I think having some kind of Animal Game-like system in place would have helped immensely.

I can imagine an alternative timeline for that scenario, in which the pursuit team first stops and asks, “Have we seen this sort of situation before? And if so, what do we call it?” Using the system, they would have identified that the scenario was in fact not unique. It already had a name, and was associated with some existing best practices and technologies.

“Folks, it looks like we’re dealing with a giraffe here. It typically lives between 15-20 years. Every day, it only needs a half-hour of sleep, but eats up to 75 pounds of food. We’ll need to place its meals high up. We have a platform that we constructed for just this purpose.”

From the standpoint of solutions architecture, being able to differentiate between a “giraffe scenario” and a “reindeer scenario” is a prerequisite to being able to effectively serve either one, allowing us to make use of our animal-specific assets, including our know-how.

That’s a capability I’ll call opportunity profiling, and it’s one of the things that I’d do with 150 extra engineers. How would I start?

Finding an AI solution

Navigating the rich landscape of AI technology isn’t that easy if you aren’t immersed in it. Soon there will be Animal Games to help you identify the best AI solution for your use case, but in the meantime, I will share with you the process that I used when trying to create a working example of this profiling capability.

The world of AI options is about as diverse as the world of technology options. There are dedicated commercial products, commercial products that contain AI features, SaaS systems, API-based cognitive microservices, open source toolkits, and of course you can always try to home-grow a capability. The maturity, costs and level of involvement required to operationalize these options cover a wide spectrum.

There’s not just one kind of AI either. Capabilities can be subdivided both by domain (self-driving cars, video games, natural language processing, image recognition, virtual assistants, industrial process management) and by AI approach and worldview, for which there are almost as many variations as there are genres of music.

Deep learning (a branch of AI), for example, finds correlations between inputs and outputs through a process of layered statistical analysis. It’s capable of making predictions based on naturally existing relationships between things that would be difficult to model using traditional regression techniques, or that involve characteristics that are hidden yet statistically relevant. It represents a world of extraordinary possibility, but requires powerful computers and a framework for capturing and processing big data. This possibly includes dedicated data centers.

Not long ago, deep learning was the exclusive domain of AI researchers, but tools like Google’s TensorFlow have lowered the barrier to entry by releasing its libraries as open source. In combination with leasable cloud computing platforms, an organization can set up a deep learning capability with less capital investment than ever before. Of course, there is still a barrier to entry as organizations need the expertise to operationalize the technology — experienced AI engineers are in short supply. And they also need lots of relevant data. Companies that manage their data like an asset, and have a great deal of it, have a competitive advantage over those who don’t.

For my opportunity profiling problem, deep learning would be overkill. Instead, I worked with my developer colleague Alexander Sádovský to research how other Animal Game-type systems worked, and we landed on the technique of the “decision tree classifier.”

Using the general approach, Alex set up a working example in just a couple of hours.

How it works

You can see the code at https://goo.gl/XZfeJh.

Here’s how it works in a nutshell: you have a set of data that includes a certain number of “classes” (discrete things that we have names for, such as giraffes and reindeer), and a list of core characteristics about each of those classes that collectively make them unique (is an herbivore, has antlers, has a long neck, is found in Africa and so on). There should be at least enough characteristics such that no two classes share the exact same set.

The point of asking the questions is essentially to narrow down possible classes. Each question should reduce the current set of possibilities by roughly one-half — a process called “binary searching.”

Because of the power of exponents, it doesn’t take that many questions to reduce the possibilities down to a single item from a huge set. Through twenty questions, you can classify a single candidate from a set of over a million possibilities (2^20). If you include non-yes-or-no questions, you can address an even bigger set.

With opportunity profiling, there aren’t a million different classes. There are maybe twenty different archetype situations that we run into, meaning our system should be able to get there in five questions.

(If you search for “decision tree” you can also find tutorials online.)

Now what?

My plan now is to run a limited pilot with our solution architects group to determine if this approach is generally useful and worth developing further. We’re going to first build a data set using a record of historical projects and their characteristics, then see how a junior architect (or someone with no expertise) using the system can identify the class of an existing opportunity compared to an expert.

If that works, we can point the system to our existing repositories of solutions intelligence to ascertain how best to serve these opportunity profiles: what the ideal team looks like, what’s an ideal configuration of tools and processes, and what risks and pitfalls the operations team can expect.

I will want to know: What effect will this process have on our key performance indicators created during the solution design process? Can we bring better solutions to the table more quickly and cheaply? How accurate are our predictions against their operational reality?

If the process brings measurable value, I’ll want to keep improving the algorithm. In the world of business problems, opportunities are constantly evolving. New classes of opportunity are born, and existing classes evolve. I can envision the system being improved to include an interface that allows users to add new classes of characteristics, and to bring some supervised machine learning into the mix, allowing experts to provide feedback on the quality of the results.

Conclusions

The globalization and localization industry finds itself in a fascinating place right now. We’re at a confluence of technological trends, including advances in artificial intelligence, that collectively open up a world of possibility. Simultaneously, as the world is more connected digitally and communicative, the problem space has become more demanding.

Like other technological trends, advancements in AI will exert different kinds of disruptive influence on many industries, including ours.

One effect will be increased access to AI technology. We’re already seeing this now in the trend of AI companies like Google, Microsoft, Amazon and IBM competing to bring cloud-based cognitive services platforms to businesses at the cost of less than a tenth of a cent per transaction. A phenomenon like this should have a “playing-field leveling” effect on industries.

At the same time, AI will magnify the differences in capabilities between organizations. The most advanced AI techniques thrive on having lots of data, and those that deliberately collect and manage data like a business asset will have a significant advantage over those that don’t.

The ability to operationalize AI will be another differentiating factor, and this is where I believe that the globalization and localization industry has a reason to get excited. Many of the decades-old practices we take for granted today are examples of applied AI; we can rightfully say that we’ve been using AI “before it was cool.” Translation memory and machine translation are two prime examples of AI in action in our industry. So now that AI is becoming more powerful and accessible, what else can we do with it?

I like approaching this question from different angles. Wearing my “big blue sky thinking hat,” I can imagine what the world would look like if we weren’t at all bound by natural laws. I find tremendous value in that exercise, but I also find value in a more pragmatic approach, looking at our existing world through a lens of process improvement and incremental innovation.

Once we’ve generated some ideas, it is important to get our hands dirty with AI and start trying out solutions. It may get messy, but that’s okay. Innovation is an imperfect and iterative endeavor, but the potential gains from applying AI to our industry’s real-world challenges are so great that we must engage.