Tag: machine translation


MIT CSAIL and Reviving Lost Languages

AI, Technology

Can the evolution of language inform machine translation models for extinct languages? Researchers at CSAIL think so. Jean-Francois Champollion did too.

If not for ancient Greek and Coptic – a descendant of ancient Egyptian – the decades-long effort to crack the Rosetta Stone could have turned to centuries. For dead languages with few or no existing descendants, the task would appear impossible. Machine translation could help.

A project at MIT has been evolving throughout the past decade, as researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have sought to develop a system that can automatically decipher lost languages, even with scarce resources and an absence of related languages.

The team made headway in 2010 when Regina Barzilay, a professor at MIT, alongside Benjamin Snyder and Kevin Knight, developed an effective automatic translation method from the dead language Ugaritic into Hebrew. However, the more recent study considered this breakthrough relatively limited, since both languages are derived from the same proto-Semitic origin. Furthermore, they found the approach too customized and unable to work at scale.

To build off their initial findings, Barzilay and Jiaming Luo, a PHD student at MIT, have proposed a model that accounts for several linguistic constraints, particularly “patterns in language change documented in historical linguistics.”

One grounding principle here is that most human languages evolve in predictable ways. This would account for linguistic patterns where descendant languages rarely make drastic changes to sounds. A plosive “t” sound in a parent language, could feasibly change to a “d” sound, but would very seldom evolve into fricatives like “h” and “s” sounds.

Along with these constraints, another notable detail here is history. As the algorithm deciphers patterns in sounds and syntax, it will also pull from encyclopedic data to fill in some of the blanks.

“For instance, we may identify all the references to people or locations in the document which can then be further investigated in light of the known historical evidence,” Barzilay told MIT News. “These methods of ‘entity recognition’ are commonly used in various text processing applications today and are highly accurate, but the key research question is whether the task is feasible without any training data in the ancient language.”

While imperfect, these methods have so far made progress. The team found the algorithm could identify language families, and one instance corroborated earlier findings that Basque – a language spoken in a region of northern Spain and southwestern France – appeared too distinct to assume any linguistic relation.

The team hopes eventually to develop a method of automatically identifying the semantic meaning of words with or without a linguistic relation. Like the linguists who cracked the Rosetta Stone, CSAIL researchers could be on the verge of a paradigm shift.

Tags:, , ,
+ posts

MultiLingual creates go-to news and resources for language industry professionals.


Related News:


Hungarian Translator Keeping Expats Informed


Even for those learning a new language, consuming news content requires a high level of understanding. One Hungarian translator has volunteered efforts to deliver timely pandemic information to Hungary’s immigrant communities.

As people around the world await new pieces of information about COVID-19, many expatriates have struggled to stay current with local decisions and mandates. One Hungarian translator, however, has made a concerted effort throughout the crisis to disseminate the most recent information to immigrant communities in Hungary.

Setting up a Facebook group for his international network, István Fülöp, the founder of TrM Translations, began posting unofficial highlights of what officials had said regarding the pandemic, attaching links for additional information. The group started small, but word spread, and hundreds tuned in multiple times per day to catch up with important updates.

A month into his voluntary effort, Fülöp began trying out machine translations (MT), asking users to comment on the effectiveness of the MT. Some users pointed out that translating from and into languages like French and Italian could produce fairly effect MT, but languages like Hungarian, with its rules and exceptions to the rules, created a bigger challenge.

Despite the low-quality results Hungarian has yielded in Fülöp’s MT efforts, users have noticed that the MT has made steady improvements, depending on the nature of the text. Overall, for the purposes of this group, Fülöp’s MT helped many expats deal with the constant inflow of new information.

Still, Fülöp noted, “A clear risk of using machine translation is that while the sentences generated are meaningful, sometimes the information they convey is the exact opposite of what the original text says, or there might be significant omissions.”

“Working on this news gisting service helped me keep my mind off things by keeping me busy,” said Fülöp. “It helped me stay up-to-date without giving me time to dwell on the bad news. The group helped me streamline our new service of human-edited machine translation, a budget solution to translate larger volumes of text quickly with reasonable quality while keeping our clients’ costs low.”

Tags:, ,
+ posts

MultiLingual creates go-to news and resources for language industry professionals.


Related News:


Top Lesser-Known Google Translate Tricks


Even people who have never heard of machine translation have often heard of Google Translate — because it makes communication easier. Sure, it makes mistakes sometimes. If you write an entry in a specific dialect, the app will probably translate it wrong. Some of the mistakes are downright ridiculous. 

But let’s be honest: the app works well most of the time, and it can even work for learning a new language. So here are some Google Translate tricks to try.

Use Offline Google Translation

Most people know how to use the service when they are connected to the internet. But what if you’re traveling and you’re not connected 24/7? The Google Translate app gives you an option for offline use. This requires some brief preparation, during which you will need an internet connection. 

  • In the settings menu, you’ll see the “Offline translation” option. Tap on it.
  • You’ll see a list of languages that are available for this option. Download the software for the language that you need. Don’t download them all, since the app would consume too much space on your device. Each language package weighs 35+ MB. 
  • Wait for the language package to download. 

Let the Google Translate App Replace a Dictionary

You understand the text you read or the things you hear? Maybe a single word or phrase confuses you. Instead of using an individual dictionary app, you can rely on a translation. Google’s app will give you a translation, and an explanation of the word as well. It’s just what a dictionary app would do. 

Translate Highlighted Text from an Image

This is another cool trick that you can use when traveling. You can snap a photo, and the Google Translate app will translate the text on it. That’s useful when you want to understand menus, street signs, or any other text without typing it. 

  • In the app’s home screen, you’ll notice a camera icon on the left side. 
  • The app will ask for language settings. Set the source language on the left, and the output on the right. 
  • The app will prompt you to take a Google Translate picture. Snap it!
  • You’ll need to highlight the area with text that you want to be translated. To see what it means, press the blue arrow. That’s how you use the Google Translate camera feature.

Explore Recipes from Around the World

This one gets a little creative, but are you trying to improve your kitchen skills? Here’s something that will motivate you: focus on a foreign food culture each month. This month, you can learn about Persian cuisine and cook some of its recipes. Next month, you’ll explore Greek cuisine. Then you’ll proceed with France, Italy, Honduras… be creative! Google Translate could help you find authentic recipes. Instead of searching for recipes in your native language, explore them in their native language. It’s how you’ll get to the source of the food culture. 

The Internet is full of food blogs in all languages. Translate the pages and you’ll have the original recipes at your fingertips. 

Use Google Translate in Conversation Mode

When you’re trying to make conversation with someone who speaks a foreign language, you can use the Google Translate audio feature. As participants in the conversation speak, the app will detect their voices through the microphone, and it will give you a translation in the chosen language: 

  • The home screen of your app gives you a “conversation mode” feature with a microphone icon. Tap on it. 
  • The app is ready for listening. Just hold the device’s microphone close when you speak. Then, let the app speak in the chosen language. Switch the languages when your partner responds.  

Use Google Translate for Research

When you engage in deep research, it’s best to look into the direct sources of information. Let’s say you’re discussing World War II in a research paper. Wouldn’t it be great to access original German documents? When looking through resources in your native language, this information will be briefly mentioned. If you want to deepen your research, use Google Translate on news articles, research studies, and academic texts in a language that’s relevant to your topic.  

Use Google Translate for TikTok and Other Video Voiceovers

Want to add something to your TikTok videos? Google Translate has a speech function, albeit a robotic one. Use the app or the desktop tool to translate the text you need. Then, click/tap on the audio icon to hear the speech. This is a cool trick for making videos of speaking pets, for example. The robotic voiceover makes them unique and fun.

Create a Personal Dictionary through the App

Did you know that you can save your favorite words and phrases? Maybe you find yourself repeatedly translating a particular word. You can save it in your personal dictionary. Maybe you found a cool phrase you want to remember. Save it, too:

  • You’ll notice a star symbol next to the translation that you get. Tap on it, and you’ll immediately add it to your Saved list. 
  • To view the entries you saved, look for the Saved button at the bottom of the device’s screen. 



James Dorian is a technical copywriter. He is a tech geek who enjoys reading and writing on technology, business, and ways to become a real pro in our modern world of innovations. You can also check out one of his articles on how to use TikTok.


Related News:


Reaching a global audience to maximize your startup’s potential

Globalization, Language in Business, Localization Basics

Globalization maximize startups

The Global Policy Forum reported as far back as the year 2000 that the pace of globalization — the process by which organizations start operating or influencing internationally — was quickening. Technological advances have been key to this change of pace. Globalization is not without its drawbacks, but many leading economists and business analysts believe it is better than the alternative. Indeed, Deloitte reports that after the global financial crisis in 2008, leaders around the world pledged to avoid protectionist measures to boost growth and speed up the global financial recovery.

The global environment we now live in poses both challenges and opportunities for new businesses. Startups today have a wider audience at their fingertips than ever before. A vast international customer base awaits those with the vision and courage to reach out to it. Technology can help with this, and the next issue of MultiLingual, on startups, will cover this when it goes live in a few days.

But the human element is still essential. Let’s look at language as an example of this.

Microsoft has just announced its latest machine translation (MT) success: achieving parity with the quality of human translation for the Chinese-English language pairing on 2,000 sentences in a test environment. However, there is still an incredibly long way to go before MT can rival human translation services. As such, startups that want to promote their products globally are reliant on professional human translators in order to assist them.

Adaptation and localization services are also essential. An image that is perfectly acceptable in one country can cause sufficient offense for arrest warrants to be issued in another. Any business with global aspirations therefore needs to use specialist local knowledge when globalizing its brand. Doing so does take time, but the rewards can be well worth the effort.

Our company, for example, recently launched 11 new websites targeting clients in various new countries as part of its globalization strategy. The French site is targeted to customers in France, Belgium, Canada and other French-speaking countries. Meanwhile, the German site is aimed at German-speaking territories, such as Germany, Switzerland and Austria.

The choice of languages for the new sites was the result of extensive research. Supply and demand were the cornerstones of the research. The demand front covered the number of speakers of the languages being considered, local business activity, size of potential customer base and search engine statistics (anchors, keyword volumes and more). On the supply side, we investigated competition and concurrency in the relevant markets, availability of local translation and localization experts, cost of advertising, cost of pay per click/SEO and similar parameters.

For companies just starting out, global dominance may seem a tall order. However, the right product can have almost boundless appeal. Have you heard of Slack? If you haven’t, you’re behind the curve. Founded less than a decade ago, the business messaging system is now available in more than 100 countries around the world. Meanwhile, TV network Netflix, founded in 1997, is available in all but four countries (China, Crimea, North Korea and Syria).For companies just starting out, global dominance may seem a tall order. However, the right product can have almost boundless appeal. Click To Tweet

Not every startup will want to go global. However, even the smallest of ideas can go a long way in the global environment in which we live. You might dream of simply running a local coffee shop, but that’s how Starbucks started too. The world’s largest coffee company, it now operates in 62 countries.

Whatever your business niche, it’s likely that there’s money to be made by turning globalization to your advantage. A carefully devised strategy, based on appropriate research, is the starting point. Identifying target countries and languages through a measured approach will ensure that time and money are both used efficiently when it comes to international expansion plans.

If you have a great product or service, the world really can be your oyster.

Tags:, , , ,
+ posts

Louise Taylor manages content for the Tomedes Translators blog. She has worked in the language and translation industry for many years.


Related News:


Linguistic prejudice, race and machine translation


Linguistic prejudice

Linguistic prejudice, race and machine translation

There are two basic approaches to grammar: the kind that says “this is what the rule book has said since 1858” and the kind that says “language evolves, and this is how it’s actually being used in the current world to communicate these specific concepts and grammatical differences.” The way pockets of minority speakers use language has always fascinated me, although when I was young it would make me cringe. As a teenager, I thought it was extremely strange, for example, that the black-cap Mennonite community that I sometimes mingled with used word constructions I’d never heard of in real life; they greeted me with “welcome here” instead of “hi;” their pronunciation of “school” sounded more like “skewel.” It sounded super-archaic to me, like in eschewing modern forms of dress they’d also decided — subconsciously or consciously — to eschew modern linguistic constructions.

During grad school, one of my linguistics professors delved into the linguistic nuances of African-American Vernacular English (AAVE). He told us that there was a grammatical dialectical difference between “this coffee cold” and “this coffee be cold” and the difference did not exist in standard American English. “This coffee cold” was a remark about a temporal state; “this coffee be cold” was a remark about a known, habitual quality of this specific genre of coffee. Similar to “le café a été froid” and “le café était froid,” I imagine, or perhaps more accurately, “this coffee is cold” and “this coffee is usually cold.”

AAVE drops certain sounds (but not others) in spoken language; there’s a regularity to the practice. Because there are rules, this is no more “incorrect” in English than when it happens in French or in certain dialects of Spanish; Cuban Spanish, for example, may also drop sounds with a practiced regularity. AAVE drops “to be” verbs in some instances, but then, so does standard Hebrew and Russian. Standard English drops the verb in phrases such as “every man an island unto himself.” White dialectical English drops it in phrases such as “this floor needs swept.”

In short, contrary to the opinions of grumpy white grammar nazis, AAVE isn’t “wrong,” it just does its own thing, having adapted the way language always adapts. And this is important because for a certain portion of the population, these grammatical differences become a reason to mistrust African-Americans, to dismiss them as “uneducated” or “lazy” because they sound different. To treat them with less innate respect.

For a certain portion of the population, these grammatical differences become a reason to mistrust African-Americans, to dismiss them as “uneducated” or “lazy” because they sound different. Click To Tweet

A study put out in June of this year, for example, concluded that police use less respectful language with black members of the community than white members of the community, even controlling for heavy-crime areas and reasons for the police stop. The study could find no difference, in fact, than the race of the people being spoken to by police.

Now, I find it hard to believe that the majority of police are linguistically reacting purely to the skin color of the person in front of them. What seems more likely to me, as a linguist, is that they react linguistically to linguistic difference (real or perceived). When a person speaks a non-standard dialect, or is assumed to speak a non-standard dialect, that person is usually placed in a more suspect category. If their speech itself is not “correct,” what else is not correct about them?

I consider myself open-minded on such matters, but I am by no means immune to this. I noticed as I was recently watching an interview with Seattle Seahawks-turned-Oakland Raiders player Marshawn Lynch that I couldn’t stop the subconscious commentary in the back of my head on his pronunciation of ask as “axe,” or the myriad of ways he sounded like a stereotypical black man. His way of speaking sounded incorrect to my brain; the unintentional emotional result ranged between slight irritation and amusement. Neither are particularly respectful reactions. The guy standing next to me, on the other hand, remarked “I love that he’s himself, and he isn’t dumbing himself down for the media. He sounds so black. He’s such a badass.”

This guy had grown up siding with his black friends against stereotypical white jock bullies and Klansmen in the south, so his firsthand experience with African-American dialects was way more intimate than mine. More friendly, more familiar. His subconscious was trained differently than mine.

And I thought, you know, he’s totally right. It’s pretty badass that this guy is refusing to change who he is, refusing to give up his linguistic heritage, in the pursuit of fame or being more palatable to the money machine of corporate America.

I posit that, given my own reaction, white Americans are less likely to believe a man committed a crime if he sounds like them; if he speaks with the cadence and vocabulary of a white man. This is, of course, a difficult theory to prove in a double-blind study, but it bears out anecdotally. As this study shows, it is true that many people are implicitly biased against accents unlike their own and certain accents in particular, whether or not they realize it. It is also true, for example, that all-white juries are 16% more likely to convict a black defendant than a white defendant.

It seems likely that linguistics play a role, and they certainly have on a trial-by-trial basis. After Trayvon Martin was fatally shot in Sanford, Florida, by George Zimmerman, Martin’s friend Rachel Jeantel, who had been present, testified against Zimmerman. Jeantel spoke non-standard English. Her speech patterns were widely mocked on social media, while her testimony was ignored by jurors. A prize-winning linguistics write-up put it this way: “one of the six jurors (B37) said, in a TV interview with CNN’s Anderson Cooper after the trial (July 15, 2013), that she found Jeantel both ‘hard to understand’ and ‘not credible’. In the end, despite her centrality to the case, ‘no one mentioned Jeantel in [16+ hour] jury deliberations. Her testimony played no role whatsoever in their decision’ (Juror Maddy, as reported in Bloom 2014:148). In a sense, “Jeantel’s dialect was found guilty as a prelude to and contributing element in Zimmerman’s acquittal.”

Accent and dialect influence how you’re perceived. I once conducted my own experiment on accent: during my first semester of grad school, I was employed taking phone surveys about Charmin Ultra toilet paper. This was extremely boring, so I ran my own secondary experiment in the background: I would alternate calls in an Irish accent, in a standard American accent, and in a Southern accent. I was curious if accent played any role in people’s willingness to take a survey about toilet paper; the majority of people hung up on any accent, but maybe there was a competitive edge I could use to complete more surveys, and thus to earn more money per hour.

I was calling non-Southern white Americans, by the sound of it; the call’s geography was random numbers pulled from somewhere like northern Arizona or Wyoming. I kept track of completed surveys in each accent. After doing this enough times, a pattern started to emerge: people slightly preferred talking to a woman who spoke in a soft-and-subtle Irish accent, followed by standard, crisp American English; Southern American English was a distant last. Few people seemed to take Southern Accent Girl seriously enough to complete a toilet paper survey with her voice on the other end of the line.

Southern accents are often associated with being “uneducated” or “dumb,” even to listeners as young as five years old, so this was not a huge surprise. And what American, on the other hand, doesn’t love the Irish?

And lest this be considered an American phenomenon, British studies have found that speaking with a Birmingham accent (like Ozzy Osborne) makes listeners assume that you are less intelligent compared to standard British English or a Yorkshire accent.

Humans make judgments about dialect and language, often without realizing it. However, machines only do this where their data is prejudiced in some way. Data-driven linguistic models collect data removed of innate prejudice, studying how humans use language and deriving rules from this. Although this has certainly resulted in subtle and non-subtle human prejudice being codified into machine learning, it also presents an opportunity to create programs that may correct for human prejudices. Data-driven models often work best when the linguistic field is narrowed, actually, because human language is so broad. Because of this, I wonder if there will be — could be — machine translation settings in the future that take into account the dialect of English being spoken; certainly this is a question that speech-to-text MT applications have to take into account. And I wonder if this, somehow, could be used to “translate” dialects in places like the courtroom, for the benefit of everyone.

Tags:, , , , , ,
+ posts

Katie Botkin, Editor-in-Chief at MultiLingual, has a background in linguistics and journalism. She began publishing "multilingual" newsletters at the age of 15, and went on to invest her college and post-graduate career in language learning, teaching and writing. She has extensive experience with niche American microcultures across the political spectrum.


Related News:


Today’s new machine translation releases

Translation Technology

Our January 2018 machine translation (MT) issue has just gone online, prominently featuring a host of new case studies on neural MT. It happens to coincide with the TAUS release of their MT ebook Nunc Est Tempus (“now is the time” in Latin), which also went online today and likewise considers the emerging developments in neural MT, albeit in a different way.Nunc Est Tempus considers the emerging developments in neural MT in a different way Click To Tweet

The ebook is a response to neural MT developments

The ebook, written by Jaap van der Meer and Andrew Joscelyne, looks at the history of the translation industry and proposes that now is the time to redesign workflows in the industry. It features interviews with MT company founders and experts, such as Smith Yewell, CEO of Welocalize; Eric Liu, general manager of Alibaba Language; and Chris Wendt, group program manager of machine translation at Microsoft. The primary message of the ebook is that translation “can now be redefined as intelligent global content delivery,” according to TAUS.

The ebook can be downloaded in PDF format; it reads like a 72-page TAUS whitepaper. MultiLingual readers can purchase it for 20 euros, for a limited time, with the code CHRISTMASR-ML.

Tags:, , , , , , , , , , , , , ,
+ posts

Katie Botkin, Editor-in-Chief at MultiLingual, has a background in linguistics and journalism. She began publishing "multilingual" newsletters at the age of 15, and went on to invest her college and post-graduate career in language learning, teaching and writing. She has extensive experience with niche American microcultures across the political spectrum.


Related News:


Google Translate’s deep AI upgrade represents the future of machine translation

Translation Technology

Artificial intelligence may seem like science fiction, but it’s technically existed for decades. In 1951, students at the University of Manchester created a program for the Ferranti Mark I computer that allowed it to defeat amateurs in checkers and chess. That may not seem too impressive anymore, but it spurred a period of major innovation that continues to this day. Now, that technology can be applied to how we conduct website translations.

Google recently upgraded the AI of Google Translate, making it potentially much more effective than past web translation services. To understand how this new breakthrough works, it’s necessary to get some background on the basic classifications of AI.

Narrow AI

While a computer that can play chess may have been impressive in 1951, now there are plenty of similar (and more sophisticated) programs you can download straight to your phone. These early developments were examples of narrow AI, in which a programmer “teaches” a computer to perform basic, rule-based functions and tasks.

This type of AI can learn how to play checkers, but it can never learn to research the history of checkers. Basically, it can’t develop its own natural curiosity, and wouldn’t know how to apply said knowledge if it could.

Machine learning

Machine learning AI became more prominent in the 1990s. Rather than playing a game with constant rules, machine learning AI represents a shift towards programs that can actually “learn” on their own.

Essentially, the machines leverage specialized algorithms and refer to substantial amounts of data to acquire knowledge. For example, when you ask Siri a question, it sorts through data and breaks it down into subsets, arriving at what is most likely the correct answer. Siri doesn’t technically learn how to research on its own, nor does it retain knowledge or act independently in the same way a human does. What it can do, however, is adapt to learning situations that exist outside of the basic rules it was programmed to follow. Compared to narrow learning AI, which can only do the one particular thing it’s assigned, machine learning AI isn’t nearly as restricted.

Deep learning

Deep learning AI has been on the rise in the past decade. Structurally, the algorithms are based on the human brain. Although tech visionaries understood the value of this approach, until the right hardware and technologies were available, it was impossible to design such a complex system.

Unlike machine learning AI, which can mimic some form of actual thought, deduction, or reasoning, deep learning is the first type of AI which can use knowledge of past behavior and apply it to new problems outside of its programming. For example, a deep learning program from Google was able, after being exposed to 10 million images, to recognize specific objects (like cats) twice as accurately as previous image recognition programs.

In all likelihood, general AI — the kind featured in sci-fi movies with independent, thinking robots — will be a part of the everyday reality in the near future. This phenomenon will likely have a major impact on translating language.AI will likely have a major impact on translating language. Click To Tweet

Most online translation services work by dividing a sentence or phrase into smaller parts, referring to dictionaries to identify the equivalent words, and relying on post-processing to adjust the sentence structure according to the language’s specific grammatical rules. Anyone who has used one of these tools before knows the results are far from perfect.

That’s why Google Translate is shifting to a new method. Previously, the service worked by using hundreds of narrow AI programs to translate text. Now that Google has begun implementing deep learning AI in its translation service, the program will actually be able to learn from past experience. This allows it, in theory, to avoid the kinds of errors that are commonplace in online, machine translations.

The upgrade also allows it to bridge the gap between language problems it may not have encountered before. Perhaps it translates a Japanese text into English, then a Korean text to English. Early results indicate it will then translate Japanese to Korean with decent-to-remarkable accuracy.

Researchers call this breakthrough the “zero-shot translation,” and it represents the breakthrough fact that Google neural machine translation — the current moniker for the new translation program — basically “learned” how to achieve it independently. In fact, Google experts only have theories regarding how it accomplished this feat; they’re not entirely sure.

While this absolutely marks a major leap forward in online translation services, it doesn’t mean human translators will be replaced anytime soon. Effective translation requires not only an understanding of the nuances of language, but also an understanding of a given culture and how cultural attitudes impact the effectiveness of language. Until a machine can do that, it’s not quite human just yet.

Tags:, ,

​Sirena Rubinoff is the content manager at Morningside Translations. She earned her B.A. and Master’s Degree from the Medill School of Journalism at Northwestern. After completing her graduate degree, she won an international fellowship as a Rotary Cultural Ambassador to Jerusalem. She covers topics related to software and website localization, global business solutions, and the translation industry as a whole.


Related News:


Making the most of machine translation

Localization Basics

Machine translation (MT) is ever-present in the translation industry. The technology is being used to shorten project timelines for language service providers (LSPs) and reduce costs for clients as they localize content around the globe. At this point it has become obvious, however, that MT cannot be used as a sole means of translation due to issues with its accuracy. Applications such as Google Translate have proven to be a successful way to translate documents or segments of words generally, but are not able to capture specific vernacular traits that human translators are capable of deciphering.


MT is a relatively new technology, having origins in the early 20th century and not being considered a completely realized tool until advancements in the field in the 1980 and 1990s. George Artsrouni and Petr Smirnov-Troyanskii worked individually on ways to create MT in the 1930s, with Smirnov-Troyanskii laying the groundwork for what was needed in an MT system. Namely, Smirnov-Troyanskii suggested the system would require an editor familiar with a source language to convert words to base forms before sending them to a machine to turn them into equivalent forms in the target language. After this, another editor familiar with the target language would edit the machine translations.

Attempts to create a successful MT system continued into the 1950s and 1960s with the advent of computers and, in 1954, IBM partnered with Georgetown University in an MT a demonstration widely circulated in the media and considered to be quite a feat at the time. Although advancements continued in the field, the thought of translating languages digitally seemed less and less like a possibility and more like a flawed experiment.

Hopes were diminished to the point that the United States government created an advisory committee on the technology, ALPAC, that released an unflattering report of MT’s progress, saying research wasn’t progressing at it should. The report put a stop to MT research until important developments resurfaced in the 1980s.

MT research and development boomed in the 1990s and 2000s as we moved into the digital age. Neural machine translation (NMT) is making waves in the industry today.


MT serves a number of functions for companies within different verticals, including legal, life sciences, manufacturing, information technology, finance and consumer products. Businesses attempting to localize content have had success using MT to reduce costs and more efficiently convey global messages.

MT has many features that help to expedite translation projects, including:

  • Language Identification. A function that can quickly go through a large number of documents and decipher what languages the text in these documents is written in
  • Keyword search. This allows users to search for certain terms that come up often in documents.
  • Ocular Character Recognition. OCR is a technology that allows for the recognition of printed characters using digital technology

Right now, MT allows users to get the gist of a translation without having to use a human translator to do so. There are instances in which the new technology can be a boon for a business, but, at the same time, its limitations outweigh its abilities if used the wrong way.

Very recently Google and Microsoft started using NMT with their translation systems, a relatively new technology that is on pace to replace phrase-based machine translation (PBMT). NMT employs artificial intelligence that can understand entire sentences or ideas using neural networks, while PBMT can only decipher words, or segments of a sentence, at a time.

When MT works for business

Before using solo MT — without any linguistic editing — for business, users should ask themselves a handful of questions to see if the solution will be a successful one. The decision of whether to use MT should rely heavily on the following:

  • Is quality an important part of this project?
  • Is a quick turnaround time necessary?
  • Will this be distributed externally?
  • Is cost-effectiveness a priority?

Right away, companies should identify what kind of project they need to complete when deciding whether to use MT. If quality of utmost importance, it will be impossible to use solo MT to receive accurate translations without the help of a human interpreter.

At the same time, if a law firm needs to quickly identify a large number of documents and find out which need to be translated for trial, using MT as a tool can be very helpful. A strong language identification tool can act as a time-saving and cost-effective feature for assignments that require going through a large amount of foreign language text before translation.

This feature can also be helpful for those facing tough deadlines, as MT can go through text much faster than an interpreter to sort what needs translation and what doesn’t. But if a translation project is going to be distributed externally or to clients, it is imperative that one uses a human translator or a combination of machine translation and post-editing.

If cost is an issue, companies can leverage certain MT tools to mitigate their expenses. For language identification or in order to basically understand foreign language documents, MT is far cheaper than the use of a certified LSP.

When MT doesn’t work

The biggest setback of MT is its inability to pick up on linguistic nuance and metaphors as humans can. This truism was proven by AlSukhni, Al-Kabi and Alsmadi in their study on using Google and Bing Translate to translate passages from the Quran.

For projects that necessitate quality, MT cannot be used as the only translation tool for businesses. Projects that should not rely only on MT include:

  • Those in highly regulated fields such as medical device
  • Projects that will be distributed for external use
  • Projects that require the translation of nuanced, complicated texts

In Ryoko, Hirono, MJ and Takahiro’s article examining the usability of MT among nurses in Japan, it was discovered that respondents found MT was “not useful enough” when deciphering medical texts in a foreign language. The study also emphasized the fact that a stronger knowledge of technical terms in other languages made for a better experience using MT.

Know your audience

MT is an extremely valuable asset to businesses that know how to use it properly. MT tools used in a secure environment can be strong assets for those in any vertical. Maybe most importantly, businesses interested in MT need to know how they want to use the technology and what the scope of their project is to employ the resource effectively.

MT should never be used as the only resource for translating documents unless they will stay inside of a company. Even still, companies should always be aware of the fact that mistakes are likely to occur when only using MT.




A former newspaper reporter and native Minnesotan, Jake Schild is a staff writer in the marketing department at ULG.


Related News:


Common Sense Advice about Machine Translation and Content

Translation Technology

You’d need to be living on the moon if you still don’t get it about how data quality impacts machine translation quality (actually, every kind of translation). But, what does this fact really mean when communicating with content creators?

Writers, and information developers generally, have to contend with all sorts of “guidance” about how they must create content to make it easily “translatable”. I am against that sort of positioning.

Content creators need and want guidance on how to make their content usable, not translatable. There is no conflict between making content readable in English and making it easily translatable, and vice-versa. There is a conflict between telling content creators to make their content translatable and not accounting for content style, source user experience, and especially the motivations and goals of the content creators themselves.

Well, I have been reading the Microsoft Manual of Style (4th Edition), recently published, and I am delighted to see there is a section called “Machine Translation Syntax”.

Microsoft Manual of Style 4th Edition. Sensible stuff about machine translation.

Microsoft Manual of Style 4th Edition. Sensible stuff about machine translation. Did I mention that I got a new bag from Acrolinx?

Here is what that section says:

“The style of the source language has significant impact on the quality of the translation and how well the translated content can be understood.”

The style of the source language. Brilliant appeal to the audience! What follows is a baloney-free set of 10 guidelines for content creators. Each guideline appears to be an eminently sensible content creation principle worth respecting, regardless of the type of translation technology being used, or even if the content is not explicitly destined for translation at the time of creation.

You can read the 10 guidelines on the Microsoft Press blog.

Well done Microsoft, again (no, I am not looking for a job). Let’s see more of this kind of thing from everyone!

I’ll do a review of my new Acrolinx bag when time allows.

Tags:, , , , , , , , ,
+ posts

Ultan Ó Broin (@localization), is an independent UX consultant. With three decades of UX and L10n experience and outreach, he specializes in helping people ensure their global digital transformation makes sense culturally and also reflects how users behave locally.

Any views expressed are his own. Especially the ones you agree with.


Related News:


Google Translate is Finished. Again.

Language in the News, Translation Technology

We’ve heard this before. This time it’s somewhat truer. Google Translate itself (http://translate.google.com/) isn’t finished, but the API allowing third-party developers to use Google Translate as a service is. Google Translate in its own right will continue.

Google has deprecated the API because of excessive abuse (presumably from people using it to manipulate search results through mass translation of web content). The reaction from developers has been pretty hostile (see the comments). The translation industry, on the other hand, has stayed smugly silent, save for a few posts about the API demise and how it might impact existing professional tools, impact on the language industry, and so on. How sad. Nobody really wins in this, I think.

Personally, I feel this move is a big loss to the world’s information-sharing efforts. Google Translate API is widely used by web and mobile app developers, and it is really playing a role in translating that explosion of community content that we hear about.

On top of all that, a bigger question remains: What developer–operating in the globalization space or otherwise–will trust using these (or indeed other) APIs in their development efforts again? Will existing uptake now have to back out Google Translate in favor of another API solution by end of the year?

The Google Translate service isn’t all that bad for the free translation of non-domain specific content and general use when your life didn’t depend on it, but your purchase or vacation might. My position was that Google Translate offered as a service directly, or through website and mobile apps, isn’t an alternative to the paid translation variant but the alternative to no translation at all. And that is what many will now get for a while: No translation at all.

I guess Bing Translator and other solutions will win out in the API space now. However, I am sure that we have not heard the last from Google in the automated information translation–as a service–space…though you may have to pay something for it…

Tags:, , ,
+ posts

Ultan Ó Broin (@localization), is an independent UX consultant. With three decades of UX and L10n experience and outreach, he specializes in helping people ensure their global digital transformation makes sense culturally and also reflects how users behave locally.

Any views expressed are his own. Especially the ones you agree with.


Related News: