Focus
Dmitry Ulanov has been involved in localization since 1998 and risen through the ranks from a project manager to CTO. Now he is focused on automation of operations and also supervising the R&D department. He holds an MS degree and is an accredited internal auditor for ISO 9001 QMS.

Dmitry Ulanov

Dmitry Ulanov
Dmitry Ulanov has been involved in localization since 1998 and risen through the ranks from a project manager to CTO. Now he is focused on automation of operations and also supervising the R&D department. He holds an MS degree and is an accredited internal auditor for ISO 9001 QMS.
here are many areas in the translation business that may benefit from implementation of AI and particularly from machine learning (ML). Since AI is not completely predictable, implementation of any kind of AI also involves new risks. The introduction of AI may produce an unexpected impact or a positive breakthrough.
There are different areas to which an ML technique could be applied, and different learning approaches — supervised or unsupervised learning, reinforcement learning, transfer learning and so on. Searching for the best combinations is an interesting journey.




- Consider who is managing this project (and therefore who receives a suggestion). This allows customization of ML output on a personal/departmental basis, and thus, to continue reusing collected data across different project managers and eliminate “wrong” suggestions at the same time).
- Reduce the weighting of records in the training/validation data set as they become obsolete. Thus, more recent data take precedence over data that is five years old, and the data older than five years is not considered at all.
The results of using this improved architecture have not yet been announced, but we already have a list of adjustments for the next iteration. Instead of a complete regular retraining of the model on the data set over the past five years, we will try to switch to reinforcement learning on the data that will come after the initial training of the network.
Apart from that, we can just compare a new project with completed past projects considering such characteristics as client/account, language pair(s), domain, service level agreement and a linguistic snapshot of text to be translated (hello, natural language processing). And we simply reuse resources from matching past projects where appropriate.
So, the way to detect erroneous data could be either an extra analysis during a scheduled data processing task or a separate scheduled data analysis task. Both are supposed to identify cases that do not match a common pattern (via regression model). Automatic correction of data that seem incorrect would be too risky. Therefore, it is enough to report a potential error to the responsible employee who can verify and fix it if needed.

There is another option that would work for a quick preliminary estimation of the deadline: training the network on the data of previously completed projects, taking into account the volume, language pairs, domain and other characteristics of the project. The estimate given by such a network will be approximate, but such an algorithm is much simpler to implement.
Here to help us are natural language processing techniques such as information extraction and named entity recognition. Running them on a project description received from a client in nonstructured format, we could get all those project properties (such as volume, language pair, account, project name, deadline and many more) extracted as separate values. Then we just fill in a structured instruction template with them to get a draft of the vendor instruction. And if you do not have a corresponding project created in your project management system yet, it may be a good idea to create a project based on the extracted project properties.
To deal with this, a company may consider either building its own ML microservices with Python/R or use excellent cognitive services provided by such IT giants like Google, Amazon and Microsoft, or integrators like Intento (Figure 3).
Although the results of adopting ML could be amazing, they are accompanied with risks. So it makes sense to implement ML solutions in an operational environment, but to keep a human eye on what they do until you can trust the automation and properly evaluate the possible risks.