Software Code Internationalization Self Healing with AI

The Solution

NetApp’s software code internationalization self-healing with AI is a method that learns from the analysis of the history of the source code to identify and categorize issues based on the past mistakes and generate suitable fixes. This uses artificial intelligence (AI) in a combination of classification learning models and pattern recognition models.

This approach breaks the source code into the snippets of code, and then tokenizes the code snippets to identify all the elements present in them. It uses the trained model to identify if a certain piece of code contains any internationalization violation. If the code snippet contains an issue, then the system further identifies the type of the issue such as “hardcoding,” “concatenation,” “date/time formats,” etc.

The user receives a highly accurate report, containing a very minimal number of false positives or false negatives.

This approach helps identify issues automatically, and generates the fix for the given issues and hence called a self-healable system.

Method/Process

The proposed method uses modules which are divided into backend and frontend.

Backend Modules

Modules at the backend are used to save the data extracted from the source code and ML models that are published as application programming interfaces (APIs) to be consumed by the frontend systems.

1. Classification of code line: Training and Testing [BEMOD1]This module takes the source code being scanned and divides it as per its logical completeness (e.g., line-wise, function-wise, etc.). Then, it changes code to vectors, labels each snippet with error type and recommended fix, and passes it to the BEMOD2 prediction module.

2. Code line prediction [BEMOD2]: This module takes the snippet of code from BEMOD1 and identifies the type of violation the code snippet contains based on the model trained for classification. Then, it generates the fix based on the pattern recognition.

3. Saving the predictions for further model learning [BEMOD3]: The module saves all the predictions made by BEMOD2 to the database for model learning.

Frontend Module

Frontend modules are those that interface between the users and the trained ML model(s) and its scanning capability.

1. Scanner [FEMOD1]: The Scanner scans the code, breaks it in logical snippets, tokenizes it, and sends to the BEMOD3 for prediction.

2. Interactive Reports [FEMOD2]: Display and confirm violations: These reports display the result of code scanning and recommended fixes based on the training. The reports allow the user to mark any reported issue as false positive or any non-issue as false negative. The final report is used to apply the fix automatically.

3. Plugin [FEMOD3]: These are similar modules that show results on an Integrated Development Environment (IDE) and show the similar data on live code by highlighting the code in appropriate colors.

Advantages over Previous Solutions

This technology minimizes the internationalization efforts, as most of the work is done by the self-healing system, which is ready to work after initial training without the need for a rule-based scanning.

This solution is highly accurate as the system forms the patterns from the real data. Plus, the bug fixing is also automated.

For more information write to saurabh.kavathekar@netapp.com

Back to Issue

The Great Gap: Will MT ever be on par with human translators?

By Andrew Warner

The debate around human parity in machine translation — whether or not we’ll achieve it in our lifetimes and its potential impact on the language…

→ Continue Reading

Column, Life Sciences

How MT Helps with All Four Legs

By Mark Shriner

Welcome back to The Lab, where we take a look at what’s cooking in life sciences localization. This month, we are talking about some key…

→ Continue Reading

Business, Technology

Best practices for securing enterprise machine translation

By Bart Maczynski and Arnaud Simon

Organizations today find themselves dealing with an ever-growing volume, velocity, and variety of multilingual content and data. Digital transformation, including the recent rapid advancements in…

→ Continue Reading

Subscribe Now for Full Access

Select a print or digital plan to get early access to MultiLingual and never miss an article.

SUBSCRIBE