sponsored content

Uber’s Generative AI System for Mobile Testing Cuts Costs and Improves Quality

Supported by LocWorld

L

ike many software companies, Uber spends a large chunk of time and money on mobile testing, including maintaining test scripts that don’t scale well across diverse cities and languages.

“The substantial maintenance costs of these tests significantly hinder their adaptability and reusability,” Uber engineers wrote in a company blog post. “[This] makes it really difficult for us to ensure Uber operates with high quality globally.”

Not only that, but bugs related to internationalization, such as unlocalized or truncated text, account for over half of Uber’s bug backlog — despite not requiring a localization expert to flag.

These concerns led Uber to develop a generative artificial intelligence (GenAI) tool for detecting and reporting bugs in its mobile app. Called DragonCrawl, the system boasts “the intuition of a human,” winning Uber’s Senior Localization Experience Program Manager Carolina Freire second place in LocWorld 52’s Process Innovation Challenge (PIC) in October 2024.

“DragonCrawl detects bugs in a matter of seconds and raises the ticket for you,” Freire said in her PIC presentation. “It explains the issue, categorizes the bug, and provides a screenshot and all the labels that you need to track and resolve the bug.”

The code-free and goal-oriented GenAI solution needs to be trained only once in order to run in many locales and languages. It is also highly efficient for Uber — up to a thousand times smaller than some off-the-shelf large language models (LLMs).

So far, DragonCrawl has blocked many high-priority bugs from impacting Uber’s customers, all while saving developers thousands of hours and significantly reducing test maintenance costs. Uber’s blog post sums it up this way: “Scaling mobile testing and ensuring quality across many languages and cities went from humanly impossible to possible with the help of DragonCrawl.”

Related Articles