OpenAI has unveiled GPT-5, describing it as its “smartest, fastest, most useful model yet” and positioning it as a significant upgrade over previous versions GPT-3 and GPT-4, which will be deprecated. The company says GPT-5 is already in use by clients such as Amgen and BBVA for tasks ranging from coding to healthcare.
However, despite the marketing push, published benchmarks suggest that GPT-5 offers only marginal gains—or even slight declines—in multilingual performance compared to some of OpenAI’s existing models.
Benchmarks Tell a Mixed Story
In the absence of a language-industry-specific demo, OpenAI’s release pointed to results from its GPT-5 System Card, which evaluated the model’s multilingual understanding using the MMLU benchmark. While MMLU is not designed as a translation-specific test, it is commonly used as a proxy for a model’s ability to understand and generate accurate text in different languages.
According to OpenAI, the MMLU test set was translated into 13 languages by professional human translators to assess knowledge and problem-solving capabilities. The system card notes that GPT-5’s language understanding is “generally on par” with existing models.
The published results show that GPT-5-main underperformed slightly across all 13 languages when compared to OpenAI’s o3-high model. The GPT-5-thinking variant fared somewhat better, with performance on par in Brazilian Portuguese, slightly worse in Arabic, French, German, Italian, and Spanish, and marginally better in Bengali, Chinese, Hindi, Indonesian, Japanese, Korean, Swahili, and Yoruba.
Expanded Features, Not Multilingual Breakthrough
While the multilingual gains appear modest, OpenAI has introduced new features that could be relevant for some workflows. These include expanded context windows and a new “minimal” reasoning setting, which researcher Michelle Pokras said allows users to trade reasoning depth for faster response times without switching between models.
The company also highlighted improvements in voice capabilities—such as more natural speech, video integration, and smoother language switching across turns—though it confirmed that Voice mode is still powered by GPT-4o, not GPT-5.
Health interactions are another stated focus area, with OpenAI suggesting that accuracy gains could support healthcare AI interpreting and other clinical applications.
Adoption and Industry Outlook
OpenAI expects early adopters to drive “industry leadership” in AI applications powered by GPT-5, with benefits including faster decision-making and more effective collaboration. The model is now being rolled out to all ChatGPT Plus, Pro, Team, and Free users globally on web, mobile, and desktop.
For the language industry, however, the takeaway from OpenAI’s own data is clear: GPT-5 may offer new features and interface improvements, but its multilingual understanding—based on the company’s own benchmarks—remains largely in line with existing models, with no clear leap forward.

