The Biggest AI Model Releases of 2024 and Their Impacts on the Language Industry

December 3, 2024

OpenAI, Meta, Google, and Anthropic introduced major artificial intelligence (AI) model updates this year, which primarily focused on three key trends:

higher-performing multi-modal large language models (LLMs), combining video, audio, and text;
the growth of AI agents; and
the rise of small language models (SLMs) that are cheaper to run and more task-specific.

OpenAI

Sora: Sora was the first major text-to-video model launch capable of generating realistic videos up to a minute long from textual descriptions. (February)
GPT-4o: GPT-4omni is a model with voice and computer-vision capabilities, transforming ChatGPT into a virtual assistant with functionalities like real-time language translation/interpreting and tone modulation. (May)
o1 Model: The o1 model focuses on reasoning-heavy tasks by leveraging a technique called chain-of-thought (CoT) reasoning. (September)

OpenAI’s Sora was an impressive launch but never became publicly available, leaving many questions about its effectiveness. GPT-4omni, meanwhile, changed the game for me — I now have full conversations with GPT as I drive in the car or tackle random questions on the go. Its on-demand speech translation is also handy enough in a pinch.

Google

Gemini 1.5 Series: These models offer faster output speeds and lower latency that allows for more affordable application development. However, Google’s Gemini series faced significant backlash when its image generation feature produced historically inaccurate and offensive depictions, causing it to be pulled off the market for a period of time. (February)
AlphaChip AI: Google DeepMind announced AlphaChip, an AI-driven method for designing electronic chip layouts, marking a significant advancement in AI-assisted hardware design. (September)

Remember when Google’s Gemini was called Bard? That change happened this year, thankfully. Google struggled to regain credibility after the backlash over flaws in its image generation, such as depicting historical figures incorrectly (such as George Washington as African American). The controversy brought to the surface the influence model creators actually have that can significantly affect output, with diversity, equity, and inclusion (DEI) policies or internal biases leading to distorted results.

Anthropic

Claude 3.5: Anthropic’s most popular launch was Claude 3.5 Sonnet, an advanced model that excels at understanding nuanced instructions and generating sophisticated analyses. It features a Prompt Playground for developers to efficiently create and refine prompts for AI application development. (June)
Computer Use Capability: Anthropic launched a “computer use” feature, enabling AI to essentially control your computer and perform tasks similar to human computer interactions (such as moving cursors, typing, and browsing the internet). This feature has been adopted by companies like Asana, Canva, and DoorDash. (October)

The word on the street from some of the underdogs I’ve spoken to is that Claude 3.5 Sonnet is becoming more favored in AI translation pipelines. The computer-use feature launch signaled the next era of AI capabilities.

The Biggest AI Model Releases of 2024 and Their Impacts on the Language Industry

OpenAI

Meta

Google

Anthropic

RELATED ARTICLES

Intento’s State of MT 2023 Report Evaluates 37 MT Engines and 5 LLMs

The Week in Review: August 11, 2023

Hera AI – The Future of Automated Post-Editing, LQA and QE

ChatGPT’s o3‑pro and Real-Time Voice Translation Mark Major Leap in Multilingual AI

Google unveils generative AI tools at developer conference

Weekly Newsletter, Subscribe to stay updated!

Login or Register