The Biggest AI Model Releases of 2024 and Their Impacts on the Language Industry

OpenAI, Meta, Google, and Anthropic introduced major artificial intelligence (AI) model updates this year, which primarily focused on three key trends:

  • higher-performing multi-modal large language models (LLMs), combining video, audio, and text;
  • the growth of AI agents; and
  • the rise of small language models (SLMs) that are cheaper to run and more task-specific. 

OpenAI

  • Sora: Sora was the first major text-to-video model launch capable of generating realistic videos up to a minute long from textual descriptions. (February)
  • GPT-4o: GPT-4omni is a model with voice and computer-vision capabilities, transforming ChatGPT into a virtual assistant with functionalities like real-time language translation/interpreting and tone modulation. (May)
  • o1 Model: The o1 model focuses on reasoning-heavy tasks by leveraging a technique called chain-of-thought (CoT) reasoning. (September)

OpenAI’s Sora was an impressive launch but never became publicly available, leaving many questions about its effectiveness. GPT-4omni, meanwhile, changed the game for me — I now have full conversations with GPT as I drive in the car or tackle random questions on the go. Its on-demand speech translation is also handy enough in a pinch.

Meta

  • Llama 3 Series: Meta released the open-sourced Llama 3 series, including versions 3.1 and 3.2. Llama 3.2 introduced vision-capable LLMs and lightweight text-only models designed for edge and mobile devices. (July and September)
  • Meta Movie Gen: Meta unveiled an AI video-generating tool capable of creating videos up to 16 seconds long from text prompts, outperforming competitors like OpenAI’s Sora and Google’s Veo. (October)

Llama is a favorite amongst model builders due to cost savings. However, I’m concerned about data privacy; if our Facebook data has been used to train these models, what are the implications of turning that around as open-source?

Google

  • Gemini 1.5 Series: These models offer faster output speeds and lower latency that allows for more affordable application development. However, Google’s Gemini series faced significant backlash when its image generation feature produced historically inaccurate and offensive depictions, causing it to be pulled off the market for a period of time. (February)
  • AlphaChip AI: Google DeepMind announced AlphaChip, an AI-driven method for designing electronic chip layouts, marking a significant advancement in AI-assisted hardware design. (September)

Remember when Google’s Gemini was called Bard? That change happened this year, thankfully. Google struggled to regain credibility after the backlash over flaws in its image generation, such as depicting historical figures incorrectly (such as George Washington as African American). The controversy brought to the surface the influence model creators actually have that can significantly affect output, with diversity, equity, and inclusion (DEI) policies or internal biases leading to distorted results.

Anthropic

  • Claude 3.5: Anthropic’s most popular launch was Claude 3.5 Sonnet, an advanced model that excels at understanding nuanced instructions and generating sophisticated analyses. It features a Prompt Playground for developers to efficiently create and refine prompts for AI application development. (June)
  • Computer Use Capability: Anthropic launched a “computer use” feature, enabling AI to essentially control your computer and perform tasks similar to human computer interactions (such as moving cursors, typing, and browsing the internet). This feature has been adopted by companies like Asana, Canva, and DoorDash. (October)

The word on the street from some of the underdogs I’ve spoken to is that Claude 3.5 Sonnet is becoming more favored in AI translation pipelines. The computer-use feature launch signaled the next era of AI capabilities.

Veronica Hylak
Veronica Hylak is a 1x founder, award-winning AI product innovator, and host of the vlog “Hey AI”. With 10 years of experience working with Fortune 500 companies, the US government, and startups, her current focus is on go-to-market strategy, regulation, and AI ethics.

RELATED ARTICLES

Weekly Digest

Subscribe to stay updated

 
MultiLingual Media LLC