New speech-to-speech tech company in running for $3 million grant
Voiseed aims to shake up the audio localization industry with a deep learning, virtual voice engine.
→ Continue ReadingI
n just 58 years, since the Massachusetts Institute of Technology (MIT) created the first artificial intelligence (AI) chatbot in 1966, the world has seen a series of monumental technological breakthroughs: first personal computers, then email, the internet, and smartphones. Each step forward arrived a few years after the last, unfolding at a quick yet manageable pace that allowed society time to adjust accordingly.
But 2024 has felt different, marked by a relentless acceleration driven by AI. Some advancements feel like they happened ages ago, yet they’re only months old. We’re no longer in a steady progression; it’s now a wave of rapid advancements that arrive almost daily. Feeling overwhelmed by the endless AI noise, I started a video blog, or vlog, called The AI Almanac. The project quickly resonated with thousands of people who felt the same way.
Each episode, I focused only on the most critical new AI advancements. But with the year coming to a close, I wanted to choose the top AI models of 2024. But then it hit me: The AI landscape is no longer about isolated product releases, but rather larger changes across many companies that are fundamentally reshaping how we interact with the world.
I now believe it’s better to evaluate the most important overarching trends that have emerged in the AI space. In this article, I highlight the eight biggest takeaways we can glean from the many updates that happened this year.
Advertisement
Artificial general intelligence (AGI) — the idea that AI models could reason like humans — was effectively put to rest. After growing suspicions increased over the course of the year, the final nail in the coffin was when Apple released a research paper stating that large language models (LLMs) are excellent at recognizing patterns but show no signs of actual logical reasoning. Even “chain-of-thought” (CoT) techniques designed to simulate reasoning (used in the OpenAI o1 models) have been exposed as merely imitating learned patterns rather than engaging in an actual thought process. The revelation aligns with the growing consensus that the tech industry’s next hurdle lies in building models that move beyond reproducing trained patterns to achieve genuine logical reasoning.
While many people are suddenly emerging from the shadows claiming that they have known this all along, others still hold out hope — even OpenAI seems to still believe AGI is achievable. Personally, I’m open to the idea that they might know something we don’t.
Regardless, this year marked a major turning point for AI as the hype finally began to cool. Investors who once poured money into speculative projects are now demanding real results, shifting from flashy AI toys to practical applications that prove genuine value.
In MultiLingual magazine’s December 2023 issue, I predicted that generative AI (GenAI) would change how we find information online. That quickly became reality in 2024. While Perplexity.ai was already around, Google’s overhaul of its legacy algorithms and launch of AI summaries — alongside Bing and OpenAI’s Search GPT — pushed generative search into the mainstream.
Generative engines synthesize data from multiple sources using LLMs to provide immediate, natural language responses. This reduces the need to visit individual sites or dig through links like with traditional search engines. This shift has spawned a new search optimization method called generative engine optimization (GEO) that focuses on aligning content with generative patterns, rather than traditional search engine optimization (SEO) tactics — like keyword stuffing and backlinks — which reduce visibility in generative environments. For example, keyword-heavy content, once central to SEO, can lower visibility in generative engines by up to 10%.
The rise of generative search engines has raised anti-competitive concerns among government bodies across the globe. Generative engines often use content without consent, redirect traffic away from original sources, and cause significant drops in site visits. To prevent these effects, publishers must opt out of all search indexing (including Google). This means companies are essentially forced to allow AI access or risk being excluded from the digital marketplace.
With technology fatigue on the rise, the shift to generative engines feels like a relief for some people — though many others aren’t even aware it’s happened. For me, I’m happy I can now snap a picture of a random mushroom in the forest to ask Google what it is, or get a smoothie recommendation without digging through endless links. Personally, I’m not going back.
But this development has many companies worried. Websites could become obsolete if users no longer need to visit them. I’m constantly pulled into many board conversations where SEO teams are under growing pressure to figure out how to handle optimizing their content for generative environments as traffic continues to drop.
GenAI became embedded in nearly every social media platform, laptop, and mobile phone this year, weaving itself into the daily lives of users worldwide. Platforms like Facebook, Instagram, WhatsApp, and even LinkedIn introduced AI features that leverage vast amounts of user data, bringing data privacy to the front of the conversion despite being something we have neglected up until now. In particular, Meta’s approach to training on user data hit roadblocks in regions like Europe and Brazil, signaling a pushback against tech giants’ extensive data usage. Meanwhile, “consent” has become almost meaningless to many people as users navigate hidden policies that are impossible to find or turn off.
What many people don’t realize is that companies like Meta operate far beyond traditional stateless models; their AI systems are deep contextual engines designed to analyze decades of user behavior, tailoring responses based on highly personalized insights. This goes far beyond surface-level personalization — it’s an attempt to predict and shape user interactions at an unprecedented scale to execute almost everything.
Apple took a notably different approach with the launch of Apple Intelligence, which features the most advanced privacy-protective AI infrastructure available to everyday users. Apple’s Private Cloud Compute infrastructure ensures that data remains encrypted and inaccessible even to Apple, gives every user a private cloud instance, and sets a new benchmark for privacy protection. Unlike the data-driven AI models of Meta and LinkedIn, Apple’s approach prioritizes user security over data exploitation, signaling a crucial shift as the demand for privacy in AI grows.
Unfortunately, Apple Intelligence was the year’s most overlooked AI launch. While not as flashy as competitors, Apple’s approach set a new privacy standard, standing out as one of the few truly protecting everyday users. As a data privacy advocate, I feel reassured using Apple devices — a trust I don’t extend to others.
Many people believe Apple is falling behind. I believe that they know that “slow and steady wins the race.” Apple, I’m keeping my eye on you.
In 2024, the language industry saw significant advancements as a natural evolution, given that LLMs were originally designed with translation in mind. More companies focused on developing task-specific, proprietary models to improve precision and quality assurance (QA). And DeepL entered the interpreting space with its launch of DeepL Voice, showcasing its potential to break down language barriers in real time and foster communication in virtual settings like Microsoft Teams.
One standout release sparked deeper questions about the evolving role of human translators: Translated’s launch of Lara, the industry’s first native LLM developed with NVIDIA. Lara set a new accuracy benchmark, claiming only 2.5 errors per thousand words — 50 percent fewer than the industry standard of five errors with professional human translators.
While some fear these advancements will edge out human translators, the idea of humans as essential copilots in the AI-driven translation process is surging. Powerfully articulated by Marina Pantcheva of RWS, this idea resonates across the industry: “Thirty years ago, linguists were at the center of the translation process. As technology and automation advanced, they were pushed to the periphery, ending up somewhere ‘in the loop.’ Now, it’s time for linguists to take their seat in the cockpit, from where they guide the development of linguistic AI.”
The video dubbing industry saw significant advancements with the integration of AI to enhance content accessibility and localization. Meta, YouTube, and TikTok all launched AI-driven dubbing features to help creators reach broader audiences by providing multilingual content. TikTok’s Symphony AI Dubbing tool automatically detects the original language in a video, transcribes, translates, and produces a dubbed version in the selected languages — enabling creators and brands to produce content that resonates across cultures. Similarly, HeyGen launched an AI-powered video translation feature that clones the user’s facial expressions, natural speaking voice, and style. This allows seamless delivery in multiple languages and makes content more accessible to a global audience.
While some fear AI will replace human translators, others believe it will lead to an unprecedented increase in translated content, keeping professionals engaged. This perspective aligns with RWS Trados’s “translate everything” slogan and was echoed by Gabriel Fairman of Bureau Works at the American Translators Association (ATA)’s conference this year. As social media platforms enable creators to reach more audiences and companies find translation more accessible, this trend seems likely.
While I can agree AI will boost translation volumes in commercial sectors, I do not believe this surge will extend by default to government or specialized fields where reaching broader audiences doesn’t directly drive sales.
In 2024, countless new AI-powered tools emerged to transform traditionally human-driven arts like film, music, and visual media. AI video generators produced high-quality, realistic clips that simulate human emotions and animations (such as OpenAI’s Sora and Runway’s Gen-2), creating visuals as if they were straight out of a movie. Their potential quickly attracted attention in the film industry through an opportunity to reduce costs and achieve complex visual effects without large budgets. On the music side, AI platforms like Suno and Udio launched to allow users to generate music from scratch, replicating popular artists’ styles and even inventing new genres.
While these tools expanded creative possibilities, they also raised significant questions about the legality of their outputs. Major record labels (like Universal, Sony, and Warner) filed their first lawsuits against AI music generators for using copyrighted material (or artist likeness) in outputs, while Hollywood unions went on strike to secure protections against AI-generated scripts and digital likenesses. Visual artists and authors also filed lawsuits over the unauthorized use of their work in AI training datasets.
We all knew this was coming. As a former Sony Music employee who worked in copyright compliance, I weirdly see both sides of the argument. While my opinion may not be entirely popular, the truth is we are touching an area of copyright law that hasn’t been considered. There is currently a gap in copyright law that mainly addresses public distribution, not private use within companies. AI companies argue that using copyrighted works privately falls outside traditional copyright rules. Similarly, an artist’s likeness (such as voice and style) was never traditionally considered copyrightable, but laws (such as Tennessee’s ELVIS Act) are starting to change that.
The bottom line is AI model creators are currently operating in a legal gray area, similar to Spotify’s early days before copyright law adapted to streaming. I predict this will lead to a lose-lose situation, with new copyright provisions being established moving forward rather than significant repercussions retroactively.
Advertisement
In January, OpenAI quietly removed its military-use ban from its terms of service and confirmed collaboration with the United States (US) Pentagon, sparking global concerns about private companies’ collaboration with military AI applications. Multiple summits throughout the year continued to address military AI use; over 90 nations convened in Seoul to establish guidelines for responsible conflict use, especially as AI-enabled drones gained prominence in the Russia-Ukraine war. Many organizations like the North Atlantic Treaty Organization (NATO) also updated their AI guidelines.
The European Union (EU) passed the AI Act, categorizing AI applications into risk tiers and enforcing strict rules for high-risk AI systems, while the Council of Europe introduced an international treaty to protect human rights against AI. Microsoft also banned US law enforcement from using Azure AI for facial recognition, citing risks of bias in high-stakes environments.
Six years ago, I worked with AI on US government drone ships, so this isn’t exactly new. But with GenAI as the buzzword, even governments are centering non-stop discussions around it. I still find it ironic that model makers emphasize “putting humans first” while essentially saying, “Our priority is humanity’s safety, but let’s teach the robots Sun Tzu just in case.”
As millions of people interact with AI models regularly, their environmental impact is quickly increasing:
Companies like OpenAI and Google are realizing they can’t keep up with AI’s skyrocketing energy demands without serious infrastructure changes. These companies heavily invested this year in nuclear power, highlighting a push in the tech industry for sustainable energy. Balancing nuclear power with safer renewables like wind and solar — and ensuring government oversight — will be essential to avoid environmental and safety risks.
AI has reshaped our world in profound ways this year, sending ripples we’re only beginning to grasp. From the surge in data privacy concerns to groundbreaking strides in creative fields, translation, and everyday tech, the rapid transformation has exposed both immense potential and significant risks that society is racing to keep up with. As we look to the future, it’s about not only advancing AI, but also ensuring it aligns with values that protect and empower. The road ahead will demand careful balance, responsibility, and an unwavering commitment to ethical innovation that truly serves humanity.
Veronica Hylak is co-CEO of Metalinguist, an award-winning AI product innovator, and host of The AI Almanac vlog. With 10 years of experience working with Fortune 500 companies, the US government, and startups, she has led many high-impact projects and loves to build things that solve problems.
Advertisement
Related Articles
Voiseed aims to shake up the audio localization industry with a deep learning, virtual voice engine.
→ Continue Reading[Milano, September 14, 2023] — Voiseed, an innovative start-up specializing in expressive synthetic voices for Games & Media localization, today announced the appointment of Dr.…
→ Continue ReadingNot only does the show need to have an all-Asian cast, but it also needed to be told in three languages: Korean, Japanese, and English,…
→ Continue Reading