Voiseed, a new Milan-based startup, aims to shake up the audio localization industry with a deep learning, virtual voice engine that allows content creators to easily and affordably create unique, expressive voice content.
Localization key to success
In today’s global world, there is a huge and growing consumer demand for high-quality localization – the translation and transformation of content into their own languages and cultures. And while English may be the default language in the tech industry, the consumer reality is quite different. For example, of the top five countries/markets for game revenue, China, the US, Japan, South Korean, and Germany, English is the dominant language in just one. In fact, of the total game revenue from the top ten, only about 35% comes from English-speaking countries/markets.
As Voiseed’s Co-founder and CEO, Andrea Ballista knows these kinds of numbers show that for content creators to succeed in increasingly competitive international markets, they can no longer rely on poor translations and robotic, artificial voices. They must translate and transform the words and speech into an equally satisfying and useful experience for users around the world.
“The voice is the primary medium in human interaction, capable of conveying a wide range of emotions that go far beyond the simple meaning of the words being spoken.”
Faster, faster, faster…
Yet, while technology has increased the efficiency of almost every step in the content creation process, consumers now expect new content and investors expect more profits within an ever-shrinking time frame. New releases, frequent updates, or promotional tie-ins are the new norm, and as if creating them in one language wasn’t difficult enough, it now needs to be done in multiple languages simultaneously and instantly.
Looking for a better way
While TTS (text to speech) systems already exist in the market, the voices they create are just not expressive, controllable, and easy-to-use: Voiseed is starting from use cases and creating the best tech to solve them, says Ballista.
His goal has always been to create an audio environment that seamlessly supports and enhances the user experience. In the industry for almost 30 years, a computer music studies graduate, Ballista co-founded Binari Sonori in 1994 and became audio director at Keywords Studios in 2015. He is well aware of the challenges and costs involved in creating the perfect soundscape.
“We have been working on multiple projects with more than 100 actors for the original production. If you want to localize them for five or more languages, you might need up to 500 additional international actors, making it extremely complicated, slow, and expensive.”
Ballista also knows the potential for a game-changing technology like Voiseed’s virtual voice engine. He estimates that the latent market in the industry to be in the billions of dollars, with Voiseed’s serviceable obtainable market share being $100M of a $4B serviceable available market.
New funding brings realization even closer
Recently, Voiseed’s ground-breaking technological innovation was recognized by the European Innovation Council (EIC) Accelerator program of the European Commission. From an extremely competitive field of over 4,000 start-ups and SMEs in the fields of healthcare, digital technologies, energy, biotechnology, space, etc., Voiseed was selected to be one of 65 recipients to receive more than 3 million EUR in grant funding and equity investment.
“The EIC Accelerator is a unique European funding instrument of the European Innovation Council. It supports the development of top-class innovations through crowding-in private investors and offers a portfolio of services to support their scaling-up. With the European Innovation Council, we aim to bring Europe to the forefront of innovation and new technologies, by investing in new solutions for the health, environmental and societal challenges we are facing.”
-Mariya Gabriel, Commissioner for Innovation, Research, Culture, Education, and Youth
Harnessing the power of AI
On top of the tech, Voiseed has built a Virtual Studio platform where users can upload the text and audio of the original dialogue and its translation in all the relevant languages. Using unique deep learning algorithms and multi-character and multilingual voice emotional capabilities, the system creates natural and emotional character voices in all the target languages, transferring the expression of the source audio.
In addition to crafting unique voice textures, users will be able to control the emotional profile of the voice performance, edit the target text in real-time, and transfer the style and emotion from any reference audio file into any language.
A new solution is on the horizon
With years of experience in international voice production and localization in the entertainment market, AI research, software design, and global business management, Voiseed’s full-time core development team plans to start offering game dubbing in eight languages in 2022, adding video dubbing in sixteen languages in 2023, and full language market services in thirty-two languages by 2024.