Conversational UI Language Design at LocWorld35
Conversational UI localization and language design skills are central to a great user experience. CUI means we moved from a "user"-centric concept of design to…
→ Continue ReadingM
illions of people around the world experience permanent voice loss due to injuries or diseases like cancer and amyotrophic lateral sclerosis (ALS). The inability to speak not only severely limits communication, but can also negatively affect a person’s sense of identity and mental health.
Many people with speech impairments utilize adaptive alternative communication devices such as speech synthesizers. But these electronic voices often sound robotic and lack features matching the person’s identity, such as correct age, accent, and pitch.
Fabio Minazzi, Director of Audiovisual Services at Translated, hopes to change that with an initiative called “Voice for Purpose: A Multilingual Voice Donation Ecosystem Empowering Individuals With Vocal Disabilities to Communicate Worldwide.” The cloud-based and artificial intelligence (AI)-enabled technology won Minazzi the top prize at LocWorld 52’s Process Innovation Challenge in October 2024.
Using donated recordings of diverse people speaking in English, French, Italian, German, or Spanish, Translated has developed expressive AI voices that render in real time on any device. The innovation includes an online platform for volunteers to donate their voice and patients to select their desired voice.
Minazzi spoke with MultiLingual about what makes the project truly innovative, as well as his goals for expanding its impact.
At Translated, we like challenges, and we have a mission: enable everyone to understand and be understood in their own language. So when Italian actor Pino Insegno approached us with the challenge of using AI for the purpose of helping the voiceless, we recognized a new opportunity to accomplish our mission.
We already had many years of operating AI-based solutions for written communication under our belt. As a natural evolution, we had also developed high-quality voice algorithms for dubbing, with very satisfactory results. We studied the state of the art in speech synthesis for ALS patients and realized our infrastructure for running real-time interactions with linguists across the world could be a key element for delivering an innovative service. Basically, we started the “Voice for Purpose” project because we felt our experience and infrastructure could be used to help people.
Voice donation is a new concept, and so is receiving a donated voice. Therefore, the first challenge was presenting those two new concepts in a simple way for people who land on the portal — so that they would feel confident in becoming part of the project, whether as a donor or a user.
Another challenge was making the platform open to everyone and still protecting it from inappropriate usage. Each step was designed and tested carefully, balancing simplicity and completeness, informality and security. Balancing the technical aspects with the user experience occupied much of our thoughts.
For example, we considered requiring the sample text that a donor records to strictly adhere to the script we provided. But then we realized that the majority of the donors are normal people who are not skilled in speech, and they might make mistakes; if we sanctioned deviations from script, we would have frustrated them and lost their precious contributions. We carefully measured the impact of those deviations from the script on the final result and decided that we would only ask the donor to retry if the sample was silent for any reason.
Now that we have collected the voices of more than 5,000 people of different ages, nationalities, and levels of education, we can say that keeping the platform simple was the right choice.
There are several areas that make us proud. From the social point of view, we created one more opportunity for people to be generous — they can donate blood, cells, biological tissues, and now also voice. We have cases of people who recorded their voice for self-donation because they know they will lose their voice, which is also a new concept introduced by “Voice for Purpose.” These individuals are happy to make their voices available to other people, too.
From the medical point of view, we’ve discovered that donated voices have a measurable advantage compared with the standard, impersonal voices that are often used. Age, cadence, accent, and timbre can more easily be matched with a donated voice.
From the technical point of view, we are proud of having introduced real-time voice delivery for medical applications over the cloud, to scale — making the power of cloud computing available to any patient in the world. The algorithms we implemented are in fact too complex to be executed on many medical devices, which have not been designed with high-quality speech synthesis in mind.
We’re expanding both on research and engagement. On the research side, we’ve just been awarded a grant for 200,000 computing hours to expand to more languages and to research new input methods. For some pathologies, the lack of voice is the main impairment; in other cases, like ALS, typing text on a keyboard is a daunting or impossible effort, which makes composing sentences a major barrier to communication. If we want everyone to be understood in their language, we need to lower that barrier.
From the engagement point of view — now that the platform is up and running — we are building relationships with clinical centers around the world to spread the word, reach more patients, and help more people.
The localization industry is made up of individuals who spend their lives connecting people through language. We conceived “Voice for Purpose” as an open project, so we would welcome other players in the industry to join us and support it. We want “Voice for Purpose” to become a global platform that represents the best values that inspire our industry. It would be great to see a pool of localization companies back the “Voice for Purpose” project in the coming months. If that interests you, get in touch!
Related Articles
Conversational UI localization and language design skills are central to a great user experience. CUI means we moved from a "user"-centric concept of design to…
→ Continue ReadingTranslated has released a commercial highlighting how people translate each other every day,decoding what isn’t expressed in words. The short film was previewed at the…
→ Continue ReadingBlackbird.io's release of Blacklake showcases how it redefines memory for the AI-driven era of content by introducing a living memory layer that bridges content and…
→ Continue Reading