Voice conversion is a new technology that is capable of drastically changing the way people with speech disabilities function in their everyday lives.
The Respeecher team is constantly looking for new ways to advance its technology. On our journey, we met with so many extraordinary people. One of them was a young scientist named Konrad Zieliński, a Ph.D. student at the University of Warsaw who had lost his voice due to laryngectomy. His need for novel solutions for people in similar situations was the inspiration for the topic of his thesis, the main focus of his company, Uhura Bionics, and the beginning of our fruitful collaboration.
A laryngectomy is the surgical removal of the larynx (or voice box). It is usually done to treat severe or advanced-stage cases of laryngeal cancer. As a result of the surgery, patients lose their ability to speak and have to rely on voice-assistance technologies such as an electrolarynx or tracheoesophageal voice prosthesis (TEP).
However, there are certain challenges regarding these two ways of “fixing” voice disabilities.
Today, patients must use various noise-reduction techniques and filtering systems to improve speech intelligibility with an electrolarynx. At present, the technology responsible for improving electrolaryngeal speech intelligibility still has a long way to go. And while TEP speech is more natural than electrolarynx speech, it is still much less natural sounding than normal laryngeal speech.
Both of these methods seek to restore a patient's ability to speak. The only issue is the low quality of voice that these technologies produce. Communication difficulties affect patients in their jobs, personal relationships, and social gatherings.
Konrad is looking for different ways to restore patients’ original voices after having their larynx removed, and that’s why he decided to try Respeecher.
We interviewed Konrad to learn more about his research plans, his experience using Respeecher, and his future plans regarding solutions for voice disorders.
Respeecher: Can you tell us a little bit about the work you are doing right now?
Konrad: I am a PhD Student at the University of Warsaw, I work in the Human Interactivity and Language Lab (HILL). Right now I am working with a grant that has allows me to evaluate various bionic systems for laryngectomees.
I recently had been published at two prestigious conferences on human-computer interaction and speech technology: ACM CHI 2022 in New Orleans, US and INTERSPEECH in Icheon, Korea.
Currently, I am developing my own company Uhura Bionics with technical co-founder Marek Grzelec. We are participating in the EIT Health Patient Innovation Bootcamp, distributing voice amplifying speakers for people with voice disorders. We are also working on a novel electronic larynx that generates more elegant speech with a more natural sound.
We are natural partners for Respeecher and we feel like we have a lot to do together.
R: What is the eventual goal you are trying to accomplish with Respeecher? Is it real-time voice conversion?
K: Yes. We are striving to achieve voice conversion in 50 millisecond, so that it is imperceptible to the human ear. However, it is a very ambitious goal, and there are still many technical challenges ahead of us.
R: Speaking of now, with the current level of the technology Respeecher has, how are you going to use it for your recent projects?
K: I consider two scenarios.
One is as an assistive tech for content production. So that people with voice disabilities could produce different types of audio and video content, such as lectures, voiceovers, advertisements, and many more.
Another scenario is related to one of the biggest issues people with voice disabilities have - communicating with someone directly or by phone. Devices designed to aid communication (as electrolarynx or TEP) usually make speech sound robotic and unrecognizable, especially when talking to someone by phone.
R: Do you consider this technology to be available for the wide scope of people affected by voice disabilities?
K: When the technology is more developed, I would say in 5-10 years, then of course, it will become more available for a larger number of patients with voice problems. The combination of the devices we are developing right now and the Respeecher technology will make the lives of people suffering from speech problems much easier.
R: What are your thoughts after trying Respeecher?
K: Respeecher created a software that allows me to change my voice with electrolarynx to sound more human-like. They even utilized recordings from before my laryngectomy surgery (4 years ago) to build a dedicated voice conversion system that resembles my old voice! I can’t describe how excited I am about the current results and our future work together!
In order to help laryngectomy patients achieve a higher quality of life, Respeecher is exploring the use of its technology on voice samples of the electrolarynx and tracheoesophageal voice. The solution will offer real-time intelligible voice replacement to improve support specifically for individuals who have undergone a laryngectomy.
Respeecher’s voice-changing technology transforms the sound of electrolarygeal and TEP speech into clearer, more articulated, and more intelligible audio. In particular, the technology dampens the mechanical hum of the electrolarynx and TEP voices while accentuating the natural tonal inflections. This makes it much easier to communicate, both for the patient and their interlocutors.
The technology could be deployable using a modified phone. Patients can communicate live through a speaker or use the technology for phone calls and other electronic voice communications.
Today, Respeecher’s voice cloning algorithms deliver critical benefits when employed as an assistive technology to those who have lost their natural ability to speak.
Konrad provided the samples of his voice for Respeecher to create his voice model and started testing the Voice Marketplace in real-time and offline voice conversions. The results of the collaboration create an example of how laryngectomy patients can use Voice Marketplace independently.
With the help of voice cloning, as the partnership with Konrad showed, the patients are able to communicate in a more natural manner as well as produce different types of audio and video content, such as lectures, voiceovers, advertisements, and many more.