Our unique technology can change your voice to that of another person (e.g., a celebrity) while preserving all the subtle detail of how you say what you say.

About the system

We leverage recent breakthroughs in the field of machine learning called deep learning which allow artificial neural networks to produce high quality synthetic speech. Up until now, these techniques have mainly been used for text-to-speech conversion. Because text contains very little prosodic information, this results in a rather monotone output. By doing speech-to-speech, we circumvent this problem, copying the intonation of the input speech to the output.

SaveSave

SaveSave

Why the world needs voice conversion

Movie dubbing and ADRFamous voicesCall centersEntertainment and VRSpeech problems

With our technology, movies can be dubbed using the voices of the original actors, providing a big wow factor. Relatedly, production companies often need to record short bits of dialog after principal photography (ADR), and with our technology they can fake it when the actor is no longer easily available.  Finally, our technology will reinvigorate porn parody films with famous voices.

Any actor can speak with a famous voice. The talent can get ill and even die. But their voice can live on. Audiobooks narrated in the author’s voice. Tribute concerts for singers who have passed away.

A whole call center can speak with one good voice. It could be the voice of a celebrity or of the business owner. Or call centers can switch between voices, targeting them to customers.

Entertainment such as karaoke, new highly immersive VR games as well as traditional online games will need voices that our technology is poised to provide.

Personalized synthetic voice for people with speech problems

SaveSave

Samples

These samples are generated by our current prototype trained on the CMU Arctic dataset. A trained system takes a file spoken by a source speaker (“Source” column) and produces a result (“Target, converted” column), as if it was spoken by the target person (“Target, true” column). Note that the true target samples are given here just for comparison; the system only uses the source voice for conversion. Neither the source nor the target samples below were ever seen by the model during training.

Source Target, true Target, converted

Team

Dmytro<br>Bielievtsov

Dmytro
Bielievtsov

CTO

Former CTO at IBDI. Publications at Phys Rev and CCE. Expertise in machine learning, dynamical systems and distributed computing. Experience in building, guiding and managing research and software development teams.

Grant<br>Reaber

Grant
Reaber

Head of Research

PhD candidate in math at Carnegie Mellon. PhD from NIP Aberdeen in philosophy. Accepted at the NextAI incubator (for a previous speech tech project). Expertise in applied math and deep learning.

Oleksandr<br>Khapilin

Oleksandr
Khapilin

Voice conversion padawan

BSc degree in systems engineering from Kiev Polytechnic Institute.

Oleksandr<br>Serdiuk

Oleksandr
Serdiuk

CEO

Former CEO at IBDI. Successfully built several tech-centered companies from scratch. Developed sales processes and built tech and business teams.

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSave

SaveSaveSaveSave

Contact us

We are always glad to build connections with talented engineers, marketing and sales professionals, investors and, of course, customers.

Please feel free to write us using the form below.

Your Name*
E-mail:*
Phone:
Subject:*
Message:*