by Orysia Khimiak – Aug 4, 2022 7:39:19 AM • 8 min

Respeecher Mates: Bogdan Belyaev on the Intersection of AI & Music, Luke Skywalker's AI Voice and the Invasion of a Hometown

•••

 

Respeecher Mates:

This is an episode of small talks with Respeecher team members. They are human stories which stand behind Emmy-Award-winning technology, reflecting our commitment to AI ethics.  Behind every great product stands incredible people, and we want to introduce you to ours. We won't share all the secrets, but these stories will give you a glimpse into Respeecher's cozy office life in Kyiv and our daily routine. 

We've talked with Bogdan Belyaev, Sound Engineer at Respeecher first about the intersection of AI & music, how it felt for a Star Wars fan to work on Luke Skywalker's voice and about the invasion of his hometown.

How would you explain to a child what you do? 

I would explain it using the example of a factory. You make metal materials out of raw materials, and metal parts using a machine. I do the same with sound. We receive raw audio recordings (from the customer). We process them and check if this output suits us. After that, we use our generative AI models "as a machine" and make sound conversions - our "factory parts,” as it were. 

When sound meets AI

- AI and music/sound is a unique intersection. How did you get there? What is your background?  

I started as a musician, wrote and still write music, modern classics (neoclassical). A combination of piano and synthesizer, and sounds scores for soundtracks. Some of them were used in short films, documentaries and computer games.

- So those clients came to you with a request to create music for particular movie episodes?  

Yes. I was also approached and asked to create scores, sometimes within 24 hours, a task I managed to accomplish.  

Do you ever  not have the inspiration to make music, or have you  ever not felt the emotion of a movie episode and been unable to recreate a sound background that would match.?

Inspiration comes during the process. The most important thing is to start.

Was there life before music? Has work always been closely intertwined with music?

Friends often make fun of me that no matter how I change my job, I would somehow have a sound-related job. Of course, there were first jobs. I have worked as a cashier, and a lifeguard and shoveled the snow for money. At the same time, though, I wrote music and became a sound engineer in an animation studio that produced animations for YouTube.

There were times when I had to work part-time, because sound work is not in high demand in Ukraine. It was mostly freelance work, but most of the jobs were related to music and sound production. Later I started editing audio courses for a platform that was similar to Spotify, but for educational courses, involving synthetic media

Photo by Galyna Balabanova

So how did you get into Respeecher? The work of a sound engineer and working in a technology startup definitely have many different aspects. How did your paths cross?

Simultaneously, when I was with sound production, I became interested in programming, which I found fascinating. I wanted to create musical mobile applications, primarily for myself, so I became interested in Swift. I wasn’t trying to make money. At the time, it was just  a hobby.

Everything came together when Vetal (Head of Delievery at Respeecher) wrote to me on rabota.ua ;) 

He said he was looking for a person with a profile like mine. No one had ever written to me on rabota.ua before. So I kinda "froze" and didn't write back for three days. (laughs)

When you work with sound production, it's not common for companies to hunt you. Usually, you are  the one who hunts them, looking for opportunities yourself and taking the initiative. Not always, but that's usually how it works.

And here it turns out that they found me. But when the startup concept was described to me, I thought it was an excellent idea, especially considering AI ethics.

Even when I was creating sound for games, it's also basically tech, but they usually said what kind of sound was needed, and I wasn't involved in the technological part of the  process. But I was always interested in trying.

Because sound is very mathematical, just like music.

Artificial intelligence and music - the emotionality of sound - how did you understand how it works?

At first, it was based on the principle ‘monkey see monkey do”. Guided by intuition. I think the fact that I worked a lot with sound helped me. 

Wait, what is your education? Do you have a background in music?

No, I am a philologist, by education, and had a degree as a paramedic. 

That’s quite a resume for a 27-year old. How did you manage that? 

Well, my first degree was in emergency medical care, but I realized that  this profession was not for me. I then studied Romano-Germanic philology in the university of my hometown. These skills and my background in music, philology and Swift led me to Respeecher.

What happened next?

Vetal started talking about the AI voice cloning technology and I immediately fell in love with this concept. Because the speech synthesis technology and audio and the solution were really cool and coincided as much as possible with what I was doing and what I was interested in. I was ready to work with them in test mode for even half a year just to work a little with web-based technology, but they told me that this wasn’t necessary and that I suited for them. They hired me. 

Star Wars

About working on voices for Star Wars

Which Respeecher project is a benchmark for you and the one you are really proud of?

Luke Skywalker's voice, of course. 

Tell me everything. Were you a Star Wars fan before doing the voice of Skywalker?

Absolutely. I  was crazy about it! I even remember the day when I fell in love with this movie. A friend called to see the third part of Star Wars in the cinema. At the time, I  was a fan of Lord of the Rings and other epic high-fantasy films, so I was a bit skeptical about Star Wars initially. 

I even had a disc with every episode of Star Wars, Lord of the Rings, and The Matrix. It was my favorite artifact. It is a pity that it  remains in my hometown city, which is now under Russian occupation.

I remember how the team decided who would lead this project, and I just wanted to touch the Star Wars galaxy at least somehow, to be related to this project in any way. I asked my colleague to make at least a dataset if he was given this project, but to my surprise, he told me to take the whole project. To say that I was happy is an understatement.

And what did the work on the project look like?

The project was not as easy as we thought. The quality of the first audios was not so good and there were problems with the data because they were recorded in the 1980s. The first Star Wars had bad funding, so the sound quality was so-so. Apparently, it was not bad for the 1980s, but for 2022, the quality differed, to put it mildly. At the same time, Star Wars was something like a spaghetti western, and no one expected that the movie would cause such a stir. That's probably the reason why they didn't invest so much in sound quality back then.

We worked on the project for nine months.

Wow! Almost like having a baby. That is, all this time was spent analysing the data and analyzing the voice of Mark Hamill (the actor who played Luke Skywalker)?

No, we did several iterations with the voice cloning software. They didn't like my first conversions, which we overdubbed so that the sound had the highest possible quality. 

G.Balabanova-2443Photo by Galyna Balabanova

About a big experiment that brought desired results

Were you afraid to get feedback from the sharks of the Hollywood sound industry?

At first, I was very nervous and sent them everything with trembling hands. We had limited data, so the quality of the recordings with Mark Hamill's voice from the 1980s was poor and we didn't have enough data. In order to reproduce a voice with the help of artificial intelligence, a certain amount of audio with this voice must be analyzed first in order to train the model. The model we have now works perfectly, but back then it it needed work. At that time, we did not have deblurring - a voice filter. 

How much recording of the audio track with the voice did you need to get it right?

Ideally, from 45 minutes to two hours, but we had only 15 minutes cuts. Therefore, we had to reproduce a better version of what was there from the data we provided.

We brainstormed a lot about how to achieve the desired result with the amount of data we had. We talked with Dmytro Bielievtsov (CTO and Co-founder) and after the last conversation I went out on the balcony, and a crazy idea came to my mind: I wanted to rework all the audio and do something that is forbidden in the world of sound engineering. I took all the datasets with Skywalker's voice and aggressively reworked them with AI voice cloning. I would even say not agreesive, but inadequate: we made a compression with an ininadequate numbers. We didn't do aggressive post-processing. I took all our datasets and highly compressed the source voices. This way I brightened the f0 of the source and helped the model work better for the conversion. After this experiment, everything worked out and we got perfect voice dynamics of the target got even better. 

About the beginning of the war, Russian occupation and longing for home

What happened next?

This work flowed smoothly into work on another project. At a certain point, I was brought in to speak directly with the Lucasfilm team.

On February 24, when the full-blown invasion in Ukraine began, I sent them data, just as there was an air raid alert. Fun fact, after that, about 80% of subsequent iterations from the conversion push were accompanied by an air alarm. But I understood that if I don't send them now, I might never send them.

Did you work on Star Wars from Kyiv from a studio, or was it enough just to have a computer?

No. I worked from my hometown. I had a sound studio there and started to retool it a bit. That's why I worked from there on the voices for Obi-Wan Kenobi and on the Book of Boba Fett, leveraging synthetic media.

We have been in a frontline region since 2014 (Bogdan's hometown is near Mariupol), so we know how it feels to be "near war". We saw the first hints of renewed aggression back in October.

*Mariupol is a Ukrainian city with an area of ​​166 km², which is larger than San Francisco or Paris. Since the beginning of the full-scale war, the russians have destroyed 90% of it. Now it's under russian occupation, just like Bogdan's hometown. For safety reasons, we will not name his city because his family still lives there.

Source: BBC 

We left for western Ukraine during the last week of February. We wanted to believe the war would not start. But I am glad that it was possible to save my possessions and evacuate our dog and cats. 

 

 

Orysia Khimiak
Orysia Khimiak
PR and Comms Manager
For the past 9 years, have been engaged in Global PR of early stage and AI startups, in particular Reface, Allset, and now Respeecher. Clients were featured in WSJ, Forbes, Mashable, the Verge, Tech Crunch, and Financial Times. For over a year, I Orysia been conducting PR Basics course on Projector. During the war, became more actively involved as a fixer and worked with the BBC, Guardian and The Times.
  • Linkedin
  • Email
Previous Article
Tackling Voice and Speech Disorders: An Inside Perspective
Next Article
What Is Singing Voice Synthesis and Is It Even Possible?
Clients:
Lucasfilm
Blumhouse productions
AloeBlacc
Calm
Deezer
Sony Interactive Entertainment
Edward Jones
Ylen
Iliad
Warner music France
Religion of sports
Digital domain
CMG Worldwide
Doyle Dane Bernbach
droga5
Sim Graphics
Veritone

Recommended Articles

The Role of AI Voice APIs in Building Accessible Smart Cities
Oct 25, 2024 | 9 minutes read

The Role of AI Voice APIs in Building Accessible Smart Cities

As urban environments grow smarter, the role of AI voice APIs in enhancing accessibility becomes increasingly critical. Smart cities leverage technologies like AI, the ...
# Respeecher Voice Marketplace
AI Voice Cloning for Historical Preservation: Bringing the Past to Life
Sep 20, 2024 | 8 minutes read

AI Voice Cloning for Historical Preservation: Bringing the Past to Life

AI voice cloning, a cutting-edge technology that uses artificial intelligence to replicate human voices, is transforming various industries, including historical ...
# Respeecher for Business