by Alex Serdiuk – Jul 29, 2021 5:45:39 AM • 8 min

What Is Synthetic Film Dubbing: AI Deepfake Technology Explained

•••

Many of us remember the good old days of dubbing. The on-screen actor speaks dialog while their lips move unnaturally, pronouncing words in another language.

In addition to being unnatural, classical dubbing had one more drawback. The need to adjust the localized text to the actor's facial expressions often meant changing the meaning of the dialog itself.

In general, this resulted in a less pleasant experience for the viewer. With subtitles, although the actor’s acting is authentic, those who are reading subtitles will not enjoy the performance as the native audience does because they are reading text at the bottom of the screen.

This is where AI and synthetic dubbing technology, known as voice synthesis, come to the rescue, eliminating the need for traditional voice actors and the challenges associated with lip-syncing.

The traditional dubbing process and its drawbacks

The traditional dubbing process is pretty straightforward, albeit challenging to execute.

First, the producer finds a dubbing studio in a foreign language.
The producer sends the original video material and the texts for every dialog to the studio.
The agency starts searching for voice actors (they are often the same people who voice dozens of films in their countries annually).
Then the complex duplication process begins. Voic actors work in the studio, reading the dialogs to match what is happening on the screen, taking into account the expressions of the original actors.
The audio directors then mix the new audio track with the video. And voila, the movie is ready to be submitted to local cinemas.

This process has several significant disadvantages, both in terms of viewing experience and production.

The costs for traditional dubbing are incredibly high. The exact cost is difficult to estimate, but you can reasonably imagine that the price varies from 100 to 150 thousand dollars per language for a film.
Dubbing is not fast. Although voice acting takes less time than creating original content, the time it takes to complete a proper dub is sometimes measured in months.
Dubbing overshadows the original acting. We already mentioned this at the beginning of the blog.

AI/deepfake dubbing technology helps to reduce nearly every difficulty introduced by the original approach. At the same time, and this is important, it does not give rise to new intricacies, particularly when guided by principles of ethical voice cloning.

Here's how deepfake dubbing changes the game for good

If you've never heard of deepfakes, you can start learning by reading What Are Deepfakes: Synthetic Media Explained. In short, deepfake technology helps to adjust video and audio content using machine learning technologies.

The most stunning example of automated dubbing and localization is the example of TrueSync. Flawless AI announced its synthetic film dubbing technology that replaces an actor's lips with deepfake-generated lips, showcasing the capabilities of the deepfake generator.

Lip sync has traditionally been the most conspicuous flaw associated with dubbing. It may look fine amongst European languages, but with American films dubbed in Chinese (for example), almost every lip movement is completely out of sync.

The mouth movements of an English speaker are almost impossible to adapt to Asian languagess, highlighting the challenges that AI dubbing seeks to address. With this in mind, check out what Flawless AI has done:

Hope you enjoyed Jack Nicholson speaking French and German just as much as we did.

So how does AI dubbing work?

In short, a neural network, using the example of an actor's original content, learns to distinguish the characteristic features of their face.

The same network then analyzes the same features in people speaking a different language. Thus, when the foreign language dub is ready, the network can edit the original actor's face to perfectly lip-sync with the foreign dialog.

So far, this technology has only one drawback. To some especially attentive viewers, it seems that modified scenes don't look as essential as the original performances do. But keep in mind that the technology is still young and continues to perfect its methodology.

However, voice cloning technology introduces an entirely new set of tools. Respeecher, the leading innovator in AI voice generator technology, allows movie producers and content creators to make anyone sound as if they were someone else.

Check out this demo we recently recorded for folks interested in the technology just like yourself:

In our How Voice Cloning Makes Dubbing and Localization Easier article, we outlined the impact of voice cloning software on the dubbing market.

It's astonishing how much this technology democratizes the dubbing market and lowers the threshold for entry into foreign markets for small studios. Check the link above to learn more about the advancements in voice synthesis.

By combining both technologies, producers can achieve outstanding dubbing quality. The modified facial animation allows for the original voice of the actor to be transferred to another language. Thus, the dubbing voice matches the original actor's facial expressions.

Plus, AI dubbing itself is produced to give the impression that the actor speaks Chinese or Japanese, for example. This means that viewers cannot tell if the actor is speaking in their native language or not.

AI dubbing concerns that you needn’t worry about

The first and greatest concern is the question of the ethical use of the technology. When you can easily simulate the voice of the US President, you must take technology seriously. At Respeecher, we prioritize ethical voice cloning, and we take ethical concerns very seriously.

Our approach emphasizes that we only work with those voices whose original speakers have given their written permission to do so. And only to the extent specified by this permission.

We also watermark the voice generated by our system. Watermarks are not noticeable to the listener, but they easily allow a professional to detect synthesized speech.

In addition, we are constantly working on algorithms that can differentiate between synthetic and original speech, even if it is not watermarked. As with any advanced technology, society will soon adapt it for widespread benefit by minimizing harm, guided by principles of AI ethics. It's simply a question of time.

Another concern is the quality of the synthesized speech and modified picture. Is it really enough to make the modified content indistinguishable from the original? Well, judge for yourself. We have provided you with some examples above. The main argument in favor of its sufficiency is the interest of large Hollywood studios. In recent years, AI and deepfake dubbing technology has been widely used in blockbusters, including Star Wars, The Irishman, and others, while adhering to principles of AI ethics.

Last but not least is the question of cost. Do these new technologies cost a fortune?

Well, here's a quote from Director Scott Mann, co-founder of Flawless: "Our research suggests this will differ on a case-by-case basis. But when Hollywood is likely to pay $60 million to remake Another Round in English, I can safely say this would cost nearly $60 million less."

Well, it can't be said better. The production price is reduced by an order of magnitude.

What about small content producers? We recently launched the Voice Marketplace where any creator can license the original voice of an actor and use it for voice acting and localization at an affordable price.

Conclusion

We hope we've managed to convince you that synthetic film dubbing is worth your attention at the very least. If you want to get more information about our services or get a consultation about using our AI voice generator technology for your project, let us know right now. We will gladly tell you more and help you navigate the price and use cases.

FAQ

Notable international film festivals include Cannes, Sundance, and Venice, where AI-powered filmmaking tools and synthetic media are becoming integral to film production, showcasing AI-generated content and deepfake technology.

AI voice cloning is revolutionizing film production, enabling voice synthesis for dubbing, localization, and even creating digital doubles. It enhances the efficiency of AI-driven CGI while ensuring more natural-sounding performances across languages.

Synthetic media plays a significant role in film festivals, showcasing groundbreaking uses of AI voice synthesis, deepfake technology in movies, and AI-powered filmmaking tools to create AI-generated content that challenges traditional storytelling.

Ethical concerns about synthetic media in films include issues related to AI voice cloning, the potential for deepfake technology misuse, and questions about digital immortality. Ensuring ethical AI applications is critical to prevent misrepresentation and privacy violations.

Film festivals increasingly incorporate panels and discussions on AI in filmmaking, focusing on AI voice synthesis, synthetic media, and the ethical implications of deepfake technology, AI tools for scriptwriting, and AI-enhanced post-production techniques.

AI improves dubbing and localization by using AI voice cloning to create more natural, language-appropriate dialogues. It helps overcome traditional dubbing challenges, such as lip-syncing, and speeds up the localization process, reducing costs and time.

Synthetic media improves accessibility by offering AI-generated voice synthesis for audio descriptions and creating localized versions of films with accurate dubbing and voice cloning. It ensures films are accessible to a wider audience, including those with hearing and visual impairments.

Ethical AI is crucial in the film industry to prevent the misuse of AI voice cloning, ensure privacy protection, and promote responsible use of deepfake technology. Ethical standards ensure that synthetic media enhances creativity without infringing on rights or misrepresenting individuals.

Glossary

International film festivals

International film festivals showcase global cinema and innovations like AI voice cloning, synthetic media in film, and AI-powered filmmaking tools, spotlighting deepfake technology, digital doubles, and AI-generated content. They explore ethical AI applications and advancements in AI-driven CGI and voice synthesis.

AI voice cloning in filmmaking

AI voice cloning in filmmaking uses synthetic media and AI-powered tools to recreate voices, enabling deepfake technology, voice synthesis, and digital immortality in films.

Synthetic media ethics

Synthetic media ethics focuses on ethical AI applications like AI voice cloning, deepfake technology, and AI-generated content, ensuring responsible use in filmmaking and digital immortality.

Dubbing and localization tools

Dubbing and localization tools use AI voice cloning, synthetic media in film, and voice synthesis to adapt content for global audiences with AI-driven CGI and ethical considerations.

Deepfake applications in film

Deepfake applications in film use AI-powered filmmaking tools and synthetic media to create digital doubles, AI-generated content, and AI-driven CGI, raising ethical considerations.

Digital doubles for movies

Digital doubles for movies utilize AI-powered tools and deepfake technology to create synthetic media, enabling AI-generated content and enhancing AI-driven CGI advancements.

Alex Serdiuk

CEO and Co-founder

Alex founded Respeecher with Dmytro Bielievtsov and Grant Reaber in 2018. Since then the team has been focused on high-fidelity voice cloning. Alex is in charge of Business Development and Strategy. Respeecher technology is already applied in Feature films and TV projects, Video Games, Animation studios, Localization, media agencies, Healthcare, and other areas.