Blog | Respeecher

VTubers: The Rise of Synthetic Media in Entertainment

Written by Rustem Vilenkin | Feb 10, 2021 2:42:30 PM

With millions of followers and millions in venture investments, video bloggers like Kizuna A.I., Gawr Gura, and Tokyo-based VTuber studio Cover are paving the way for the latest trend that has already taken over Asia and is moving westward. Let's take a look at how this virtual YouTuber technology works and what underlies its popularity.

VTubers: who are they?

The easiest way to understand what it’s like to be a VTuber is to go and watch a couple of VTuber videos. Like this one with an anime girl, or… well, another anime hero. In short, VTubers are just regular YouTube content creators or bloggers who switch their personal identity to an animated one. They adopt animated personas instead of their real-life identities, utilizing AI character generators and AI-generated video techniques to produce their content.

Some of the most popular VTubers created by the upd8 VTuber studio and other independent vloggers include Kizuna AI, Laki, Nobuhime, Neu, Zombiko and The Omega Sisters. You can easily find them with a YouTube search.

Does it always have to be anime? Actually, no. But the history of the VTuber emergence is directly related to Japan. That is where anime was and remains the most popular cartoon genre.

How did virtual YouTubing get started?

In 2011 Ami Yamato, a Japanese YouTuber, used a 3D animated avatar in her vlog. You can still watch the video which gave birth to the new YouTube trend. Most of the viewers were genuinely interested in learning how she created the avatar and what VTuber software she used to do it. Although the video was not a huge success and received relatively few views, a new industry was launched.

Five years later, Kizuna Al, a Japanese virtual idol and video blogger, launched her first channel, marking a pivotal moment in the use of AI character generator technology in entertainment. The VTuber is still widely considered to be the world's first virtual video blogger. That’s due to the massive popularity of her channel compared to Yamato’s vlog. Since 2011, Kizuna’s channels have racked up more than 4 million subscribers and over 250 million views.

Presumably, Kizuna’s videos are created using a free VTuber software known as MikuMikuDance, along with other techniques for capturing facial expressions, voices, and actions. The content of the videos on the channels is similar to other live YouTubers' content and activities, including video hosting discussions, Q&A, live play, etc. The primary language used in the videos is Japanese, but the fan community is constantly working on adding subtitles in multiple languages.

Today, as it usually happens, businesses are trying to ride the rising trend of synthetic media. Some of the most popular VTubers don't run their show on their own. Agencies like Hololive own VTuber shows such as Watson Amelia and Gawr Gura. These vlogs feature typical anime girl style personas, which differ in character and background.

Venture capital investors are also following the trend. In 2018, Activ8 (the production studio behind Kizuna AI) raised $5.4 million from Gumi, one of the largest Japanese gaming companies and VC fund. In 2020, Tokyo-based startup Cover raised $6.6 million in their first round of investment. The startup focuses on providing management production services for virtual YouTubers.

How do VTubers work? The technology that fuels the hype

In this blog post, we will not examine the streaming component of the process. If one can create a digital avatar, the process of distributing it to a video blog is no different from other streaming content.

As long as you understand how YouTube or Twitch streaming works, the process here is no different. But how do VTubers create these virtual avatars?

In short, before you can create an avatar, there are two issues you’ll need to contend with:

  1. Creating and animating the avatar itself

  2. Integrating the required motion capture software

1. Animated avatars

If you’re a digital artist, a familiarity with tools such as VRoid and Live2D to create anime-style characters is second nature. These tools are popular because the artist doesn’t have to bother with the rigging process, as these tools provide them with a set of pre-rigged face models out of the box.

The only thing the content creator has to do is sketch the character’s face and then overlay it onto a model. With multiple sets of premade character appearances, you can customize the avatar's dress and overall appearance.

If you are not satisfied with the standard models, you can always create your own from scratch. Unfortunately, this is a very resource-intensive process. You will need access to more traditional programs for working with 3D models such as Maya and Blender. And what’s more, you probably won't be able to use them unless you're a professional 3D modeler.

2. Motion capture

Traditionally, motion capture is a costly process. To implement it seamlessly, you'll need equipment most commonly used in large Hollywood productions, not home-based YouTube studios.

This includes a VR head-mount, VR gloves, and a motion-capture bodysuit. Of course, this is not cheap gear. And yet, there are several less expensive alternatives that may not provide the same quality but won’t break your wallet either.

Services like CVPR, Siggraph, or AWE allow VTubers to use camera-based body tracking. Couple that with Wakaru and Hitogata software and you can even get facial capture results utilizing regular 2D webcams. 

When your animated avatar model is ready and the motion capture is set up, connect both to the real-time rendering engine. This engine will render the video stream in real-time, allowing it to be streamed to YouTube or any other video service. The Unreal Engine is the most known and is used in many of the most popular game titles, including BioShock, Mortal Kombat, and others. 

So why have VTubers become so popular?

The simplest explanation is that they are popular for the same reason that movies or video games are popular. You can follow the life of your favorite virtual character, which, unlike in games or movies, is truly alive and interacts with its audience.

One of the most important aspects of VTubers' success is the secret of their true identities who is actually hiding behind the animated character? What do they do when they’re not VTubing? These are the questions that captivate the audience’s interest in the virtual character and content creators

Given the enormous popularity of VTubers in Asia, it is only a matter of time before their westward expansion begins. Given the enormous passion they’ve inspired in children and youth, Youtube content creators are set to dominate internet trends by the end of 2021.

Future use cases for VTuber technology and synthetic media 

Virtual video bloggers are just the beginning of what this new technology is capable of. Today, the gap between how many content businesses and media companies want to produce and what they can afford to publish is huge.

Of course, we are not talking about the fact that these media giants will begin to "disguise" their leaders as comic book heroes (although, who knows?). Nobody said that the same AI video generation technology that can make you into an anime character cannot turn you into, for example, Tom Cruise?

We are entering a new frontier where virtual identities will be created in order to universalize content production. Actors, TV presenters, singers, and even public speakers will have avatars that are independent commodities in the media market. 

With technologies like AI dubbing (the one Respeecher specializes in), it will be possible to produce local identities for content distribution and easily dub them using any regional language.

This expands the ability to access any content on local markets by significantly minimizing production costs. Where a studio used to have to hire local actors or TV presenters and re-dub the content, it is now possible to reduce time and financial expenditures using these high-tech speech generators

Respeecher has already helped content creators cut down on intensive video production costs. These processes include dubbing and localization, voice cloning for virtual (or simply de-aged) actors, and providing the ability to speak clearly for those who need it most. We see technological disruptions across the media and business landscape, from creating automatically customized video advertising to virtual video assistants.

Businesses aren't the only ones that are implementing new technologies. With gender-swap filters for Snapchat or FaceApp that once made us all look older, we see a growing interest in this kind of technology among mass consumers.

People want to speak in different languages, look differently, and cosplay their favorite characters. Looks like VTubers are only the beginning.