Skip to content

ByteDance Volcano Voice Synthesis

Voice synthesis, or converting text into speech, has many excellent open-source solutions like GPT-SoVITS and ChatTTS, as well as free options like edge-tts. There are also commercial-grade services, such as ByteDance's Volcano voice synthesis. For free options, open-source tools are the best choice, but for higher quality, commercial services are more suitable. With the advancement of large models, prices are becoming increasingly affordable, making commercial APIs a great option for dubbing.

Starting from version 2.88, ByteDance's Volcano Engine voice synthesis service has been added. It supports dubbing in 8 languages: Chinese, English, Japanese, Portuguese, Spanish, Thai, Vietnamese, and Indonesian. For Chinese, it also supports various dialects like Northeastern and Sichuan accents. It offers 20,000 free requests, approximately enough to synthesize 10 hours of speech.

Supported Chinese Voice Styles

Only some Chinese voice styles are shown here. For the other 7 languages, check: https://www.volcengine.com/docs/6561/97465

There are many supported Chinese voice styles, including various dialects and popular voices like those used in Douyin for movie commentary, such as Xiao Shuai and Xiao Mei.

Voice Namevoice_type
Can Can 2.0BV700_V2_streaming
Yang YangBV705_streaming
Sunny YouthBV123_streaming
Anti-Roll YouthBV120_streaming
General Son-in-LawBV119_streaming
Ancient Elegant LadyBV115_streaming
Dominant UncleBV107_streaming
Simple YouthBV100_streaming
Gentle LadyBV104_streaming
Cheerful YouthBV004_streaming
Sweet Young LadyBV113_streaming
Elegant YouthBV102_streaming
Sweet Xiao YuanBV405_streaming
Friendly Female VoiceBV007_streaming
Intellectual Female VoiceBV009_streaming
Cheng ChengBV419_streaming
Tong TongBV415_streaming
Friendly Male VoiceBV008_streaming
Dubbed Film Male VoiceBV408_streaming
Lazy Little SheepBV426_streaming
Fresh Literary Female VoiceBV428_streaming
Inspirational Female VoiceBV403_streaming
Wise ElderBV158_streaming
Loving GrandmaBV157_streaming
Rap GuyBR001_streaming
Energetic Male NarratorBV410_streaming
Movie Narrator Xiao ShuaiBV411_streaming
Xiao Shuai - Multi-EmotionBV437_streaming
Movie Narrator Xiao MeiBV412_streaming
Playboy YouthBV159_streaming
Live Stream QueenBV418_streaming
Anti-Roll YouthBV120_streaming
Calm Male NarratorBV142_streaming
Free-spirited YouthBV143_streaming

How to Activate

  1. First, register, log in, and complete real-name verification.

https://console.volcengine.com/

Open the link to register, log in, and complete the verification.

  1. After entering the console, open the Speech Technology page as shown below.

image.png

Alternatively, click this link to go directly: https://console.volcengine.com/speech/app

Then, create an application as shown below. Fill in the name and description as desired, but make sure to select "Voice Synthesis Service" and confirm to complete.

image.png

  1. Next, go to the voice synthesis page to activate the free trial.

Navigate to: https://console.volcengine.com/speech/service/8

At the top, select the application you just created and click "Trial" to activate.

image.png

  1. Copy the three parameters and fill them into the video translation software for use.

First is the cluster id. Copy the name under cluster id as shown.

image.png

Second is the App id. Scroll down on the same page to find it.

image.png

Third is the Access Token, located to the right of App id. Copy it.

image.png

  1. Fill these into the video translation software. Open Menu - TTS Settings - ByteDance Volcano Voice Synthesis window, enter the details, test to ensure no issues, and save.

image.png

Using It in Video Translation Software

After filling in and testing without issues, select the target language in the software, then choose ByteDance Volcano Voice Synthesis under the dubbing channel. You can click to preview each voice style.

image.png

Choose a satisfactory voice style to start the dubbing process.

Important Notes

If you activate the official version, only the "General Male" and "General Female" voices are available by default. Other voice styles need to be purchased and activated separately in the ByteDance Volcano backend.