This is a powerful open-source video translation software dedicated to seamlessly converting videos from one language's audio and subtitles to another. Whether you are a content creator, educator, or language learner, pyVideoTrans provides you with a one-stop solution to break down language barriers.
Core Features at a Glance
- Fully Automatic Video Translation: Intelligently recognizes speech in videos, generates source language subtitles, translates them into the target language, dubs the audio, and finally synthesizes the new audio and subtitles into the original video, all in one go.
- Speech Recognition and Transcription: Accurately transcribes human speech from video or audio files into SRT subtitle files with timestamps in batches.
- SRT Subtitle File Translation: Supports batch translation of SRT subtitle files, preserving the original timecodes and formatting, and provides a variety of bilingual subtitle styles.
- Text-to-Speech (TTS): Utilizes various advanced TTS channels to generate high-quality, natural-sounding voiceovers for your text or SRT subtitle files.
- Practical Toolkit: Built-in auxiliary tools such as video/audio/subtitle merging and vocal/background sound separation to meet your various refined needs in video processing.
How It Works
Before you begin, be sure to understand the core workings of this software:
pyVideoTrans works by recognizing and processing the [human speaking voice] in the video. It is completely independent of whether the video screen already has subtitles (hard subtitles).
- Can process: Any video containing human speech, whether it has embedded subtitles or not.
- Cannot process: Videos with only background music and hard subtitles, but without any human speech. This software also cannot directly extract hard subtitles from the video screen.
Download and Installation
1.1 Windows Users (Pre-packaged Version)
We provide a ready-to-use pre-packaged version for Windows 10/11 users, eliminating the need for cumbersome configuration.
Click here to download the pre-packaged Windows version, unzip and use
Unzipping Precautions
Incorrectly unzipping is the most common cause of software startup failures. Please strictly adhere to the following rules:
- Prohibit Administrator Privileges Paths: Do not unzip to system folders such as
C:/Program Files
,C:/Windows
, orDesktop
. - Path Must Be Pure English: The unzipping path cannot contain any Chinese characters, spaces, or special symbols.
- Recommended Practice: Create a new folder with pure English or numbers (e.g.,
D:/videotrans
) on a non-system drive like D or E, and then unzip the package into this folder.
Starting the Software
After unzipping, enter the folder and find the sp.exe
file. Double-click to run it.
The software needs to load more modules for the first startup, which may take tens of seconds. Please be patient.
1.2 MacOS / Linux Users (Source Code Deployment)
For MacOS and Linux users, deployment needs to be done via source code.
- Source Code Repository Address: https://github.com/jianchang512/pyvideotrans
- Detailed Deployment Tutorial:
Software Interface and Core Functions
After the software starts, you will see the following main interface.
- Left Function Area: Switch between the main function modules of the software, such as
Custom Video Translation
,Audio and Video to Subtitles
, etc. - Top Menu Bar: Perform global configuration.
- Translation Settings: Configure the API keys and related parameters for each translation channel (e.g., OpenAI, Azure).
- TTS Settings: Configure the API keys and related parameters for each voiceover channel (e.g., OpenAI TTS, Azure TTS).
- Speech Recognition Settings: Configure the API keys and parameters for the speech recognition channel (e.g., OpenAI API, Alibaba ASR).
- Tools/Options: Contains various advanced options and auxiliary tools, such as subtitle format adjustment, video merging, and vocal separation.
- Help/About: View software version information, documentation, and community links.
- Translation Settings: Configure the API keys and related parameters for each translation channel (e.g., OpenAI, Azure).
- Right Workspace: The specific operation area for the current function module.
Quick Start - Video Translation Full Process
This is the core function of the software. We will guide you step by step through a complete video translation task. The Custom Video Translation
module opens by default.
Step 1: Select Video and Output Settings
Select Video to Process
: Click the button to select one or more video files (holdCtrl
to select multiple).Folder
: Check this option to batch process all videos within the entire folder.Save to..
: Set the output directory for the translated video. The default is the_video_out
folder in the original video directory.Clean Generated
: Check this option if you need to reprocess the same video (instead of using the cache).Save Video Only
: If checked, only the final MP4 video will be kept after processing, and intermediate files such as subtitles and audio will be automatically deleted.Move Subtitle Position
: If the original video has hard subtitles, checking this option will attempt to place the new subtitles in a different position to avoid overlap.Shutdown After Completion
: Automatically shuts down the computer after processing all tasks, suitable for large-scale, long-term tasks.
Step 2: Configure Translation and Voiceover
Translation Channel
: Select the engine used to translate subtitles.- Free:
Google (Free)
(requires proxy),Microsoft Translate
(no proxy required). - High Quality (API Key Required):
OpenAI
,Gemini
,DeepL
, etc. Set the API Key in the corresponding location in the top menu bar.
- Free:
Source Language
: Must accurately select the language spoken by the people in the original video.Target Language
: The target language you want to translate into.Glossary
: Check this option to use a preset glossary for translation to ensure the accuracy of professional vocabulary.Network Proxy
: If using a channel that requires a proxy (such as Google, OpenAI), fill in your proxy address and port here (e.g.,http://127.0.0.1:10808
).Voiceover Channel
: Select the engine to generate voiceovers.Edge-TTS
is the default option, free and with excellent results.Voiceover Role
: You must select the target language first to load and select the corresponding voice (male/female, etc.).Listen to Voiceover
: Click to preview the sound effect of the current role.Voiceover Speed/Volume/Pitch
: Adjust as needed. The numbers represent the percentage increase or decrease based on the default.
Step 3: Configure Speech Recognition
This is a crucial step in converting video speech into text subtitles, directly affecting the quality of all subsequent processes.
Speech Recognition
: It is recommended to use the defaultfaster-whisper(local)
, which is free, runs locally, and provides excellent results.Select Model
: The larger the model, the more accurate the recognition, but the slower the speed and the more resources consumed.- Entry-Level:
tiny
/medium
- Recommended:
large-v3-turbo
(excellent effect and fast speed, highly recommended with NVIDIA graphics card and CUDA acceleration).
- Entry-Level:
Speech Segmentation Mode
: It is recommended to use the defaultOverall Recognition
.LLM Re-segmentation
: If checked, a large language model will be used to intelligently segment and punctuate the recognized text, significantly improving subtitle readability.Noise Reduction
: If checked, the audio will be denoised to improve speech recognition accuracy in noisy environments.
Step 4: Set Synchronization and Subtitles
Since different languages have different speech speeds, the duration of the translated voiceover may not match the original video. This section allows for adjustments.
Synchronization Alignment
:Voiceover Acceleration
: When the voiceover is longer than the video, accelerate the voiceover to match the video duration (commonly used).Video Slowdown
: When the voiceover is longer than the video, slow down the video to match the voiceover duration.Video Extension
: When the voiceover is longer than the video, add still frames at the end of the video to match the voiceover duration.
Subtitle Embedding
:Do Not Embed Subtitles
: Only replace the audio, without adding any subtitles.Embed Hard Subtitles
: Permanently "burn" the subtitles into the picture, which cannot be turned off.Embed Soft Subtitles
: Package the subtitles as an independent track into the video, which can be turned on or off by the player.(Dual)
: Embed bilingual subtitles with both the source and target languages.
Step 5: Process Background Sound
Keep Original Background Sound
: If checked, the software will attempt to separate the human voice and background sound of the original video and keep the background sound in the final video. Note: This feature will significantly increase processing time, but can greatly improve the quality of the finished product.Add Additional Background Audio
: You can also select your own audio file as new background music.Background Volume
: Adjust the volume of the background sound. Less than 1 reduces the volume, greater than 1 increases the volume.
Step 6: Start Execution
CUDA Acceleration
: If you have an NVIDIA graphics card and have correctly installed the CUDA environment, be sure to check this option. It can increase the speed of speech recognition by several times or even dozens of times.
After all settings are complete, click the "Start" button.
The software will start working. If only one video is being processed, it will pause after subtitle generation and translation, giving you the opportunity to proofread and modify the subtitles in the text box on the right. Click execute again to continue after confirming that everything is correct.
Step 7: View Results
After the task is completed, click the progress bar area at the bottom to open the output folder. You will see the final MP4 file and the SRT subtitles, voiceover files, and other materials generated during the process.
Explore Other Practical Features
In addition to the core video translation, pyVideoTrans also provides several independent powerful functions.
4.1 Audio and Video to Subtitles/Voice Transcription/Speech Recognition
Batch transcribe video or audio files into SRT subtitles. Simply drag in the files, set the original language and recognition model, and start. Supports advanced features such as LLM Re-segmentation
and Noise Reduction
.
4.2 Batch Translate SRT Subtitles
If you already have SRT subtitle files, this function can help you quickly translate them into other languages while keeping the timeline unchanged. It also supports selecting multiple output formats such as Single Language Subtitles
, Target Language Above (Dual)
, and Target Language Below (Dual)
.
4.3 Batch Dubbing Subtitles
Convert your SRT files or plain text into voiceover files (such as WAV or MP3) in batches through the selected TTS engine. Supports fine-tuning of speech speed, volume, and pitch.
4.4 Audio and Video Subtitle Merging
This is a useful post-processing tool. When you have separate video, voiceover, and subtitle files, you can use it to perfectly merge the three into a final video file, and supports customizing subtitle styles.
Chapter 5: Function Overview and Support List
The power of pyVideoTrans lies in its high scalability and support for multiple services.
Speech Recognition (STT) Support:
- Local Offline: faster-whisper, openai-whisper
- Online API: OpenAI SpeechToText, GoogleSpeech, Alibaba FunASR, Doubao Model, and custom API.
Subtitle Translation Support:
- Microsoft Translate, Google Translate, Baidu Translate, Tencent Translate, DeepL, DeepLX, ByteDance Volcano
- Large Language Model: ChatGPT, AzureAI, Gemini, other OpenAI-compatible AI large models, and local large models
- Offline Translation: OTT
Speech Synthesis (TTS) Support:
- Microsoft Edge TTS, Google TTS, Azure AI TTS, OpenAI TTS, Elevenlabs
- Voice Cloning/Local: GPT-SoVITS, clone-voice, ChatTTS, Fish TTS, CosyVoice, F5-TTS, KokoroTTS
- Custom TTS Server API
Supported Languages:
- Simplified and Traditional Chinese, English, Korean, Japanese, Russian, French, German, Italian, Spanish, Portuguese, Vietnamese, Thai, Arabic, Turkish, Hungarian, Hindi, Ukrainian, Kazakh, Indonesian, Malay, Czech, Polish, Dutch, Swedish, Filipino, Finnish, Persian, etc., and supports automatic detection.
Thank you for choosing pyVideoTrans. I hope this software will be a powerful assistant for you to cross language barriers!