Skip to content

Advanced Settings Options Explained

In the top menu, go to Tools/Options -- Advanced Options to customize some parameters for finer control, as shown in the figure below.

image-20240804220459698

Click on the text title on the left to pop up a detailed explanation.

Interface Language: Set the software interface language. A restart is required after modification. The default follows the operating system. zh represents Chinese, and en represents English.

Pause Countdown: When processing a single video translation, a pause occurs for a certain period after the subtitles are recognized and after the subtitles are translated. You can set the number of seconds to pause here.

Background Volume Multiplier: The background audio volume is multiplied by this value. For example, if you enter 0.8, the volume is reduced to 80% of the original.

Loop Background Sound: If the background audio is shorter than the video, whether to repeat the background audio. true for looping, false for not looping.

302.ai Translation Model List: Enter the names of the models used by 302.ai for translation, separated by English commas.

302.ai TTS Model List: Enter the names of the models used by 302.ai for voice-over, separated by English commas.

ChatGPT Model List: The available chatGPT models, separated by English commas.

Gemini Model List: Gemini model list, separated by English commas.

Azure Model List: The available models, separated by English commas.

Local LLM Model List: The available models, separated by English commas.

ByteDance Volcano Inference Endpoint: Fill in the name of the inference endpoint created in ByteDance Volcano Ark. See https://pyvideotrans.com/zijiehuoshan for how to create it.

Video Transcoding Loss Control: Loss control during video transcoding. 0 = lowest loss, 51 = highest loss, default 13.

NVIDIA Use qp Instead of crf: Whether to use qp instead of crf to control video quality loss on NVIDIA graphics cards. true = yes, false = no.

Output Video Quality Control: Used to control the output video quality and size. Faster speed means lower quality.

Custom ffmpeg Command Parameters: Custom ffmpeg command parameters, which will be added in the second to last position. For example: -bf 7 -b_ref_mode middle

264 or 265 Video Encoding: Enter 264 to use libx264 encoding, and enter 265 to use libx265 encoding. 264 has better compatibility, and 265 has a larger compression ratio and higher definition.

Audio Maximum Acceleration Multiple: The maximum acceleration multiple of the audio. The default is 3, which means the maximum acceleration is 3 times the original speed. It needs to be set to a number between 1-100. For example, 3 means a maximum acceleration of 3 times. Used to control the duration of the voice-over to be aligned with the original duration.

Video Slow Motion Multiple: Video slow motion multiple: a number greater than 1, representing the maximum allowable slow motion multiple. 0 or 1 means no video slow motion, used to extend the video to align with the voice-over and subtitles.

Remove Voice-over Trailing Silence: Whether to remove the silent space at the end of the voice-over. true = remove, false = do not remove.

Remove Subtitle Duration Greater Than Voice-over Duration: Whether to remove silence when the original subtitle duration is greater than the voice-over duration. For example, if the original duration is 5s and the voice-over is 3s, whether to remove the 2s silence. true = remove, false = do not remove.

Remove Silence Length Between 2 Subtitles: Remove the silence length between 2 subtitles in ms. For example, 100ms, that is, if the interval between two subtitles is greater than 100ms, 100ms will be removed. -1 = completely remove.

Force Modify Subtitle Timeline: true = force modify subtitle timeline to match the sound, false = do not modify, keep the original subtitle timeline. Not modifying may cause the subtitles and sound to be mismatched.

Enable VAD: Enable VAD in faster-whisper subtitle overall recognition mode. true = enable, false = disable. Enabled by default.

Minimum Silent Segment: Minimum silent segment ms, default 250ms.

Maximum Sentence Duration Seconds: Maximum sentence duration seconds, default 6s.

VAD Threshold: VAD threshold

VAD Pad Value: VAD pad value

Equal Division Silent Segment: Silent segment in equal division mode, default 10s

Equal Division Segment Duration: Duration of each segment in equal division mode in seconds.

faster and openai Model List: Model name list under faster mode and openai mode, separated by English commas.

CUDA Data Type: cuda data type in faster mode, int8 = less resource consumption, fast speed, low precision, float32 = more resource consumption, slow speed, high precision, int8_float16 = device auto-select

Whisper Model Prompt: Prompt word sent to the whisper model.

faster-whisper cpu Process: In faster mode, the number of cpu processes during subtitle recognition.

faster-whisper Worker Process: In faster mode, the number of concurrent worker processes during subtitle recognition.

Subtitle Recognition Accuracy Control 1: Precision adjustment during subtitle recognition, 1-5, 1 = lowest memory consumption, 5 = highest memory consumption.

Subtitle Recognition Accuracy Control 2: Precision adjustment during subtitle recognition, 1-5, 1 = lowest memory consumption, 5 = highest memory consumption.

faster-whisper Temperature Control: 0 = occupy less GPU resources but slightly worse effect, 1 = occupy more GPU resources and better effect

Context Awareness: true = occupy more GPU with better results, false = occupy less GPU with slightly worse results.

Hard Subtitle Font Pixel: Hard subtitle font pixel size

Hard Subtitle Font Name: Font name for hard subtitles

Hard Subtitle Text Color: Set the color of the font. Note that the 6 characters after &H, each 2 letters represent the BGR color, that is, 2-bit blue / 2-bit green / 2-bit red, which is the reverse of the common RGB color.

Hard Subtitle Text Border Color: Set the font border color. Note that the 6 characters after &H, each 2 letters represent the BGR color, that is, 2-bit blue / 2-bit green / 2-bit red, which is the reverse of the common RGB color.

Hard Subtitle Move Up Distance: The subtitles are located at the bottom of the video by default. Here you can set a value greater than 0, representing the distance the subtitles move up. Note that the maximum should not be greater than (video height - 20), that is, at least 20 height should be reserved for display. Subtitles, otherwise subtitles will not be visible

faster/openai-whisper Re-punctuate After Recognition: If selected, nltk will be used to re-punctuate after recognition.

Number of Characters Per Line for Chinese, Japanese and Korean: The number of characters per line length for Chinese, Japanese and Korean hard subtitles, more than this will wrap, the default is 20 characters, and it is also used as the basis for re-punctuation.

Number of Characters Per Line for Other Languages: The line wrapping length for hard subtitles in other languages, more than this number of characters will wrap, the default is 54 characters, and it is also used as the basis for re-punctuation.

Subtitle Traditional to Simplified: Forcefully convert the recognized traditional subtitles to simplified.

Number of Subtitles Translated Simultaneously: The number of subtitle entries translated simultaneously, the default is 15.

Number of Translation Error Retries: The number of retries when a translation error occurs, the default is 2.

Pause Time After Translation: Pause time after each translation/second, used to limit the request frequency.

Number of Subtitles Dubbed Simultaneously: The number of subtitle entries dubbed simultaneously.

AzureTTS Batch Lines: The number of lines for AzureTTS voice-over at one time, the default is 150.

ChatTTS Voice Tone Value: chatTTS voice tone value.