Two Key Factors Determining Translation Quality:
- The accuracy of the recognized text.
- The quality of the translation of that text.
The accuracy of the text directly determines the quality of the translation, so improving translation quality requires addressing both of these aspects.
I: Improving Text Recognition Accuracy:
Use the Large-V3 Model.
From the base, small, and medium models to the large-v3 model, recognition accuracy progressively improves, but so does the consumption of computer resources. If your computer has a high-performance NVIDIA graphics card with at least 8GB of video memory and you have properly configured the CUDA and cuDNN environments, you can try using the large-v3 model, which can significantly improve the accuracy of text and subtitle recognition.
2. Separate Background Sound from Video.
If the video contains a lot of background music or noise, it will definitely interfere with the text recognition effect. You can try selecting "Keep Background Sound," which will separate the background sound before recognition, using only human speech for recognition, which will improve the effect significantly.

Of course, you can also use other third-party separation tools or the "Separate Voice/Background" function on the left side of the software to separate human voice and background sound from the video individually.

Then, use the "Audio/Video to Text" function to perform subtitle recognition on the human voice alone and obtain text subtitles.

Then, under "Text Subtitle Translation," translate the subtitles into the target language.

Finally, in "Standard Function Mode," import the subtitles, add background music, and embed the dubbing and subtitles into the video. Although the steps are slightly more complicated, the translation effect can be significantly improved.

3. Manually Modify and Adjust
After subtitle recognition and translation are completed, the current complete text will be displayed in the subtitle area on the right side of the software. You can click the "Pause" button to pause and then manually modify and adjust. No matter how accurate machine recognition and translation are, they will never be as good as human proofreading.
II: Improving Text Subtitle Translation Quality
The best translation quality comes from ChatGPT/DeepL/Azure. All three require paid accounts but do not support domestic users paying for them. Additionally, ChatGPT/Azure requires configuring a proxy, which has a higher barrier to entry.
If you meet these conditions, have a paid account, and know how to configure a proxy, you can use these three translation channels to improve translation quality (many transit proxy services are available domestically for ChatGPT).
The next best options are Google/Gemini/Microsoft, which are all free. Google and Gemini require configuring a proxy, while Microsoft does not.
However, note that Gemini has stricter security restrictions, and if your video's dialogue content is rated as inappropriate, Gemini may refuse to translate it.
Next, you can choose Baidu Translate and Tencent Translate, which require applying for free keys and app IDs from their respective websites. Tencent offers a higher free quota, while Baidu's free quota is very low.
In summary, if conditions are met, ChatGPT/DeepL are the preferred choices, followed by Google, then Microsoft, and finally Tencent Translate and Baidu Translate.
Of course, you can also use DeepLx to get DeepL for free, but it is unstable and prone to IP bans.
Similarly, after the translation is completed, a pause button will appear. Click pause, and the subtitle area on the right can be manually checked and modified for translation results.