Local Large Language Models & How to Use Domestic AIs Compatible with OpenAI ChatGPT API
In video translation and dubbing software, AI large language models can serve as efficient translation channels, significantly improving translation quality by considering the context.
Currently, most domestic AI interfaces are compatible with OpenAI technology, allowing users to operate directly within OpenAI ChatGPT or local large language models. You can also deploy and use ollama locally.
Moonshot AI Usage
- Menu bar -- Translation Settings -- OpenAI ChatGPT API Settings interface
- Enter
https://api.moonshot.cn/v1
in the API interface address. - Fill in the
API Key
obtained from the Moonshot open platform in the SK field. You can obtain it from this website: https://platform.moonshot.cn/console/api-keys - Enter
moonshot-v1-8k,moonshot-v1-32k,moonshot-v1-128k
in the model text box area. - Then, select the model you want to use in the model selection and keep it after testing without any problems.
Deepseek AI Usage
- Menu bar -- Translation Settings -- OpenAI ChatGPT API Settings interface
- Enter
https://api.deepseek.com/v1
in the API interface address. - Fill in the
API Key
obtained from the Moonshot open platform in the SK field. You can obtain it from this website: https://platform.deepseek.com/api_keys - Enter
deepseek-chat
in the model text box area. - Then, select
deepseek-chat
in the model selection and keep it after testing without any problems.
Zhipu AI BigModel Usage
- Menu bar -- Translation Settings -- OpenAI ChatGPT API Settings interface
- Enter
https://open.bigmodel.cn/api/paas/v4/
in the API interface address. - Fill in the
API Key
obtained from the Moonshot open platform in the SK field. You can obtain it from this website: https://www.bigmodel.cn/usercenter/apikeys - Enter
glm-4-plus,glm-4-0520,glm-4 ,glm-4-air,glm-4-airx,glm-4-long , glm-4-flashx ,glm-4-flash
in the model text box area. - Then, select the model you want to use in the model selection. You can select the free model
glm-4-flash
and keep it after testing without any problems.
Baichuan AI Usage
- Menu bar -- Translation Settings -- OpenAI ChatGPT API Settings interface
- Enter
https://api.baichuan-ai.com/v1
in the API interface address. - Fill in the
API Key
obtained from the Moonshot open platform in the SK field. You can obtain it from this website: https://platform.baichuan-ai.com/console/apikey - Enter
Baichuan4,Baichuan3-Turbo,Baichuan3-Turbo-128k,Baichuan2-Turbo
in the model text box area. - Then, select the model you want to use in the model selection and keep it after testing without any problems.
01.AI (Lingyiwanwu)
Website: https://lingyiwanwu.com
API KEY Acquisition Address: https://platform.lingyiwanwu.com/apikeys
API URL: https://api.lingyiwanwu.com/v1
Available Models: yi-lightning
Alibaba Bailian
Alibaba Bailian is an AI model marketplace that provides all Alibaba-related models and other manufacturer models, including Deepseek-r1.
Official Website: https://bailian.console.aliyun.com
API KEY (SK) Acquisition Address: https://bailian.console.aliyun.com/?apiKey=1#/api-key
API URL: https://dashscope.aliyuncs.com/compatible-mode/v1
Available Models: Many, see details at https://bailian.console.aliyun.com/#/model-market
Silicon Flow
Another AI marketplace similar to Alibaba Bailian, providing mainstream domestic models, including deepseek-r1.
Official Website: https://siliconflow.cn
API KEY (SK) Acquisition Address: https://cloud.siliconflow.cn/account/ak
API URL: https://api.siliconflow.cn/v1
Available Models: Many, see details at https://cloud.siliconflow.cn/models?types=chat
Note: Silicon Flow provides the Qwen/Qwen2.5-7B-Instruct
free model, which can be used directly without any cost.
ByteDance Volcano Engine Ark
An AI marketplace similar to Alibaba Bailian, in addition to the Doubao series models, there are also some third-party models, including deepseek-r1.
Official Website: https://www.volcengine.com/product/ark
API KEY (SK) Acquisition Address: https://console.volcengine.com/ark/region:ark+cn-beijing/apiKey
API URL: https://ark.cn-beijing.volces.com/api/v3
MODELS: Many, see details at https://console.volcengine.com/ark/region:ark+cn-beijing/model?vendor=Bytedance&view=LIST_VIEW
Note: ByteDance Volcano Engine Ark has a somewhat strange compatibility with the OpenAI SDK. You cannot directly fill in the model name. You need to create an inference endpoint in the Volcano Engine Ark console in advance, select the model to use in the inference endpoint, and then fill in the inference endpoint ID in the place where the model is needed, i.e., in the software. If you find it troublesome, you can ignore it, as it has no other advantages besides a slightly lower price. See how to create an inference endpoint: https://www.volcengine.com/docs/82379/1099522
Precautions:
Most AI translation channels may limit the number of requests per minute. If an error message indicates that the request frequency is exceeded, you can click "Translation Channel↓" on the main interface of the software, and change the pause seconds to 10 in the pop-up window, that is, wait 10 seconds after each translation before initiating the next translation request, up to 6 times per minute, to prevent the frequency from being exceeded.
If the selected model is not intelligent enough, especially if the locally deployed model is limited by hardware resources and is usually small, it may not be able to accurately return translations that meet the required format according to the instructions, and there may be too many blank lines in the translation results. In this case, you can try using a larger model, or open Menu -- Tools/Options -- Advanced Options -- Send complete subtitles content when using AI translation, and uncheck it.
Use ollama to locally deploy the Tongyi Qianwen large language model
If you have some hands-on skills, you can also deploy a large language model locally and then use the model for translation. Take Tongyi Qianwen as an example to introduce the deployment and usage methods.
1. Download the exe and run it successfully
Open the website https://ollama.com/download
Click to download. After the download is complete, double-click to open the installation interface, click Install
to complete.
After completion, a black or blue window will automatically pop up. Enter the three words ollama run qwen
and press Enter. The Tongyi Qianwen model will be downloaded automatically.
Wait for the model to finish downloading. No proxy is required and the speed is quite fast.
After the model is automatically downloaded, it will run directly. When the progress reaches 100% and the "Success" character is displayed, it means that the model has been run successfully. This also means that the installation and deployment of the Tongyi Qianwen large language model has been fully completed and you can use it happily. Isn't it super simple?
The default interface address is http://localhost:11434
If the window is closed, how to open it again? It is also very simple. Open the computer's start menu, find "Command Prompt" or "Windows PowerShell" (or directly enter
Win key + q key
to search for cmd), click to open, and enterollama run qwen
to complete.
2. Use directly in the console command window
As shown in the figure, when this interface is displayed, you can actually enter text directly in the window to start using it.
3. Of course, this interface may not be very friendly, so let's get a friendly UI
Open the website https://chatboxai.app/zh and click Download
After downloading, double-click and wait for the interface window to open automatically.
Click "Start Settings", in the pop-up floating layer, click the top model, select "Ollama" in the AI model provider, fill in the address http://localhost:11434
in the API domain name, select Qwen:latest
in the model drop-down menu, and then save it.
The usage interface displayed after saving, use your imagination and use it at will.
4. Fill in the API in the video translation and dubbing software
Open Menu -- Settings -- Compatible with OpenAI and local large language models, add a model
,qwen
in the middle text box, as shown below after adding, and then select the modelFill in
http://localhost:11434/v1
in the API URL, and fill in the SK arbitrarily, such as 1234Test whether it is successful, save it if it is successful, and go to use it
5. What other models can be used
In addition to Tongyi Qianwen, there are many other models that can be used, and the usage method is as simple, just 3 words ollama run model name
Open this address https://ollama.com/library You can see all the model names. Copy the name you want to use, and then execute ollama run model name
.
Remember how to open the command window? Click the start menu and find Command Prompt
or Windows PowerShell
For example, I want to install the openchat
model
Open Command Prompt
, enter ollama run openchat
, press Enter and wait until Success is displayed.
Precautions:
Most AI translation channels may limit the number of requests per minute. If an error message indicates that the request frequency is exceeded, you can click "Translation Channel↓" on the main interface of the software, and change the pause seconds to 10 in the pop-up window, that is, wait 10 seconds after each translation before initiating the next translation request, up to 6 times per minute, to prevent the frequency from being exceeded.
If the selected model is not intelligent enough, especially if the locally deployed model is limited by hardware resources and is usually small, it may not be able to accurately return translations that meet the required format according to the instructions, and there may be too many blank lines in the translation results. In this case, you can try using a larger model, or open Menu -- Tools/Options -- Advanced Options -- Send complete subtitles content when using AI translation, and uncheck it.