Using Hugging Face Models in Software A Detailed Guide | pyVideoTrans-Open Source Video Translation Tool -pyvideotrans.com github.com/jianchang512/pyvideotrans

Using Hugging Face Models in Software: A Detailed Guide

Hugging Face (huggingface.co) is a popular machine learning model hub containing a vast collection of speech recognition models. When the built-in Faster-Whisper model lacks sufficient support for minority languages, or you need a specifically optimized model, Hugging Face is the ideal platform to find solutions.

This feature is available in software version 3.71 and above, and only supports models converted by ctranslate2.

Step 1: Confirm Model Compatibility

Before using a Hugging Face model, you must confirm that it has been converted using ctranslate2. If the model has not been converted with ctranslate2, it cannot be used in the software.

Here are several ways to check:

1. Explicit Indication on the Page

If the model page explicitly states "Converted from ctranslate2" or similar wording, the model is compatible.

The page clearly states that it was converted from ctranslate2 As shown in the image, the page explicitly states that it was converted using ctranslate2, so the model is usable.

2. Check Code References

Even if the page does not explicitly state it, you can check if the model page contains code snippets related to from faster_whisper. Usually, these models are also compatible. Check if the example code on the model page contains from faster_whisper

3. Check the config.json File Structure

If neither of the above methods can confirm, you can click the Files and versions tab on the model page, then find and click the config.json file.

Click the file list to view the config.json structure

If the structure of the config.json file is similar to the following, for example, the file begins with alignment_heads and contains fields such as lang_ids in the middle, then the model is generally compatible.

Structure containing alignment_heads and lang_ids, etc.

Step 2: Obtain and Configure the Model ID

Once you have confirmed that the model is compatible, you can add it to the software.

1. Obtain the Model ID

The model ID consists of two parts separated by /: username/model_name. For example: zh-plus/faster-whisper-large-v2-japanese-5k-steps.

You can find and click the copy button on the model details page to directly obtain the model ID, as shown below:

You can go to the model details page and directly click to copy the ID

2. Add the Model ID to the Software

Open the software, click Menu -> Tools -> Advanced Options.
At the end of the Faster and OpenAI Model List text box, use an English comma , to paste the copied model ID after the existing content.
Click Save to apply the changes.

Separate with an English comma and paste

Step 3: Use and Automatically Download the Model

Return to the main software interface.
In the Speech Recognition dropdown list, select faster-whisper (local).
In the Model dropdown list on the right, select the model ID you just added. The software will automatically download the model from the domestic mirror site https://hf-mirror.com, without requiring a VPN.

Important Notes

1. Model Availability Limitations

The software only supports downloading publicly available models on Hugging Face. The software cannot download and use models that require you to agree to terms (such as accepting a license agreement) to download (as shown below) and private models.

Non-public models cannot be downloaded

2. Use of Domestic Mirror Site

When the software interface language is set to Chinese, the software will automatically use the https://hf-mirror.com domestic mirror site to download models, thus avoiding VPN issues.
If the software interface is in English, it will still attempt to download from the Hugging Face official website, which may require a VPN.
You can click Menu -> Tools -> Advanced Options -> Interface Language, enter zh and save, then restart the software to change the interface to Chinese.