FunASR Chinese Recognition

FunASR is an open-source speech recognition model suite from Alibaba, which performs better than the Whisper series in Chinese speech scenarios. It is supported in video translation software via HTTP calls through the zh_recogn and SenseVoice projects. Simply deploy the corresponding zh_recogn and SenseVoice integration packages, start them, and enter the API address in the video translation software to use it.

However, many users still find this process confusing. Therefore, starting from v2.97, this feature has been integrated directly into the video translation software. This means you no longer need to deploy and start the zh_recogn and SenseVoice projects separately. Simply select FunASR Chinese Recognition in the software to use it.

Select FunASR Chinese in Speech Recognition

After selecting FunASR Chinese Recognition in the speech recognition settings, you can choose between the paraformer-zh model or the SenseVoiceSmall model. It is recommended to choose the former, as it offers better performance and speed compared to the latter.

First-Time Use: Download FunASR Chinese Recognition Models Online

To avoid making the software package too large, the FunASR models are not included in the software package. The first time you use it, the models will be automatically downloaded from modelscope.cn. After downloading, they will be saved in the hub folder under the models directory in the software folder. Depending on your network conditions, this may take anywhere from a few minutes to tens of minutes. As long as there are no red error messages, please wait patiently for the download to complete.