Using the deepgram.com Speech Recognition API
Deepgram.com speech recognition API support has been added since v2.92. This is a foreign AI service that offers a $200 credit upon registration, which is enough for a while.
- Open the website https://deepgram.com/, register and log in to the console https://console.deepgram.com/
- After logging in, click the big green "Create API Key" in the console.
After clicking, a window like the one below will pop up.
Write a few English letters in the first text box, and then click "" at the bottom. The SK will then be displayed, remember to copy it, as shown below.
- Open Menu -- Speech Recognition Settings -- Deepgram window
API Key: Enter the key copied in the previous step in the API Key field.
Silence Duration: Can be kept at the default 200, i.e. 200ms. If the speech speed of the video to be recognized is fast, it can be reduced to 150 appropriately. If it is slow and there is a lot of silence, it can be increased to 500 or 800.
- Note: The Deepgram platform does not support Chinese well. Whether using subtitles directly returned by Deepgram or re-segmenting sentences according to word-level timestamps, punctuation marks are missing, resulting in unsatisfactory subtitle segmentation. To optimize this, the Ali Chinese punctuation recovery model is used to re-segment sentences. Please select "Chinese Re-segmentation" in the software interface.