sherpa-onnx

Preview：

<br />

Introduce：

sherpa-onnx is a speech recognition and speech synthesis project based on next generation Kaldi that uses onnxruntime for inference and supports a variety of speech-related features, including speech to text (ASR), text to Speech (TTS), speaker recognition, speaker verification, language recognition, keyword detection, and more. It supports multiple platforms and operating systems, including embedded systems, Android, iOS, Raspberry Pi, RISC-V, Server, and more.

Stakeholders:
sherpa-onnx is suitable for developers and researchers, especially those who need to implement speech recognition and speech synthesis capabilities on different platforms. It provides a variety of apis, including C++, C, Python, Go, C#, Java, Kotlin, JavaScript, Swift, for developers with different backgrounds to use.
Usage Scenario Examples:

Real-time speech to text on Android devices with sherlia-onnx.
Use sherlia-onnx for batch speech recognition tasks on the server.
Use sherlia-onnx for keyword detection in embedded systems.

The features of the tool:

Supports streaming and non-streaming speech recognition (ASR).
Text to Speech (TTS) is supported.
Support speaker identification.
Supports speaker verification.
Supports language recognition.
Support audio tags and keyword detection.
Multiple platforms and operating systems are supported.

Steps for Use:

1. Clone or download the sherlia-onnx project locally.
2. Select the right API and platform for the desired functionality.
3. Configure the environment and dependencies based on the document description.
4. Load the pre-training model and test it.
5. Adjust parameters as required to optimize performance.
6. Integrated into applications to achieve speech recognition or speech synthesis functions.

Tool’s Tabs: Speech recognition, speech synthesis