sherpa-onnx
CommunityOffline speech AI: ASR, TTS, VAD, diarization
System Documentation
What problem does it solve?
Enable deterministic, private, and low-latency speech processing locally without internet access by providing ready guidance to run ASR, TTS, VAD, speaker diarization, speaker ID/verification, speech enhancement, audio tagging, keyword spotting, and source separation using ONNX models and the sherpa-onnx runtime.
Core Features & Use Cases
- Streaming & Non‑streaming ASR: real-time microphone transcription via OnlineRecognizer and batch/file transcription via OfflineRecognizer.
- TTS Engines: generate speech locally with Kokoro, Piper, Matcha, VITS, or KittenTTS for multi‑speaker and multi‑language needs.
- VAD, Diarization & Speaker Tasks: voice activity detection, segmentation, embedding extraction, identification and verification workflows for meetings and call analytics.
- Enhancement & Tagging: denoise or separate sources, classify audio content, and detect keywords on-device for privacy-sensitive or edge applications.
- Use Case Example: transcribe a meeting audio file on an offline workstation, split speakers using pyannote segmentation plus embeddings, and export timestamped captions and per-speaker transcripts.
Quick Start
Download the appropriate ONNX model, install sherpa-onnx, and run a local offline transcription of meeting.wav with OfflineRecognizer.from_sense_voice to produce a timestamped transcript.
Dependency Matrix
Required Modules
None requiredComponents
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: sherpa-onnx Download link: https://github.com/jayll1303/AIEKit/archive/main.zip#sherpa-onnx Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.