speech-to-text

Official

Turn audio into accurate, timestamped text.

Authorelevenlabs
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill converts spoken content from audio or video into searchable, editable transcripts, enabling subtitles, meeting minutes, and data extraction.

Core Features & Use Cases

  • High-accuracy transcription: supports 90+ languages, word-level timestamps, and speaker diarization to distinguish multiple voices.
  • Versatile workflows: supports batch transcription with language hints and real-time streaming for live captions and subtitles.
  • Practical scenarios: transcribe meetings, interviews, podcasts, or lectures and generate searchable transcripts with speaker labels.

Quick Start

Use the speech-to-text skill to transcribe an audio file with model_id "scribe_v2" and optionally enable timestamps and diarization for richer output.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: speech-to-text
Download link: https://github.com/elevenlabs/skills/archive/main.zip#speech-to-text

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.