oral-history-tools
CommunityTranscribe, diarize, and export OHMS
Education & Research#subtitles#transcription#anonymization#speaker diarization#digital humanities#oral history#ohms xml
Authorxjtulyc
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill turns oral history audio into timestamped, speaker-attributed transcripts and archival-ready OHMS XML, while helping you anonymize sensitive information before sharing or deposit.
Core Features & Use Cases
- Whisper transcription with timestamps: Produces word-level start/end times for long-form interviews to enable precise alignment.
- Speaker diarization (pyannote.audio 3.x): Assigns speaker labels to audio segments and merges them into word-level speaker attribution.
- OHMS XML cuepoint export: Generates OHMS-compatible XML cuepoints that synchronize transcript snippets with the media timeline.
- Anonymization: Redacts common PII patterns (emails, phone numbers, SSNs) and can map named speakers to generic labels.
- Subtitle output: Exports utterances to SRT with optional speaker prefixes for publication workflows.
Use case: You have a multi-speaker interview recording and need an OHMS XML package plus subtitles, with interviewee identity anonymized for public access.
Quick Start
Use the oral-history-tools skill to process the attached audio file and produce an OHMS XML file with diarized, anonymized transcripts.
Dependency Matrix
Required Modules
openai-whisperpyannote.audiopandasnumpytorchlxml
Components
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: oral-history-tools Download link: https://github.com/xjtulyc/awesome-rosetta-skills/archive/main.zip#oral-history-tools Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.