evaluate-skill-quality
CommunitySystematically evaluate and raise skill quality.
Authornotwillk
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Systematic evaluation of skill outputs across varied prompts and baselines helps ensure reliability, detect edge cases, and drive focused improvements.
Core Features & Use Cases
- Provides a structured evaluation framework (prompts, eval workspace, and baselines) to measure skill quality over time.
- Enables automated grading guidance, human feedback, and iterative SKILL.md improvements based on results.
- Supports design of reproducible evals, performance analysis, and benchmarking workflows to track progress.
Quick Start
Run an initial evaluation comparing the current skill version against a baseline to collect timing, tokens, and grading data for refinement.
Dependency Matrix
Required Modules
pyyaml>=6.0
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: evaluate-skill-quality Download link: https://github.com/notwillk/skills/archive/main.zip#evaluate-skill-quality Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.