evaluate-skill-quality

Community

Systematically evaluate and raise skill quality.

Authornotwillk
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Systematic evaluation of skill outputs across varied prompts and baselines helps ensure reliability, detect edge cases, and drive focused improvements.

Core Features & Use Cases

  • Provides a structured evaluation framework (prompts, eval workspace, and baselines) to measure skill quality over time.
  • Enables automated grading guidance, human feedback, and iterative SKILL.md improvements based on results.
  • Supports design of reproducible evals, performance analysis, and benchmarking workflows to track progress.

Quick Start

Run an initial evaluation comparing the current skill version against a baseline to collect timing, tokens, and grading data for refinement.

Dependency Matrix

Required Modules

pyyaml>=6.0

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: evaluate-skill-quality
Download link: https://github.com/notwillk/skills/archive/main.zip#evaluate-skill-quality

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.