skill-eval-runner

Community

Automates binary skill evaluations for prompts.

Authordanmestas
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This tool automates binary pass/fail evaluations of skill descriptions by simulating trigger decisions, enabling quick verification of whether a skill would be invoked by a given prompt without external LLM judgments.

Core Features & Use Cases

  • Binary evals: run strict pass/fail checks for skill triggers based on the frontmatter description.
  • Auto-retest on edits: re-evaluate triggers whenever a skills/*/SKILL.md changes to ensure up-to-date behavior.
  • Local reasoning: uses built-in heuristics to judge description-trigger alignment and report clear pass/fail outcomes.

Quick Start

Invoke /eval skill <name> to run the binary trigger tests for that skill.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: skill-eval-runner
Download link: https://github.com/danmestas/wardrobe/archive/main.zip#skill-eval-runner

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.