waxa-eval
CommunityOrchestrate iterative skill evaluations with waxa.
Software Engineering#evaluation#iteration#convergence#ledger#skill-eval#empirical-prompt-tuning#waxa-eval
Authormizchi
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Empirical evaluation loops for skill prompts, codified from real iter runs. This skill acts as the operating manual for the waxa CLI, guiding how to author scenarios, choose graders, interpret unclear-points, and manage a ledger to judge convergence.
Core Features & Use Cases
- Four-stage iteration pattern (structural fix, grader breadth, surface-form coverage, residual unclear)
- Explicit invocation rules: only run when the user asks for evaluation
- Scenario authoring under evals/ with templates and per-task scenarios
- Ledger-based convergence tracking and extraction of general fix rules
- Integration with empirical-prompt-tuning methodology and the waxa tooling
Quick Start
Scaffold the eval skeleton inside the skill directory and run an iteration pass with the provided eval.yaml to start the loop.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: waxa-eval Download link: https://github.com/mizchi/skills/archive/main.zip#waxa-eval Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.