waxa-eval

Community

Orchestrate iterative skill evaluations with waxa.

Authormizchi
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Empirical evaluation loops for skill prompts, codified from real iter runs. This skill acts as the operating manual for the waxa CLI, guiding how to author scenarios, choose graders, interpret unclear-points, and manage a ledger to judge convergence.

Core Features & Use Cases

  • Four-stage iteration pattern (structural fix, grader breadth, surface-form coverage, residual unclear)
  • Explicit invocation rules: only run when the user asks for evaluation
  • Scenario authoring under evals/ with templates and per-task scenarios
  • Ledger-based convergence tracking and extraction of general fix rules
  • Integration with empirical-prompt-tuning methodology and the waxa tooling

Quick Start

Scaffold the eval skeleton inside the skill directory and run an iteration pass with the provided eval.yaml to start the loop.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: waxa-eval
Download link: https://github.com/mizchi/skills/archive/main.zip#waxa-eval

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.