agent-test-harness
CommunityStatistically validate skill triggers.
Software Engineering#quality-assurance#read-only#skill-testing#statistical-significance#trigger-measurement#prompt-triggers#cross-trigger
AuthorFrogAi
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This skill provides a structured method to measure whether skills and subagents trigger as expected on representative prompts, enabling objective validation of skill engagement.
Core Features & Use Cases
- Live frontmatter discovery: reads each skill's current name and description during evaluation.
- Multi-run per prompt: executes each test prompt ≥3 times to ensure statistical significance.
- Structured reporting: outputs per-prompt and per-skill trigger data, including cross-trigger conflicts and tuning opportunities.
- Read-only evaluation: does not modify skill definitions; focuses on measurement and reporting.
Quick Start
Run a test prompts file to measure trigger-rate per skill across multiple prompts.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: agent-test-harness Download link: https://github.com/FrogAi/Xenopus/archive/main.zip#agent-test-harness Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.