cli-agent-evaluate-batch
OfficialBatch-evaluate a CLI against AI failure modes.
Authorcli-agent-spec
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It prevents AI agents from silently hanging, corrupting output, or misinterpreting CLI behavior by running a comprehensive, resumable evaluation across many CLI Agent Spec failure modes.
Core Features & Use Cases
- Batch evaluation in one run: Tests a CLI tool against multiple §N failure modes with a severity, part, or explicit §N scope.
- Resumable findings and trace: Loads prior environment and evaluation artifacts, skips fully completed checks, and saves progress incrementally after each failure mode.
- Scorecard-ready results: Produces a per-failure-mode output plus a final scorecard table summarizing critical/high/medium outcomes.
Quick Start
Run the skill to evaluate your target CLI across all selected §N failure modes, then review the generated findings, trace, issues, and final scorecard for remediation.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: cli-agent-evaluate-batch Download link: https://github.com/cli-agent-spec/cli-agent-spec/archive/main.zip#cli-agent-evaluate-batch Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.