cli-agent-evaluate-batch

Official

Batch-evaluate a CLI against AI failure modes.

Authorcli-agent-spec
Version1.0.0
Installs0

System Documentation

What problem does it solve?

It prevents AI agents from silently hanging, corrupting output, or misinterpreting CLI behavior by running a comprehensive, resumable evaluation across many CLI Agent Spec failure modes.

Core Features & Use Cases

  • Batch evaluation in one run: Tests a CLI tool against multiple §N failure modes with a severity, part, or explicit §N scope.
  • Resumable findings and trace: Loads prior environment and evaluation artifacts, skips fully completed checks, and saves progress incrementally after each failure mode.
  • Scorecard-ready results: Produces a per-failure-mode output plus a final scorecard table summarizing critical/high/medium outcomes.

Quick Start

Run the skill to evaluate your target CLI across all selected §N failure modes, then review the generated findings, trace, issues, and final scorecard for remediation.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: cli-agent-evaluate-batch
Download link: https://github.com/cli-agent-spec/cli-agent-spec/archive/main.zip#cli-agent-evaluate-batch

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.