agent-comparison
CommunityCompare agent variants with evidence-driven benchmarks.
Authornotque
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill provides a structured approach to compare agent variants through controlled benchmarks, quantifying quality and total session token cost to guide production decisions.
Core Features & Use Cases
- Systematically compare agent variants through controlled benchmarks
- Measure total session token cost (prompt, reasoning, tools, retries)
- Generate evidence-backed reports, highlighting production-impacting differences and bugs
Quick Start
Run the four phases in order (Prepare, Benchmark, Grade, and Report) using identical prompts to evaluate both agent variants.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: agent-comparison Download link: https://github.com/notque/claude-code-toolkit/archive/main.zip#agent-comparison Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.