agent-evaluation
CommunityEvaluate AI agents with multi-dimensional rubrics.
Authorabdullah1854
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Evaluates agent outputs using multi-dimensional rubrics, providing scalable, objective assessments beyond single metrics.
Core Features & Use Cases
- Rubrics-Based Scoring: Factual accuracy, completeness, citation quality, sources, and tool efficiency.
- LLM-as-Judge Support: Scales evaluation across large test sets.
- Continuous Improvement: Stores evaluation history for trend analysis.
Quick Start
Evaluate an agent's output by providing an aspect (rubrics, methodology, testset, continuous, pitfalls).
Dependency Matrix
Required Modules
None requiredComponents
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: agent-evaluation Download link: https://github.com/abdullah1854/ClaudeSuperSkills/archive/main.zip#agent-evaluation Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.