Name: agent-evals
Availability: InStock
Author: imsanghaar

System Documentation

What problem does it solve?

This Skill provides a structured approach to evaluating the reasoning quality of AI agents, moving beyond simple pass/fail tests to nuanced performance measurement.

Core Features & Use Cases

Systematic Quality Checks: Build robust evaluation frameworks for any AI agent.
Error Analysis: Identify patterns in agent failures to focus improvement efforts.
Regression Protection: Ensure new changes don't degrade agent performance.
Use Case: You've updated your customer support agent and want to ensure it still handles common queries accurately and doesn't introduce new errors. Activate this skill to run your evaluation suite and confirm performance.

Quick Start

Use the agent-evals skill to design a dataset for testing agent reasoning quality.

Please help me install this Skill: Name: agent-evals Download link: https://github.com/imsanghaar/agentfactory/archive/main.zip#agent-evals Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

agent-evals

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper