auto-eval

Community

Offline evaluation of AI automation agents.

Authoreasingthemes
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Automates offline evaluation of AI automation agents by running tests against pre-captured fixtures, enabling quality verification without invoking ADO or LLM APIs.

Core Features & Use Cases

  • Offline evaluation against pre-captured fixtures to ensure prompts and rule changes do not regress agent behavior.
  • Supports multiple agents (dor, pr-review, pr-answer) and an optional tier-2 mode for real LLM checks.
  • Provides end-to-end QA workflow guidance from argument parsing to result interpretation.

Quick Start

Run the evaluation framework against pre-captured fixtures to verify agent quality offline.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: auto-eval
Download link: https://github.com/easingthemes/dx-aem-flow/archive/main.zip#auto-eval

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.