Name: eval-review
Availability: InStock
Author: opendatahub-io

System Documentation

What problem does it solve?

Facilitates thorough, human-in-the-loop review of evaluation results by surfacing judge outcomes, qualitative feedback, and actionable improvement opportunities for evaluation SKILLs.

Core Features & Use Cases

Analyze per-case judge scores and outputs to identify where automated checks align with or miss human expectations.
Collect and synthesize user feedback and transcripts to detect recurring issues across runs, guiding SKILL.md improvements.
Propose targeted SKILL.md changes and evaluation workflow adjustments to drive faster, iterative improvements.

Quick Start

Review a run by providing --run-id and follow the prompts to collect human feedback and generate improvement recommendations.

Please help me install this Skill: Name: eval-review Download link: https://github.com/opendatahub-io/agent-eval-harness/archive/main.zip#eval-review Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

eval-review

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper