summarize-eval-results

Community

Summarize AI eval results.

Authorseiggy
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill automates the tedious process of analyzing and summarizing the output from AI agent evaluation tests, making it easier to understand performance and identify issues.

Core Features & Use Cases

  • Automated Reporting: Parses complex JSON evaluation results into a human-readable markdown report.
  • Key Metrics Visualization: Presents a clear table of scenario performance across various metrics.
  • Failure Analysis: Details specific failures, including user prompts, tool call chains, and agent responses.
  • Use Case: After running a suite of agent tests, use this Skill to quickly generate a summary report that highlights which scenarios passed, failed, and why, allowing for rapid iteration on agent prompts.

Quick Start

Run the summarize eval results skill on the latest evaluation reports.

Dependency Matrix

Required Modules

None required

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: summarize-eval-results
Download link: https://github.com/seiggy/lucia-dotnet/archive/main.zip#summarize-eval-results

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.