summarize-eval-results
CommunitySummarize AI eval results.
Authorseiggy
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill automates the tedious process of analyzing and summarizing the output from AI agent evaluation tests, making it easier to understand performance and identify issues.
Core Features & Use Cases
- Automated Reporting: Parses complex JSON evaluation results into a human-readable markdown report.
- Key Metrics Visualization: Presents a clear table of scenario performance across various metrics.
- Failure Analysis: Details specific failures, including user prompts, tool call chains, and agent responses.
- Use Case: After running a suite of agent tests, use this Skill to quickly generate a summary report that highlights which scenarios passed, failed, and why, allowing for rapid iteration on agent prompts.
Quick Start
Run the summarize eval results skill on the latest evaluation reports.
Dependency Matrix
Required Modules
None requiredComponents
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: summarize-eval-results Download link: https://github.com/seiggy/lucia-dotnet/archive/main.zip#summarize-eval-results Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.