eval-audit

Name: eval-audit
Availability: InStock
Author: marchatton

Community

Audit LLM evals for trust and impact.

Software Engineering #quality assurance #debugging #audit #llm #metrics #evaluation

Authormarchatton

Version1.0.0

Installs0

System Documentation

What problem does it solve?

This Skill identifies critical flaws in your LLM evaluation pipelines, ensuring your metrics are trustworthy and your AI product is genuinely improving.

Core Features & Use Cases

Diagnostic Checks: Systematically inspects six key areas of your eval pipeline, including error analysis, evaluator design, judge validation, human review, labeled data, and pipeline hygiene.
Prioritized Findings: Delivers a report of identified problems, ordered by their impact on your product's success.
Concrete Next Steps: Provides actionable recommendations, often referencing other skills, to fix identified issues.
Use Case: You've inherited an LLM evaluation system and are unsure if the reported metrics accurately reflect performance. This Skill will audit the existing setup and provide a clear roadmap for improving its reliability and trustworthiness.

Quick Start

Audit the current LLM evaluation pipeline for potential issues and provide a prioritized list of findings.

eval-audit

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper