LLM-as-Judge Skill

Community

LLM-based evaluation of agent outputs.

Authorreaatech
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Automatically assess the quality of agent outputs using configurable criteria to ensure consistent judgments and reduce manual review effort.

Core Features & Use Cases

  • LLM-based evaluation of responses across criteria such as relevance, coherence, helpfulness, and factual accuracy.
  • Supports tools like judge_output, batch_judge, and get_judge_config to produce scores, feedback, and configuration data.
  • Use Case: benchmark and compare different agent responses in customer support, tutoring, or informational assistants to identify strengths and gaps.

Quick Start

Provide a prompt and agent response along with evaluation criteria to receive a structured quality score and actionable feedback.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: LLM-as-Judge Skill
Download link: https://github.com/reaatech/agents-md-kit/archive/main.zip#llm-as-judge-skill

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.