LLM-as-Judge Skill
CommunityLLM-based evaluation of agent outputs.
Authorreaatech
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Automatically assess the quality of agent outputs using configurable criteria to ensure consistent judgments and reduce manual review effort.
Core Features & Use Cases
- LLM-based evaluation of responses across criteria such as relevance, coherence, helpfulness, and factual accuracy.
- Supports tools like judge_output, batch_judge, and get_judge_config to produce scores, feedback, and configuration data.
- Use Case: benchmark and compare different agent responses in customer support, tutoring, or informational assistants to identify strengths and gaps.
Quick Start
Provide a prompt and agent response along with evaluation criteria to receive a structured quality score and actionable feedback.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: LLM-as-Judge Skill Download link: https://github.com/reaatech/agents-md-kit/archive/main.zip#llm-as-judge-skill Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.