relevance-evals
OfficialAutomate agent evaluations and testing.
AuthorRelevanceAI
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Manages the end-to-end evaluation process for AI agents, enabling structured test design, execution, and results analysis within Relevance AI.
Core Features & Use Cases
- Test-set and test-case management for end-to-end agent evals
- Two evaluation modes: generate_and_score and score_only
- Batch-based result tracking with per-run rule judgments and summaries
- Guidelines for designing reliable, observable rules and scenarios
Quick Start
Create a new test set, add test cases, run an evaluation batch, and review the results.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: relevance-evals Download link: https://github.com/RelevanceAI/agent-skills/archive/main.zip#relevance-evals Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.