eval-faq

Official

Practical answers to eval methodology questions.

Authormicrosoft
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Answers AI agent evaluation questions with practical, opinionated guidance grounded primarily in Microsoft's agent evaluation ecosystem (MS Learn, Eval Scenario Library, Triage & Improvement Playbook, Eval Guidance Kit) supplemented by select industry sources.

Core Features & Use Cases

  • Provides authoritative, cited guidance for eval-method selection, dataset design, non-determinism handling, tool-call evaluation, and red-teaming.
  • Synthesizes framework references from MS Learn and the Triage Playbook to support Stage 1 Define, Set Baseline & Iterate, Systematic Expansion, and Operationalize planning.
  • Use cases include planning evals, interpreting results, and triaging failures with root-cause analysis.

Quick Start

Ask a question using /eval-faq <your question> to receive actionable guidance.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: eval-faq
Download link: https://github.com/microsoft/eval-guide/archive/main.zip#eval-faq

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.