eval-faq
OfficialPractical answers to eval methodology questions.
Education & Research#methodology#evaluation#triage#eval#tool-invocation#non-determinism#scenario-library
Authormicrosoft
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Answers AI agent evaluation questions with practical, opinionated guidance grounded primarily in Microsoft's agent evaluation ecosystem (MS Learn, Eval Scenario Library, Triage & Improvement Playbook, Eval Guidance Kit) supplemented by select industry sources.
Core Features & Use Cases
- Provides authoritative, cited guidance for eval-method selection, dataset design, non-determinism handling, tool-call evaluation, and red-teaming.
- Synthesizes framework references from MS Learn and the Triage Playbook to support Stage 1 Define, Set Baseline & Iterate, Systematic Expansion, and Operationalize planning.
- Use cases include planning evals, interpreting results, and triaging failures with root-cause analysis.
Quick Start
Ask a question using /eval-faq <your question> to receive actionable guidance.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: eval-faq Download link: https://github.com/microsoft/eval-guide/archive/main.zip#eval-faq Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.