gaia-playbook

Official

GAIA benchmark playbook for failure-mode debug.

AuthorDarwin-Agent
Version1.0.0
Installs0

System Documentation

What problem does it solve?

GAIA benchmark guidance to diagnose trajectory gaps, map failure modes to actionable interventions, and streamline tool-spec authoring for performance improvement.

Core Features & Use Cases

  • Maps GAIA failure modes (A-H) to concrete tooling patterns and harness configurations.
  • Provides templates and best practices for drafting TOOL_SPEC.md and coordinating with reference materials.
  • Supports teams in planning, experimentation, and score improvement across multi-hop GAIA tasks.

Quick Start

Identify your GAIA trajectory gaps and draft a TOOL_SPEC.md to address the most critical capability missing.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: gaia-playbook
Download link: https://github.com/Darwin-Agent/HarnessX/archive/main.zip#gaia-playbook

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.