harness:health

Community

Auto-diagnose and auto-fix dataset health

Authorraphaelchristi
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Identifies and repairs dataset quality problems that undermine reliable evaluation of LLM agents by checking size, difficulty distribution, dead examples, coverage, and splits so that evolution runs and evaluations produce meaningful results.

Core Features & Use Cases

  • Health Diagnostic: Runs a dataset health check that reports a health score, example counts, and a list of issues with severities.
  • Automated Corrections: Applies corrections such as creating train/held_out splits, retiring dead examples, and invoking test-generation to rebalance or harden the dataset.
  • Integration & Reporting: Uses the langsmith Client and dataset_health.py to update examples and prints a final health summary and warnings for any unresolved critical issues.
  • Use Case: Run this before /harness:evolve to ensure the evaluation dataset is balanced, challenging, and free of dead or mis-split examples.

Quick Start

Run the harness:health check to analyze dataset quality, auto-apply suggested fixes, and print the final health report.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: harness:health
Download link: https://github.com/raphaelchristi/harness-evolver/archive/main.zip#harness-health

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.