harness:health
CommunityAuto-diagnose and auto-fix dataset health
Authorraphaelchristi
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Identifies and repairs dataset quality problems that undermine reliable evaluation of LLM agents by checking size, difficulty distribution, dead examples, coverage, and splits so that evolution runs and evaluations produce meaningful results.
Core Features & Use Cases
- Health Diagnostic: Runs a dataset health check that reports a health score, example counts, and a list of issues with severities.
- Automated Corrections: Applies corrections such as creating train/held_out splits, retiring dead examples, and invoking test-generation to rebalance or harden the dataset.
- Integration & Reporting: Uses the langsmith Client and dataset_health.py to update examples and prints a final health summary and warnings for any unresolved critical issues.
- Use Case: Run this before /harness:evolve to ensure the evaluation dataset is balanced, challenging, and free of dead or mis-split examples.
Quick Start
Run the harness:health check to analyze dataset quality, auto-apply suggested fixes, and print the final health report.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: harness:health Download link: https://github.com/raphaelchristi/harness-evolver/archive/main.zip#harness-health Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.