harness:health

Name: harness:health
Availability: InStock
Author: raphaelchristi

Community

Auto-diagnose and auto-fix dataset health

Data & Analytics #evaluation #test-generation #data-quality #langsmith #auto-correct #dataset-health

Authorraphaelchristi

Version1.0.0

Installs0

System Documentation

What problem does it solve?

Identifies and repairs dataset quality problems that undermine reliable evaluation of LLM agents by checking size, difficulty distribution, dead examples, coverage, and splits so that evolution runs and evaluations produce meaningful results.

Core Features & Use Cases

Health Diagnostic: Runs a dataset health check that reports a health score, example counts, and a list of issues with severities.
Automated Corrections: Applies corrections such as creating train/held_out splits, retiring dead examples, and invoking test-generation to rebalance or harden the dataset.
Integration & Reporting: Uses the langsmith Client and dataset_health.py to update examples and prints a final health summary and warnings for any unresolved critical issues.
Use Case: Run this before /harness:evolve to ensure the evaluation dataset is balanced, challenging, and free of dead or mis-split examples.

Quick Start

Run the harness:health check to analyze dataset quality, auto-apply suggested fixes, and print the final health report.

harness:health

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper