eval-mlflow
OfficialMLflow-backed eval: log, sync, feedback.
Authoropendatahub-io
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Bridges evaluation data flow by logging run results and syncing datasets with MLflow, enabling end-to-end tracking of experiments and feedback between the harness and MLflow traces.
Core Features & Use Cases
- MLflow experiment tracking: log parameters, metrics, and artifacts from evaluation runs to MLflow for consistent analytics.
- Dataset synchronization: push evaluation cases to MLflow datasets and maintain cross-run traceability.
- Feedback integration: attach judge and human feedback to traces and pull annotations back into the evaluation pipeline for optimization.
- Use case: a data science team validates model performance across many runs and compares results in a single MLflow dashboard.
Quick Start
Install the skill and configure MLflow tracking, then run the evaluation workflow to start logging results and syncing data.
Dependency Matrix
Required Modules
mlflowpyyaml
Components
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: eval-mlflow Download link: https://github.com/opendatahub-io/agent-eval-harness/archive/main.zip#eval-mlflow Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.