dw-skill-eval-verify
CommunityVerify evaluation quality and regression health.
Data & Analytics#automation#calibration#quality-assurance#regression-testing#artifact-generation#evaluation-metrics
Authorxurik
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It helps teams ensure their skill evaluation workflows produce reliable, comparable results by validating quality metrics, performing regression checks after changes, and surfacing actionable insights.
Core Features & Use Cases
- Automated quality checks across evaluation dimensions (distinguishability, difficulty distribution, judge consistency, calibration).
- Baseline regression workflow to compare current scores against the latest baseline and generate regression reports.
- Guided, automated reporting and artifact generation to support audits and release processes.
Quick Start
Run a verification pass with --quality to validate fresh evaluations, or --regression to compare against the latest baseline.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: dw-skill-eval-verify Download link: https://github.com/xurik/dataworks-skill-evaluator/archive/main.zip#dw-skill-eval-verify Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.