diyu-eval-harness

Community

Governance-focused AI behavior evaluation harness.

Authorandyan77
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill provides a structured framework to identify and quantify AI behavior across governance, capability, and regression for eight registered objects, enabling objective evaluation and traceable decision-making.

Core Features & Use Cases

  • Governance evaluation: verify presence and integrity of governance artifacts and ensure alignment with upstream references.
  • Capability evaluation: assess declared capabilities against actual behavior for active objects.
  • Regression evaluation: compare current results against a baseline to detect degradation or improvement.
  • Scorecard generation: compute a 6-dimension, 10-point scorecard and emit both human-readable reports and machine-readable artifacts.
  • Evidence and auditing: produce Markdown reports and YAML scorecards suitable for governance reviews and archives.

Quick Start

Invoke the evaluation harness to run governance, capability, and regression checks across all registered objects and generate the 6-dimension scorecard outputs.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: diyu-eval-harness
Download link: https://github.com/andyan77/diyu-agent/archive/main.zip#diyu-eval-harness

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.