design-test-rubric

Community

Build repeatable, rigorous evaluation rubrics.

Authorsmartmarbles
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Design-test-rubric provides a structured blueprint to craft rigorous evaluation rubrics for PROBE-like AI agent systems, ensuring consistency and comparability across runs.

Core Features & Use Cases

  • Eight-category rubric with weights summing to 100, tailored to observed failure modes and verification needs.
  • Comprehensive severity taxonomy (critical/major/minor) with explicit sub-score rules and a hard cap on critical violations.
  • Fixed violation log schema, run-tagging conventions, and a reusable scorecard template for all rubric revisions.
  • Clear versioning and changelog workflow for iterative rubric improvements.

Quick Start

Write a starter rubric by listing eight categories with weights, define severity rules, specify the violation fields, and lock in the scorecard template; then bump the minor version and add a changelog entry.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: design-test-rubric
Download link: https://github.com/smartmarbles/helm/archive/main.zip#design-test-rubric

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.