design-test-rubric
CommunityBuild repeatable, rigorous evaluation rubrics.
Product & Management#versioning#verification#rubric#quality-assurance#probe#model-evaluation#scorecard
Authorsmartmarbles
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Design-test-rubric provides a structured blueprint to craft rigorous evaluation rubrics for PROBE-like AI agent systems, ensuring consistency and comparability across runs.
Core Features & Use Cases
- Eight-category rubric with weights summing to 100, tailored to observed failure modes and verification needs.
- Comprehensive severity taxonomy (critical/major/minor) with explicit sub-score rules and a hard cap on critical violations.
- Fixed violation log schema, run-tagging conventions, and a reusable scorecard template for all rubric revisions.
- Clear versioning and changelog workflow for iterative rubric improvements.
Quick Start
Write a starter rubric by listing eight categories with weights, define severity rules, specify the violation fields, and lock in the scorecard template; then bump the minor version and add a changelog entry.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: design-test-rubric Download link: https://github.com/smartmarbles/helm/archive/main.zip#design-test-rubric Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.