karpathy-metric-pre
CommunityRed-team optimization metrics before they fail.
Data & Analytics#optimization#metrics#evaluation#pre-mortem#red teaming#countermeasures#proxy divergence
Authordrewid74
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It helps you stress-test an optimization metric to uncover how it can be gamed, contaminated, or diverge from the real business outcome it is supposed to represent.
Core Features & Use Cases
- Adversarial metric pre-mortem: Enumerates concrete ways an optimizer can inflate the metric while delivering less real value (e.g., gaming evaluation hooks or measurement edge cases).
- Defense design with countermetrics: Proposes secondary metrics, holdout scenarios, and a disappearance test to detect brittle improvements.
- Evaluation diversity plan: Produces a single actionable document to run an adversarial and periodic test regimen across optimization cycles.
Quick Start
Ask the AI to run a metric gaming pre-mortem by providing the primary metric definition, intended business outcome, what the optimization agent can edit, and how the evaluation is currently performed.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: karpathy-metric-pre Download link: https://github.com/drewid74/ai_skills/archive/main.zip#karpathy-metric-pre Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.