diff-eval-local
CommunityDeterministic agent code diff evaluation.
Authorwzh4464
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Evaluate agent-generated code against a ground-truth diff and a handwritten file list to enable reproducible scoring and analysis.
Core Features & Use Cases
- Deterministic file-coverage analysis comparing handwritten HW files to generated changes and repo modifications.
- GT-diff integration: reads ground-truth patches from the base_repo task to anchor evaluation.
- Function-level context extraction and metadata-driven prompt resolution to guide execution and scoring.
Quick Start
Invoke /diff-eval-local with your experiment repository path and the corresponding ground-truth diff and handwritten file list to produce a deterministic evaluation report.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: diff-eval-local Download link: https://github.com/wzh4464/claude-skills/archive/main.zip#diff-eval-local Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.