diff-eval-local

Community

Deterministic agent code diff evaluation.

Authorwzh4464
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Evaluate agent-generated code against a ground-truth diff and a handwritten file list to enable reproducible scoring and analysis.

Core Features & Use Cases

  • Deterministic file-coverage analysis comparing handwritten HW files to generated changes and repo modifications.
  • GT-diff integration: reads ground-truth patches from the base_repo task to anchor evaluation.
  • Function-level context extraction and metadata-driven prompt resolution to guide execution and scoring.

Quick Start

Invoke /diff-eval-local with your experiment repository path and the corresponding ground-truth diff and handwritten file list to produce a deterministic evaluation report.

Dependency Matrix

Required Modules

None required

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: diff-eval-local
Download link: https://github.com/wzh4464/claude-skills/archive/main.zip#diff-eval-local

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.