Name: activation-patching
Availability: InStock
Author: ndif-team

System Documentation

What problem does it solve?

Activation patching provides a causal intervention framework to identify which model components (layers, heads, or positions) drive a specific behavior by swapping activations and observing changes in outputs.

Core Features & Use Cases

Causal intervention across transformer components (layers, attention heads, token positions) to locate components critical for a behavior.
Three-run patching workflow: run clean, run corrupted, patch activations and measure impact to quantify causality.
Interpretability outputs and guidance for diagnosing circuits or computational stages in neural networks.

Quick Start

Run a clean prompt and a corrupted prompt, patch activations layer-by-layer, and measure the effect on outputs.

Please help me install this Skill: Name: activation-patching Download link: https://github.com/ndif-team/skills/archive/main.zip#activation-patching Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

activation-patching

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper