mechanistic-interventions

Name: mechanistic-interventions
Availability: InStock
Author: concordance-co

Official

Plan and validate causal interventions.

Data & Analytics #interpretation #interventions #experimental-design #causal-analysis #activation-patching #read-vs-write #site-selection

Authorconcordance-co

Version1.0.0

Installs0

System Documentation

What problem does it solve?

Move from benchmark-driven analysis into causal or mechanism-oriented interventions. Covers activation patching, interchange, control design, read-vs-write distinctions, and intervention-specific success criteria, while pointing to future attention and routing follow-up work.

Core Features & Use Cases

Intervention framing: define exact behavior to change, success criteria, malformed-output criteria, and intended direction of change.
Site choice from the computation story: select plausible sites based on localization cues and timing, not ease of patching.
Paired design and controls: prefer matched donor-target pairs, single-layer first tests, same-label controls, and purposeful control strategies.
Interpretation discipline: track intended-direction flips, reverse-direction flips, malformed outputs, and same-label instability; separate causal evidence from broad destabilization.
Follow-on mechanism work: point to likely next steps such as attention follow-up, routing/MoE follow-up, narrower span decomposition, and read-vs-write comparisons.

Quick Start

Draft an intervention plan detailing the target behavior, success criteria, plausible intervention sites, and a paired-control design.

mechanistic-interventions

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper