mechanistic-interventions
OfficialPlan and validate causal interventions.
Data & Analytics#interpretation#interventions#experimental-design#causal-analysis#activation-patching#read-vs-write#site-selection
Authorconcordance-co
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Move from benchmark-driven analysis into causal or mechanism-oriented interventions. Covers activation patching, interchange, control design, read-vs-write distinctions, and intervention-specific success criteria, while pointing to future attention and routing follow-up work.
Core Features & Use Cases
- Intervention framing: define exact behavior to change, success criteria, malformed-output criteria, and intended direction of change.
- Site choice from the computation story: select plausible sites based on localization cues and timing, not ease of patching.
- Paired design and controls: prefer matched donor-target pairs, single-layer first tests, same-label controls, and purposeful control strategies.
- Interpretation discipline: track intended-direction flips, reverse-direction flips, malformed outputs, and same-label instability; separate causal evidence from broad destabilization.
- Follow-on mechanism work: point to likely next steps such as attention follow-up, routing/MoE follow-up, narrower span decomposition, and read-vs-write comparisons.
Quick Start
Draft an intervention plan detailing the target behavior, success criteria, plausible intervention sites, and a paired-control design.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: mechanistic-interventions Download link: https://github.com/concordance-co/xenon/archive/main.zip#mechanistic-interventions Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.