transformer-lens-interpretability
OfficialInspect and manipulate transformer internals.
Education & Research#deep learning#ai research#mechanistic interpretability#activation patching#transformerlens#circuit analysis
AuthorOrchestra-Research
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill provides the tools and guidance to deeply analyze the internal workings of transformer models, enabling researchers to understand how they learn and process information.
Core Features & Use Cases
- Mechanistic Interpretability: Reverse-engineer algorithms, study attention patterns, and analyze circuits within transformer models.
- Activation Patching: Perform causal tracing experiments to identify the impact of specific activations on model outputs.
- Use Case: A researcher wants to understand why a language model makes a specific prediction. They can use this Skill to isolate and analyze the attention heads and neuron activations responsible for that prediction.
Quick Start
Use the transformer-lens-interpretability skill to perform activation patching experiments on a GPT-2 model.
Dependency Matrix
Required Modules
transformer-lenstorch
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: transformer-lens-interpretability Download link: https://github.com/Orchestra-Research/AI-Research-SKILLs/archive/main.zip#transformer-lens-interpretability Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.