perf-activation-recompute
OfficialOptimize GPU memory with selective activation recompute.
AuthorNVIDIA-NeMo
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It reduces GPU memory usage during training by selectively recomputing activations, allowing larger models or batch sizes without crashing.
Core Features & Use Cases
- Memory Optimization: Recomputes specific modules like attention and layernorm to save memory.
- Training Efficiency: Helps investigate and fix out-of-memory issues caused by memory fragmentation or large model sizes.
- Use Case: When training a large transformer model that exceeds available GPU memory, apply recompute strategies to fit the model into hardware constraints.
Quick Start
Configure your training setup to enable selective recompute on the attention modules and run your training script to reduce peak memory usage.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: perf-activation-recompute Download link: https://github.com/NVIDIA-NeMo/Megatron-Bridge/archive/main.zip#perf-activation-recompute Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.