perf-activation-recompute

Official

Optimize GPU memory with selective activation recompute.

AuthorNVIDIA-NeMo
Version1.0.0
Installs0

System Documentation

What problem does it solve?

It reduces GPU memory usage during training by selectively recomputing activations, allowing larger models or batch sizes without crashing.

Core Features & Use Cases

  • Memory Optimization: Recomputes specific modules like attention and layernorm to save memory.
  • Training Efficiency: Helps investigate and fix out-of-memory issues caused by memory fragmentation or large model sizes.
  • Use Case: When training a large transformer model that exceeds available GPU memory, apply recompute strategies to fit the model into hardware constraints.

Quick Start

Configure your training setup to enable selective recompute on the attention modules and run your training script to reduce peak memory usage.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: perf-activation-recompute
Download link: https://github.com/NVIDIA-NeMo/Megatron-Bridge/archive/main.zip#perf-activation-recompute

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.