nemo-mbridge-perf-activation-recompute

Community

Reduce GPU memory usage with selective recompute.

Authorsayalinvidia
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Activation recompute trades GPU memory for compute by discarding intermediate activations during the forward pass and recomputing them during backward in Megatron Bridge, enabling training under tighter memory budgets.

Core Features & Use Cases

  • Supports selective recompute to save memory by recomputing specific submodules (e.g., core_attn, layernorm) with moderate compute cost.
  • Supports full-layer recompute to maximize memory savings when memory pressure is extreme, with guidance on when to apply.
  • Provides compatibility guidance for TE-scoped CUDA graphs and related constraints during memory tuning for large models.

Quick Start

Configure selective recompute first (e.g., core_attn; optionally add layernorm), then escalate to full recompute with recompute_num_layers and recompute_method if memory pressure persists.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: nemo-mbridge-perf-activation-recompute
Download link: https://github.com/sayalinvidia/sayali-skills-test/archive/main.zip#nemo-mbridge-perf-activation-recompute

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.