Name: memory-tuning
Availability: InStock
Author: NVIDIA

System Documentation

What problem does it solve?

GPU memory fragmentation and peak memory usage during Megatron training often cause OOM or reduced throughput. This memory-tuning guide provides proven fixes to stabilize training on large models.

Core Features & Use Cases

Expandable segments: reduce fragmentation by using non-fixed memory blocks.
Activation recompute: selectively recompute activations to save peak memory.
CPU offloading constraints: guidance on when offloading is compatible with parallelism.
Parallelism tuning: advise TP/PP/DP trade-offs to fit memory budgets for large-scale training.

Quick Start

Set PYTORCH_CUDA_ALLOC_CONF to expandable_segments:True before launching Megatron training.

Please help me install this Skill: Name: memory-tuning Download link: https://github.com/NVIDIA/skills/archive/main.zip#memory-tuning Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

memory-tuning

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper