cpu-offloading

Official

Enable CPU offloading for Megatron Bridge.

AuthorNVIDIA
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Reduces GPU memory pressure by moving transformer activations and optimizer states to CPU memory, enabling larger models and more aggressive memory strategies in Megatron Bridge.

Core Features & Use Cases

  • Activation offloading: layer-level offload with per-transformer-layer control, PP constraints, and optional weight offload.
  • Optimizer offloading: fractional offload of Adam states via HybridDeviceOptimizer with overlap between GPU and CPU transfers.
  • Use cases include training/inference of large models with limited GPU memory and scenarios requiring memory-speed tradeoffs.

Quick Start

Enable optimizer CPU offload with a 0.5 fraction to start reducing GPU memory usage while preserving performance.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: cpu-offloading
Download link: https://github.com/NVIDIA/skills/archive/main.zip#cpu-offloading

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.