cpu-offloading
OfficialEnable CPU offloading for Megatron Bridge.
Software Engineering#memory-management#gpu-memory#cpu-offloading#megatron-bridge#optimizer-states#hybrid-device-optimizer#pp-constraint
AuthorNVIDIA
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Reduces GPU memory pressure by moving transformer activations and optimizer states to CPU memory, enabling larger models and more aggressive memory strategies in Megatron Bridge.
Core Features & Use Cases
- Activation offloading: layer-level offload with per-transformer-layer control, PP constraints, and optional weight offload.
- Optimizer offloading: fractional offload of Adam states via HybridDeviceOptimizer with overlap between GPU and CPU transfers.
- Use cases include training/inference of large models with limited GPU memory and scenarios requiring memory-speed tradeoffs.
Quick Start
Enable optimizer CPU offload with a 0.5 fraction to start reducing GPU memory usage while preserving performance.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: cpu-offloading Download link: https://github.com/NVIDIA/skills/archive/main.zip#cpu-offloading Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.