perf-parallelism-strategies

Official

Optimize model parallelism for best performance.

AuthorNVIDIA-NeMo
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill helps users understand how to effectively choose and combine various parallelism strategies in Megatron Bridge to improve training efficiency and scalability.

Core Features & Use Cases

  • Parallelism Strategy Selection: Guides users in selecting appropriate data, tensor, pipeline, expert, and sequence parallelism based on model size, hardware topology, and sequence length.
  • Profiling and Sizing Advice: Provides heuristic guidelines and minimum GPU counts for different parallelism configurations, helping users plan resources efficiently.
  • Technical Resource: Explains the technical constraints, performance considerations, and implementation details necessary for advanced parallelism setup.

Quick Start

Configure the tensor_model_parallel_size, pipeline_model_parallel_size, and sequence_parallel settings in the model provider based on your hardware topology and model size to optimize training performance.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: perf-parallelism-strategies
Download link: https://github.com/NVIDIA-NeMo/Megatron-Bridge/archive/main.zip#perf-parallelism-strategies

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.