nemo-mbridge-perf-hierarchical-context-parallel
CommunityEnable hierarchical CP for Megatron-Bridge.
Software Engineering#distributed-training#Megatron-Bridge#hierarchical-context-parallel#transformer-engine#context-parallel
Authorsayalinvidia
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Scaling context parallelism in Megatron-Bridge to support larger models and datasets by enabling hierarchical context parallelism (HCP) across groups.
Core Features & Use Cases
- Enable per-level context-parallel grouping for a2a+p2p communication in Megatron-Bridge.
- Enforce configuration constraints (product of hierarchical_context_parallel_sizes equals context_parallel_size; seq_length divisible by 2 * context_parallel_size; Transformer Engine version >= 1.12.0).
- Verify implementation through logs showing HIERARCHICAL_CONTEXT_PARALLEL_GROUPS during initialization and by running manual smoke tests.
Quick Start
Set cfg.model.context_parallel_size=4, cfg.model.cp_comm_type="a2a+p2p", cfg.model.hierarchical_context_parallel_sizes=[2, 2], and verify logs show HIERARCHICAL_CONTEXT_PARALLEL_GROUPS.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: nemo-mbridge-perf-hierarchical-context-parallel Download link: https://github.com/sayalinvidia/sayali-skills-test/archive/main.zip#nemo-mbridge-perf-hierarchical-context-parallel Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.