megatron-lm

Name: megatron-lm
Availability: InStock
Author: tylertitsworth

Community

Scale transformer training with Megatron-LM.

Software Engineering #distributed-training #mixed-precision #megatron-lm #megatron #transformer-training #gpu-parallelism #huggingface-conversion

Authortylertitsworth

Version1.0.0

Installs0

System Documentation

What problem does it solve?

Megatron-LM enables scalable training of transformer models at massive parameter counts across GPU clusters.

Core Features & Use Cases

Parallelism configuration for tensor, pipeline, context, and MoE experts.
Data loading, tokenization, and HuggingFace integration for large-scale training.
Checkpoint management and Megatron Bridge conversion to/from HuggingFace.
Use Cases: training Megatron-sized models, MoE training, conversion workflows.

Quick Start

Configure a multi-node GPU cluster with tensor, pipeline, and context parallelism, then run the Megatron-LM training script to start a scalable model training job.

megatron-lm

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper