capture-nsys-profile

Official

Capture Nsight Systems traces for MoE training.

Authormlc-ai
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Capture Nsight Systems traces to diagnose performance bottlenecks in MoE training runs using PithTrain, enabling targeted optimization across parallelism configurations.

Core Features & Use Cases

  • Generates per-node .nsys-rep traces for profiling across pipeline, expert, and context parallelism settings.
  • Automatically sizes global batch and run warmup + profiled steps from a released checkpoint to produce representative traces.
  • Integrates with existing checkpoint workflows to produce actionable performance data for kernel timelines and all-to-all overhead analysis.

Quick Start

Run the profile-capture flow with your target model and parallelism configuration to create an .nsys-rep under workspace/capture-nsys-profile.

Dependency Matrix

Required Modules

nsight-systemspithtrain

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: capture-nsys-profile
Download link: https://github.com/mlc-ai/pith-train/archive/main.zip#capture-nsys-profile

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.