capture-nsys-profile
OfficialCapture Nsight Systems traces for MoE training.
Authormlc-ai
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Capture Nsight Systems traces to diagnose performance bottlenecks in MoE training runs using PithTrain, enabling targeted optimization across parallelism configurations.
Core Features & Use Cases
- Generates per-node .nsys-rep traces for profiling across pipeline, expert, and context parallelism settings.
- Automatically sizes global batch and run warmup + profiled steps from a released checkpoint to produce representative traces.
- Integrates with existing checkpoint workflows to produce actionable performance data for kernel timelines and all-to-all overhead analysis.
Quick Start
Run the profile-capture flow with your target model and parallelism configuration to create an .nsys-rep under workspace/capture-nsys-profile.
Dependency Matrix
Required Modules
nsight-systemspithtrain
Components
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: capture-nsys-profile Download link: https://github.com/mlc-ai/pith-train/archive/main.zip#capture-nsys-profile Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.