ncu-report-skill

Official

Profile CUDA kernels with Nsight Compute (B200)

Authormit-han-lab
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Profile CUDA kernels with Nsight Compute on NVIDIA Blackwell B200 to identify bottlenecks, reason about root causes, and generate a structured optimization plan.

Core Features & Use Cases

  • End-to-end profiling workflow: harness creation, gathering reports, and reporting.
  • Python-based analysis: extract metrics, compare runs, and map signals to fixes.
  • Six analysis dimensions and a diagnosis playbook to guide actionable changes.
  • Final optimization report with prioritized recommendations and evidence.

Quick Start

Create a new run directory under profile/<run_name>/, build a standalone harness with -lineinfo, and profile the kernel with Nsight Compute.

Dependency Matrix

Required Modules

ncu_report

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: ncu-report-skill
Download link: https://github.com/mit-han-lab/ncu-report-skill/archive/main.zip#ncu-report-skill

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.