Avoid Warp Divergence
OfficialDiagnose and reduce CUDA warp divergence.
Software Engineering#cuda#kernel-optimization#ballot#gpu-kernels#warp-divergence#predication#loop-peeling
Authortensormux
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Identify and reduce warp divergence in CUDA kernels by distinguishing avoidable vs unavoidable divergence and evaluating performance implications before restructuring.
Core Features & Use Cases
- Classify divergence type (geometry-based, data-dependent, unavoidable boundary) and estimate warp-level impact.
- Apply restructuring strategies such as loop peeling, data reorganization, and predication-aware paths, and decide when to launch specialized kernels.
- Plan, prioritize, and validate changes with profiling metrics and safety checks.
Quick Start
Profile a CUDA kernel to identify divergent regions and choose a restructuring strategy before optimization.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: Avoid Warp Divergence Download link: https://github.com/tensormux/kernel-skills/archive/main.zip#avoid-warp-divergence Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.