cuda-c-optimization
Community提升 CUDA-C 性能与稳定性的实战指南
Software Engineering#debugging#cuda#numerical-stability#gpu-optimization#kernel-optimization#cuda-c#memory-coalescing
Authorxchang1121
Version1.0.0
Installs0
System Documentation
What problem does it solve?
CUDA C kernels 常常存在性能不足、数值稳定性差和调试困难的问题。本 Skill 提供一个结构化、可操作的指南,帮助你优化 CUDA-C 代码、提升吞吐量、确保在常见工作负载中的数值可靠性。
Core Features & Use Cases
- 块大小选择策略:针对不同内核类型(元素级、规约、矩阵乘法、图像处理)给出推荐的线程块大小与网格配置。
- 内存访问优化:实现共存、对齐访问,避免 Bank 冲突,提升带宽利用率。
- 计算优化:减少分支发散、使用快速数学函数、最小化全局原子操作。
- 占用率与调试:检查网格/块配置,维护易于维护的调试检查表。
- 数值稳定性技巧:可靠的规约、安全的除法与平方根、避免溢出/下溢。
- 使用场景:适用于需要性能调优、内存访问优化和数值稳定性的 CUDA-C 计算密集型任务。
Quick Start
Follow the guide to profile a kernel, adjust block sizes, improve memory access, and validate numerical stability in your CUDA C code.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: cuda-c-optimization Download link: https://github.com/xchang1121/AutoResearch-CC-hook/archive/main.zip#cuda-c-optimization Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.