croq-dsl-cuda
OfficialStreamline CUDA kernel tuning with automated build and profiling.
AuthorLancerLab
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill simplifies the process of tuning CUDA kernels by providing standardized build and run scripts, environment validation, and performance profiling tools.
Core Features & Use Cases
- Build Automation: Generate optimized CUDA kernel binaries with custom build scripts tailored for different shapes and models.
- Performance Profiling: Utilize NVIDIA Nsight Compute (ncu) to collect detailed performance metrics on GPU kernels.
- Use Case: A developer optimizing a deep learning operation can automate the compilation and profiling process to identify bottlenecks and improve throughput.
Quick Start
Use the croq-dsl-cuda skill to generate build and run scripts for your CUDA kernels and perform performance profiling with ncu.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: croq-dsl-cuda Download link: https://github.com/LancerLab/croqtile-tuner/archive/main.zip#croq-dsl-cuda Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.