croq-dsl-cuda

Official

Streamline CUDA kernel tuning with automated build and profiling.

AuthorLancerLab
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill simplifies the process of tuning CUDA kernels by providing standardized build and run scripts, environment validation, and performance profiling tools.

Core Features & Use Cases

  • Build Automation: Generate optimized CUDA kernel binaries with custom build scripts tailored for different shapes and models.
  • Performance Profiling: Utilize NVIDIA Nsight Compute (ncu) to collect detailed performance metrics on GPU kernels.
  • Use Case: A developer optimizing a deep learning operation can automate the compilation and profiling process to identify bottlenecks and improve throughput.

Quick Start

Use the croq-dsl-cuda skill to generate build and run scripts for your CUDA kernels and perform performance profiling with ncu.

Dependency Matrix

Required Modules

None required

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: croq-dsl-cuda
Download link: https://github.com/LancerLab/croqtile-tuner/archive/main.zip#croq-dsl-cuda

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.