cuda-agent-team
CommunitySpeed up CUDA kernel tuning with parallel agents
Software Engineering#multi-agent#performance tuning#cuda#parallel search#kernel optimization#gpu pinning#merge protocol
AuthorRomaosir
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It solves the slow, plateauing progress of single-agent CUDA kernel optimization by running multiple independent optimization directions concurrently, then coordinating safe synchronization and merges to find better kernels faster.
Core Features & Use Cases
- Parallel multi-agent optimization: Orchestrates two or more worker agents that each run the single-agent optimization loop in isolated working directories.
- GPU-pinned, contention-safe execution: Assigns and pins each worker to its own GPU (via CUDA_VISIBLE_DEVICES) to keep speedup measurements comparable.
- Round-based orchestration, sync, and merge: Tracks per-worker perf logs, decides when to adopt a winning baseline, and merges compatible wins via a defined merge protocol with correctness checks.
- Use Case: If your kernel tuning has flattened for multiple iterations, you can split the effort into orthogonal hypotheses (e.g., memory restructuring vs. compute restructuring) and explore them in parallel across GPUs.
Quick Start
Ask the system to run the CUDA agent team for parallel kernel optimization when you have at least two GPUs available and want two distinct optimization directions explored concurrently.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: cuda-agent-team Download link: https://github.com/Romaosir/IF_Romao_kernel_optimize/archive/main.zip#cuda-agent-team Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.