Choose Tile Size and Work Partitioning

Official

Tune tile sizes and work partitioning for kernels.

Authortensormux
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Guides AI agents in selecting tile sizes and work partitioning for CUDA or Triton kernels, balancing shared memory use, register pressure, occupancy, and problem shape.

Core Features & Use Cases

  • Structured decision framework to evaluate smem budgets, register pressure, and occupancy for GEMM-like tiling, attention, or elementwise kernels.
  • Guided configuration outputs including BLOCK_M, BLOCK_N, BLOCK_K, and grid strategy tailored to problem shapes (static or dynamic) and hardware limits.
  • Use Case: When designing a new tiled kernel with irregular shapes, this skill helps pick tile sizes that maximize occupancy and throughput while avoiding shared memory overruns.

Quick Start

Provide problem dimensions and hardware constraints to receive recommended tile sizes and launch configuration.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: Choose Tile Size and Work Partitioning
Download link: https://github.com/tensormux/kernel-skills/archive/main.zip#choose-tile-size-and-work-partitioning

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.