tilelang-api-best-practices

Official

Optimize Ascend NPU kernels with TileLang API

Authortile-ai
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Provides concise, structured best-practice guidance for using the TileLang Ascend API to write correct, efficient, and debuggable Ascend NPU kernels, reducing common mistakes in memory placement, data movement, synchronization, and scheduling.

Core Features & Use Cases

  • API Index: Organized reference for kernel definition, memory allocation primitives, data copy semantics, compute primitives (GEMM, MMA, reductions), and tile-level vector operations.
  • Scheduling & Synchronization: Patterns and examples for T.Pipelined, T.Persistent, barrier and cross-core sync usage to enable pipelining and multi-core coordination.
  • Debugging & Performance: Guidance on device-side printf, dump tensor, msProf profiling, and pass_config tuning for memory planning and auto-sync.
  • Use Case: Implement a high-performance GEMM or attention kernel by following the kernel-memory, compute, and schedule-sync references to allocate shared/fragment buffers, prefetch with pipelining, apply T.gemm_v0 or T.mma, and validate with device dumps and profiling.

Quick Start

Implement a GEMM kernel by defining a prim_func, allocating shared and fragment buffers, using copy to move blocks into L1/L0, invoking gemm operations with correct init semantics, and applying pipelined or persistent scheduling for performance.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: tilelang-api-best-practices
Download link: https://github.com/tile-ai/tilelang-ascend/archive/main.zip#tilelang-api-best-practices

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.