triton-ascend-reduce
CommunityOptimize multi-axis reductions on Ascend Triton.
Authorxchang1121
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Optimize reduce-type and composite operators that involve reductions, including multi-axis reductions and normalization scenarios, to improve performance on Ascend using Triton.
Core Features & Use Cases
- Supports non-final-dimension reductions with correct multi-dimensional indexing to avoid costly reshapes.
- Describes two-stage reduction workflows for complex operators like normalization (layernorm, rmsnorm, groupnorm, batchnorm) and statistical computations (variance, std).
- Provides practical guidance for implementing efficient Triton kernels on Ascend hardware, including axis-aware tiling and atomic reductions.
Quick Start
Start by identifying a non-final axis reduction and implement a two-stage Triton kernel on Ascend to validate performance gains.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: triton-ascend-reduce Download link: https://github.com/xchang1121/AutoResearch-CC-hook/archive/main.zip#triton-ascend-reduce Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.