Name: triton-ascend-attention
Availability: InStock
Author: xchang1121

System Documentation

What problem does it solve?

Optimizes Transformer-style attention workloads on Ascend by applying Triton-based kernel optimizations, including QKV tiling, online softmax, and masking strategies.

Core Features & Use Cases

QKV tiling and block-wise computation to reduce memory footprint and latency in attention operations.
Online softmax and masking techniques for efficient causal and masked attention on large sequences.
Flash Attention tiling strategies to boost throughput for multi-head attention on Ascend hardware.

Quick Start

Run the Triton-Ascend optimized attention workflow on Ascend hardware for Transformer-based models.

Please help me install this Skill: Name: triton-ascend-attention Download link: https://github.com/xchang1121/AutoResearch-CC-hook/archive/main.zip#triton-ascend-attention Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

triton-ascend-attention

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper