Fast Attention Path (PyTorch SDPA + optional FlashAttention-2)
CommunityFast, configurable SDPA-based attention routing
Authorsovr610
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Route PyTorch scaled dot product attention to a chosen set of backends (Flash, Efficient, cuDNN, Math) using the SDPA framework.
Core Features & Use Cases
- Route attention through multiple backends via a simple, composable BackendConfig.
- Inspect and verify backend capabilities with runtime probes to guide backend selection.
- Apply in transformer workloads with optional FlashAttn integration to maximize throughput on CUDA GPUs.
Quick Start
Instantiate a BackendConfig with your preferred policy and pass it to sdpa_attention to route attention through the chosen backend.
Dependency Matrix
Required Modules
torchpytest
Components
scriptsreferencesassets
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: Fast Attention Path (PyTorch SDPA + optional FlashAttention-2) Download link: https://github.com/sovr610/refffiy/archive/main.zip#fast-attention-path-pytorch-sdpa-optional-flashattention-2 Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.