Name: xformers
Availability: InStock
Author: jstzwj

System Documentation

What problem does it solve?

This Skill provides optimized building blocks for Transformer models, reducing development time and improving performance across training and inference tasks.

Core Features & Use Cases

Memory-Efficient Attention: Enables fast, exact attention computations suitable for large-scale models.
Structured Sparse Operations: Implements 2:4 sparsity, supporting faster training and inference with reduced memory footprint.
Research and Deployment: Supplies custom CUDA, Triton kernels, and model parallel layers for cutting-edge Transformer research, including heterogeneous batching and inference acceleration.
Example Scenario: Use this Skill to replace standard attention with a memory-efficient version in a language model, reducing GPU memory usage and speeding up training.

Quick Start

Use the xformers skill to replace the standard attention with the memory-efficient attention function in your transformer code.

Please help me install this Skill: Name: xformers Download link: https://github.com/jstzwj/ai-infra-plugins/archive/main.zip#xformers Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

xformers

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper