Name: triton-cuda-elementwise
Availability: InStock
Author: xchang1121

System Documentation

What problem does it solve?

This tool enables high-performance element-wise computations on CUDA by generating Triton-based kernels, optimizing vectorized operations for activations and broadcasting.

Core Features & Use Cases

Triton-based kernel generation for elementwise operations (add, mul, relu, sigmoid, tanh, gelu, exp, log, div, sub, sqrt, pow).
Memory layout optimizations to ensure coalesced, contiguous access and minimized stride overhead.
Use Cases: accelerates activation functions and element-wise math across large tensors in CUDA workflows, including broadcasting scenarios.

Quick Start

Provide an input tensor and call the elementwise kernel generator to produce a fast Triton-based CUDA kernel.

Please help me install this Skill: Name: triton-cuda-elementwise Download link: https://github.com/xchang1121/AutoResearch-CC-hook/archive/main.zip#triton-cuda-elementwise Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

triton-cuda-elementwise

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper