gemm-perf

Community

Assess GPU matrix performance with quick, accurate tests.

Authordongg622
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill enables hardware developers and FAEs to measure the matrix computation performance of GPU accelerators efficiently.

Core Features & Use Cases

  • Hardware Performance Evaluation: Conducts detailed tests of GEMM operations across multiple data types, such as FP32, FP16, BF16, and INT8.
  • Benchmarking & Comparison: Outputs precise TFLOPs or TOPs metrics, allowing performance comparison before deployment or during hardware validation.
  • Use Case: A FAEs tests the matrix calculation capacity of a new AI accelerator card to ensure it meets specifications for deployment in AI workflows.

Quick Start

Run the gemm_perf tool directly on your system with the default parameters to measure your device's matrix computation speed.

Dependency Matrix

Required Modules

None required

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: gemm-perf
Download link: https://github.com/dongg622/china-ai-chip-skill/archive/main.zip#gemm-perf

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.