gemm-perf
CommunityAssess GPU matrix performance with quick, accurate tests.
Authordongg622
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill enables hardware developers and FAEs to measure the matrix computation performance of GPU accelerators efficiently.
Core Features & Use Cases
- Hardware Performance Evaluation: Conducts detailed tests of GEMM operations across multiple data types, such as FP32, FP16, BF16, and INT8.
- Benchmarking & Comparison: Outputs precise TFLOPs or TOPs metrics, allowing performance comparison before deployment or during hardware validation.
- Use Case: A FAEs tests the matrix calculation capacity of a new AI accelerator card to ensure it meets specifications for deployment in AI workflows.
Quick Start
Run the gemm_perf tool directly on your system with the default parameters to measure your device's matrix computation speed.
Dependency Matrix
Required Modules
None requiredComponents
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: gemm-perf Download link: https://github.com/dongg622/china-ai-chip-skill/archive/main.zip#gemm-perf Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.