igie
CommunityHigh-performance GPU inference framework supporting multiple models.
Authordongg622
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill streamlines the deployment and optimization of AI inference workloads on Iluvatar GPUs, enabling efficient model conversion, quantization, and deployment.
Core Features & Use Cases
- Model Import and Conversion: Supports importing models from ONNX, PyTorch, TensorFlow, and others, simplifying the deployment pipeline.
- Quantization and Optimization: Facilitates INT8 and FP16 quantization for faster inference and reduced memory usage, applicable to vision and NLP models.
- Dynamic and Static Deployment: Implements static shape compilation and dynamic shape support for flexible inference scenarios.
- Use Case: Deploy a ResNet model in FP16 precision on Iluvatar GPU, achieving accelerated inference for real-time applications.
Quick Start
Load an ONNX model, import it with relay.frontend, build for Iluvatar hardware, and run inference using TVM's graph executor.
Dependency Matrix
Required Modules
tvmonnxrelay
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: igie Download link: https://github.com/dongg622/china-ai-chip-skill/archive/main.zip#igie Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.