igie

Community

High-performance GPU inference framework supporting multiple models.

Authordongg622
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill streamlines the deployment and optimization of AI inference workloads on Iluvatar GPUs, enabling efficient model conversion, quantization, and deployment.

Core Features & Use Cases

  • Model Import and Conversion: Supports importing models from ONNX, PyTorch, TensorFlow, and others, simplifying the deployment pipeline.
  • Quantization and Optimization: Facilitates INT8 and FP16 quantization for faster inference and reduced memory usage, applicable to vision and NLP models.
  • Dynamic and Static Deployment: Implements static shape compilation and dynamic shape support for flexible inference scenarios.
  • Use Case: Deploy a ResNet model in FP16 precision on Iluvatar GPU, achieving accelerated inference for real-time applications.

Quick Start

Load an ONNX model, import it with relay.frontend, build for Iluvatar hardware, and run inference using TVM's graph executor.

Dependency Matrix

Required Modules

tvmonnxrelay

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: igie
Download link: https://github.com/dongg622/china-ai-chip-skill/archive/main.zip#igie

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.