vllm-model-selection
CommunitySmart model selection for vLLM deployments.
Software Engineering#deployment#quantization#model-selection#vllm#sglang#inference-engine#gpu-constraints
AuthorHugoAlmeidaMoreira
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Evaluate and select the optimal LLM model for a given hardware setup and deployment target (vLLM or SGLang), balancing constraints such as VRAM, quantization, and architecture compatibility to accelerate go/no-go decisions.
Core Features & Use Cases
- Gather hardware constraints (GPU count, VRAM, interconnect) and map to viable model options.
- Query model metadata from Hugging Face to extract hidden_size, layers, architectures, and quantization options.
- Compare on-disk size, estimated VRAM, and compatibility across vLLM and SGLang, then produce recommendations and deployment-ready guidance.
Quick Start
Provide your GPU VRAM, number of GPUs, and target deployment (vLLM or SGLang) to receive a recommended model.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: vllm-model-selection Download link: https://github.com/HugoAlmeidaMoreira/zeus-agent/archive/main.zip#vllm-model-selection Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.