gem-llm-deploy-vllm
CommunitySafely manage GEM-LLM vLLM servers
Software Engineering#deployment automation#health check#gpu memory#vllm#model serving#port conflict#openai-compatible endpoints
Authorsaintgo7
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill resolves operational friction when starting, stopping, and verifying GEM-LLM vLLM model servers, including common issues like port conflicts and GPU memory (OOM) failures.
Core Features & Use Cases
- Two-model vLLM lifecycle control: Starts, stops, and performs health checks for the main Gemma-based server and an auxiliary model server.
- Single-node GPU-aware configuration: Generates vLLM launch settings that respect the single-node constraint (tensor-parallel sizing and safe launch parameters) for the 8xB200 environment.
- Operations-oriented troubleshooting: Helps diagnose endpoint health via
/v1/models, inspects GPU status withnvidia-smi, and guides responses to port collisions and CUDA OOM scenarios using the expected log location.
Quick Start
Run the skill installer with gem-llm-deploy-vllm to start or restart the vLLM main and auxiliary servers, then verify readiness by checking both /v1/models endpoints.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: gem-llm-deploy-vllm Download link: https://github.com/saintgo7/claude-skills/archive/main.zip#gem-llm-deploy-vllm Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.