blue-green-deployment-pattern
CommunityCut over LLM serving safely with blue/green.
Software Engineering#vllm#llm serving#blue green deployment#rollback runbook#flashinfer cache#gpu memory fencing#fastapi gateway
Authorsaintgo7
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill prevents risky downtime during LLM serving upgrades by using a blue/green cutover approach that runs an isolated green instance, verifies it, switches traffic, and rolls back quickly if anything looks wrong.
Core Features & Use Cases
- Zero/Low-downtime cutover workflow: bring up a green vLLM instance on a new port with isolated dependencies, perform health and smoke checks, then switch to full production settings.
- Safety guardrails for real deployments: isolated venv, flashinfer cache handling, interactive approval gate, GPU memory fencing (smoke vs full), and a rollback runbook.
- Use Case: Upgrading vLLM (e.g., 0.19 → 0.20) while keeping the current serving endpoint stable, even when dependency changes and flashinfer/JIT cache issues would otherwise break the upgrade.
Quick Start
Run the standard cutover flow for this pattern from the repository using install.sh with the skill name.
Dependency Matrix
Required Modules
None requiredComponents
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: blue-green-deployment-pattern Download link: https://github.com/saintgo7/claude-skills/archive/main.zip#blue-green-deployment-pattern Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.