Name: gke-inference-gateway
Availability: InStock
Author: Riku-KANO

System Documentation

What problem does it solve?

This skill consolidates expert knowledge on GKE Inference Gateway, including the correct API groups (inference.networking.k8s.io), the role of BBR, per-pool Endpoint Pickers, and safe migration paths away from deprecated InferenceModel to modern constructs.

Core Features & Use Cases

API group guidance: explains the two API groups (stable v1 and alpha v1alpha2) and when to use each.
BBR & HTTPRoute-based routing: shows how body→pool dispatch is done by BBR and how to map models to HTTPRoute header rules.
Operational patterns: how to deploy multi-model stacks, per-pool EPP, health checks, timeouts, and troubleshooting steps.
Uses example sequences from the real Gemma + vLLM deployment to visualize end-to-end flow.

Quick Start

Follow this skill to configure or debug GKE Inference Gateway using BBR, HTTPRoute header-based routing, and per-pool EPP.

Please help me install this Skill: Name: gke-inference-gateway Download link: https://github.com/Riku-KANO/gemma4-gke-demo/archive/main.zip#gke-inference-gateway Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

gke-inference-gateway

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper