vllm-benchmarking

Name: vllm-benchmarking
Availability: InStock
Author: air-gapped

Official

Benchmark vLLM deployments with repeatable tests.

Software Engineering #benchmarking #latency #throughput #benchmarks #slo #vllm #air-gapped

Authorair-gapped

Version1.0.0

Installs0

System Documentation

What problem does it solve?

Benchmark production vLLM deployments to measure latency, throughput, and SLO compliance under realistic load. It applies to air-gapped environments and a variety of bench scenarios, including health checks, change comparisons, and SLO validation across multiple subcommands. It supports structured JSON output, warmup controls, percentile metrics, goodput budgets, and repeatable rate sweeps for reproducible performance analysis.

Core Features & Use Cases

Provides guidance and tooling for benchmarking vLLM deployments across serve, sweep, startup, latency, and throughput subcommands.
Demonstrates health-check, A/B change comparison, and SLO-constrained measurements with repeatable, auditable workflows.
Documents air-gapped patterns (mirrors, ModelScope substitutions, offline caches) and a complete dataset catalog for production-like workloads.

Quick Start

Run a health-check benchmark against a running vLLM deployment to generate baseline latency and throughput numbers.

vllm-benchmarking

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper