gem-llm-load-test

Community

Validate GEM-LLM throughput and p99 latency.

Authorsaintgo7
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill eliminates uncertainty in GEM-LLM performance by running repeatable load tests that expose real throughput, latency (including p99), and quota/rate-limit behaviors under concurrent users.

Core Features & Use Cases

  • Concurrent load generation (locust + asyncio bench): Runs both a single-key Locust workload and a multi-key multi-user asyncio benchmark to measure realistic behavior.
  • Quota and rate-limit scenario validation: Confirms 60RPM and daily token limits via controlled concurrency and user-key setup.
  • Production-oriented metrics capture: Produces latency percentiles (p50/p95/p99/max), success/failure classifications (HTTP 401/429/500), and token throughput (tok/s), saving results into reports for analysis.

Use Case

Validate whether a new deployment can sustain a target concurrency (e.g., ~50 concurrent requests), keeps p99 latency under a threshold (e.g., <5s), and meets throughput requirements (e.g., >30 req/s), while ensuring quota/rate-limit responses behave correctly.

Quick Start

Run the skill with install.sh for gem-llm-load-test, then execute the Locust single-key test or the multi-user-bench script to generate load and save metrics reports.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: gem-llm-load-test
Download link: https://github.com/saintgo7/claude-skills/archive/main.zip#gem-llm-load-test

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.