quota-rate-limit-pattern

Community

Protect gateways with 3-layer quota enforcement.

Authorsaintgo7
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill solves the problem of incorrectly throttling API traffic by handling rate limits (burst), concurrent in-flight requests, and daily token quotas with distinct mechanisms so you can reliably prevent overload and return actionable 429 errors.

Core Features & Use Cases

  • 3-layer rate limiting: Combines slowapi RPM limits, asyncio Semaphore concurrency admission, and DB-backed daily token checks to cover different failure modes.
  • Reason-code 429 responses: Returns HTTP 429 with a structured error payload and a specific code for rpm_limit, concurrent_limit, or daily_token_limit so clients can retry appropriately.
  • Operational visibility: Integrates a Prometheus counter that labels rejections by reason, enabling fast debugging of which layer is blocking traffic.
  • Distributed-environment guidance: Documents limits of worker-local enforcement and the required Redis/admission-controller alternatives when moving beyond single-worker deployments.

Quick Start

Ask your AI to generate a gateway quota module that enforces slowapi RPM, per-user asyncio concurrency slots, and a DB daily token SUM check, returning OpenAI-compatible 429 error responses with code set to the exact rejected layer.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: quota-rate-limit-pattern
Download link: https://github.com/saintgo7/claude-skills/archive/main.zip#quota-rate-limit-pattern

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.