model-serving-security
CommunitySecure model-serving endpoints from DoS and SSRF.
System Documentation
What problem does it solve?
Model serving endpoints are expensive to operate. A single unthrottled client can exhaust GPU capacity for all users by submitting high-token-count requests. Streaming responses introduce new timing and partial-response leakage channels. Model-generated outputs that include URLs can cause the serving layer to make outbound requests — a classic SSRF vector in a non-obvious location.
Core Features & Use Cases
- Parameter bounding: Enforce server-side caps on max_tokens, n, logprobs, and streaming to prevent resource exhaustion.
- Multi-layer rate limiting: Apply per-key, per-user, per-IP, per-organization limits with shared state to deter abuse.
- SSRF protections: Validate and restrict URLs derived from model outputs before outbound requests; disable URL following by default unless explicitly enabled.
- Monitoring & quick win: Real-time latency, token usage, and error-rate monitoring with circuit-breaker behavior to maintain availability.
Quick Start
Implement parameter caps, multi-dimensional rate limiting, and SSRF-safe URL handling on your model-serving endpoints today.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: model-serving-security Download link: https://github.com/maruakshay/mii-ai-security/archive/main.zip#model-serving-security Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.