model-serving-security

Community

Secure model-serving endpoints from DoS and SSRF.

Authormaruakshay
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Model serving endpoints are expensive to operate. A single unthrottled client can exhaust GPU capacity for all users by submitting high-token-count requests. Streaming responses introduce new timing and partial-response leakage channels. Model-generated outputs that include URLs can cause the serving layer to make outbound requests — a classic SSRF vector in a non-obvious location.

Core Features & Use Cases

  • Parameter bounding: Enforce server-side caps on max_tokens, n, logprobs, and streaming to prevent resource exhaustion.
  • Multi-layer rate limiting: Apply per-key, per-user, per-IP, per-organization limits with shared state to deter abuse.
  • SSRF protections: Validate and restrict URLs derived from model outputs before outbound requests; disable URL following by default unless explicitly enabled.
  • Monitoring & quick win: Real-time latency, token usage, and error-rate monitoring with circuit-breaker behavior to maintain availability.

Quick Start

Implement parameter caps, multi-dimensional rate limiting, and SSRF-safe URL handling on your model-serving endpoints today.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: model-serving-security
Download link: https://github.com/maruakshay/mii-ai-security/archive/main.zip#model-serving-security

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.