vllm-tool-calling
CommunityStop tool-call leaks in vLLM deployments
Software Engineering#tool calling#streaming#vllm#smoke testing#parser fallback#model regression#OpenAI schema
Authorsaintgo7
Version1.0.0
Installs0
System Documentation
What problem does it solve?
vLLM tool calling can silently fail or leak tool-call markup into user-visible content, breaking automated function execution in production.
Core Features & Use Cases
- 3-stage defenses (server + model + client fallback): Prevents failures when any single layer regresses, including stream boundary issues.
- Parser-to-model mapping: Ensures the selected vLLM
--tool-call-parsermatches the model’s real tokenizer/chat-template output format. - Client-side promotion of leaked patterns: Detects Hermes/Qwen3 XML and bare-JSON cases, then promotes them into OpenAI-standard
tool_callswhile stripping leaked content. - Smoke test guidance: Validates both non-stream and stream behavior and checks that leaks do not appear in
content.
Quick Start
Install and activate this skill guidance by running: ./install.sh vllm-tool-calling.
Dependency Matrix
Required Modules
None requiredComponents
referencesassets
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: vllm-tool-calling Download link: https://github.com/saintgo7/claude-skills/archive/main.zip#vllm-tool-calling Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.