sglang-qwen3-next-optimization
CommunityPR-diff guide for Qwen3-Next optimization.
AuthorBBuf
Version1.0.0
Installs0
System Documentation
What problem does it solve?
PR-diff-guided workflow to audit, document, and reason about Qwen3-Next family optimizations across base Qwen3-Next, MTP, Coder-Next, and related runtimes in SGLang, enabling reproducible improvements and clear justification.
Core Features & Use Cases
- Structured PR-diff dossier for Qwen3-Next optimization, NEXTN, Eagle3, and MTP compatibility.
- Guidance for GDN/Mamba state, FP8/NVFP4 loading, CPU offload, and kernel fusion strategies.
- Validation planning, risk assessment, and lane definitions to track changes across PRs.
Quick Start
Follow this playbook to audit and reproduce Qwen3-Next optimizations from the base codebase using the associated PR history.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: sglang-qwen3-next-optimization Download link: https://github.com/BBuf/AI-Infra-Auto-Driven-SKILLS/archive/main.zip#sglang-qwen3-next-optimization Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.