sglang-qwen3-next-optimization

Community

PR-diff guide for Qwen3-Next optimization.

AuthorBBuf
Version1.0.0
Installs0

System Documentation

What problem does it solve?

PR-diff-guided workflow to audit, document, and reason about Qwen3-Next family optimizations across base Qwen3-Next, MTP, Coder-Next, and related runtimes in SGLang, enabling reproducible improvements and clear justification.

Core Features & Use Cases

  • Structured PR-diff dossier for Qwen3-Next optimization, NEXTN, Eagle3, and MTP compatibility.
  • Guidance for GDN/Mamba state, FP8/NVFP4 loading, CPU offload, and kernel fusion strategies.
  • Validation planning, risk assessment, and lane definitions to track changes across PRs.

Quick Start

Follow this playbook to audit and reproduce Qwen3-Next optimizations from the base codebase using the associated PR history.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: sglang-qwen3-next-optimization
Download link: https://github.com/BBuf/AI-Infra-Auto-Driven-SKILLS/archive/main.zip#sglang-qwen3-next-optimization

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.