sglang-glm46-glm47-optimization
CommunityOptimization playbook for GLM-4.6/4.7 in SGLang.
AuthorBBuf
Version1.0.0
Installs0
System Documentation
What problem does it solve?
PR-backed and current-main optimization manual for GLM-4.6, GLM-4.6V-adjacent text paths, GLM-4.7, and GLM-4.7-Flash in SGLang. Use when Codex needs to recover, extend, or audit GLM shared-expert fusion, dual-stream MoE GEMM overlap, GLM-4.7 tool parser, NVFP4/MTP, or GLM4-MoE-Lite/Flash loading, AMD/NPU validation.
Core Features & Use Cases
- PR-based optimization dossiers and diff auditing for GLM-4.6/4.7 lanes.
- Guidance for GLM-4.7-Flash/Lite loading, MTP/draft quant config, and hardware backend validation.
- Use cases include debugging, extension, and production validation of shared-expert fusion and MoE pathways across GLM-4.6/4.7.
Quick Start
Consult the PR history, run through the audit steps, and validate GLM-4.6/4.7 optimizations against runtime tests.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: sglang-glm46-glm47-optimization Download link: https://github.com/BBuf/AI-Infra-Auto-Driven-SKILLS/archive/main.zip#sglang-glm46-glm47-optimization Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.