sglang-glm46-glm47-optimization

Community

Optimization playbook for GLM-4.6/4.7 in SGLang.

AuthorBBuf
Version1.0.0
Installs0

System Documentation

What problem does it solve?

PR-backed and current-main optimization manual for GLM-4.6, GLM-4.6V-adjacent text paths, GLM-4.7, and GLM-4.7-Flash in SGLang. Use when Codex needs to recover, extend, or audit GLM shared-expert fusion, dual-stream MoE GEMM overlap, GLM-4.7 tool parser, NVFP4/MTP, or GLM4-MoE-Lite/Flash loading, AMD/NPU validation.

Core Features & Use Cases

  • PR-based optimization dossiers and diff auditing for GLM-4.6/4.7 lanes.
  • Guidance for GLM-4.7-Flash/Lite loading, MTP/draft quant config, and hardware backend validation.
  • Use cases include debugging, extension, and production validation of shared-expert fusion and MoE pathways across GLM-4.6/4.7.

Quick Start

Consult the PR history, run through the audit steps, and validate GLM-4.6/4.7 optimizations against runtime tests.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: sglang-glm46-glm47-optimization
Download link: https://github.com/BBuf/AI-Infra-Auto-Driven-SKILLS/archive/main.zip#sglang-glm46-glm47-optimization

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.