sglang-ernie45-optimization

Community

PR-backed optimization for Ernie4.5 in SGLang.

AuthorBBuf
Version1.0.0
Installs0

System Documentation

What problem does it solve?

PR-backed optimization manual for Ernie4.5 / Ernie4.5-VL in SGLang. Use when Codex needs to audit, debug, extend, or document the SGLang Ernie4.5 multimodal runtime, especially the initial VL landing, fused Triton rotary path, and later cos/sin cache rewrite for Ernie4.5-VL.

Core Features & Use Cases

  • PR-backed diff-driven guidance for optimizing Ernie4.5/Ernie4.5-VL in the SGLang runtime, including VL bring-up and rotary embedding improvements.
  • Tracks production-ready changes across layers such as srt/models/ernie45_vl.py, rotary_embedding.py, and model_config.py with references to diff history.
  • Provides a repeatable evidence baseline using references/pr-history.md and diff-based PR cards to support audits and future work.

Quick Start

Review the PR-diff dossiers for Ernie4.5-VL and apply the recommended changes to the SGLang runtime.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: sglang-ernie45-optimization
Download link: https://github.com/BBuf/AI-Infra-Auto-Driven-SKILLS/archive/main.zip#sglang-ernie45-optimization

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.