ad-accuracy-debug
CommunityPinpoint AutoDeploy accuracy regressions
Software Engineering#tensorrt-llm#autodeploy#accuracy debugging#fp8 quantization#sharding correctness#evaluation harness#kernel investigation
Authoryo-steven
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill helps you debug when an AutoDeploy TensorRT-LLM model’s evaluation accuracy drops sharply versus a known-good reference, without knowing the root cause.
Core Features & Use Cases
- Phase-based accuracy triage: validate the eval harness, reproduce on a small sample, and classify the error pattern before deep investigation.
- Targeted isolation knobs: simplify configuration (e.g., TP/world_size, transforms, multi-streaming) and switch compile backends (CUDA graphs vs torch-simple) to narrow the failing component.
- Root-cause investigation paths: cover common categories like quantization/FP8 or wrapper assumptions, and sharding issues that only appear at world_size > 1.
- Actionable outputs: produce an identified root cause, a minimal reproducer approach, and concrete next steps for a code/config fix.
Quick Start
Use the ad-accuracy-debug Skill to compare an AutoDeploy model run against a PyTorch backend reference on the same eval task, then run a 50–100 sample diagnostic that reproduces the evaluator’s exact prompt formatting.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: ad-accuracy-debug Download link: https://github.com/yo-steven/skills-exploration-20260522/archive/main.zip#ad-accuracy-debug Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.