plan-mode
CommunityEnd-to-end plan for AEGIS model improvements.
Authorzapabob
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Plan-mode provides a structured, repeatable plan to scientifically improve AEGIS model performance across ARC-Challenge, GSM8K, and GRPO-based rewards, coordinating multiple evaluation loops.
Core Features & Use Cases
- Plan-mode orchestrates ARC-Challenge improvements (robust answer extraction, timeout optimization, and prompt consistency), GSM8K sanity checks (data contamination, multi-seed stability, zero-shot assessment), and GRPO reward multi-objective design, plus AEGIS v2.5 integration.
- Use cases include running end-to-end improvement pipelines, ABC test automation, and cross-task generalization experiments to raise robustness and benchmark scores.
Quick Start
Initialize AEGISImprovementPlan and run ARC improvements, GSM8K sanity checks, GRPO optimization, and v2.5 integration steps in sequence.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: plan-mode Download link: https://github.com/zapabob/SO8T/archive/main.zip#plan-mode Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.