plan-mode

Community

End-to-end plan for AEGIS model improvements.

Authorzapabob
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Plan-mode provides a structured, repeatable plan to scientifically improve AEGIS model performance across ARC-Challenge, GSM8K, and GRPO-based rewards, coordinating multiple evaluation loops.

Core Features & Use Cases

  • Plan-mode orchestrates ARC-Challenge improvements (robust answer extraction, timeout optimization, and prompt consistency), GSM8K sanity checks (data contamination, multi-seed stability, zero-shot assessment), and GRPO reward multi-objective design, plus AEGIS v2.5 integration.
  • Use cases include running end-to-end improvement pipelines, ABC test automation, and cross-task generalization experiments to raise robustness and benchmark scores.

Quick Start

Initialize AEGISImprovementPlan and run ARC improvements, GSM8K sanity checks, GRPO optimization, and v2.5 integration steps in sequence.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: plan-mode
Download link: https://github.com/zapabob/SO8T/archive/main.zip#plan-mode

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.