megaplan-bakeoff

Community

Run fair multi-profile LLM bake-offs with megaplan.

Authorpeteromallet
Version1.0.0
Installs0

System Documentation

What problem does it solve?

It helps you compare different megaplan profile mixes on the same task without wasting money or trusting misleading outputs, producing a fair “winner” based on blind, rubric-driven assessment.

Core Features & Use Cases

  • Multi-profile concurrent bake-offs: Run the same idea across N profiles to test which mix delivers better quality per cost.
  • Smoke testing and launch hygiene: Validate routing/model behavior in doc-mode first to catch failures cheaply before code-mode runs.
  • Blind assessment workflow: Enforce sub-agent blinding, rubric scoring, and style quotes so evaluation is consistent and not profile-aware.
  • Pre-merge validation gate: Detect empty diffs and other misdirections before selecting or merging results into main.
  • Reporting patterns for decision-making: Produce comparison tables and cost-adjusted conclusions that summarize trade-offs and production readiness.

Quick Start

Tell the megaplan bakeoff runner to execute a light-robustness, blind-scored bake-off for your task idea across your chosen profiles, then pick and merge the winner.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: megaplan-bakeoff
Download link: https://github.com/peteromallet/arnold/archive/main.zip#megaplan-bakeoff

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.