Name: pop-benchmark-runner
Availability: InStock
Author: jrc1883

System Documentation

What problem does it solve?

The Benchmark Runner skill automates the end-to-end process of running side-by-side benchmarks that compare PopKit-enabled Claude Code against a baseline, delivering objective measurements and reports.

Core Features & Use Cases

Orchestrates paired trials (WITH PopKit vs BASELINE) across separate workspaces to enable real-time comparison.
Collects detailed recordings, runs statistical analysis (t-tests, Cohen's d, confidence intervals), and generates comprehensive reports.
Produces markdown and HTML reports to share insights with stakeholders, CI pipelines, or team dashboards.

Quick Start

Run a benchmark using the command pattern /popkit-ops:benchmark run <task-id> to compare PopKit-enabled vs baseline Claude Code.

Please help me install this Skill: Name: pop-benchmark-runner Download link: https://github.com/jrc1883/popkit-ai/archive/main.zip#pop-benchmark-runner Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

pop-benchmark-runner

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper