router-benchmark-runner

Community

Benchmark agentic tool-calls across models.

AuthorcrycriM
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Run agentic tool-call benchmarks across all models on the m5-router sequentially. Handles model swap warmup, output buffering, and missing model files to ensure reliable evaluation.

Core Features & Use Cases

  • Sequential benchmarking across models on the m5-router.
  • Automatic model warmup with controlled swaps and retry logic to ensure responsiveness.
  • Output buffering avoidance by redirecting logs to bench_run.log for real-time visibility.
  • Pre-run validation of model paths using router-preset.ini to catch missing files.
  • Skip completed models on rerun based on results metadata to save time.
  • End-to-end workflow: restart router, clear stale results, run benches, monitor progress, and produce bench_summary.json.

Quick Start

Restart the router, clear stale results, run the bench script, monitor the log, and review bench_summary.json.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: router-benchmark-runner
Download link: https://github.com/crycriM/hermes-skills/archive/main.zip#router-benchmark-runner

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.