evals-live-run
OfficialRun repeatable live-device skill evals & debugging
Authorclawperator
Version1.0.0
Installs0
System Documentation
What problem does it solve?
The evals-live-run skill provides a convenient entrypoint for repeatable live-device skill proving runs, consolidating harnessed evaluation flows and retained-log workflows.
Core Features & Use Cases
- Orchestrates Solax orchestrated-cold-start eval runs against real devices for repeatable results.
- Provides a wrapper to run Pack A Samsung android-version benchmarks and other eval scenarios with explicit device targeting.
- Enables debugging of retained logs and selective replay of newest eval batches.
Quick Start
Run the Solax cold-start eval on a real device using the included helper scripts to kick off an end-to-end evaluation.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: evals-live-run Download link: https://github.com/clawperator/clawperator/archive/main.zip#evals-live-run Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.