robotics-experiment-evaluation

Community

Turn robotics claims into rigorous metrics

Authoryuewangg
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This skill helps you design and present evaluation experiments so your robotics and embodied-AI paper’s claims are supported by measurable, fair, and reproducible results.

Core Features & Use Cases

  • Claim-to-experiment mapping: translate each claimed module or contribution into at least one direct experiment or diagnostic rather than relying on overall system performance.
  • Metric protocol coverage: choose appropriate metric families for VLN/navigation, SLAM/state estimation, and control/RL, including APE/RPE, SPL, success and collision rates, runtime, and ablations.
  • Fairness and statistical reporting: ensure consistent splits, modality parity, budget parity, correct handling of failures/crashes/timeouts, and clear guidance on seeds, confidence intervals, and significance wording.
  • Paper-ready reporting rules: produce LaTeX-ready tables/figures guidance with explicit measurement scope, episode counts, and simulation-vs-hardware labeling.

Quick Start

Use this skill to generate a complete experiment-and-metrics plan for a robotics paper by providing your paper’s claims, target tasks (e.g., VLN/SLAM/control), and available hardware/simulator setup, then ask it to output an evaluation design that includes baselines, metrics, ablations, run counts, and reporting rules.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: robotics-experiment-evaluation
Download link: https://github.com/yuewangg/agent-research-skills/archive/main.zip#robotics-experiment-evaluation

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.