experiment-monitor
CommunityMonitor experiments and auto-update status.
AuthorShiien
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill automates monitoring of active experiments by reading logs in logs/experiments/*.yaml, verifying process health, and surfacing up-to-date status to reduce manual chasing of stalled runs.
Core Features & Use Cases
- Load active experiments from logs/experiments/*.yaml and monitor their status.
- Check process liveness locally or remotely, tail recent logs, and extract key metrics (loss, accuracy, epoch, step).
- Detect errors such as OOM, NaN, or tracebacks; update status to running/completed/failed; optionally notify.
Quick Start
Monitor all active experiments and report their latest status in a single glance.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: experiment-monitor Download link: https://github.com/Shiien/Self-Evolved-Research-Framework/archive/main.zip#experiment-monitor Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.