diagnose-failure
CommunityIdentify failed runs and suggest fixes.
Authorqualit527
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Quickly identify why a training run stalled or failed and provide targeted, non-invasive guidance to fix the issue without applying changes automatically.
Core Features & Use Cases
- Root-cause analysis: Examines logs and configurations to identify the most probable failure mode.
- Actionable recommendations: Proposes concrete fixes or patches to recover runs.
- Non-destructive guidance: Delivers steps for manual review and execution by the user.
Quick Start
Provide the run_dir or round_dir of the failed run to the skill and request a diagnosis.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: diagnose-failure Download link: https://github.com/qualit527/qec-ai-decoder/archive/main.zip#diagnose-failure Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.