Self-Healing Server
CommunityKeep your infrastructure running—automatically
Software Engineering#sre#auto-remediation#disk cleanup#incident reporting#infrastructure monitoring#docker recovery#ssl renewal
AuthorTravisLeeeeee
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Prevent small infrastructure failures from turning into outages by monitoring server health and applying safe, bounded auto-remediation actions when common problems occur.
Core Features & Use Cases
- Proactive Health Monitoring: Watches CPU, RAM, disk, network, and process counts to detect early warning conditions.
- Auto-Remediation with Guardrails: Restarts crashed containers with exponential backoff, performs disk cleanup while preserving recent logs, and handles hung/zombie processes with escalation after repeated failures.
- Operational Incident Reporting: Produces remediation reports that include before/after metrics and maintains an incident log with root-cause analysis.
- Example Use Case: When a Docker container exits due to OOM kills, the skill attempts controlled restarts, records the incident details, reports before/after health metrics, and alerts for human review if the failure persists.
Quick Start
Copy the self-healing-server folder into your OpenClaw workspace by running: cp -r self-healing-server/ ~/.openclaw/workspace/
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: Self-Healing Server Download link: https://github.com/TravisLeeeeee/awesome-openclaw-personas/archive/main.zip#self-healing-server Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.