md-raid-deadlock-recovery
CommunityRecover RAID5/6 deadlocks safely, fast.
Software Engineering#linux#mdadm#cloudflare tunnel#production incident#container restart#raid5#deadlock recovery
Authorsaintgo7
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill prevents and resolves Linux mdadm RAID5/6 stripe-cache deadlocks that freeze mdcheck and system IO, where disks look healthy (SMART PASSED) but fsync/COMMIT and container writes hang until the system is rebooted.
Core Features & Use Cases
- Diagnose the deadlock pattern: Detect high load with iowait, many blocked (D-state) processes, frozen sync_action, near-zero sync speed, and kernel stack evidence consistent with
raid5_get_active_stripelock contention. - Make reboot safe and repeatable: Disable RAID auto rechecks (mdcheck/mdadm autocheck), skip root fsck when appropriate, ensure SSH backup access via Cloudflare Tunnel, and verify container restart policies so services come back.
- Verify recovery immediately after reboot: Re-check blocked process counts, confirm RAID sync_action returns to idle, validate
/datawriteability, and confirm critical service reachability.
Quick Start
Run the diagnose script with your md device name to confirm the deadlock pattern, then follow the pre-reboot hardening script before rebooting.
Dependency Matrix
Required Modules
None requiredComponents
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: md-raid-deadlock-recovery Download link: https://github.com/saintgo7/claude-skills/archive/main.zip#md-raid-deadlock-recovery Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.