triage-alerts
CommunityDiagnose alert failures and fix faster.
Software Engineering#root cause analysis#kubernetes#observability#alertmanager#gatus#alert triage#incident remediation
Authordavid-driscoll
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill helps you rapidly triage operational incidents by correlating live uptime failures from Gatus with active alerting signals from Alertmanager, then translating those patterns into actionable root-cause hypotheses and remediation steps.
Core Features & Use Cases
- Live failure discovery: Pulls current failing endpoint statuses from Gatus and active alerts from Alertmanager.
- Root-cause patterning: Categorizes issues into likely causes (e.g., ingress/auth failures, DNS/webhook mismatches, rollout stalls, crash loops, secret sync problems, HelmRelease rollback loops).
- Cluster-specific verification and remediation: Recommends kubeconfig-appropriate commands to inspect Helm/Kustomize health, pod/event state, and apply targeted fixes.
- Common remediation playbooks: Includes concrete actions such as breaking stuck HelmRelease conditions, forcing ExternalSecret resync, triggering Flux reconciles, and restarting/rescheduling workloads.
- Verification loop: Guides polling Gatus until failing endpoints recover to zero.
Quick Start
Run the triage process to fetch Gatus failing endpoints and Alertmanager active alerts, then follow the remediation steps until the failing count reaches 0.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: triage-alerts Download link: https://github.com/david-driscoll/home-operations/archive/main.zip#triage-alerts Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.