principle-resiliency
CommunityKeep services useful during partial failures.
Software Engineering#health checks#fault tolerance#graceful degradation#resiliency#bulkheads#cascading failure#failure domains
Authorlugassawan
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Resiliency guidance helps prevent systems from failing completely when dependencies, networks, or components degrade, by designing for partial failure and controlled degradation instead of cascading outages.
Core Features & Use Cases
- Failure domain clarity: Identify what fails together to avoid “unnamed blast radii” and to reason about cascading failure paths.
- Blast-radius control with bulkheads: Isolate resource pools per dependency (connection pools, semaphores, thread pools, queues) to prevent one slow dependency from starving others.
- Correct failure handling and fallback strategy: Choose fail-fast vs fail-soft appropriately, design graceful degradation modes, and ensure health checks (liveness vs readiness) don’t create restart storms or probe storms.
Quick Start
Ask for a resiliency review of your service design focused on failure domains, bulkheads, fail-fast versus fail-soft decisions, graceful degradation options, and health-check behavior under upstream outages.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: principle-resiliency Download link: https://github.com/lugassawan/swe-workbench/archive/main.zip#principle-resiliency Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.