reliability-observability-gate
CommunityMake changes safe, observable, and recoverable
Software Engineering#observability#reliability#alerting#runbook#capacity planning#incident readiness#SLI SLO
Authormachenjie
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Reliability and observability gates prevent production changes from shipping without measurable SLI/SLO impact, bounded failure modes, evidence-backed incident readiness, and testable recovery plans.
Core Features & Use Cases
- SLI/SLO impact and error-budget discipline: Ensures every user-facing path has an SLI and that error budgets are defined and respected before release.
- Performance, capacity, and cost guardrails: Verifies latency budgets, concurrency limits, saturation signals, and cost/capacity exposure are explicitly captured.
- Telemetry and alerting correctness: Requires structured logs with trace context propagation, bounded metric label cardinality, and multi-window multi-burn-rate alerting.
- Resilience controls validation: Checks circuit breakers, rate limits, timeouts, retries, fallbacks, DLQ/depth monitoring, and tested recovery/rollback criteria.
- Use Case: Before deploying a new feature that changes a critical API endpoint and adds async background processing, define SLI/SLO targets, resilience controls, and evidence-based recovery steps to reduce incident risk.
Quick Start
Use this skill to review a proposed production change and produce a complete reliability-and-observability plan that includes SLI/SLO assessment, telemetry requirements, alerting design, capacity/cost guardrails, and tested recovery and rollback criteria.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: reliability-observability-gate Download link: https://github.com/machenjie/rd-skills/archive/main.zip#reliability-observability-gate Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.