dataops-disaster-recovery-review

Community

Set RTO/RPO and rehearse resilient data recovery

Authorivanshamaev
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill helps data teams design and validate disaster recovery (DR) plans by translating business availability goals into concrete backup, replication, restore, and failover procedures for data platforms.

Core Features & Use Cases

  • Define RTO/RPO by platform component: Establishes restore time and data-loss limits for systems like Airflow metadata DB, data lake, Kafka, Trino catalog, and Kubernetes workloads.
  • Operationalize DR with tested mechanisms: Covers practical approaches for Airflow DB backup/restore (pg_dump/pg_restore), Kafka cross-region replication (MirrorMaker2), S3 data lake replication, Kubernetes backup/restore (Velero), and reconciliation-ready runbooks.
  • Run DR “game day” and verify readiness: Includes guidance to execute periodic DR tests, validate backup integrity through restores, measure actual recovery time, and catch anti-patterns that create false confidence.

Quick Start

Ask the AI to produce an RTO/RPO-driven DR runbook for your platform by reviewing Airflow, Kafka, the data lake, Trino, and Kubernetes against your target recovery timelines.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: dataops-disaster-recovery-review
Download link: https://github.com/ivanshamaev/de-agent-skills/archive/main.zip#dataops-disaster-recovery-review

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.