airflow-starrocks-etl-best-practices

Community

Idempotent Airflow→StarRocks ETL runs

Authorivanshamaev
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill prevents duplicate or corrupted data when you repeatedly run Airflow ETL jobs that load into StarRocks, especially during retries and backfills.

Core Features & Use Cases

  • Idempotent DAG loading patterns: partition replacement via INSERT OVERWRITE, deterministic Broker Load labels, and retry-safe execution.
  • Deduplication and upsert handling: stable dedup logic before insert and safe merge/upsert behavior for primary-key tables.
  • Production reliability and observability: task retries with exponential backoff, SLA miss callbacks, freshness checks, and lineage tagging for audit trails.
  • Operational safety for backfills: dependency ordering, concurrency controls, catchup safety checklist, and dynamic partition management with duplicate prevention.

Quick Start

Ask the agent to generate an Airflow DAG for StarRocks that implements deterministic Broker Load labels, uses INSERT OVERWRITE for partition replacement, adds retries with exponential backoff, includes an SLA miss callback, and performs freshness validation after each run.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: airflow-starrocks-etl-best-practices
Download link: https://github.com/ivanshamaev/de-agent-skills/archive/main.zip#airflow-starrocks-etl-best-practices

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.