airflow-starrocks-backfill
CommunityBackfill StarRocks safely with Airflow
Software Engineering#partitioning#airflow#idempotency#backfill#starrocks#broker load#insert overwrite
Authorivanshamaev
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It helps you reprocess and reload historical StarRocks partitions reliably without duplicates, race conditions, or data loss when upstream data or transformation logic has changed.
Core Features & Use Cases
- Idempotent partition backfill: Uses deterministic Broker Load labels per (table, date) so reruns can safely skip already-finished loads.
- Atomic partition replacement: Recomputes partitions with StarRocks INSERT OVERWRITE semantics (and ensures correct preconditions like partition existence).
- Airflow-driven orchestration: Provides two DAG patterns (catchup-based and programmatic date-range) with safety guardrails like
max_active_runs=1. - Operational safety: Includes partition pre-creation, FINISHED/CANCELLED polling, and clear anti-patterns (e.g., append-based reloads).
- Progress tracking & observability: Suggests a backfill tracking table and example queries to monitor durations and row counts.
- Concurrency control for speed: Shows parallel backfill with a configurable worker limit to avoid overwhelming StarRocks BE.
Quick Start
Run the backfill DAG in Airflow with max_active_runs=1 for a defined historical date range, ensuring partitions exist first and using deterministic Broker Load labels to make reruns safe and idempotent.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: airflow-starrocks-backfill Download link: https://github.com/ivanshamaev/de-agent-skills/archive/main.zip#airflow-starrocks-backfill Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.