airflow-starrocks-pipeline
CommunityOrchestrate StarRocks loads with Airflow
Software Engineering#orchestration#airflow#starrocks#broker load#routine load#stream load#etl pipelines
Authorivanshamaev
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It helps you reliably orchestrate StarRocks ingestion workflows from Airflow, including async Broker Load polling and Routine Load lifecycle management, so ETL pipelines don’t get stuck or load with stale metadata.
Core Features & Use Cases
- Broker Load orchestration (S3 batch ingestion): Create partitions, trigger Broker Load jobs with a deterministic label, and poll
SHOW LOAD ...untilFINISHED. - Stream Load micro-batch pattern: Send NDJSON payloads via HTTP with a computed label and validate StarRocks response status.
- Routine Load lifecycle control: Pause for schema changes, apply DDL, and resume while handling job state transitions safely.
- Post-load optimization: Run
ANALYZE TABLEfor partitions to refresh statistics and improve query planning. - Airflow integration details: Use
MySqlHookfor MySQL-compatible DDL/DML andHttp/requestsfor Stream Load; pass labels via XCom and template partition-aware DAG parameters.
Quick Start
Ask the agent to generate an Airflow DAG that performs partition-aware Broker Load into a StarRocks table, polls until completion, then runs ANALYZE TABLE ... PARTITION (...) and basic row-count validation for the same date.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: airflow-starrocks-pipeline Download link: https://github.com/ivanshamaev/de-agent-skills/archive/main.zip#airflow-starrocks-pipeline Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.