databricks-spark-structured-streaming

Community

Master Spark Structured Streaming in production.

Authordatasciencemonkey
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Spark Structured Streaming in production can be complex and brittle, making it hard to guarantee reliability and performance. This guide provides a structured approach to building robust streaming pipelines, applying stateful processing, and optimizing throughput, latency, and fault tolerance.

Core Features & Use Cases

  • Patterns for Kafka ingestion, stream-to-Delta writes, stream-stream joins, and windowed analytics.
  • Production best practices covering watermarking, state store tuning, triggers, and monitoring for real-world workloads.
  • Real-world use cases including real-time dashboards, event-driven ETL, and streaming analytics at scale.

Quick Start

Create a minimal Spark Structured Streaming job that reads from Kafka, applies a watermark, and writes to Delta.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: databricks-spark-structured-streaming
Download link: https://github.com/datasciencemonkey/coding-agents-databricks-apps/archive/main.zip#databricks-spark-structured-streaming

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.