spark-python-data-source

Community

Create Spark data sources with Python.

Authordatasciencemonkey
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Build custom Spark data sources for external systems, enabling Spark to read from and write to databases, APIs, and message queues that lack native connectors.

Core Features & Use Cases

  • Flat Python-based Spark DataSource pattern with explicit DataSource, DataSourceReader, DataSourceWriter, and their streaming variants.
  • Supports batch and streaming connectors to external systems, with minimal dependencies and clear, readable implementation patterns.
  • Use case: Create a connector to a REST API or a database, enabling Spark to ingest live data and push updates.

Quick Start

Create a Python-based Spark data source following the provided skeleton to connect Spark to an external system.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: spark-python-data-source
Download link: https://github.com/datasciencemonkey/coding-agents-databricks-apps/archive/main.zip#spark-python-data-source

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.