datafusion-python

Official

Use DataFusion Python bindings for dataframes.

Authorapache
Version1.0.0
Installs0

System Documentation

What problem does it solve?

DataFusion Python bindings enable Python developers to run DataFusion queries via the DataFrame API and SQL, bridging Python workloads with an in-process query engine built on Apache Arrow for fast analytics.

Core Features & Use Cases

  • Data loading from Parquet, CSV, and JSON sources into a SessionContext.
  • DataFrame-based query construction with lazy evaluation and a rich expression API (Expr, col, lit, functions as F).
  • SQL-to-DataFrame mappings and interoperability with pandas, polars, and other in-memory data frames.
  • Idiomatic patterns and common pitfalls guidance for practical analytics tasks.

Quick Start

Create a SessionContext, load data, and run a SQL or DataFrame query.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: datafusion-python
Download link: https://github.com/apache/datafusion-python/archive/main.zip#datafusion-python

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.