s3-parquet-sampling

Community

Sample large Parquet data on S3 with caching.

Authorarm2arm
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Large Parquet datasets stored on S3 are expensive to load fully for exploration or visualization. This Skill provides sampling and local caching to substantially reduce memory use and accelerate analysis.

Core Features & Use Cases

  • Sample and reduce Parquet data on S3 to sizes suitable for interactive exploration.
  • Cache the reduced dataset locally as Parquet to avoid repeated downloads.
  • Use Dask for scalable processing and hvPlot/Datashader for scalable visualizations in data-heavy workflows.

Quick Start

Run the s3-parquet-sampling workflow to sample a large dataset on S3 and cache the result locally.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: s3-parquet-sampling
Download link: https://github.com/arm2arm/AstroAgentAssistant/archive/main.zip#s3-parquet-sampling

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.