s3-parquet-sampling
CommunitySample large Parquet data on S3 with caching.
Authorarm2arm
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Large Parquet datasets stored on S3 are expensive to load fully for exploration or visualization. This Skill provides sampling and local caching to substantially reduce memory use and accelerate analysis.
Core Features & Use Cases
- Sample and reduce Parquet data on S3 to sizes suitable for interactive exploration.
- Cache the reduced dataset locally as Parquet to avoid repeated downloads.
- Use Dask for scalable processing and hvPlot/Datashader for scalable visualizations in data-heavy workflows.
Quick Start
Run the s3-parquet-sampling workflow to sample a large dataset on S3 and cache the result locally.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: s3-parquet-sampling Download link: https://github.com/arm2arm/AstroAgentAssistant/archive/main.zip#s3-parquet-sampling Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.