trino-file-layout-optimization
CommunityFix slow Trino scans with smarter file layout
Authorivanshamaev
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill helps you reduce slow Trino query performance caused by inefficient Iceberg file layout, especially when small files cause excessive split counts or when file-level pruning is ineffective.
Core Features & Use Cases
- Select optimal table file format (Parquet vs ORC) for Trino+Iceberg workloads, defaulting to Parquet and using ORC when migrating from Hive or requiring ORC-specific stripe statistics.
- Tune file sizing and row group behavior by adjusting
iceberg.target-max-file-sizeand Parquet row group settings to improve compression and pruning effectiveness. - Increase data skipping and reduce wasted reads using
sorted_by(min/max skipping) and Bloom filter indexes for equality-heavy predicates. - Detect and remediate small-file pathologies with targeted Iceberg
OPTIMIZEcompaction strategies, plus split/parallelism tuning and partition granularity guidance. - Use health-check SQL to quantify small files, oversized files, and overall table health from Iceberg metadata tables.
Quick Start
Ask the AI to propose an Iceberg layout plan for your Trino workload and generate the specific CREATE TABLE/property changes and OPTIMIZE commands to reduce small files and improve pruning.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: trino-file-layout-optimization Download link: https://github.com/ivanshamaev/de-agent-skills/archive/main.zip#trino-file-layout-optimization Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.