hugging-face-datasets
OfficialManage Hugging Face datasets with SQL.
System Documentation
What problem does it solve?
Hugging Face datasets often require manual setup for repo creation, configuration, and data processing. This Skill provides an end-to-end workflow to initialize, configure, and edit datasets, plus SQL-based discovery, transformation, and export capabilities.
Core Features & Use Cases
- Dataset lifecycle management: initialize repos, configure system prompts, and manage content with templates.
- SQL-based querying and transformation: query HF datasets using DuckDB, describe schemas, sample data, join datasets, and export to Parquet/JSONL.
- HF Hub integration: push results to new datasets, manage access, and organize multi-split workflows.
Quick Start
Use uv run scripts/dataset_manager.py init to create a new dataset, then uv run scripts/dataset_manager.py quick_setup --template chat --repo_id "your-username/your-dataset" to bootstrap a dataset with chat templates. Then run uv run scripts/sql_manager.py query --dataset "your-username/your-dataset" --sql "SELECT * FROM data" to inspect.
Dependency Matrix
Required Modules
Components
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: hugging-face-datasets Download link: https://github.com/huggingface/skills/archive/main.zip#hugging-face-datasets Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.