public-dataset-exploration
OfficialFind raw public data and turn it into seeds.
System Documentation
What problem does it solve?
This Skill helps you locate and shortlist raw public datasets in a new domain when you don’t yet have training-ready data available.
Core Features & Use Cases
- Market discovery across platforms: Search Kaggle, Hugging Face, and GitHub for domain-relevant raw or semi-structured datasets suitable for conversion.
- Training-readiness filtering: Identify “relevant but not training-ready” sources by checking whether the data can produce forecasting questions or document-style Q&A rather than already being instruction-tuned or synthetic.
- Seed creation workflow planning: Convert downloaded files into samples using the SDK’s conversion utilities and assemble an input dataset for downstream pipelines.
Use case: You’re starting a sports forecasting project and have a domain focus, but no documents or labels—this Skill guides you to find a usable raw dataset (e.g., event logs or match stats), convert it into samples, and package it as seeds for further labeling and training.
Quick Start
Ask the Skill to explore public datasets for your domain and recommend 1–3 candidates that are relevant but not already training-ready.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: public-dataset-exploration Download link: https://github.com/lightning-rod-labs/lightningrod-python-sdk/archive/main.zip#public-dataset-exploration Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.