public-dataset-exploration

Official

Find raw public data and turn it into seeds.

Authorlightning-rod-labs
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill helps you locate and shortlist raw public datasets in a new domain when you don’t yet have training-ready data available.

Core Features & Use Cases

  • Market discovery across platforms: Search Kaggle, Hugging Face, and GitHub for domain-relevant raw or semi-structured datasets suitable for conversion.
  • Training-readiness filtering: Identify “relevant but not training-ready” sources by checking whether the data can produce forecasting questions or document-style Q&A rather than already being instruction-tuned or synthetic.
  • Seed creation workflow planning: Convert downloaded files into samples using the SDK’s conversion utilities and assemble an input dataset for downstream pipelines.

Use case: You’re starting a sports forecasting project and have a domain focus, but no documents or labels—this Skill guides you to find a usable raw dataset (e.g., event logs or match stats), convert it into samples, and package it as seeds for further labeling and training.

Quick Start

Ask the Skill to explore public datasets for your domain and recommend 1–3 candidates that are relevant but not already training-ready.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: public-dataset-exploration
Download link: https://github.com/lightning-rod-labs/lightningrod-python-sdk/archive/main.zip#public-dataset-exploration

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.