custom-dataset-seeds

Official

Turn your files into training-ready seeds

Authorlightning-rod-labs
Version1.0.0
Installs0

System Documentation

What problem does it solve?

It converts local files and user-provided datasets (like PDFs, CSVs, and text) into Lightning Rod “seeds” so you can quickly build labeled forecasting datasets without manual preprocessing.

Core Features & Use Cases

  • File-to-samples ingestion: Chunk documents or map CSV columns into model-ready samples with optional embedded labels and metadata.
  • FileSet-based workflows: Upload large or metadata-rich corpora as a FileSet for scalable transformation and temporal/metadata filtering.
  • Flexible context and labeling strategies: Generate seeds-only, whole-document (non-RAG) context/labels using chronological constraints, or RAG context/labels using vector retrieval with payload/temporal filters.
  • Fitness + chunking guidance: Provide practical checks (volume, date coverage, text quality, label availability) and recommended chunking parameters to improve results.

Quick Start

Convert your PDFs into seeds and run a transforms pipeline with a limit on generated questions by asking: “Ingest data/*.pdf as samples with chunk_size=1000 and chunk_overlap=100, create an input_dataset from those samples, then run lr.transforms.run(pipeline, input_dataset=input_dataset, max_questions=10).”

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: custom-dataset-seeds
Download link: https://github.com/lightning-rod-labs/lightningrod-python-sdk/archive/main.zip#custom-dataset-seeds

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.