custom-dataset-seeds
OfficialTurn your files into training-ready seeds
Authorlightning-rod-labs
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It converts local files and user-provided datasets (like PDFs, CSVs, and text) into Lightning Rod “seeds” so you can quickly build labeled forecasting datasets without manual preprocessing.
Core Features & Use Cases
- File-to-samples ingestion: Chunk documents or map CSV columns into model-ready samples with optional embedded labels and metadata.
- FileSet-based workflows: Upload large or metadata-rich corpora as a FileSet for scalable transformation and temporal/metadata filtering.
- Flexible context and labeling strategies: Generate seeds-only, whole-document (non-RAG) context/labels using chronological constraints, or RAG context/labels using vector retrieval with payload/temporal filters.
- Fitness + chunking guidance: Provide practical checks (volume, date coverage, text quality, label availability) and recommended chunking parameters to improve results.
Quick Start
Convert your PDFs into seeds and run a transforms pipeline with a limit on generated questions by asking: “Ingest data/*.pdf as samples with chunk_size=1000 and chunk_overlap=100, create an input_dataset from those samples, then run lr.transforms.run(pipeline, input_dataset=input_dataset, max_questions=10).”
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: custom-dataset-seeds Download link: https://github.com/lightning-rod-labs/lightningrod-python-sdk/archive/main.zip#custom-dataset-seeds Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.