curate-trajectories
OfficialClean, provable trajectory datasets for safe training.
Authorunderstudylabs
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Trajectories pile up as loose per-task JSON across runs, and once used for training, distillation, or RL they can leak frozen developer holdout. This skill provides a provenance-tracked, contamination-safe workflow to import, tag splits, perform hash-stamped selections, and emit a decontaminated pool that preserves holdout integrity.
Core Features & Use Cases
- Index + attach provenance: Build a local index (one record per trajectory) at .understudy/curate-trajectories/index.jsonl with full provenance fields and corpus/hash metadata.
- Tag splits from capture-evidence: Map trajectories to train/dev/holdout/none using the frozen splits.json and annotate index records for guarded pools.
- Query as a hash-stamped selection: Express subsets by provenance filters, resolve to a named selection, compute a selection hash, and emit a manifest for downstream audits.
- Contamination check & emission: Cross-check against holdout/dev id sets, produce a contamination report, and hard-block guarded pools unless overridden.
- Emit decontaminated pool: Produce train-safe/distill-safe pools along with the selection hash, splits_sha256, corpus hash, and row counts for auditability.
Quick Start
Index your trajectories and generate a train-safe, decontaminated pool using the built-in workflow.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: curate-trajectories Download link: https://github.com/understudylabs/understudy-agent-tools/archive/main.zip#curate-trajectories Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 510,000+ vetted skills library on demand.