dataset-curation

Community

Curate Cortex agent evaluation datasets.

Authorrandoneering
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Create and manage evaluation datasets for Cortex Agents to enable robust testing and benchmarking of agent behavior.

Core Features & Use Cases

  • Workflow design: Define dataset schemas, source questions, and expected answers to ensure consistent evaluation formats.
  • Format standardization: Produce datasets in the Snowflake Agent Evaluations format, including ground_truth structures and tool invocations.
  • Versioned delivery: Maintain dataset versions (v1, v2, ...) and re-register with clear change logs for reproducibility.

Quick Start

Create a new evaluation dataset by outlining the source questions, expected answers, and the steps to register it for evaluation.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: dataset-curation
Download link: https://github.com/randoneering/nix-flake-mirror/archive/main.zip#dataset-curation

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.