generate-synthetic-data
CommunityGenerate diverse LLM test data.
Software Engineering#prompt engineering#llm evaluation#synthetic data#data pipelines#test data generation#data augmentation
Authormarchatton
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill addresses the challenge of creating comprehensive and diverse test datasets for LLM pipelines, especially when real user data is scarce or specific failure scenarios need to be tested.
Core Features & Use Cases
- Dimension-based Tuple Generation: Defines axes of variation (dimensions) relevant to potential LLM failures.
- Iterative Tuple Refinement: Involves user feedback to ensure generated tuples reflect realistic scenarios.
- LLM-assisted Query Generation: Converts refined tuples into natural language queries for pipeline testing.
- Use Case: Bootstrapping an evaluation dataset for a customer support chatbot by defining dimensions like 'user intent', 'customer sentiment', and 'product type', then generating varied queries to test the bot's responses.
Quick Start
Define dimensions for your application and generate synthetic data tuples.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: generate-synthetic-data Download link: https://github.com/marchatton/agent-skills/archive/main.zip#generate-synthetic-data Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.