messydata
OfficialGenerate realistic messy data for tests.
Authorsodadata
Version1.0.0
Installs0
System Documentation
What problem does it solve?
MessyData enables you to generate synthetic, realistic dirty data for testing data pipelines, data quality tooling, and ML workflows without writing custom data generators.
Core Features & Use Cases
- Declarative YAML config: define datasets, distributions, and anomalies without writing procedural code.
- Date-aware generation and both CLI and Python APIs for end-to-end data generation workflows.
- Use cases include validating pipelines, stress-testing anomaly detection, and simulating real-world data quality issues.
Quick Start
Create a MessyData YAML config, then run the validate command before generating.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: messydata Download link: https://github.com/sodadata/messydata/archive/main.zip#messydata Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.