datahub-catalog
CommunityIngest and connect data lineage in DataHub.
Data & Analytics#lineage#datahub#kubernetes helm#metadata ingestion#finegrainedlineage#dbt artifacts#graphql search
Authorivanshamaev
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill solves the problem of setting up DataHub metadata ingestion so you can centralize datasets, capture lineage, and enable search/discovery for data engineering assets.
Core Features & Use Cases
- Deploy and operate DataHub: Use Docker Compose quickstart or Kubernetes Helm deployment, including Kafka/Elasticsearch/MySQL components.
- Ingest metadata from many sources: Generate and run ingestion recipes for PostgreSQL, Hive, Spark, dbt, Airflow, Kafka, and S3, including stateful ingestion and optional profiling.
- Emit metadata programmatically with Python SDK: Use DatahubRestEmitter and MCP wrappers to publish entity aspects like schema, ownership, tags, glossary terms, and lineage.
- Support lineage at dataset and column granularity: Attach table-level upstream lineage and FineGrainedLineage field-to-field mappings.
- Discover assets: Use DataHub UI, REST search APIs, or GraphQL lineage traversal to navigate impact and dependencies.
Quick Start
Activate this skill and ask your agent to ingest metadata from a dbt project by preparing a dbt ingestion recipe and running datahub ingest against your DataHub GMS endpoint.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: datahub-catalog Download link: https://github.com/ivanshamaev/de-agent-skills/archive/main.zip#datahub-catalog Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.