mlflow-data-pipelines
CommunityTrack ETL and models with MLflow
Software Engineering#experiment tracking#mlflow#model registry#etl observability#airflow integration#spark autologging#pyfunc serving
Authorivanshamaev
Version1.0.0
Installs0
System Documentation
What problem does it solve?
MLflow for data engineering removes uncertainty around where data pipelines and models succeed or fail by centralizing run metadata, metrics, artifacts, and model lifecycle management.
Core Features & Use Cases
- Production-grade tracking server setup: Configure an MLflow tracking server with a PostgreSQL backend and S3 artifact storage (including concurrency-oriented options like PgBouncer).
- End-to-end run observability for data engineering: Log pipeline metadata such as input/output row counts, processing time, data quality metrics, and lineage tags for each pipeline stage.
- Model lifecycle management and deployment: Register models in the MLflow Model Registry, promote versions via aliases, and serve or run batch scoring using REST APIs and pyfunc.
Quick Start
Load the MLflow tracking skill to set up a tracking server and instrument your ETL and model training runs so each stage logs row counts, DQ metrics, and deployable model artifacts.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: mlflow-data-pipelines Download link: https://github.com/ivanshamaev/de-agent-skills/archive/main.zip#mlflow-data-pipelines Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.