grafana-prometheus-monitoring
CommunityProvision Grafana+Prometheus from git.
Software Engineering#monitoring#infrastructure as code#grafana#prometheus#gpu metrics#slo alerts#ai inference monitoring
Authordrewid74
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill solves the problem of deploying and operating reliable monitoring for Grafana and Prometheus without brittle, manual UI changes that waste tokens, drift over time, and hide real incidents.
Core Features & Use Cases
- Provisioning-first observability: idempotent Grafana provisioning (datasources and dashboards) with
allowUiUpdates: false, plus deployment via Grafana API, Grizzly, or Terraform. - Prometheus metrics + recording rules: scrape infrastructure/AI/GPU targets using file-based service discovery, then normalize and precompute SLIs/metrics using recording rules.
- Operational alerting with guardrails: MWMBR burn-rate SLO alerts, Alertmanager routing (critical→PagerDuty, warning→Slack), runbook URL enforcement, and flapping/cardinality controls.
- AI + GPU coverage: GPU metrics via DCGM Exporter/ROCm SMI exporter, inference metrics via LiteLLM/vLLM/Ollama endpoints, training metrics via Pushgateway, and vector DB monitoring via native metrics.
Quick Start
Use the grafana-prometheus-monitoring skill to generate a complete monitoring plan and Grafana provisioning configuration for your Docker Compose stack so dashboards, datasources, scrape targets, and SLO-based alerts are ready without manual UI steps.
Dependency Matrix
Required Modules
None requiredComponents
references
đź’» Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: grafana-prometheus-monitoring Download link: https://github.com/drewid74/ai_skills/archive/main.zip#grafana-prometheus-monitoring Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.