aiops-observability-copilot
CommunityExplain alerts and generate queries fast
Authorivanshamaev
Version1.0.0
Installs0
System Documentation
What problem does it solve?
It reduces time-to-understanding during incidents by turning infrastructure questions into actionable PromQL/LogQL, and by converting firing alerts into plain-English explanations with contextual incident enrichment.
Core Features & Use Cases
- Natural language to PromQL/LogQL translation: Converts user questions into Prometheus and Grafana Loki queries for metrics and log exploration.
- Plain-English alert explanation: Explains what fired, why it matters, the most likely causes, and the immediate next action to take.
- Context-rich incident summaries: Enriches on-call narratives using recent deployments, related firing alerts, and runbook links.
- Noisy log reduction via error clustering: Normalizes and clusters similar error log lines to surface unique failure patterns.
- Grafana dashboard auto-generation: Produces a Grafana dashboard JSON based on service topology and desired metrics.
- Weekly observability health digests: Summarizes alert trends and SLO risk into a short improvement-focused report.
Quick Start
Ask the AI to translate your question into PromQL or LogQL, then request an alert explanation with incident context for the firing alert using your current metric value and alert labels.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: aiops-observability-copilot Download link: https://github.com/ivanshamaev/de-agent-skills/archive/main.zip#aiops-observability-copilot Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.