aiops-observability-copilot

Community

Explain alerts and generate queries fast

Authorivanshamaev
Version1.0.0
Installs0

System Documentation

What problem does it solve?

It reduces time-to-understanding during incidents by turning infrastructure questions into actionable PromQL/LogQL, and by converting firing alerts into plain-English explanations with contextual incident enrichment.

Core Features & Use Cases

  • Natural language to PromQL/LogQL translation: Converts user questions into Prometheus and Grafana Loki queries for metrics and log exploration.
  • Plain-English alert explanation: Explains what fired, why it matters, the most likely causes, and the immediate next action to take.
  • Context-rich incident summaries: Enriches on-call narratives using recent deployments, related firing alerts, and runbook links.
  • Noisy log reduction via error clustering: Normalizes and clusters similar error log lines to surface unique failure patterns.
  • Grafana dashboard auto-generation: Produces a Grafana dashboard JSON based on service topology and desired metrics.
  • Weekly observability health digests: Summarizes alert trends and SLO risk into a short improvement-focused report.

Quick Start

Ask the AI to translate your question into PromQL or LogQL, then request an alert explanation with incident context for the firing alert using your current metric value and alert labels.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: aiops-observability-copilot
Download link: https://github.com/ivanshamaev/de-agent-skills/archive/main.zip#aiops-observability-copilot

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.