observability-apm-expert
OfficialEnd-to-end observability for reliable services
Authorcuriositech
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Distributed systems often lack coherent telemetry, leaving teams blind to cross-service performance issues, missed error traces, and noisy alerts; this Skill provides structured guidance to design tracing, metrics, and logging so incidents are detectable and actionable.
Core Features & Use Cases
- Sampling Strategy & Trace Retention: Tailored guidance to choose 100% error retention, tail-based sampling for high-cardinality services, and policies for slow-trace capture.
- Backend Selection & Integration: Recommendations for self-hosted Grafana stack (Tempo, Mimir, Loki) or SaaS vendors (Datadog, Honeycomb) plus OTLP collector configuration and fallbacks.
- Alerting and SLOs: SLO definition, error-budget calculation, burn-rate thresholds, and runbook linkage to reduce alert fatigue and speed response.
- Incident Investigation Playbooks: Stepwise trace-first triage, correlation with infrastructure metrics, and remediation actions for common root causes like DB pool exhaustion.
- Instrumentation Guidance: Practical advice for OpenTelemetry SDK usage, context propagation, log->trace correlation, and business-metric instrumentation with bounded cardinality.
Quick Start
Describe your service topology and ask for a recommended OpenTelemetry sampling strategy, backend choice, and the SLO/alert configuration to apply.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: observability-apm-expert Download link: https://github.com/curiositech/port-daddy/archive/main.zip#observability-apm-expert Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.