Monitoring
Prometheus & Grafana for Service Owners
We pair chart literacy with annotation etiquette so incidents stay collaborative instead of adversarial.
Inside the syllabus
- Dashboard critique using real anonymised metrics
- Recording rule vs. query cost exercise
- Annotation workshop tied to incident retros
- Alert fatigue reduction worksheet
- Template for “definition of done” on monitoring tickets
- Guest slot from a product manager on readable SLOs
Outcomes you can evidence
- Co-author a dashboard that passes mentor review
- Draft three alert descriptions devoid of jargon pile-ons
- Facilitate a metrics review without dominating the room
Lead mentor
Noah Ibrahim
Pairs this offering with the OpenTelemetry track for progressive depth.
FAQ
Light PromQL only; no application instrumentation in this module.
We discuss portable patterns even while using Grafana Cloud.
Senior SREs seeking advanced Prometheus operator topics should choose another track.
Recent participant notes
Annotation etiquette section stopped our Slack threads from turning accusatory.
Liked the PM guest slot—grounded the charts in roadmap language.