Reliable Insights – Page 15 – Robust Perception

April 23, 2018

Why can count(x > 5) not return 0?

When using the count aggregation operator you may have noticed that it sometimes returns nothing rather than 0. Why is this?

Published by Brian Brazil in Posts

Tags: prometheus, promql

April 16, 2018

When to Alert with Prometheus

Alerting is an art. One must be sure to alert just enough to be aware of all problems arising in the monitored system while at the same time not drown out the signal with excess noise. In this blogpost we'll explain some of the best practices to use when alerting with Prometheus.

Published by Conor Broderick in Posts

Tags: alerting, best practices, prometheus

April 9, 2018

Using Geohashes with the Worldmap Panel and Prometheus

Wouldn't it be nice to have arbitrary locations on the Worldmap panel?

Published by Brian Brazil in Posts

Tags: grafana, prometheus

April 2, 2018

Using the Worldmap Panel with Prometheus

The Worldmap Panel for Grafana allows displaying of metrics on a map.

Published by Brian Brazil in Posts

Tags: grafana, prometheus

March 26, 2018

Why not send graphs with alerts?

You may have noticed that notifications from the Alertmanager are text. Wouldn't it be nice if Prometheus sent graphs along?

Published by Brian Brazil in Posts

Tags: best practices, design, prometheus

March 19, 2018

Alerting on crash loops with Prometheus

If your applications are restarting regularly, whether due to segfaults or OOMs, it'd be nice to know.

Published by Brian Brazil in Posts

Tags: alerting, prometheus, promql

March 12, 2018

New Features in Prometheus 2.2.0

Prometheus 2.2.0 is now out, following on from 2.1.0 back in January with several fixes and improvements.

Published by Brian Brazil in Posts

Tags: prometheus, releases

March 5, 2018

Using sample_limit to avoid overload

Worried that your application metrics might suddenly explode in cardinality? sample_limit can save you.

Published by Brian Brazil in Posts

Tags: prometheus, reliability

February 26, 2018

Dude, where’s my exporter?

So you have just discovered Prometheus and want to try it out or use it to replace your old monitoring system but have run into a part of your stack that you cannot instrument with a client library and for which there are no officially supported exporters. What do you do?

Published by Conor Broderick in Posts

Tags: best practices, exporters, prometheus

February 19, 2018

Common pitfalls when using the Pushgateway

Jobs of an ephemeral nature are often not around long enough to have their metrics scraped by Prometheus. In order to remedy this the Pushgateway was developed to allow for these types of jobs to push their metrics to a metrics cache in order to be scraped by Prometheus long after the original jobs have gone away. This blogpost discusses some of the common pitfalls users tend to fall into when adding the Pushgateway to their monitoring stack.

Published by Conor Broderick in Posts

Tags: best practices, prometheus, pushgateway