Reliable Insights

A blog on monitoring, scale and operational Sanity

August 28, 2017

Avoid irate() in alerts

While the irate() function is useful for granular graphs, it is not suitable for alerting.

Read more

August 21, 2017

Relabelling can discard targets, timeseries and alerts

When relabelling is suggested for selecting targets with service discovery there's sometimes a misunderstanding that relabelling can only change labels. That's not the case.

Read more

August 14, 2017

Which kind of push? Events or metrics?

Continuing in our exploration of the ongoing epic saga of push vs. pull where the very future of humanity is at stake, let's look at two general classes of push that are often conflated.

Read more

August 7, 2017

Setting a Prometheus Counter

We're often asked how to call set() on a Counter. So how do you do that?

Read more

July 31, 2017

Configuring Blackbox exporter timeouts

Wondering how the cool kids are configuring their Blackbox probe timeouts these days?

 

Read more

July 24, 2017

Existential issues with metrics

The Prometheus instrumentation best practices say to "Avoid missing metrics". Let's look at why, and how to deal with it.

Read more

July 17, 2017

High Availability Prometheus Alerting and Notification

Prometheus is architected for reliability of alerting, how do you set it up?

Read more

July 10, 2017

How to unit test Prometheus instrumentation

If you've determined a metric should be tested, how do you go about that?

Read more

July 3, 2017

When should you unit test instrumentation?

Should you unit test every bit of instrumentation you add? Not always.

Read more

June 26, 2017

Exposing Dropwizard metrics to Prometheus

If you've an existing instrumentation library in use, it's not always practical to immediately switch to a Prometheus instrumentation library. There are a multitude of integrations available to aid your transition.

Read more

twitter
youtube
linkedin

Blog   |   Training   |   Book   |   Privacy