Reliable Insights

A blog on monitoring, scale and operational Sanity

July 24, 2017

Existential issues with metrics

The Prometheus instrumentation best practices say to "Avoid missing metrics". Let's look at why, and how to deal with it.

Read more

July 17, 2017

High Availability Prometheus Alerting and Notification

Prometheus is architected for reliability of alerting, how do you set it up?

Read more

July 10, 2017

How to unit test Prometheus instrumentation

If you've determined a metric should be tested, how do you go about that?

Read more

July 3, 2017

When should you unit test instrumentation?

Should you unit test every bit of instrumentation you add? Not always.

Read more

June 26, 2017

Exposing Dropwizard metrics to Prometheus

If you've an existing instrumentation library in use, it's not always practical to immediately switch to a Prometheus instrumentation library. There are a multitude of integrations available to aid your transition.

Read more

June 19, 2017

Blackbox Exporter Benchmarking

Have you ever wondered how many CPU seconds it takes to probe an instance via TCP or HTTP 100, 1,000, or 10,000 times?

Read more

June 12, 2017

New Features in Prometheus 1.7.0

After 1.6.0 back in April, Prometheus 1.7.0 is now out. Let's look at what has changed.

Read more

June 5, 2017

Extracting labels from legacy metric names

When metrics come from another system they often don't have labels. metric_relabel_configs offers one way around that.

Read more

May 29, 2017

What’s in a __name__?

You may have noticed that most PromQL functions and operators remove the metric name in their result. Let's look at why.

Read more

May 22, 2017

Push needs Service Discovery

It's often claimed that an advantage of push-based monitoring systems is that, compared to pull-based systems like Prometheus, they don't need service discovery. This isn't true, and I'm going to explain why.

Read more

twitter
youtube
linkedin

Blog   |   Training   |   Book   |   Privacy