Reliable Insights – Page 16 – Robust Perception

February 12, 2018

Alerting on gauges in Prometheus 2.0

One of the major changes introduced in Prometheus 2.0 was that of staleness handling. Previously for instant vectors, Prometheus would return a point up to 5 minutes in the past which caused a number of different issues.

Published by Conor Broderick in Posts

Tags: alerting, prometheus, promql

February 5, 2018

What percentage of time is my service down for?

Have you ever wondered what percentage of time a given service or application spends up or down?

Published by Conor Broderick in Posts

Tags: blackbox_exporter, prometheus, promql

January 29, 2018

Adding Basic Auth to Prometheus with Apache

Having previously discussed why the Prometheus project does not support SSL and user authentication out of the box and detailing how to add basic authentication with Nginx, we will now demonstrate how to do the same with Apache.

Published by Conor Broderick in Posts

Tags: apache, auth, prometheus, security

January 22, 2018

New Features in Prometheus 2.1.0

Prometheus 2.1.0 is now out, following on from 2.0.0 last month with several fixes and improvements.

Published by Brian Brazil in Posts

Tags: prometheus, releases

January 15, 2018

Instrumenting a Ruby on Rails Application with Prometheus

In this blogpost we'll run you through a quick 'hello world' example instrumenting a Rails application with the Prometheus ruby client.

Published by Conor Broderick in Posts

Tags: client, instrumentation, prometheus, ruby

January 8, 2018

Measuring the performance impact of Meltdown/Spectre with Prometheus

The world of infosec is alarmed right now over the recent security vulnerabilities disclosed by Google on Wednesday that affect Intel, AMD, and ARM chips.
The now infamous Meltdown and Spectre bugs allow for the reading of sensitive information from a system's memory, including passwords, private keys and other sensitive information.

Thankfully fixes are being swiftly rolled out to patch these issues, however they come at a performance cost which we will use Prometheus to explore in this blogpost.

Published by Conor Broderick in Posts

Tags: golang, loadtesting, prometheus, security

January 1, 2018

Rule groups for hierarchical aggregation

Prometheus 2.0 brought with it rule groups, making hierarchical aggregation easier than ever.

Published by Brian Brazil in Posts

Tags: prometheus, promql

December 25, 2017

Keep It Simple scrape_interval-id

How many scrape intervals should you have in a Prometheus?

Published by Brian Brazil in Posts

Tags: best practices, prometheus

December 18, 2017

What’s the difference between group_interval, group_wait, and repeat_interval?

In this blogpost we try and clear up some confusion by outlining the key differences between commonly confused alerting configuration options: group_interval, group_wait, and repeat_interval.

Published by Conor Broderick in Posts

Tags: alerting, alertmanager, prometheus

December 11, 2017

Why are Prometheus histograms cumulative?

Have you ever wondered why the buckets in histograms are not just counters of events that fall into each bucket?

Published by Brian Brazil in Posts

Tags: design, prometheus, promql, relabelling