As I mentioned in a previous post, I made some memory-related improvements to Prometheus that'll be in the 1.5 release. Let's look at how I came across unneeded memory allocations and ultimately improved the code.
A blog on monitoring, scale and operational Sanity
As I mentioned in a previous post, I made some memory-related improvements to Prometheus that'll be in the 1.5 release. Let's look at how I came across unneeded memory allocations and ultimately improved the code.
There's various ways Prometheus federation can be used. To ensure your monitoring is scalable and reliable, let's look at how to best use it.
It can be a little confusing to figure out Prometheus memory usage. Let's break part of it down.
A common question around Prometheus client libraries is how much RAM they'll use on a busy process. There tends to be disbelief when we say it's the same as an inactive server. Let's look deeper.
The blackbox_exporter allows for a variety of network checks to be performed, with many common modules available out of the box.
We previously looked at finding your biggest metrics, that involves an expensive query though. A new feature in Prometheus 1.3 offers another approach.
If you try and do max_over_time(rate(my_counter_total[5m])[1h])
or predict_linear(rate(my_counter_total[5m])[1d], 3600)
in Prometheus it won't work. How can you combine these functions?
The Alertmanager has integrations to a variety of popular notification mechanisms. Let's see how easy it is to hook it in to OpsGenie. Read more