There's various ways Prometheus federation can be used. To ensure your monitoring is scalable and reliable, let's look at how to best use it.
A blog on monitoring, scale and operational Sanity
There's various ways Prometheus federation can be used. To ensure your monitoring is scalable and reliable, let's look at how to best use it.
One of the advantages of pull-based monitoring such as Prometheus is that you can tell if the target is healthy as part of the scrape. How do you do that though?
The Prometheus instrumentation guidelines say to use seconds, and the timing functions in client libraries follow this. Why?
It can seem like a good idea to use recording rules to make more explicit the content of a time series, particularly for those not used to labels. However this usually leads to confusing names and losing the benefits of labels.
How you choose to name metrics is important. If everyone choose different schemes it'd lead to confusion, irritation and prevent us from sharing and reusing each others' work. I'd like to share some guidelines to help keep things sane for everyone.
Another not uncommon question we get about Prometheus is as to why we don't have a single per-machine agent that handles all the collection, and instead have one exporter per application. Doesn't that make it harder to manage?
How should you choose the labels to put on your Prometheus monitoring targets? Let's take a look.
When designing a monitoring system and the datastore that goes with it, it can be tempting to go straight for a clustered highly consistent approach. But is that the best approach?
Since starting Robust Perception a year ago I've given 20+ tech talks at various meetups and conferences across the world. I'd like to share some tips around the practicalities of speaking that I've learned along the way.
There's a common misunderstanding when dealing with Prometheus counters, and that is how to apply aggregation and other operations when using the rate
and other counter-only functions.