Reliable Insights

A blog on monitoring, scale and operational Sanity

September 10, 2018

Using the textfile collector from a shell script

The node exporter's textfile collector is handy for monitoring machine-level cronjobs. How would you go about that?

Published by Brian Brazil in Posts

Tags: node exporter, prometheus, textfile collector

September 3, 2018

Deleting time series from Prometheus

If a misconfiguration leads to unwanted time series, it'd good to know how to remove them.

Published by Brian Brazil in Posts

Tags: prometheus

August 27, 2018

Dealing with “too many open files”

While not a problem specific to Prometheus, being affected by the open files ulimit is something you're likely to run into at some point.

Published by Brian Brazil in Posts

Tags: prometheus, reliability

August 20, 2018

Using the Java client with Gradle

While the Java client library uses pom.xml and Maven, there's nothing stopping you from using other tools such as Gradle

Published by Brian Brazil in Posts

Tags: gradle, java, prometheus

August 13, 2018

Why predeclare metrics?

The standard way to use metrics in Prometheus is to declare them at file level, before using them. Why?

Published by Brian Brazil in Posts

Tags: best practices, design, prometheus, python

Aggregating across batch job runs with push_time_seconds

For counting how many times a thing has happened you can use a counter and rate(), but that doesn't work across batch jobs.

Published by Brian Brazil in Posts

Tags: prometheus, promql, pushgateway

Prometheus: Up and Running is out

After many months of work, Prometheus: Up&Running is now available for purchase!

Published by Brian Brazil in Posts

Tags: prometheus, releases

Absent Alerting for Scraped Metrics

In the previous post we looked at dealing with when all the targets for a job had disappeared. What if you wanted to alert on specific metrics from one target disappearing?

Published by Brian Brazil in Posts

Tags: alerting, prometheus, promql

Absent Alerting for Jobs

Alerting on numbers being too big or small is easy with Prometheus. But what if the numbers go missing?

Published by Brian Brazil in Posts

Tags: alerting, prometheus, promql

ICMP Pings with the Blackbox exporter

The Blackbox exporter can perform ICMP probes. Let's see how.

Published by Brian Brazil in Posts

Tags: blackbox_exporter, prometheus, relabelling

‹ Newer Posts Older Posts ›

youtube

Blog | Training | Book | Privacy