Have you ever wondered what percentage of time a given service or application spends up or down?
In this blogpost we'll demonstrate how to use the Blackbox exporter with Prometheus in order to achieve this.
Setting up a simple contrived example, we'll run both the Blackbox and Node exporter, and configure Prometheus to tell the Blackbox exporter to issue a simple HTTP probe to the node exporter and scrape the result.
global: scrape_interval: 5s evaluation_interval: 5s scrape_configs: - job_name: 'node' metrics_path: /probe params: module: [http_2xx] # Look for a HTTP 200 response. static_configs: - targets: - :9100 relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: 127.0.0.1:9115 # The blackbox exporter's real hostname:port.
Using the query function avg_over_time() we can get the average value of the blackbox exporter's probe_success metric over a given time period which simply reports 1 or 0 depending on whether the target probed responds with a HTTP 200 response for our given probe.
The examples below show the result of this query function when looking at probe_success over a period of 15 minutes. We multiply by 100 to get a percentage.
In order to get a percentage of 80%, I killed the Node exporter for a few minutes.
(The full query used is avg_over_time(probe_success{job="node"}[15m]) * 100
Interested in gaining more operational insights with Prometheus? Contact us.
No comments.