best practices – Page 7 – Robust Perception | Prometheus Monitoring Experts

October 8, 2015

Monitoring: Not Just For Outages

It's common to think of monitoring as something just to alert you when things are going wrong. At Robust Perception we believe in Inclusive Monitoring, where all aspects of systems are monitored and available to provide insight and drive decisions.

Published by Brian Brazil in Posts

Tags: best practices, estimation, inclusive monitoring, scaling

September 28, 2015

Healthchecking is Not Transitive

Systems such as Consul perform healthchecking of local services and expose this information to other machines within the cluster. Does this mean that the service will work when you try to talk to it?

Published by Brian Brazil in Posts

Tags: best practices, consul, healthchecking, reliability, rpc

September 14, 2015

Do you know your peak-to-mean ratio?

Traffic from users to your servers isn't a steady stream, it waxes and wanes over the day and week. The peak-to-mean ratio is your primary tool to avoid outages or unnecessary costs due to this.

Published by Brian Brazil in Posts

Tags: best practices, capacity, estimation, provisioning, reliability

August 23, 2015

There are 100,000 Seconds in a Day

Just after you've launched is not the best time to find out that you can't handle the load you predicted, or that running costs are much higher than you'd like. By estimating the operational parameters of your system as you design you can gain confidence that the system will work as you expect.

Published by Brian Brazil in Posts

Tags: best practices, design, estimation, scaling

August 12, 2015

The Three Types of Cache

Caches are a common feature of distributed systems, often added to improve performance. There are three main types of cache, and knowing about them will help you design robust systems.

Published by Brian Brazil in Posts

Tags: availability, best practices, cache, capacity, latency, reliability

Reliable Insights

Monitoring: Not Just For Outages

Healthchecking is Not Transitive

Do you know your peak-to-mean ratio?

There are 100,000 Seconds in a Day

The Three Types of Cache