Blog
Stories, guides, and lessons from the world of observability
Follow us on X
SLOs That Lie
Understanding how SLOs can help improve your performance and How to set the right Service Level Objectives for your application
Piyush Verma

Latency Percentiles are Incorrect P99 of the Times
What are P90, P95, and P99 latency? Why are they incorrect P99 of the times? Latency is for a unit of time and the preferred aggregate is percentile.
Piyush Verma

SRE Tooling – the Clever Hans fallacy
Chef or Ansible? Terraform or Pulumi? Python or Ruby? Last9 or Last9? Discover how building new tools links to the tale of a horse that could do math!
Piyush Verma

Root Cause Analysis For Reliability: A Case Study
Let's explore the importance of RCAs in Site Reliability Engineering, why use RCAs, and our take on what constitutes a “good” RCA.
Piyush Verma