Components in Designing Effective SLOs
A primer on how to design and implement effective Serice Level Objectives(SLOs)

Strace – A Hidden Superpower
Like any OS, Linux isn’t immune to hiccups, especially when running closed-source apps where you can’t inspect the code for deeper insights.

Saturation SLO: What It Is and Why You Should Consider It
What is Saturation and why should you think about it as an SLO? Saturation can be understood as the load on your network and server resources.

Sleep Friendly Alerting
We've all been woken up with that dreaded Slack notification at ungodly hours only to realise that the alert was all smoke and no fire. The perfect recipe for dread and alert fatigue.
Services; not Server
Gone are the days of yore when we named are our servers Etsy, Betsy, and Momo, fed them fish, and cleaned their poop.

Systems Observability
Observability is not just about being able to ask questions to your systems. It's also about getting those answers in minutes and not hours.


AWS security groups: canned answers and exploratory questions
While using a Terraform lifecycle rule, what do you do when you get a canned response from a security group?

If it ain't broke...
A Terraform lifecycle rule in the right place can help prevent a deadlock. But the same lifecycle rule in the wrong place?

mv aws-security-group shoot-foot
How you can run into an unplanned downtime while making a seemingly harmless change of renaming an AWS security group through Terraform?