All Topics / Deep Dives
Deep Dives
Explore our deep-dive blogs for an in-depth look at various observability and reliability topics! We break down complex ideas and share valuable insights to help you understand observability and related concepts better.
Think Data Warehouse, NOT Database.
The software monitoring world is broken because of a TSDB. We deserve a TSDW
Aniket Rao
The most important aspect of software monitoring
Ths single most important thing to get better at your software monitoring journey
Aniket Rao
What needs to change in software monitoring?
A wishlist of things that need to change in the world of software monitoring
Aniket Rao
How We Cut Monitoring Costs and Deprecated Thanos at Replit
Winning Replit over by taming High Cardinality data and deprecating Thanos
Prathamesh Sonpatki
Back to the Future: The R-C-A of alerting
Dissecting the RCA of Alerting - Reliability, Correlations, Actionability
Aditya Godbole
Launching Alert Studio
Modern monitoring systems depend heavily on ‘Alerting’ to reduce the Mean Time to Detect (MTTD) faulty systems. But, alerting hasn’t evolved to meet the demands of modern architectures. We’re changing that with Alert Studio.
Aditya Godbole
Everything in software monitoring is dead, apparently
Chasing shiny new toys, as always ;)
Aniket Rao
Software Monitoring — Stuck in the 00s
A short history of software monitoring, from the 00s. What has changed? Why are things so arcane?
Piyush Verma
A checklist to choose a monitoring system
A detailed checklist of points you should consider before choosing a monitoring system
Prathamesh Sonpatki
Why your monitoring costs are high
If you want to bring down your monitoring costs, you need to shake up a decision paralysis in engineering
Aniket Rao
The unresolved cost of High Cardinality
Fulfill all your food delivery orders this December 31st by taming High Cardinality data with Levitate 😉
Prathamesh Sonpatki
Why you need a Time Series Data Warehouse
What is a Time Series Data Warehouse? How does it help in your monitoring journey? How does it differ from a Time Series Database? That and more
Rishi Agrawal
Building Logs to Metrics pipelines with Vector
How to build a pipeline to convert logs to metrics and ship them to long term Prometheus storage like Levitate.
Aniket Rao
This arctic winter — time to repay your tech debt
We're in a peak tech winter. What should engineering teams focus on when product velocity dwindles?
Ajey Gore
A case for Observability outside engineering teams
Observability is being built by engineers for engineers. In reality, o11y is for all.
Aniket Rao
Understanding the Rasmussen model for failures
What does the Rasmussen model teach us about Site Reliability Engineering?
Nishant Modak
1979, a nuclear accident and SRE
Deep diving into the 'Normal accident' theory by Charles Perrow, and what it means for SREs
Aniket Rao
OpenTelemetry for dummies: ELI5
What is OpenTelemetry? Why is it important? Do SREs need to adopt OTel? An Explain It Like I'm 5.
Mohan Dutt Parashar
What Site Reliability Engineering Needs: A Swarm of Bees
If all companies are software companies, all companies need better Observability to understand how performative their software is
Aniket Rao
Take back control of your Monitoring
Take back control of your Monitoring with Levitate - a managed time series data warehouse
Nishant Modak
Observability is a practice, not a job
Engineering organizations that ship fast have Observability as part of their core DNA.
Aniket Rao
High Cardinality for Dummies: ELI5
High Cardinality woes are far & frequent in today's modern cloud-native environment. What does it mean, & why is it such a pressing problem?
Mohan Dutt Parashar