All articles on Deep dives ⏤ Last9

Think Data Warehouse, NOT Database.

The software monitoring world is broken because of a TSDB. We deserve a TSDW

Read

Aniket Rao

Jul 18, 2024

The most important aspect of software monitoring

Ths single most important thing to get better at your software monitoring journey

Read

Aniket Rao

Jul 5, 2024

What needs to change in software monitoring?

A wishlist of things that need to change in the world of software monitoring

Read

Aniket Rao

Jun 13, 2024

How We Cut Monitoring Costs and Deprecated Thanos at Replit

Winning Replit over by taming High Cardinality data and deprecating Thanos

Read

Prathamesh Sonpatki

Jun 7, 2024

Back to the Future: The R-C-A of alerting

Dissecting the RCA of Alerting - Reliability, Correlations, Actionability

Read

Aditya Godbole

Apr 29, 2024

Launching Alert Studio

Modern monitoring systems depend heavily on ‘Alerting’ to reduce the Mean Time to Detect (MTTD) faulty systems. But, alerting hasn’t evolved to meet the demands of modern architectures. We’re changing that with Alert Studio.

Read

Aditya Godbole

Apr 24, 2024

Everything in software monitoring is dead, apparently

Chasing shiny new toys, as always ;)

Read

Aniket Rao

Mar 19, 2024

Software Monitoring — Stuck in the 00s

A short history of software monitoring, from the 00s. What has changed? Why are things so arcane?

Read

Piyush Verma

Mar 8, 2024

A checklist to choose a monitoring system

A detailed checklist of points you should consider before choosing a monitoring system

Read

Prathamesh Sonpatki

Feb 20, 2024

Why your monitoring costs are high

If you want to bring down your monitoring costs, you need to shake up a decision paralysis in engineering

Read

Aniket Rao

Jan 4, 2024

Deliver all your orders this December 31st 😉

The unresolved cost of High Cardinality

Fulfill all your food delivery orders this December 31st by taming High Cardinality data with Last9 😉

Read

Prathamesh Sonpatki

Dec 15, 2023

A Time Series Data Warehouse vs A Time Series Database

Why you need a Time Series Data Warehouse

What is a Time Series Data Warehouse? How does it help in your monitoring journey? How does it differ from a Time Series Database? That and more

Read

Rishi Agrawal

Dec 7, 2023

Building Logs to Metrics pipelines with Vector

How to build a pipeline to convert logs to metrics and ship them to long term Prometheus storage like Last9.

Read

Aniket Rao

Nov 24, 2023

Repaying your tech debt during the tech arctic winter

This arctic winter — time to repay your tech debt

We're in a peak tech winter. What should engineering teams focus on when product velocity dwindles?

Read

Ajey Gore

Sep 5, 2023

A case for Observability outside engineering teams

Observability is being built by engineers for engineers. In reality, o11y is for all.

Read

Aniket Rao

Aug 23, 2023

Understanding the Rasmussen model for failures

What does the Rasmussen model teach us about Site Reliability Engineering?

Read

Nishant Modak

Aug 18, 2023

1979, a nuclear accident and SRE

Deep diving into the 'Normal accident' theory by Charles Perrow, and what it means for SREs

Read

Aniket Rao

Jul 31, 2023

What Site Reliability Engineering needs — A swarm of rogue bees

What Site Reliability Engineering Needs: A Swarm of Bees

If all companies are software companies, all companies need better Observability to understand how performative their software is

Read

Aniket Rao

Jul 11, 2023

Take back control of your Monitoring

Take back control of your Monitoring with Last9 - a managed time series data warehouse

Read

Nishant Modak

Jun 30, 2023

Observability is a practice, not a job

Engineering organizations that ship fast have Observability as part of their core DNA.

Read

Aniket Rao

May 30, 2023

Who should define Reliability — Engineering, or Product?

Whoever owns Reliability should define its parameters. But who owns the Reliability of a Product? Engineering? Product Management? Or the Customer success team?

Read

Piyush Verma

May 11, 2023

What do self-driving cars tell us about Site Reliability Engineering?

From Robocars to Reliability — SRE with self-driving cars; mapping out where the Observability space is in conjunction with self-driving cars

Read

Mohan Dutt Parashar

May 9, 2023

OSS vs Paid vs Managed OSS — Picking what works for your Observability journey

Observability—OSS vs Paid vs Managed OSS

The Reliability industry needs a managed, non-vendor lock-in answer to spiraling costs, high cardinality and the toil of managing a tsdb

Read

Satyajeet Jadhav

May 3, 2023

The neglected tech arctic winter — Internal SaaS expenses

The current tech winter reveals a hard truth: spending on internal tools for tech infrastructure is bloated—and this isn't just a passing cycle.

Read

Nishant Modak

Mar 30, 2023

What does "Cricket scale" mean for a Site Reliability Engineer?

Understanding “Cricket Scale”

How does a DevOps/Site Reliability Engineer plan for "Cricket scale"? How do you warm systems' about to witness 30+ million concurrent users?

Read

Aniket Rao

Mar 23, 2023

Reliability Engineering for Dummies: ELI5

Explaining Reliability Engineering to a 5-year-old.

Read

Mohan Dutt Parashar

Mar 9, 2023

When should I start thinking of observability?

How does one scale metrics maturity in a cloud-native world — A guide on observability tooling as your engineering org scales.

Read

Piyush Verma

Jan 17, 2023

The importance of structured communication in the world of SRE

How you communicate helps build your 9s. In the world of Site Reliability Engineering, this is crucial. How do you do it?

Read

Saurabh Hirani

Dec 27, 2022

The difference between DevOps, SRE, and Platform Engineering

In reliability engineering, three concepts keep getting talked about - DevOps, SRE and Platform Engineering. How do they differ?

Read

Prathamesh Sonpatki

Dec 20, 2022

How to improve Prometheus remote write performance at scale

Deep dive into how to improve the performance of Prometheus Remote Write at Scale based on real-life experiences

Read

Saurabh Hirani

Dec 8, 2022

India vs Pakistan: SRE and the Shannon Limit

How does one ‘detect change’ in a complex infrastructure, so you don’t lose out on critical revenues — A short SRE story

Read

Satyajeet Jadhav

Nov 29, 2022

Why MTTR should be a ‘business’ metric

A key challenge is aligning engineering health metrics with business goals. How can business measure engineering, and engineering show its value?

Read

Sidu Ponnappa

Oct 13, 2022

Observability - That Last 9

TL;DR: A stitch in time, saves 9. A discussion on the key blocks of observability.

Read

Akash Saxena

Oct 6, 2022

How we won Dukaan over

5 meetings. 1 month. Subhash and his team’s velocity on decision-making, moving fast, and radical candor, are a breath of fresh air in the Indian startup ecosystem.

Read

Aniket Rao

Sep 21, 2022

Getting the big picture with Log Analysis

How to get the most out of your logs!

Read

Jayesh Bapu Ahire

Feb 6, 2022

Microservices - Tracking Dependencies

Quick primer into microservices architecture and the importance of tracking dependencies

Read

Akshay Chugh

Jayesh Bapu Ahire

Feb 1, 2022

Infrastructure-As-Code-As-Software

Explore how Infrastructure-as-Code-as-Software combines coding practices with automation to streamline infrastructure management and enhance scalability.

Read

Piyush Verma

Nov 15, 2020