All articles on Last9 ⏤ Last9

From GPU Silicon to Business Metrics: The 8 Layers of GPU Observability

GPU observability isn't one thing - it's eight connected layers from silicon to cost. See why correlation across layers is what cuts debugging from 2 hours to 2 minutes, and why most teams instrument only one or two

Read

Shekhar

Apr 21, 2026

The GPU Metrics That Actually Matter

Most teams monitor three GPU metrics - utilization, temperature, memory. There are 50+ that matter, and the ones you skip cause your worst outages. A vendor-neutral guide across NVIDIA, AMD, and Intel Gaudi

Read

Shekhar

Apr 20, 2026

Last9 Named a Gartner® Cool Vendor in AI for SRE and Observability

Gartner recognizes Last9 in their latest Cool Vendor report for unified telemetry and agentic SDK—moving teams from reactive monitoring to proactive ops.

Read

Nishant Modak

Oct 15, 2025

Last9 launch week day 4 banner showing AI agent analyzing frontend proxy logs with MCP integration

From Cloud Native to AI Native: Why Your Observability Stack Needs to Speak Agent

Your production telemetry now speaks agent: ask questions in Slack, debug in VS Code, optimize in real-time. Same data, conversational interface.

Read

Nishant Modak

Aug 21, 2025

Last9 launch week day 3 banner showcasing Kubernetes pod monitoring with CPU and memory metrics

Your Apps Are Green. Your Infrastructure Is Dying.

Infra problems hide behind green dashboards. Discover Infrastructure monitors K8s and hosts from the same telemetry—unified visibility, AI-powered debugging.

Read

Nishant Modak

Aug 20, 2025

Last9 launch week day 2 banner featuring background job monitoring with exception trend analysis

Your APIs Are Green. Your Background Jobs Are Dying.

Background jobs fail silently while your APIs look healthy. Discover Jobs gives async operations the same deep visibility as APIs—automatic detection, operation-level debugging.

Read

Nishant Modak

Aug 19, 2025

Last9 launch week day 1 banner introducing automatic service discovery with APM metrics

The Service Discovery Problem Every Developer Knows (But Pretends Doesn't Exist)

New services deploy faster than you can track them. Discover Services auto-discovers your entire architecture from traces—convention over configuration. No manual catalogs.

Read

Nishant Modak

Aug 18, 2025

Use Telegraf Without the Prometheus Complexity

Collect metrics with Telegraf without running Prometheus. No scraping, no TSDB tuning, just clean, push-based telemetry to your backend.

Read

Anjali Udasi

Jul 24, 2025

Ship Confluent Cloud Observability in Minutes

Push metrics into Last9 and start tracking Kafka lag, retries, and throughput in real-time.

Read

Anjali Udasi

Jul 22, 2025

Stream AWS Metrics to Grafana with Last9 in 10 minutes

Visualize AWS metrics like Lambda, API Gateway, and RDS in Grafana using Last9. No agents, no code, set it up in under 10 minutes.

Read

Faiz Shaikh

Jul 18, 2025

Query and Analyze Logs Visually, Without Writing LogQL

Visually build, parse, and analyze logs across services, no LogQL required. Get structured insights faster with Query Builder.

Read

Anjali Udasi

Jul 17, 2025

Build Log Automation with Last9's Query API

Here's how you can build automated log analysis workflows with Last9's Query Logs API

Read

Prathamesh Sonpatki

Jul 16, 2025

Enable Kong Gateway Tracing in 5 Minutes

Instrument Kong with OpenTelemetry for end-to-end API visibility, no code changes required.

Read

Anjali Udasi

Jul 16, 2025

Last9 MCP Server feature image showing Claude Desktop configuration with illustrated developer cat mascot

Last9 MCP Server: Fix Production Issues in Your Local Environment

Ask your agent to bring production context to your local environment, debug issues, and fix them. Sit back and vibe monitor.

Read

Nishant Modak

Mar 28, 2025

Last9’s Single Pane for High Cardinality Observability

Last9’s Telemetry Warehouse now supports Logs and Traces, offering a unified view for high cardinality observability to simplify monitoring and troubleshooting.

Read

Sahil Khan

Nov 12, 2024

Unwiring High Cardinality - SRE Day 2023

Report from SRE Day 2023, where Piyush Verma - CTO Last9, gave a talk on Unwiring High Cardinality

Read

Last9

Sep 17, 2023

What Site Reliability Engineering needs — A swarm of rogue bees

What Site Reliability Engineering Needs: A Swarm of Bees

If all companies are software companies, all companies need better Observability to understand how performative their software is

Read

Aniket Rao

Jul 11, 2023

Take back control of your Monitoring

Take back control of your Monitoring with Last9 - a managed time series data warehouse

Read

Nishant Modak

Jun 30, 2023

SRECon APAC 2023 Recap

Recap of SRECon APAC 2023 in Singapore

Read

Aniket Rao

Jun 19, 2023

QCon New York 2023 Recap

Recap of QCon New York 2023 Conference

Read

Prathamesh Sonpatki

Jun 19, 2023

SRE vs Platform Engineering

What's the difference between SREs and Platform Engineers? How do they differ in their daily tasks?

Read

Last9

May 26, 2023

SRE vs DevOps: Definition, Key Differences, and Similarities

What's the difference between SREs and DevOps professionals? How do they differ in their daily tasks?

Read

Nishant Modak

May 18, 2023

What does "Cricket scale" mean for a Site Reliability Engineer?

Understanding “Cricket Scale”

How does a DevOps/Site Reliability Engineer plan for "Cricket scale"? How do you warm systems' about to witness 30+ million concurrent users?

Read

Aniket Rao

Mar 23, 2023

What is MTBI?

Everything you need to know about Mean Time Between Incidents (MTBI) and how it can help Site Reliability Engineers

Read

Last9

Mar 20, 2023

Reliability Engineering for Dummies: ELI5

Explaining Reliability Engineering to a 5-year-old.

Read

Mohan Dutt Parashar

Mar 9, 2023

SLA vs SLO vs SLI - What's the difference

SLAs, SLOs, and SLIs—what’s the difference? For DevOps folks, understanding these nuances is key. Here's a quick guide to each term.

Read

Last9

Mar 7, 2023

Introducing Levitate: Uplift Your Metrics Management

Managing time series databases is hard. We've evolved to services, yet monitoring lags. Our solution powers critical workloads at a lower cost.

Read

Nishant Modak

Jan 11, 2023

Self-managed Prometheus vs Managed Prometheus

What are the differences between Self-managed Prometheus vs Managed prometheus? How do you choose what works for you?

Read

Last9

Jan 4, 2023

India vs Pakistan: SRE and the Shannon Limit

How does one ‘detect change’ in a complex infrastructure, so you don’t lose out on critical revenues — A short SRE story

Read

Satyajeet Jadhav

Nov 29, 2022

Battling Alert Fatigue

What is Alert Fatigue and techniques to reduce it

Read

Last9

Nov 22, 2022

Kubernetes Monitoring with Prometheus and Grafana

A guide to help you implement Prometheus and Grafana in your Kubernetes cluster

Read

Last9

Nov 4, 2022

Why MTTR should be a ‘business’ metric

A key challenge is aligning engineering health metrics with business goals. How can business measure engineering, and engineering show its value?

Read

Sidu Ponnappa

Oct 13, 2022

Observability - That Last 9

TL;DR: A stitch in time, saves 9. A discussion on the key blocks of observability.

Read

Akash Saxena

Oct 6, 2022

How we won Dukaan over

5 meetings. 1 month. Subhash and his team’s velocity on decision-making, moving fast, and radical candor, are a breath of fresh air in the Indian startup ecosystem.

Read

Aniket Rao

Sep 21, 2022

Sample vs Metrics vs Cardinality

When dealing with Time Series databases, I always got confused with Sample vs Metrics vs Cardinality. Here’s an explanation as I have understood it.

Read

Piyush Verma

Aug 22, 2022

Last9 completes SOC II Type 2 Certification

The comprehensive audit validates Last9 as a trusted SRE partner; a crucial process to work with highly regulated industries.

Read

Abhi Puranam

Jul 28, 2022

We’ve raised a $11M Series A led by Sequoia Capital India!

Exciting news! We've secured an $11M Series A funding round led by Sequoia Capital India to fuel our growth and innovation at Last9!

Read

Nishant Modak

Apr 20, 2022

Best Practices for Postmortems: A guide

The ins and outs of conducting an effective postmortem. Ready templates and examples from leading organizations around the world!

Read

Prathamesh Sonpatki

Mar 1, 2022

Choosing Effective SLIs

Practical advice to choose an effective SLI.

Read

Akshay Chugh

Feb 25, 2022

Running a Database on EC2 is Slowing It Down

Learn everything about the advantages of EC2, it's use cases and how to optimize EC2 further.

Read

Jayesh Bapu Ahire

Akshay Chugh

Feb 20, 2022

Deployment Readiness Checklists

A ready checklist of a comprehensive list of steps and activities involved in the deployment of your application.

Read

Prathamesh Sonpatki

Feb 19, 2022

Doing SRE the Right Way!

A well-thought-out approach to SRE, which will help site reliability engineers and software engineers develop and maintain a useful, consistent, and effective SRE strategy for their products!

Read

Piyush Verma

Feb 11, 2022