Vibe monitoring with Last9 MCP: Ask your agent to fix production issues! Setup →
Last9 Last9

Mar 27th, ‘25 / 3 min read

SRECon Americas 2025 Recap Day 2

Highlights from SREcon Americas 2025 Day 2—key takeaways, SRE challenges, and lessons from industry leaders.

SRECon Americas 2025 Recap Day 2

Did you see our Day 1 coverage from SRECon? No worries if you missed it - you can still catch all the highlights here.

SRECon Americas 2025 Recap Day 1 | Last9
Key takeaways from Day 1 at SRECon Americas 2025—insights, challenges, and what’s shaping the future of site reliability engineering.

Let's jump into the standout moments from Day 2 of SRECon 2025!

Highlights from Day 2

Here’s a quick recap of the sessions that sparked conversations on Day 2:

The Perverse Incentives of Reliability

Nobody thanks you for the disasters that never happened! In this talk, Katie Wildeskill set (Senior Director at Snyk) tackles the thankless world of reliability engineering, where your best work goes unnoticed while feature pressure keeps mounting.

Drawing from her leadership experience at Ambassador Labs and Buffer, Katie shares how to overcome these backward incentives by tapping into engineers' intrinsic motivation - because preventing digital dumpster fires can be rewarding (and occasionally hilarious).

She offers approaches, tactics, and lessons learned that can transform reliability practices by connecting with the inherent pride, joy, and humor found in incident prevention.

What Do SRE ICs Do? How to Build SRE Skillsets

Site Reliability Engineers wear many hats: load testing, infrastructure maintenance, setting SLOs, incident command, writing post-mortems, system design, automation building, and more. But virtually no one starts with this complete skill set.

In this session, Beth Adele Long (Principal at Adaptive Capacity Labs and founding member of the Resilience in Software Foundation) and Fred Hebert (staff SRE at Honeycomb.io and published technical author) explore how to develop these diverse abilities.

They tackle the fundamental question: Is it better to become an SRE generalist or to specialize in specific areas?

A must-attend for anyone looking to navigate the complex but rewarding path of Site Reliability Engineering!

Beyond Sequential: A Recipe for Async Pipeline Observability and Alerting

Microservices observability gets extra tricky when dealing with asynchronous systems. In this session, Jash Mistry and Gabriela Medvetska from eBay's SRE team serve up a complete "cookbook" for creating effective Service Level Objectives for async pipelines.

They break down the essential ingredients: identifying the right metrics, instrumenting your app with Prometheus, crafting informative dashboards, and setting up alerts that actually matter. The focus is on practical techniques that monitor what customers actually experience, not just what your servers report.

Just don't attend on an empty stomach—all this talk of cooking might make you hungry!

OpenTelemetry Semantic Conventions and How to Avoid Broken Observability

OpenTelemetry's Semantic Conventions bring much-needed standardization to telemetry data—defining consistent meaning for spans, metrics, and attributes across your entire observability ecosystem. This standardization helps your data flow smoothly between systems and improves the quality of your insights.

In this session, Dinesh Gurumurthy (Staff Engineer and leader of Datadog's OpenTelemetry team) and Laurent Querel (Senior Director and Distinguished Engineer at F5) explain how they collaborated with the wider community to develop the Schema Processor—a solution that handles semantic convention changes without causing painful outages.

A must-attend for anyone navigating the balance between standardization and flexibility in their observability pipeline!


And, if you're searching for an observability solution that's gentler on your budget while maintaining top-tier performance, Last9 deserves your attention.

Our managed platform enables high-cardinality monitoring at scale and has earned the trust of industry leaders including Disney+ Hotstar, CleverTap, and Replit.

As a comprehensive telemetry platform, Last9 has successfully monitored more than half of the 20 largest live-streaming events in history.

It seamlessly integrates with both OpenTelemetry and Prometheus, bringing together metrics, logs, and traces in one unified system that optimizes performance, cost-efficiency, and real-time insights through correlated monitoring and alerting capabilities.


If you didn't get your hands on our awesome swag yesterday, come find us to claim yours! 😎

Last9 merch
Last9 merch

I'm already counting down to Day 3 of SRECon Americas 2025 - it's going to be fun!

Contents


Newsletter

Stay updated on the latest from Last9.

Authors
Prathamesh Sonpatki

Prathamesh Sonpatki

Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

X
Topics