If you missed my KubeCon 2024 Day 1 Recap, you can catch up here! I’ve shared my experience and highlighted some of my favorite talks from Day 1, including insights from Observability Day and more.
The day kicked off with EmpowerUs, where everyone got a chance to connect, collaborate, and share their stories. It was a great way to start things off together!
The talk highlighted Kubernetes’ journey from a playground for early adopters to a critical technology for major industries. As cloud-native matures, it's now supporting everything from AI to edge computing.
SUSE, a longtime open-source leader, remains deeply committed. CEO DP van Leeuwen shared his passion for open, community standards, especially as innovations in AI and edge continue to grow.
In this session, the speakers broke down Prometheus Remote Write—a handy protocol for sending metrics from Prometheus (or other sources) to remote storage like Thanos and Cortex.
Prometheus maintainers Bartek and Callum, who helped write the RW2.0 spec, introduced the latest version. It’s packed with new features, cuts egress costs by up to 60%, and still keeps the simple, stateless design that everyone loves.
The Network Nook came alive during lunch with some engaging table topic discussions! Each day’s topics align with the keynote themes, and today’s focus was Platform Engineering & AI Platforms. It was refreshing to hear different perspectives and learn from others’ insights.
In this talk, Ashok and Liudmila discussed the rise of Large Language Models (LLMs) on Kubernetes and the challenges of lacking proper observability.
They shared how to use OpenTelemetry for monitoring LLM client and server performance, based on efforts from the Kubernetes and OpenTelemetry communities.
Do check out their talk once the recording is out.
In this interactive "Choose Your Own Adventure" talk, Whitney and Viktor introduced an app running in a secure Kubernetes environment, serving users but unaware of its own performance. The app faces challenges like scaling issues and deployment failures and needs CNCF tools for metrics, traces, and progressive delivery to uncover what’s going wrong.
Throughout the session, Whitney and Viktor presented choices the app must make to add observability and resolve issues.
The fun part? The direction of the app’s journey is determined by the audience’s votes, as they work to help the app get its performance on track before time runs out.
Kruthika and Charlie from Apple covered the importance of observability in distributed systems to understand application performance.
They explained how correlating metrics, traces, and logs enhance their value and demonstrated how to achieve this using the OpenTelemetry SDK and Collector, with results displayed in Grafana.
This was the official OpenTelemetry session at KubeCon, where the focus was on the project's evolution. OpenTelemetry began with distributed traces and metrics, but its vision has always been to offer a comprehensive view by capturing signals from infrastructure, services, and beyond.
The session covered what’s coming next, including new signals and sources, with exciting insights into how OpenTelemetry is expanding its capabilities. A must-listen for anyone working with OpenTelemetry!
This talk was all about improving user experience by turning Kubernetes performance metrics into actionable insights.
Nadia and Antonio showed how to identify the key performance factors that affect users, gather metrics from live clusters with tools like Prometheus, Grafana, kube-burner, and custom instrumentation, and use those insights to make real improvements.
They also talked about finding bottlenecks and optimizing services. Perfect for anyone working on Kubernetes performance—whether you’re a developer, admin, SRE, or DevOps engineer.
The talk covered all the exciting details about Jaeger v2, including its native OpenTelemetry integration and updates to service performance monitoring. Pavol and Jonah also shared insights into the project’s plans and how you can get involved through LFX and Google Summer of Code mentorship programs.
Chris and Alolita gave us an inside look at what the Observability TAG has been up to in 2024. They dove into the challenges of tracking AI workloads on GPUs and NPUs and shared some cool new trends and solutions to help manage data, boost efficiency, and keep costs in check in the AI Cloud.
Thank you for your love for Last9 stickers and t-shirts! We were almost out of stock.
Have you grabbed yours yet? We’re pretty sure they’ll totally match your work vibes. Come find us and snag yours!
Already looking forward to Day 3 of KubeCon + CloudNativeCon 2024!
Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.