This post is from guest author Sebastian Vietz. The story was originally published on medium.
Over the last week, I have been “reporting” rather actively about the happenings at SRECon. It was both a Twitter experiment for me and, more importantly, a means to share how I experienced the conference with folks who may not have had the opportunity or time to attend it. Plus, I have a natural tendency to share.
SREcon is a multi-day conference organized by the USENIX Association, a nonprofit organization that supports advanced computing system communities and furthers the reach of innovative research. It’s one of the most popular conferences hosted by USENIX and is focused on site reliability, distributed systems, and systems engineering at scale.
Each SRECon is jam-packed with informative, insightful, and technical sessions. In 2021, SRECon was hosted virtually due to the ongoing pandemic.
Let’s keep this short and sweet.
Interesting takeaways, ideas, topics, and concepts to explore further.
- Adaptive Capacity — as defined by Dr. David Woods
- Sociotechnical Engineering — The Gray Matter between systems and the people that care about them
- Incident Management — A critical business process that is really hard to do well
- Lots of the things we do as SRE practitioners are tough — Our thanks go to Lorin Hochstein for saying it out loud on everyone's behalf.
- Don’t worry so much about buzzwords or shiny objects — what truly matters and connects us is our eagerness “to make things better.”
- Limited perspectives of complex systems and the fallacies of our mental models
- The Wheel of Expertise — Establish shared understanding by Matt Davis
- Be prepared to be surprised — the antidote, practice improvising
All of the above comes from my favorite talks during the conference.
If you were here, share them widely.
If you cannot attend, every talk will be worth your time.
Get your hands on those decks and video recordings once they become available.
My favorite talks in no particular order, don’t mind my qualifiers.
- Most surprising — What does “high priority” mean? by Daniel Magliola @dmagliola
- Most relatable — Confessions of an SRE Manager by Andrew Hatch @Hatchman76
- Most enjoyable — Why this stuff is hard by Lorin Hochstein @norootcause
- Most mind-blowing presentation ever — The Endgame of SRE by Amy Tobey @MissAmyTobey
- Most unorthodox — Human Observability of Incident Response by Matt Davis
- Most colorful — Watering the roots of resilience: Learning from failure with decision trees by Kelly Shortridge @swagitda_
- Most thought-provoking — The revolution will not be Terraformed: SRE the anarchist style by Austin Parker Austin Parker @austinlparker
- The “Deepest” — Far from the shallows: The value of deeper incident analysis by Courtney Nash @courtneynash
- Most referenced — An organizational response to incidents: Designing for smooth coordination in high tempo, large scale software incident response by Laura Maguire @LauraMDMaquire
- Most practical — Not all minutes are equal: The secret behind SLO adoption failure by Michael Goins and Troy Koss @therealtroykoss
- Most critical — Hell is other platforms by Alex Hidalgo @ahidalgosre and Andrew Clay Shafer Andrew Clay Shafer @littleidea