PANEL DISCUSSION + Live Q&A
Panel: Real-World Production Readiness
What does it mean for an app to truly be ready for Production? Join Ines Sombra (Senior Director of Engineering at Fastly), Kolton Andrus (CEO of Gremlin), and Laura Nolan (Seeking SRE Contributor) as we discuss production readiness. Topics we’ll dive into range from the “practice” of SRE, how you think about data when it comes to production readiness, and strategies for resiliency, such as chaos engineering, gamedays, and, of course, production readiness reviews.
Speaker

Kolton Andrus
Founder and CEO of @GremlinInc
Kolton is the founder of Gremlin Inc - helping companies build more robust services. He was a Chaos Engineer at Netflix, focused on the resilience of the Edge services. He designed and built FIT: Netflix’s failure injection service. Prior he improved the performance and reliability of the...
Read moreFind Kolton Andrus at:
Speaker

Laura Nolan
Senior Staff Engineer @Slack, Contributor to Seeking SRE, & SRECon Steering Committee
Laura Nolan's background is in Site Reliability Engineering, software engineering, distributed systems, and computer science. She wrote the 'Managing Critical State' chapter in the O'Reilly 'Site Reliability Engineering' book, as well as contributing to the more recent...
Read moreSpeaker

Ines Sombra
Director of Engineering @Fastly
Ines Sombra is a Senior Director of Engineering at Fastly, where she spends her time helping the Web go faster. Ines holds an M.S. in Computology with an emphasis on Cheesy 80’s Rock Ballads. She has a fondness for steak, fernet, and running after a toddler who won't stay put....
Read moreFind Ines Sombra at:
From the same track
Production Readiness: Fighting Fires or Building Better Systems?
Wednesday Nov 10 / 11:10AM EST
In 2018 Tanya Reilly gave a talk called ‘The History of Fire Escapes’ in which she argues that we need to ‘focus on better software, not better incident response’. When I was recently asked how much time SREs should spend firefighting, that talk came to mind. The ideal...

Laura Nolan
Senior Staff Engineer @Slack, Contributor to Seeking SRE, & SRECon Steering Committee
Prod Lessons - Deployment Validation and Graceful Degradation
Wednesday Nov 10 / 12:10PM EST
Key to Site Reliability Engineering is building frameworks and “guardrails” that enable the product to be developed safely. If patterns can be identified in outages and bugs, preventing those problems systematically gives SRE unparalleled leverage to improve stability. During...

Anika Mukherji
Software Engineer @Pinterest
Incidents, PRRs, and Psychological Safety
Wednesday Nov 10 / 01:10PM EST
A Production Readiness Review is a process that identifies the reliability needs of a service based on its specific details. Few organizations have the benefit of starting with a robust PRR process. In most instances, the PRR process came about because of a production incident. The fact is there...

Nora Jones
Founder and CEO @jeli_io