The video on-demand of this session is available to logged in QCon attendees only. Please login to your QCon account to watch the session.

Session + Live Q&A

Building Trust & Confidence with Security Chaos Engineering

Complex adaptive systems are dynamic, self-evolving, non-linear, emergent, and most of all unpredictable. Delivering secure and reliable software will continue to become exponentially more difficult unless we start approaching this new problem frontier differently.    

Security Chaos Engineering (SCE) is an emerging discipline being used to proactively build confidence in the security of complex systems through continuous security experimentation. This novel approach has proven to help organizations derive a more realistic understanding of their security practices and as a result, reduce the likelihood of security blind spots resulting in an erosion of trust and system safety.    

In this session, Aaron Rinehart, the pioneer behind Security Chaos Engineering and O’Reilly Co-Author on the topic will share his experience and how you can begin practicing Security focused Chaos Engineering to build trust and confidence, proactively identifying and navigating security unknowns.

Main Takeaways

1 Find out about Security Chaos Engineering, what it is and how it can help.

2 Learn how to proactively identify and fix security issues.

What is the work that you're doing today?

I'm currently CTO and co-founder of I co-founded Verica with Casey Rosenthal, the creator of Chaos Engineering at Netflix. I am known as the person behind security-based Chaos Engineering with O'Reilly book on the topic. I'm actually writing a second O'Reilly book right now. But the content of the talk is work I've been working on for the last five years since I wrote the first open source tool in the space and the whole evolution afterward.

What are your goals for the talk and what do you want the audience to walk away with from your talk?

It's a little bit about security. The problem space has changed, complexity is eroding our abilities. Engineers want to be effective. As an engineer, I don't have a good conceptual understanding of how my system is working post-deployment, an accurate mental model of that. It's highly likely that my security is no different. If we don't know how our systems are really functioning, it's highly likely that security is not much better. So to think differently about the problem, to understand the complexity of the system is a new problem that we need to focus on. And if there's a new technique that we can use, to proactively verify that security works the way it's supposed to. Very similar to Chaos Engineering, things like retry logic, circuit breaker patterns, failover. Security technologies are the same way, you put all those things that I described in place in the events of some condition. Well, the problem is when you design that logic, it really never gets exercised until it happens, until you need it. And the problem is, the rest of the system has been changing a lot since you wrote that logic, and a lot of times we don't really recognize that it no longer works until we need it. And that's not a good opportunity to find that out. So what we're trying to do is proactive exercise availability, the same thing with security, but we're proactively ensuring that those things that we need to have in place to protect us are actually still working and are effective.

Can you give us a preview of one of the techniques that you're using?

What it is it's a proactively exercise, similarly to Chaos Engineering. During this we apply our security engineering use cases. For example, I use in the talk the issue of misconfigured ports. That should be designed for a long time before the code existed. And the problem is manifested in the code, of course. Security Chaos engineering is its own technique, it's not attack simulation, it's not purple teaming, it's not red teaming. If you really want to narrow it down it's fault injection, introducing the failure conditions. So we're trying to introduce the failure conditions that we expect the system to already be effective at. We're not introducing something we think the system is not going to be able to handle, that way you wouldn't learn anything. The whole point is to try to derive better context about how effective our security still is. For example, security is a context dependent discipline. So what do I mean by that? What I mean is that as an engineer, my job is to deliver business value, to sell your product, your software to a customer. You get the business requirements and constantly change the software to try to deliver that to your customer. Well, I need flexibility to do that. I'm not sure what permissions or what port or whatever I need open access, making sure I can do what I'm being asked to do, because it's new things. I think that flexibility can change something. At the same time, security is a context dependent discipline. You must know what you're trying to secure. You need to understand the context of the object of the user or whatever it is you're trying to secure to understand what needs securing. By default, that puts security into a stable understanding of a system. But the problem is, as I described before, the system is always changing because that's what engineers do, they're constantly delivering value to the customer. That's their job. But the problem is that the system has changed so much and we started to get this drift. With security chaos engineering we're proactively introducing conditions that the security was originally designed for that state to ensure that it can still after the system is evolved, still actually function under the original conditions. The real goal is to find out before an adversary can take advantage of it. The fact that we didn't realize we can't detect that, we can't prevent that, it no longer works. A lot of times what we're doing is finding out it doesn't work today because of some kind of error, the security technology can't phone home to get a manifest or there's a blocked port blocking this update or whatever. But that is too late. By that point in time, an adversary could be taking advantage of it.


Aaron Rinehart


Aaron has been expanding the possibilities of chaos engineering in its application to other safety-critical portions of the IT domain notably cybersecurity. He began pioneering the application of security in chaos engineering during his tenure as the Chief Security Architect at the largest...

Read more
Find Aaron Rinehart at:


Tuesday Nov 2 / 01:10PM EDT (40 minutes)


Security: Establishing & Maintaining Customer Trust


SecurityChaos EngineeringAuthentication

Add to Calendar

Add to calendar


From the same track

Session + Live Q&A Security

Authorization at Netflix Scale

Tuesday Nov 2 / 12:10PM EDT

How do you centralize authorization in the critical path of a multi-million RPS online service?  How does centralizing authorization enable product flexibility?   How do you make such a system fault-tolerant?  We will answer these questions and more in this session. At...

Travis Nelson

Senior Software Engineer @Netflix

Session + Live Q&A Security

"Trust me, I'm an insider" - Diving into Zero Trust Security

Tuesday Nov 2 / 02:10PM EDT

In 2020, hackers got around by making about 4.2 Billion Dollars majorly from Phishing scams.The current scenario of Network Security highly depends on the assumption that if a client has a set of “good” credentials, they can be trusted with access to all or at least some confidential...

Sindhuja Rao

Network Security Engineer @Cisco

Deepank Dixit

Technical Consulting Engineer @Cisco


Perspectives on Trust in Security & Privacy

Tuesday Nov 2 / 03:10PM EDT

Continuing the track trend around trust, the security panel discusses we can balance the adjustment of our security posture and our user experience. What is the right balance between security and usability? How do we build systems that scale, that gives the right amount of security and control to...

Clint Gibler

Head of Security Research @r2cdev

Stephanie Olsen

Customer Trust, Abuse & Fraud @Netflix

Cassie Clark

Security Awareness Lead Engineer @brexHQ

Ellen Nadeau

Privacy Analysis Engineer @Cruise

View full Schedule