Break Things on Purpose

This episode we speak with Kolton Andrus, the CEO and co-founder of Gremlin. Topics include: The role of a Call Leader in incidents, using Chaos Engineering as runtime validation, FIT and application level fault injection, Jesse Robbins and early experiments at Amazon, oncall training, Lineage Driven Fault Injection (LDFI), the value of looking at real traffic instead of synthetic transactions, and the challenges people face when starting to do Chaos Engineering.

Show Notes

Links:Episode transcript: https://docs.google.com/document/d/12fUhzpbCfwUJQQi5PDPuE32KraWwDFnpd6k19eesEUQ/edit?usp=sharing

Our music is by Komiku. For more of Komiku’s music visit loyaltyfreakmusic.com.

What is Break Things on Purpose?

A podcast about site reliability engineering (SRE); Chaos Engineering; and the people, processes, and tools used to build resilient systems. Sponsored by Gremlin. Find us on Twitter at @BTOPpod.