Subscribe
Share
Share
Embed
This episode we speak with Kolton Andrus, the CEO and co-founder of Gremlin. Topics include: The role of a Call Leader in incidents, using Chaos Engineering as runtime validation, FIT and application level fault injection, Jesse Robbins and early experiments at Amazon, oncall training, Lineage Driven Fault Injection (LDFI), the value of looking at real traffic instead of synthetic transactions, and the challenges people face when starting to do Chaos Engineering.
A podcast about site reliability engineering (SRE); Chaos Engineering; and the people, processes, and tools used to build resilient systems. Sponsored by Gremlin. Find us on Twitter at @BTOPpod.