Adam Hawkins presents the theory and practices behind software delivery excellence. Topics include DevOps, lean, software architecture, continuous delivery, and interviews with industry leaders.
[00:00:00] Hello and welcome. I'm your host, Adam Hawkins. In each episode I present a small batch, with theory and practices behind building a high velocity software organization. Topics include DevOps, lean software architecture, continuous delivery, and conversations with industry leaders. Now let's begin today's episode.
[00:00:26] I'm taking small batches back to its roots for the next series of episodes. I've written coffee cup size servings, of software delivery education based on Steven Spear's 2009 book, the high velocity edge. the high velocity edge offers a simple hypothesis organization’s with higher learning velocity, beat the competition. Today's organizations are complex adaptive systems that fail in curious and unexpected ways. This is evident in the story of Mrs. Grant. Mrs. Grant was recovering from cardiac surgery in the hospital. The day nurse discovered Mrs. Grant having a seizure at 8:15 AM. A code was called and blood work rushed the lab.
[00:01:16] blood work revealed a undectably low serum-glucose level. Mrs. Grant's brain was sputtering out like a car on an empty tank attempts to intervene, intravenously failed her condition worsened until she went into a coma. She died weeks later when she was taken off life support. what happened to Mrs. Grant? The hospital conducted an immediate investigation. The investigation showed a nurse responded to an alarm at 6:45 AM the nurse diagnosed a potentially fatal blood. clot. The nurse injected a dose of the anticoagulant herapin to break up the blood cot. The nurse left the room when Mrs. Grant was resting peacefully.
[00:02:00] how did she go from resting peacefully to on life support in under two hours? Well, a minor detail and a frantic moment culminated to kill. Mrs. Grant Heparin could saved. Mrs Grant. instead she was given a fatal dose of insulin simply because the nurse grabbed the wrong vial. The investigation found that both vials of insulin and herapin look the same.
[00:02:26] of course they're labeled differently, but the type is small and specific details are even smaller. We've all seen a nutrition information label or the list of ingredients on a can of soup now shrink that down ten times. Too make things worse. The vials were also placed close enough to each other. So one could be chosen by mistake. that's exactly what happened.
[00:02:49] Sure. The nurse may have correctly identified herapin but combine that with someone in a rush, responding to a critical alarm in a dark room. And at end of the overnight shift, doubtful. I can put myself in the nurse's shoes because I've done the same thing. I've pushed code to production. That was seemingly correct.
[00:03:08] then went about my business. later things failed catastrophically. Here's an example. of a time that I killed Mrs. Grant, I had just joined a new company. The standard onboarding process was to ship a small change to production. I committed my code change then completed the code review process engineers, much more familiar with the codebase approved my change.
[00:03:31] All systems were go. So I deployed my change to production. Immediately afterwards user records started deleting from the database. Um, what, how could I have done that? I just changed this small thing over here. more experienced engineers, stepped in and diagnosed the problem. my change had a unexpected side effect that was not caught by the existing test suite or.
[00:03:59] code review. Some code path modified an instance variable, which triggered a code path that deleted data. Unfortunately, my code was executed on every request. So every request deleted data from the system. I was killing Mrs. Grant every time a user visited our website. who's to blame for killing Mrs. Grant and who's to blame for all that deleted data.
[00:04:22] the nurse did what they were expected to do. Give the patient medicine known to prevent blood clots. I did what was expected of me: test my change, to the best my ability and have it reviewed my those more knowledge than myself. So who’s fault is it then? Is it the pharmacy who arranged the vials?
[00:04:42] Is it the code reviewers who didn't identify the fault that's possible, but what if they too were just doing what was expected of them in that case? The process itself. is the culprit. If the process is the culprit, then wouldn't people like Mrs Grant be killed all the time or more bad changes. Go out to production.
[00:05:03] Spear offers some research on the topic. Research showed that five to ten patients are injured for every patient killed by an error in medication administration. Like the one that killed Mrs Grant. Then for every injury, there are five to ten close calls, like identifying the wrong medication before administering.
[00:05:22] it]. For every close call, there are five to ten slipups or mistakes. If the same is true about software, then five to ten people may have also written code with unintended side effects. Then another five to ten people would have realized their code had unintended side affects before deploying to production.
[00:05:41] Then another five to ten would have committed code that unintended side effects. Of course, some worse than the others–detectable or not. The truth is that there were hundreds, if not thousands of opportunities to save Mrs. Grant, all it takes is one person to say, Hey, mixing up these vials could kill someone.
[00:06:02] We need to do something about it. Or, wow. This code is doing something unexpected. We need to make our system more deterministic or add a test for this. Sadly. That's not what happened. These cascading failure signals indicated the process was inadequate and should be modified. Instead, the failure signals were considered obstacles to overcome as normal noise in the daily process.
[00:06:27] Each person got their job done, but they did not increase the chance. that The next person would have a higher chance of success. All these factors culminate in lemony Snicket's type series of unfortunate events. So-called high velocity organizations are different. They strive to continuously reduce errors and increase failures through four capabilities.
[00:06:50] 1. system design operation 2. problem solving improvement, 3. knowledge sharing 4. developing high velocity skills in others We'll cover these in future episodes. You've just finished another episode of small batches podcast on building a high performance software delivery organization for more information, and to subscribe to this podcast, go to smallbatches.fm. I hope to have you back again for the next episode. So until then, happy shipping,
[00:07:25] like the sound of small batches. This episode was produced by pods. Worth Media. That's podsworth.com.