Screaming in the Cloud with Corey Quinn features conversations with domain experts in the world of Cloud Computing. Topics discussed include AWS, GCP, Azure, Oracle Cloud, and the "why" behind how businesses are coming to think about the Cloud.
Dylan: We need to understand all the aspects that drive engineering. We need to give some sort of level of visibility and the ability to sort of tweak those because you never know what's going to be the bottleneck or what's going to be making that difference to the bottom line of your business so that you can do better.
Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest on today's promoted episode is Dylan Etkin, who today is the CEO and co founder of Sleuth, but previously, in the mists of antiquity, was the first architect behind Jira. So, ladies and gentlemen, we found the bastard. Dylan, thank you for joining me.
Dylan: It's me. I've come out of hiding for a short minute to, uh, to take your, your, your quotes and barbs.
Corey: Exactly, to atone for your sins, presumably.
Dylan: But happy to be here. Yeah, excited to, to talk to your, to your audience of Screaming in the Cloud.
Sponsor: Running Engineering is a demanding job. It's not just about delivering software. It's about delivering software predictably, without compromising on stability. And it's about delivering software that drives business value. Learn how Sleuth makes the job of running Engineering easier at Sleuth.io
Corey: Sure. So let's start at the very basic, very basic beginning here.
What exactly is Sleuth? I'm, I get the sense by the name alone, I'm expected to track down the answer to that question, but I find it's easier just to go to the primary source.
Dylan: Absolutely. But you're, you're right. The name has, uh, has, has held us well. So, you know, we're really about, We're an engineering intelligence platform.
That's pretty vague. And at the end of the day, we help make sense of engineering at all different levels. So we provide insights to executives, to engineering managers, and to developers about how things are going, you know, where you're spending your time. We're really concerned about making it less opaque so that you can understand.
how you're aligning your business and your engineering efforts and understand like, you know, where your bottlenecks are coming and then potentially provide you the ability and the automations and other things to change outcomes. Because at the end of the day, it's really about doing things a little better every time.
Corey: I have to ask, in my history as a terrible employee, is, I guess, shading my initial visceral reaction to that. Because I still remember one of the worst bosses I ever had walking past me once, saw that I was on Reddit, And then leaned in to say, is this really the best use of your time? Well, at the time I was an r slash sysadmin trying to track down the answer to the actual thing I was working on, so yeah, it very much was, but it left a weird taste in my mouth.
I've also seen the terrible approaches of, oh, we're going to judge all of our engineers based upon lines of code that they wind up putting out, or numbers of bugs fixed, and it leads down a path to disaster. I think the consensus that I've always heard from engineers is that the only thing that they agree on around how to measure engineers is that whatever you happen to be doing, you're doing it wrong.
Pick a different metric instead. Clearly you have put some work into this. How do you evaluate it?
Dylan: You're absolutely right. And I mean, I come from an engineering background as well, and I'm like, I know what I'm doing. Trust me. Like I'm doing a good job. I think you're right. Like lines of code, there's been so many false starts in industry around measuring developers and developer productivity that it's left this very, very wary group of people.
And let's be honest, like devs are pretty prone to being skeptical. And you know, there are some measures that I think can be pretty toxic. The way that I've thought about all of this is not about measuring individuals, but measuring teams. Because what I've also found to be true is that every developer team that either I've run or been a part of.
has always had this deep desire to do things better all the time, you know, like the best engineers that I've ever worked with, they care about their craft. They care about, you know, the next time iterating and doing it a little bit better. So when you frame measurement in terms of team productivity or giving yourself the leg up or the answers to be able to do it better that next time, even if it's like 2 percent better, all of a sudden it's a different attitude altogether.
They just want to know that you're not stack ranking. People on some dumb ass metric that is arguably very easy to be wrong, but they do love the idea of, Hey, as a team, we agree on a certain set of things and we're going to try and get better. That is
Corey: probably one of the only metrics I can think of that adequately captures some of the absolute finest engineers I've ever worked with.
You take a look at their own personal output and productivity. It wasn't terrific by most traditional measures, because they were spending all of their time in conversation with other engineers, coming to consensus, solving complicated problems. And that does not reflect in any traditional metric you're going to get by counting lines of code or timing how long someone's bathroom breaks are by badge swipes.
I find that that kind of person who makes the entire team better by their presence Often gets short shrift whenever you start evaluating. Okay. Who here is not pulling their weight by via metrics that are from 10,000 feet. That has no idea what the ground truth is. I
Dylan: mean, we've had companies come to us very explicitly and say, I want to use your tool for stack ranking.
We're going to have riffs. We're going to have whatever. I just want to understand who's at the top and who's at the bottom. And my argument has always been like, that's not us. Like you should go find somebody else. Some of our competitors do that. And if that's what you're after, go for it. But I would very much maintain that.
There are other ways of knowing if people are good or bad in your organization. What we're trying to focus on is how good is your team and how much better could they be? You know, and where are the areas where you can sort of dive in and make a real difference? I mean, so much about what we do. It's not about the individual engineers.
It's about engineering is misaligned with business, right? And you're like, yeah, they produced really well. They made a great thing. That was the wrong thing, you know, and the business didn't even realize that that's what they were focusing on or. You know, like it was like a few degrees off the mark and therefore didn't land the way that it was supposed to.
That's not something that you're going to blame an engineer for.
Corey: When you take a look at the negative approach of stack ranking people based on idiocy, I can see the utility or perceived utility to the company. Great. But now you're talking about the actual healthy way of doing this in a, in a team based environment and making sure that Things are getting done and teams are optimizing.
What is the actual value to the business? Because it's an easy but unfortunate sell to, we'll help you fire the underperformers when that's the crappy perspective you take. You're not taking that perspective. So what is the value proposition?
Dylan: The easiest one to understand is just like allocation. I mean, if you just want to talk money, right?
Which often like, you know, if you're selling to an organization, they want to understand the ROI. They want to understand like, Hey, when I put your tool in, how much money am I going to save? I mean, in the most extreme example, I had the CTO that I was talking to. And you know, when he showed up in his organization, he did like, you know, some investigation and some deep dives.
And maybe a month and a half later, he realized that 15 percent of his engineering organization was working on something. That he could track back to no business value, right? Like they didn't have a plan for what they were going to do with that thing that was going to, um, you know, result in the end. And, you know, it was pretty easy for him to take the principle of focus and say, great, you guys go off and figure out if that's actually important or not, but you don't have the, the, the convincing argument that that's important for the business today.
We're not going to do that. We're going to take 15 percent and focus them on, on the things that I do understand have business value. And you can see that like, wow, they just regained 15 percent of their workforce, right? Monetarily speaking, that could be millions of dollars, right? Having that alignment and understanding at a glance, whether you have that alignment or not and how much it's actually costing the organization.
That can be huge.
Corey: I get the sense that this is targeted at a specific size of organization, where if you're a two person developer team, I question that you're going to get a whole lot of utility from this. If you're a 10,000 person engineering team, is the utility there? And what, I guess, what's the continuum of your, or the exact size and shape you're looking at here?
Dylan: You know, what we tend to say is like, you know, dev teams of like 50 and up. We have definitely seen some like overachievers, you know, maybe teams of like 20 or 30 who can get a lot of use. And I guess the other thing to bring up is that it's not just executive focus. You know, there's, there's a lot of ways, like if we were to talk about the Dora metrics that that group, uh, inside of Google, you know, you can utilize those to understand like bottlenecks around like how you're shipping.
And so if you're a team of 20 or 30, understanding your change lead time and your failure rate and your meantime to recovery and those sorts of things, you know, you can make a lot of efficiency gains. with that sort of information at that size as well. But absolutely, when you're talking about like misalignment and just understanding like, are things on track or are things off track and by how much and you know, why are we even doing them?
That ends up being like larger organizations.
Corey: When you identify things that are on track versus off track, is this, at this point, are you viewing it through the lens of surfacing what you're discovering to decision makers? Are you proposing specific courses of action as a part of what you're doing?
Because I can see a good faith argument in either direction on that.
Dylan: Yeah, I mean, you know, people don't want metrics just for metrics. Um, I like to talk about it as like, you know, it's really a third of the problem. Surfacing hard metrics so that you have some data to understand what's going on. It's not easy, but it's also not the solution.
The second part of that problem, like the second leg of the stool, is interpretation, a shared interpretation. There could be a set of data, and you and I could look at that set of data and have very varying interpretations of what it actually means. But for us to make a quality decision and to change an outcome, it's very important for us to be aligned on what that actually means.
And we might be wrong, right? But as long as we're aligned, and we're saying we're taking this on faith that we're moving forward. And that's what it means, then that's sort of like phase two and then phase three is kind of the outcomes. What are we going to do about it? How are we going to like tie the data through to an outcome that we're going to try and shift and change?
Corey: It's an interesting way to aim this. When you wind up presenting these things, are you talking primarily to line engineering managers? Are you talking to directors, VPs, SVPs of engineering, CTOs in the clouds or orbit somewhere?
Dylan: We have Move to taking a kind of top down approach. And so we are really leading with the value prop at that sort of exec level.
So maybe like CTO or VP and that sort of thing. Now, what is also true is that you can't solve this problem without addressing every level of engineering. So there has to be tooling for engineering managers. There has to be tooling for individual engineers. The only way that you can get believable data Is to take it from the work that people are doing day in and day out.
So that means like tying into systems like GitHub and Jira and PagerDuty, Datadog, and the place where people are doing their actual work and then distilling that kind of upwards and like reasoning about it and saying like, where are we at? And then zooming it up a level further. And then maybe like, you know, breaking it into allocations.
So you're like, how much time are we spending keeping the lights on versus new feature work? And what was our plan for that? You know, but it all starts with kind of like bottom level where you're actually doing the actual work. But then you have to kind of address each one of those layers. But, uh, what we have discovered is that if you don't have a top down cover or an initiative, things can go wrong really quickly.
If you're just starting in the middle and trying to work your way up, your CTO can show up and just say, I don't believe in that. I don't care. Like, we're able to ship software. It's fine. Don't worry about that. And then you're not going to actually make a material difference.
Sponsor: Working on the wrong things is worse than not working at all. Can you trace how each project maps to a business goal? If not, you may want to rethink the timing or scope of that work. Traceability is the key to business alignment. Learn how Sleuth helps leaders like you map engineering priorities to business goals at Sleuth.io
Corey: What's the, I guess, day to day experience of using it?
Because when I've seen initiatives at various places I've been embedded for, uh, over the years, a lot of them tend to focus around, great, we're going to wind up now taking an entire day every two weeks to sit here and talk and do surveys and answer these questions. And I've seen other approaches to this that are completely hands off to the point where engineers have no idea it's actually happening.
Neither one of those feels like a great approach. How do you do it?
Dylan: No, that's a great question. Thank you for asking it, actually. Because that's kind of how we think about differentiating, honestly, in our space. Here's what I think is wrong. Just having a dashboard, right? Like a dashboard with a bunch of metrics.
I'm going to tell you the future of what that looks like. What that looks like is that somebody loves that dashboard and they are super into it. That data is their baby and they think it's the bee's knees and they don't think they even need to communicate to other people about what that dashboard means.
That But like we just talked about, like, interpretation is a huge part of everything. So that dashboard can mean something very different to other people that drive by and see it. It also doesn't drive any cadence of, like you say, like, how are we meant to use this thing? A huge thing about what we focus on is not just providing metrics, but how do we use them?
And the key insight that we had is Every organization that we've interacted with or been a part of ourselves has these sort of ritual cadence based things. So think of like SREs with their operations reviews, or think of sprint reviews for developers, or daily stand ups, or, um, you know, it's not uncommon to have Have like a CTO engineering review, right?
Where you either talking to the CEO or the CTO, and you're talking about allocations of incoming headcount, how much you're spending on what, whether there's like, you know, how you're doing operationally as you know, an engineering department, these things exist, but what they also tend to do is they, they are very data poor.
They're often like sentiment based, or you're working off of a level of data that is not well curated. So you're just like looking at the Jira board and trying to make some intuitive leaps as to what it means.
Corey: Yeah, there's no, there's no, uh, there's no box that's actually labeled this on a dashboard, but it may as well be vibes.
Dylan: And there is definitely a time for vibes too, right? Like, I mean, DevEx is a big part of this too, where, but you want to be able to be really flexible and say, I'm going to frame this meeting, this ritual that we have in a certain set of structured data, you know, and some of that might be sentiment. Coming in from a DevEx survey.
Some of that might be Dora metrics that are going to tell you like what your change lead time looks like. Some of that might be a KPI coming from Datadog, right? Where you're saying like this together is an important set of information. One of the favorite ones that I like to look at is support escalations.
As CEO, I want to understand like how supportable is our product when we're onboarding new customers. Are they getting stuck in their onboarding because we have some sort of technical Blitz that's getting in their way. To me, that is a business critical issue that I want to have surfaced. And when I'm reviewing engineering with my CTO, I want to understand that we're not unable to execute on the things that we want to execute because we're dealing with too many support escalations at any given time.
Corey: I can probably guess the answer to this one just based upon the fact that you have some serious companies on your logo wall and they would not tolerate one way this gets answered. But, a challenge I've seen historically with a lot of tools that try to play in this space, or methodologies, or let's just call it what it is, religious cults that spring up around these things, is that they all mandate that, okay, you were going to completely change top to bottom the way that you work in engineering.
That great plan, uh, requiring people to get religion to derive benefit from what you do does not work. So I have to assume you meet people where they are rather than step one is we start on Monday. It's a brand new day. Let's begin.
Dylan: One thousand percent. Yeah. I mean, you, what you said is right. And you do this long enough or you're in engineering long enough and you realize there's no religion around any of that stuff.
If you're shipping software, even if you're doing it poorly, Good for you. There's some reason for it, right? And so we have to work with every different way that people can work. One of the things that you'll run into in the space that we're in is people being like, Oh, I just want to roll my own. You know, I'm going to create a data dog dashboard and this should be easy.
And we always see those people again, right? Like they show up six to nine months later. Because they realized, hey, guess what? We're doing rebases, and this other team is doing force pushes, and this other team works off of a branch, and this other team works off of the mainline, and only that. And there's a million variants to how you can work, and uh, we have to just support all of those.
That's the magic behind the screen, right?
Corey: One thing that I'm taken aback by, in one of the most pleasant ways I can imagine, is that We are, um, 15 minutes or so into this conversation, and I've looked at your website, and neither aspect of those has the term AI been brought up once. This is amazing, how did you get your investors to let you do that?
It's the right answer, this is not a computer problem, this is a people problem. But how did you get away with it? Okay, number one,
Dylan: yeah, every time we have a board meeting, they bring it up. That is real. And number two, we probably should be leaning into it a little bit more because I, I do think there is a place for AI.
We actually embed it pretty heavily into our product and maybe this is controversial. I don't, I don't know if it is. I don't think it is, but I think AI is a great starting place. And I think humans is where it needs to go. So like, When we talked about that three legged stool of like data and interpretation, um, large sets of data is hard.
AI is pretty good at it, turns out. So like I can take a thousand pull requests that happened in like the last week, and I can like get some stats about those that we're going to pre calculate. I can feed that into an LLM. I can tell the LLM what is good and what is bad. And then I can ask the LLM to like, tell me some stuff that's important about that.
Uh, and it turns out that it's really good at doing that. And then the other thing that it's really good at doing is taking That summary plus other summaries and then smooshing them together to get a higher level summary. So, you know, in our tool, because again, we're like sort of doing this section based view and we're trying to talk to these different levels of audience, we will start the summarization with an AI generated version of that stuff.
But that's the starting place, like a human should be looking at that and we provide, you know, workflow to allow humans to go back and forth and assign somebody to it and comment back and forth. Because again, it's that shared understanding. But I think if you're starting from zero, that's really difficult.
But if you're starting from something where you're like, Oh, that's a good insight, I understand that you're saying that we did this thing weird, but it's because Joe, who usually does that really quickly, he's on vacation and he just had a baby, you know, and like, that's the why behind it. And like a human is going to know that the AI is not going to know that.
But I think the mixture of the two can be really, really powerful. It's
Corey: refreshing to hear a story about this that doesn't result in, and that's why our product will replace a bunch of people, either as the engineering level or the engineering management level. For some reason, no one ever pitches we can replace your executives with an AI system.
I mean, honestly, I've met a few that I think could replace with a dicky head bird, but that's a separate problem. There's a, it's a holistic approach, but it does make the selling proposition a little bit harder in the sense of what is the actual outcome that we get by implementing Sleuth. I'm starting to get an impression around it, but it feels like it's not one of those necessarily intuitive leaps.
Dylan: I mean, I think the, the real driver behind it is going to be if you have a motivated executive team that needs to understand how they're doing alignment, right? Now there's a ton of downstream benefits that we could go into, but again, it's like, If you want to get a, uh, a tax credit from some sort of like, I know Australia does like an R and D credit for engineering, right?
But you have to have an auditable sense of who's working on R and D versus keeping the lights on or doing support and those sorts of things. You can use allocations from Sleuth to generate that information. Or if you just want to understand, Hey, we have like a thousand people. And like, we're re utilizing them and who's working on what, and, you know, you don't want to spend forever trying to put that together, or you want to communicate to your board around how engineering is going and you want to get off of the typical narrative, which is, I have a great gut feel, I'm really good at hiring people who have really great gut feels, and they've all sort of told us that it is X, right?
But you want to be data driven around where things are going. That tends to be the big value prop.
Corey: There's a lot to be said for, I think, being able to address. The multiple stakeholders that you have to be able to address and speak to simultaneously without things getting lost in translation or dropping a ball as you go about it.
Is this, I guess, almost the Dora principles come to life in the form of a tool? To
Dylan: a certain degree, yes. You know, we started out doing a thing called deploy tracking. Um, this goes way back to like me and running the Bitbucket team and just being like, deploys don't get enough love, but it's like when the rubber hits the road for, for customers and code.
Right. And, you know, I think we learned very quickly that the power there is to be able to glean really, really accurate Dora metrics. But the other thing that I think we've learned is Dora is going to tell you how you're doing as a team and where you might be able to improve. It's not going to tell you whether things are on time or not, and it's not going to tell you if you're spending time on the right thing, and it's not going to help you with like the interpretation anomalous is trying to tell you.
Dora is definitely important, but it's not the whole story. I don't think you're going to make a hugely material difference with just DORA, with your engineering organization. You can use it, but I think much past like the engineering management level, it starts to break down.
Corey: While we're on, I guess, the, the, the small cell team approach here, I have to ask, why should the engineering team?
Rank and file, for lack of a better term, care about the metrics.
Dylan: I think it goes back to the original thing that we were talking about, which is ideally, you're working with people that are motivated to get better all the time. You know, that care about their craft. You know, one thing that I talk to my engineering team about all the time is just like this idea of like, put yourself in the place of the user.
You know, I had a conversation with one of my engineers just the other day from like, Hey. You've got to use this thing every day, right? Like, even though, like, you don't have a reason to do it, you're going to put your pretend hat on. Like you, you worked with product enough to know the why behind this thing, and you're building this thing.
And then you just got to like, every day you use it and you say, how does it feel? You know what I mean? Are there places where we could do better? You know, is there a way for us to do this thing? And, and the motivation behind all of that is to just do your craft better. Metrics can be a great driver for that.
You can change your team culture. If you have a culture of like running too fast and breaking things, you can slow yourself down and break a little bit less things. You can set guardrails around like how you're doing and those guardrails can be metrics driven. Often they can be automated and automation driven.
But you know, the motivation is as a team enjoying doing your work every day more and more and more.
Corey: That's a, I, I can feel the rightness of it. I can definitely see. the value to it. It feels almost like you're sliding into the DevEx space on some level, or is that not a fair characterization?
Dylan: It's part of it.
I mean, like the DevEx space, I mean, this whole thing, you can't really untangle it, right? Because it's about, do you have the right tooling in place? Do you have the right measures? Do you have the right guardrails? Do you have the right cultural attitude? Are people happy? You know what I mean? Like, and maybe, maybe they're unhappy, but just because.
You have a terrible office, right? And they're like, I love everybody on my team. I love the way we work. But like that smell coming from the pet store down below is just too much. The truth of the matter is, is we're humans and different factors affect all the things. The engineer in me, when we first started doing Sleuths, really wanted there to be a number and the number was going to drive the answer.
Right. And I was like, I don't ever want to connect to a calendar because I don't care if you're in too many meetings or whatever and this and that. And I have changed my attitude quite a bit. We need to understand all the aspects that drive engineering. We need to give some sort of level of visibility and the ability to sort of tweak those because you never know what's going to be the bottleneck or what's going to be making that difference to the bottom line of your business so that you can do better.
Corey: One of the failure modes of a lot of tools that display dashboards, metrics, and the rest that I've, this always irked me. has been, when they jump to making value judgments without anything approaching enough context to do it. Easy example would be, Oh, the AWS bill. I live in that space. Oh, the bill was higher this month.
Frowny face, or it's in red. Well, okay, maybe you, maybe you're having massive sales. This could all be incredibly positive. Conversely, it could just be that someone left a bunch of things running when they shouldn't have. You don't have the context to know the answer to, is this a positive, negative, or neutral?
All you can do is present the data and not Try to do people's thinking for them, because when you start to get that value judgment wrong, even by a little bit, it starts to rankle people subconsciously.
Dylan: No, I believe that. But I also think, I understand the motivation. I think the dashboard is the wrong place to do it.
But like, what people really seem to want is they, they want you to be able to suggest a course of action. You know, and I'll go back to what I said before, which is they don't want the data. They want the answer. They don't just want to see the value of what the bill was. They want some sort of suggestion of what should I do next?
If it was too high, could I potentially save myself some money in this area? Like, where should I look first? You know what I mean? That's why they have that whole, what is that cost? Uh, I forgot what they call it in AWS. They have a million different products, but they do have that cost and analyzer, right, where they're like, Potentially, if you switch to spot instances here, you could save 3,000.
Corey: TrustAdvisor or Compute Optimizer, or technically some aspects of the Cost Optimization Hub now. They, well, they have a service, please. They have three, at least. That's the AWS way. So what is the interface that people have when they're working with Sleuth? Because you've, you've expressed a, I guess, a fair, a reasonably fair disdain for the idea of dashboards, which I'm not opposed to.
So I have to imagine it's not just a bunch of dashboards here displaying people, or is it?
Dylan: I mean, they look A little bit like dashboards, but I would say they almost present a little more like a slide deck than dashboards, right? Because the intent is that you're going to use them in some sort of setting where you're working through it with, with others.
Interestingly enough, like, you know, when we did interviews and stuff with CTOs, almost every one of them had one of those exec alignment meetings that we were talking about. And to a man, they were using Google slides to present that data and nobody liked it. Nobody like really enjoyed that experience, but that was how they sort of presented things.
And I think some of the reasons around that is that you want to understand that there's relevance in the data and you want to understand the timeframe around that relevance. So, you know, we, we kind of treat these things. Almost like pull requests, right? Like, uh, to, to make a dev analogy where you're doing a review, right?
It's not just like a dead document that may or may not have relevance. It's something that's living and breathing, but it's only going to be around for so long. Like the relevance, right? Like when you've gone through it. You want to know that this thing was relevant for this time period. And that's how we were thinking of things at that time.
But then, you know, when things move on to the next one, maybe in the next month, when you go through it again, you're, you have connection to that older stuff. So you can bring things back and surface those things. It's a different incarnation of that, of that next thing. Cause time has changed and the data has changed and your interpretation might've changed.
And so it looks a little bit like a dashboard presents a little bit more like slides, but it is very temporally based. And so, you know, what is relevant and what isn't.
Corey: I regret having left traditional engineering roles before this existed. This seems like something that would have, that would at least have the potential, if wielded properly by upper management, to make a material difference in the quality of work life.
Which of course does translate directly to better engineering outcomes. If your people are listening for whom it's not too late, and they've Transcended into like podcast nonsense world, like I have. Where's the best place for them to go to learn more?
Dylan: Our website is the right place. Go to sleuth. io, uh, and there's like lots of buttons to either give it a try or contact us, you know, or reach out to us at, uh, support at sleuth.
io. And, uh, we're very good with, uh, customer service and we will get in touch with you right away because, uh, we want to talk about what your challenges are and what you're facing and how we might be able to help.
Corey: And we'll, of course, put a link to that in the show notes. Thank you so much for taking the time to speak with me.
I really appreciate you doing it.
Dylan: Thanks, Corey. It's been fun.
Corey: Dylan Etkin, CEO and co founder of Sleuth. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five star review on your podcast platform of choice. Whereas if you hated this podcast, please leave a five star review on your podcast platform of choice, along with an angry, insulting comment, being sure to include your personal favorite Jerusalem.