Tune in as the pair shares insights from their experience running web applications at scale and offer advice, tips, and tools for startups and businesses looking to optimize their infrastructure.
Join the discussion as they explore the importance of getting real about the costs of the cloud for small businesses.
Links and Resources:
Do you have a question for Jason and David? Leave us a voicemail at 708-628-7850.
Creators & Guests
What is Rework?
A podcast by 37signals about the better way to work and run your business. Hosted by Kimberly Rhodes, the Rework podcast features the co-founders of 37signals (the makers of Basecamp and Hey), Jason Fried and David Heinemeier Hansson sharing their unique perspective on business and entrepreneurship.
Welcome to Rework, a podcast by 37signals about the better way to work and run your business. I'm your host Kimberly Rhodes. Back in October of 2022, 37signals co-founder and CTO, David Heinermeier Hansson let the world know that the company would be transitioning off of cloud services in a post titled "Why We're Leaving the Cloud." It's been just four short months since we talked about it here on the podcast and things have been happening, so we thought it was time to share some of that progress. I'm joined by David along with Eron Nicholson, 37signals' Director of Operations ,who joined us last time to talk about this topic. So, um, I have to be honest, I read all of your check-ins about this. I don't fully understand it, but I know that this is very exciting. , tell me what's going on. It's been only four months, but I feel like a lot has happened.
It's actually crazy when you say it's only been four months. To me it feels like it's been like the nine months already, but maybe it's because we've been working so intently on this and particularly not even the full four months. Really at the start of the year is where we kicked it into a different gear. We had originally a plan to get off the cloud by having someone help us with, uh, some of the software and some of the guidance on it. And that, uh, deal kind of just went sour a bit. Um, we found out we were not the kind of customer they usually sell to. They were selling to banks and military and governments and whatever. And when it came time to look at the pricing, the pricing was as you'd expect for something that sells to banks and military and whatever. Totally outta line for what we were looking for.
Um, but it was a really good experience because I think it really just anchored this notion, why are we getting outta the cloud? We're getting outta the cloud in part because we want our independence. We want to be able to run our own stuff, everything it takes more or less, in order to serve our customers and to deliver these apps that we have. And we should not put crucial infrastructure in the hands of others in that way ever again. The cloud was that, and it has advantages. It's, it's not that, but the downsides are just too much for us. So at the beginning of the year we kind of finally hit the cold wall on that and we were left with like, okay, then what do we do? And we had a bit of a sort of a think on how to change this uh, direction and um, around the same time I'd been looking into Docker in general.
I mean Docker is this thing to, to package up applications into a neat little box that you can just put anywhere. And it's been around forever. I mean I think almost a decade or something like that by now. And container technology is spread very far. But I have happily lived in a realm of ignorance more or less around that because we already had a good thing going. We had set these things up I hadn't really bothered with in a long time. And then just before we hit that wall, I had spent several weeks just going deep on that for other reasons. And that's when I thought like, what do we actually need? Do we need that much? Do we need this big honking system? It's something called Kubernetes that we were gonna try to run ourselves. Kubernetes is what all the major cloud providers are running.
This is what the Amazons and the Googles and whatever of the world, they run that and it makes total sense for them. It's actually really nice that the industry has come together around Kubernetes for that level. But we're not that level. We're not even close to that level. We barely exist in the same universe as that level. And this is something we harp upon all the time, in every other aspect of the business. Should we use whatever technology Facebook uses because Facebook uses it? No, very often we should use the opposite of that because the technology that's right for a company of 50,000 is not the technology that's commonly right for a company of 80. And this was exactly the same thing here. And thankfully Eron and his team had quite some trepidation about this plan we originally had. We're just gonna run Kubernetes ourselves and, but there's like, uh, there's a lot of moving parts and we gotta do the upgrades thing and like the whole thing is gonna fall apart if we mess it up, right?
And that's the thing about complexity. It gives you just these, this feedback that like, okay, if we get in too deep, I don't know if we can get back to shore. We might be stuck out there and then we're really in the shit. And we've had that experience once before. I think this was what, three or four years ago, we were on Google Cloud at the time and Google Cloud kind of just fell apart for a couple of days or a week or whatever. They had all sorts of network infrastructure issues and there was nothing we could do. In that moment, there was nothing we can do, which is about the worst feeling in the world when you run a service like this. I mean, you may mess things up yourself and I'm sure we will do that along the way, but at least we will mess it up ourselves and we at least we will have agency to set it right.
When you depend on someone else like that, you just can't. And we flirted with getting into that situation ourselves, running Kubernetes ourselves. It was just not the right move. So long story short, we decided to go with a much simpler solution. Basically just docker, basic Docker and we instrument that now through a new piece of uh, technology that we built called MRSK, which is sort of an approach that we've honed for 15 years, I think. Um, there's another tool we've been using to put Basecamp onto our servers because Basecamp never moved to the cloud called Capistrano. It's basically that, but for the modern container world. We've been pulling home two apps already. We got another one coming next week. We are all in full throttle to make this cloud move happen as quickly as possible. And sort of the line in the sand we've drawn is we want to be fully out by the end of the summer.
Oh my gosh. Okay. I feel like, I'll have to go back cause I feel like when we first recorded the podcast we were like, this is probably gonna take years. With an S at the end. And since then we've escalated.
Yes, there was definitely an escalation and part of it was realizing once we've made the decision that we were just gonna use basic Docker and I sort of personally just got really into it and have now spent several months deep immersed in it realizing, do you know what, this is not trivial. It's not trivial at all to move this back, but it's also not kind of mystery box magic science stuff. And once we'd done a couple of these and we've already done a couple of these moves, you could see the path. Like it wasn't hazy anymore. It was quite hazy when we talked last time. We're like, we're probably gonna go with this vendor and we're gonna try to run Kubernetes ourselves. We've never done that. Like, lot of moving unknown parts and like who knows how long that's gonna take. But once we had this kind of clear path built on simple tools that we understood and have used for years using techniques that we've used for years, it became a lot less scary and a lot less uncertain.
And then the other thing happened. Um, Eron reminded me how much money we were spending on the cloud per week. What's really interesting is that that was the trigger for me. I knew what we were spending millions every year, but when you boiled that number down to, it cost us $38,000 per week in the cloud right now, I couldn't get that number outta my head. I was like, wait, what? So if we do this two weeks slower than otherwise, that's almost 80 grand if we do this a month slower. Holy shit, that's a lot of money. Eron and and team's been amazing at working on this. Like I was just like, okay, I can't think of anything else until that number's zero or if not zero then not in the same ballpark. Like there's, I mean for us, I mean we're a relatively large company and we have a lot of customers. $38,000 a week is just an enormous amount of money. And there's very few other levers that I can move or help move in the business that'll produce $38,000 a week just like that. I know how hard it is to get more customers. I know how hard we work on that side and amazing work going on there, but like producing like $38,000 a week in your business, like that's a lot.
Yeah. So talk me through, um, both of you, David and Eron, kind of where we started. I know you guys were very selective on which products you were gonna move first, like a a particular order. Tell us about that and then also a little bit more about the cost savings.
Yeah, so we have, uh, we've moved a very small app called Tadalist which was I think created even before Basecamp. Uh, I'm not a hundred percent sure on that, but...
Right after. 2005.
Right after? Yeah, so very simple app that is just to-do lists, uh, still has people that use it, amazingly to this day. They, they're not paying customers, but we've heard from a few people, uh, who still use it. And that is always the canary that we use when we move things. So that was the first app that went to the cloud, uh, back in the day and it's the first one to come back after that. We've moved an app called Writeboard, which is used by some other apps. It's used by the original Basecamp Classic and Backpack. It's integrated into them, but it isn't a real standalone app. And so we've, we've moved those two successfully so far. We hit a few bumps in the road on both of them that we were able to, to smooth out pretty quickly, but they both went, they both went really well and they set a path forward for moving the rest.
And, and that's the thing that I think is really interesting about it for us doing it this way is that they're now moved and they're more or less independent and stood up, and they set a template that we can use to move the rest of the stuff and each of them will be independent moves on their own that can proceed a pace and can proceed, uh, with some parallelism versus what we were going to do before with Kubernetes was going to be, you know, a, a continuing move into this centrally managed cluster, uh, which had a lot of unknowns for us and Kubernetes was definitely the right choice when we moved to the cloud. And it's, I think, the right choice if you wanna run your apps with someone else running your clusters. But I still remain very skeptical that it was ever gonna be the right choice for us to run by ourselves with a company and a team of our size. And the thing that made me realize the path that we really needed to take was thinking about going back in time to when we first started this journey to the cloud, which was in 2016. Would we, if we could go back in time right then would the end result have been us running Kubernetes by ourselves? And I think the answer is a hundred percent no. There's no way.
I think what's so neat about this setup where we have this long history of apps, we have been in business for 20 years. We have a lot of what other people called somewhat despairingly, legacy app. I think of legacy app as trophy apps, right? Like if you think of a a race team, they have like a trophy case of like, oh, in, in 1972 this is what they won. This is what Tadalist feels like to me. It's like a trophy that's there in 2005. That is 18 years ago we put this thing into the world. And do you know what? We're still running it. So that's just amazing. But then it also, as Eron says, gives us this canary, which I like to call the criticality ladder. We start at the absolute lowest criticality, Tadalist is the app that's 18 years old, like literally has not been developed in like 18 years or maybe 17 years.
We shut it off for new signups maybe 12 years ago or something like that. There's still an amazing, I think a thousand people a week or something interact with their list. Maybe it's a little less now, but that was what it was last. But they don't pay us. So I feel like the contract is a little different. Like if Tadalist, which is mostly for your personal to-do lists, um, has 15 minutes of an outage or half an hour or even two hours, I don't feel like we're violating a deep trust here. So we can start with that. And as, um, Eron says, we had a couple of, uh, small hiccups, nothing major, but a couple of small hiccups that took a couple minutes to resolve and we're learning from that. But that's okay. If we've done that on, Hey, we're like, all right, it's gonna be offline for 20 minutes.
You can't get to your flight information on the way to your gate. Uh, that's gonna be a problem, right? Hey is super high criticality. Hey is the highest criticality application we run. Interestingly enough, even though Basecamp actually makes us more money, I feel personally that Hey has the highest criticality because if you are in a given situation, you can't get to your email and you must get to your email. That's high stakes. So that's why we, we tackle Hey after we've done several of these other lower criticality apps and I think that's been, uh, working out really well.
Okay, so Tadalist has come home, right Board has come home. What's next on the list?
Next up is going to be Backpack, which is another one of these, these, uh, legacy apps that exist in a suite along with Basecamp Classic and Campfire and Highrise. So Backpack will be hopefully next week. That is our, our goal. Although if, you know, if things happen and we're not ready, we'll push it. But uh, that is hopefully the next one. Um, and then we're going to start working on Campfire will be the next one to likely go and, and Basecamp Classic and Hey are the ones that we're gonna work on over the next few months.
My question, Eron, for you, is about staffing and your team, cuz clearly you guys are like all hands on deck with this move. How has that worked out? I mean, you haven't hired new people to make this happen, work hasn't just stopped on the other things that they were doing. How are you managing all this?
We haven't hired directly, but we did hire at the end of last year, we brought on three new people and it's actually very convenient timing because they, they all just hit their three-month point and it takes more than that. It's, it's gonna take another probably three to six months for them to be fully on board on the team. Our team has far and away the longest onboarding process in the company, but they're at the point now where they can contribute and they will be contributing to this project going forward. So that, that's super helpful. Um, the rest of the team, it's really about refocusing and putting aside some work that maybe doesn't necessarily need to be done, but there's actually another interesting component to this that I've realized over the past couple weeks especially, is that there's a huge amount of work that my team does around operating our services in AWS that is going to go away and it's a pretty major drain on our time.
We have to deal with incidents and we'll have to deal with that at home as well. But, uh, the bigger thing is, is wrangling the cloud spending. We write these cloud spending reports every month or two and they take significant time and not only writing the reports, but doing the work behind the scenes to make sure that, that our spend is as little as it can be takes significant resources. Uh, the other thing that that takes a lot of resources is AWS mandates upgrades all the time to a lot of their services, which isn't a bad thing. A lot of these are security driven and, and everything like that, but we're on this AWS upgrade treadmill a lot of the time where we have to constantly do maintenances to upgrade their Kubernetes clusters, their database clusters and all of that. So it'll be nice to be a little more streamlined and not have to work on that stuff all the time.
And I think that is really the second part of the key reason of why we're coming home. One thing is that the cloud cost was just ridiculous. Once you sort of do all the math as we've now done actually reduce the down, I had a post about how we were gonna save $7 million over the next five years doing the napkin math on how much are we gonna spend at home, how much we're gonna spend in the cloud. And you go like, wow, like a million here and a million there. That adds up to real money. So that was the one part of it that was probably the impetus of it, but almost as important was the realization that, as Eron says, we've been in the cloud, sort of proper, since '16. I mean we've used S3 much longer than that, but proper in the cloud since '16.
It never delivered on its fundamental promise that it was gonna be significantly less work to operate the kind of apps that we operate in the cloud than it is to operate them on our own hardware. And now I've written a lot about this recently, so I get to interact with a lot of people who, who still seem to believe this, who still seem to believe that if you're in the cloud, there's no work, there's like all the operation work, it just disappears and like I'm only half watching the team as they go. Eron is in it every day as he says, that's just not true. There's all this work you have to do all the time on someone else's schedule to keep up with the upgrades and one, and it's all work that has to happen. I'm, we're not saying like it's bad, it's not bad that you upgrade your services, you have to do that anyway.
It's just that it's not that different or at least it's not significantly less work. We can't do what we want to do with our apps with like half the operations team just because we're in the cloud. It does not work like that at all. And then the equation really falls apart when you first go, well, it's way more expensive. Wait, wait, wait, $7 million over five years more expensive and there's no cost savings in terms of the team who have to run it. Now, I will caveat at this as I always do, is that is not true if you have a hugely spiky service. If you are a shop that like twice a year gets a hundred times as much traffic as you do the rest of the year, cloud is amazing. That's Amazon's original reason for conjuring AWS into existence that twice a year they get way more traffic than they do the rest of the year, which means they have all this unused capacity they should resell.
Wonderful, great idea, real breakthrough. And then there's the, the low end, right? Like we don't even have enough computing needs to fill a single server or we can use these fully managed, um, systems that don't require that much maintenance and the cost doesn't matter. Like we're talking about a few thousand dollars here, whatever. It's just not a big deal. So those two ends still exist. Cloud is still great for that, although you gotta do it sort of smart. You, you go in early like new stardom, you're like, oh, then we use all these services then boom, you wake up one morning and you're spending the big bucks and you're just locked in. You cannot get out because you've gone serverless this and serverless that and blah blah proprietary services. So caution on that. But for us, for the median size, for the thousands of software companies, SaaS companies in particular who are in this middle range where they're spending real money, big money on the cloud, yet not being able to get fewer people to, to run it, man, the equation just looks so outta whack when you run the numbers.
I'll say one piece of this that makes it perhaps easier for us than other folks who might be in our sort of situation is that we never stopped doing it in the data center. We have never moved fully to the cloud. Basecamp three, our flagship, Basecamp four, our flagship never went to the cloud. So we've never stopped and lost the expertise or the muscle memory to run things in the data center. And that doesn't exist for a lot of organizations. And I think that's, that's one thing that is interesting going forward is, uh, you know, what, what organizations are still going to know how to do this outside of the context of the cloud. And so I think that's one thing that's interesting with MRSK is it's is a vastly simpler way to deploy things on your own hardware. And if we can show that to people and you know, perhaps build back some of that expertise, then we might have a more compelling way for people to leave the cloud.
I think this is, this is so huge. And it's so fascinating when you've been in the business so long that I remember and now it sounded like a really old person, right? Like I remember when the cloud first arrived. I remember when the cloud first arrived, and I remember when that was the new thing that no one knew how to do and everyone ran their own stuff and it was like, eh, I don't know. I mean that's a whole different skillset, blah, blah, right? And now we've just like done a whole generational churn where there are people today working companies who've never touched, virtually or otherwise, their own hardware. They've always gone through these abstractions and there's something real lost there. And I really want, I mean mean I feel like I'm on an, an mission to break down the misconception that owning your own hardware is this weirdly exotic thing that's actually super duper hard and only wizards of old are able to do that.
That is just not true. Not only can you do this, there's so many vendors out there who are willing to help you. I mean, one misconception when we said we're coming home from the cloud was like, so you're gonna build your own data center. It's like, what? No, we're not getting into construction. We're not gonna figure out how to run plumbing in a large building to, I don't even, what? Like how are we even having this discussion? That's not what we do. We have eight racks across two data centers. Like that's something you pay someone to do, which is how it was the whole time. But it's almost like this lost knowledge, like we're ancient Egypt here. And then like how did they build those pyramids? Like, hey, we're here, we're still alive. Like, it's not like multiple generations ago. The context is still there and there are companies to help, like we've uh, used Deft for 14 years or something like that to help us manage the hardware that is, we never, we don't receive the hardware.
E ron just placed this mammoth order for a gazillion um, CPU cores, it's gonna arrive and Eron's not gonna touch it. Like there are people on site at the data centers that we pay to unwrap the boxes, to put the machines into the racks, connect power and connect, um, connectivity to him. And then we see boom and IP appear online, which is not that different from the cloud. Like, so the experiences are acting not that different. There's some differences and you do have to know some of this stuff, especially around networking and topologies and da da da. But you can get help with a lot of it. And the likes of, of Dell, who we've worked with really closely on this has those, has just been amazing. I mean, figuring out exactly what CPU should we buy, what do you see other customers do? There's a bunch of stuff there that's really broadly distributed is not exotic at all. Like that's just what they do. Deft this is what they do. They help companies like ours make the hardware work. Dell, they built these things, they've been building them for a long time. So this notion that like cloud is now the only thing that you can do, that's the one we need to blow to smithereens.
Yeah, I think there's an even lower, uh, expertise way of doing it as well, which is you can very well get from Deft or from Digital Ocean or from Hetzner, a block of compute that you can use on your own. And you never have to buy the hardware. Y ou can lease the hardware until you get to such a point where it makes sense for you to buy, you know, two or four servers and have a quarter of rack or something like that. You can, you can get in on the very low end without a ton of expertise in these things. And then as you need it, you build it as you go.
This is one of the reasons by the way, I just, this is demo on MRSK, which is the new tool we use to deploy these things. And I used, uh, Hetzner, which is a German company. Um, that's awesome. And I very rarely hear about them in the conversation about cloud. Whenever people talk about cloud, they talk about Amazon, they talk about Google, maybe they talk about Microsoft, um, and a couple of other large ones. There's a bunch of others who will sell you or lease you some hardware and make it available in like cloud familiar terms Digital Ocean is another one as Eron mentions in the, in the U.S. but Hetzner in Europe I think is, is really an awesome one. There's another one called uh, OVH, which is a French company. It's also awesome.
Some of these have a really interesting model where they will allow you to use the cloud stuff where you just rent some computing and connect it straight to, hey, I just need one box I'll buy, or two boxes or three boxes, some redundancy, a couple of boxes that I own. It sits in the data center right next to all this cloud capacity that I can use. That's not for us. I mean we have high enough needs that we need like a ton of servers, we don't really need that. But there are a lot of options here. I really just wanna break up this myopia that exists at the moment. Where 80% of people when they say cloud, they actually just mean AWS. And again, Jeff Bezos owns a slice of this company. I think what he's built with AWS is amazing. It's turned the industry upside down.
There's a ton of good stuff that's come from this, so don't hear us going like, we're getting outta the cloud. It's like the cloud thing's the worst thing ever. It doesn't work for anyone anytime. Total nonsense. It has wonderful uses. And then there's also just a very large slice in the middle of the spectrum where it's like, yeah, it doesn't make sense, it doesn't add up, the math isn't there and you should not be afraid of admitting that. Run your own math. Talk to someone like Dell, talk to someone like Deft, like, oh, what would it take? We're currently spending like a million dollars on AWS a year. Like could we spend 200 grand instead? That'd be nice, right? Especially right now, especially in this environment, everyone is getting the squeeze the economy is, uh, wobbling, uh, startups are having trouble raising new funds. Before you go laying off people, make sure your damn servers are uh, straight. Make sure you're not, uh, fiddling away all that, uh, payroll on a bunch of rental servers.
Okay, David, since you brought up your YouTube video that you did on MRSK, I'll link to it in the show notes, but I wanna pull up a comment that somebody wrote on that post to get your thoughts on it. Someone said, I think your move off the cloud is likely dubious. You could have probably saved a ton of money by cost optimizing. We talked about this a little bit in the last podcast episode. I don't know if you saw that comment, I wanted you to respond to it.
Yeah, I've seen that comment about 5,000 times as we've gone through this. I've posted a bunch of these articles on LinkedIn as well and you see it as well. And people have these misconceptions like, oh, oh really? You can optimize your cloud spend. What do you think we've been doing for the last like years as Eron talked about, like this is a lot of the cloud work is actually keeping costs in control because costs can spiral out of control in about five minutes. If you do not keep a careful watch on the spend, it can just go bananas. You can forget that you started this wild server and it's just running at hundreds of dollars a day. I heard these stories all the time where, um, especially for smaller companies, one of the reasons they want to exit the cloud is that they don't want that variability.
They don't want suddenly to wake up like, oh, here's an extra $50,000 bill. That does not happen if you run in your own hardware. If you run in your own hardware, you may run out of capacity. You service might stutter a bit, but you're not gonna be hit by a crippling bill. So we have already optimized our spend out the wazoo. We have bought these long things called reserve instances that run for a whole year where you reserve your capacity. We have private pricing for S3, we have enterprise agreements, we have leverage in terms of Jeff Bezos being an owner, we have everything on there. I would actually go as far as to say we are probably in the top 5%, if not top 1% most optimized cloud bills. That is why it's only $3.2 million and not eight.
Yeah. And and we lived this journey. We went down this whole path, right when we first went to the cloud, it was about making it work and then making it fast. And we made it really expensive in the process of doing that. It was, it was, I don't remember exactly how much, but it was probably double what the bill is now when we first went and then we looked and we were like, oh, holy crap, we gotta do something about this. And, and we optimized it over months and years and have been using every single lever at our disposal within AWS. We, we talk to aws, they have experts who, who help you do this. Although they make it a little difficult, I think. But yeah, we, we have optimized it literally as much as our team possibly can. And we track it, like I mentioned earlier, on a month to month basis to make sure that we're not missing anything and this is what we're left with. Yeah,
I knew that was gonna be the answer and I knew it'd get you fired up. So I wanted to wrap with that. Rework is a podcast from 37signals. You can find show notes and transcripts on our website at 37signals.com/podcast. And as always, if you have a specific question for Jason, David, or Eron about a better way to work and run your business for anything cloud related, leave us a voicemail at 708-628-7850 and we just might answer your question on an upcoming show.