Leaving the Cloud
with David Heinemeier Hansson and Eron NicholsonCloud services have been used by tech companies for many years, but it’s not the only way. Hear why 37signals is making the decision to go in another direction.
David Heinemeier Hansson, co-founder of 37signals, and Eron Nicholson, Director of Operations, discuss why 37signals is making the move away from the cloud.
Show Notes
- David’s piece, Why We’re Leaving the Cloud
- 00:59 - 37signals history with on-premise and cloud storage
- 08:26 - How cloud solutions don’t necessary reduce operations teams costs
- 10:58 - What types of companies are the best fit for cloud solutions
- 14:14 - 37signals costs for cloud solutions and potential savings with on-premise options
- 15:25 - Advantages of working with on-premise storage companies that are similar in size to 37signals
- 20:08 - What the transition might look like, including timing
- 26:02 - Advice for medium-sized companies that might be thinking about making the switch
Transcript
Kimberly (00:00): Welcome to REWORK a podcast by 37signals about the better way to work and run your business. I’m Kimberly Rhodes, and in case you can’t tell already we’re changing things up a bit. Not only am I new around here, but today we’re switching gears from discussing concepts from the REWORK book and diving into something a little more timely. David Heinemeier Hansson, co-founder of 37signals, recently wrote a post titled Why We’re Leaving The Cloud, which created quite a stir, so we thought we’d talk about it. Today I’m joined by David along with Eron Nicholson, 37signals director of operations. Hey guys, thanks for being here.
David (00:36): Thanks for hosting.
Kimberly (00:37): Blew up. David, you wrote this post, it is now all over the Twitter verse. Before we dive into it, what I do wanna know is I wanna make sure for the non-tech people like myself, we’re all on the same page. We’re talking about moving from cloud services like AWS and Google Cloud and hosting our software on our own servers. Is that correct?
David (00:57): That’s exactly right. So we have for many years had one foot in the camp of our own hardware, which is often called on-prem or on-premises where you own your own machines. You don’t necessarily own your own data center and all the other stuff that goes with it. You rent that stuff, but the hardware itself is often something you own. And then the other camp, which is the cloud that everyone knows about aws, Amazon’s GCP with Googles, and we have been running Hey almost well exclusively in the cloud. And we’ve been running summer Basecamp OnPrem with our own hardware and summer Basecamp in the cloud over time. And now I feel like we have enough experience with both sides of the fence to really know what we’re talking about, to objectively look at what’s worth it when, and we’re ready to make some decisions on the basis of that.
(01:52):
And the funny thing is, I wrote this post up basically just sort of, oh yeah, this was a thing we’ve been talking about for a long time. Internally, we’ve set some guidelines internally of when we wanted to get off big tech, for example, that this plays into. And then we just had a wonderful meetup in Amsterdam where Eron and I sat down with the whole operations team and we were talking about basically how to do it. So I thought I’d just write this up and didn’t really
(02:38): Perhaps they keep failing to realize some of the simplification benefits that they’ve been promised in the cloud. And maybe now they’re starting to ask questions about is this actually right for us? And I wanted to open that discussion by simply proudly stating, Hey, look, this is what we’re doing for our size business, which is a medium size business of some 80 plus employees with a 10 ish person operations team. And and over a hundred thousand customers. This doesn’t actually make sense for us. In a lot of our cases, we have this pretty predictable base load that doesn’t fluctuate wildly. We don’t have a, a Black Friday sales event for example, that spikes everything we have to be ready for. And at the same time, we also have quite a lot of competence at the company for running services in this way. So when those things are true, and you have the very long time horizon that we have, which is also a critical part, right?
(03:40): 37signals that’s been in business for over 20 years, we’ve been profitable since the start. We can depreciate a server over 3, 5, 7 years. Maybe Eron will have some stats on that later. We have some machines in our data center that are really old and really old. Sounds like, well, that’s a bad thing. No, not really. It means that that machine was paid off many, many years ago and is now continuing to service our business. We’re not sort of putting additional load in the environment by adding or changing machines all the time. And most importantly, for this specific push, we’re not renting these machines. We have bought it. There are not payments every week or month due on the rent of the machine itself. Now we of course still rent power or pay for power. We pay for bandwidth, we pay for some services at the data center where we rent our cabin, but the core hardware we’ve bought it.
(04:41): So this is the big contrast. And as I called out in that post where I just zoomed in on one aspect for one service, if you just take, hey, our email service and you just look at the database and you just look at the search infrastructure that we need. Eron actually corrected me earlier today on our internal Basecamp. It’s not half a million. It’s $600,000 a year to purchase those two services. Now, Eron and I have actually been working on buying new hardware recently looking into how much hardware can you get for what amount of money? And it is astounding how much machinery you can buy today at extremely affordable rates. And this is something I think a lot of people who were swept up in the cloud marketing sweep haven’t realized things are not the same as they were even three years ago, or four years ago or five years ago.
(05:35): You might have made a decision on cloud five years ago, made a bunch of calculations, Oh, cloud looks pretty cheap. We can actually not outperform that. We did some of those same calculations, especially around hosting and especially around storage. And now you arrive today and you look at a pricing sheet like Eron showed me a a while back, we were just buying 12 terabytes of N V M E storage for some of these servers. A quite large amount of SuperDuper mega fast storage that we what, seven or eight years ago, tried to buy something similar and it was like tens of thousands of dollars of exotic equipment that had to be bought from specialist vendors, right? Like really enterprisey esoteric stuff. Today we bought 12 terabytes for $3,000.
Kimberly (06:24): Oh gosh.
David (06:25): You’re like, it’s just a completely different scale. And the same has been true in the evolution on the CPU power and so forth, that there is actually momentum now again in hardware development. And if you look at the prices on what you pay to rent your servers from Amazon, right, or or storage from Amazon, it has not been going down many orders of magnitude over the same time horizon. And of course it isn’t, right? You look at the public accounts as I point to, how much money is Amazon if we just take the, the largest vendor pulling in, how much profit are they able to pull out of this? And it’s astounding figures that profit does not come out of Blue Air. It comes from the rent payments that companies like ours are making to them every month. And often in sort of even more extreme cases than us, I mean, we have a pretty large cloud budget and gonna have some of the more specific numbers, but I think we’re at about $3 million a year in collaborate.
(07:21): That $3 million is a highly optimized monthly scrutinized amount. That figure could easily be double or even triple if we didn’t do all the accounting tris and reserv instances and long term deals and haggling this and haggling that, that we’ve had to do to get the prices down to that level, right? A lot of companies that I’ve heard from, they haven’t done any of that stuff. Yeah. Or they’re continuously surprised. Oh, someone left a server on, like, it’s just been churning and it’s been costing us hundreds of not thousands of dollars a day. So in many ways, we are sort of the most optimized version of what’s possible on cloud and really have poured a lot into it. We have high expertise to do it. And yet still, you look at that and you compare to what we have OnPrem and it’s just in different universes. If you have a, as we do long time horizon, three years, four years, five years, seven years, maybe, that you can appre depreciate things over, we’re talking just astronomically different figures.
Kimberly (08:26): Well, and David, I do have a question because you mentioned in your piece that that necessarily doesn’t mean that we would have to change our staffing. Like just because you’re going off the cloud doesn’t mean you have to, you know, increase your operations staff incredibly. Is that something that we can handle without the, a change in that?
David (08:45): So what’s really wonderful in our particular case is we’ve actually gone the other way. We have gone from a hundred percent of our servers being, or more or less a hundred percent of our servers being on Preem and knowing what that takes to operate. We currently run base cam three, our, well four now actually base cam four. We run base cam four on our own servers predominantly. There are a few cloud assistances here and there, most predominantly S3 that we continues to use for all our sort of file storage. But we know what it takes. And what I was initially sold on and thought I had understood and bought into when we many years ago started going to the cloud, was that the cloud was going to radically simplify the job of operations that we would need far fewer people to do the operations work once you went into the cloud.
(09:37): Well, it’s basically just software. You can just flip a switch and everything just magically works. Absolutely f and not, it does not, We have not been able to reduce the size of our team. The work that goes into operating services like Hay and like Basecamp, they’re just not predominantly in racking hardware. First of all, we don’t even rack the hardware. No one goes when we buy a new server. No one drives from wherever they live in the country to the data center, unpack the box, slot it into machine. No, no, no, they’re data centers who rent out what’s called this white glove service. They unwrap it, they put it in your rack, they connect the the cables and then we remotely set it up. And what we’ve just found is like, that’s just not where the bulk of the complexity is hidden.
(10:24): That’s not where the man hours are. The amount of time that we’ve invested setting up on cloud in some cases, I’d say is probably more than they’ve been on similar size setups in our own setups. So it’s not that one thing is clearly better than the other, but it is the point that the pitch that the cloud was going to eliminate your need or drastically reduce your need for an operation teams, when you run the size of services that we do, it’s just false. Now, important caveat, it’s not false when you’re literally just starting to some extent, if you’re literally just starting, I do believe that some of these cloud services, especially the fully managed ones like Heroku or Render or some of these others, they can actually allow you to postpone building an entire operations team until a later stage because it does simplify things.
(11:23): Now these services are just, they’re even more ludicrously expensive. It doesn’t matter when your scale is low and you have few customers. In fact, then the tradeoff is worth it. But I mean, if we were to imagine running both Hey and Base Camp Ons, say Heroku, I don’t even think we’d have enough money coming into the company to pay for those bills. I mean, maybe we would, but we would be paying orders of magnitude more. So there is something that’s true at that end, and then it also continues to be true. And this was a anecdote I actually wanted to to include in the piece but didn’t, was that if you have this spiky or uncertain load, then it can also make a lot of sense. And that was actually where we got the key advantage for Hay. When we launched HEY.com two years ago, we had originally estimated we were going to have 30,000 users over six months.
(12:13): We had 300,000 users over three weeks. Wow. It would’ve been very difficult to respond to that surge if we had to rack our own service. Not impossible. And I think it is getting easier all the time, but it would’ve been difficult. And I think we were quite happy that we were on cloud when that launch and we were able to turn the the dial. But outside of those two extremes, in that vast middle where I actually think there’s a lot of quiet companies suffering humongous built, like the bulk of the revenue for AWS and Google, I would not be surprised, comes from this magic middle. There’s obviously the mega customers, the Netflixes of the world and so on, who buy a lot, but they buy it at extremely discounted rates that no middle size companies will ever get within a million miles off.
(13:07): So the really profitable range is often that middle range, and that’s the range I’m talking to. You’re not sort of just starting just three people in the garage. If you’re that, hey, look into Heroku, look into render, look into some of these fully managed servers. If you are Shopify I think perhaps is a good example. Shopify’s on cloud. They have Black Friday, they have these right events where they have to deal with, I don’t know what the spike is, maybe it’s 50 times or a hundred times your normal load. But if you’re like Basecamp, if you’re like, Hey, if you’re like normal b2c, b2b SAS services, which is this huge industry, your load is probably not that variable. And even if it is variable within let’s say double right, like you wanna be prepared for double, we’re prepared for double. Like, it’s not like you only buy exactly what you need. We have two data centers. They run live, live, and each data center essentially has capacity to run everything plus quite a lot more. And even when you do that, it is still dramatically cheaper.
Kimberly (14:14): Okay, so speaking of cost, you mentioned $600,000 a year currently, what are we thinking that we might end up saving by taking things internally? Eron, maybe that’s a question for you.
Eron (14:25): I mean, it’s kind of unclear because you know, a lot of that is is storage and, and associated, things like that. But but yeah, our, so first off, we, we run Basecamp three and now four in the data center and, and have for its entire life cycle. And as you mentioned earlier, a lot of the servers that that we have are the ones that we launched Basecamp three on coming up on six years ago now. And they’re still doing just fine and are more than paid off. But our, for those two data centers, our bill is I think around a fifth, maybe even less than that than, than what we pay every month for the cloud. And that’s running Basecamp for our, our biggest site by and large. And yeah, to, to bring everything that we have on AWS in house might increase that by, you know, 50 or 75% or something.
(15:17): But it’s, it’s certainly not going to be the, the orders of magnitude larger that, that the cloud is. And I mean, the other point I wanna make about spending that money is, is it’s about who you’re supporting with that money and, and what you get for that money. We support a data center that is close to the same sort of scope of business as us. It’s, you know, they’re, they’re friendly people that we have talked to for 10 to 12 years and they’re not going anywhere. They’re the same kind of people that we are. Whereas Amazon, we give them, you know, 275, $300,000 a month and they don’t care at all about us. They, they don’t answer our calls. You know, we, and that is a very, very optimized figure. So even at that spend which is at some very low support levels you know, support can get exorbitantly expensive with, with aws. But even at that, you know, we open tickets with them and we don’t get responses for hours, whereas we call our data center and they are on the phone with us immediately and they care about our problems and they wanna help us.
Kimberly (16:20): Yeah, it sounds like it’s a good value alignment between the companies for sure.
David (16:23): I I think there’s also just the, the broad sentiments here, which was part of what kicked it off for us, which is that we don’t want to have a bunch of monopolies running the internet. And when it comes to cloud computing, there really are just a handful of very, very large clouds that sit on almost everything. And this is that dilemma I hinted at when Amazon’s data center called US East one, which is the original one that has the majority of of usage. When that goes down seemingly half the internet is offline, right? It’s a perversion of how the internet was designed, which was to be fully decentralized, very resilient. If one server went down one place, it had no bearing on a different server, another place. And we’ve kind of taken that beautiful model and perverted it into this scenario where so much of the internet just runs on a handful of companies computers.
(17:23): And I think there’s just something for me, aesthetically offensive about that. Not just a perversion of the Internet’s architecture and, and setup, but also this idea that these big tech companies, and I certainly include both Amazon and Google in that camp they’re already too big, too powerful. We’ve seen it in all sorts of ways too. When you concentrate that level of power with someone you also prefer this sense of like, who gets to be on the internet, which is not a major concern for us, hopefully. But it certainly has been a concern for a lot of other people. And those mos and morals shift constantly, like who’s allowed to be on the internet and who do you enter too? So I just have this philosophical alignment with the idea that we should be buying services in a decentralized, resilient way from other companies of similar size, where we are actually happy that they get our money, right?
(18:20): We have a very direct relationship with these service providers that we use, as Eron says, we can call 'em up. There is that connection. And I think that connection counts for a lot. I think it’s the same thing the customers see when they buy from Basecamp versus buying Google. Have you ever tried to get support from Google? I mean, we laugh about it. We actually used to be on Google’s cloud and you barely gotten more support on Google’s cloud than you did. If you have a problem in Gmail, which is to say almost nothing, right? Like’s just this black hole because it’s this enormous octopus of an organization. Do you know what, If you’re on Basecamp and you have a problem you’re gonna hear from someone right away. And if that problem is serious, it’s gonna get elevated very quickly to either Erin or me and either of us will either write you back or we will jump on it right away and we will solve that issue, right?
(19:12): So that sort of sense of distance that the internet allows thousands, millions of businesses like us to operate in this way, in this resilient way is just such a, a gorgeous image of what commerce can be online, what the internet can be, what a, what a free and open marketplace can be, that we should be there. And I particularly feel like we have an obligation to be there because we have the power to do that. We don’t need to be in the cloud. There’s no investor, there’s no whatever, someone telling us, Oh, you have to be on the cloud because there’s that and the other thing. Or you have such a short runway that you can’t imagine multiple years out in different in, in the distance. I mean, maybe that’s something we’ll get into. None of these changes are quick. It took us a long time to get on the cloud. It’s gonna take us quite a while to get off it too. These things do not happen snap overnight.
Kimberly (20:07): Yeah, let’s talk about that. So I mean, from an operations standpoint, like what is that process like? I mean, it sounds hard from someone who doesn’t know the tech side, but like what is the process to make that move?
Eron (20:19): I mean it’s, it, it can be hard. We have a lot of expertise on the team and, and in the company and we’ve run things in the data center for a long time. And one thing that I’ll note is you know, David, you talked about how it’s, it’s very easy in the beginning if you’re a startup, if you’re a very small organization or if you have a lot of uncertainty to, to sort of just spin stuff up in the cloud and run with it. And that’s true and that’s a great way to, to prove out, you know, a lot of unknowns in a business. But also once you get there, it’s pretty hard to get out of it. And, and if you get there and you have these huge bills, like, like we did, I mean, when we started moving things, I’d say we got, I don’t know, maybe 12 months down the road of, of moving to the cloud and, and really started looking at the bills and worked shocked at how much it cost and had to spend, you know, probably the next 12 months trying to optimize that.
(21:10): If if you don’t keep an eye on things and you don’t do that, then yeah, you can be sort of stuck in the cloud without a good path to get out of it. And fortunately, we, you know, we never moved some of our services. All of our services have been in the data center in some way since since the company was was founded. And so we have the expertise to move it. But yeah, it’s, it’s gonna involve standing up basically similar sized not identical, but but similar resources as what we have in the cloud. And then making a slow calculated move to, to move individual services off one at a time and make sure that they work and, you know, make sure the monitoring is set up and our customers are happy. And yeah, it’s gonna take months if, if not probably years to move everything off that we that we moved
David (21:58): On. And I think this is what’s obviously important to understand with this proclamation too. It’s not like we can just flip a switch and then move everything next week. This is gonna take quite a while, but this is also one of the luxuries of being a company that’s been in business for a very long time and have long time horizons. Which though I don’t think it’s that unique. There are plenty of solid profitable companies who can afford to look maybe one, two or even three years into the future when it comes to this kind of stuff and plan out like where do we actually want to be? What kind of future do we want to participate in for the internet? But I think what’s also important to note here is that there is a lock-in effect in terms of APIs and systems and so forth.
(22:45): And then there’s a psychological lock in that I’ve seen a lot of response on which the psychological lock-in goes like, I don’t dare venture outside the cloud. Isn’t it scary out there on the internet? Aren’t all these security issues gonna pop up in new and novel ways that no one anticipated? Am I not going to be exposed? And I think that’s what’s really insidious when you start on the cloud that you cannot even envision that, you know what up until, I was going to say five minutes ago, but not very long ago, everything on the internet ran like this. Everyone had their own machines and they were all connected to the internet. And yes, there were some issues with that, but you are fooling yourself if you don’t think that there are availability issues in the cloud. If you don’t think there are security issues in the cloud, the large, large majority of the issues you have to face when you are operating online services, they’re the same.
(23:40): And in fact, in some ways, I’d say it can give you a false sense of security to be in the cloud where you just think, Oh, that’s someone else’s problem, right? Like security is not something Amazon does for us. Absolutely not. There are some provisions for it, but you absolutely have to know what you’re doing. You have to set things up in in the proper way. It is extremely easy actually to expose the wrong things, even if you’re using a cloud, even if you’re using fully hosted services. And we see this time and again when there’s a another announcement of some company being hacked or whatever, plenty of them are in the cloud. So right. The cloud is not this sort of protective barrier that will free you from knowing stuff about how networks work, how security works, how resilience design work, any of these things.
(24:26): There are advantages you can take. You can use some of them, but you still gotta know your stuff here. And to think that, oh, like we’ve lost that. What, what are we talking about a loss civilization here? It’s not like the freaking pyramids where we can’t go back and ask them, how the fuck did you get those stones up there, right? Like, this is technology that like living people are still around who were there prior to the cloud. Like the cloud is a relatively recent phenomenon in its sort of broad exploration. If you even just go 10 years back most startups were not starting necessarily in the cloud. There were some were, but it’s not, it was not the automated default that it is today. So you really have to be careful that you don’t fall into this learned helplessness trap where you think like, Oh, we can’t do it unless we have Amazon or Google set things up for us.
(25:17): We’re gonna be exposed in these novel ways. And in fact, as we pointed out earlier, to some extent, you can be worse off, right? That when this us east one goes down and half the internet is down, right, you’re down by no fault of your own necessarily, right? Along with everyone else. When us east one is down, Basecamp is up, that doesn’t mean that Basecamp is always up. We have our own challenges even though we have a very good record on reliability, but we’re not all down at the same time. And this gets me back to this point about what the internet was designed to do, that it’s really not great if your project management system goes down at the same time your email goes down at the same time your documents go down. It’s the same time that everything is just dark. Right, not so great.
Kimberly (26:03): Yeah, creating a little redundancy. It’s helpful. So David, I feel like your piece created a lot of traction because people who are reading it in the tech space were like, Ooh, we’ve been thinking about this, or we should be thinking about this. So my question to you is, as we we leave here, is there any advice you have to someone who’s in an organization who feels like they might be in this same boat, in this medium sized business where maybe the cloud isn’t for them? Like what kinda advice do you have to get them started thinking about it?
David (26:31): Yes, absolutely. The majority of the feedback that I’ve gotten directly is the same kind of feedback we get on rework. Actually, I thought I was the only one who thought this. I thought you weren’t supposed to question these things. I thought you weren’t supposed to question the cloud because like we’ve all agreed, right? The doess the future and then I come out and say like, Hey, no, it’s not, or it doesn’t have to be, You can choose a different path. So I would start by simply just raising the discussion internally. What kind of business do we have? Do we have a highly volatile business where we have these huge surges? Are we a very early stage business where we can entirely get away without an operations team? Or are we perhaps in the middle just like 37signals where we might not yet be spending $3 million a year like they are, but maybe we’re already spending a million dollars a year or maybe we’re even spending half a million dollars a year.
(27:22): And we’d go like, Do you know what, That’s an awful lot of money. Should we just run the numbers? What would it cost us to buy some servers? How long would it take us to pay that back? And if we could end up in a situation like 37signals has with Basecamp where they’re still running on servers they bought seven years ago, how much more profitable could our operations be? And that’s really the, that that irks me, that we have this huge push in SAS in particular, we’re building all these wonderful services and then someone else is just taking this huge rake off the top by, by letting us rent the shovels. Buy a damn shovel every once in a while. If you like the shovel, if you know you have to keep digging, maybe you should just own your shovel, right? Like we know this in all sorts of other areas of life.
(28:09): If you know you’re gonna live in the same house for the next 10, 20 years and you have the opportunity to purchase, do you know what? It’s probably better than just paying rent every month for the next 10, 20 years. The same thing with a car, the same thing with a lot of things. Renting implies that someone else has to make money between you and the thing you want. If you can take that part out and just put you directly in contact with the thing you want, the servers you want you can save a lot of money and even more uplifting, you can help heal the damn internet. You can help return the internet to the roots that DARPA envisioned when we first came, they first came up with this beautiful constellation, this thing that if anyone tried to present it today, there would be a million people telling us that could never work. But it does work and it’s gorgeous and we have an obligation as middle size companies with the wherewithal and the capital to keep that dream, not just alive, but strengthen it to do so. It is our civic obligation to the internet.
Kimberly (29:13): I love that. Well, with that, I think we’re gonna wrap it up. David and Eron, thank you so much for being here and David, I’m sure we’ll catch you more on Twitter.
David (29:21): Excellent. Thanks so much. All right, thank you.
Kimberly (29:23): If you enjoy today’s conversation and hearing about some of the tech behind 37signals, you’ll be happy to know that we’re going to be starting a technical blog in the very near future. Make sure to follow us on Twitter at 37signals, so you get alerted as soon as it’s.