Seven Shipping Principles
How does 37signals decide when software is ready to ship? In this episode of The REWORK Podcast, 37signals’ co-founders Jason Fried and David Heinemeier Hansson discuss the company’s Seven Shipping Principles. They dive into when a product update is good enough to ship and the importance of making updates to solve actual problems.
Watch the full video episode on YouTube
Key Takeaways
- 00:11 - We only ship good work.
- 03:36 - The problem with MVPs
- 04:41 - We ship when we’re confident.
- 10:08 - We own the issues after we ship.
- 13:41 - We don’t ship if it isn’t right.
- 17:11 - Making sure your solution tackles a real problem
- 20:00 - We ship to our appetite.
Links & Resources
- 37signals’ Seven Shipping Principles
- Books by 37signals
- 30-day free trial of HEY
- HEY World
- The REWORK Podcast
- The 37signals Dev Blog
- 37signals on YouTube
- 37signals on X
Sign up for a 30-day free trial at Basecamp.com
Transcript
Kimberly (00:00): Welcome to Rework, a podcast by 37signals about the better way to work and run your business. I’m your host, Kimberly Rhodes, and I’m joined as always by the co-founders of 37signals Jason Fried and David Heinemeier Hansson. This week I want to dive into one of the write-ups on the 37signals website called the Seven Shipping Principles, which is our guidelines for how we ship and shape and build software at a sustainable pace is how it’s written. David, I believe you wrote this, if I’m not mistaken, let’s go through some of these seven principles. We won’t dive into all of them in too much detail, but a few of them I think are really important and interesting for people to hear about. The first one, we only ship good work. I think that’s pretty self-explanatory, but tell us a little bit about your thoughts on that.
David (00:45): That was actually the point that motivated writing up these shipping principles in the first place, and it sounds self-evident, but it’s not actually as self-evident as it sounds. When you have spent weeks on something, you’re out of time and you’ve built something, something that you could rationalize yourself into being okay to ship, but it’s not okay to ship because it’s not good. And good is actually a relatively high bar. It’s better than okay. We could have written, we ship okay software. Do you know what? I am sure that would actually be a high bar in some establishments, but that is not our bar. Our bar is good software and by articulating that as the bar, it gives us permission to say no. It gives us permission to say, yes, we’ve spent a fair amount of time on this one feature this cycle, but it’s not good software yet, so we’re not going to ship it.
(01:41): We’re going to eat that instinct that is to always deliver something and realize occasionally, it’s quite rare, I can remember only a handful of instances where we literally had to invoke this as a stop block. This is not going out because it’s just not good enough. But it is also one of those things, and I think we talked about it in one of the previous podcasts, that the ultimate test of we only ship good work really comes in in the 11th hour, and that was the other reason I wrote this down is to say, do you know what? That’s just facts. Good software is a all-inclusive evaluation of what you’ve built. It’s not this little piece here, this little piece there. It’s all of it, just as it’s ready to go out, and that’s usually when we are ready to ship or when we want to ship.
(02:27): It’s when we’ve finished assembling all the pieces. So to no wonder that that is the ultimate test for whether this should go out the door or not, and I wrote it down in part because this is the role Jason or I often play that we are the last stop before something goes out the door. But it’d also be nice if there are more people in the organization who felt empowered because it was written down to essentially say, you know what? We need a time out here. Maybe this project for whatever reason, just didn’t come together in time, which is not a huge amount of shame. And in fact, I think a little suspiciously sometimes about our shipping rate. I think maybe our shipping rate is actually too high that you should not aim to ship a hundred percent of all work you set out to do. You should aim to ship, I don’t know, 80, 90, 90, 95%, but there’s got to be some part of it that’s not good enough because otherwise either you’re just the best software maker that’s ever been, highly unlikely, or your standards are just not where they need to be. So shipping, good work, good software, something that I would be proud to use, that’s got to be the bar and that’s why we wrote it down as such.
Jason (03:36): Well, I was just going to add one small thing. There’s been a lot of discussion about MVPs and whatnot lately. I think this is one of the issues we take with MVPs, which is minimally viable. That’s too low a bar. It also, people use it to test ideas. I saw David wrote up a nice tweet about this recently. It’s like, testing incomplete half-ass ideas isn’t really going to yield you the answers that you’re looking for. That’s why the stuff has to be good because if you want to evaluate whether or not something’s any good, you have to put something out there that you think is good, not just minimally viable because you’re going to get minimally viable feedback on that, which is not really that good either. So that’s kind of the idea is that when you put something out there in the world, it also should be a complete idea. It can be a smaller idea than you initially wanted it to be, but it needs to be a complete idea and not some partial half ass, half version of it in a sense. It can be half, not half ass in a sense, but the half needs to be good. What you put out there needs to be good. It can’t just be a half ass version of something.
Kimberly (04:41): Okay, I’m going to read the beginning of this next one because I think it’s interesting. I’m going to get your take on it. The next one is we ship when we’re confident. We talk a lot about confidence and gut reactions here on the podcast. It says, the reason we do automated and exploratory testing is so we can ship work with confidence that it won’t cause major problems, but exactly how much confidence is required depends both on the team and the criticality of the problem. David, to you.
David (05:08): I think this comes up often in software development circles when we talk about quantification, when we talk about we have so and so test coverage for this product. I remember once upon a time people would say, I need 90% test coverage, which basically meant that out of all the lines of code you have in your product, all 90% of them are going to be tested or they have a test ratio. We write three lines of test code for every one line of production code. That’s a ratio I’ve heard stated in the path, and I always go like, that doesn’t tell me anything. If you are writing all these sort of mechanically, I need three lines of code for every line of production code, whether what you’re implementing is completely trivial and banal or whether you’re implementing something that if there’s a bug, it costs people thousands of dollars or in the most extreme case of criticality that someone could get seriously hurt or worse from it
(05:59): you’re not dealing with the context in mind. You should apply far less ultimate rigor to something where a problem is a mere inconvenience or even perhaps an aesthetic issue versus I am losing millions or I’m putting people in harm’s way. That’s what the criticality ladder is about, which I think is a great way of looking at the whole problem of confidence because what are you confident about and how confident are you really? So we try to generally have confidence that it’s not likely, and this is almost like burdens of proofs, right? Or a beyond reasonable doubt that is much higher as a burden of proof than, I forget what the one below it is called. Something…
Jason (06:47): Preponderance of evidence.
David (06:48): Yeah, something like that, right? In the legal world, they clarify these levels of criticality. If you’re convicting someone for murder, you damn well better be sure beyond a reasonable doubt. If you are having a dispute about some contract, that’s much lower criticality. We don’t have to deal with it with the same rigor. We may not even need 12 peers to adjudicate this issue, and we try to still get this the same way, that so much of the work that we do falls in the middle territory. Do you know what? If it doesn’t work or there’s a bug, that’s not great, but also we could fix it quickly. Sometimes we work, I have pushed out plenty of fixes. Where do you know what? I know there’s not an issue here, which is always by the way a cue that there will be an issue, but the issue isn’t going to be that large.
(07:34): And then when we tinker with, for example, what we call QueenBee, which is where we do all our billing, we take a higher burden of proof here. We want to have more tests, they have to be more regimented. We want to have more reviews, but that level of ceremony needs to be proportionate to it. And what I’ve seen, the biggest error is to pick one protocol, one level of ceremony and just apply it to all of it. Apply it to the trivial stuff and now you’re, there’s a Danish saying of shooting sparrows with cannons. You’re shooting sparrows with cannons. That’s just overkill. And at the same time, that process may also be underkill if you’re literally putting people in harm’s way or you could potentially lose hundreds of thousands of dollars, right? So having that sense, the fluidity of confidence, this is why I like confidence as a word.
(08:28): It’s not a numerical scale. It doesn’t go like one to five. It’s again, it’s a gut feel. Now that takes a while to train. I see very often junior programmers come in and they want to do everything as safe as possible and then it takes forever. They want to write a million tests. They want to write the five lines of test code for one line of production code, and then they’re not going to be able to ship, they’re not going to fit within our six week cycles. They’re going to turn a two week project into a five week project. And then they quickly realized that, you know what, okay, all this ceremony that I was just doing, because that’s the protocol, I can’t actually do that and accomplish what I wanted to accomplish. I have to now gauge things and calibrate them such that it is proportionate to the criticality.
(09:10): And that’s where the magic happens in those trade-offs. And I also think this is where the enjoyment happens. There is nothing more frustrating than going through meaningless checklists, bullshit checklists that are just like, oh, I have to do this because that’s what the chart says, right? When you’re doing something that you know isn’t proportionate, that you know isn’t a fit for the right thing, it’s just demoralizing. It feels like you’re spinning your wheels and you’re wasting your time. I would rather air, if anything, and a little bit on the cowboy side, right? We don’t build pacemakers, so we’re not going to kill anyone if we make a slight mistake. So we should air a little bit on the cowboy side, not so much that no one wants to use our damn software because it’s total crap and full of bugs and they constantly run into issues. That’s not rewarding either, right? But there’s a sweet spot in between these two things that fit within the context of the work that you’re doing.
Kimberly (10:08): Okay, David, you brought up issues, so I’m going to skip to the fourth shipping principle, which is we own the issues after we ship. I think this one’s interesting because it kind of ties in with our theory of managers of one. It says, clean up your own mess if you make one. Pay attention to the error tracker for a while after launching. Be the first port of call for support when they get feedback from customers. If you did the work, you have the context and you should put it straight if it’s crooked.
David (10:34): This is all about incentives. If you are a product designer or developer and you get to throw things over the wall and not worry about how they land or what kind of mess they make when they land, you will inevitably be less careful. Your confidence meter will be off because you have no feedback from reality to calibrated by. You think you’re doing enough testing because you don’t see the damn bugs, you don’t see the damn feedback from customers. So you’re isolated from reality. And whenever you’re isolated from reality, you will inevitably just invent a reality of your own making. Oh, I ship great software. I don’t know what you’re talking about. This was wonderful. I did all the tests. The tests don’t matter. What matters is whether the software is fault free or not, that’s the ultimate outcome. We’re not looking for these intermediary outcomes. We’re not looking for you to check off boxes about how many tests you’ve written.
(11:28): We’re looking for faultless software or faultless enough to the criticality that we’re aiming for. So I think we have aired wrong on this one occasion in the past, and some of it came from before we either had or fully respected our cool down. That’s usually kind of part of the magic of cool down. Cool down for us, so we work in six week cycles, and then we have two weeks of cool down before the next cycle starts. And that time needs to be for two things. It needs for you to roam and clean things up if you didn’t have a project that ran right to the line. But it’s also essentially over time, if you had a project that ran right to the line, but you need to clean up after it, after it goes into production. We have had instances where things ran too close back to back.
(12:18): Someone would work on a major feature, they’d push it out and boom, they’d be on project number two, and then, yeah, who’s going to clean up that mess? Whoever is, as we call it on call, and now they’re sat there going like, oh, I have to understand this feature from first principles. I didn’t build it. That creates actually a bit of friction within teams, and I’ve seen that develop even within our teams in the past when we didn’t respect this so fully. And I think it’s not good for any party. It’s not good for the person who feels like they’re just being handed a bag of trash and they have to figure out how it all works, and it’s actually not good for the person who just gets to throw shit over the wall because they don’t get to learn. They actually don’t get to improve.
(12:57): They don’t get to test whether they’re theories of software design or adequate or even good because they’re robbed of that feedback from reality. And I think there’s a sweet spot for software developers who get to work on new stuff, but also get to see that stuff in the wild. That’s the kick of shipping, the kick of seeing something you build actually being put to good use. Not from two kilometers away, but from right upfront, right upfront that you’re seeing, oh shit, yeah, there’s an issue there. So I think it is just a gift to both parties to make it such that there’s a fairness principle that the shit you put in there, do you know what? You have to take it back out again.
Kimberly (13:40): Okay. This one I feel like Jason, this one probably applies to the design side just as much as the programming side, which is we don’t ship if it isn’t right. I know you mentioned a couple podcasts ago that we were working on Highrise for six months, completely redesigned it. Tell me a little bit about how this applies to the design side.
Jason (13:59): Yeah, I would say it’s more on the product side, which is all of it, but I’ll give you an example. So we’re working on a feature right now in HEY which originally started out as being able to move through emails using the keyboard, j and k or the arrows back and forth, whatever. And there was some technical reasons why we couldn’t really do it everywhere we wanted to. So it was only going to work in a few spots, and so we kind of built it that way and I was using it. Something isn’t right here. It’s like a smell to use David’s line. There’s something that isn’t smelling right about this. Yeah, in most cases it works, but here’s an edge condition where it’s not that unusual. I could find myself in this corner pretty frequently and it won’t work, and it’s frustrating and I don’t know why, and I can’t explain it to people.
(14:46): I mean, I could technically, but it’s awkward. It would be a very unsatisfactory explanation. So we backed off of that for a second. We could have shipped it because it mostly worked, but I’m like, yeah, let’s back off this for a second here. What are we trying to do? So the main thing we were trying to do here was we’re trying to make it easy basically to power through your unread emails. And the reason you might want to use the keyboards because you want to look at one and go to the next one and go to the next one, and you want to move through them rapidly. The idea is to move through them rapidly. That’s the idea. A keyboard is faster than a mouse in most cases tor that. So we stepped back and said, okay, so we’re trying to do that. Is there another way we could do that?
(15:25): We looked through the way we do things in HEY already, and it happens to be an interesting layout where we stack things on top of each other. Scrolling is very fast on computers, both on mobile and desktop. What if we just let you scroll through your unreads in a way that you can then respond to some that you need to, mark some read that you don’t need to respond to, but you want to mark read, and then at the end when you’re done, say, mark ‘em all done if you want to, maybe we could do that. So we started out trying to solve the problem one way, started using it, could have shipped it, but it wasn’t quite right, backed off. Literally took a step back and looked at it more broadly and came up with, I think a better idea.
(16:02): When I read that paragraph that David wrote, that’s what it means to me, and it can also mean an aesthetic thing, like the proportions aren’t right and the buttons aren’t in the right place and all that stuff, but to me it’s more about is this doing what it should do? Is this the best way it can do it or not the best way because there’s always a better way, I suppose, but does this feel like the right way to solve this problem? Are we really getting to the root of it and are we leaving any unknown magical features in the corner? Let me explain what that mean by that. I don’t like special rules basically, or hidden rules I should say. I call these magical corners. That’s my little stupid internal version of it. But basically when hidden rules, well, the arrow works in every case, but these four and no one knows that.
(16:45): No one’s going to know that, and when they run into it, they’re going to ask us questions and we have to explain these hidden conditions. Well, in this case, if you have this situation, this won’t work. I don’t like those. So whenever I run into those, I start to step back and go, I’m not sure this is right. I don’t think we can ship this yet. Maybe sometimes we can, but oftentimes we can’t. Let’s figure out a different way where we don’t have these hidden rules. That’s how I think about that particular one. I don’t know if that’s what it was intended to be, but that’s how I interpret it.
David (17:12): I think one of the correlators to this or things that connected to it is this idea that when it gets hard, when there’s something that isn’t quite working, you should always question the premise. Why is this hard? As Jason said, what are we trying to do here? I think with that specific example, we started out with a solution rather than a problem. So the solution is I want to be able to go through my emails quickly using a keyboard. Okay, that’s not the same as stepping back one step further and looking, what is the problem? I want to catch up quicker on my emails. That’s a more generic version of that. That’s not about a keyboard. It’s not even about stepping from one to the other. It is about a fundamental premise, and when we ran into these technical issues as to why the solution didn’t work in all the cases, that’s exactly sort of that hard stuff, what the shipping principles is supposed to do.
(18:06): It’s just remind you, wait a second, let’s lean back here. There’s a reason this is hard. Could we make it easy? There’s a great saying in programming that is essentially, if you’re going to make a hard change first, make the change easy, then make the easy change. Now, it may be hard to make the change easy, but that’s how you clean things up both conceptually and practically. We needed to clean up our concept of what we’re trying to solve a little bit here, and as it often happens with these shipping principles, why they’re called shipping principles, not development principles, is that that moment of clarity arrives in the 11th hour. It arrives when all the pieces are put back together. That’s usually often the only time you can really lean back and question the entire thing. Is this right? And that’s why we needed to write it down because you have the other influence.
(18:58): We’re done. We’re shipping, right? It’s called shipping principles because we’re ready to push it out in the world. That’s when you need to pause. That’s when the hardest pause needs to happen. You could push deploy, it could go out there and you can convince yourself that what you’re pushing out there is better than nothing. That’s a terribly low burden of proof. I don’t ever want to be or rarely want to be in this situation where we sign off on better than nothing. I mean, what a depressing way to work. Better than nothing? That’s your bar, like non-existence or something, whatever that something is, even if it’s total shit? I mean, come on, you got to set your sights a little higher than better than nothing. So it’s one of those things where I try to code myself to when I hear my internal monologue go better than nothing. There’s just a reaction, right? There’s just, it connects to this. We don’t ship if it isn’t right, better than nothing, don’t have such low standards. Raise your chin a bit. Look proud about what you’re about to ship. Better than nothing.
Kimberly (20:01): Okay, the last one I want to dive into is the last one — we ship to our appetite. David, you talked a little bit about our six week cycles and our two week cool down. I do want to read this one phrase that I found interesting, which says, “Like a great company can be a terrible stock at an exorbitant price, so too can a great pitch become a mistake if it’s pursued far beyond the appetite.” Love to get your thoughts on that, and then if you have some examples of maybe where we’ve pushed the boundaries of the appetite.
David (20:31): I think just as a general principle here in programming circles, this is usually called gold plattening, that you take something that has a valid reason to exist and you just adorn it. There’s so much decoration, there’s so much veneer on top of it. There’s all these other things because you want to make it perfect, which is also a strain that goes through all of this. That perfect is the enemy of good, and good is what we’re trying to do. The first principle we ship good work, not the first principle is we ship perfect work. That can be a total trap to fall into, and oftentimes that perfection masks over actually issues that are deeper. You’re perfecting a specific solution to a flawed premise. There’s no joy in that. I’d much rather have a good enough solution to a great premise to a great insight, and I think that’s sometimes what we have to weigh, right? When we’re trying to fit things into a short amount of time as these six week cycles are
(21:31): we have to weigh the quality of our inside, of our premise, of our design against the implementation itself, and when I’m being forced to choose between those two things, I’ll always pick the clearer concept, always the clearer idea, the clearer line, and if you look actually at the example Jason referred to going back and forth between your emails, the implementation of using the keyboard in the more traditional just jump to another was actually quite sophisticated. We had to do some sophisticated work to get the caching to be right, and I could look at it and go like, that’s pretty clever. That’s a clever implementation of that. Yeah, it’s a clever implementation of a flawed premise because now I look at the other solution, we’ve come up and I go like, oh yeah, we’re just reusing some implementational bits we already have, but it’s being used in service of a right, novel insight into what people are actually trying to do.
Kimberly (22:25): Okay, well, I’m going to link to the Seven Shipping principles that are on 37signals.com/thoughts, but until then, REWORK is a production of 37signals. You can find show notes and transcripts on our website at 37signals.com/podcast. Full video episodes are on YouTube. If you have a question for Jason or David about a better way to work and run your business, leave us a voicemail, 7 0 8 6 2 8 7 8 5 0. You can text that number or send us an email to rework@37signals.com.