SEASON 2 - EPISODE 194 - July 1, 2026

AI challenges in software development

AI didn’t just speed up the development of Basecamp 5, it changed how it was built. This week, Jason Fried and David Heinemeier Hansson break down how much of a role AI played in building their latest product, where it shines, and how 37signals had to adjust its process to keep their code base clean.

Watch the full video episode on YouTube

Key Takeaways

00:11 - How much of a role AI played in building Basecamp 5
09:55 - Where the intelligence technology truly excels
14:42 - The new challenge of saying no when features become easier to build
16:19 - Why a leaner product roadmap still matters
18:32 - Staying “easy to use” while competitors focus on AI features
20:44 - Why AI deserves both the hype and the skepticism
25:12 - Token spend, productivity, and staying profitable

Links & Resources

Transcript

Kimberly (00:00): Welcome to Rework, a podcast by 37signals about the better way to work and run your business. I’m Kimberly Rhodes, joined as always by the co-founders of 37signals, Jason Fried and David Heinemeier Hansson. We have talked about AI a couple times on the podcast, but with the launch of our newest version of Basecamp 5, AI was involved a lot in that development. So I thought we’d talk a little bit about it. I know I saw a few messages poking around on our tech team about AI and things we needed to do to rein it in. So David, I’m going to let you kind of kick it off and talk to us a little bit about Basecamp 5 and how much was AI involved in this?

David (00:36): Basecamp 5 was the first fully AI accelerated development process that we’ve had. The past product that we launched before Basecamp 5 was Fizzy, and Fizzy had very little direct AI generated code in it. It had a little bit around the edges, but the vast majority of that code was written by hand for someone to put that product out. And it’s interesting because Fizzy only just launched back in December, but now here we are having launched Basecamp 5 just a few months later and yet it could not have been more different certainly towards the end in terms of the role that AI played. It really went from AI is this pair programmer you’ll ask for questions and it’s an easier way to look up APIs and dogs and get things explained to AI is where most features start, that most things we are now doing with Basecamp and all sorts of other developments, starts with a prompt.

(01:36): Not always. Sometimes there are specific features where the developer has a really clear eye on what needs to change and that doesn’t start with AI, but it’s less and less and I’d actually say at this point, it’s certainly the majority of fixes and feature upgrades that start with a prompt. And that’s just a big change. But more than the change for the developer in terms of their workflow and whether they start with a prompt or whether they start with an editor is the fact that AI allowed our design team to start developing much more of the implementation by themselves as part of the design process, which as a baseline is amazing. It is absolutely wonderful that AI is enabling designers to see their ideas and their designs literally come to life and therefore being able to interact with them and therefore being able to gauge whether their ideas are any good or not.

(02:32): The problem with traditional design is traditional application design is when you divorce it from how does it flow, how does it work, what does it look like with real data in a production situation? It’s just really difficult to land the spot. It’s really difficult to get something that’s great, but if you allow the designer to get the whole thing working and not just working but deployed and not just deployed but useful on a beta server working directly with our real data, you just move so much faster towards, yes, that’s it, that’s what I want, which is an enormous part of the whole design process. There’s the part that’s just, I know exactly what I want and I’m just going to design it. That’s usually the end stage of things or certainly the rare situation. A lot of the times we start with, I want to solve this problem.

(03:29): There’s a million different ways I could do it, but to figure out which one is right, I have to start building. I have to start developing and then I’ll tweak it and so forth. But what was the norm prior to AI was that the designer would reach a barrier quite early on in their design process where they just couldn’t get further. They just couldn’t get this thing actually working, actually usable. So then they’d have to pair up with a programmer and that process in some cases, either the designer had to wait, there just wasn’t program available right now, or the project that they were working on was actually quite intricate. So it might take a week or two weeks or in some cases even three weeks before they’re fully able to evaluate the vision and design that they had. And then in some cases they figure out that what the developer had just spent several weeks working on was not right and was not what they actually wanted because no one knows what they want until they have it in their hands and can tinker with and go like, oh yeah, no, no, no, that grip is wrong. And I couldn’t envision that.

(04:34): I can only feel it. So that process really changed how we started developing features, but it also has some limitations at the moment and one of those limitations is that AI will literally give you anything you ask for. It’s kind of like that monkey hand story where you get three wishes and you ask for a lot of money and the way you get a lot of money is like a death in the family with someone who had a big insurance policy where you’re like, that’s not what I meant. I didn’t want a granny to die for me to get a million dollars. I just wanted a million dollars. Some of AI development is a little bit like that. There’s a little bit of a monkey hand where you ask it to do something for you and it’ll just do it. It won’t say no. And sometimes the correct answer is no.

(05:20): And if you’d had a human developer with you, they would’ve said no. I would’ve said no. Jason would’ve asked for something in a particular way to say like, well, I wouldn’t say no. I’m more polite than that. I’d say like, “That’s probably not the best way to do things. If we do it this way, instead you can get 90% of what you want for 10% of the complexity.” And in most cases, it can be like, “Yeah, totally fine. Great. I didn’t even care about that as the last 10% anyway.” AI is not going to ask you that question. It’s not going to push back. It’s not going to be offended by having to come up with 700 lines of code when 35 could do. The problem is if you just keep doing that, if you keep asking the monkeypaw for more wishes, you’re going to get a distorted code base.

(06:06): At least right now, I mean, we should timestamp this episode because it might very well change by the end of the summer as new, more powerful models come out here. But as of what are we, July now, 2026, this is still the state of the art. The state of the art is that the models are incredibly powerful and they’ve even gotten quite a lot more powerful just in those last six months that we’ve been rushing to get to the end of Basecamp 5 and then doing the follow up work afterwards. But they’re still not at a place where if you just have a designer prompt their way through, let’s say 20 features that you’re going to end up with something that has a coherent architecture that’s easy to continue to develop and that’s a pleasant place for humans to work alongside it. So we had to pull back a little bit a couple of times where the eventual rule became, you should start like this, as a designer or a product manager or anyone with a good idea for a new feature, make it work.

(07:05): Whatever you need to do, just make it work, run it on a branch, get it on a beta, show it to Jason, show it to Brian, try it with a larger share of the company. And then, if you like what you have, we should not think that that’s something you can just hit merge on. You cannot just take that output and then shove it straight into the code base and then expect that always going to be well. We did that a couple of times and we ended up with some technical debt, which is the polite word for crappy code that needs to be swept up after the fact and it’s always harder to do it after the fact than it is to do it before the fact. So the final rule was if you’re making changes to the Ruby code or to the JavaScript code and you’re on the design side or a junior programmer, you should just have someone look it over.

(07:51): You should have someone more senior look it over and when they do look it over, you kind of have to be able to stomach that sometimes the answer is, “I can’t use any of this.” What the AI has produced works, but it would destroy performance, it would puncture the architecture, it would have security issues or any of the other considerations that a skilled programmer might identify with a piece of code is present here and so what? That’s fine. We are no longer attached to this piece of code that has been produced by prompts because I mean, we didn’t write it. There’s not a human on the other side. If you tell the AI to, well, you don’t tell the AI, you just start a new session and boom, amnesia. They’ve completely forgotten that they spent a lot of expensive tokens producing this stuff and they’re not offended.

(08:40): In fact, if anything, Anthropic and OpenAI, they’re going to be like, “Great, you want to start over? That’s another batch of tokens for us to sell you.” So when that’s the case, when the residual code has very little value, you don’t feel as bad about throwing it out and therefore when a senior programmer on our team looks at something produced by prompt that isn’t right, they can just go like, I’m just going to put it all into trash, and we’re going to take the idea you arrived at the concept, the design, probably the HCML, probably the CSS, and we’re going to wrap it around a new implementation and that’s going to be fine. We’ve still got to the final destination much faster. And that’s the other thing. I mean, this really is a lot faster. We were able to do a lot of things, especially in the last sprint where I would look at the amount of outstanding work and went like, do you know what?

(09:38): Pre AI, this just wasn’t going to happen. Now there was a slice of that that was whack-a-mole. There was a slice of that, especially in the phase where we were still trying to figure out how far we can push the prompt and some designers pushed the prompt very far. I mean, awesome. You got to touch the electric fence to know whether it’s actually on and it was on, so it gave us a little bit of a shock, but totally fine. They pushed it really far and then we kind of had to pull back a little bit because otherwise you can end up in sort of this prompt loop where you’re like, “Make this happen. Oh, you broke two things. Fix one thing or you broke another three things.” And that’s just not a pleasant place to be. So even if code is really cheap to produce, quality code still takes a fair bit of effort and it takes some dedicated prompting, some understanding of how the architecture already is and what fits in well and what you can just let the models run wild with and which other parts you kind of have to reign them in a bit.

(10:39): But I’d still say as a whole, Basecamp 5 is a lot better because we were able to use AI both to figure out what we wanted and then also to produce what we wanted and then finally to fix a lot of issues AI assisted. I mean, this is one of the things we saw after we launched, we talked in the previous episode about some of the performance issues, some of the pathological cases where some customers just have very rare data shapes, where they have thousands of entry in a part of the product where we don’t expect that because we very rarely see it and we don’t have it in our own account. And you take those kind of cases and AI is truly incredible, not just at fixing the issue but at diagnosing it and going through logs and figuring out where these outliers, then trying to go through some loops where it’s rewriting queries or whatever the optimization is and then producing something quantifiable and it goes like, “Here’s the thing you had, here’s a new thing I made and it’s now 72% faster.” And it’s kind of that labor intensive, almost scientific work that AI just excels at and then still on, I want you to write me a new feature that requires a fair amount of Ruby and a fair amount of JavaScript, the reigns have to be held a little tighter and it’s still possible for it to get off into the weeds. Again, as I said, we weren’t even there. Like for the initial part of Basecamp 5’s development, this started, Jason, when was the first dig, like spring of last year, I think?

Jason (12:13): We talked about it at the previous meetup. Yeah, probably about over a year ago.

David (12:17): Yeah. So about a year ago is how long we were working on Basecamp 5. That initial phase didn’t have any of this. It didn’t have any of this AI acceleration. That really only started to hit around December-ish. That was when the affliction point was. And then we had maybe half of development phase that was AI accelerated. So it was a really interesting moment where you could see both of these things. And I’d say finally, there are qualities with the slower handwritten approach that we not just appreciate but actually miss. There’s something about the interaction when a developer and a designer is working closely together for a longer period of time on a feature where occasionally you end up with something that’s better and that’s just the reality of things. If you take three weeks to slow cook something and spend all your attention on it, it’s not all mechanical.

(13:13): Some of it is breakthroughs and concept that you’re not going to find if you only spend two days on something. So that’s the trade-off. I still think it’s very worth the trade-off. You just got to know that there is a trade-off. And then you also got to know that this acceleration, the ability to just go through long lists of things that are nice to have is also kind of dangerous. And I don’t know if the industry at large have fully appreciated the devil’s bargain it is to be able to get whatever you want, however much you want, because a lot of products don’t always improve. I think that’s a polite way of saying it when they add two times as many features, three times as many features. So what happens to product management when suddenly there’s not this great constraint of, I only have so many programmers, they can only work so many hours and therefore they can only produce so many features.

(14:09): What if you suddenly double that, triple that, 10X that? Do you actually trust yourself and your product managers to say like, “No, we’re not going to do these things for greater order.” I think even here at 37signals where we’ve prided ourselves on less software and fewer features and so forth, it’s a real challenge of figuring out what to say no to and not let your appetite sort of just gorge itself on endless streams of features until your product comes out completely bloated.

Jason (14:42): Speaking of that, really quick, just to add, I was just looking at our card table in Basecamp 5. We have a card table called Cycle 3, which is the current cycle of work we’re on and there’s about 20 things that we have listed as things we might be working on over the next six weeks. That would normally be maybe six things like a year ago and now there’s like up to 20 things. Now some of these are much smaller and quicker and not a big deal, but still, we wouldn’t have had this much stuff on the menu to choose from a year ago. So to David’s point, I think at this stage we’re making a lot of quick subtle changes because of feedback we’re getting and things we’re seeing ourselves. So there’s going to be probably more on the plate right now than there will be in six months, but still it is easy to just keep doing more and more stuff.

(15:29): And if you think about the ground has just shifted, something I’m conscious of and I was talking to Brian about is like, we’ve already upset enough people with change and I think many of those people will come around. I’m not like worried about permanently upsetting them, but like at the moment we’ve upset a lot of people, let’s make sure we don’t keep shifting heavy bits of ground under their feet. We can make some subtle changes. We’ve already made a couple, given them a big change in the home screen and the activity screen, like that’s actually enough big change right now. So let’s shift into smaller change for a while, stuff they may not even see but might appreciate because things are getting a bit more dialed in. So it’s not just about quantifying the amount of change, it’s about qualifying the size of change, the scope of these changes and understanding when to roll those out. And you don’t want to keep rolling out massive continental shifts basically all the time.

David (16:19): And I think that requirement of discipline is going to be very difficult going forward. And I think there’s some opportunity here that when it becomes easier and easier to make a bigger and bigger product, I’d like to believe that the distilled product, the carefully curated product, the product that doesn’t do everything for everyone in all different shapes and sizes is going to gain more value. And it’s not exactly easy to figure that out because there’s also pressure on the low end that if your product is really simple, even if it’s well made, AI is going to be able to eat that quite quickly, right? There’s some space in between the gigantuous behemoth that does everything and the small little feature that could be created by AI in a snap of a finger where there’s a magic space that exists. I think that’s the space that remains our big challenge as a company, as someone who’s been making Basecamp for a very long time.

(17:20): How do we make sure that we distill the essence of project collaboration to such a form that it remains humanly understandable, humanly learnable, humanly adoptable without requiring big manuals, without requiring big tours? As Jason mentioned in another episode, sometimes it’s hard to even get people to spend five minutes to watch an introduction video to a new version of the product. Good luck trying to get them to sit through an hour or read a hundred pages like that’s just not going to happen. So the natural constraint on how big can your product be in these kinds of consumer adjacent spaces is going to be the human capacity for picking something up. And that’s, I mean, historically always been one of our fortes, that Basecamp was easier to learn, easier to adopt than other softwares in our industry because it was just right, not less to some degree, but also just more approachable. And we have to be careful not to lose that as we gorge ourselves on all these new productivity gains.

Jason (18:32): And on the front end of that, we’ve been very careful about that on the marketing side as well. I invite everybody to go to basecamp.com and open a few different browser tabs. Let’s talk about other products in our industry. You can go to clickup.com, monday.com, notion.com, Asana.com. If you look at all of those products, how they’re presented today, it’s AI first. It’s AI is everything. It’s all there is AI. And I will tell you, having just interacted with literally thousands of customers over the past handful of weeks here, a lot of people don’t want that. They’re not interested in that right now. It’s not that AI isn’t useful in their life, but to lead with that, to be the primary touchpoint, to be the front door for everything is not what a lot of people want. Now there might be plenty of people online who want that, but there’s a lot of people in the world who do not want that and it’s extremely intimidating for them.

(19:30): And I’m not saying that they’re behind either. Their work is not that. That’s not what they need out of a tool. And so we’re very careful about making sure that we’re an option that is modern and you can bring your agents to Basecamp if you want. We have agents working in our Basecamp and you can do all those things, but we’re also saying like, this is straightforward and easy to use. These are words that have been lost. This used to be how everyone actually talked about their product, easy to use. We still have that in our homepage. I actually don’t think I’ve seen that phrase kind of anywhere else in the last few years. It’s sort of been forgotten and I think for a lot of people it’s still preeminently the most important thing for them. Can I use this thing? Can I get other people to use this thing with me?

(20:13): Am I going to understand this thing? If I have to explain this to someone else who’s not a technical person, are they going to get it? This is very, very important. And I would encourage people who are listening to not just jump on the AI and bandwagon and going, every customer is looking for AI this, AI, that. There are plenty of people who are not and big, huge, huge, massive groups of customers who do not want that as the lead feature. It’s not all that everywhere. And if you talk to people, you’ll find that they are not all that everywhere all the time. So keep that in mind as well.

David (20:44): I think this is what’s so hard about our industry because if you are a developer, designer, anyone who’s making software, you will have heard almost about nothing but AI for the past two years. And therefore it’s very difficult not to let that soak into every decision you make about everything, including how you position your product, how you build for your product, that AI is such a tidal wave that it seemingly just washes all these other considerations away, including like, “Can I get humans to understand this product that we’re making and selling?” And yeah, it’s not easy because we can’t just dismiss this as it’s just a fad, right? The AI revolution that’s just coming underway now, I mean, that’s the other thing that’s important to remember. Literally when we were just talking about how AI was being used to make Basecamp, we’re talking about what happened the last six months, the last nine months, the last year at the very most.

(21:45): This is extremely recent stuff to be able to use it for these kind of productivity gains. And first of all, even when you have things that are getting adopted very quickly like AI, it doesn’t get adopted universally within six months or 12 months. In some cases it doesn’t even fit at all, but it is still the most important title wave that has come from our industry, quite possibly bigger than the internet and that was the title wave that we have been riding for the past quarter of a century with Basecamp. It’s this weird in between place where on the one hand, I think if you are a software developer and you’ve decided that AI is just not for you at all, you don’t care about it, it’s all just a fad, it’s all just hype. I don’t think that’s a good place to be. And I’m saying that with as much empathy as I can, because I mean, I occasionally also read just the 900th breathless opinion piece on the latest model and just go like, “Jesus, would you shut up for just a second about this stuff?” I mean, first of all, you were raving and ranting about just this other thing three days ago, like my brain simply can’t take that amount of just breathless boosterism.

(22:59): And I put that in a little jar and it goes over there and then there’s also just the magnificent daily encounters with these models where I just go like, “I cannot believe we’re making computers do this.” It is so exciting. It is so novel that we can just talk to the computer and it produces amazing software, not always, but increasingly persistently good software and you have to be able to hold those things in your head at the same time that there’s a tremendous amount of hype and some of the hype is ahead of where the market is right now and certainly ahead of where lots of customers are. And then there’s also the other part of the equation that is, this is absolutely real. This is not a fad. This is as strong of a signals as I’ve gotten from our industry literally since the internet.

(23:54): And I remember the internet very fontly and very vividly. And I also remember in like 95, 96, 97, how many people were just determined to tune out all the hype there was for the internet in those days and go like, “This is nothing more than an advanced fax machine. It’s not going to have a major impact on our society or our economy or whatever, just tune it out.” And clearly that was just a category error, fundamental misunderstanding of technology. So I think I’m trying to sit there. We’re trying to sit there as a company trying to embrace the enthusiasm, embrace the wonder of computers doing these things. I think if you are a person who’s designer of software, a programmer of software, you should be excited about computers. Computers fundamentally, the idea that you can make machines do these things should be appealing, should be exciting, should be just invigorating. Soak that in, sit and enjoy that and then also go like, yeah, and then now I’m also going to make software that’s just not just for me, my industry, my super early adopters and that part of the curve. We’re also making people or a bunch of software for totally normal people whose day and life and work has not just been completely rewritten overnight by AI.

Kimberly (25:12): Okay. Last question before we wrap it up. David, you mentioned tokens earlier and I’ve heard stories about companies unknowingly spending millions of dollars on AI development. Tell me how we’re managing that. It might be none of my business, but I’m curious, are we looking at our token spend? Is there a budget for tokens? How does that work in this small company who is framed on always staying profitable?

David (25:35): Yeah. So it’s actually interesting. We’re in a really good spot on that as of yet. Now these plans may change and so forth, but we exist just far enough below the enterprise tier where we’re not actually buying individual tokens. We’re buying these subscription plans that the providers of the frontier models are offering and they can be expensive as in like hundreds of dollars per person per month, but they’re not to the tune of some of the spends that I’ve seen or heard about from large enterprises that just buy tokens. I was going to say in bulk, but that’s not actually what they do. They buy them at piece and those tokens really can add up. I’ve heard cases where individual developers have been able to spend as much as a quarter of a million over a one year run rate and obviously, well, I don’t know, obviously, maybe not obviously, but for me, my optics, my wallet, my perception like, no, we’re not going to spend a quarter of a million on tokens for single developer.

(26:39): That’s just not where this is, but it is still something we need to pay a lot of attention to and we need to make sure that we’re not just paying attention to what’s happening with the frontier models, whether you buy them from Anthropic or open AI, but also keeping them honest by using open source models or not open source, open weight models. And this has been one of the things that I’ve found just a marvelous revelation or maybe even irony of modern computing is that the open weight models are all Chinese and the open weigh models are very cheap. They’re very cheap to run. You can run them on American providers or European providers. It’s not like you have to send all your data to China to get these models run, but they’re also not quite as good as the frontier models, but there are shockingly few steps behind.

(27:32): There’s a new model that just came out a few days ago, GLM 5.2 that on a bunch of benchmarks are being compared to an Opus 4.8 or any of the other like leading edge frontier models and that’s just incredible that like in theory, you could run that model yourself in practice, not so much. It requires, I don’t know, $80,000 worth of equipment in your closet to do it, but there are providers who just do that, inference providers who will take these open weight models, they’ll put it on their commodity hardware and then they’ll just serve it for you. And there you can get pricing that’s just at 20th the token cost of some of the frontier stuff. So paying attention to that just that we always have an out that we don’t have to give up on all AI if, for example, the frontier labs turn off these plans, these subscriptions where we’re able to currently get a lot of value out of a fixed cost.

(28:23): And then if we were suddenly forced to just go per token pricing tomorrow, I’d be more worried about our spend because I’ve seen that spend at a lot of companies and as amazing as this is, the funny thing about productivity is you actually have to measure it against economic value. There’s a version of productivity measurements that just go on output like here’s a developer they’re able to produce three times as much code, five times as much code. Yeah, okay, that’s great. That’s exciting, but are they five times as valuable? Are the five times the amount of code that they’re producing or features or bug fixes they’re putting out there, do they have the same economic value as the first feature they produced? In almost all cases, no. There are some outliers, but for lots of companies, certainly companies of our size, if we ship twice as many features next month, we’re not going to double our revenue.

(29:20): That’s just not how the cookie crumbles. We can make a substantially better product, bigger product, more feature full product, maybe even less bugs build a product. There’s not a linear relationship between that and revenue, but there’s very linear relationship between that and token spent if all of this new productivity is just coming of AI and agent acceleration. So I think there’s a lot of companies that are going to have to wrestle with this curve diverging that making your individual developers a lot more productive does not mean a lot more revenue. Then there’s another conversation that thankfully we don’t need to engage in, but if what you’re creating is software as a cost center, you just have a body of work that you have to produce and it’s kind of fixed and making more of it doesn’t really add up or make sense then it’s about shrinking your cost to produce that.

(30:19): And that’s where I think fair society in our industry has prepped in and a bunch of papers is like, is this going to lead to layoffs? And for us, we can just say like, no we’re happy with the size of this company we are. We’ve always been well within our skis. We’ve not overextended ourselves. We do not need to change our composition as a company, certainly not now from what’s happening with AI. We can just take all of this productivity and put it towards making better and more things. And in some cases that means more Basecamp, it means more Basecamp features, it means more Basecamp bug fixes and bunch of other things, but it also means other things. We can open source more things, we can give more things back, we can do all of that, but you still have to have the economics in mind. The token spend has gotten out of hand. This is what you’re seeing with a lot of the profit warnings that’s coming out of the big lab. Some of the anxiety in the stock market was about this too, that they have large customers who are going like, “Yeah, it’s making us more productive, but we’re not making more money.” So something’s going to have to give. We can’t just be exponentially increasing our token spend if customers aren’t buying more of our stuff.

Kimberly (31:32): Yeah. Well, with that, we’re going to wrap it up. Jason, you mentioned that you can bring your agents to Basecamp. You can find information on that on basecamp.com/agents. REWORK is a production of 37signals. You can find show notes and transcripts on our website at 37signals.com/podcast. Full video episodes are on YouTube. And if you have a question for Jason or David about a better way to work and run your business, leave us a video recording. You can do that at 37signals.com/podcastquestion or send us an email to rework@37signals.com.