Blog

The SDLC is Dead. Long Live the SDLC.

You cannot review your way out of the new glut of AI code. Winning teams will learn faster from what reaches production.

Vikram Sreekanti

I recently came across a piece called "Respect the SDLC" by Saanya Ojha, and it got me thinking. Her argument is that AI doesn't reduce the need for engineering rigor. In fact, in the absence of good engineering discipline, moving faster with AI can create chaos. Her argument revolves around three pillars: specs need to be precise, reviews need to focus on intent rather than syntax, and testing becomes your only source of truth. That’s a helpful framing to start, but here’s where I'd push back: discipline alone can't keep pace with the volume of code we're now generating. That's not a process problem you can engineer your way out of. It requires a new way of thinking about things.

Where "Respect the SDLC" Gets It Right

Saanya is right that specs need to be precise. I'd actually go slightly further. What matters isn't just that specs are precise, it's that the engineer has genuine clarity of intent before they ever open a coding agent. Pre-AI, an engineer could partially offload intent to a product manager and a designer – those roles would be responsible for the high-level requirements and the look & feel of the product. Then as the engineer was going through the implementation process, they would discover edge cases naturally through the act of writing code. We've all had that experience: you sit down to implement something and realize mid-way through that you hadn't thought through a critical constraint. That would lead to a further discussion of how to handle these edge cases to ensure everyone was on the same page.

That organic discovery process is gone now. Code generation compresses what used to be uncovered in hours of reflection into a few seconds of a spinning Anthropic icon. More often than not, lots of assumptions are silently encoded into the product without you ever hearing about them. The edge cases don't surface themselves anymore – they are only uncovered when a customer has a bad experience. That means you have to think through them upfront.

What this means is that taste matters more than it ever has. The teams shipping the best work with coding agents are the ones with the clearest picture of what they're building and why. The understanding of why translates into prompts and specs, which in turn ensures that the agent does exactly what it’s supposed to – and nothing more. That's a real shift, and it's one that's easy to underestimate until you've felt it firsthand.

Where It Gets Harder

The argument for reviews is sound in principle, but it runs into three problems at AI-generated code volume.

The first is sheer volume. If everyone on your team is running coding agents in parallel, the bottleneck doesn't disappear. It moves from writing to reviewing. There's simply too much code coming out too fast for engineering teams to actually read and review. You’re simply shifting the bottleneck from writing code to reading it.

The second concern is what gets discovered in reviews. Tools like BugBot and Greptile are a promising start, but they’re only focused on the narrow details of what’s in the code base – race conditions, poor coding practice, and API compatibility. Humans were, frankly, never good at noticing these things pre-LLMs, so this is an improvement. But none of these tools can verify whether the feature implementation actually matches what was intended when the task was started. That still falls on humans.

This brings us to the third issue – the most challenging one to solve. These tools completely lack an understanding of why you’re doing what you’re doing. How do you know the feature does what it’s supposed to do? If you need strong verification of intent, you can't yet replace humans. And humans are already underwater.

Testing runs into its own version of this problem. Start with what TDD advocates have always said: you can't test what's in the code, you have to test what the application was supposed to do. That's hard enough when humans write the tests. When you ask an LLM to generate tests, the problem becomes recursive. The LLM will test the intent that was specified during implementation, not the intent that was necessary but omitted by random chance. If the intent was muddled going in, the tests will be muddled coming out. You end up with tests that validate the wrong thing with complete confidence.

What We're Actually Seeing

What’s underlying our worldview argument is that the SDLC itself is changing. This has become abundantly clear to us in our conversations with engineering leaders over the last 6 months. Specs, reviews, and tests are very much what the doctor ordered for the pre-LLM SDLC, but the game is changing.

The SDLC post-coding agents almost by definition must look different for a few reasons. First, speed is paramount – you have to race to keep up with the competition, and slowing your team down with reviews and tests is incompatible with that imperative. The best teams have always optimized for speed, but now it’s table stakes. Second, code is cheap. Building something, trying it, and throwing it away used to be a huge waste of time – today, you don’t think twice about it. Finally, as many others have observed, the question most teams are asking is no longer whether they should build something, but what’s actually worth building. You realistically can build basically anything now, so exercising good taste and judgment – and knowing what users want – is critical.

Does that mean good engineering discipline is hopeless? Are we all doomed to ship low-quality code that breaks all the time, only to come back the next day and try to fix it? Hopefully not!

The solution is not to impose discipline that slows down teams – it’s to think about what developer tooling is going to enable the next version of the SDLC. Some of this might be familiar, and some of it will be net new:

“Hands” on testing: This is the true promise of Cursor Cloud Agents and Claude Desktop’s app use features. You want an agent to implement something, validate what it did, and then show you a nice demo video that you can sign off on. Reality is a long way away from that. While this works for basic testing today, the promise is that it should be able to simulate lots of known failure cases (angry users, shoddy internet, etc.) and validate that the app behaves appropriately. This is what engineers (still) often do by hand.
Precise measurement: No matter how hard we try, more code that causes issues – whether explicit (error) or implicit (bad UX) – is likely going to go into production. Without clear measurement of the impact of your code changes on key metrics, you’re going to be flying blind. Most product analytics frameworks can support this, but we’re going to have to develop better discipline around measuring the right thing.
Incident detection: Once you have the right metrics, you’re going to have to know what to pay attention to. You can’t chase every single blip on the radar, or you’re going to drive yourself crazy. At RunLLM, we’ve been working on predictive incident detection to ensure that issues are detected as close to when they occur as possible. If Ben’s prognostication that there will be many copies of code in production is true, this will be more important than ever.

In a nutshell, the next version of the SDLC isn’t about fixing everything up front – it’s about being confident that you’re building the right thing from the get-go, iterating quickly, and making sure you proactively catch anything that falls through the cracks. If you can get that loop right, your team will be well ahead of the curve.

What That Means

Specs, reviews, and testing still matter. The goal of coding agents was never to have every engineer ship whatever they dreamed up at 2am.

But the version of the SDLC that works going forward has to be built for a world where code is generated orders of magnitude faster than it can be understood. That means new tooling, new practices, and a clear-eyed acknowledgment that we can't review our way out of this. We need systems that validate intent rather than syntax, catch issues as close to production as possible, and give teams leverage rather than more process to maintain.

The contract that software development was built on has changed. We optimized for generating code and left everything else exactly where it was. The teams that figure out how to close that gap are the ones that will ship reliably at the speed AI makes possible. Everyone else is just writing faster into the dark.

Read the Latest

From thought leadership to product guides, we have resources for you.

Ready to Transform Your Incident Response?

The AI SRE that builds trust through evidence.

Get in touch

The SDLC is Dead. Long Live the SDLC.

Read the Latest

Ready to Transform Your Incident Response?