Blog
Every CTO knows the numbers by now. Developers are more than 50% faster with AI assistants, and seeing up to 2× throughput gains. As you read this, your team probably uses Cursor, Claude Code, or Copilot, and they're shipping more code than ever.
The problem? More code means more change—and change is where systems break. DORA research shows that more than 70% of production incidents are caused by changes to systems.
With the fast adoption of AI coding tools, maintaining code is harder than ever. At your company, both the number and size of merged PRs have likely grown significantly year-over-year—each one another potential point of failure.
In 2015, Google researchers published a paper with a perfect metaphor: machine learning systems are "the high-interest credit card of technical debt." They documented how ML code introduces unique problems—hidden entanglement between components, configuration sprawl, glue code that nobody fully understands. The systems work, but they become exponentially more expensive to maintain, debug, and modify.
AI-generated code compounds debt even faster because nobody truly owns it.
When a human writes code, there's cognitive weight. They understand the tradeoffs, they remember why certain decisions were made, they can explain it in code review. That knowledge stays in their head and can be reconstructed when something breaks.
With AI tools, engineers click "Accept" repeatedly. The code appears, tests pass, PR merges. But reading code is not the same as writing it. The deep understanding that comes from wrestling with a problem, considering alternatives, making difficult tradeoffs—that never happens.
Six months from now, when something breaks, the person debugging it is starting from zero. Worse, they're debugging code that may be syntactically correct but doesn't encode any of your system's actual invariants, undocumented assumptions, or performance characteristics.
The economics of software haven't changed in 50 years: maintenance consumes up to 80% of lifecycle costs. But AI is radically changing the composition of that maintenance burden. It's shifting from "extend the system" to "understand what the system actually does."
Here's the uncomfortable truth: the teams using AI most aggressively will be the first to collapse under operational load.
Not because AI writes bad code—it doesn't, most of the time. But because operational complexity scales with change velocity, and AI just multiplied your change velocity without multiplying your ability to understand, monitor, or debug what's running.
Google's Site Reliability Engineering handbook has a forcing function: keep toil at or below 50% of engineering time. Toil is the repetitive, manual, interrupt-driven work that scales linearly with system growth—pages at 3am, manual deploys, debugging outages, chasing down config issues. When toil exceeds 50%, teams lose the capacity to invest in reliability improvements. They enter a death spiral: maintenance consumes all capacity, preventing the very investments that would reduce maintenance.
Most SRE teams are already stretched. They're already dealing with Mean Time to Detect numbers that are too high and incident rates that are too frequent. Now you're flooding them with changes they didn't write, don't fully understand, and have to debug in production when they break.
Because more than 70% of incidents stem from changes, teams that keep borrowing velocity without paying down reliability debt will default in production.
The pressure to adopt AI coding tools is irresistible, especially when competitors are shipping faster. But velocity without accountability is just preloading your incident queue.
The fundamental problem isn't that AI writes bad code—it's that AI-generated code enters production with less ownership or deep understanding. No one understands the tradeoffs, no one can explain the decisions, and six months from now, no one will remember why it was written that way. It's the difference between your teacher showing you a math problem on the white board that "looks right" versus you being able to work the problem out yourself on an exam with no help.
AI adoption doesn't eliminate responsibility; it redistributes it.
That means the solution isn't better monitoring or slower deploys. It's making ownership mandatory.
Make engineers accountable for what they accept, not just what they write. If you click "Accept" on AI-generated code, you own it—including the 3am page when it breaks. This means:
Start tracking AI-generated code as its own category in your metrics. You measure PR velocity, incident rates, and MTTR. Start measuring what percentage of your codebase is AI-generated and how it correlates with operational load. Not to ban AI tools, but to understand the real cost of borrowed velocity.
Invest in the maintenance debt you're creating. If AI doubles your output, maintenance burden will eventually double too. That means doubling your investment in documentation, observability, testing, and the unglamorous work of understanding what your system actually does.
Reliability can't be retrofitted. The teams that survive the next 24 months won't be the ones that shipped the most AI-generated code. They'll be the ones who made sure someone actually understood it.
Every engineering leader right now is making the same bet: that AI will increase productivity faster than it increases operational complexity. The data says that's a losing bet. AI-generated code is already here—the only question is whether you're building the systems to survive it, or whether you'll be the case study in someone else's postmortem.