Your Code Is Moving. Your Judgment Is Not. | Jono Herrington

The production memory issue didn't reveal itself as a speed problem at first. It arrived as a debugging session that should have taken thirty minutes and kept stretching. A senior engineer at the terminal, working through the layers the way he always had, waiting for the pattern recognition to fire. Six months earlier it would have been automatic. Now it arrived the way a word does when it's on the tip of your tongue and keeps almost forming.

He wasn't coasting. He was busy in ways that looked exactly like leadership. Sprint planning. Stakeholder management. The roadmap reviews that had to happen. The calendar was full of things that mattered. What he didn't see was that the calendar had quietly substituted for something else, and that substitution had a cost that doesn't show up until a moment like this one.

The instinct to debug wasn't gone. It was soft. And soft is the version that's hard to catch because the output keeps looking fine right up until it doesn't.

Six months of distance had done what six months always does.

The Number Worth Actually Thinking About

GitHub's research found that developers using Copilot complete tasks up to 55% faster. That statistic tends to land as good news, and in isolation it is. The tools work. Adoption makes sense. The productivity case is solid.

But 55% faster code means something downstream from the code itself. It means more decisions per day embedded in pull requests. More architectural choices shipping without a deliberate pause. More places in the system where someone with real context needs to evaluate whether what just merged connects to what's being built on three other continents, whether the pattern holds under load, whether the trade-off that seemed obvious in one service is going to be expensive when replicated across twelve of them.

The code is moving. The evaluation cycle is not keeping up.

This is the gap that doesn't have a metric yet: not whether the code ships faster, but whether the judgment keeping pace with it is sharp enough to catch what's compressing. A 55% velocity increase doesn't arrive with a corresponding increase in review depth. It arrives with the same humans, the same available hours, the same leader who may or may not have been staying close to the work.

The code is moving. The evaluation cycle is not keeping up.

Leaders who were already drifting from technical proximity before AI arrived are now falling behind at a pace that drift used to take years to build. The calendar didn't change. The speed of what's underneath it did.

That's the thing about a full calendar. It doesn't feel like drift. Every meeting has a purpose. Every conversation moves something forward. From the outside, and usually from the inside too, it looks exactly like the work.

I've Been on the Wrong Side of This

I've retreated from technical proximity more than once. Not as a decision but through the way months accumulate when you're leading an engineering organization across three time zones and three continents. The drift doesn't announce itself. It settles in through small things that each feel reasonable on the day they happen.

The 1:1s that gradually become status reports because somewhere along the way you started asking about progress instead of approach. The technical explanation you follow at about sixty percent and tell yourself is close enough. The sprint that looks good on Friday and leaves you realizing, two weeks later, that you don't actually know how the hard problems got solved.

The signal I trust most is this: when did something in the codebase last surprise me? Not concern me. Not require action. Just genuinely catch me off guard in a way that meant I was close enough to see something I hadn't expected.

When I can't answer that question cleanly, I know I've drifted.

The trouble with drift is that everything around it keeps looking fine. PRs get merged. Features ship. The sprint board fills and empties in the right rhythm. You can mistake all of that motion for proximity to the actual work. And for a while, the gap between where your judgment actually is and where you think it is stays narrow enough that nothing reveals it.

AI closes that window. Code moving 55% faster means the gap between your actual evaluation capacity and what needs evaluating grows faster too. Drift that used to be invisible for a year is now surfacing in months.

What I'm Actually Looking for in a Pull Request

I still read pull requests every week. Not to catch syntax. My engineers are better at that than I am on most days, and AI is catching more of it than any reviewer has the hours to chase.

I read them because a PR is a decision record, and decision records are what I'm actually managing. Not lines of code. Not ticket counts. The thinking that went into the approach. The trade-offs the engineer saw. The choices that got made quietly inside a feature and will shape what's possible six months from now when someone needs to extend it.

Leading across three continents means the clearest signal I have about whether the judgment on my team is sound is not the sprint review or the roadmap deck. It's the PRs. That's where I can see whether the decisions being made in one geography connect to the architecture decisions being made in another. Whether the error handling patterns we agreed on are actually holding under delivery pressure. Whether an engineer is solving the problem in front of them in a way that creates a bigger problem two decisions from now.

When an engineer is using AI well, the PRs show it. The reasoning is tighter. Edge cases get considered. The approach names its trade-offs and sometimes names the ones it decided against. When an engineer has shifted into mostly directing and reviewing output without doing the underlying thinking, the PRs show that too. The decisions are there, but the reasoning behind them is thin. The code compiles and passes review and ships. And you can't tell whether the engineer understands why the approach works, or whether they'd make the same judgment call correctly in a different context without the assist.

The second version isn't always slower. It's often faster. The problem is what it's building toward.

That's still the best signal I have about whether the judgment is sound. You can't see it if you've stopped reading.

That's still the best signal I have about whether the judgment is sound.

The Multiplier Goes Both Ways

You'll hear that AI raises the floor. Even weaker engineers ship more. The baseline rises. The ceiling matters less. This is the optimistic read, and it's not wrong.

It's also not the whole story.

AI is compressing the margin for judgment debt. Yes, the floor rose. But so did the speed at which poor judgment compounds. A leader who isn't close to the work doesn't miss things at the rate they used to. They miss things at the speed the code is moving, which is meaningfully faster than it was two years ago.

The better frame is that AI is a multiplier, and multipliers don't choose sides. Whatever state your technical judgment is in when AI arrives in your team's workflow ... sharp because it's been fed by proximity, by reading the PRs, by staying close enough to the system to actually see what's happening ... that's what gets multiplied. Whatever state the drift has left it in ... soft at the edges, slow to pattern-match, confident in its own competence without the proximity to back it up ... that also gets multiplied.

The engineers using AI as a tool for thinking faster are compounding their judgment at a rate that would have been impossible without it. They're genuinely ahead in ways the velocity dashboard is not built to capture. The leaders and engineers using it to avoid the thinking are accumulating judgment debt at a speed that also would have been impossible without it.

The gap between those two groups is widening faster than most organizations can see.

The Part That Doesn't Come Back

The senior engineer from that production incident figured it out. He got back in the code. Rebuilt the proximity. Let the codebase surprise him again through deliberate reps over time. The instinct came back, slowly, because technical instincts are recoverable when you're willing to do the work.

What doesn't come back is the time the code was moving without anyone close enough to catch what it was doing.

The skill atrophy problem is a real one, and it compounds in ways most teams aren't tracking. But atrophy is individual. The evaluation gap is structural. It's what happens when the speed of production outpaces the capacity of the people responsible for evaluating it, and nobody notices because the code keeps shipping and the sprint keeps moving and the calendar stays full of things that look exactly like leadership from the outside.

AI is a force multiplier. What it's multiplying is whatever judgment your leaders still have.

AI is a force multiplier. What it's multiplying is whatever judgment your leaders still have.