The numbers looked good last quarter. Weekly AI credit usage was up. Active users climbing. More code reviewed with AI assistance than without. I sent a note to the team. Good progress. Keep leaning in. Then I started asking different questions, and the dashboard stopped being useful.
What wasn't on it ... who had stopped sitting with a hard problem before reaching for a prompt window. Who shipped three consecutive features without once asking why the system underneath worked the way it did. Who used to run toward the gnarly debugging sessions and now routed them to an AI agent before their hands ever touched the keyboard.
I was tracking participation. I had convinced myself that was the same as tracking growth.
What 500 Engineers Described
Around the same time I was reviewing those adoption numbers, I was reading a thread that had grown past five hundred responses from experienced engineers describing what happened at their companies after mandated AI rollouts. The volume alone was striking. What stayed with me was the texture of the frustration.
Nobody was complaining about the tools. The tools worked.
The frustration was more specific than that. The engineers who had spent years building judgment in the hard places ... the ones who were exceptional at careful debugging, at reading a system and understanding why it was behaving the way it was ... were feeling a particular kind of displacement. Their form of skill, the kind that takes years and a lot of broken things to develop, was suddenly treated as equivalent to good prompting. A junior engineer who knew how to construct the right query could now generate the same output. The adoption dashboard couldn't tell the difference between the two.
The slow refactoring where you actually learn something, the debugging session where you sit with the problem for two hours before the pattern surfaces, the architectural decision that requires holding years of context in your head at once ... none of that was showing up in the metrics. None of it was being tracked. And because it wasn't being tracked, it was starting to feel like it didn't count.
The engineers who felt this most acutely were often the best ones.
Companies were measuring the spread of the tool. Nobody had built a way to measure what was happening to the engineers using it.
The Same Mistake, Different Dashboard
I've made this kind of error before. Not with AI adoption, but with velocity.
There was a period earlier in my career when I was pushing for velocity numbers to climb in ways that made the acceptable behavior obvious without anyone saying it directly. The metric looked healthy. The codebase underneath it quietly was not. I had confused the number going up with the team going well. By the time I understood what I had actually been measuring, the cost had already been paid.
The adoption dashboard gave me the same feeling. Usage up, credit consumption up, percentage of AI-assisted reviews up. And I was presenting those numbers with the same confidence I'd had when velocity was climbing.
Adoption metrics have the same architecture problem as velocity did. They measure the presence of the tool in the workflow. They tell you nothing about whether the engineer understands the system they're shipping. A team with perfect adoption scores can be losing their debugging instincts one week at a time without a single number on the dashboard moving in the wrong direction.
I had spent years learning that lesson through velocity. Then I walked into the same room with a different dashboard and almost didn't notice.
The dashboard said we were winning. The engineers knew what they had actually built.
Generating vs. Understanding
Generating code and understanding the system you're building are two different activities that adoption metrics actively flatten into one. They produce similar surface outputs. In the short term they are almost indistinguishable to someone looking at a dashboard. Over time they produce very different engineers.
The engineers I trust most on my team use AI constantly. They also know, without needing to reach for anything, why their system does what it does. They can walk through an architectural decision and tell me the four things they considered and why they landed where they did. That knowledge lives in them. It doesn't live in a chat window or a prompt history.
Skill atrophy doesn't announce itself. It settles in quietly, in the space between what you used to do manually and what you now route somewhere else without thinking about it. The engineers losing that grounding aren't failing in any visible way. They're responding rationally to the environment around them. If the only thing being measured is how much they're using the tools, they'll optimize for using the tools. The comprehension that shows up when something breaks at 2 AM and there's no support path and no agent to route it to ... that isn't something the adoption dashboard rewards. So it stops getting practiced.
What I Track Instead
I changed what I ask for in reviews and in one-on-ones.
The question is whether they can explain why their system does what it does, without reaching for a tool to help them answer first. No agent assist. No quick look-up. The engineer and the question.
The engineers who can answer that without hesitation are the ones I trust when things go wrong. Not because they're smarter, but because their understanding of the system lives in them. It's portable in ways that a well-crafted prompt history isn't.
What's helped is treating explanation as something that gets practiced, not audited. I look for moments in architecture conversations and design reviews where an engineer articulates the reasoning behind what they built. When someone can ship three features and still reconstruct the logic underneath each one, the tool is serving them well. When the shipping is happening but the reasoning can't be reconstructed, that's the signal the adoption dashboard was actively covering up.
High usage score. Low system comprehension. The metric looked optimistic. The picture underneath it wasn't.
The Gap Your Dashboard Can't See
The engineers who can explain the system without flinching are the ones building things I can trust when conditions get hard. The engineers who can't are generating a lot of code.
There's a measurable distance between those two things.
Your dashboard probably isn't measuring it.
