2 Comments
User's avatar
The AI Architect's avatar

Phenomenal articel, especially the historical arc through DeMarco's retraction to DORA's limitations. The METR perception gap (developers thought they were 24% faster but were actually 19% slower) is maybe the most underrated finding in AI tooling research. We're seeing this exact dynamic where teams report feeling more productive but quality metrics tell a different story. The sematic understanding angle is intresting because it sidesteps Goodhart's Law more elegantly than previous approaches, tho I'm curious how well LLMs can actually distinguish between high-value architectural work and busywork in practice.

Justin Cranshaw's avatar

We're definitely making progress on the semantic understanding side of things. Can't say it's totally solved, but at least to me, this seems like where the future of measurment is going.