The Code Review Habit That Was Costing My Team

For three years, I treated every pull request the same.

A one-line environment variable change? Full review. A 12-line copy update from the marketing team? Full review. A 400-line database migration that restructured our core data model? Full review.

Same process. Same scrutiny. Same turnaround expectation. Same Slack ping to the same three reviewers. I thought I was being rigorous. I was actually destroying my team's effectiveness.

The Ritual I Built

As a founding engineer, I set up our code review process early. The rules were simple and, I thought, unimpeachable:

Every PR requires at least one approval. No exceptions. No self-merges. Every change, no matter how small, gets a second pair of eyes.

I even wrote it into our engineering handbook. I was proud of it. It felt mature. It felt like the kind of thing Real Engineering Organizations do.

Here's what actually happened in practice. An engineer would finish a feature involving a complex state machine refactor — the kind of change where a reviewer's fresh eyes genuinely matter. They'd open the PR. Then they'd wait. Because the reviewer they needed was currently blocked reviewing a PR that updated a README. And before that, they'd been reviewing a PR that bumped a dependency version. And before that, a PR that fixed a CSS margin.

By the time the reviewer got to the state machine refactor, they were exhausted. Not physically — mentally. They'd burned their review budget on changes that didn't need it. So the PR that actually required careful architectural review got a skim and an approval emoji.

I had created a system that allocated maximum attention to trivial changes and minimum attention to critical ones.

The Realization

The moment it clicked for me was embarrassingly concrete. I was looking at our review metrics and noticed our median time-to-first-review was about the same across all PR sizes. That sounded good on paper — consistent turnaround! But it meant a one-line typo fix waited the same duration as a 500-line feature. And the 500-line feature, once it finally got reviewed, accumulated roughly the same number of review comments as a 50-line change.

That couldn't be right. Either the big PRs were being under-reviewed, or the small PRs were being over-reviewed. Both turned out to be true.

I started watching more carefully. I'd see a reviewer leave three comments on a whitespace-only PR — not because there was anything meaningful to say, but because the process expected engagement. Then the same reviewer would approve a complex auth change with "LGTM" because they were running low on bandwidth.

The uniform process hadn't ensured uniform quality. It had ensured uniform mediocrity.

What I Should Have Known

The mistake seems obvious in hindsight: not all code changes carry the same risk, and review effort should be proportional to that risk.

A typo fix in a README has essentially zero chance of breaking production. A database schema migration could take down the entire application. Treating these with identical ceremony is like a hospital running the same diagnostic protocol for a paper cut and a chest pain complaint.

But how do you actually implement this? You could try to categorize PRs manually — label them "trivial," "standard," or "critical" and route them to different review processes. I tried this. It didn't work because engineers are terrible at self-assessing the complexity of their own changes. Everyone thinks their change is straightforward. The person writing a "quick fix" that subtly breaks an invariant in the payment system genuinely believes it's a minor patch.

You need something that can evaluate the actual substance of the code change, not the author's self-reported assessment.

The Fix: Complexity-Driven Review

What ultimately fixed our process was understanding what's actually inside each PR — not just its size in lines of code, but its real complexity across multiple dimensions. How much architectural surface area does it touch? How much risk does it introduce? How sophisticated is the implementation?

Once you can score that, you can make rational decisions about review allocation.

Low-complexity changes — config updates, copy changes, dependency bumps, straightforward test additions — get a lightweight process. Maybe a quick scan. Maybe auto-merge with post-merge review. The point is they don't consume your senior reviewer's most valuable resource: focused analytical attention.

High-complexity changes — schema migrations, auth system modifications, performance-critical algorithms, changes that touch multiple service boundaries — get the full treatment. Dedicated reviewer. Block of focused time. Architectural discussion if needed. This is where review effort actually pays dividends.

Medium-complexity changes get a standard process. One reviewer, normal turnaround, standard scrutiny.

The key insight is that this triage shouldn't be based on vibes or self-reporting. It should be based on an objective evaluation of the code change itself. This is exactly the kind of evaluation that AI-powered PR scoring was built for — it looks at the actual diff across dimensions like scope, architecture, risk, and implementation complexity, then gives you a number you can route on.

What Changed for Us

After we implemented complexity-based review routing, three things happened.

First, reviewer burnout dropped. Our senior engineers were no longer spending their review cycles on trivial changes. They could focus their expertise where it mattered. One of our staff engineers told me it was the first time in years he felt like code review was a good use of his time instead of a tax.

Second, review quality on complex PRs went up. With reviewers less fatigued and more focused, they caught issues they'd previously missed. Architectural concerns got surfaced earlier. Performance implications got discussed before merge, not after an incident.

Third — and this surprised me — our overall velocity increased. Engineers weren't waiting in queue behind twelve trivial PRs to get their complex work reviewed. The critical path shortened. Features that used to take a week to get through review were landing in a day or two.

We'd been so focused on the ritual of review that we forgot the purpose: catching real problems in code that actually carries risk. The ritual had become the goal instead of the tool.

The Broader Lesson

I think this mistake generalizes beyond code review. Engineering teams love uniform processes because they feel fair and they're easy to enforce. But uniformity optimizes for simplicity of management, not for quality of outcomes.

The best engineering processes are adaptive. They match effort to risk. They concentrate resources where the leverage is highest. And they require understanding the nature of the work being done — not just that work is happening.

This is the same principle behind why engineering measurement has been broken for so long. Counting PRs tells you nothing about what's in them. Measuring cycle time tells you nothing about what got shipped. You need to understand the substance of the work to make intelligent decisions about how to manage it.

My code review mistake wasn't a process problem. It was an information problem. I didn't have the data to differentiate between a trivial change and a complex one, so I treated them all the same. Once I could see the difference, the right process was obvious.

If your team reviews every PR the same way, you're probably making the same mistake I did. Your reviewers are burning out on noise, and your most important changes aren't getting the attention they deserve.

GitVelocity measures engineering velocity by scoring every merged PR using AI. When you can see the complexity of each PR on a 0-100 scale, you can allocate review effort where it actually matters.

See how it works.