Small PRs vs Large PRs

One of the most common questions about GitVelocity is why a brilliant 10-line bug fix gets a low score while a routine 500-line feature gets a higher one. The answer is the Effort Scale Factor (ESF), and it is working as designed.

How ESF Works

Every PR score has two components:

Final Score = Base Score x ESF

The Base Score evaluates the inherent complexity of the change across six dimensions. The Effort Scale Factor adjusts for the size of the PR, primarily based on lines changed and secondarily on files changed.

ESF ranges from near 0 for tiny changes to 1.0 for large ones. It acts as a multiplier that scales the base score according to how much code actually shipped.

Why Small PRs Get Lower Final Scores

A 10-line bug fix might demonstrate deep system knowledge and clever problem-solving. The Base Score captures that -- it could be a 45 or higher. But the ESF for 10 lines is around 0.10, so the Final Score lands near 4.5.

This is intentional. GitVelocity measures the total complexity of work that shipped. A 10-line change, no matter how insightful, is a small amount of shipped work. It took less time to write, less time to review, and carries less deployment risk than a 500-line change of equivalent base complexity.

The low final score does not mean the work was unimportant. It means it was small.

Large PRs Are Not Automatically High-Scoring

ESF removes the size penalty -- it does not add a size bonus. A 500-line PR that is mostly boilerplate will still have a low Base Score because the underlying complexity is low. The ESF might be 0.80, but 0.80 multiplied by a Base Score of 12 is still only 9.6.

Large PRs score well only when they are both large and complex. A 500-line feature that introduces new architecture, touches multiple systems, and requires careful error handling will have a high Base Score and a high ESF, resulting in a high Final Score.

An Example

Consider two PRs with identical Base Scores:

PR Base Score Lines Changed ESF Final Score
10-line critical bug fix 45 10 0.10 4.5
500-line feature implementation 45 500 0.80 36.0

Both PRs demonstrate the same level of underlying complexity per the rubric. But the feature implementation shipped far more code, required more review effort, and represented a larger unit of work. The Final Score reflects that difference.

Do Not Artificially Inflate PR Size

Knowing how ESF works, you might be tempted to make PRs larger to get higher scores. This is a bad idea for several reasons:

  • Larger PRs are harder to review. Review quality drops as PR size increases. Bugs slip through.
  • Larger PRs are riskier to deploy. More code changed means more potential failure points.
  • The AI evaluates actual complexity. Adding padding, unnecessary refactoring, or unrelated changes does not increase the Base Score. The AI reads the code and scores what it finds.
  • Your team ships slower. Large PRs take longer to get through review and merge. Smaller PRs ship faster and more frequently.

The best strategy is also the simplest: write well-scoped PRs that solve one problem each. If the work is complex, the Base Score will be high. If the work is small, the Final Score will be low, and that is fine.

Total Velocity Tells the Full Story

If you are shipping many small, focused PRs, your individual scores will be low but your total velocity over time will be substantial. GitVelocity aggregates scores across all your merged PRs, so a week of ten well-crafted small PRs contributes more total velocity than one large PR that sat in review for days.

Volume and consistency matter more than any single score.