Effort Scale Factor

The Effort Scale Factor (ESF) is a multiplier that adjusts the base score for PR size. It exists because a 5-line fix and a 500-line feature can have identical base scores, but the larger change required more effort to develop, test, and review.

Final Score = Base Score x ESF

The ESF ranges from 0.10x (for the smallest changes) to 1.00x (for the largest). This page explains how the ESF is calculated, why lines are the primary signal, and how the file-count adjustment works.

Why the ESF Exists

Without the ESF, a one-line typo fix that happens to touch a critical config file could score higher than a 400-line feature implementation. The base score captures what the change does; the ESF captures how much work it took to do it.

The ESF is intentionally a scaling factor rather than an additive bonus. This means small PRs are not penalized — they receive the full base score analysis — but the final number reflects the reality that larger changes involve more engineering effort.

The Tier Table

The ESF uses a six-tier system based primarily on lines changed:

Tier Lines Changed Multiplier
Nano 1-10 0.10x
Micro 11-50 0.25x
Small 51-150 0.40x
Medium 151-400 0.60x
Large 401-800 0.80x
XL 801+ 1.00x

Lines changed includes both additions and deletions in the PR diff. Generated files, lock files, and other non-authored content are excluded from the count.

Why Lines Are Primary

Lines changed is the primary signal for ESF because it most directly correlates with the amount of code an engineer had to write, understand, and test. While imperfect, it is a better proxy for effort than alternatives:

  • File count alone is misleading. A single-line change to 20 files (like updating an import path) is less effort than a 300-line change to 3 files.
  • Commit count is easily gamed and varies by workflow.
  • Time spent is not available from the PR data.

Lines are not the only signal, however. The file count serves as a secondary adjustment through the 2-tier gap rule.

The 2-Tier Gap Rule

Some PRs touch many files with few lines per file. A cross-cutting rename, a configuration change propagated across services, or a type definition update can touch 30+ files while adding only 40 total lines. The line count alone would classify these as Micro (0.25x), which understates the coordination effort involved.

The 2-tier gap rule corrects for this:

How It Works

  1. Step 1: Determine the Base Tier. Use lines changed to look up the tier from the table above.
  2. Step 2: Determine the File Tier. Use the same tier boundaries but applied to the number of files changed (1-10 files = Nano, 11-50 = Micro, etc.).
  3. Step 3: Compare. If the File Tier is 2 or more levels above the Base Tier, bump the ESF up by one tier.

Rules and Constraints

  • The maximum bump is +1 tier. Even if the file tier is 4 levels above, the adjustment is still only +1.
  • Files never reduce the tier. If the file tier is below the base tier, the base tier is used unchanged.
  • The bump applies to the tier, not the multiplier directly. A bump from Micro to Small means 0.25x becomes 0.40x.

Worked Examples

Example 1: Standard Feature (No Bump)

A PR adds a new API endpoint with 220 lines changed across 6 files.

Step Calculation Result
Lines changed 220 Medium tier (151-400)
Files changed 6 Nano tier (1-10)
Gap check Nano is below Medium No bump
Final ESF 0.60x (Medium)

The file count is below the line tier, so no adjustment is made.

Example 2: Cross-Cutting Rename (Bump Applies)

A PR renames a widely-used utility function. 35 lines changed across 24 files.

Step Calculation Result
Lines changed 35 Micro tier (11-50)
Files changed 24 Micro tier (11-50)
Gap check Micro - Micro = 0 levels No bump
Final ESF 0.25x (Micro)

Wait — 24 files but only Micro? The file tiers use the same boundaries: 1-10 = Nano, 11-50 = Micro. So 24 files is also Micro, and the gap is 0. No bump.

Now consider a different rename: 35 lines changed across 55 files.

Step Calculation Result
Lines changed 35 Micro tier (11-50)
Files changed 55 Small tier (51-150)
Gap check Small - Micro = 1 level No bump (need 2+)
Final ESF 0.25x (Micro)

Still no bump — the gap is only 1. The rule requires a gap of 2 or more.

Now: 35 lines changed across 180 files.

Step Calculation Result
Lines changed 35 Micro tier (11-50)
Files changed 180 Medium tier (151-400)
Gap check Medium - Micro = 2 levels Bump +1
Final ESF 0.40x (Small)

The file tier (Medium) is 2 levels above the line tier (Micro), so the ESF bumps from Micro (0.25x) to Small (0.40x).

Example 3: Large Config Propagation (Bump Capped at +1)

A PR updates a shared type definition. 8 lines changed across 450 files.

Step Calculation Result
Lines changed 8 Nano tier (1-10)
Files changed 450 Large tier (401-800)
Gap check Large - Nano = 4 levels Bump +1 (max)
Final ESF 0.25x (Micro)

Even though the gap is 4 levels, the bump is capped at +1. Nano bumps to Micro (0.10x to 0.25x), not to Large.

Example 4: Full Score Calculation

Putting it all together. A PR titled "Add user notification preferences API" has a base score of 62 (Scope 14 + Architecture 12 + Implementation 15 + Risk 10 + Quality 8 + Performance/Security 3). The PR has 280 lines changed across 9 files.

Step Calculation Result
Lines changed 280 Medium tier (151-400)
Files changed 9 Nano tier (1-10)
Gap check Nano is below Medium No bump
ESF Medium tier 0.60x
Final Score 62 x 0.60 37.2

ESF and Score Interpretation

The ESF means that small PRs will almost always have low final scores, regardless of their base score. This is by design:

  • A Nano PR (1-10 lines) with a perfect base score of 100 would get a final score of 10.
  • A Medium PR (151-400 lines) with a base score of 50 would get a final score of 30.
  • An XL PR (801+ lines) with a base score of 50 would get a final score of 50.

This reflects reality: shipping a large, complex change is more work than shipping a small, complex change. The base score captures the nature of the complexity; the ESF captures the scale of the effort.

When analyzing scores, keep in mind that final scores below 15 typically indicate small changes (low ESF) rather than low-quality work. Look at the base score breakdown for a complete picture of complexity.

Frequently Asked Questions

Why not just count lines of code?

Lines of code rewards verbosity and punishes elegance. A developer who deletes 500 lines of dead code shows up as negatively productive by that metric. The ESF uses lines as one signal among several -- it accounts for PR size without treating lines as a direct measure of value.

How does the ESF handle large refactors that delete code?

Deletions count as lines changed, just like additions. A refactor that removes 400 lines is a Medium-tier PR (151-400 lines) and receives a 0.60x multiplier. The ESF captures effort, and deleting code thoughtfully is real engineering work.