Score Your AI Fluency
We built an open Claude skill based on Anthropic's 4D AI Fluency Framework that analyzes your conversation history and scores how effectively you collaborate with AI. Here's what we learned running it on ourselves.
Insights on engineering velocity, AI-powered measurement, and what makes teams ship faster.
We built an open Claude skill based on Anthropic's 4D AI Fluency Framework that analyzes your conversation history and scores how effectively you collaborate with AI. Here's what we learned running it on ourselves.
AI agents now write, test, and iterate on code autonomously. Engineers are becoming orchestrators, not typists. Every existing metric is blind to this shift.
Every engineering leader has been asked 'how productive is your team?' and felt their stomach drop. The honest answer is: we don't know. Here's why — and how to fix it.
SPACE, DORA, DevEx — the major developer productivity frameworks explained, what they miss, and an updated framework for measuring engineering in 2026.
Story points promised to predict engineering capacity. They never delivered. Here's why — and what actually works.
You bought Copilot seats for your entire team. Some engineers doubled their output. Others didn't change at all. You can't tell which is which. Here's how to fix that.
Commit counts, lines of code, and story points all fail when AI writes code. Here's the measurement approach that works — with real data from our team.
Engineering went from 5% of headcount to 30-50%. Boards want the same visibility they get for sales. The tools to deliver it finally exist.
DORA metrics started as research. They became a religion. Somewhere along the way, we stopped asking whether our teams are shipping great software and started asking whether our pipelines look fast enough on a dashboard.
Most engineering KPIs measure activity, not outcomes. Here are the KPIs that actually tell you something useful about your team's productivity.
Software estimates are wrong because the information required for accurate estimation doesn't exist until you're inside the work. This isn't a calibration problem — it's a fundamental limitation.
DORA metrics measure how fast your pipeline runs. They don't measure what's moving through it. Here's what's missing — and why it matters.
Lines of code rewards verbosity. Commit counts reward noise. PR counts reward splitting. These metrics feel precise and measure nothing. Here's why.
Looking for Jellyfish alternatives? Here are the engineering analytics tools worth evaluating in 2026 — from free AI-powered scoring to DORA metrics platforms.
Comparing GitVelocity and Sleuth — one scores the code inside deployments, the other tracks deployment health. DORA tells you speed; AI tells you substance.
Comparing GitVelocity and DX — one uses AI to score shipped code, the other uses surveys to capture developer sentiment. Objective output vs subjective experience.
Pluralsight Flow tracks active days and code churn. GitVelocity scores code complexity with AI. Compare them on metrics, pricing, and what actually matters.
Comparing GitVelocity and Waydev — one measures what your engineers ship, the other tracks how active they are. Different questions, different tools.
Comparing GitVelocity and Hatica — one scores what engineers ship, the other tracks how they feel while shipping it. Both dimensions matter.
Comparing GitVelocity and Swarmia — one scores what your team ships, the other optimizes how your team works. They might be better together than apart.
Comparing GitVelocity and LinearB — one measures what you ship, the other optimizes how you ship it. Here's how to decide which you need.
Comparing GitVelocity and Jellyfish — two engineering platforms that measure fundamentally different things. One scores shipped code, the other tracks resource allocation.
Every AI code reviewer has blind spots. Here's how to build a layered review stack that catches what single-tool setups miss.
Code review went from optional to mandatory to bottleneck. AI tools are unbottlenecking it -- but the real question is what you optimize for.
After studying engineering teams across 30+ portfolio companies, five patterns separate high-output teams from everyone else.
I reviewed every PR with the same scrutiny — a one-line config fix got the same ceremony as a database migration. It burned out my reviewers.
Engineering management tools evolved in waves: spreadsheets, Jira, DORA dashboards, AI analytics. Here's what the latest wave unlocks.
License dashboards and surveys don't tell you if AI tools are working. Track engineering output instead — here's the framework with real team data.
A practical metrics framework for tracking AI tool adoption across your engineering org. Lagging indicators, leading indicators, and the red flags to watch for.
CTOs are spending $50-200/seat/month on AI tools and can't prove the value. Here's a concrete framework for building the ROI case your board actually needs.
Cursor has no built-in productivity dashboard. Here's how to measure its real impact on engineering output and build the ROI case for your CFO.
You bought Claude Code seats. How do you know it's working? Seat usage doesn't cut it. Here's how to measure actual ROI from AI-assisted development.
Should you build engineering analytics in-house or buy a platform? The honest trade-offs most vendors won't tell you about.
Jellyfish, Swarmia, LinearB, DX, Waydev, Hatica, Sleuth — compared by what they actually measure and which category fits your team.
The tools every engineering manager actually needs in 2026 — from output measurement to AI code review to project tracking.
AI analytics moved past vanity dashboards into real code evaluation. Here's the landscape, three distinct categories, and how to evaluate what works.
Six dashboards that actually drive decisions — what each shows, what decisions it enables, and what healthy vs. unhealthy patterns look like.
A candid look at the engineering analytics landscape in 2026 — what each tool actually measures, who it's for, and which approach works in the AI era.
The junior vs senior debate is outdated. AI fluency matters more than years of experience. Here's what our velocity data actually shows.
Traditional interviews test skills AI makes obsolete. Here's how to evaluate what actually matters: decomposition, evaluation, and shipping speed.
The all-seniors strategy costs more, ships less, and breaks down in the AI era. Here's what the data shows and how to fix your team composition.
When we rolled out GitVelocity internally at Headline, the reaction followed a predictable arc: skepticism, testing, acceptance, then something we didn't expect — competition.
Most engineering metrics are terrible. Engineers are right to resist them. But the answer isn't no measurement — it's better measurement.
Every engineering metric ever invented has been gamed. Story points got inflated. Commits got split. Lines of code got padded. Here's why scoring actual code is different.
Every merged PR gets a 0-100 complexity score. Here's exactly how the scoring works, why it matters, and what the numbers mean — no black boxes.