Keeping Tests Valuable: Are Code Coverage Metrics Trustworthy?

When a measure becomes a target, it ceases to be a good measure. – Charles Goodhart

mar 24, 2025

Over the years, codebase quality and test coverage metrics have been a constant topic in the Software Engineering community. We keep trying to find the right balance. Many projects enforce specific types of metrics in different ways. Let’s take a moment to dive into some key questions:

• Should we set strict coverage goals?

• Does code coverage really indicate quality?

• How can we use coverage metrics to our advantage?

Let’s talk about it!

Goodhart’s Law

Have you ever heard of Goodhart’s Law? British economist Charles Goodhart once stated:

“When a measure becomes a target, it ceases to be a good measure.”

Now, here’s the question: Can turning metrics into targets be harmful in the context of software engineering—specifically when it comes to code coverage? The answer is yes. Let me explain with an example.

Imagine a development team working on a high-stakes delivery for a major client in the financial sector. The goal is to build an enterprise app to streamline critical internal processes. The QA team, along with management, sets a minimum code coverage threshold of 97% for the project to pass through the pipeline. The logic is simple: the more lines covered, the fewer bugs. From that point on, code coverage becomes the main metric driving the project.

After a few weeks, what happens? Several features come back full of bugs. Management starts to ask:

“If the metrics are where we expected, why are so many features failing?”

The answer? Focusing only on metrics like code coverage hides real problems.

The Problem in Practice

Instead of improving test quality and focusing on validating system behavior, the team starts writing tests just to meet the coverage goal. The result? Tests that don’t cover real-world scenarios, but rather just lines of code—the typical happy paths. This issue becomes especially critical in large and complex projects.

Without goals based on real outcomes or tangible objectives, chasing a high code coverage percentage can simply hide critical issues. And the situation gets worse when combined with factors like:

• Poor management

• Lack of documentation

• Weak understanding of the software domain

• Misaligned squads working together

This creates the perfect environment for:

• Financial losses

• Customer dissatisfaction – whether they’re end users or corporate clients.

A Common Problem

I’ve worked on projects where no one really knew how to design proper test scenarios for each feature. The result? Unit tests were written based on gut feeling. Something like:

• “I think this is what we should be testing.”

Or worse:

• “We don’t need that many tests here, just cover the lines.”

Only covering the happy path is a mistake! We need to understand what we’re testing and why it matters to the business. That’s why it’s so important to ask questions during meetings, dig into the business rules, and truly understand the product you’re working on.

Setting metrics as targets doesn’t guarantee quality. Especially in software engineering, code coverage should be treated as an outcome, not a goal.

If you’re a project manager, tech lead, senior dev, or CTO—here’s a tip: quality should be grounded in solid results! Code coverage is just a report that supports the development cycle.

What kind of results should we aim for?

• Features with fewer bugs (maybe even bug-free!), approved for production with no unexpected issues

• Increased productivity

• Greater confidence when refactoring or building new features—because the tests are reliable

Coverage is a natural result, not a target. That’s what leads to robust software and a productive team.

Compartilhar Chronicles of a Pragmatic Programmer

Is Setting Code Coverage Percentages Really That Important?

That’s an interesting question, right? After all, everyone seems to have an opinion about it. I’ll admit—on many projects I’ve worked on, code coverage percentages were always a requirement. Some companies asked for 80% across the entire codebase. Others went further, demanding over 90% for each class. And there were even those ultra-specific ones: domain classes with business logic needed to hit 100% coverage.

So, what can we actually learn from these code coverage metrics?

The Value of Metrics

First of all, they’re useful—within the right context. Overly rigid metrics can lead to the kind of problems we discussed earlier. But on the other hand, having realistic metrics, aligned with each client’s needs and focused on quality, helps maintain standards and guides the project in the right direction.

Now, let’s be honest: if there’s no minimum coverage requirement, will developers actually bother writing tests?

In practice, we know the answer is often no. A lot of developers just don’t like writing tests—and I used to be one of them. A while back, I hated writing tests, especially for those edge cases that felt unlikely to happen.

But let me tell you something: any developer who wants to grow and mature needs to take testing seriously. These days, I see tests not as some boring phase in the development cycle, but as something essential.

They bring confidence, quality, and often help us catch issues before the client does.

Realistic Coverage Goals

So back to the question: Is having realistic code coverage percentages important?

My answer is still yes.

But here’s the catch: those percentages need to be achievable and realistic. Pushing teams toward unrealistic goals only creates stress, shifts focus away from what really matters, and messes with timelines. On the flip side, working with practical standards—defined together with domain experts and project managers—can be a valuable strategy.

There’s a quote that really captures this well, from Unit Testing Principles, Practices, and Patterns by Vladimir Khorikov:

“It’s good to have a high level of coverage in the critical parts of your system. It’s bad to make that high level a requirement. The difference is subtle but critical.”

My Take

Let me share something I find important: I see more value in coverage percentages per file or class than in a generic target for the entire project. But hold on—there’s a catch! As we’ll see later in this article, blindly trusting line coverage can also be risky.

Still, setting specific coverage targets for critical classes or layers—especially those that hold business rules—can bring real benefits. Why? Because it pushes developers to better understand these more complex and business-critical parts.

And if there’s one thing we’ve learned in software engineering, it’s that domain classes usually have more paths, higher cyclomatic complexity, and naturally, require more attention.

Let’s dig deeper into branch coverage and why it matters so much!

Branch Coverage

Instead of relying on the raw number of lines covered, this metric focuses on control structures, such as if and switch statements. It tells us how many of these control branches are exercised by at least one test in the suite. Here’s an example using .NET:

public class Phone
    {
        public bool isValid(string phone)
        {
            if (phone == null)
            {
                return false;
            } 

            if (!phone.All(char.IsDigit))
            {
                return false;
            }

            if (phone.Length < 11)
            {
                return false;
            }

            if (phone.Length > 13)
            {
                return false;
            }

            return true;
        }
    }

Now take a look at the generated report. I'd like you to take a good look at the metrics raised just for this file of the number of branches:

The image above is a great example of how code coverage reports can be useful. Notice how the uncovered branches are clearly highlighted, indicating exactly where we need to work. This visualization makes it easy to identify what’s missing – and in this case, pretty much everything still needs to be covered!

Every if or switch statement can’t be overlooked. And what if you missed something? The report is there to point it out and show you what you missed. It’s like having an assistant reminding you of what still needs attention.

Branch Coverage vs. Code Coverage

Now, it’s important to understand this: branch coverage is a part of code coverage. Think of code coverage as a broader concept, broken down into several criteria—with branch coverage being one of them.

Branch coverage is more specialized, focusing on making sure that every decision point or path in the code is tested. For example, in the case of the image, each condition inside an if statement needs to be evaluated to see if it has been tested under all possible scenarios.

Balance Is Key

I’m not going to say that branch coverage is “better” or “more useful” than other types of coverage. It all comes down to balance. Line coverage, for example, goes hand in hand with branch coverage. One doesn’t replace the other—they complement each other.

So, what’s the takeaway here? Always pay attention to uncovered branches in your code. Ask yourself:

• Why wasn’t this branch covered?

• How can I make sure it gets tested?

Answering these questions leads to stronger, more reliable tests. After all, no detail should be overlooked!

The problems with relying completely on code coverage percentages!

Some people believe that after refactoring a code and the number of lines in code coverage drops, this will automatically improve our tests. This is illogical. Here is an example:

public class Phone
    {
        public bool isValid(string phone)
        {
            if (phone == null) return false;

            return phone.All(char.IsDigit) && phone.Length >= 11 && phone.Length <= 13;
        }
    }

I did a small refactoring in the code without changing anything in the unit tests. Now let's make a comparison, before and after the refactoring in the metrics that the Coverage framework brought us:

BEFORE REFACTORING

AFTER REFACTORING

The percentage of covered lines increased—from 46% to 80%.

Does that mean there was progress? An improvement? No!

We simply reduced the number of lines of code, but the behaviors—the important branches—are still not covered by tests!

If you look closely at the before-and-after screenshots, you’ll notice that the branch coverage (the paths the algorithm can take) remains the same, even after the refactoring.

And that’s one of the major issues when working with code coverage. If we act without thinking or analyzing carefully, we can easily be misled!

Manipulating coverage numbers is surprisingly easy. The more compact your code is, the better your coverage metric will look—because it only considers the raw number of lines.

That’s why careful analysis from developers and testers is essential, always paying close attention to the quality of the tests being written. What does this prove?

Code coverage is not directly related to quality.

Just because your code is 100% covered doesn’t mean your tests are actually good.

Everything we’ve discussed so far reinforces the idea that coverage is just a tool—meant to support the creation of effective, high-quality tests.

And for a test to be considered high quality, it needs to meet several important criteria. For example, it should be:

• Readable

• Well-organized

• Easy to understand

• Free from dozens of asserts

• And most importantly: focused on testing real feature behavior

So, to finally answer the question:

Does code coverage indicate quality?

No, based on everything we’ve seen.

This topic is closely related to what’s coming next, so I’ve dedicated a special section to it.

Reaching a certain coverage percentage should not be a reason to push code to production.

Don’t Use Code Coverage as a Release Criterion

Let’s make this crystal clear.

Oh—and let me share something I’ve seen very rarely in my career: using code coverage as a criterion to release features to production.

And why is it important to talk about this?

Because it’s a major mistake.

Relying on coverage percentages to release critical features can be disastrous.

Code needs to go through several layers of quality checks, including a thorough code review, done by project developers—and, if possible, by the client too (if they have a technical team).

You also need a QA analyst with the skills to spot things from a different angle—someone who might catch what the dev team didn’t fully grasp or validate.

Now, if the only requirement to release code is hitting a certain coverage percentage…

Sorry to say it, but that’s amateur hour.

Large organizations need robust and reliable processes.

Here’s a warning: avoid future headaches.

Build a reliable, high-quality pipeline for your software.

Why Coverage Shouldn’t Be the Only Criterion

To drive this point home, think about the software you work on today.

What would happen if the only requirement to release a new feature was reaching a coverage threshold?

At first, it might be easy to fix bugs caused by a lack of proper tests.

But as the software grows, complexity increases.

And guess what?

Eventually, shipping new features will become a nightmare.

In the worst-case scenario, it could lead to the end of the project—or the product itself.

In fact, Vladimir Khorikov, in Unit Testing Principles, Practices, and Patterns, warns about this very thing:

“Code coverage should never be a goal. It’s just a side effect of good tests. Focusing on coverage can promote bad practices, like writing tests that don’t validate actual behavior.”

This mindset reinforces the idea that coverage is just a tool to support development—not the final objective.

What to Do Instead

It’s essential to have rigorous code reviews and solid acceptance criteria before releasing code to staging or production. These processes are far more effective at ensuring quality than blindly relying on coverage percentages.

Martin Fowler, one of the most respected voices in software engineering, also touches on this in his article Test Coverage:

“It’s not test coverage that ensures quality, but what you’re testing and how. The focus should be on behaviors and critical scenarios—not on the numbers.”

The idea is clear: prioritize the quality of your tests, not just how much of your code they cover.

But hold on—this topic is so important, it definitely deserves a dedicated post of its own.

How Can We Use Code Coverage Metrics to Our Advantage?

We’ve talked a lot about the challenges and pitfalls of code coverage—but how can we actually use this tool effectively? Here are some practical suggestions:

1. Focus on critical areas of the code

Use coverage metrics to shine a light on complex and business-critical parts of the codebase. These areas tend to have a greater impact on system behavior and deserve extra attention.

2. Write test scenarios for low-coverage files

If the team identifies files with zero or very low coverage, especially in terms of branches, that’s a great opportunity to write meaningful test scenarios that validate real behavior.

3. Emphasize the importance of testing code branches

Remember: branches represent the paths your algorithm can take. Structures like if/else need to be carefully tested to ensure all possible scenarios are validated.

4. Don’t blindly trust the coverage report

The report is useful—but don’t assume your tests are good just because coverage is high. As we’ve discussed, 100% coverage doesn’t equal quality.

5. Avoid setting unrealistic coverage goals

It’s not productive to demand high coverage percentages if test quality remains poor. Focus on quality over quantity.

6. Use coverage reports to identify cyclomatic complexity

Coverage tools can reveal cyclomatic complexity, showing whether a class or method may need refactoring. Refactoring improves readability and reduces risk.

7. Leverage coverage metrics during code reviews

Coverage metrics are great allies during code reviews. They help reviewers focus on less-tested areas and spot potential risks.

8. Never ignore complementary test scenarios

Review the reports and percentages carefully, but always remember: there are scenarios the coverage tool doesn’t show. Never rely solely on metrics to determine whether your tests are sufficient.

Find the Balance Point

In the end, every project will have its own demands, and it’s up to the team to define the right standards for code coverage. The suggestions above are lessons I’ve learned from experts and from my own experiences on both national and international projects. The key lies in striking the right balance between metrics and quality.

Thanks so much for reading all the way through!

Big hug, and see you next time! 🙂

Chronicles of a Pragmatic Programmer