Top 20 essential QA metrics in software testing

Testing • 10 min. reading • September 17, 2024

This article highlights the importance of QA testing tools in software development by showing the importance of accurate knowledge and metrics to improve project management, track progress, monitor testing activities, and improve software quality.

What are quality assurance metrics?

QA metrics are used to evaluate and assess the quality and effectiveness of software development processes, products and testing activities. These metrics help in quantifying various aspects of software quality and can provide valuable information on the efficiency, reliability and overall performance of software development and testing activities. QA (quality assurance) metrics are used to monitor and control the quality of software throughout its SDLC (and STLC) development lifecycle. They can be used at various stages of the software development process, including requirements gathering, design, coding, testing, and deployment. By tracking these metrics, organizations can identify areas for improvement, make data-driven decisions, and ensure that the software meets the required quality standards. There are two types of metrics:

Quantitative metrics (absolute numbers)

Quantitative metrics are exactly what they are called. They are whole numbers that measure one aspect of the quality assurance process.

Qualitative metrics (derived numbers)

Quantitative metrics alone cannot provide a complete picture of a QA team’s performance. For example, the average number of bugs per test alone does not tell much if it is not seen in the context of, for example, the total number of tests executed and the average time to execute each test. Qualitative metrics help in this regard by linking various relevant metrics together to provide a nuanced picture of a team’s speed, accuracy, or efficiency. In short, QA testing tools provide insight into the quality of the software developed and the team’s progress against the test plan. To achieve this insight, however, your QA testing tool must be able to track several key metrics. Although the testing metrics your organization captures may evolve, your team should consider the following tips:

1. Test coverage

This metric should be able to answer the question: how many tests are we running and which areas of software do they cover? Test coverage, calculated as a percentage, defines how much of the application is verified by existing tests. It’s easy to calculate this using two quick formulas:

Test execution = (number of tests already carried out/total number of tests to be carried out) x 100
Requirement coverage = (number of requirements covered by existing tests/total number of requirements) x 100

The second formula is particularly important for verifying that QA checks all (or most) software features. For example, if you simply run 500 tests, the suite does not guarantee a high quality product by default. Tests must cover critical user journeys, performance of core features, and obvious customer preferences.

2. Test data coverage

This metric assesses the variety and coverage of test data used during testing. Comprehensive test data coverage is essential to simulate real-world scenarios and uncover potential errors that could arise under different data conditions. It ensures that the application is tested on different data sets and helps to identify data-related issues.

3. Escaped bugs

The main reason for the existence of QA is to prevent most (or ideally all) bugs from getting into production. Ideally, customers should not detect and report any major bugs after an application or feature is put into production. Therefore, the number of escaped bugs should be the main metric to assess the entire QA process. If serious bugs are repeatedly leaking and disrupting the user experience, you may need to reevaluate your test files. Fortunately, when customers report bugs, you can quickly identify problem areas and patterns instead of having to re-examine entire architectures. Realistically, however, it’s not possible to identify and resolve every possible bug before putting it into production. You can, however, opt for an acceptable number of quickly fixable bugs that won’t bother the customer too much.

4. Requirement defect density

Tracking the number of bugs that occur in tests covering individual requirements is particularly useful. This QA metric can reveal whether some requirements are riskier than others, helping product teams decide whether those features should be released. If testing a particular requirement reveals too many bugs, it may actually reveal problems with the requirement itself. Of course, it’s possible that the test cases themselves require refactoring, but rarely will more bugs be discovered because of errors in the test structure. For example, if the tests for requirement A generate 38 errors while the tests for requirement B generate only 7, this is a signal to IT testers to investigate whether requirement A requires modified tests. It is also a signal that the requirement may not be realistically deployable in its current state.

5. Test effort

Evaluating the difficulty of testing requires taking into account several other indicators. These sub-metrics reflect how many tests are performed and for how long. Test effort numbers, which are generally calculated as averages, help you decide whether you run enough tests and catch enough bugs. Some important numbers:

number of tests run per (duration): number of tests run / total duration,
test pass rate: number of tests passed / total duration,
bug capture rate: total number of bugs captured / total duration of tests,
average number of bugs per test: total number of bugs / total number of tests.

6. Test reliability

A perfect test suite has the following characteristics:

a close correlation between the number of bugs and failed tests,
each failed test contains a real bug instead of being defective,
a test is only successful if the function under test is completely bug-free.

The closer the suite of tests is to the above criteria, the more reliable it is. Here are some important questions:

are the tests failing because of actual bugs or because of poor design? If so, how many?
are the tests flawed? If so, how many and how often?

Test reliability monitoring is necessary to establish confidence that QA is adequately testing the software and doing its job. Like all effective QA metrics, it helps testers continuously improve existing test cases, scenarios, and procedures.

7. Time to test

This metric reveals how quickly a team or tester can create and execute tests without affecting the quality of the software. Of course, this metric will vary between manual and automated test cycles, with the latter being executed much faster. In addition, the tools and frameworks used for quality assurance also have a real impact on the time it takes to test. It can be challenging to combine these numbers, so use the following averages:

Average time to create tests = total time to create tests / total number of tests created.
Average test run time = total test run time / total number of tests run.

Once you have the initial numbers for this QA team performance metric, you can incorporate best practices and update tools to increase both averages. Keep in mind that reducing average times means nothing if it lowers quality standards.

8. Test cost

Most QA teams have to work within limited budgets. In order to justify their spending, they need to keep a close eye on how much they plan to spend and how much they end up spending. There are two main numbers here:

Total cost allocation for testing: the amount of money approved by management for QA activities for a specific period (trimester, year, etc.).
Actual cost of testing: the actual amount of money that has been spent to carry out the necessary tests. This calculation may include the cost of testing per hour, per test case, or per requirement.

For example: if your total allocated cost is 2000 euros and you have to test 200 requirements:

Cost of testing one requirement: 2000/200 = 10 euro.
Cost per hour of testing: 2000/number of hours of testing (let’s say 200) = 100 euro.
Cost per test case: 2000/number of test cases (let’s say 50) = 40 euro.

The above example assumes that testing all requirements requires the same amount of time and the same amount of money.

9. Cost per bug fix

Simply put, cost per bug fix is the amount of money spent for the developer to fix each bug. The formula is as follows:

Cost of fixing the bug = time needed to fix * hourly rate of the developer.

QA (quality assurance) metrics

10. Test execution status

At any point in time, you should be able to get accurate information about how many tests have passed, failed, are blocked, incomplete or pending. This metric, represented as numbers or percentages, is needed for daily/weekly reporting. It’s also a quick snapshot of the team’s average efficiency, as these numbers can be compared to previously established benchmarks. Quick tip: For easier reporting, convert test execution status numbers into visual aids such as bar or pie charts.

11. Defects per software change

Often when a new feature is added or an existing feature is changed, testing these changes will reveal bugs that did not exist in previous tests. For example, if you’ve added another button to a web page, tests may show that the previous buttons (which were rendering fine) are now crooked and have misaligned text. In other words, the bugs appeared solely because of the new change. For example, if five changes were made and 25 bugs appeared after testing, you can attribute approximately five bugs to each change. Of course, it is always possible that one change introduced more bugs than the others. If you study this QA metric long enough across multiple projects, you can make predictions about what bugs you and your team can expect to see with each change. With these numbers in hand, your team can better plan their time, resource investment, and availability when starting new testing cycles.

12. Defect distribution over time

At the end of the test cycle, it is important to map how many bugs exist and where they come from. This will reveal whether the QA team is making progress in identifying and resolving more bugs as they work on more cycles. Breaking down bugs based on their origin also helps pinpoint which areas need more attention. Some common categorizations here are:

defect distribution by cause,
defect distribution by module,
defect distribution by severity,
defect distribution by platform.

If the number of defects is increasing in certain category, it is easier to determine the cause. For example, if more bugs appear in one platform, it may mean that the software requires more optimization for that particular environment.

13. Bugs found vs. bugs fixed

The bugs found vs. bugs fixed metric is one of the key metrics for assessing the effectiveness of the QA process. It maps the number of bugs found to the number of bugs fixed and provides an average that objectively shows whether QA is accomplishing its core task. This analysis is also useful in identifying patterns in which defects are being discovered and eliminated. It provides important insight into the current stage of defect management. To get this number, you must first track the number of bugs found and resolved each day in the test cycle. For example, let’s say you have a five-day test cycle and you have collected the following numbers:

Test cycle date	Bugs created	Bugs resolved	Total bugs created to date	Total bugs resolved to date
01-09-2024	6	4	6	4
02-09-2024	3	0	9	4
03-09-2024	4	4	13	8
04-09-2024	2	4	15	12
05-09-2024	2	3	17	15

By the end of the test cycle, 17 bugs were created/identified and 15 were resolved. Compare this to previous test cycles and you can determine whether or not testers are improving at finding and resolving bugs.

14. Defect resolution percentage

This metric reveals how effective the development team is at analyzing and fixing bugs reported by QA teams. Although resolving bugs should ideally not be a QA issue, tracking this number can help explain delays in product deliveries – which is especially useful in conversations with management. To calculate this number, track the total number of bugs reported to the development team and the total number of bugs fixed within the test cycle. Then use this formula:

(total number of defects fixed / total number of defects reported) x 100.

Again, track the % of resolved defects over time to verify that QA is providing the desired results for the SDLC.

15. Defect age

The defect age measures the average time it takes developers to fix a defect, from the start of the defect to the actual resolution of the defect.

Defect Age = the difference between the time the defect occurred and the time it was resolved.

In general, the age of the defect is measured in days. Let’s say a defect was identified on 9 April 2023 and corrected on 26 April 2023. In this case, the age of the defect is 17 days. A gradually decreasing defect age is a strong indicator of the maturity of the QA team. It means that with each test cycle, defects take less time to fix.

16. Test case effectiveness

This number, derived as a percentage, gives an indication of the effectiveness of the test cases in detecting bugs. In other words, how many test cases executed by your QA team successfully identified bugs in one test cycle? The formula is simple:

Test case effectiveness = (number of bugs found/number of test cases executed) x 100.

An important measure of the quality of test cases, this number should gradually increase over successive test cycles. It is one of the most obvious indicators of team performance.

17. Defect leakage

In this case, the number of bugs that escape to the User Acceptance Testing (UAT) phase is tracked. Basically, it is the number of bugs that appear in the UAT after the application has gone through multiple levels of testing. Ideally, test cases should filter them out before potential users even touch the product. Calculate how:

Defect leakage = (total number of defects in UAT/total number of defects found before UAT) x 100

18. Test case productivity

You probably won’t have to report this metric to management, but measuring it helps in setting realistic expectations for your team. Test Case Productivity measures the effort required to create test cases for a particular sprint/cycle. The formula is as follows:

Test case productivity = (number of test cases / efford required per test case) x 100.

Obviously, the “effort required per test case” will not be an accurate number. Some test cases require more design work than others. However, you can ask your testers to provide a fair average. This metric will give you an idea of what is reasonably achievable for the team per cycle.

19. Test completion status

Not every test case that the team has designed will be carried through to completion. Some tests will pass, some will fail, and some will end up not being executed or blocked – tracking the completion status of tests is another KPI of the team’s overall performance. Several different formulas whose results combine to provide an overall picture of test completion status:

% of tests executed = (number of tests executed/number of tests created) x 100,
% of test cases not executed = (number of test cases not executed/number of test cases created) x 100,
% of tests passed = (number of tests passed / number of tests executed) x 100,
% of failed test cases = (number of failed test cases / number of test cases executed) x 100,
% of test cases blocked = (number of blocked test cases / number of executed test cases) x 100.

With these numbers in hand, you can quickly assess the current state of operations. For example, if the % of successful test cases is lower than the % of blocked test cases, there may be a fundamental problem with the test case design or the test environment. Now you know what problem to focus on to improve your results for the next sprint.

20. Test review efficiency

Although test cases may have flagged bugs, each such flag requires some review by a tester, even if it only takes a few minutes – and it usually takes longer. Depending on the software and its stage of development, however, tests can return a large number of bugs. The time to review each one adds up, and therefore the efficiency of the tests’ review needs to be calculated.

Test review efficiency in % = (number of tests reviewed/total number of tests requiring review) x 100.

Of course, the formula for this QA metric must be used in the context of a specific duration. Let’s say that in a test sprint lasting 7 days, 58 bugs were found, but due to the nature of these bugs, the team could only review and escalate to resolution 45 of them; the test review efficiency is then 77%.

Conclusion

If you measure the QA metrics described above, you’ll be very clear about how test teams work and the absolute value they deliver. The need to measure QA team performance cannot be overstated. Like any investment, QA must show a reasonable return in order to justify its existence in any SDLC. Fortunately, the necessity and effectiveness of the QA function has been proven countless times, as long as evolving practices are followed. If you are an IT tester or IT automation tester and you speak German, check out our employee benefits and respond to our latest job offers.

About the author

Michaela Kojnoková

Agile Test Engineer

Po štúdiu informatiky na ŽU a TUKE som sa najviac ponorila do oblasti automatizácie testovania. Okrem toho sa venujem tvorbe webov, databázam, dátovej analytike, umelej inteligencii a strojovému učeniu. Mám rada cestovanie, šport a najviac si užívam čas strávený v prírode s mojimi blízkymi. LinkedIn

Join us!

Name*

Last name*

E-mail*

Annex

I consent to the processing of personal data more information here>.