Tosca tester
20 essential QA software testing metrics
This article highlights the importance of QA testing tools in software development by highlighting the importance of accurate knowledge and metrics to improve project management, track progress, monitor testing activities, and improve software quality.
What are quality assurance metrics?
QA metrics are used to evaluate and assess the quality and effectiveness of software development processes, products and testing activities. These metrics help in quantifying various aspects of software quality and can provide valuable information on the efficiency, reliability and overall performance of software development and testing activities. QA(quality assurance ) metrics are used to monitor and control the quality of software throughout its SDLC (and STLC) development lifecycle. They can be used at various stages of the software development process, including requirements gathering, design, coding, testing, and deployment. By tracking these metrics, organizations can identify areas for improvement, make data-driven decisions, and ensure that the software meets the required quality standards. There are two types of metrics:
Quantitative metrics (absolute numbers)
Quantitative metrics are exactly what they are called. They are whole numbers that measure one aspect of the quality assurance process.
Qualitative metrics (derived numbers)
Quantitative metrics alone cannot provide a complete picture of a QA team’s performance. For example, the average number of bugs per test alone does not tell much if it is not seen in the context of, for example, the total number of tests executed and the average time to execute each test. Qualitative metrics help in this regard by linking various relevant metrics together to provide a nuanced picture of a team’s speed, accuracy, or efficiency. In short, QA testing tools provide insight into the quality of the software developed and the team’s progress against the test plan. To achieve this insight, however, your QA testing tool must be able to track several key metrics. Although the testing metrics your organization captures may evolve, your team should consider at least the following types
1. Test Coverage
This metric should be able to answer the question: “How many tests are we running and which areas of software do they cover?” Test coverage, calculated as a percentage, defines how much of the application is verified by existing tests. It’s easy to calculate this using two quick formulas:
- Test execution = (number of tests already carried out/total number of tests to be carried out) x 100
- Requirement coverage = (number of requirements covered by existing tests/total number of requirements) x 100
The second formula is particularly important for verifying that QA checks all (or most) software features. For example, if you simply run 500 tests, the suite does not guarantee a high quality product by default. Tests must cover critical user journeys, performance of core features, and obvious customer preferences.
2. Test data coverage
This metric assesses the variety and coverage of test data used during testing. Comprehensive test data coverage is essential to simulate real-world scenarios and uncover potential errors that could arise under different data conditions. It ensures that the application is tested on different data sets and helps to identify data-related issues.
3. Escaped Bugs
The main reason for the existence of QA is to prevent most (or ideally all) bugs from getting into production. Ideally, customers should not detect and report any major bugs after an application or feature is put into production. Therefore, the number of escaped bugs should be the main metric to assess the entire QA process. If serious bugs are repeatedly leaking and disrupting the user experience, you may need to reevaluate your test files. Fortunately, when customers report bugs, you can quickly identify problem areas and patterns instead of having to re-examine entire architectures. Realistically, however, it’s not possible to identify and resolve every possible bug before putting it into production. You can, however, opt for an acceptable number of quickly fixable bugs that won’t bother the customer too much.
4. Requirement Defect Density
Tracking the number of errors that occur in tests covering individual requirements is particularly useful. This QA metric can reveal whether some requirements are riskier than others, helping product teams decide whether those features should be released. If testing a particular requirement reveals too many bugs, it may actually reveal problems with the requirement itself. Of course, it’s possible that the test cases themselves require refactoring, but rarely will more bugs be discovered because of errors in the test structure. For example, if the tests for requirement A generate 38 errors while the tests for requirement B generate only 7, this is a signal to IT testers to investigate whether requirement A requires modified tests. It is also a signal that the requirement may not be realistically deployable in its current state.
5. Test Effort
Evaluating the difficulty of testing requires taking into account several other indicators. These sub-metrics reflect how many tests are performed and for how long. Test effort numbers, which are generally calculated as averages, help you decide whether you’re running enough tests and catching enough bugs. Some important numbers:
- number of tests run per (duration): number of tests run / total duration,
- test pass rate: number of tests passed / total duration,
- error capture rate: total number of errors captured / total duration of tests,
- average number of errors per test: total number of errors / total number of tests.
6. Test Reliability
A perfect test suite has the following characteristics:
- a close correlation between the number of errors and failed tests,
- each failed test contains a real error instead of being defective,
- a test is only successful if the function under test is completely error-free.
The closer the suite of tests is to the above criteria, the more reliable it is. Here are some important questions:
- Are the tests failing because of actual bugs or because of poor design? If so, how many?
- Are the tests flawed? If so, how many and how often?
Test reliability monitoring is necessary to establish confidence that QA is adequately testing the software – actually doing its job. Like all effective QA metrics, it helps testers continuously improve existing test cases, scenarios, and procedures.
7. Time to Test
This metric reveals how quickly a team or tester can create and execute tests without affecting the quality of the software. Of course, this metric will vary between manual and automated test cycles, with the latter being executed much faster. In addition, the tools and frameworks used for quality assurance also have a real impact on the time it takes to test. It can be challenging to combine these numbers, so use the following averages:
- Average time to create tests = total time to create tests / total number of tests created.
- Average test run time = total test run time / total number of tests run.
Once you have the initial numbers for this QA team performance metric, you can incorporate best practices and update tools to increase both averages. Keep in mind that reducing average times means nothing if it lowers quality standards.
8. Test Cost
Most QA teams have to work within specific budgets. In order to justify their spending, they need to keep a close eye on how much they plan to spend and how much they end up spending. There are two main numbers here:
- Total Cost Allocation for Testing: the amount of money approved by management for QA activities for a specific period (quarter, year, etc.).
- Actual cost of testing: the actual amount of money that has been spent to carry out the necessary tests. This calculation may include the cost of testing per hour, per test case, or per requirement.
For example: if your total allocated cost is 2000 euros and you have to test 200 requirements:
- Cost of testing one request: 2000/200 = 10 euro.
- Cost per hour of testing: 2000/number of hours of testing (let’s say 200) = 100 euro.
- Cost per test case: 2000/number of test cases (let’s say 50) = 40 euro.
The above example assumes that testing all requirements requires the same amount of time and the same amount of money.
9. Cost per bug fix
Simply put, it is the amount of money spent for the developer to fix each bug. The formula is as follows:
- Cost of fixing the bug = time needed to fix * hourly rate of the developer.
10. Test Execution Status
At any point in time, you should be able to get accurate information about how many tests have passed, failed, are blocked, incomplete or pending. This metric, represented as numbers or percentages, is needed for daily/weekly reporting. It’s also a quick snapshot of the team’s average efficiency, as these numbers can be compared to previously established benchmarks. Quick tip: For easier reporting, convert test execution status numbers into visual aids such as bar or pie charts.
11. Defects per software change
Often when a new feature is added or an existing feature is changed, testing these changes will reveal bugs that did not exist in previous tests. For example, if you’ve added another button to a web page, tests may show that the previous buttons (which were rendering fine) are now crooked and have misaligned text. In other words, the errors appeared solely because of the new change. For example, if five changes were made and 25 errors appeared after testing, you can attribute approximately five errors to each change. Of course, it is always possible that one change introduced more errors than the others. If you study this QA metric long enough across multiple projects, you can make predictions about what bugs you and your team can expect to see with each change. With these numbers in hand, your team can better plan their time, resource investment, and availability when starting new testing cycles.
12. Defect Distribution over Time
At the end of the test cycle, it is important to map how many bugs exist and where they come from. This will reveal whether the QA team is making progress in identifying and resolving more bugs as they work on more cycles. Breaking down bugs based on their origin also helps pinpoint which areas need more attention. Some common categorizations here are:
- distribution of errors by cause,
- error distribution by module,
- severity of errors,
- distribution of errors by platform.
If the number of errors in a category increases, it is easier to determine the cause. For example, if more bugs appear in one platform, it may mean that the software requires more optimization for that particular environment.
13. Bugs found vs. Bugs Fixed
The Found Bugs vs. Fixed Bugs metric is one of the key metrics for assessing the effectiveness of the QA process. It maps the number of bugs found to the number of bugs fixed and provides an average that objectively shows whether QA is accomplishing its core task. This analysis is also useful in identifying patterns in which defects are being discovered and eliminated. It provides important insight into the current stage of defect management. To get this number, you must first track the number of bugs found and resolved each day in the test cycle. For example, let’s say you have a five-day test cycle and you have collected the following numbers:
Date of test cycle | Errors created | Resolved bugs | Total number of bugs created to date | Total number of bugs resolved to date |
01-09-2024 | 6 | 4 | 6 | 4 |
02-09-2024 | 3 | 0 | 9 | 4 |
03-09-2024 | 4 | 4 | 13 | 8 |
04-09-2024 | 2 | 4 | 15 | 12 |
05-09-2024 | 2 | 3 | 17 | 15 |
By the end of the test cycle, 17 bugs were created/identified and 15 were resolved. Compare this to previous test cycles and you can determine whether or not testers are improving at finding and resolving bugs.
14. Defect Resolution Percentage
This metric reveals how effective the development team is at analyzing and fixing bugs reported by QA teams. Although resolving bugs should ideally not be a QA issue, tracking this number can help explain delays in product deliveries – which is especially useful in conversations with management. To calculate this number, track the total number of bugs reported to the development team and the total number of bugs fixed within the test cycle. Then use this formula:
- (total number of errors removed / total number of errors reported) x 100.
Again, track the % of resolved errors over time to verify that QA is providing the desired results for the SDLC.
15. Defect Age
The defect age measures the average time it takes developers to fix a defect, from the start of the defect to the actual resolution of the defect.
- Defect Age = the difference between the time the defect occurred and the time it was resolved.
In general, the age of the defect is measured in days. Let’s say a defect was identified on 9 April 2023 and corrected on 26 April 2023. In this case, the age of the defect is 17 days. A gradually decreasing defect age is a strong indicator of the maturity of the QA team. It means that with each test cycle, defects take less time to fix.
16. Test Case Effectiveness
This number, derived as a percentage, gives an indication of the effectiveness of the test cases in detecting faults. In other words, how many test cases executed by your QA team successfully identified bugs in one test cycle? The formula is simple:
- Test case efficiency = (number of bugs found/number of test cases executed) x 100.
An important measure of the quality of test cases, this number should gradually increase over successive test cycles. It is one of the most obvious indicators of team performance.
17. Defect Leakage
In this case, the number of bugs that escape to the User Acceptance Testing (UAT) phase is tracked. Basically, it is the number of bugs that appear in the UAT after the application has gone through multiple levels of testing. Ideally, test cases should filter them out before potential users even touch the product. Calculate how:
Error leakage = (total number of errors in UAT/total number of errors detected before UAT) x 100
18. Test Case Productivity
You probably won’t have to report this metric to management, but measuring it helps in setting realistic expectations for your team. Test Case Productivity measures the effort required to create test cases for a particular sprint/cycle. The formula is as follows:
- Test case productivity = (number of test cases / cost per test case) x 100.
Obviously, the “effort required per test case” will not be an accurate number. Some test cases require more design work than others. However, you can ask your testers to provide a fair average. This metric will give you an idea of what is reasonably achievable for the team per cycle.
19. Test Completion Status
Not every test case that the team has designed will be carried through to completion. Some tests will pass, some will fail, and some will eventually fail or get blocked – tracking the completion status of tests is another KPI of the team’s overall performance. Several different formulas whose results combine to provide an overall picture of test completion status:
- % of tests executed = (number of tests executed/number of tests created) x 100,
- % of test cases not executed = (number of test cases not executed/number of test cases created) x 100,
- % of tests passed = (number of tests passed / number of tests taken) x 100,
- % of failed test cases = (number of failed test cases / number of test cases performed) x 100,
- % of blocked test cases = (number of blocked test cases / number of executed test cases) x 100.
With these numbers in hand, you can quickly assess the current state of operations. For example, if the % of successful test cases is lower than the % of blocked test cases, there may be a fundamental problem with the test case design or the test environment. Now you know what problem to focus on to improve your results for the next sprint.
20. Test Review Efficiency
Although test cases may have flagged bugs, each such flag requires some review by a tester, even if it only takes a few minutes – and it usually takes longer. Depending on the software and its stage of development, however, tests can return a large number of bugs. The time to review each one adds up, and therefore the efficiency of the tests’ review needs to be calculated.
- Test review efficiency in % = (number of tests reviewed/total number of tests requiring review) x 100.
Of course, the formula for this QA metric must be used in the context of a specific duration. Let’s say that in a test sprint lasting 7 days, 58 bugs were found, but due to the nature of these bugs, the team could only review and escalate to resolution 45 of them; the test review efficiency is then 77%.
Conclusion
If you measure the QA metrics described above, you’ll be very clear about how test teams work and the absolute value they deliver. The need to measure QA team performance cannot be overstated. Like any investment, QA must show a reasonable return in order to justify its existence in any SDLC. Fortunately, the necessity and effectiveness of the QA function has been proven countless times, as long as evolving practices are followed. If you are an IT tester or automated tester and you speak German, check out our company benefits and respond to our latest job postings.