Skip to content

Improving Test Quality with Test Gap Analysis

In contrast to code quality, test quality is often directly visible to the user – namely, if the software does not work as it should. Maintaining high test quality is a challenging task in today's increasingly large and complex software systems. A task that is made even more difficult by increasing velocity and parallel development branches, to reduce time to market.

To keep up with the pace, testing often focuses on what has changed recently. This is reasonable, since, especially for long-lived software systems, most errors are introduced by changes to existing code or by writing new code. However, even with rigorous testing processes, we often observe that untested changes are still getting released, causing field bugs and potential hot-fixes. Approximately half of all code changes are not tested at all before being released and these untested code changes (Test Gaps) cause the majority of field defects. Why is that?

In our experience, the main reason for changes remaining untested is a lack of information. Since software errors have very different root causes, there is usually a pyramid of test levels ranging from low-level unit tests to integration tests and high-level system or manual tests, where each level aims to find different types of errors. Low-level tests are often written by the developers, who know precisely what changed and what their tests cover. Testers, who are responsible for integration or system testing, on the other hand often derive their tests from domain requirements, while information about actual code changes is unavailable to them. Therefore, they cannot know whether their test runs cover all the actual changes.

This problem is exacerbated in situations where testing is performed ad-hoc or exploratory. Often times in these so-called unstructured tests, it is even unclear what exactly has already been tested and what still needs to be done. Thus, it is easy to forget testing some crucial code change.

Even with detailed information about code changes, it remains cumbersome and error prone to manually check whether all of these changes are covered by tests, as it is difficult to infer which code actually gets executed from the description of a high-level test. Therefore, Teamscale automates the assessment of how complete your testing is with the Test Gap analysis.

Why Test Gap Analysis?

Teamscale's Test Gap analysis (TGA) is a form of software intelligence that combines existing data from the development and testing process. Being connected to the version control system, Teamscale knows exactly which parts of the code changed. From its integration with test coverage tools, Teamscale also knows which code was executed during testing. Combining both sources, Teamscale automatically reveals recent changes that have not been tested. We call such untested changes Test Gaps.

Since a Test Gap has not been executed in any test, you cannot have found any error that may be hiding in the respective change. It is of little surprise that such Test Gaps are five times more likely to contain a production defect than tested changes [1]. Knowing your software system's Test Gaps allows you to make informed decisions about which changes still need testing before a release and to make the most of your limited testing resources.

A Test Quality Control Process

Knowing your software system's Test Gaps enables you to manage them. This does not mean you should always close all Test Gaps. Instead, we suggest you focus on Test Gaps in those parts of your system where an error would be painful. Knowing them allows you to select or define test cases for further testing.

Managing Test Quality

On the other hand, it is perfectly sensible to tolerate a Test Gap, if the risk of a defect in the respective change is arguably low. For example, a change in some migration code that is only ever run by your own people, such that you can easily detect and mitigate a defect later, may not be worth the testing effort.

Moreover, keep in mind that having no test gaps does not guarantee that there are no defects. TGA does not tell you how well some change was tested and, like any other testing technique, cannot prove the absence of defects. It is designed as a tool that helps you avoid missing changes to your software system inadvertently.

Getting Started

If you're setting up Test Gap analysis for the first time in your organization, have a look at our organizational best practices for a quick and successful start with the Test Gap analysis.

First, you need to create a Teamscale project and connect it to your source code repository. This allows Teamscale to identify code changes in the history of your software system. Furthermore, a Teamscale project is where you'll deposit code coverage for later analysis.

Second, you need to regularly upload test coverage to your Teamscale project to keep the information about what has already been tested up-to-date, ideally in an automated fashion. This is generally possible in both Continuous Integration environments as well as manual testing setups. We provide tutorials that explain how to do this for Java and .NET applications.

With this setup, Teamscale automatically identifies the Test Gaps in your software system. The results of this analysis appear as Test Gap treemaps in the Test Gaps perspective and on TGA dashboards. To start working with TGA results, you may want to familiarize yourself with how to read and work with Test Gap treemaps.

To take it one step further, you may also connect your issue tracker to your Teamscale project. Teamscale then additionally tracks changes and code coverage on the level of individual issues, which enables you to analyze Issue Test Gaps as a means to continuously stay on top of your testing efforts during development.

Further Reading:

  1. Did We Test Our Changes? Assessing Alignment between Tests and Development in Practice
    S. Eder, B. Hauptmann, M. Junker, E. Juergens, R. Vaas, and K-H. Prommer.
  2. In German: Erfahrungen mit Test-Gap-Analyse in der Praxis
    Haben wir das Richtige getestet? E. Juergens, D. Pagano