Skip to content

How to Improve Analysis Results for C/C++/Objective-C/Objective-C++ Code

The challenge with analysis of C/C++/Objective-C/Objective-C++ code is that macros meant for configuration are typically set using the build system and are not automatically available to Teamscale. This article describes solutions to this problem.

Problem Description

Typically, C/C++/Objective-C/Objective-C++ code relies to some degree on conditional compilation, also referred to as #ifdef programming. This means that the preprocessor is used to include or exclude certain parts of your code base at compile time. While Teamscale finds and uses all macros found in your source code, important macros affecting the compilation process are usually provided by the build system via compiler arguments (-DMacro_Name).

Since such arguments are not visible to Teamscale, they are not considered in the code analysis. For simple projects this behavior might lead to reasonable results, but to get precise analysis results in general, it is important for Teamscale to know all macros, even those defined by the build system.

Adding Preprocessor Information to the Analysis Profile

The easiest way to provide information about macros is by adding it to the analysis profile. If you select the language C/C++ for an analysis profile, you will find an option Predefined Preprocessor Macros. Here you can add preprocessor #define directives that will be applied to every single code file that Teamscale analyzes with this profile. For example, the content could look like this:

c
#define __cplusplus 199711L
#define DEBUG 1
#define USE_SPECIAL_FEATURE

This works great for simple cases of global macros, e.g., setting a debug mode. However, there are additional scenarios where more compilation information is needed for properly parsing C/C++/Objective-C/Objective-C++ code. For instance, some projects may utilize headers, library functions, type definitions, and macros from frameworks shipped in public SDKs. Moreover, projects often depend on basic system headers that utilize basic compilation definitions such as the target architecture, to properly load up the right macros, functions, and type definitions.
See next section on how to provide these additional information such as the target triple and additional args in the analysis profile.

Defining the Target Triple and Additional Arguments

The target triple is important to indicate which target platform the code should be parsed. In many systems written in C/C++/Objecive-C/Objective-C++, it is not uncommon to depend on basic system libraries or SDK frameworks that directly or indirectly depend on basic compiler definitions, such as which CPU architecture or operating system should be considered. For certain analyses, Teamscale supports the definition of the target triple as specified by LLVM.

The target triple canonical form is architecture-vendor-os (most common) or optionally, architecture-vendor-os-environment or architecture-vendor-os-environment-format (least common). For example, values for the target triple field can be arm64-apple-ios, x86-pc-win32, or x86_64-intel-linux-gnu. If available, adding this information to the analysis profile will most likely improve the static source code analysis results.

Moreover, if utilizing clang-tidy as an internal tool in Teamscale, the analysis profile can also take additional arguments to pass to clang parsing. Such additional arguments can be any valid and relevant argument supported by clang to properly parse the source code with more accurate compilation information. Examples of additional arguments are: the -F option to pass the frameworks path that are typically included in SDKs; or the -fno-objc-arc flag to indicate no ARC (Automatic Reference Counting) should be assumed when parsing the source code; or anything else supported by Clang (see their doc).

However, if such additional information is not enough to reach an accurate source code analysis, we also recommend the use of compilation database (see next section).

Using a Compilation Database

A compilation database is a file containing the exact compilation command for each single file encountered during compilation. This includes information about include paths and macro definitions passed to the compiler. The creation of a compilation database is supported by most popular build systems, such as bazel, cmake, ninja or directly via the clang compiler. To learn how to create a compilation database from your build system, read this guide. Teamscale supports compilation databases in JSON format.

The JSON file containing the compilation database is passed to Teamscale just like any other external data using the format COMPILATION_DATABASE. Typically, you will adjust your build jobs to upload this database to Teamscale or an Artifactory server at least every night or even on every run. You can also use different compilation databases for different branches, just like with other external data. Whenever Teamscale processes a file, it will respect the information from the latest database containing information about this file and hence have the same view on the code as the compiler.

Analyzing excluded code in architecture analysis

By default, our architecture analysis for C/C++ finds dependencies in code after preprocessing, i.e., code that would be compiled if a compiler would get the same input as Teamscale. Teamscale obtains that code by

  1. applying text filters configured in the project configuration (i.e., removing, for example, generated code) and
  2. running our preprocessor (which removes, for example code between #if 0 and #endif).

Our C/C++ architecture analysis detects dependencies from

  1. #include directives to the included files and from
  2. function declarations to function definitions.

For the second point, we need to detect function declarations/definitions that requires code which is as syntactically correct as possible. Therefore, this analysis is based on the code after preprocessing (and after applying text filters).

If you need dependencies from excluded code, you can enable the option Search for dependencies in excluded code in your analysis profile. Teamscale will then additionally do a best-effort attempt to find dependencies in excluded code. It tries to parse the code without preprocessing and ignores any code that could not be parsed. Since a lot of C/C++ code requires preprocessing before it can be parsed properly, enabling this option can lead to decreased quality of analysis results.