Code Complexity

Spoiler: It is often needed to estimate the quality of a source code or to point out the risky parts. Whether because your are maintaining it or auditing it, this information allows to direct the work and to accelerate the discovery of bugs or vulnerabilities. Among the usefull metrics, I’ll show you the cyclomatic complexity, the NPath complexity and the cognitive complexity. The first computes the number of branches, the second the number of possible (acyclic) paths, and the third tries to translate the reading difficulty for humans.

When I work on a piece of software, I give some importance to the quality of the source code. The ease of understanding, testing and maintaining it determines the cost of fixing bugs and vulnerabilities but affects also their frequency.

A clear code facilitates the work of the auditor but also that of the developers. Vulnerability discovery is easier and thus less frequent because developers have been able to avoid a large number of them.

To get an idea of the quality of a source code, we can start by reading random portions of it. With experience, we quickly recognize a neat work of a nameless DIY. Although very useful, this first impression remains personal and difficult to communicate.

Complexity, TheDigitalArtist @ pixabay

One can then use software metrics which give a certain measure of the quality of the source code. Their objectivity and impartiality avoid the problems related to personal judgments but their interpretation requires a good understanding of them, which does not always facilitate their communication…

In this article, I proposes three metrics that attempt to measure the complexity of a source code. That is to say the work necessary to understand, test and improve the source code.

Cyclomatic Complexity

Also called McCabe measure, this first metric was invented in 1976 by Thomas McCabe who applied graph theory to programs and derived from it a complexity measure corresponding to the number of decisions of an algorithm.

A Complexity Measure, McCabe, IEEE transactions on software engineering, volume SE-2 issue 4, décembre 1976, pdf.

To understand this measurement, one must use the control flow graph of the program. The nodes correspond to the instruction blocks and the arcs to the sequences. We can then see visually the complexity of the structure of a program.

Counting the total number of paths in the graph is unnecessary. Indeed, for a cyclic graph, corresponding to a program with loops, the total number of paths is infinite and therefore no longer makes it possible to compare the programs with each other.

Rather than counting all of these paths, McCabe’s idea is to count only the basic paths. These paths make it possible to generate all the others by combinations. Mathematically, we speak of linearly independent paths of the graph (GG).

By adding an arc between the endpoint of the program to its starting point, it is then a question of computing the cyclomatic number V(G)V(G) of the graph GG containing ee arcs, nn nodes and pp connected components:

V(g)=en+p \begin{align} V(g) & = e - n + p \end{align}

This number corresponds to the branches in the graph and therefore to the number of decision points of the program. It can be computed simply by counting the conditional statements rather than building the graph.

So here is an example of source code on which to apply complexity metrics. This code computes the sum of the divisors and tells you if the number is perfect, abundant or deficient. Although this is certainly very useful, the purpose here is only illustrative.

#include <iostream>
#include <string>
//  - - - - - - - - - - - - - - - - - Cyclomatic complexity :
int main(int argc, char * argv[]) //  - - - - - - - - - - - - +1
{
    if (argc != 2) { // - - - - - - - - - - - - - - - - - - - +1
        std::cerr << "one argument needed" << std::endl ;
        return 1 ;
    }

    try {
        int a = std::stoi(argv[1]) ;

        int sum = 0 ;
        for (int i = 1; i < a ; ++i) { // - - - - - - - - - - +1
            if (a % i == 0) { //  - - - - - - - - - - - - - - +1
                sum += i ;
            }
        }

        if (sum == a) { //  - - - - - - - - - - - - - - - - - +1
            std::cout << "a is perfect" << std::endl ;
        } else if (sum < a) { //  - - - - - - - - - - - - - - +1
            std::cout << "a is deficient" << std::endl ;
        } else {
            std::cout << "is abundant" << std::endl ;
        }

    } catch (std::invalid_argument const & e) { //  - - - - - +1
        std::cout << "Error : " << e.what() << std::endl ;
        return 1 ;
    }

    return 0 ;
} // Total  - - - - - - - - - - - - - - - - - - - - - - - - -  7

The interest of graph theory is to provide a formal framework to prove these equivalences and give a mathematical reality behind the computed number.

One of these realities is that cyclomatic complexity is the minimum number of tests to cover all the instructions in a program. A number of tests lower than this minimum means either that there are missing tests, or that the code is too complex unnecessarily (i.e. contains unnecessary branches) and that it can therefore be simplified.

It is generally accepted that a cyclomatic complexity of 10 is a maximum for a function. This is the figure originally proposed by McCabe after applying it to real projects and is still used by current automatic measurement tools.

NPath Complexity

As the cyclomatic complexity not taking into account the nesting of conditional instructions, Nejmeh proposed in 1988 the notion of NPath complexity, also derived from graph theory and which gives the number of acyclic paths in the program.

NPATH: a measure of execution path complexity and its applications, Brian A. Nejmeh, Communications of the ACM, volume 31, issue 2, février 1988.

Where McCabe keeps the loops and counts the basic paths, Nejmeh prefers to count all the paths but for that he has to eliminate the loops. The NPath complexity is therefore the number of possible paths in a program without going through the same sequence of instructions twice: the acyclic paths.

Rather than working on the graph, it is also possible and more relevant here to do a count by reading the program. Indeed, reducing an unspecified directed graph into an acyclic graph is NP-Complete (cf. problem of the cycle-cutting of vertices) and therefore too much expensive on big codes.

NPath complexity is computed by multiplying the complexity for instruction sequences and adding those for conditional branches. Technically, each structure (if,for, …) is treated specifically.

I won’t go into the details you can find in the original Nejmeh article or in the checkstyle documentation.

We can apply this method on the same code as before. The computation is more difficult to illustrate because the cost of a structure depends on its content and it is then necessary to multiply these complexities for the sequences.

#include <iostream>
#include <string>
                                                  // NPath complexity :
int main(int argc, char * argv[])
{
    if (argc != 2) {
        std::cerr << "un argument nécessaire" << std::endl ;
        return 1 ;
    } // if + 1  - - - - - - - - - - - - - - - - - - - - - => 2

    try {
        int a = std::stoi(argv[1]) ;

        int sum = 0 ;
        for (int i = 1; i < a ; ++i) {
            if (a % i == 0) {
                sum += i ; //  - - - - - - - - - - - - - - 1
            } // if + 1- - - - - - - - - - - - - - - - - - => 2
        } // for + 1 - - - - - - - - - - - - - - - - - - - => 3

        if (sum == a) {
            std::cout << "a est parfait" << std::endl ;
        } else if (sum < a) {
            std::cout << "a est déficient" << std::endl ;
        } else {
            std::cout << "a est abondant" << std::endl ;
        } // if + elif + else  - - - - - - - - - - - - - - => 3

    } catch (std::invalid_argument const & e) { //  - - - 1
        std::cout << "Error : " << e.what() << std::endl ;
        return 1 ;
    } // try (for x if) + catch  - - - - - - - - - - - - - => 10

    return 0 ;
} // Total : if x try  - - - - - - - - - - - - - - - - - - => 20

These two complexities are linked to each other. A program with cyclomatic complexity cc has an NPath complexity at worst of 2c2^c. The minimum is reached if the conditional structures are only nested, the maximum if they are only sequential. The intermediate values give an idea of the arrangement of the structures of the program.

Just as cyclomatic complexity gives the minimum number of tests to cover all the code, NPath complexity gives the minimum number of tests to cover all acyclic combinations of paths.

The generally accepted threshold here is 200 and comes from the first tests at AT&T. A higher value for a function indicates that it is relevant to check it in order to be able to simplify it.

Cognitive Complexity

The last complexity discussed today does not seek to measure the computer complexity of an algorithm but the cognitive one, that is to say the ease with which a humanwill understand it. As the concept is not formally defined, it is the subject of active research, even today.

The most consensual version (especially because it is integrated into their tool) is that of SonarSource. It was introduced in 2016 and uses a counter that is incremented when particular control instructions and structures are encountered.

Cognitive Complexity, G. Ann Campbell, SonarSource (PDF).

The goal is not to measure the mathematical complexity of the algorithm but to quantify the effort required to understand it. The choice of the incrementing instructions follows this logic by penalizing the instructions considered to be complex (i.e. goto, continue) and by favoring the most readable (the definition of a function counts as 0). The consideration of indentation follows this goal.

For example, cognitive complexity increments for the switch statement and each of the break but ignores the cases whereas, conversely, the cyclomatic complexity increments for each case but ignores the switch and the breaks.

Here is what this metric gives on the example code:

#include <iostream>
#include <string>
                                                  // Cognitive complexity :
int main(int argc, char * argv[])
{
    if (argc != 2) { // - - - - - - - - - - - - - - - - - - - +1
        std::cerr << "un argument nécessaire" << std::endl ;
        return 1 ;
    }

    try {
        int a = std::stoi(argv[1]) ;

        int sum = 0 ;
        for (int i = 1; i < a ; ++i) { // - - - - - - - - - - +2 (nesting = 1)
            if (a % i == 0) { //  - - - - - - - - - - - - - - +3 (nesting = 2)
                sum += i ;
            }
        }

        if (sum == a) { //  - - - - - - - - - - - - - - - - - +2 (nesting = 1)
            std::cout << "a est parfait" << std::endl ;
        } else if (sum < a) { //  - - - - - - - - - - - - - - +1
            std::cout << "a est déficient" << std::endl ;
        } else { // - - - - - - - - - - - - - - - - - - - - - +1
            std::cout << "a est abondant" << std::endl ;
        }

    } catch (std::invalid_argument const & e) { //  - - - - - +1
        std::cout << "Error : " << e.what() << std::endl ;
        return 1 ;
    }

    return 0 ;
} // Total  - - - - - - - - - - - - - - - - - - - - - - - - -  11

The lack of a mathematical basis, a quality according to the authors of this metric, prevents any interpretation of the value obtained. In the example the value 11 has no reality and can only be used to compare codes with each other to find out which one uses more structures or indentation.

If you are looking for a variant of cognitive complexity that tries to get a more reliable correlation between the computed value and the difficulty experienced by the developers, you can turn to the npm genese-complexity module which follows a similar principle (but with other values for the increments).

The threshold proposed by the tool is 15 but I have not found any documentation that explains the origin of this value, probably the experience of designers in software metrology.

And next?

These three metrics have in common the goal of measuring code complexity but have a different approach and interpretation:

Using them globally on a project doesn’t make sense. First of all, no correlation with the quality of the product has been demonstrated (i.e. number of bugs) and there is indeed a lack of studies on this aspect. Then there are cases where exceeding the thresholds is necessary and justifiable. Finally, the complexity of a project is more related to its functional coverage than to the code itself (refactoring it will only redistribute this complexity between the different components).

The usefulness of these methods is rather to measure portions of the code and compare them to point out the most relevant places in terms of improvement and audit. These metrics thus point to the risk areas most likely to contain faults or even vulnerabilities.

It is therefore above all a tool at the service of the auditor which, handled with intelligence, makes it possible to orient the work efficiently.