Program Testing

July 7, 1994

It is unacceptable for a programmer to release new code to his team without testing it. By “testing” I don’t mean running one or two obvious cases through it; I mean being reasonably sure that there are no bugs. There are several ways to test software: (1) banging on it; (2) doing an operational proof; and (3) doing an assertional proof. The first is what’s most commonly used to see if the program has any bugs. As Dijkstra says, it’s a great way to verify the presence of bugs, but a lousy way to show their absence. When method 1 finds problems or if there’s a particularly nasty piece of code, methods 2 or 3 should be used. Method 2 is easy: you walk through the code and “play computer”, simulating the computer’s move at every instruction. There are two problems with this technique: humans make lousy computers, and you will probably make assumptions about input values or the state at any given point that aren’t true. Method 3 avoids these problems at a fairly large cost: the proofs are tedious and can take pages of work for even a few lines of code. An assertional proof keeps track of all the possible states between each program statement and verifies through formalisms that the state we wish the program to end in is actually reachable from the start.

Here are a few examples where assertional verification would have been useful. A few years ago I wrote some code to convert a string to an integer. The top of the routine checked for invalid characters and a negative sign:

    if (!isdigit(s[0])) {
        return ERROR;
    }
    sign = 1;
    if (s[0] == ’-’) {
        sign = -1;
        s++;
    }
            

I tested the routine with method 1 fairly well, but forgot to give any negative numbers (my need for the routine at the time only involved positive numbers). A few days later my needs changed and I passed it a negative number; the routine failed. It took quite a bit of method 2 to figure out that !isdigit() was not really what I wanted to test for. An assertional proof would have immediately shown that the test for the negative sign could never succeed.

My officemate at the time wrote this snippet to free a linked list:

    while (p != NULL) {
        q = p->next;
        free(p);
        p = q;
    }
    free(p);
            

Even the simplest of assertional proofs would have shown that the second free() could only be reached if p was NULL.

That same summer (at Microsoft), I found two seperate routines written by two different people in two different programs to imitate the strstr() function. (strstr() searches a string for a particular substring.) Both had the same bug:

    char *search(char *s1, char *s2)
    {
        /*
         * Looks for s2 in s1, returns
         * location in s1 or NULL.
         */

        char *s, *match;

        for (; *s1; s1++) {
            match = s1;
            for (s = s2; *s; s++) {
                if (*s1 == *s) {
                    s1++;
                } else {
                    break;
                }
            }
            if (!*s) {
                return match;
            }
        }

        return NULL;
    }
            

(Spend a minute here finding the bug.) Method 1 is extremely unlikely to find this bug. Method 2 is also very unlikely, just because the reader will cheat in the mental simulation and not consider every possible combination of input strings.

It’s not reasonable to expect programmers to constantly write out pages of formal proofs for every routine they write. Particularly complex or bug-prone code should be scrutinized more formally than methods 1 and 2 allow, though. Two tools can help in this respect:

I wrote a program a few years ago to do trivial assertional proofs of programs. It was very limited, but while writing the program I ran into a bit of tricky code where a particular setting of a debug switch should execute one printf inside a loop, and another setting should execute another after the loop. (This was more complicated than it sounds.) I fed this 10-line segment into the program with printfs replaced with setting boolean variables (printdebug1 and printdebug2 for inside and after the loop), and asked the program to prove that exactly one of the two would get printed: that printdebug1 != printdebug2 after the code segment was executed. The program returned “TRUE”. Without any complicated testing or mental walk-through of code I was able to prove without a doubt (within the correctness of the proving program and my input to it) that my code was going to behave correctly. Programs are rarely, if ever, used to prove the correctness of critical sections of code, which is surprising considering how easy it is to do.

The second help is an intelligent editor or compiler. (The former would be harder to write but would catch more subtle bugs.) Either of these could do simple tracking of possible variable ranges. A fairly simple test would have detected my negative-value “if” above by tracking the range of s[0]. Some compilers already do this (range-tracking), although I haven’t seen them warn users on this basis. For example, gcc will convert this code:

    i *= 2;
    i++;
            

into:

    i <<= 1;  // Easy enough
    i |= 1;   // We know it’s even, just OR it in
            

This is impressive and could lead to very helpful compile-time warnings.

In general, be wary of claims that code has been tested—it is often not tested nearly enough and ends up costing much more wasted time later when bugs pop up and it’s not obvious what module they’re in. Get used to the formalisms that assertional proofs use. They even help when doing operational proofs.