Solution

First, we compile pointerbug.cxx using the debug -g option: \fbox{\tt c++ -g -o pointerbug pointerbug.cxx} . To debug with ddd, we run: \fbox{\tt ddd pointerbug} . We set command line arguments e test \fbox{\tt set args e test} and set a breakpoint inside the while loop \fbox{\tt break pointerbug.cxx:26} . We start the debugging session \fbox{\tt run} which stops at the required breakpoint, on the line \fbox{\tt *wordPtr = ' ';} . Putting the mouse pointer on the symbol wordPtr reveals on the window status line that the array pointed at by wordPtr is "test". From here on, we debug one statement at a time by repeatedly issuing the command \fbox{\tt next} . See Fig. 4.1.

Figure 4.1: The ddd frontend to the gdb debugger.
% latex2html id marker 3992
\includegraphics[width=14cm]{pointerbug1.eps}

At line 27, the array has become " est" because the 't' character has been replaced by a space (this is the effect of line 26). A \fbox{\tt next} command later, we are on line 25 (the while test). As theChar is equal to 'e' and *wordPtr is also 'e', the test fails and the loop is exited. Three more \fbox{\tt next} commands exit the debugging session without detecting any problem. We now change the command line arguments: \fbox{\tt set args
y test} and re-debug the program. We notice that, as the array pointed at by wordPtr, namely "test", does not contain the 'y' character, the while condition does not fail before the end of the array. It is not difficult to imagine that as the loop is repeated, characters in memory after the wordPtr array are set to contain spaces (as per line 26), so that eventually the whole memory adjacent to wordPtr is erased. As the variables are stored from the stack, the memory ``after'' the wordPtr array contains the values of the variables declared before the wordPtr array, namely the testInt variable. This is the reason why the program behaves incorrectly and unpredictably. This type of bug is a memory bug known as stack corruption, and it was engendered by the array bound overflow of the wordPtr pointer being increased past its legal limit.

The valgrind command-line debugging tool is usually very helpful in tracing memory bugs; in this case, however, its output is somewhat obscure. Running \fbox{\tt valgrind -q ./pointerbug y test} yields

==28373== Conditional jump or move depends on uninitialised value(s)
==28373==    at 0x8048752: main (pointerbug.cxx:25)
538976288
Line 25 is the while condition. What valgrind is telling us is that at a certain point of the program execution, the test (*wordPtr != theChar) occurs over an uninitialised portion of memory. Now, theChar is certainly initialised (line 17), so this leaves the *wordPtr as the sole possible source of the problem. We can shed further light on this matter by printing a loop counter (so we know when the problem occurs). We do this by inserting the C++ line \fbox{\tt cout << (int) (wordPtr - buffer) << endl;} after line 27. The output is now,
1
[...]
21
22
23
24
==28407== Conditional jump or move depends on uninitialised value(s)
==28407==    at 0x8048783: main (pointerbug.cxx:25)
25
26
27
538976288
At this point we immediately realize that the loop is counting past the end of string, as our given string test is only 4 characters long, and the program stops well past it. It is also interesting to notice that the first 25 bytes after buffer are all initialised parts of the stack: this is consistent with buffer being 20 bytes long (i.e. the vale of bufSize), theChar occupying only 1 byte of memory, and testInt being a 32 bit (i.e. 4 bytes) integer of type int.

One further point of interest is, why should the program terminate at all? Supposing there is no 'y' character in the memory starting from buffer, the while condition would never be satisfied. This is indeed the case, but the fact is that the overall distribution of byte values in memory is close to uniform, so that it is likely that each ASCII value from 0 to 255 will eventually be found in such a large chunk of memory.

In order to fix the bug, it suffices change the while condition to

(*wordPtr != theChar && *wordPtr != '\0' && (int) (wordPtr - buffer) < bufSize)
so that the end-of-char-array delimiter (the NULL byte), as well as the maximum array size checking condition may also be able to stop the loop.

To answer the last part of the question, the value 1 contained in testInt will not be affected by the bug described above if testInt is declared after buffer, because the statement \fbox{\tt *wordPtr = ' ';} will then overwrite memory after buffer (which, because the variables are on the stack, refers to variables declared before buffer).

Leo Liberti 2008-01-12