[10] netbsdsrc/usr.bin/expand/expand.c:36–62
[11] netbsdsrc/usr.bin/expand/expand.c:64–151
[12] netbsdsrc/usr.bin/expand/expand.c:153–185
When examining a nontrivial program, it is useful to first identify its major constituent parts. In our case, these are the global variables (Figure 2.2:1) and the functions main (Figure 2.3), getstops (see Figure 2.5:1), and usage (see Figure 2.5:8).
The integer variable nstops and the array of integers tabstops are declared as global variables, outside the scope of function blocks. They are therefore visible to all functions in the file we are examining.
The three function declarations that follow (Figure 2.2:2) declare functions that will appear later within the file. Since some of these functions are used before they are defined, in C/C++ programs the declarations allow the compiler to verify the arguments passed to the function and their return values and generate correct corresponding code. When no forward declarations are given, the C compiler will make assumptions about the function return type and the arguments when the function is first used; C++ compilers will flag such cases as errors. If the following function definition does not match these assumptions, the compiler will issue a warning or error message. However, if the wrong declaration is supplied for a function defined in another file, the program may compile without a problem and fail at runtime.
Figure 2.2 Expanding tab stops (declarations).
<-- a
#include
#include
#include
#include
#include
int nstops;
int tabstops[100];
static void getstops(char *);
int main(int, char *);
static void usage (void);
(a) Header files
Global variables
Forward function declarations
Notice how the two functions are declared as static while the variables are not. This means that the two functions are visible only within the file, while the variables are potentially visible to all files comprising the program. Since expand consists only of a single file, this distinction is not important in our case. Most linkers that combine compiled C files are rather primitive; variables that are visible to all program files (that is, not declared as static) can interact in surprising ways with variables with the same name defined in other files. It is therefore a good practice when inspecting code to ensure that all variables needed only in a single file are declared as static.
Let us now look at the functions comprising expand. To understand what a function (or method) is doing you can employ one of the following strategies.
Guess, based on the function name.
Read the comment at the beginning of the function.
Examine how the function is used.
Read the code in the function body.
Consult external program documentation.
In our case we can safely guess that the function usage will display program usage information and then exit; many command-line programs have a function with the same name and functionality. When you examine a large body of code, you will gradually pick up names and naming conventions for variables and functions. These will help you correctly guess what they do. However, you should always be prepared to revise your initial guesses following new evidence that your code reading will inevitably unravel. In addition, when modifying code based on guesswork, you should plan the process that will verify your initial hypotheses. This process can involve checks by the compiler, the introduction of assertions, or the execution of appropriate test cases.
Figure 2.3 Expanding tab stops (main part).
int
main(int argc, char *argv)
{
int c, column;
int n;
while ((c = getopt (argc, argv, "t:")) != -1) {
switch (c) {
case 't':
getstops(optarg);
break;
case '?': default: <-- a
usage();
}
}
argc -= optind;
argv += optind;
do {
if (argc > 0) {
if (freopen(argv[0], "r", stdin) == NULL) {
perror(argv[0]);
exit(1);
}
argc--, argv++;
}
column = 0;
while ((c = getchar()) != EOF) {
switch (c) {
case '\t': <-- b
if (nstops == 0) {
do {
putchar(' ');
column++;
} while (column & 07);
continue;
}
if (nstops == 1) {
do {
putchar(' ');
column++;
} while (((column - 1) % tabstops[0]) != (tabstops[0] - 1));
continue;
}
for (n = 0; n < nstops; n++)
if (tabstops[n] > column)
break;
if (n == nstops) {
putchar(' ');
column++;
continue;
}
while (column < tabstops[n]) {
putchar(' ');
column++;
}
continue;
case '\b': <-- c
if (column)
column--;
putchar('\b');
continue;
default: <-- d
putchar(c);
column++;
continue;
case '\n': <-- e
putchar(c);
column = 0;
continue;
} <-- f
} <-- g
} while (argc > 0);) <-- h
exit(0);
}
Variables local to main
Argument processing using getopt
Process the -t option
(a) Switch labels grouped together
End of switch block
At least once
(7) Process remaining arguments
Read characters until EOF
(b) Tab character
Process next character
(c) Backspace
(d) All other characters
(e) Newline
(f) End of switch block
(g) End of while block
(h) End of do block
The role of getstops is more difficult to understand. There is no comment, the code in the function body is not trivial, and its name can be interpreted in different ways. Noting that it is used in a single part of the program (Figure 2.3:3) can help us further. The program part where getstops is used is the part responsible for processing the program's options (Figure 2.3:2). We can therefore safely (and correctly in our case) assume that getstops will process the tab stop specification option. This form of gradual understanding is common when reading code; understanding one part of the code can make others fall into place. Based on this form of gradual understanding you can employ a strategy for understanding difficult code similar to the one often used to combine the pieces of a jigsaw puzzle: start with the easy parts.
Exercise 2.7 Examine the visibility of functions and variables in programs in your environment. Can it be improved (made more conservative)?
Exercise 2.8 Pick some functions or methods from the book's CD-ROM or from your environment and determine their role using the strategies we outlined. Try to minimize the time you spend on each function or method. Order the strategies by their success
No comments:
Post a Comment