Debug with GDB


Guide to Faster, Less Frustrating Debugging

Guide to Faster, Less Frustrating Debugging

Norman Matloff
University of California at Davis
(530) 752-1953
matloff@cs.ucdavis.edu
?1992-2001, N. Matloff

April 4, 2002

Contents

1  Save Time and Frustration!
2  General Debugging Strategies
    2.1  Confirmation
    2.2  Binary Search
    2.3  What If It Doesn't Even Compile?
3  Use a Debugging Tool and a Good Text Editor!
    3.1  Don't Use printf()/cout As Your Main Debugging Devices
    3.2  Which Debugging Tools?
        3.2.1  The Ubiquitous (on Unix) gdb
        3.2.2  ddd: A Better View of gdb
        3.2.3  A Good Text Editor Can Help a Lot
    3.3  Integrated Development Environments
    3.4  What Do I Use?
4  How to Use gdb
    4.1  Easy to Learn
    4.2  The Basic Strategy
    4.3  The Main gdb Commands
        4.3.1  Invoking/Quitting gdb
        4.3.2  The r (Run) Command
        4.3.3  The l (List) Command
        4.3.4  The b (Breakpoint) and c (Continue) Commands
        4.3.5  The d (Display) and p (Print) Commands:
        4.3.6  The printf Command
        4.3.7  The n (Next) and s (Step) Commands
        4.3.8  The bt (Backtrace) Command
        4.3.9  The set Command
    4.4  The call Command
        4.4.1  The define Command
    4.5  Effect of Recompiling the Source Without Exiting gdb
    4.6  A gdb Example
    4.7  Documentation on gdb

http://heather.cs.ucdavis.edu/~matloff/debug.html

It's much easier and pleasurable to use gdb through the ddd interface. For example, to set a breakpoint we just click the mouse, and a little stop sign appears to show that we will stop there.

But I do recommend learning the text-based version first.

http://heather.cs.ucdavis.edu/~matloff/progedit.html. For example, it mentions that you should make good use of undo/redo operations.1 Consider our binary-search example above, in which we were trying to find an elusive compilation error. The advice given was to delete half the lines in the function, and later restore them. If your text editor includes undo capabilities, then the restoration of those deleted lines will be easy.

It's very important that you use an editor which allows subwindows. This enables you to, for instance, look at the definition of a function in one window while viewing a call to the function in another window.

Often one uses other tools in conjunction with a debugger. For example, the vim editor (an enhanced version of vi) can interface with gdb; see my vim Web page:http://heather.cs.ucdavis.edu/~matloff/vim.html You can initiate a compile from within vim, and then if there are compilation errors, vim will take you to the lines at which they occur.

http://heather.cs.ucdavis.edu/~matloff/codecrusader.html

It used to be public-domain, but unfortunately Code Crusader is about to become a commercial product.

One big drawback from the point of view of many people is that one cannot use one's own text editor in most IDEs. Many people would like to use their own editor for everything-programming, composing e-mail, word processing and so on. This is both convenient and also allows them to develop personal alias/macro libraries which save them work.

A big advantage of Code Crusader is that it allows you to use your own text editor. As far as I can tell, this works best with emacs.

2

So, gdb has pinpointed the exact source of our error-the value of J is way too large on this line. Now we have to determine why J was so big. Let's take a look at the entire function, using gdb's l command:

g196  (gdb) l CheckPrime.c:12
 g53     /* the plan:  see if J divides K, for all values J which are
 g54
 g55           (a) themselves prime (no need to try J if it is nonprime), and
 g56           (b) less than or equal to sqrt(K) (if K has a divisor larger
 g57               than sqrt(K), it must also have a smaller one,
 g58               so no need to check for larger ones) */
 g59        
g200  19         J = 2;
g201  20         while (1)  {
g202  21            if (Prime[J] == 1)
g203  22               if (K % J == 0)  {
g204  23                  Prime[K] = 0;
g205  24                  return;
g206  25               }

Look at the comments in Lines g56-g58. We were supposed to stop searching after J got to sqrt(K). Yet you can see in Lines g201-g206 that we never made this check, so J just kept growing and growing, eventually reaching the value 4024 which triggered the seg fault.

After fixing this problem, the new CheckPrime.c looks like this:

g214  mole.matloff% cat CheckPrime.c
g215  
g216  
g217  #include "Defs.h"
g218  #include "Externs.h"
g219  
g220  
g221  CheckPrime(K)
g222     int K;
g223  
g224  {  int J;
g225  
g226     /* the plan:  see if J divides K, for all values J which are
g227        
g228           (a) themselves prime (no need to try J if it is nonprime), and
g229           (b) less than or equal to sqrt(K) (if K has a divisor larger
g230               than this square root, it must also have a smaller one,
g231               so no need to check for larger ones) */
g232     
g233     for (J = 2; J*J <= K; J++)  
g234        if (Prime[J] == 1)
g235           if (K % J == 0)  {
g236              Prime[K] = 0;
g237              return;
g238           }
g239     
g240     /* if we get here, then there were no divisors of K, so it is
g241        prime */
g242     Prime[K] = 1; 
g243  }

OK, let's give it another try:

g248  mole.matloff% !F
g249  FindPrimes
g250  enter upper bound
g251  20
g252  mole.matloff% 

What?! No primes reported up to the number 20? That's not right. Let's use gdb to step through the program. We will pause at the beginning of main(), and take a look around. To do that, we set up a ``breakpoint,'' i.e. a place where gdb will suspend execution of our program, so that we can assess the situation before resuming execution:

g261  (gdb) b main
g262  Breakpoint 1 at 0x22b4: file Main.c, line 24.

So, gdb will pause execution of our program whenever it hits Line 24 of the file Main.c. This is Breakpoint 1; we might (and will) set other breakpoints later, so we need numbers to distinguish them, e.g. in order to specify which one we want to cancel.

Now let's run the program, using the r command:

g286  (gdb) r
g287  Starting program: /tmp_mnt/lion/d/guest/matloff/tmp/FindPrimes 
g288  
g289  Breakpoint 1, main () at Main.c:24
g290  24         printf("enter upper bound\n");

We see that, as planned, gdb did stop at the first line of main() (Line 24). Now we will execute the program one line at a time, using gdb's n (``next'') command:

g291  (gdb) n
g292  enter upper bound
g293  25         scanf("%d",&UpperBound);

What happened was that gdb executed Line 24 of Main.c as requested-the message from the call to printf() appears on Line g292-and now has paused again, at Line 25 of Main.c, displaying that line for us (Line g293 of the script file).

OK, let's execute Line 25, by typing `n' again:

 

g294  (gdb) n
g295  20
g296  27         Prime[2] = 1;

Since Line 25 was a scanf() call, at Line g295 of the script file, gdb waited for our input, which we typed as 20. Gdb then executed the scanf() call, and paused again, now at Line 27 of Main.c (Line 296 of the script file.

Now let's check to make sure that UpperBound was read in correctly. We think it was, but remember, the basic principle of debugging is to check anyway. To do this, we will use gdb's p (``print'') command:

g297  (gdb) p UpperBound
g298  $1 = 20

OK, that's fine. So, let's continue to execute the program one line at a time, by using n:

g299  (gdb) n
g300  29         for (N = 3; N <= UpperBound; N += 2)

Also, let's keep track of the value of N, using gdb's disp (``display'') command. The latter is just like the p, except that disp will print out the value of the variable each time the program pauses, as opposed to p, which prints out the value only once.

g301  (gdb) disp N
g302  1: N = 0
g303  (gdb) n
g304  30            CheckPrime();
g305  1: N = 3
g306  (gdb) n
g307  29         for (N = 3; N <= UpperBound; N += 2)
g308  1: N = 3

Hey, what's going on here? After executing Line 30, the program then went back to Line 29-skipping Line 31. Here is what the loop looked like:

29   for (N = 3; N <= UpperBound; N += 2)
30      CheckPrime();
31      if (Prime[N]) printf("%d is a prime\n",N);

Oops! We forgot the braces. Thus only Line 30, not Lines 30 and 31, forms the body of the loop. No wonder Line 31 wasn't executed.

After fixing that, Main.c looks like this:

g314  mole.matloff% cat Main.c
g315  
g316  
g317  /* prime-number finding program
g318  
g319     will (after bugs are fixed) report a list of all primes which are
g320     less than or equal to the user-supplied upper bound
g321  
g322     riddled with errors! */
g323  
g324  
g325  
g326  #include "Defs.h"
g327  
g328  
g329  int Prime[MaxPrimes],  /* Prime[I] will be 1 if I is prime, 0 
                                otherwise */
g330      UpperBound;  /* we will check all number up through this one for
g331                      primeness */
g332  
g333  
g334  main()
g335  
g336  {  int N;
g337  
g338     printf("enter upper bound\n");
g339     scanf("%d",&UpperBound);
g340  
g341     Prime[2] = 1;
g342  
g343     for (N = 3; N <= UpperBound; N += 2)  {
g344        CheckPrime();
g345        if (Prime[N]) printf("%d is a prime\n",N);
g346     }
g347  }

OK, try again:

g352  mole.matloff% !F
g353  FindPrimes
g354  enter upper bound
g355  20
g356  mole.matloff% 

Still no output! Well, we will now need to try a more detailed line-by-line execution of the program. Last time, we did not go through the function CheckPrime() line-by-line, so we will need to now:

g586  (gdb) l Main.c:1
g587  1       
g588  2       
g589  3       /* prime-number finding program
g590  4       
g591  5          will (after bugs are fixed) report a list of all primes which 
g592  (gdb) 
g593  6          are less than or equal to the user-supplied upper bound
g594  7       
g595  8          riddled with errors! */
g596  9       
g597  10      
g598  11      
g599  12      #include "Defs.h"
g600  13      
g601  14      
g602  15      int Prime[MaxPrimes],  /* Prime[I] will be 1 if I is prime, 0 
g603  (gdb) 
g604  16          UpperBound;  /* we will check all number up through this one
g605  17                          primeness */
g606  18      
g607  19      
g608  20      main()
g609  21      
g610  22      {  int N;
g611  23      
g612  24         printf("enter upper bound\n");
g613  25         scanf("%d",&UpperBound);
g614  (gdb) 
g615  26      
g616  27         Prime[2] = 1;
g617  28      
g618  29         for (N = 3; N <= UpperBound; N += 2)  {
g619  30            CheckPrime();
g620  31            if (Prime[N]) printf("%d is a prime\n",N);
g621  32         }
g622  33      }
g623  34       
g624  (gdb) b 30
g625  Breakpoint 1 at 0x2308: file Main.c, line 30.

Here we have placed a breakpoint at the call to CheckPrime.3

Now, let's run the program:

g626  (gdb) r
g627  Starting program: /tmp_mnt/lion/d/guest/matloff/tmp/FindPrimes 
g628  enter upper bound
g629  20
g630  
g631  Breakpoint 1, main () at Main.c:30
g632  30            CheckPrime();

Gdb has stopped at Line 30, as we requested. Now, instead of using the n command, we will use s (``step''). This latter command is the same as n, except that it will enter the function rather than skipping over the function like n does:

g633  (gdb) s
g634  CheckPrime (K=1) at CheckPrime.c:19
g635  19         for (J = 2; J*J <= K; J++)  

Sure enough, s has gotten us to the first line within CheckPrime().

Another service gdb provides for us is to tell us what the values of the parameters of the function are, in this case K = 1. But that doesn't sound right-we shouldn't be checking the number 1 for primeness. So gdb has uncovered another bug for us.

In fact, our plan was to check the numbers 3 through UpperBound for primeness: The for loop in main() had the following heading:

for (N = 3; N <= UpperBound; N += 2)

Well, what about the call to CheckPrime()? Here is the whole loop from main():

29         for (N = 3; N <= UpperBound; N += 2)  {
30            CheckPrime();
31            if (Prime[N]) printf("%d is a prime\n",N);
32         }

Look at Line 30-we forgot the parameter! This line should have been

30            CheckPrime(N);

After fixing this, try running the program again:

g699  mole.matloff% !F
g700  FindPrimes
g701  enter upper bound
g702  20
g703  3 is a prime
g704  5 is a prime
g705  7 is a prime
g706  11 is a prime
g707  13 is a prime
g708  17 is a prime
g709  19 is a prime

OK, the program now seems to be working.

1That in turns means that you should make sure to use a good editor which has these operations.

2Note very, very carefully, though: Most C compilers-including the Unix one, which is what we are using here-do not produce checks for violation of array bounds. In fact, a ``moderate'' violation, e.g. trying to access Prime[57], would not have produced a seg fault. The reason that our attempt to access Prime[4024] did produce a seg fault is that it resulted in our trying to access memory which did not belong to us, i.e. belonged to some other user of the machine. The virtual-memory hardware of the machine detected this.

By the way, the `$1' here just means that we can refer to 4024 by this name from now on if we like.

3We could have simply typed ``b CheckPrime'', which would have set a breakpoint at the first line of CheckPrime(), but doing it this way gives as a chance to see how the s command works. By the way, note Lines 592, 603 and 614; here we simply typed the carriage return, which results in gdb listing some further lines for us.




File translated from
TEX
by
TTH,
version 2.88.
On 4 Apr 2002, 16:47.

相关