Memory-checking debuggers are the greatest thing since sliced bread. Once you start using memory-checking debuggers, you'll wonder how you programmed without them. In addition to identifying all the "normal" causes of cashing that regular debuggers provide (segmentation faults, bus errors, etc.), memory-checking debuggers look for erroneous patterns such as accessing memory outside of an array or the local stack, using heap memory that was already freed, freeing memory that was already freed, using uninitialized variables, and so on. Best of all, they will tell you these things by file and line number in your source code.
Popular memory-checking debuggers include Valgrind (Linux), bcheck (Solaris, part of the Forte compiler suite), Rational Purify (a commercial product available for several operating systems), and various forms of "malloc debug" (e.g., OS X has native support). Others are also available.
Most memory-checking debuggers are typically intended to be used non-interactively and cannot be attached to already-running processes. As such, depending on your MPI implementation, they can only be launched via mpirun. For example (Running Valgrind in parallel)
$ mpirun -np 2 valgrind -num-callers=100 \ -tool-memcheck -leak-check=yes \ -show-reachable=yes -log-file=output my_mpi_app
This command will run two copies of Valgrind, which will, in turn, each launch a copy of my_mpi_app. Each of the Valgrind instances will monitor their child process and send their output to a file named foo.pid[PID]. After the application completes, the foo files can be examined to see the errors that Valgrind detected.
Many users ask me if they need to compile their MPI implementation with "-g" to enable them to debug their MPI application.
The answer is no. The application being debugged must be compiled with "-g", however. The MPI implementation itself should not be compiled with "-g"; this will disallow the debugger from stepping into MPI functions. Hence, if you try to "step" into MPI_SEND, the debugger won't let you and will likely execute the entire MPI_SEND function call. This result is what most users want anyway - you're attempting to debug your application, not the MPI implementation.
If you compile your MPI implementation with "-g", you'll be able to step into functions such as MPI_SEND, but this may not provide as much useful information as you would think - the internals of an MPI implementation are quite complex.
Note that this sidebar really only applies to MPI implementations that provide their source code; binary-only MPI implementation distributions are most likely compiled without "-g" (and, by definition, the debugger will not be able to find the source code to display).
Note that LAM/MPI should be configured with the "-with-purify" switch to be used with memory-checking debuggers. This switch eliminates many false positives at the expense of a slight performance loss (i.e., LAM uses some optimizations that are known to be safe, but tools such as Valgrind will interpret them as reading from uninitialized memory).
Although memory-checking debuggers cannot catch all errors, they can help find a lot of errors even before you know that they exist (even for serial applications). My own personal experience has shown that it can extremely helpful to use memory-checking debuggers frequently during an application's development - even when you are not aware of any current bugs.
Finally, there are debuggers specifically created to operate on parallel MPI applications. Three commercial suites are the Distributed Debugging Tool (DDT) from Allinea, Fx2 from Absoft, and Totalview from Etnus. These packages have the significant advantage over the prior approaches in that they can natively understand an entire parallel process. Specifically, in addition to all the normal functionality of a debugger (setting breakpoints, examining variables, stepping through code, etc.), you can individually monitor and control all processes in a running MPI job.
For example, you can step through the code in process A (while blocking all other processes) and watch a message being sent. Then you can step through process B and watch the message being received. In this manner, you have complete control over the entire parallel job.
This kind of tool is invaluable for serious parallel application development, but tend to be somewhat expensive. If you can afford them, parallel debuggers are extremely helpful tools.
Where to Go From Here?
The moral of this column is: fear not. There really is more to parallel debugging than printf, Virginia. Debugging is a tricky task, but using the proper tools can greatly reduce the task to something that is manageable.
Next column, we'll continue the debugging discussion and describe some common MPI programming errors and how you can use the techniques described here to find them.
Got any MPI questions you want answered? Wondering why one MPI does this and another does that? Send them to the MPI Monkey.
|LAM/MPI FAQ (more information on debugging in parallel)|
|MPI Forum (MPI-1 and MPI-2 specifications documents)|
|MPI - The Complete Reference: Volume 1, The MPI Core (2nd ed) (The MIT Press) By Marc Snir, Steve Otto, Steven Huss-Lederman, David Walker, and Jack Dongarra. ISBN 0-262-69215-5|
|MPI - The Complete Reference: Volume 2, The MPI Extensions (The MIT Press) By William Gropp, Steven Huss-Lederman, Andrew Lumsdaine, Ewing Lusk, Bill Nitzberg, William Saphir, and Marc Snir. ISBN 0-262-57123-4.|
|NCSA MPI tutorial|
This article was originally published in ClusterWorld Magazine. It has been updated and formatted for the web. If you want to read more about HPC clusters and Linux, you may wish to visit Linux Magazine.
Jeff Squyres is leading up Cisco's Open MPI efforts as part of the Server Virtualization Business Unit.
- << Prev