Home
Learning About Clusters
Programming Clusters
Administering Clusters
Benchmarking Clusters
File Systems for Clusters
Cluster Applications/Grid
Cluster News
Site Map
 
    Home arrow Columns arrow MPI arrow MPI: The Top Ten Mistakes to Avoid (Part 1)
Search
Monkey Support
Main Menu
Home
News
Features
Columns
Reviews
Links
FAQ's
Contact
Site Information
Cluster Classifieds
Projects
Conference Reports
Cluster Agenda
Site Map
Add This Article

Visit Basement Supercomputing

Cluster Builder

Appro International


MPI: The Top Ten Mistakes to Avoid (Part 1) Print E-mail
Written by Jeff Squyres   
Friday, 03 February 2006
Article Index
MPI: The Top Ten Mistakes to Avoid (Part 1)
Page 2

6: Blaming MPI for Programmer Errors

A natural tendency when an application breaks is to blame the MPI implementation, particularly when your application "works" with one MPI implementation and (for example) seg faults in another. While no MPI implementation is perfect, they do typically go through heavy testing before release. It is quite possible (and likely) that your application actually has a latent bug that is simply not tripped on some architectures / MPI implementations.

This sounds arrogant (especially coming from an MPI implementer), but the vast majority of "bug reports" that we receive are actually due to errors in the user's application (and sometimes they are very subtle errors). For example, some compilers initialize variables to default values (such as zero). Others do not. If your code accidentally depends on a variable having a default value, it may work fine under some platforms / compilers, yet cause errors on others.

Before submitting a bug report to the maintainers, double and triple check your application. Use a memory-checking debugger, such as the Linux Valgrind package, the Solaris bcheck command-line checker, or the Purify system. All of these debuggers will report on the memory usage in your application, including buffer overflows, reading from uninitialized memory, and so on. You'd be surprised what will turn up in your application.

Where to Go From Here?

So what did we learn here?

  1. Ensure your environment is setup correctly. You only need to do this once.
  2. Always check non-blocking communication for completion. Don't leak resources.
  3. Avoid MPI_PROBE and MPI_IPROBE; they're evil.
  4. Ensure that you are using the Right compilers.
  5. Don't blame MPI for your errors. Use memory-checking debuggers.

If anything, realize that you are not alone if you run into MPI problems. The problems discussed in this column are all relatively easy to fix. So even if you can't get your MPI application to run - don't despair. The solution is probably just a few Google searches or a system administrator away.

Stay tuned - next column, we'll continue the list with my Top 5, All Time Favorite Evils to Avoid in Parallel.

Resources
MPI Forum (MPI-1 and MPI-2 specifications documents) http://www.mpi-forum.org/
MPI - The Complete Reference: Volume 1, The MPI Core (2nd ed) (The MIT Press) By Marc Snir, Steve Otto, Steven Huss-Lederman, David Walker, and Jack Dongarra. ISBN 0-262-69215-5
MPI - The Complete Reference: Volume 2, The MPI Extensions (The MIT Press) By William Gropp, Steven Huss-Lederman, Andrew Lumsdaine, Ewing Lusk, Bill Nitzberg, William Saphir, and Marc Snir. ISBN 0-262-57123-4.
NCSA MPI tutorial http://webct.ncsa.uiuc.edu:8900/public/MPI/
The Tao of Programming By Geoffrey James. ISBN 0931137071
Valgrind http://www.valgrind.org/

This article was originally published in ClusterWorld Magazine. It has been updated and formatted for the web. If you want to read more about HPC clusters and Linux, you may wish to visit Linux Magazine.

Jeff Squyres is the Assistant Director for High Performance Computing for the Open Systems Laboratory at Indiana University and is the one of the lead technical architects of the Open MPI project.

Comment on this article
You must login to leave comments...


Other Visitors Comments
There are no comments currently....

Last Updated ( Monday, 27 March 2006 )
 
< Prev Article   Next Article >
Appro International
 

Creative Commons License
  ©2005-2008 Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.
Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.