|
Page 2 of 2
All production-quality MPI implementations can handle simultaneous
progress of multiple requests, even those that do not allow true
asynchronous progress. Hence, even if polling (via TEST
operations) is required, non-blocking communication programming models
can still represent a large performance gain as compared to standard /
blocking mode communication.
Sidebar:
Pre-Posting Receives
|
Just because an operation is non-blocking does not mean that it is
somehow automatically more efficient than if it were blocking. Indeed,
many of the same best practices that apply to blocking communication
also apply to non-blocking communication. One such best practice
judiciously pre-posting non-blocking receives. This method potentially
helps an MPI implementation reduce the use of temporary buffers.
For example, if a message is received in an MPI process that is
unexpected - meaning that the application did not [yet] post a
corresponding receive - the MPI implementation may have to allocate a
temporary buffer to receive it. If a matching receive is ever posted,
the MPI implementation copies the message from the temporary buffer
into the destination buffer.
However, if a non-blocking receive is posted before the message is
received, once the message arrives, it is expected and can be received
directly into the target buffer. No temporary buffer needs to be
allocated and no extra memory copy is necessary. Hence, ensuring to
pre-posting receives can increase the efficiency of an MPI
application.
|
MPI-2: Multiple Types of Requests
MPI-2 defines two new types of operations that can be started and
completed using MPI_Request handles: parallel I/O and
user-mode "generalized" requests. Although those operations are the
topics for future columns, suffice it to say that both of them follow
the same general model as non-blocking point-to-point communication:
actions are started with calls to MPI functions that generate requests
and are completed with calls to TEST or WAIT
operations.
A subtle implication is that the array-based TEST
and WAIT functions can accept multiple MPI_Request
handles regardless of the type of pending operation that they
represent. Hence, it is possible to create an array of requests that
encompasses both point to point and I/O communications, and
have MPI_WAITALL wait for the completion of all of them.
Sidebar:
ROMIO: A Popular MPI-2 I/O Implementation
|
ROMIO is a popular implementation of many of the MPI-2 I/O function
calls from Argonne National Laboratory
(e.g., MPI_FILE_OPEN, MPI_FILE_READ, etc.). ROMIO's
implementation is layered on top of MPI-1 point-to-point
communication; it is specifically designed to be used as an add-on to
existing MPI implementations (such as LAM/MPI, LA-MPI, FT-MPI, and
MPICH, to name a few). This layering creates problems because ROMIO
cannot re-define the underlying type MPI_Request since it has
already been defined by the underlying MPI implementation. Moreover,
the back-end of MPI_Request is different in every MPI
implementation; ROMIO can't extend it in a portable way.
ROMIO's solution was to create a new
type: MPIO_Request. All MPI_FILE* functions that are
supposed to take an MPI_Request as a parameter instead take
an MPIO_Request. This situation means that ROMIO technically
does not conform to the MPI-2 standard, but this detail is usually
overlooked for the sake of portability and functionality.
There is a notable side effect, however. Since MPI_TEST
and MPI_WAIT (and their variants) take MPI_Request
arguments, they cannot accept ROMIO MPIO_Requests. Hence,
ROMIO implements its own MPIO_TEST and MPIO_WAIT
functions. As such, MPI implementations that use ROMIO generally do
not support invoking the various TEST and WAIT
functions with arrays of point-to-point and I/O requests.
|
Where to Go From Here?
Non-blocking communications, when used properly, can provide a
tremendous performance boost to parallel applications. In addition to
allowing the MPI to perform at least some form of asynchronous
progress (particularly when used with communication co-processor-based
networks), it allows the MPI to progress multiple communication
operations simultaneously.
Got any MPI questions you want answered? Wondering why one MPI
does this and another does that? Send
them to the MPI Monkey.
Resources
| ROMIO: A High-Performance, Portable MPI-IO Implementation |
http://www.mcs.anl.gov/romio |
| MPI Forum (MPI-1 and MPI-2 specifications documents) |
http://www.mpi-forum.org |
| MPI - The Complete Reference: Volume 1, The MPI Core (2nd ed) (The
MIT Press) |
By Marc Snir, Steve Otto, Steven Huss-Lederman, David Walker, and
Jack Dongarra. ISBN 0-262-69215-5 |
| MPI - The Complete Reference: Volume 2, The MPI Extensions (The
MIT Press) |
By William Gropp, Steven Huss-Lederman, Andrew Lumsdaine, Ewing
Lusk, Bill Nitzberg, William Saphir, and Marc Snir. ISBN
0-262-57123-4. |
| NCSA MPI tutorial |
http://webct.ncsa.uiuc.edu:8900/public/MPI/ |
This article was originally published in ClusterWorld Magazine. It
has been updated and formatted for the web. If you want to read more
about HPC clusters and Linux, you may wish to visit
Linux Magazine.
Jeff Squyres is the Assistant Director for High Performance Comptuing
for the Open Systems Laboratory at Indiana University and is the one
of the lead technical architects of the Open MPI project.
Comment on this article
You must login to leave comments...
Other Visitors Comments
There are no comments currently....
|