[Beowulf] Writing a parallel program

Robert G. Brown rgb at phy.duke.edu
Wed Mar 10 12:28:54 EST 2004


On Wed, 10 Mar 2004, roudy wrote:

> Hello everybody,
> I completed to build my beowulf cluster. Now I am writing a parallel program
> using MPICH2. Can someone give me a help. Because, the program that I wrote
> take more time to run on several nodes compare when it is run on one node.
> If there is a small program that someone can send me about distributing data
> among nodes, then each node process the data, and the information is sent
> back to the master node for printing. This will be a real help for me.
> Thanks
> Roud

I can't help you much with MPI but I can help you understand the
problems you might encounter with ANY message passing system or library
in terms of parallel task scaling.

There is a ready-to-run PVM program I just posted in tarball form on my
personal website that will be featured in the May issue of Cluster World
Magazine.  

  http:www.phy.duke.edu/~rgb/General/random_pvm.php

It is designed to give you direct control over the most
important parameters that affect task scaling so that you can learn just
how it works.

The task itself consists of a "master" program and a "slave" program.
The master parses several parameters from the command line:

  -n number of slaves
  -d delay (to vary the amount of simulated work per communication)
  -r number of rands (to vary the number of communications per run and
work burdent per slave)
  -b a flag to control whether the slaves send back EACH number as it is
generated (lots of small messags) or "bundles" all the numbers they
generate into a single message.  This makes a visible, rather huge
difference in task scaling, as it should.

The task itself is trivial -- generating random numbers.  The master
starts by computing a trivial task partitioning among the n nodes.  It
spawns n slave tasks, sending each one the delay on the command line.
It then sends each slave the number of rands to generate and a trivially
unique seed as messages.  Each slave generates a rand, waits delay (in
nanoseconds, with a high-precision polling loop), and either sends it
back as a message immediately (the default) or saves it in a large
vector until the task is finished and sends the whole buffer as a single
message (if the -b flag was set).

This serves two valuable purposes for the novice.

First, it gives you a ready-to-build working master/slave program to
use as a template for a pretty much any problem for which the paradigm
is a good fit.

Second, by simply playing with it, you can learn LOTS of things about
parallel programs and clusters.  If delay is small (order of the packet
latency, 100 usec or less) the program is in a latency dominated scaling
regime where communications per number actually takes longer than
generating the numbers and its parallel scaling is lousy (if slowing a
task down relative to serial can be called merely lousy).  If delay is
large, so that it takes a long time to compute and a short time to send
back the results, parallel scaling is excellent with near linear
speedup.  Turning on the -b flag for certain ranges of the delay can
"instantly" shift one from latency bounded to bandwidth bounded parallel
scaling regimes, and restore decent scaling.

Even if you don't use it because it is based on PVM, if you clone it for
MPI you'll learn the same lessons there, as they are universal and part
of the theoretical basis for understanding parallel scaling.  Eventually
I'll do an MPI version myself for the column, but the mag HAS an MPI
column and my focus would be more for the novice learning about parallel
computing in general.

BTW, obviously I think that subscribing to CWM is a good idea for
novices.  Among its many other virtues (such as articles by lots of the
luminaries of this vary list:-), you can read my columns.  In fact, from
what I've seen from the first few issues, ALL the columns are pretty
damn good and getting back issues to the beginning wouldn't hurt, if it
is still possible.

If you (or anybody) DO grab random_pvm and give it a try, please send me
feedback, preferrably before the actual column comes out in May, so that
I can fix it before then.  It is moderately well documented in the
tarball, but of course there is more "documentation" and explanation
in the column itself.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list