mpich on cluster of SMPs

Nathan Fredrickson 8nrf at
Mon May 26 10:48:03 EDT 2003


I have a cluster of SMP machines running linux-2.4.20 and mpich-1.2.5.  I
configured mpich with device=ch_p4 and comm=shared to allow shared-memory
communication between processes on the same node.  This seemed to be working
fine, but I was not confident that shared-memory was actually being used
on-node so I build and installed a second instance of mpich with
device=shmem.  Using a simple two process test program that measures how
long it takes to send an integer back and forth 10000 times I compared the
three setups:

device=ch_p4, comm=shared, off-node: 1.88 seconds
device=ch_p4, comm=shared, on-node: 1.25 seconds
device=ch_shmem: 0.288314 seconds

The ch_p4 device does not seem to be using shared-memory when both processes
are on the same node.  Am I misinterpreting what comm=shared is supposed to
do?  Is there additional configuration required to make the ch_p4 device use
shared-memory on-node?  I expected on-node performance similar to
device=shmem.  Any insights would be appreciated.


Beowulf mailing list, Beowulf at
To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list