mpich on cluster of SMPs

Nathan Fredrickson 8nrf at qlink.queensu.ca
Mon May 26 10:48:03 EDT 2003


Hi,

I have a cluster of SMP machines running linux-2.4.20 and mpich-1.2.5.  I
configured mpich with device=ch_p4 and comm=shared to allow shared-memory
communication between processes on the same node.  This seemed to be working
fine, but I was not confident that shared-memory was actually being used
on-node so I build and installed a second instance of mpich with
device=shmem.  Using a simple two process test program that measures how
long it takes to send an integer back and forth 10000 times I compared the
three setups:

device=ch_p4, comm=shared, off-node: 1.88 seconds
device=ch_p4, comm=shared, on-node: 1.25 seconds
device=ch_shmem: 0.288314 seconds

The ch_p4 device does not seem to be using shared-memory when both processes
are on the same node.  Am I misinterpreting what comm=shared is supposed to
do?  Is there additional configuration required to make the ch_p4 device use
shared-memory on-node?  I expected on-node performance similar to
device=shmem.  Any insights would be appreciated.

Thanks,
Nathan

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list