data storage location

Ashley Pittman ashley at
Thu Sep 11 13:03:30 EDT 2003

> The alternative approach is to keep copies of the data on local disk on
> each node. This gives you good IO rates, but you then have a substantial
> data management problem; how to you copy 100Gb to each node in your
> cluster in a sensible amount of time, and how do you update the data and
> make sure it is kept consistent?
> The commonest approach to data distribution is to do some sort of
> cascading rsync/rcp which follows the topology of your network.

I've often wondered why there isn't some kind of a 'mpicp' program to do
just this.

I'd imagine the command line to be something like

$ mpirun -allcps mpicp node0:~/myfile.dat /tmp/

This would then use MPI_Bcast to send the data to all the nodes.  The
assumption here is that MPI_Bcast is fairly efficient anyway so it's
best to use it rather than writing your own cascading rsync algorithm.


Beowulf mailing list, Beowulf at
To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list