data storage location
Ashley Pittman
ashley at pittman.co.uk
Thu Sep 11 13:03:30 EDT 2003
> The alternative approach is to keep copies of the data on local disk on
> each node. This gives you good IO rates, but you then have a substantial
> data management problem; how to you copy 100Gb to each node in your
> cluster in a sensible amount of time, and how do you update the data and
> make sure it is kept consistent?
>
> The commonest approach to data distribution is to do some sort of
> cascading rsync/rcp which follows the topology of your network.
I've often wondered why there isn't some kind of a 'mpicp' program to do
just this.
I'd imagine the command line to be something like
$ mpirun -allcps mpicp node0:~/myfile.dat /tmp/
This would then use MPI_Bcast to send the data to all the nodes. The
assumption here is that MPI_Bcast is fairly efficient anyway so it's
best to use it rather than writing your own cascading rsync algorithm.
Ashley,
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list