We've tried both multicast and snowball for data distribution on our
cluster. We have a 60Gig dataset which we have to distribute to 1000

We started off using snowball copies. They work, but care is needed in
your choice of tools for the file-transfers.  rsync works, but can have
problems with large (> 2Gig) files if you use rsh as the transport
mechanism. (this is an rsh bug on some redhat versions rather than an
rsync bug).

rsync over ssh gets around that problem, but of course has the added
encryption overhead.

You should also avoid the incremental update mode of rsync (which is the
default). We've found that it will silently corrupt your files if you
rsync across different architectures (eg alpha-->ia32). It also has
problems with large files.

The only usable multicast code we've found that actually works is udpcast.

There are plenty of other multicast codes to choose from out on the web,
and most of them fall over horribly as soon as you cross more than one
switch or have more than 10-20 hosts.

We get ~70-80% wirespeed on 100MBit and Gigabit ethernet, and we've used
it to sucessfully distribute our 60gig dataset over large numbers of nodes

In practice, on gigabit, we find that disk write speed is the limiting
factor rather than the network. Lawrence Livermore use udpcast to install
OS images on the MCR cluster, and I believe they side-step the disk
performance issue by writing data to a ramdisk as an intermediate step.
Obviously this only makes sense if your dataset < size of memory.

Our current file distribution strategy is to use a combination of rsync
and updcast. We do a dummy rsync to find out what files need updating, tar
them up, pipe the tarball through udpcast and then untar the files and the

The main performance killer we've found for udpcast is cheap switches.


