AW: mulitcast copy or snowball copy
csmith at lnxi.com
Tue Aug 19 13:48:24 EDT 2003
You might want to look into the Clusterworx product from Linux Networx. It
has been used to boot and image clusters over 1100 nodes in size using
multicast, and supports image sizes over 4GB. Multiple images can be served
by a single server using ethernet. Each channel can use 100% of the network
bandwidth (12.5MB per second on Fast Ethernet) or can be throttled to a
specific rate. We typically use a transmission rate of 10MB per second on
Fast Ethernet (30 seconds for a 300MB image), allowing DHCP traffic to get
through. The multicast server can also be throttled to ensure that its
doesn't overdrive the switch or hub (if you are using cheap ones) which in
many cases can account for up to 95% of packet loss. If your switch is fast
and is IGMP enabled, you will generally experience little to no packet loss.
The technology is based on UDP and multicast and works with LinuxBios and
Etherboot, and was used to image the MCR cluster many times prior to its
deployment at LLNL. MCR could go from powered-off bare metal to running in
about 7 minutes (most of which was disk formatting).
Principal Software Engineer
Linux Networx (www.lnxi.com)
----- Original Message -----
From: "Guy Coates" <gmpc at sanger.ac.uk>
To: <beowulf at scyld.com>
Sent: Tuesday, August 19, 2003 10:53 AM
Subject: Re:AW: mulitcast copy or snowball copy
> We've tried both multicast and snowball for data distribution on our
> cluster. We have a 60Gig dataset which we have to distribute to 1000
> We started off using snowball copies. They work, but care is needed in
> your choice of tools for the file-transfers. rsync works, but can have
> problems with large (> 2Gig) files if you use rsh as the transport
> mechanism. (this is an rsh bug on some redhat versions rather than an
> rsync bug).
> rsync over ssh gets around that problem, but of course has the added
> encryption overhead.
> You should also avoid the incremental update mode of rsync (which is the
> default). We've found that it will silently corrupt your files if you
> rsync across different architectures (eg alpha-->ia32). It also has
> problems with large files.
> The only usable multicast code we've found that actually works is udpcast.
> There are plenty of other multicast codes to choose from out on the web,
> and most of them fall over horribly as soon as you cross more than one
> switch or have more than 10-20 hosts.
> We get ~70-80% wirespeed on 100MBit and Gigabit ethernet, and we've used
> it to sucessfully distribute our 60gig dataset over large numbers of nodes
> In practice, on gigabit, we find that disk write speed is the limiting
> factor rather than the network. Lawrence Livermore use udpcast to install
> OS images on the MCR cluster, and I believe they side-step the disk
> performance issue by writing data to a ramdisk as an intermediate step.
> Obviously this only makes sense if your dataset < size of memory.
> Our current file distribution strategy is to use a combination of rsync
> and updcast. We do a dummy rsync to find out what files need updating, tar
> them up, pipe the tarball through udpcast and then untar the files and the
> The main performance killer we've found for udpcast is cheap switches.
> Guy Coates
> Guy Coates, Informatics System Group
> The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
> Tel: +44 (0)1223 834244 ex 7199
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf