[Beowulf] copying big files (Henning Fehrmann)
mm at yuhu.biz
Sun Aug 10 09:56:50 EDT 2008
On Sunday 10 August 2008 15:02:52 Scott Atchley wrote:
> On Aug 10, 2008, at 7:57 AM, Scott Atchley wrote:
> > You may want to look at http://loci.cs.utk.edu. If you need to
> > distribute large files within a cluster or across the WAN, you can
> > use the LoRS tools to stripe the file over multiple servers and the
> > clients then try pulling blocks off of each server in parallel.
> > Using Internet2 and one client at Vanderbilt and a couple servers at
> > Univ of Tennessee, they were able to saturate UT's ~400 Mb/s I2 link
> > (much to the disbelief of the Vandy IT staff). I have seen ~5 Gb/s
> > within a cluster using good 10G NICs. :-)
> > Scott
> I forgot to mention LoRS optionally uses MD5 for checksums and AES-128
> for encryption (you can use either, both or neither).
> The stored file is represented by a XML file called an exNode. If you
> want to share the data, you can email the exNode to someone and they
> can then download the data. You control the download offset and length
> so that you can extract just the parts of the file that you want. I
> believe there is a NetCDF version that can use exNodes and there may
> be a HDF5 version as well.
I'm new to the list and I don't know if this was previously discussed but when
I need to provision a file to all machines within my cluster I use a cluster
file system like GlusterFS(http://www.gluster.org/docs/index.php/GlusterFS)
or GFarm(http://datafarm.apgrid.org/). I started with NFS but when you have
more then 50-60 machines your NFS becomes the problem that all machines see.
And the cure for that usually is an expensive hardware purchase.
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf