[Beowulf] Torrents for HPC

Bill Broadley bill at cse.ucdavis.edu
Fri Jun 8 20:06:19 EDT 2012

I've built Myrinet, SDR, DDR, and QDR clusters ( no FDR yet), but I 
still have users whose use cases and budgets still only justify GigE.

I've setup a 160TB hadoop cluster is working well, but haven't found 
justification for the complexity/cost related to lustre.  I have high 
hopes for Ceph, but it seems not quite ready yet.  I'd happy to hear 

A new user on one of my GigE clusters submits batches of 500 jobs that 
need to randomly read a 30-60GB dataset.  They aren't the only user of 
said cluster so each job will be waiting in the queue with a mix of others.

As you might imagine that hammers a central GigE connected NFS server 
pretty hard.  This cluster has 38 computes/304 cores/608 threads.

I thought torrent might be a good way to publish such a dataset to the 
compute nodes (thus avoiding the GigE bottleneck).  So I wrote a 
small/simple bittorrent client and made a 16GB example data set and 
measured the performance pushing it to 38 compute nodes:

The slow ramp up is partially because I'm launching torrent clients with 
a crude for i in <compute_nodes> { ssh $i launch_torrent.sh }.

I get approximately 2.5GB/sec sustained when writing to 38 compute 
nodes.  So 38 nodes * 16GB = 608GB to distribute @ 2.5 GHz sec = 240 
seconds or so.

The clients definitely see MUCH faster performance when access a local 
copy instead of a small share of the performance/bandwidth of a central 
file server.

Do you think it's worth bundling up for others to use?

This is how it works:
1) User runs publish <directory> <name> before they start submitting
2) The publish command makes a torrent of that directory and starts
    seeding that torrent.
3) The user submits an arbitrary number of jobs that needs that
    directory.  Inside the job they "$ subscribe <name>"
4) The subscribe command launches one torrent client per node (not per j
    job) and blocks until the directory is completely downloaded
5) /scratch/<user>/<name> has the users data

Not nearly as convenient as having a fast parallel filesystem, but seems 
potentially useful for those who have large read only datasets, GigE and 


Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Mailscanner: Clean

More information about the Beowulf mailing list