[Beowulf] copying big files (Henning Fehrmann)

David Mathog mathog at caltech.edu
Mon Aug 18 11:38:09 EDT 2008

Henning Fehrmann wrote:

> I spread successfully a 10G file to 50 nodes. The rate was 140Mb/s for
nettee and a bit slower using  dolly.
> I guess it was due to a busy node somewhere in the chain.  
> Increasing the number of clients up to 100 failed in both cases.
> For nettee I got:
> nettee: fatal error writing to child: Connection reset by peer

> I will do more systematic test the next days. 
> David Mathog, are you interested in bug reports?

Yes, please. 

If memory serves you will see that error whenever a child node, or
nettee on that child, crashes.  For instance, if you "kill -9" nettee on
a child the parent should see that.  The command option -colwf will let
the chain continue if this is caused by a full disk or a stdout pipe
failing.  The option -conwf should let the chain continue transfer down
to one above the failed node, and it should tell you which node it was
that failed, so long as -v is used with the appropriate bits.


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list