[Beowulf] copying big files (Henning Fehrmann)

David Mathog mathog at caltech.edu
Mon Aug 18 11:38:09 EDT 2008


Henning Fehrmann wrote:

> 
> I spread successfully a 10G file to 50 nodes. The rate was 140Mb/s for
nettee and a bit slower using  dolly.
> I guess it was due to a busy node somewhere in the chain.  
> Increasing the number of clients up to 100 failed in both cases.
> 
> For nettee I got:
> nettee: fatal error writing to child: Connection reset by peer

> 
> I will do more systematic test the next days. 
> David Mathog, are you interested in bug reports?

Yes, please. 

If memory serves you will see that error whenever a child node, or
nettee on that child, crashes.  For instance, if you "kill -9" nettee on
a child the parent should see that.  The command option -colwf will let
the chain continue if this is caused by a full disk or a stdout pipe
failing.  The option -conwf should let the chain continue transfer down
to one above the failed node, and it should tell you which node it was
that failed, so long as -v is used with the appropriate bits.

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list