Trouble with Bonding Broken MPI

Ricardo hraa at lncc.br
Wed Jul 3 08:17:26 EDT 2002


Hi

Take a look at the mpi machines file and verify if you're calling
the right network.


On Tue, 25 Jun 2002, Todd Broucksou wrote:

> All,
>   I have a RedHat 7.2, kernel 2.4.9-31 cluster on 8 Athlon MP's. Recently 
> added second 3c590 nic to bring up Channel Bonding. According to ifconfig 
> -a my Bond0 is up. I can ping across without any packet loss. And can ftp 
> internal to the cluster without any loss. But both the LAM and MPitch 
> mpirun are broken. I can add servers with lamboot or pg4 but can not run 
> mpirun. I get a network error. LAM and MPitch did work before "upgrade" to 
> channel bonding.
>   I have tried two separate Cisco switches and one Cisco switch running 
> EtherChannel still does not work with MPIrun type programs.
>   I have even tried recompiling Mpitch without any change in success.
>   Any help would be appreciated.
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list