Weird network connection problem

Joe Landman landman at scalableinformatics.com
Tue Oct 1 22:06:59 EDT 2002


This sounds like hardware:

Suggestions:

. Take two network cables, one going to a failing node, one to a working
node.  Swap them.  See if the problem moves to the originally ok
machine, and off of the originally not-ok machine.  If the problem
moves, restore the original connections, and try replacing the failing
machines' network cable.  See if you still have a problem.

. Drop in a new network card to one of the failing machines.  Turn off
the onboard network.  Does it still fail?

...

On Tue, 2002-10-01 at 14:24, Yudong Tian wrote:
> Hi,
>    We've got 200 nodes running,  with each node having a Biostar MB with a
> built-in Realtek 8192
> NIC chip, and running RH 7.2, 2.4-17. Recently we've seen a weird problem
> with the
> network connection to some of these nodes. About 20 of them will lose their
> network connections
> during the course of a few hours to a few days. And if I just unplug the
> network cable from
> such a node and plug it back in right away, the connection will come back
> up.  Any insight
> where we should look for the problem: hardware? BIOS? Driver  module? Such a
> problem
> does not happen to any nodes randomly. It only occurs to those 20 or so
> particular nodes,
> which are identical to others except their MAC addrs.
> 
> Thanks.
> Yudong Tian
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-- 
Joseph Landman, Ph.D
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list