[Beowulf] Cluster doesn't like being moved

Mark Hahn hahn at mcmaster.ca
Tue Mar 10 15:05:50 EDT 2009

> moved (keep getting kicked out of the space I'm using), I end up with any
> number of different problems.

debugging is mainly about breaking down the system into components
whose correctness can be observed separately.

> Personally I suspect some type of hardware issue (this equipment is about 5
> years old), but one of my co-workers isn't so sure hardware is in play.  I
> was having problems with the RAID initializing after one move back which I
> resolved a while back by reseating the RAID controller card.

sounds a bit blackmagic to me.  I don't believe I've ever had a problem
solved by card reseating (though dimm reseating does seem to clean up
40% of of the nodes I see that are reporting a lot of corrected ecc's.)

> This time It appears that the file system & configuration databases became
> corrupted after moving the equipment. Several services aren't starting up
> (LADP, DHCP, PBS to name a few) and YAST2 hangs any time an attempt is made

simplify.  to me, it sounds like your network (ip, route, dns) is confused.
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

More information about the Beowulf mailing list