[Beowulf] Re: [Linux-HA] Couldn't get watchdog to work

Alex Vrenios alex at dsrlab.com
Tue Dec 28 12:14:16 EST 2004

> -----Original Message-----
> Paul Chen wrote:
> > Both nodes did restart 
> > heartbeat but none of them reboot or shut down. Am I doing 
> > something wrong?
> >
> Alan Robertson wrote:
> The watchdog timer will only kill the system if heartbeat goes insane.
> It didn't.  So, the watchdog timer is happy.
> At this point in time, the watchdog timer is not a 
> replacement for a STONITH device.
Which is exactly what I am looking into (the STONITH device)...

I see two solutions, one hardware and one software. The hardware solution
looks expensive, but I believe the software solution will help Mr. Chen
(above), and would appreciate comments.

I would have my "backup" system execute a command as part of its attempts to
assume the identity, responsibilities and resources of the "primary" system.
The command is run from backup, as follows:

   root at backup> ssh root at primary shutdown -h now

This will not work in all cases, but it should work in cases like the above.
A hardware solution is more general, but it doesn't hurt to run this command
in any case.

Alex Vrenios

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list