[Beowulf] Remote console management

David Mathog mathog at mendel.bio.caltech.edu
Fri Sep 23 11:54:54 EDT 2005

Julien Leduc <julien.leduc at lri.fr> wrote

> Something interesting we used (and are still using without any problem 
> since installation), is a homemade reboot solution, replacing the 
> frontpanel with a controled switch (in the final hardware design we 
> found some industrial grade controlled transistor) every boxe allows to 
> control 16 nodes and you can chain 256 of them, which is ok for big 
> clusters, the only problem, is that as a homemade solution, you have to 
> solder everything (replacing frontpanels is not a big deal, because, it 
> just means replacing the original pins with the one of your solution, no 
> soldering should be required on the nodes).

I once considered implementing something similar but couldn't
justify the cost since we have few nodes and I can easily walk
over to them.  Anyway, the point is that the on/off and reset
switches are attached by leads to low power headers on the
motherboard.  So instead of running those leads to the
standard buttons on the front of the case one could instead
thread them out the back of the machine through any
convenient ventilation hole and wire each pair to
a separate electrically controlled switch (normally open).  
Those switches in turn be provided by any number of
readily available hardware.

It would be nice if the manufacturers provided a standard jack
on the backs of the nodes wired in parallel with the front panel
switches for this purpose, but I've never seen a machine where
this has been done.  The main problem with it (other than cost)
that I see is that if these back jacks weren't used you'd
want to cover them so that they couldn't be accidentally
shorted, causing an unintentional reset or power event.  I estimate
that adding these jacks would cost Dell or any other major 
manufacturer about $1 per node.  The external box and wires
to control these switches might run $10 per node (it's just a bunch
of switches). In other words, it would be much, much, much cheaper
than the IPMI or KVM solutions, while admittedly not quite
as useful.

Also sometimes neither the reset switch nor the power switch work
(Tyan anybody?) and the only way to reset the machine is to cut
off power where it enters the node.  That's harder to do with
a low voltage switch because the relevant leads that exist to
do this are inside the power connector from the PS
to the motherboard.  Any power switch present on the PS, and on
rack mount nodes, these tend not to be present in any case, are likely
to be line voltage AC.  So again, it would be nice if the PS
brought two normally open lines which when shorted caused the PS
to shut down.


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
