[Beowulf] Re: HVAC and room cooling...
James.P.Lux at jpl.nasa.gov
Mon Feb 2 19:56:33 EST 2004
At 04:27 PM 2/2/2004 -0500, Eckhoff.Peter at epamail.epa.gov wrote:
>Problem 2: What do you do when the AC stops? Maintenance and the
>occasional AC system oops can be devastating to a cluster in a small room.
>Solution 2a: We are tied directly into a security system. When a
>sensor in the room reaches a temperature level, "Security" responds
>dependent upon the
>Solution 2b: We installed a backup automated telephone dialer. Not
>that we don't trust "Security", but we wanted a backup to let us know what was
> When the temperature reaches a certain level, the phone dials us with
> automated message:
> " This is the Sensaphone 1108. The time is 1:36 AM and ...
> [ ed. your CPUs are about to fry... Have a nice night!!!" ;-) ]
YOu need to seriously consider a "failsafe" totally automated shutdown (as
in chop the power when temperature gets to, say, 40C, in the room)...
Security might be busy (maybe there was a big problem with the chiller
plant catching fire or the boiler exploding.. if they're directing fire
engine traffic, the last thing they're going to be thinking about is going
over to your machine room and shutting down your hardware.
The autodialer is nice, but, what if you're out of town when the balloon
A simple temperature sensor with a contact closure wired into the "shunt
trip" on your power distribution will work quite nicely as a "kill it
before it melts". Sure, the file system will be corrupted, and so forth,
but, at least, you'll have functioning hardware to rebuild it on.
Automated monitoring and tcp sockets are nice for management in the day to
day situation, ideal for answering questions like: Should we get another
fan? or Maybe Rack #3 needs to be moved closer to the vent. But, what if
there's a DDoS attack on someone near you, and netops decides to shut down
the router. What if all those Windows desktops run amok, sending mass
emails to each other or trying to remotely manage each other's IIS,
bringing the network to a grinding halt.
The upshot is: Do not trust computers to save your computers in the
ultimate extreme. Have a totally separate, bulletproof system. It's
cheap, it's reliable, all that stuff.
James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf