Q: Building a small machine room? Materials/costs/etc.

Robert G. Brown rgb at phy.duke.edu
Fri Sep 19 12:29:50 EDT 2003


On Thu, 18 Sep 2003, Jim Lux wrote:

> At 12:25 PM 9/18/2003 -0400, Robert G. Brown wrote:
> >On Thu, 18 Sep 2003, Jim Lux wrote:
> >
> > > So, what you probably want is a sort of staged setup.. moderate overtemp
> > > shuts down computer power, bigger overtemp (i.e. a fire) shuts down the
> > > blowers, pressing the big red button next to the door shuts down all 
> > the power.
> >
> >We have some of these (including the big red switch by the door) but not
> >all.  Maybe soon -- the first is really a matter of sensors and scripts.
> >
> >    rgb
> 
> Personally, I prefer not relying on a computer (that's doing anything else) 
> to shut down the computer for a serious overtemp.  I'd rather see a sensor 
> and a hardwired connection to some sort of relay.  OK, I might go for a 
> dedicated PLC (programmable logic controller), but I'd want it to be rock 
> solid and robust in a variety of environments, something that no PC is ever 
> going to be (PCs are just too complex with too many logic gates 
> internally).  Sometimes simple is good.

I had in mind something like one or more "server" class hosts with
sensors connected that they poll every minute or five minutes or so
(depending on how small your space is and how paranoid you are).

This just gives you access to the room temperature at the sensor
locations inside e.g. scripts.  At that point, you can do anything you
like -- mail out a warning at threshold A, initiate automated powerdowns
by banks or all together at temperaure B (saving servers for last so all
writebacks can occur) -- whatever.

netbotz are expensive, but they provide such an interface (and more)
completely canned and SNMP or web-accessible and even have internal
mailers to do the mailing at an alarm threshold.  And expensive is cheap
for people who don't want to DIY and do want to protect a few hundred
thousand dollars of hardware.  Beyond that it isn't horribly easy to
find attachable (e.g.) RS232 readable thermal sensors, but it can be
done.  Some are linked to brahma/Resources/vendors.php):  the ibutton,
sensorsoft, pico tech.  There are likely more if one is good at
googling.

One of these days I'm going to get this set up for the cluster/server
room.  It actually already has lots of built in monitoring and automated
emergency action, but it is all at the University level and the cluster
control is all local, so I think it will be good to have the
intermediate layer under our own immediate automagic control.

> The sensors and scripts would be good for the graceful shutdown as you get 
> close to the "yellow limit".

Precisely.

> Anyone who is enamored of the total software control of things that can 
> cause physical damage should remember one word: "THERAC-25"...

This is what I'm concerned about the other way, actually.  We're already
plugged into a University level IT sensor structure beyond our immediate
control AND that provides us with no immediate information at the
systems level in the room itself.  By the time even their 24 hour
service responds to a thermal problem, it is likely to be WAY too late
because whether the meltdown (or rather, thermal kill) time is less than
a minute or less then fifteen minutes, it is >>short<< once AC is really
interrupted.  

We just got grazed by a potentially interrupting event (and yes, damn
it, I've been Franned again and my house has no electricity and probably
won't for several days, if the past is any guide).  I'd really prefer
the yellow limit scripts, sorted out so that nodes shut down first and
maybe even only.  If the room is large enough, it can sometimes manage
to handle the heat output of only a few servers via thermal ballast and
losses through walls and floor and so forth, especially if one can
provide even an ordinary rotating fan to move the air.

   rgb

> 
> Nancy Leveson at MIT has a site at  http://sunnyday.mit.edu/ where there is 
> much about software safety.
> 
> James Lux, P.E.
> Spacecraft Telecommunications Section
> Jet Propulsion Laboratory, Mail Stop 161-213
> 4800 Oak Grove Drive
> Pasadena CA 91109
> tel: (818)354-2075
> fax: (818)393-6875
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list