Environment monitoring

David Mathog mathog at mendel.bio.caltech.edu
Thu Oct 2 11:33:21 EDT 2003


Robert G. Brown  rgb at phy.duke.edu wrote:

>The bad thing is that it does NOT give you any sort of measure of room
>temperature per se,

Well, no, but to be fair that's hardly lm_sensors fault.
The problem is that few (any?) motherboards have a
sensor positioned away from hot devices on the upstream
end of the wind flow.  One can sometimes acquire a fair
approximation of this info using SMART from a hard drive
if the airflow across the drive is good and
the drive itself does not run very hot.  We have not yet
filled the second processor slot on the mobos of our beowulf
and that temperature sensor gives a pretty good indication
of the air temperature in the case (32C) vs. under a live
Athlon MP 2200+ processor (no load, 40.5C). 

We use lm_sensors with mondo 

  http://mondo-daemon.sourceforge.net/

to watch the systems and shut them down if they overheat.

Generally this works well.  Mondo can compensate for
the shortcomings of the lm_sensors/motherboard combos which
sometimes arise.  For instance, on our ASUS A7V266 mobos
(workstations, not in a beowulf!) some of the sensors tend
to go whacky for one or two measurements.  Fan speeds go
to 0 or temps to 255C.  Mondo is set to require an out
of range condition for 3 seconds before triggering
a shutdown, and so far we have not seen a glitch last that
long.

Regards,


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list