Top node hotter than others?

Michael Huntingdon hunting at ix.netcom.com
Fri Jul 25 14:03:17 EDT 2003


David

Do the systems have anything similar to Insight Manager to indicate the
rate of your fans? In a rack where space is tight and systems are running
hot, a slight variance in the movement of air can be significant.

Do the cabinets have fans overhead to draw the warm air out? Less expensive
cabinets are not necessarily engineered to ensure consistent airflow under
demanding conditions, typical with clusters like this.

Are all 20 nodes purely compute or do you have head nodes somewhere in the
mix? As clusters become larger and more dense there is a great deal of
research going on in various labs, to ensure stability of temperatures not
just within cabinets, but across entire computer rooms. "Hot Spots" are a
growing issue. Have you dealt with any of the major manufactures specific
to this or any other concerns as your research clusters grow?

My Best
Michael

At 10:29 AM 7/25/2003 -0700, David Mathog wrote:
>We have a 20 x 2U rack and I've noticed that the
>top node is always a step hotter than the other nodes.
>
>Why?
>
>There is a slight gradient going up the rack (see
>below, 01 is on the bottom, 20 on the top) but it
>doesn't explain the jump at the top node.  At first
>I thought it might be due to hot air moving from
>the back of the rack, over the top of the highest
>node, and being sucked in by it.
>However no temperature change resulted when all
>side vents were blocked and cardboard pasted up
>the front of the rack so that only the same cold
>air as the other nodes could enter.  The only other
>difference between this node and the others is
>that there's hot air above 20 (two empty rack slots),
>but another node above all the others. So maybe all
>that hot air heats the top node's case and that
>couples the heat in?  I don't have an insulating
>panel handy to test that hypothesis.
>
>node case    cpu
>01   +34°C   +43°C 
>02   +35°C   +44°C 
>03   +37°C   +48°C 
>04   +42°C   +50°C 
>05   +38°C   +48°C 
>06   +37°C   +50°C 
>07   +36°C   +45°C 
>08   +38°C   +48°C 
>09   +38°C   +48°C 
>10   +38°C   +48°C 
>11   +36°C   +44°C 
>12   +38°C   +48°C 
>13   +38°C   +48°C 
>14   +40°C   +49°C 
>15   +38°C   +46°C 
>16   +36°C   +46°C 
>17   +39°C   +51°C 
>18   +39°C   +48°C 
>19   +39°C   +49°C 
>20   +44°C   +54°C
>
>Temperatures were measured using "sensors" on these
>tyan S2466 motherboards (1 CPU on each currently.)
>The case value is the temperature reading by the
>diode under the socket of the absent 2nd CPU.
>The temperatures jump around a degree or two.
>
>Regards,
>
>David Mathog
>mathog at caltech.edu
>Manager, Sequence Analysis Facility, Biology Division, Caltech
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list