[Beowulf] Re: recommendation on crash cart for a cluster room:fullcluster KVM is not an option I suppose?
rpnabar at gmail.com
Fri Oct 9 13:17:59 EDT 2009
On Thu, Oct 8, 2009 at 5:55 PM, Greg Lindahl <lindahl at pbm.com> wrote:
> 1) Console logging. Your machine just crashed. No clue in
> /var/log/messages. "I wonder if it printed something on the console?"
> Answer: ipmi and conman (available in an rpm in Red Hat distros).
I was "planning" on using kdump and a crash-kernel for that. Note the
emphasis on "planning". I never got that working correctly. I got
started on kdump+kexec when exactly the same "node crashes for unkown
reasons and I have no output" problem.
Maybe IPMI gives you the same functionality. Interesting point for me
though: What's the pros and cons of IPMI-console-logging versus kdump
in such crash scenarios. Are they competitors? Is one better / easier
than the other?
> 2) Monitoring. Temp, fan speeds, power supply state, events. Answers
> the "why is the little red light on the front of the case lit?"
> question. You can get some of this via other software (lm_sensors),
> but I find ipmitool to suck less, and ipmitool accurately answers the
> red light question -- lm_sensors can only guess.
I see. Yes, you read me correctly: I was putting full faith in
lm_sensors to do this. Currently I have lm_sensors feedign
Temperatures to my nagios monitoring setup and has been working fine.
But I didn't grasp a practical point about lm_sensors sucking more
than IPMI. THat's interesting again: Aren't they taking data from the
same bus or counters? Or is this because the sensor details tend to be
proprietary so lm_sensors lags behind the Vendor implementations of
Because if open-source IPMI is also trying to log sensor stats its in
competition with open source lm_sensors (not to say this is bad or un
heard of for multiple open source projects getting the same thing
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf