[Beowulf] S.M.A.R.T usage in big clusters
timm at fnal.gov
Wed Feb 18 09:16:48 EST 2004
On Wed, 18 Feb 2004, Joseph Mack wrote:
> Steven Timm wrote:
> > On the drives where we have had the most failures we've kept track
> > of how well SMART predicted it pretty well.. it finds an error
> > in advance about half the time.
> How do you get your information out of smartd?
> I've found output in syslog - presumably I can grep for this.
At the moment we are not using smartd. I was running an older
version that didn't have it as part of the package. I wrote
some cron scripts that do a short test every night and capture
the output to a file. But we are going to transition and
use smartd and use an agent we already have that is grepping
/var/log/messages for other purposes.
> I can get e-mail if I want (from the docs).
> To look at the output of the long and short tests it appears that
> I have to interactively use smartctl.
> Is there anyway to have a flag that can be looked at periodically to
> say "this disk is about to fail"?
> Thanks Joe
> Joseph Mack PhD, High Performance Computing & Scientific Visualization
> SAIC, Supporting the EPA Research Triangle Park, NC 919-541-0007
> Federal Contact - John B. Smith 919-541-1087 - smith.johnb at epa.gov
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf