[Beowulf] S.M.A.R.T usage in big clusters

Steven Timm timm at fnal.gov
Mon Feb 16 09:08:54 EST 2004


We are using the SMART monitoring on our cluster.  It depends
on the drive model how much predictive power you will get.
On the drives where we have had the most failures we've kept track
of how well SMART predicted it pretty well.. it finds an error
in advance about half the time.

Steve Timm


------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525  timm at fnal.gov  http://home.fnal.gov/~timm/
Fermilab Computing Division/Core Support Services Dept.
Assistant Group Leader, Scientific Computing Support Group
Lead of Computing Farms Team

On Sat, 14 Feb 2004, Konstantin Kudin wrote:

>  I am curious if anyone is using SMART monitoring of
> ide drives in a big cluster.
>
>  Basically, the question is in what percentage of the
> situations when a drive fails SMART is able to give
> some kind of a reasonable warning beforehand, let's
> say more than 24 hours. And how often it does not
> predict failure at all?
>
>  The reason I am asking is that recently I had a drive
> that started getting bunch of I/O errors on certain
> sectors, yet SMART seemed to indicate that things were
> fine.
>
>  Thanks!
>
>  Konstantin
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Finance: Get your refund fast by filing online.
> http://taxes.yahoo.com/filing.html
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list