[Beowulf] Seeing ECC errors since upgraded from Opteron 246 to 275
lindahl at pbm.com
Sat Aug 23 19:51:42 EDT 2008
On Wed, Aug 06, 2008 at 02:56:51PM -0500, Jason Clinton wrote:
> We have a tool on our website called "breakin" that is Linux 188.8.131.52
> patched with K8 and K10f Opteron EDAC reporting facilities. It can
> usually find and identify failed RAM in fifteen minutes (two hours at
> most). The EDAC patches to the kernel aren't that great about naming
> the correct memory rank, though.
> Make sure you have multibit (sometimes says 4-bit) ECC enabled in your BIOS.
I just gave this a try, and it seems to be a very nicely packaged
utility. Thanks for making it available. I've used some similar stuff
before, but this is really easy.
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf