[Beowulf] ECC exerciser/exorciser?

Greg Lindahl lindahl at pbm.com
Mon Jan 26 16:53:10 EST 2009


On Mon, Jan 26, 2009 at 10:30:50AM -0500, Mark Hahn wrote:

> - first, how would you go about setting a threshold for how high is an
> acceptable CE count?  we by default are using the mce module, which by  
> default polls at 1Hz.  my thinking is that if we get overflow events
> (the multiple error bit is set), then it's too fast.

The number should be about zero of these events, if you're near sea
level. Almost all of my 100s of 32 gbyte systems show no MCEs.

At significant altitude (5000+ feet), I don't know the current number
for this generation of memory, but it's probably << 1/week/system.

I'm curious about the comments that indicate that the "burnin" CD's
HPL isn't as good as running HPL yourself. Very odd.

And if you're going to use stream or other programs for testing, do
keep in mind that loading down all the cores seems to be very
important for causing problems.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the Beowulf mailing list