[Beowulf] Memory stress testing tools.

Prentice Bisbal prentice at ias.edu
Thu Dec 9 11:08:28 EST 2010

On 12/08/2010 11:47 AM, Jason Clinton wrote:
> On Tue, Dec 7, 2010 at 10:54, Prentice Bisbal <prentice at ias.edu
> <mailto:prentice at ias.edu>> wrote:
>     Can any of you recommend a good RAM stress testing tool?
> We have an open source ISO/netboot image that can stress-test using the
> latest Linux kernel EDAC facilities and HPL as the test code. It's
> posted here: http://www.advancedclustering.com/software/breakin.html
> It's intended to be booted into.
> There's a beta of a slightly newer version posted at:
> http://lab.advancedclustering.com/bootimage/
> I would be interested in any feedback you have on either version.


I know breakin well. I used it a quite a bit a in 2008 when I was 
stress-testing my then-new cluster, and sent some feedback to the 
developer at the time (last name Shoemaker, I think).  I did find that I 
could run it for days on all my cluster nodes, and then a few days 
later, when running a HPL as a single job across all the nodes, I'd get 
memory errors. I haven't used it since. Not because I don't like it, but 
I just haven't had a need for it since then.

I've also been testing this node by running a single HPL job across all 
32 cores myself, and even after days of doing this, I couldn't trigger 
any errors, but a user program could trigger an error in only a couple 
of hours.

Based on these experiences, I don't think that HPL is good at stressing 
RAM.Has anyone else had similar experiences?

Since this system has 128 GB of RAM, I think it's a good assumption that 
many programs might not use all of that RAM, so I need something memory 
specific that I know will hit all 128 GB of RAM.

So far, mprime appears to be working. I was able to trigger an SBE in 21 
hours the first time I ran it.  I plan on running it repeatedly for the 
next few days to see how well it can repeat finding errors.

Prentice Bisbal

Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

More information about the Beowulf mailing list