[Beowulf] Benchmark results

Robert G. Brown rgb at phy.duke.edu
Tue Jan 6 09:51:25 EST 2004

On Tue, 6 Jan 2004, Joe Landman wrote:

> Hi Rene:
>   How long were the runs in comparison to the timer resolution?  What 
> other processes (if any) were running?  Did various cron-jobs light off 
> during any portion of this?
>    For opterons, there is an a processor affinity issue whereby you can 
> get different performance if the memory allocated and used is not 
> associated with that processor.  I have heard that there are patches 
> which deal with this, though I haven't had time to try them.  I do seem 
> to remember an alignment issue for Xeon as well. 
>    I would not expect absolutely identical timing data each time the 
> benchmark were run,  but I would expect a tight grouping around a mean 
> value.  The relative size of the error would be related to a number of 
> factors, including timer resolution, machine state, etc.  For runs of 
> reasonable length (longer than 30 minutes), the timer resolution effects 
> should be minimal.  If you have other processes lighting off, consuming 
> memory, processor cycles, cache, bandwidth, interrupts, it is likely 
> that your distribution will reflect this.
> Joe

Ya.  This is one of the things motivating the bproc beowulf design --
the ability to build a "stripped" machine with much more controlled and
predictable state and with an absolutely minimal set of competing
"system" tasks running on an otherwise quiescent system.  You can check
out Scyld (open source commercial) or Clustermatic (non-commercial/GPL)
if you want to pursue this route.

The run length issue is pure central limit theorem.  If you do short
runs (or internally time short loops), expect large fluctuations and a
relatively broad distribution of times as e.g. a task swap or context
switch in the middle of a timing loop will lead to wildly different
times on a per-case basis.  If you run "long" times you can average more
effectively over the fluctuating/uncontrolled part of the system state
and your mean times SHOULD home in their true mean value, with a
distribution of means that narrows roughly with the inverse square root
of the run time.

Exceptions to this are, as Joe noted, persistent state differences
between runs.  These should be investigated and understood, as 5%
runtime difference is like getting a free node in twenty and worth

ps and top are your friend.  They'll eat some cycles themselves (good
old Heisenberg, sort of:-), but they'll also give you a realtime picture
of the tasks that are competing.  xmlsysd/wulfstat might help as well
(sort of like vmstat and/or netstat run across the whole cluster at
once).  The other thing I can think of that might result in a large
persistent variance (besides memory/motherboard issues) is substantial
difference in network communication pattern.  If your communications
aren't carefully organized, they may self-organize depending on a
nucleation condition (who talks to who and and gets through first).  I
can easily imagine distinct communication patterns emerging with a large

Actually, if I were a REAL computer scientist, this would be a fairly
interesting topic for a research project -- spontaneous emergence of
persistent patterns with very different efficiencies in complex
networks.  If you look (and upon examination it looks like this might be
the problem) there are probably research papers out there on it already.
Can't imagine that this hasn't been studied.


> Rene Storm wrote:
> >Dear cluster folks,
> >
> >I saw some weird results of different benchmarks on my systems.
> >My problem is to verify these results.
> >
> >I've played with hpl benchmark on our clusters (8 CPU Intel, 8 CPU Opteron) and wasn't able to get two times the same result. Of course no configuration changes. Difference round 5-10%.
> >
> >So I went down to one machine, but the same behavior.
> >Standart mpich, per-complied lapack, two processor on an smp machine.
> >10 times the same benchmark gives back 10 different results.
> >Same trouble with stream memory benchmark, dgemv matrix calculation and others.
> >
> >There was no network, but it was a big installation, so it could be, that there are some running jobs (eg cron-jobs) which disturb my benchmarks results.
> >
> >Next Step: I created a self-booting cdrom from the scratch - added litte bit of X and a gui my benchmarks.
> >
> >1) cpi - calculating pi with mpich on smp (ch_p4 on loopback)
> >2+3) dgemv - 5kx5k matrix calculation , single and smp
> >4) crafty chess benchmark - 2 threads
> >5) /bin/true - calling true via fork(), taking the time
> >6) PMB-MPI1 - Pallas benchmark
> >
> >There is only sshd for mpich running, everything is loaded from the cdrom into a ramdisk.
> >BUT same result for the results. 5-10% unsteady.
> >
> >Please let me know, if you could see same behavior on your machines.
> >Does someone know a reason for that?
> >
> >
> >If you would like to check this cd out -> 
> >
> >40 MB download. You will need at least 512MB and no usb mice, keyboard.
> >
> >Thanks in advance
> >Rene Storm
> >
> >
> >
> >
> >_______________________________________________
> >Beowulf mailing list, Beowulf at beowulf.org
> >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> >  
> >
> -- 
> Joseph Landman, Ph.D
> Scalable Informatics LLC,
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
> phone: +1 734 612 4615
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list