[Beowulf] Re: [Rocks-Discuss]Intel compiler specifically tuned for SPEC2k (and other benchmarks?)

Robert G. Brown rgb at phy.duke.edu
Wed Feb 11 18:32:28 EST 2004

On Wed, 11 Feb 2004, Lombard, David N wrote:

> OK. So there's our difference.  I only consider an application benchmark
> useful in this scenario.  I can't imagine using an application benchmark
> of any sort if it isn't; you enumerated all the reasons for this in the
> bits I just snipped.
> We agree completely on this.

I figured that we did -- I'm getting verbose on it because I think it is
an important issue to be precise on.  "What's a FLOP?" is a perfectly
reasonable question with a perfectly unintelligible and meaningless
answer, in spite of it being cited again and again over decades to sell
systems.  At the same time, benchmarks are certainly useful.

I think the confusion is probably my fault -- my age/history showing
again.  I can remember fairly clearly when awk was cited as a benchmark.
Quake too, and not for people who were USING awk or necessarily going to
play quake.  This is what I meant by an "application benchmark" -- some
sort of application that somebody thinks is a good measure of general
systems performance and manage to get people to take seriously.  Stuff
like this is still fairly commonly used in many WinXX "benchmarks" that
you'll see "published" both on the web and in real paper magazine
articles.  How fast can Excel update a spreadsheet that computes lunar
orbital trajectories, that sort of thing.

Sometimes they are almost a joke -- applications that do a lot of disk
I/O (apparently, who knows) are used as a "disk performance benchmark".
I won't even get started on this sort of thing and the number of
variables left completely uncontrolled (for example, the disk caching
subsystems both hardware and software) compared to, say, bonnie or
lmbench.  I also won't comment on just how much crap there is out there
with stuff like this in it, sometimes from supposedly "reputable"
testing companies that ought to know better or be more honest.

That's why I "trust" GPL/Open microbenchmarks the most, because I can
look at their sources, understand just what they are doing and how it
compares to what I want to do, maybe even hack them if I need to because
it isn't QUITE right, and get numbers with some meaning.  Stuff like
SPEC and linpack (where linpack should probably be considered micro)
isn't horrible but (in the case of SPEC) isn't GPL or terribly
straightforward to understand microscopically or macroscopically -- it
takes experience to know how the profile it generates compares to
features in your own code.  Great for sales-speak, though -- "Our system
gets 2301.124 specoloids/second, while THEIR system is a laughable
1721.564."  Quake isn't a useful benchmark -- it is a game, and one that
generally runs as fast as it needs to whereever it runs...but it is a
GREAT benchmark for how a system plays quake:-)


Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list