Off topic - G5 question

Robert G. Brown rgb at phy.duke.edu
Thu Jun 26 13:55:25 EDT 2003


On Thu, 26 Jun 2003, Simon Hogg wrote:

> And what is A/G BLAST?  "A/G BLAST is an optimized version of NCBI BLAST 
> developed by Apple in collaboration with Genentech. Optimized for dual 
> PowerPC G5 processors, the Velocity Engine, and the symmetric 
> multiprocessing capabilities of Mac OS X[...]"

Awwww, you-all is jes' so cynical...;-)

Actually, I attended at least part of an HPC talk a week+ ago given by
an Apple guy -- about half devoted to BLAST and genetic stuff and half
devoted to marketing (as far as I could tell:-).  From what I could dig
out of him (and this is all from memory so don't shoot me if I'm wrong)
the G5 has an embedded vector unit in it (the "Velocity Engine" in the
marketspeak above?) that has a modestly spectacular peak throughput for
instruction streams that match -- I want to say 12 GFLOPs or something
like that.  The CPU itself has a fairly hefty L3 cache outside of L2 and
L1 and some cleverness designed to try to keep the gaping maw the vector
unit represents filled.  It actually looked like a thoughtful
architecture at least at the block device/bus level.

However, as is usually the case with vector sub- or co-processors, the
real problem is getting code to use it at all, let alone effectively, as
the compiler usually groks "CPU" and doesn't know offhand when or how to
send code to the vector processor for a speedup or when to leave it
alone and execute on the CPU itself.  I would guess that code has to
minimally be vectorized and instrumented to use the VP with pragma's or
the like, or worse, require actual cross-compiled objects to inline in
the code, but I really don't know or care enough to fine out.

So I think that is what they are referring to as an "optimized version"
-- one that somebody has taken the time to instrument for the VP and
tune for their cache sizes and so forth.  Apple was clearly very
interested in targeting the genomics clustering people -- they may
(correctly) see them as deep pockets backing an insatiable appetite for
cluster power over the next decade or so as projects everywhere build up
the lexicon of genetic expression of every living creature on the
planet (let alone the "human genome") and be gunning for an inside edge
with an architecture and application set deliberately tuned for good
performance in this one market.

Of course, this makes a LOT of things very difficult to properly
benchmark.  At a guess, if one hand-tuned almost any sort of
vector-intensive floating point code for the VP architecture, one might
see a fairly spectacular speedup, but note that the unit's integer
performance (indicative of how well the CPU does everyday work) was
distinctly unimpressive so a NON-vectorizable application might really
suck on it.  Also one needs to use SSE2 instrumented compilers and
instructions to get fair comparisons from alternative architectures, for
the same reasons -- if one compiles a patch of code that SSE2 would
(possibly spectacularly) speed up with a non-SSE2 compiler, well,
performance might well suck.

Mark's observation about best-to-best comparisons are likely one
approach, especially if you plan to examine vendor benchmarks.  Let
Apple tune the hell out of BLAST, but only compare results to an
Intel-tuned BLAST on >>its<< favorite CPU/system du jour, etc.
Alternatively, just compiling the same source with gcc with the same
flags on both systems works for me, at least, as this is likely to be
the extent of the effort I want to make tuning per architecture, ever.
Maintaining architecture-specific optimizations in a changing world of
hardware technology is very painful and expensive.

> So once again, it would be nice to see some 'real-world' comparisons of 
> like-with-like.

Amen, although the problem that has stumped philosophers over the ages
is: just what IS the real world, anyway?

Pardon me, I have to put a cat into a superposition state with a
diabolical device...:-)

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list