rbw at ahpcrc.org
Wed Mar 12 11:34:38 EST 2003
Mark H. wrote:
>> >> PS Pentium 4 sustained performance from memory is about
>> >> 5% of peak (stream triad).
>> >that should be 50%, I think.
>> Nope ... not "from memory".
>> A 2.8 GHz P4 using SSE2 instructions can deliver two
>> 64-bit floating point results per clock or 5.6 Gflops
>> peak performance at this clock. The stream triad (a
>> from-memory, multiply-add operation) for a 2.8 GHz
>> P4 produces only 200 Mflops (see stream website). The
>> arithmetic is then:
>> 200/5600 = .0357 or 3.57% (so 5% is a gift)
>oh, I see. to me, that's a strange definition of "peak",
>since stream is, by intention, always bottlenecked on
>memory bandwidth, since its FSB is either 3.2 or 4.3 GB/s.
>it'll deliver roughly 50% of that to stream.
Not strange, reliable and consistent. Peak is always what
the processor's floating-point core can deliver without
data delivery bottlenecks.
It is also as you suggest a "marketing number". The
stream triad performance defines another pole (a sort
of sea-level, far beneath peak, as you like) within which
the real-world performance of most real code will sit.
>> As you suggest, the P4 will (as does the Cray X1) do
>> significantly better when cache use/re-use is a
>> significant factor.
>no, it's not a matter of reuse, but what you consider "peak".
>I think the real take-home message is that this sort of
>fraction-of-theoretical-peak is useless, and you need to look
>at the actual numbers, possibly scaled by price.
"useless", I like the ring of that ;-) ... not so if the ratio
of flops to mops in your kernels is low (stream triad is
.667). It sets a floor for out-of-the-box performance from
you may be able to raise your particular codes performance.
You can almost always get more with cache twiddling/blocking.
>as a matter of fact, I'm always slightly puzzled by this sort
>of conversation. yes, crays and vector computers in general
>are big/wide memory systems with a light scattering of ALU's.
>a much different ratio than the "cache-based" computing world.
>but if your data is huge and uniform, don't you win big by
>partitioning (data or work grows as dim^2, but communication
>at partitions scaling much slower)? that would argue, for instance,
>that you should run on a cluster of e7205 machines, where each node
>delivers a bit more than the 200 Gflops above under $2k, and should
>scale quite nicely until your interconnect runs out of steam,
>say, several hundred CPUs. the point is really that stream-like
>codes are almost embarassingly parallel.
Right. This is a key (perhaps last, along with SSI) point of impact
between custom and commdity HPC systems products. By clustering you
are buying distributed bandwidth ... whether it is useable for your
code ... depends ... if your code has aready been modified for
message passing, it message passing cycles can be hidden behind
computation, it has a nicely blockable foot print, it is not too
latency dependent, does not run so long that it needs check-pointing
to ensure its completion in a cluster enviroment.
There are folks with the cash and in these situations.
>so what's the cost per stream-triad gflop from Cray?
Le coupe de grace ... oui? ... but, 3 year TCOs of very large
clusters with non-COTS, interconnects, and utilization factored
in (large, like ala PNNL) are closer to those of the Cray X1 than
you might think. Doing 3-year TCO calculations is like massaging
the fat lady (finding a true global minimum at a given site ain't
that simple) and up being driven by local politics, so I am not
going to give you our site-specific/prejudiced numbers here ;-),
but I think that in certain markets and at certain sites Cray
likes their odds ... so does the bubble-wary stock market ...
a 300% percent gain on their stock in the last year. (I don't
own any ;-) )
# Richard Walsh
# Project Manager, Cluster Computing, Computational
# Chemistry and Finance
# netASPx, Inc.
# 1200 Washington Ave. So.
# Minneapolis, MN 55415
# VOX: 612-337-3467
# FAX: 612-337-3400
# EMAIL: rbw at networkcs.com, richard.walsh at netaspx.com
# rbw at ahpcrc.org
# "Beware, the shifting center of one's solar system.
# Today's religion/truth is some tomorrow's historical
# -Max Headroom
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf