[Beowulf] Re: Finally, a solution for the 64 core 4TB RAM market

Jason Riedy jason at acm.org
Fri May 29 08:56:09 EDT 2009

And Mark Hahn writes:
> the question is how much volume there is in the >= 8-socket market,
> and I don't mean "how many PHB's can be persuaded they need one
> because they're important".

I know a few large companies bought a handful of high-end Starfires
each for their database systems.  Not much in volume (this is less
than 100 total for these folks), but a bit in profits and obscenely
expensive support contracts.

The processor count (or performance) had less impact than the
amount of memory available.  I suspect this semi-vapor-hardware
announcement was targeted at current Sun users...  Showing a steady
upgrade path may move them to IBM+Intel even if not these
particular systems.  And, because one party is IBM, they may sell
these with 1, 2, 4, or 8 sockets activated according to your
contract.  Keeps the hardware volume up.  ;)

And I'm mostly being hopeful because I want a box using these 8x8
boards to replace something I'm suffering against.  "Imagine a
Beowulf cluster of these!"  (with a bit more latency tolerance,
although there may be evidence of a diminishing return)

>> Likely replacing current mid-range, <100-node clusters with a
>> single box.
> unclear to me.  a current mid-range 100-node cluster is 800 cores,
> and I don't think we're talking about that in an SMP.  Intel's recent
> nehalem-ex preview was 128 hyperthreads (64 real).

That 100-node cluster likely has 400-1600 GiB of memory, which is a
bit smaller than 4000 GiB.  But that 4 TiB number includes *really*
expensive memory.

Plus, I imagine a Larrabee-successor or merge could drop into these
boards for workloads heavier on computing.  That may be 3 years
off, but I can see ramping up the core counts and keeping the
relatively inexpensive but fast interconnect as quite useful.  If
your code is latency sensitive (i.e. not one-sided linear algebra
decompositions), fewer cores, more memory, and a fast+cheaper
interconnect may end up being faster.

But then I'm more accustomed to poorly designed systems that have 2
cores per node, an expensive interconnect, and NFS as the only
shared file space.  ;) Replacing one of *those* is a no-brainer,
which is about what went into it in the first place...


Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list