A Petaflop machine in 20 racks?

Joachim Worringen joachim at ccrl-nece.de
Fri Oct 17 03:48:17 EDT 2003

Jim Lux:
> It also doesn't say whether the architecture is, for instance, SIMD.  It
> could well be a systolic array, which would be very well suited to cranking
> out FFTs or other similar things, but probably not so hot for general
> purpose crunching.

Exactly. Such coprocessor-boards (typically DSP-based, which also achieve some 
GFlop/s) already exist for a long time, but obviously are not suited to 
change "the way we see computing" (place your marketing slogan here). 

One reason is the lack of portability for code making use of such hardware, 
but I think if the performance for a wider range of applications would 
effectively come anywhere close to the peak performance, this problem would 
be overcome by the premise of getting teraflop-performance for some 10k of $.

Thus, the problem probably is that typical applications do not achieve the 
promised performance. All memory-bound applications will get stuck on the 
PCI-bus, by both, memory access latency and bandwidth. High sustained 
performance for real problems can, in the general case, only be achieved in a 
balanced system.


Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list