[Beowulf] gpu numbers

Mark Hahn hahn at mcmaster.ca
Sun Nov 23 18:00:03 EST 2008

one thing I was surprised at is the substantial penalty that the 
current gtx280-based gpus pay for double-precision.
I think I understand the SP throughput - since these are genetically
graphics processors, their main flop-relevant op is blend:
 	pixA * alpha + pixB * beta
that's 3 sp flops, and indeed the quoted 933 glops = 
240 cores @ 1.3 GHz * 2mul1add/cycle.  I'm a little surprised
that they quote only 78 DP gflops - 1/12 the SP rate.
I counted ops when doing base-10 multiplication on paper,
and it seemed to require only 4x each SP mul.  I guess the 
problem might simply be that each core isn't OOO like CPUs,
or that emulating DP does't optimally utilize the available 2mul+add.

note also: 78 DP Gflops/~200W.  3.2 GHz QC CPU: 51 DP Gflops/~200W.
figuring power is a bit tricky, but price is even worse.  for power,
NV claims <200W (not less than 150 in any of the GTX280 reviews, though).
but you have to add in a host, which will probably be around 300W;
assuming you go for the C1070, the final is 4*78/(800+300).
a comparison CPU-based machine would be something like 2*51/350W.
amusingly, almost the same DP flops per watt ;)

does anyone know whether the reputed hordes of commercial Cuda apps
mostly stick to SP?
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

More information about the Beowulf mailing list