P4 dual vs P4C vs Opteron
Bill Broadley
bill at math.ucdavis.edu
Thu Jul 17 02:45:58 EDT 2003
I have been evaluating price/performance with a locally written
earthquake simulation code written in C, mostly floating point, and
not very cache friendly. I thought people might be interested in the
performance numbers I collected.
Gcc-3.2.2 was used in all cases with the -O3 flag (compiled on the
machine it ran).
Dual p4-3.0/533 Mhz, no HT mahcine
1 process took 86.43 seconds.
2 proccesses in parallel took 156.9 seconds
Scaling efficiency =~ 10% (2 processes run at the same time have 10% greather
throughput then a single process on a single cpu)
Dual Opteron 240-1.4 Ghz/333 MHz
1 process took 97.87 seconds.
2 proccesses in parallel took 99.79 seconds
Scaling efficiency =~ 96% (2 processes run at the same time have 97% greather
throughput then a single process on a single cpu)
Single P4C-2.6 Ghz/800 Mhz FSB with HT enabled.
1 process took 81.22 seconds.
2 proccesses in parallel took 137.59 seconds
Scaling efficiency =~ 18% (2 processes run at the same time have 18% greather
throughput then a single process on a single cpu)
I'd also like to do a performance per watt. Anyone have a >= 2.6 Ghz
dual P4, 533 Mhz FSB, a rackmount motherboard, and a kill-a-watt?
Unfortunately my dual p4 has a fast 3d card which would throw my
performance per watt calculations.
I found it amusing that Hyperthreading scaled somewhat poorly, but still
managed to outscale and outperform the dual p4, despite a significantly
slower clock.
So the P4C-2.6 is the fastest for a single job and the opteron (the slowest
model sold) is the fastest for 2 jobs. For the curious I'm seeing around
1.8 amps @ 110V running the dual opteron with 2 busy CPUs.
--
Bill Broadley
Mathematics
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list