<html>

<body>

Thanks for the results, and the link.&nbsp; In section 6.7 of the NAS

Parallel Benchmark

(<a href="http://www.nas.nasa.gov/News/Techreports/1996/PDF/nas-96-010.pdf">

NPB 2.1 Results Report, NAS-95-010</a> (PDF-213KB) on MPI, I found a

discussion of the Clustered-SMP issues discussed so far in this

thread.&nbsp; Its interesting that these issues discussed twelve years

ago are coming around again.&nbsp; La plus ca change..., I

suppose.<br><br>

In addition, there is a table of results in that section for an SGI Power

Challenge Array showing that idling processors on a given node and using

more nodes improves the speed per processor across four different code

kernels and two different problem sizes.&nbsp; This doesn't tell us how a

hybrid MP/MT application would work within a 4 core 2 CPU node, but it

does hint that memory contention can be just as nasty a problem as high

latency message transmission.<br><br>

<br>

Mike<br><br>

At 12:52 PM 12/10/2007, you wrote:<br>

<blockquote type=cite class=cite cite="">Some people had asked for more

details:<br><br>

NAS suite version 3.2.1<br>

Test class was: B<br>

Units are Mops (Million operations per second)<br>

see the NAS docs for more information<br><br>

--<br>

Doug<br><br>

<br>

&gt; I like answering these types of questions with numbers,<br>

&gt; so in my Sept 2007 Linux magazine column (which should<br>

&gt; be showing up on the website soon) I did the following.<br>

&gt;<br>

&gt; Downloaded the latest NAS benchmarks written in both<br>

&gt; OpenMP and MPI. Ran them both on an 8 core Clovertown<br>

&gt; (dual socket) system (multiple times) and reported<br>

&gt; the following results:<br>

&gt;<br>

&gt; Test&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

OpenMP&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

MPI<br>

&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; gcc/gfortran

4.2&nbsp;&nbsp;&nbsp; LAM 7.1.2<br>

&gt; ------------------------------------<br>

&gt; CG&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

790.6&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

739.1<br>

&gt; EP&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

166.5&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

162.8<br>

&gt; FT&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

3535.9&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

2090.8<br>

&gt; IS&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

51.1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

122.5<br>

&gt; LU&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

5620.5&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

5168.8<br>

&gt; MG&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

1616.0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

2046.2<br>

&gt;<br>

&gt; My conclusion, it was a draw of sorts.<br>

&gt; The article was basically looking at the<br>

&gt; lazy assumption that threads (OpenMP) are<br>

&gt; always better than MPI on a SMP&nbsp; machine.<br>

&gt;<br>

&gt; I'm going to re-run the tests using Harpertowns<br>

&gt; real soon, maybe try other compilers and MPI<br>

&gt; versions. It is easy to do. You can get the code here:<br>

&gt;<br>

&gt;

<a href="http://www.nas.nasa.gov/Resources/Software/npb.html" eudora="autourl">

http://www.nas.nasa.gov/Resources/Software/npb.html</a><br>

&gt;<br>

&gt; --<br>

&gt; Doug<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;&gt; On this list there is almost unanimous agreement that MPI is the

way to<br>

&gt;&gt; go<br>

&gt;&gt; for parallelism and that combining multi-threading (MT) and<br>

&gt;&gt; message-passing<br>

&gt;&gt; (MP) is not even worth it, just sticking to MP is all that is

necessary.<br>

&gt;&gt;<br>

&gt;&gt; However, in real-life most are talking and investing in MT while

very<br>

&gt;&gt; few<br>

&gt;&gt; are interested in MP. I also just read on the blog of Arch

Robison &quot; TBB<br>

&gt;&gt; perhaps gives up a little performance short of optimal so you

don't have<br>

&gt;&gt; to<br>

&gt;&gt; write message-passing &quot; (here:<br>

&gt;&gt;

<a href="http://softwareblogs.intel.com/2007/11/17/supercomputing-07-computer-environment-and-evolution/" eudora="autourl">

http://softwareblogs.intel.com/2007/11/17/supercomputing-07-computer-environment-and-evolution/</a>

<br>

&gt;&gt;&nbsp; )<br>

&gt;&gt;<br>

&gt;&gt; How come there is almost unanimous agreement in the

beowulf-community<br>

&gt;&gt; while<br>

&gt;&gt; the rest is almost unanimous convinced of the opposite ? Are we

just<br>

&gt;&gt; tapping<br>

&gt;&gt; ourselves on the back or is MP not sufficiently dissiminated or

... ?<br>

&gt;&gt;<br>

&gt;&gt; toon<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; _______________________________________________<br>

&gt;&gt; Beowulf mailing list, Beowulf@beowulf.org<br>

&gt;&gt; To change your subscription (digest mode or unsubscribe)

visit<br>

&gt;&gt;

<a href="http://www.beowulf.org/mailman/listinfo/beowulf" eudora="autourl">

http://www.beowulf.org/mailman/listinfo/beowulf</a><br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; <br>

&gt;&gt;<br>

&gt;<br>

&gt;<br>

&gt; --<br>

&gt; Doug<br>

&gt; _______________________________________________<br>

&gt; Beowulf mailing list, Beowulf@beowulf.org<br>

&gt; To change your subscription (digest mode or unsubscribe) visit<br>

&gt;

<a href="http://www.beowulf.org/mailman/listinfo/beowulf" eudora="autourl">

http://www.beowulf.org/mailman/listinfo/beowulf</a><br>

&gt;<br>

&gt; !DSPAM:475c325f61251246014193!<br>

&gt;<br><br>

<br>

--<br>

Doug<br>

_______________________________________________<br>

Beowulf mailing list, Beowulf@beowulf.org<br>

To change your subscription (digest mode or unsubscribe) visit

<a href="http://www.beowulf.org/mailman/listinfo/beowulf" eudora="autourl">

http://www.beowulf.org/mailman/listinfo/beowulf</a></blockquote>

!DSPAM:475f2a0321726491211187!


</body>

</html>