mpirun + Scyld MPI

Zukaitis, Anthony ZukaitAJ at nv.doe.gov
Wed Nov 12 09:43:21 EST 2003


I am currently using MPI distributed with scyld which I believe is MPICH.  I
have 6 dual CPU nodes for a total of 12 cpu's.  When ever I try to use 12
processors it puts 3 processes on one of the nodes and only one process on
the master node.  I have tried using a machinefile like


master:2
.0:2
.1:2
.2:2
.3:2
.4:2

and -map and it doesnt seem to help.  Any hints? 
-----Original Message-----
From: beowulf-request at scyld.com [mailto:beowulf-request at scyld.com]
Sent: Friday, November 07, 2003 10:04 AM
To: beowulf at beowulf.org
Subject: Beowulf digest, Vol 1 #1533 - 13 msgs


Send Beowulf mailing list submissions to
	beowulf at beowulf.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://www.beowulf.org/mailman/listinfo/beowulf
or, via email, send a message with subject or body 'help' to
	beowulf-request at beowulf.org

You can reach the person managing the list at
	beowulf-admin at beowulf.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beowulf digest..."


Today's Topics:

   1. Re:Scyld and MPICH. (William Gropp)
   2. Re:Tyan S2880 (K8S) /S2885 (K8W) Opteron boards under Linux (Glen
Kaukola)
   3. Tyan 2880 and 2885 (Mike Sullivan)
   4. Article: Sony Cell CPU to deliver two teraflops in 64-core config (Tod
Hagan)
   5. Re:Cluster Poll Results (tangent into OS choices)
(=?iso-8859-1?Q?=C5smund_=D8deg=E5rd?=)
   6. Linux vs FreeBSD clusters (was: how are the Redhat product changes
affecting existing and future plans?) (Rayson Ho)
   7. INTEL DISCOVERS NEW CHIP-SHRINKING MATERIAL (Joey Sims)
   8. Re:Linux vs FreeBSD clusters (was: how are the Redhat product changes
affecting existing and future plans?) (Craig Rodrigues)
   9. Re:Article: Sony Cell CPU to deliver two teraflops in 64-core
       config (John Hearns)
  10. OctigaBay 12K (Franz Marini)
  11. Re:OctigaBay 12K (Robert G. Brown)
  12. Re:Linux vs FreeBSD clusters (was: how are the Redhat product changes
affecting existing and future plans?) (Jan Schaumann)

--__--__--

Message: 1
Date: Thu, 06 Nov 2003 11:48:24 -0600
To: "Zukaitis, Anthony" <ZukaitAJ at nv.doe.gov>
From: William Gropp <gropp at mcs.anl.gov>
Subject: Re: Scyld and MPICH.
Cc: "'beowulf at scyld.com'" <beowulf at scyld.com>, mpi-maint at mcs.anl.gov

At 10:55 AM 11/6/2003, Zukaitis, Anthony wrote:
>I am having a problem with MPI_reduce and I believe that it is a buffer
size
>error.  Is there a way to calculate the maximum size of the buffer and what
>is the maximum size of the buffer allowed?  It does not seem to be linear
>with the number of processors.

There should be no maximum buffer size, though the ch_p4 device does impose 
a limit when shared memory is used to transfer a message.  Do you have an 
example program that we could test (Bug reports for MPICH should be sent to 
mpi-maint at mcs.anl.gov)

Bill  


--__--__--

Message: 2
Date: Thu, 06 Nov 2003 10:37:59 -0800
From: Glen Kaukola <glen at cert.ucr.edu>
To: Konstantin Kudin <konstantin_kudin at yahoo.com>
CC: beowulf at beowulf.org
Subject: Re: Tyan S2880 (K8S) /S2885 (K8W) Opteron boards under Linux

Konstantin Kudin wrote:

> Could anyone please share experiences with these
>boards under linux? Is it still a risky proposition at
>this time?
>  
>

We have a few of the s2880's.  They were real problematic at first in 
that they'd constantly crash.  But it turned out that when I downgraded 
the bios, all of our problems went away.  Of course I also needed to 
install the latest 2.4.22 kernel before the machines would boot with the 
older bios installed.

I'm not sure what to tell you about the serial ata support, as I've 
never played with it.  Linux seems to support the nic just fine though.

Hope that helps,
Glen


--__--__--

Message: 3
Date: Thu, 06 Nov 2003 13:39:51 -0500
From: Mike Sullivan <mike.sullivan at alltec.com>
Reply-To: mike.sullivan at alltec.com
To: beowulf at beowulf.org
Subject: Tyan 2880 and 2885

>Could anyone please share experiences with these
>boards under linux? Is it still a risky proposition at
>this time?


I have used the 2880 under RedHat AS 2.1 and gingin64 and
it works fine execpt for the SATA controller. I did not
get the promise chip to work but did not spend a lot 
of time on it. The GigE interface works. The board
was stable and I have been using them in NAS devices
with 3ware cards. The SMDC option for these units
works fairly well with the most recent console and 
you can get sensor data.

 
> It seem like there are drivers for AMD-8111/8131/8151
>chipset on the AMD page, drivers for the Broadcom
>network chip in other places. Any feedback on SATA
>support for the Silicon Image Sil3114 SATA RAID
>Accelerator and on SATA support in general? Any other
>caveats?

I also have both a 2882 and 2885 that I will be testing
early next week with Suse Linux 9 for AMD64 and would
will post my findings.


 Thanks in advance for any help!

 Konstantin


-- 
Mike Sullivan                           Director Performance Computing
@lliance Technologies,                  Voice: (416) 385-3255 x 228, 
18 Wynford Dr, Suite 407                Fax:   (416) 385-1774
Toronto, ON, Canada, M3C-3S2            Toll Free:1-877-216-3199
http://www.alltec.com




--__--__--

Message: 4
Subject: Article: Sony Cell CPU to deliver two teraflops in 64-core config
From: Tod Hagan <tod at gust.sr.unh.edu>
To: Beowulf List <beowulf at beowulf.org>
Date: 06 Nov 2003 15:02:25 -0500

http://www.theregister.co.uk/content/3/33791.html

It also mentions the ClearSpeed chip that was discussed here recently.



--__--__--

Message: 5
Date: Thu, 06 Nov 2003 23:52:28 +0100
To: beowulf at beowulf.org
Subject: Re: Cluster Poll Results (tangent into OS choices)
Reply-To: aasmund at simula.no
From: =?iso-8859-1?Q?=C5smund_=D8deg=E5rd?= <aasmund at simula.no>
Organization: Simula Research Laboratory AS

On Wed, 5 Nov 2003 00:05:13 +0000, Andrew M.A. Cater 
<amacater at galactic.demon.co.uk> wrote:

>
> On Tue, Nov 04, 2003 at 05:50:57PM -0500, Joe Landman wrote:
>>
>> There are interesting bits in debian.  I am not sure it is necessarily
>> the right choice for clusters due to the specific lack of commercial
>> support for cluster specific items such as Myrinet, and the other high
>> speed interconnects.
>
> Dan - if I build a _really big_ cluster, will you get Quadrics to do
> Debian :)
> Same goes for any other vendor - if you ask them nicely and make it
> worth their while, they'll do it.  In many cases, it's only a recompile
> of a device driver to account for library differences, after all.
>
> HP use Debian internally, IIRC.  Some of the Debian developers are also
> HP folk - HP are potentially looking to support more of their products
> under Linux? [See, for example, Debian Weekly News for today :) ]'

Actually, we have quite recently installed a Itanium2 based cluster, using
debian, because we want debian. We got HP to do it for us, using the 
(former
Compaq) CMU tool. They did some porting to support debian in this tool...

So, ask nicely (and put it as a requirement to let them get the deal), and
you can get what ever you want ;-)

>> Commercial compiler support for Debian (e.g.
>> Intel, Absoft, et al) is largely non-existant as far as I know (please
>> do correct me if I am wrong).

No problem with Intel compilers on Debian (alien do the trick).



-- 
[simula.research laboratory]
                 Åsmund Ødegård
                 Scientific Programmer / Chief Sys.Adm
                 phone: 67828291 / 90069915
                 http://www.simula.no/~aasmundo

--__--__--

Message: 6
Date: Thu, 6 Nov 2003 16:59:51 -0800 (PST)
From: Rayson Ho <raysonlogin at yahoo.com>
Subject: Linux vs FreeBSD clusters (was: how are the Redhat product changes
affecting existing and future plans?)
To: bioclusters at bioinformatics.org, beowulf <beowulf at beowulf.org>,
   Linux Cluster <linux-cluster at nl.linux.org>,
   List <freebsd-hackers at freebsd.org>

A very good paper about building HPC clusters with FreeBSD:

"Building a High-performance Computing Cluster Using FreeBSD"

http://people.freebsd.org/~brooks/papers/bsdcon2003/

The author talked about hardware issues: KVM, BIOS redirection, CPU
choices; and then talked about why he chose FreeBSD instead of Linux...
he also did the port of GridEngine (SGE) to FreeBSD.

Anyone tried to setup HPC clusters with *BSD??

Rayson


--- Fernan Aguero <fernan at iib.unsam.edu.ar> wrote:
> Any FreeBSD users willing to share clustering experiences
> out there?
> 
> Fernan



__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

--__--__--

Message: 7
Subject: INTEL DISCOVERS NEW CHIP-SHRINKING MATERIAL

Date: Thu, 6 Nov 2003 22:07:53 -0500
From: "Joey Sims" <jsims at csiopen.com>
To: <beowulf at beowulf.org>

Maybe someone could lend a hand and help Intel find out what their
unknown material is.  Be careful! Don't spill it in your lap for
goodness sake.... Dohh! :-O

I found this amusing:

INTEL DISCOVERS NEW CHIP-SHRINKING MATERIAL
11.07.03
by Jennifer Tabor
HPCwire
========================================================================
======

Chip makers are searching for ways to create smaller and smaller
computer chips, and researchers at Intel believe they have discovered a
new material that would help them to do just that.

Intel's announcement will garner much attention in an industry where the
demand for products that push fundamental physical limits is ever
increasing.

A problem afflicting many chip makers today is the prevention of
electrical currents from leaking outside their proper patches.  Because
the transistor gates are now becoming as small as just five atomic
layers, chips need more power.  In turn, they also need a more efficient
cooling system.

Intel has been having difficulties with the cooling of its chips -- the
smaller they get (with etchings as small as 90-130 nanometers), the
hotter they become.  Recent reports say that the problem has even caused
a delay in the Prescott, Intel's most advanced version of the Pentium.

Though the new technology would not debut until approximately 2007,
Intel is planning to scale down their current 90 nanometer chip size
over the years to 65, followed by 45.  It is at this point that Intel's
new material, which is still unknown, would be introduced.

Intel's discovery comes at the height of an intense industry wide search
for a new material to replace silicon dioxide, which is used as
insulator between the gate and the channel through which current flows
in an active transistor.

Intel researchers have been working on solving the chip predicament for
five years in efforts to keep pace with Moore's Law.  Gordon E. Moore,
co-founder of Intel, believed that the number of transistors in the same
space should double every 18 months.

Intel believes they can continue to make short strides, despite the
thoughts of many who doubt their ability to keep up such a pace.

Though many researchers and competitors agree that Intel's announcement
revolves around the most important research area in the chip industry,
some feel that the lack of specific technical detail will deter
scientists from assessing their claims.                       

==================================================
Joey P. Sims			  800.995.4274 - 242
Sales Manager			  770.442.5896 - Fax
HPC/Storage Division		     www.csilabs.net
Concentric Systems, Inc.	   jsims at csiopen.com
====================================ISO9001:2000==


--__--__--

Message: 8
Date: Thu, 6 Nov 2003 23:04:15 -0500
From: Craig Rodrigues <rodrigc at crodrigues.org>
To: Rayson Ho <raysonlogin at yahoo.com>
Cc: bioclusters at bioinformatics.org, beowulf <beowulf at beowulf.org>,
   Linux Cluster <linux-cluster at nl.linux.org>,
   List <freebsd-hackers at freebsd.org>
Subject: Re: Linux vs FreeBSD clusters (was: how are the Redhat product
changes affecting existing and future plans?)

On Thu, Nov 06, 2003 at 04:59:51PM -0800, Rayson Ho wrote:
> A very good paper about building HPC clusters with FreeBSD:
> 
> "Building a High-performance Computing Cluster Using FreeBSD"
> 
> http://people.freebsd.org/~brooks/papers/bsdcon2003/
> 
> The author talked about hardware issues: KVM, BIOS redirection, CPU
> choices; and then talked about why he chose FreeBSD instead of Linux...
> he also did the port of GridEngine (SGE) to FreeBSD.
> 
> Anyone tried to setup HPC clusters with *BSD??


Hi,

Not quite the same as an HPC cluster, but take
a look at the University of Utah's Emulab:

http://www.emulab.net

It is heavily based on FreeBSD (i.e. makes use of FreeBSD routing,
Dummynet, etc.).  The Emulab is a remotely accessible testbed
that researchers can use to conduct network experiments.  It
consists of about 200 PC nodes.  The same company that 
Brooks works for (Aerospace), has apparently set up
an internal testbed based on the Emulab software developed at Utah.

I use the Emulab every day as party of my research work at
BBN, and it is an excellent facility.

-- 
Craig Rodrigues        
http://crodrigues.org
rodrigc at crodrigues.org

--__--__--

Message: 9
Subject: Re: Article: Sony Cell CPU to deliver two teraflops in 64-core
	config
From: John Hearns <john.hearns at clustervision.com>
To: beowulf at beowulf.org
Organization: Clustervision
Date: Fri, 07 Nov 2003 10:13:40 +0100

And also on The Reg:

http://www.theregister.co.uk/content/3/33813.html

The Reg reckons Opteron 250s by early next year.


--__--__--

Message: 10
Date: Fri, 7 Nov 2003 13:56:28 +0100 (CET)
From: Franz Marini <franz.marini at mi.infn.it>
To: beowulf at beowulf.org
Subject: OctigaBay 12K

Hello,

  just discover this interesting, imho, company and its first product :

  http://www.octigabay.com/

  Their first product is a linux opteron-based cluster that they said 
could scale up to 12K processors. The base system is a 3.5U shelf with 12 
opterons, 1Tb/s aggregate switching capacity, 1 microsec interprocessor 
latency and 77GB/s aggregate mem bandwidth.

  Seems nice, I would like to know what rgb and some of the other people 
in here think about it :)

  Have a nice day,

Franz
 



---------------------------------------------------------
Franz Marini
Sys Admin and Software Analyst,
Dept. of Physics, University of Milan, Italy.
email : franz.marini at mi.infn.it
--------------------------------------------------------- 


--__--__--

Message: 11
Date: Fri, 7 Nov 2003 08:44:11 -0500 (EST)
From: "Robert G. Brown" <rgb at phy.duke.edu>
To: Franz Marini <franz.marini at mi.infn.it>
Cc: beowulf at beowulf.org
Subject: Re: OctigaBay 12K

On Fri, 7 Nov 2003, Franz Marini wrote:

> Hello,
> 
>   just discover this interesting, imho, company and its first product :
> 
>   http://www.octigabay.com/
> 
>   Their first product is a linux opteron-based cluster that they said 
> could scale up to 12K processors. The base system is a 3.5U shelf with 12 
> opterons, 1Tb/s aggregate switching capacity, 1 microsec interprocessor 
> latency and 77GB/s aggregate mem bandwidth.
> 
>   Seems nice, I would like to know what rgb and some of the other people 
> in here think about it :)

Why, it looks simply lovely, as hardware I've never actually tried goes.
I mean, if the octigabay people want to send me one for free just so I
can write a review for it on this list and the brahma website, well,
from the look of it I wouldn't kick it out of my machine room for
chewing crackers... and I >>can<< be bought, folks, yes I can, just look
at the brahma vendors page and my brazen demand for t-shirts in exchange
for space:-) I'll even dig up something fine grained to run on it so
that I can pretend to really test it.

The bottom line is, well, the bottom line.  Pretty isn't enough.
Performance (even performance that is absolutely everything promised)
isn't enough.  It is PRICE performance that matters, or better yet
cost-benefit.  How does the cost compare to the benefits the design
delivers in your environment.

For my own personal code, for example, I don't NEED their fancy
interconnect, and I can rack up a bunch of opterons for the cost of the
basic hardware and a nice case to put them in.  They'd therefore have to
literally give it to me to make it a cost-benefit win (especially true
since I just spent the last of my money in this grant cycle buying hey,
whaddya know, a stack of 9 dual Opteron 242's for a hair over $20K).
However, there are people out there who run fine grained synchronous
parallel code that is bottlenecked at the network IPC level.  Even THERE
the computations have some intrinsic "value" in that there are finite
amounts of money people are willing to pay to get them done, and there
are choices.  So ultimately it will come down to whether there is a
match between the value of the computation (amount people are willing to
pay to get it done), the needs of the computation, and the marketplace.

It's one of these people that you need to ask about whether or not this
is a good deal or good arrangement.  My knee jerk reaction is that it is
lovely but a bit too far into the big iron side (SP3-ish) to be likely
to win a hard-nosed CB comparison relative to a DIY cluster with e.g.
myrinet or SCI for MANY clustervolken (the market gets smaller and
smaller the further up one travels to super-high-speed networks), but
corporate consumers and the larger government consumers shy away from
DIY, and even in the intermediate market it comes down to
price/performance, eh?  If they price it competitively with the other
high speed networks and it has clear benefits (as it looks like it
might) well then, who knows?

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu




--__--__--

Message: 12
Date: Fri, 7 Nov 2003 10:00:56 -0500
From: Jan Schaumann <jschauma at netmeister.org>
To: beowulf at beowulf.org
Subject: Re: Linux vs FreeBSD clusters (was: how are the Redhat product
changes affecting existing and future plans?)

[Resending; this message was originally sent last night across the
various mailing lists, but beowulf at beowulf.org chokes on the gpg
signature. :-/ ]

Rayson Ho <raysonlogin at yahoo.com> wrote:
> A very good paper about building HPC clusters with FreeBSD:
> 
> "Building a High-performance Computing Cluster Using FreeBSD"
> 
> http://people.freebsd.org/~brooks/papers/bsdcon2003/
> 
> The author talked about hardware issues: KVM, BIOS redirection, CPU
> choices; and then talked about why he chose FreeBSD instead of Linux...
> he also did the port of GridEngine (SGE) to FreeBSD.
> 
> Anyone tried to setup HPC clusters with *BSD??

I have a 30 node NetBSD/i386 cluster, and just recently created the
tech-cluster at netbsd.org mailing list.  Some people are working on a port
of SGE to NetBSD, too.  I hope to expand the awareness of NetBSD in
particular for cluster usage in the near future.

Some URLs of relevance:

http://guinness.cs.stevens-tech.edu/~jschauma/hpcf/
http://www.netbsd.org/MailingLists/#tech-cluster
http://www.netbsd.org/
http://eurobsdcon.org/papers/#souvatzis
http://bsd.slashdot.org/article.pl?sid=03/10/20/1523252&mode=thread&tid=122&
tid=185&tid=190
http://bsd.slashdot.org/bsd/03/11/05/1536226.shtml?tid=122&tid=185&tid=190

-Jan

-- 
Life," said Marvin, "don't talk to me about life."


--__--__--

_______________________________________________
Beowulf mailing list
Beowulf at beowulf.org
http://www.beowulf.org/mailman/listinfo/beowulf


End of Beowulf Digest

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list