From pegu at dolphinics.no  Mon Mar  1 03:45:32 2004
From: pegu at dolphinics.no (Petter Gustad)
Date: Mon, 01 Mar 2004 09:45:32 +0100 (CET)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
Message-ID: <20040301.094532.17863925.pegu@dolphinics.no>


Taken from:

http://www.dolphinics.no/news/2004/2_25.html


Dolphin SCI Socket Software Delivers
Record Breaking Latency

New evaluation kit available at special pricing

Clinton, MA and Oslo, Norway, Feb 26, 2004 Dolphin Interconnect today
announced that the SCI Socket version 1.0 software library is now
available to customers for high-performance computing applications
interconnected with Dolphin SCI adapters. SCI Socket enables standard
Berkley sockets to use the Scalable Coherent Interface (SCI) as a
transport medium with its high bandwidth and extremely low latency.

"This is the lowest latency socket solution available today," said
Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new
high-performance possibilities for a broad range of networking
applications."

Dolphin has benchmarked a completed one byte socket send/socket
receive latency at 2.27 microseconds, which corresponds to more than
203,800 roundtrip transactions per second. Benchmarks using Netperf
also show more than 255 MBytes (2,035 Megabits/s) sustained throughput
using standard TCP STREAM sockets. The SCI Socket software uses
Dolphin's SISCI API as its transport and most of the communication
takes place in user space, avoiding time-consuming system calls and
networking protocols. SCI remote memory access provides a fast and
reliable connection.

"These record-setting performance benchmarks underscore the
capabilities of the SCI standard as a high-performance interconnect,"
said Kare Lochsen, CEO of Dolphin Interconnect. "Dolphin has extensive
expertise in this technology having developed the first SCI-based
interconnect soon after it became a IEEE standard in 1992, and we
remain committed to keeping SCI at the most competitive performance
levels in the future."

SCI Socket requires no operating system patches or application
modifications to run the software. SCI Socket is open source software
available under LGPL/GPL and supports all popular Linux distributions
for x86 and x86/Opteron. In Dolphin testing, the lowest latency was
achieved using AMD Opteron (X86_64) processors. Support for UDP and
Microsoft Windows is planned.

Dolphin SCI adapters are used to build server clusters for
high-performance computing and in a wide range of embedded real-time
computing applications including reflective memory, simulation and
visualization systems, and systems requiring high-availability and
fast failover.

For a limited time, an evaluation kit consisting of two PCI-SCI
adapter cards and cables is available directly from Dolphin
Interconnect at a substantial discount. When installed in a user's
application platform, the evaluation kit enables effective testing of
the SCI Socket software. The software and documentation is included at
no charge. Please visit the Dolphin web site for more information at
www.dolphinics.com/eval.

foobar GmbH (www.foobar-cpa.de), a software development and consulting
firm with particular expertise in SCI and located in Chemnitz,
Germany, assisted Dolphin in the development of SCI Socket.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Mon Mar  1 08:16:28 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Mon, 1 Mar 2004 21:16:28 +0800 (CST)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no>
Message-ID: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com>

> In Dolphin testing, the
> lowest latency was
> achieved using AMD Opteron (X86_64) processors.

No wonder Intel killed IA64 and released 64-bit x86
(aka IA32e) a week or two ago...

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mathiasbrito at yahoo.com.br  Mon Mar  1 08:59:19 2004
From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=)
Date: Mon, 1 Mar 2004 10:59:19 -0300 (ART)
Subject: [Beowulf] Mpirun error
Message-ID: <20040301135919.45861.qmail@web12202.mail.yahoo.com>

I intalled the lastest version of mpich in my personal
computer for simulate my parallel programs. I can
compile my programs without problem, but when I try to
run it I receive the fallowing message error:

p0_6941:  p4_error: Path to program is invalid while
starting /home/mathias/mpi/bubble with RSHCOMMAND on
linux: -1
    p4_error: latest msg from perror: No such file or
directory

What can I do?

Thanks

=====
Mathias Brito
Universidade Estadual de Santa Cruz - UESC
Departamento de Ci?ncias Exatas e Tecnol?gicas
Estudante do Curso de Ci?ncia da Computa??o

______________________________________________________________________

Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora:
http://br.yahoo.com/info/mail.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bogdan.costescu at iwr.uni-heidelberg.de  Mon Mar  1 10:49:39 2004
From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu)
Date: Mon, 1 Mar 2004 16:49:39 +0100 (CET)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no>
Message-ID: <Pine.LNX.4.44.0403011647380.10276-100000@kenzo.iwr.uni-heidelberg.de>

On Mon, 1 Mar 2004, Petter Gustad wrote:

> Dolphin has benchmarked a completed one byte socket send/socket
> receive latency at 2.27 microseconds,

Is this in polling mode or interrupt-driven ? I'm interested to see if 
I can do something useful (like computation) _and_ get such low 
latency.

> Benchmarks using Netperf also show more than 255 MBytes (2,035
> Megabits/s) sustained throughput using standard TCP STREAM sockets.

What is the CPU usage for this throughput ?

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kfarmer at linuxhpc.org  Mon Mar  1 09:51:39 2004
From: kfarmer at linuxhpc.org (Kenneth Farmer)
Date: Mon, 1 Mar 2004 09:51:39 -0500
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
References: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com>
Message-ID: <097701c3ff9c$b465fe30$1601a8c0@deskpro>

----- Original Message ----- 
From: "Andrew Wang" <andrewxwang at yahoo.com.tw>
To: <beowulf at beowulf.org>
Sent: Monday, March 01, 2004 8:16 AM
Subject: Re: [Beowulf] SCI Socket latency: 2.27 microseconds


> > In Dolphin testing, the
> > lowest latency was
> > achieved using AMD Opteron (X86_64) processors.
> 
> No wonder Intel killed IA64 and released 64-bit x86
> (aka IA32e) a week or two ago...
> 
> Andrew.


Intel killed IA64?  Where did you come up with that?

--
Kenneth Farmer <><
LinuxHPC.org
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Mar  1 11:35:56 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 1 Mar 2004 11:35:56 -0500 (EST)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no>
Message-ID: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>

well, this is interesting.  it appears that AMD has given all 
interconnect vendors a boost, since Myri and Quadrics seem to like
Opterons as well ;)

> "This is the lowest latency socket solution available today," said
> Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new

well, Quadrics now claims 1.8 us MPI latency:
http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD

it's interesting that SCI is still on 64x66 PCI - it would be very
interesting to know how many and what kinds of codes really require
higher bandwidth.  yes, some vendors (esp IB) are pushing PCI-express
as bandwith salvation, but afaikt, none of my users need even >500 MB/s
today.  it doesn't seem like PCI-express will be any kind of major win
in small-packet latency...

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Mon Mar  1 17:09:55 2004
From: csamuel at vpac.org (Chris Samuel)
Date: Tue, 2 Mar 2004 09:09:55 +1100
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <097701c3ff9c$b465fe30$1601a8c0@deskpro>
References: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com> <097701c3ff9c$b465fe30$1601a8c0@deskpro>
Message-ID: <200403020910.02925.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 2 Mar 2004 01:51 am, Kenneth Farmer wrote:

> From: "Andrew Wang" <andrewxwang at yahoo.com.tw>
>
> > No wonder Intel killed IA64 and released 64-bit x86
> > (aka IA32e) a week or two ago...
>
> Intel killed IA64?  Where did you come up with that?

Intel certainly haven't announced the death of Itanium, but you've got to 
wonder about its long term future when Intel start producing 64-bit AMD 
compatible chips. Also see [1] below.

This is more the question of what will the market do when choosing between 
them, especially as HPC is only really a niche (though a fairly high spending 
one) compared to the general computing market.

The big advantage AMD have is that "legacy" 32-bit apps will be around for a 
long long time to come (look at the mass clamour for MS to continue 
supporting Win98, something they'd hoped would be dead a long time ago) and 
that gives the hybrids a big advantage in the general market.

I guess it comes down to a business decision on Intel as to whether they feel 
the demand for Itanium is enough to justify its continued development.

Note that I'm not saying the demand per se isn't there, I've got absolutely no 
idea on the matter!

cheers,
Chris

[1] - for those who haven't seen it, here's Linus's response to the launch:

		http://kerneltrap.org/node/view/2466

- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQFAQ7S2O2KABBYQAh8RAlA/AJ4yzNxJcXZc3e8I8CtYjgScQOCpUwCfdVzF
lpG7iEOXSo3+xAK73kNb9c0=
=eYRs
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Mon Mar  1 16:38:52 2004
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Mon, 1 Mar 2004 13:38:52 -0800
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
References: <20040301.094532.17863925.pegu@dolphinics.no> <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040301213852.GA28803@cse.ucdavis.edu>

> well, Quadrics now claims 1.8 us MPI latency:
> http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD

Note the title says "sub 2us" and the body says "close to" 1.8us.

Of more interest (to me) is that further down they say:
 In the next quarter, Quadrics will announce a series of highly competitive
 switch configurations making QsNetII more cost-effective for medium
 sized cluster configuration deployment.

Sounds like more competition for IB, Myrinet and Dolphin.  Hopefully anyways.

Cool, found a quadrics price list:
	http://doc.quadrics.com/Quadrics/QuadricsHome.nsf/DisplayPages/A3EE4AED738B6E2480256DD30057B227
	 http://tinyurl.com/2sn2b

Looks like $3k per node or so for 64, and $4k per node for 1024, I'm guessing
that is list price and is somewhat negotiable.

According to my sc2003 notes the Quadrics latency was:
	 100ns for the sending elan4
	 300ns for the 128 node switch and 20 meters of cable
	 130ns for the receiving card.
	2420ns for two trips across the PCI-X bus and a main memory write
================
    2950ns for an mpi message between 2 nodes.

Anyone know what changes to get this number down to 1.8us - 2.0us?

> higher bandwidth.  yes, some vendors (esp IB) are pushing PCI-express
> as bandwith salvation, but afaikt, none of my users need even >500 MB/s
> today.  it doesn't seem like PCI-express will be any kind of major win
> in small-packet latency...

Anyone have an expected timetable for PCI-express connected interconnect
cards?

Anyone have projected PCI-express latencies vs PCI-X (133 MHz/64 bit)?

-- 
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From patrick at myri.com  Mon Mar  1 17:40:55 2004
From: patrick at myri.com (Patrick Geoffray)
Date: Mon, 01 Mar 2004 17:40:55 -0500
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <4043BBF7.9090706@myri.com>

Mark Hahn wrote:
> well, Quadrics now claims 1.8 us MPI latency:
> http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD

Hum, this one claims "under 3us": 
http://doc.quadrics.com/quadrics/QuadricsHome.nsf/PageSectionsByName/F6E4FE91508A319580256D5900447E40/$File/QsNetII+Performance+Evaluation+ltr.pdf

Maybe the 1.8us is a one-sided MPI latency, aka a PUT ?

Patrick
-- 

Patrick Geoffray
Myricom, Inc.
http://www.myri.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From daniel.kidger at quadrics.com  Mon Mar  1 18:47:33 2004
From: daniel.kidger at quadrics.com (Dan Kidger)
Date: Mon, 1 Mar 2004 23:47:33 +0000
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <20040301213852.GA28803@cse.ucdavis.edu>
References: <20040301.094532.17863925.pegu@dolphinics.no> <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca> <20040301213852.GA28803@cse.ucdavis.edu>
Message-ID: <200403012347.33322.daniel.kidger@quadrics.com>

On Monday 01 March 2004 9:38 pm, Bill Broadley wrote:
> > well, Quadrics now claims 1.8 us MPI latency:
> > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B
> >398280256E44005A31DD
>
> Note the title says "sub 2us" and the body says "close to" 1.8us.

Just ran this:
[dan at opteron0]$ mpicc mping.c -o mping;  prun -N2 ./mping
  1:        0 bytes      1.80 uSec     0.00 MB/s

This is simple bit of MPI: proc 1 posts an MPI_Recv,  proc0 then does a 
MPI_Send, then proc1 does MPI_Send and proc0 an MPI_Recv.
Latency printed is half the round trip averaged over say 1000 passes

This is for Opteron - it seems to have the best PCI-X implimentation we have 
seen. Latency on IA64 is a little higher - say 2.61uSec on one platform I 
have just tried.
   MPI performance has also improved over time as we have tuned the DMA/PIO 
writes,etc. in the device drivers.


> Of more interest (to me) is that further down they say:
>  In the next quarter, Quadrics will announce a series of highly competitive
>  switch configurations making QsNetII more cost-effective for medium
>  sized cluster configuration deployment.

yep - yet to be announced offically - but as you might expect this revolves 
around introducing a wider range of smaller switch chasses and 
configuratiions.   


-- 
Yours,
Daniel.

--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd.      daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK         0117 915 5505
----------------------- www.quadrics.com --------------------


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Mar  1 19:37:11 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 1 Mar 2004 19:37:11 -0500 (EST)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <200403020910.02925.csamuel@vpac.org>
Message-ID: <Pine.LNX.4.44.0403011907280.5445-100000@coffee.psychology.mcmaster.ca>

> > > No wonder Intel killed IA64 and released 64-bit x86
> > > (aka IA32e) a week or two ago...
> >
> > Intel killed IA64?  Where did you come up with that?
> 
> Intel certainly haven't announced the death of Itanium, but you've got to 
> wonder about its long term future when Intel start producing 64-bit AMD 
> compatible chips. Also see [1] below.

bah.  buying chips based on their address register width makes 
about as much sense as buying based on clock.  yes, some people have 
good reason to be excited about 64b hitting the mass market.  but 
that number is quite small - how many machines do you have with 
>4 GB per cpu?

remember, Intel has always said that 64b wasn't terribly important
for anything except the "enterprise" (mauve has more ram) market
(mainframe recidivists).  I think they're right, but should have also
adopted AMD's cpu-integrated memory controller.

> I guess it comes down to a business decision on Intel as to whether they feel 
> the demand for Itanium is enough to justify its continued development.

maybe instead of a bazillion bytes of cache on the next it2, 
Intel will just drop in a P4 or two ;)

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From pegu at dolphinics.no  Tue Mar  2 02:59:04 2004
From: pegu at dolphinics.no (Petter Gustad)
Date: Tue, 02 Mar 2004 08:59:04 +0100 (CET)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
References: <20040301.094532.17863925.pegu@dolphinics.no>
	<Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040302.085904.68044976.pegu@dolphinics.no>

From: Mark Hahn <hahn at physics.mcmaster.ca>
Subject: Re: [Beowulf] SCI Socket latency: 2.27 microseconds
Date: Mon, 1 Mar 2004 11:35:56 -0500 (EST)

> > "This is the lowest latency socket solution available today," said
> > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new
> 
> well, Quadrics now claims 1.8 us MPI latency:

This is excellent MPI latency. However, the quoted 2.27 ?s latency was
for the *socket* library. Latency using the Dolphin SISCI library is
1.4 ?s. See also: http://www.dolphinics.no/products/benchmarks.html


Petter
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joachim at ccrl-nece.de  Tue Mar  2 03:36:42 2004
From: joachim at ccrl-nece.de (Joachim Worringen)
Date: Tue, 2 Mar 2004 09:36:42 +0100
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <200403020936.42553.joachim@ccrl-nece.de>

Mark Hahn:
> > "This is the lowest latency socket solution available today," said
> > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new
>
> well, Quadrics now claims 1.8 us MPI latency:
> http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B39
>8280256E44005A31DD
>
> it's interesting that SCI is still on 64x66 PCI - it would be very
> interesting to know how many and what kinds of codes really require
[..]

A large fraction of the latency does indeed stem from the two PCI-buses that 
need to be crossed. For that reason, Dolphin would certainly get an 
additional latency decrease when running on a 133MHz bus. I guess they have 
this in the pipeline. 

 Joachim

-- 
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sfr at foobar-cpa.de  Tue Mar  2 04:40:46 2004
From: sfr at foobar-cpa.de (Friedrich Seifert)
Date: Tue, 02 Mar 2004 10:40:46 +0100
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
Message-ID: <4044569E.9010803@foobar-cpa.de>

Bogdan Costescu wrote:

> On Mon, 1 Mar 2004, Petter Gustad wrote:
> 
> 
>>Dolphin has benchmarked a completed one byte socket send/socket
>>receive latency at 2.27 microseconds,
> 
> 
> Is this in polling mode or interrupt-driven ? I'm interested to see if
> I can do something useful (like computation) _and_ get such low
> latency.

Actually, SCI SOCKET uses a combination of both, it polls for a 
configurable amount of time, and if nothing arrives meanwhile, waits for 
an interrupt. Something like that is necessary since the current Linux 
interrupt processing and wake up mechanism is quite slow and 
unpredictable. There is a promising project going on to provide real 
time interrupt capability, but it is still in an early stage 
(http://lwn.net/Articles/65710/)

>>Benchmarks using Netperf also show more than 255 MBytes (2,035
>>Megabits/s) sustained throughput using standard TCP STREAM sockets.
> 
> 
> What is the CPU usage for this throughput ?

SCI SOCKET was run in PIO mode for this test, so one CPU is needed to 
transfer the data. Current DMA performance is lower, but is subject to 
optimization in future revisions. CPU usage for DMA is 8%/29% at 
sender/receiver.

Regards,
Friedrich

-- 
Dipl.-Inf. Friedrich Seifert - foobar GmbH
Phone: +49-371-5221-157         Email: sfr at foobar-cpa.de
Mobil: +49-172-3740089          Web:   http://www.foobar-cpa.de

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Mon Mar  1 20:57:48 2004
From: lindahl at pathscale.com (Greg Lindahl)
Date: Mon, 1 Mar 2004 17:57:48 -0800
Subject: [Beowulf] advantages of this particular 64-bit chip
In-Reply-To: <Pine.LNX.4.44.0403011907280.5445-100000@coffee.psychology.mcmaster.ca>
References: <200403020910.02925.csamuel@vpac.org> <Pine.LNX.4.44.0403011907280.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040302015748.GA6730@greglaptop.internal.keyresearch.com>

On Mon, Mar 01, 2004 at 07:37:11PM -0500, Mark Hahn wrote:

> bah.  buying chips based on their address register width makes 
> about as much sense as buying based on clock.  yes, some people have 
> good reason to be excited about 64b hitting the mass market.  but 
> that number is quite small - how many machines do you have with 
> >4 GB per cpu?

Don't forget that "64 bits", in this case means "wider GPRs, and twice
as many, plus a better ABI." These are substantial wins on many codes,
even on machines with small memories. Bignums are a well known example,
but there are far more general-purpose examples.

For example, with the PathScale compilers on the Opteron, we find that
only 1 of the SPECfp benchmarks and 3 of the SPECint benchmarks run
faster in 32-bit mode than 64-bit mode -- keeping in mind that 64-bit
mode features longer instructions and bigger pointers and longs. (This
is our alpha 32-bit mode vs. our beta 64-bit mode, so this answer will
change a little by the time both are production quality.)

So yes, there's a reason to buy Opteron and IA32e chips beyond the
address width: more bang for the buck.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Mar  2 08:59:55 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 2 Mar 2004 08:59:55 -0500 (EST)
Subject: [Beowulf] help! need thoughts/recommendations quickly
In-Reply-To: <20040302132333.GA3957@mikee.ath.cx>
Message-ID: <Pine.LNX.4.44.0403020845550.4030-100000@lilith.rgb.private.net>

On Tue, 2 Mar 2004, Mike Eggleston wrote:

> I realize this question is not specific to beowulf clusters... however,
> at 9a I'm meeting with an upset user about a bunch of workstations
> using serial termainals. Things don't happen as quickly as he wants:
> setup, problem diagnosis, throughput, etc. What solutions can I present
> for these problems (I realize this is just a quick summary!). Also,
> the serial terminals are running at 9600 baud over sometimes 50 meters.
> One table I found shows 60 meters is 2400 baud and 30 meters is 4800
> baud. I think this is part of the problem.

It really shouldn't be, if the wiring is decent quality TP.  Back in the
old days, when our department was basically NOTHING but serial terminals
running over TP down to a Sun 110 with a serial port expansion, we had
lots of runs over 50 meters (probably some close to 100) without
difficulty at 9600 baud.  Keep the wires away from e.g. fluorescent
lights (BIG problem), major power cables, or other sources of low
frequency noise.  Running parallel to a noise source over a long
distance is where most crosstalk occurs -- try to cross wires at right
angles.  Conduit can help as it shields, as well, but our wires were
basically thrown up in a drop ceiling haphazardly by "trained
professionals" a.k.a. graduate students, faculty, and sometimes a
shop/maintenance guy.

> Possible solutions I have thought of:
> 
> - user stops complaining and deals with the situation

Always a popular one.  To accomplish it you had better be prepared to
use force.  Bring duct tape to the meeting...

> - put ethernet->serial converts at the terminals so the terminals are
>   on the network

Sounds expensive.  Of course, terminal servers themselves are typically
pretty expensive, although we used to use them in the old days when we
finally had more terminals than our server could manage even with
expansions.  And then workstations started getting cheaper and we
converted over to workstations and ethernet and never looked back.

How is it that you're still using terminals?  I didn't know that
terminals were still a viable option -- a cheap PC is less than what,
$500 these days, and by the time you compare the cost of the terminal
itself, the serial port terminal server, the serial wiring, and the
incredible loss of productivity associated with using what amounts to a
single, slow, tty interface they just don't sound cost effective.  Not
to mention maintenance, user complaints, and your time...

> - put small VIA type boards whose image is loaded through tftp and
>   the serial terminals actually run from the via boards
> - what else?

Give the terminals to somebody you don't like, replace them with cheap
diskless second hand PCs on ethernet running a stripped linux that
basically provides either the standard set of Alt-Fx tty's in
non-graphical mode or basic X and as many xterms as memory permits.
Problem solved.

In fact, depending on the applications being accessed and whether they
CAN run locally, problem solved even better by running them locally and
reducing demand on the network and servers.

   rgb

> 
> Mike
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mikee at mikee.ath.cx  Tue Mar  2 08:23:33 2004
From: mikee at mikee.ath.cx (Mike Eggleston)
Date: Tue, 2 Mar 2004 07:23:33 -0600
Subject: [Beowulf] help! need thoughts/recommendations quickly
Message-ID: <20040302132333.GA3957@mikee.ath.cx>

I realize this question is not specific to beowulf clusters... however,
at 9a I'm meeting with an upset user about a bunch of workstations
using serial termainals. Things don't happen as quickly as he wants:
setup, problem diagnosis, throughput, etc. What solutions can I present
for these problems (I realize this is just a quick summary!). Also,
the serial terminals are running at 9600 baud over sometimes 50 meters.
One table I found shows 60 meters is 2400 baud and 30 meters is 4800
baud. I think this is part of the problem.

Possible solutions I have thought of:

- user stops complaining and deals with the situation
- put ethernet->serial converts at the terminals so the terminals are
  on the network
- put small VIA type boards whose image is loaded through tftp and
  the serial terminals actually run from the via boards
- what else?

Mike
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mikee at mikee.ath.cx  Tue Mar  2 09:08:56 2004
From: mikee at mikee.ath.cx (Mike Eggleston)
Date: Tue, 2 Mar 2004 08:08:56 -0600
Subject: [Beowulf] help! need thoughts/recommendations quickly
In-Reply-To: <Pine.LNX.4.44.0403020845550.4030-100000@lilith.rgb.private.net>
References: <20040302132333.GA3957@mikee.ath.cx> <Pine.LNX.4.44.0403020845550.4030-100000@lilith.rgb.private.net>
Message-ID: <20040302140856.GA4615@mikee.ath.cx>

On Tue, 02 Mar 2004, Robert G. Brown wrote:

> On Tue, 2 Mar 2004, Mike Eggleston wrote:
> 
> > I realize this question is not specific to beowulf clusters... however,
> > at 9a I'm meeting with an upset user about a bunch of workstations
> > using serial termainals. Things don't happen as quickly as he wants:
> > setup, problem diagnosis, throughput, etc. What solutions can I present
> > for these problems (I realize this is just a quick summary!). Also,
> > the serial terminals are running at 9600 baud over sometimes 50 meters.
> > One table I found shows 60 meters is 2400 baud and 30 meters is 4800
> > baud. I think this is part of the problem.
> 
> It really shouldn't be, if the wiring is decent quality TP.  Back in the
> old days, when our department was basically NOTHING but serial terminals
> running over TP down to a Sun 110 with a serial port expansion, we had
> lots of runs over 50 meters (probably some close to 100) without
> difficulty at 9600 baud.  Keep the wires away from e.g. fluorescent
> lights (BIG problem), major power cables, or other sources of low
> frequency noise.  Running parallel to a noise source over a long
> distance is where most crosstalk occurs -- try to cross wires at right
> angles.  Conduit can help as it shields, as well, but our wires were
> basically thrown up in a drop ceiling haphazardly by "trained
> professionals" a.k.a. graduate students, faculty, and sometimes a
> shop/maintenance guy.

I know it should work and the old way it does work, but I've always
seen problems with serial and printers. I much prefer getting away
from them to full ethernet.

> > Possible solutions I have thought of:
> > 
> > - user stops complaining and deals with the situation
> 
> Always a popular one.  To accomplish it you had better be prepared to
> use force.  Bring duct tape to the meeting...

This problem is happening in the warehouse, so there is lots of packing
material and tape around. :)

> > - put ethernet->serial converts at the terminals so the terminals are
> >   on the network
> 
> Sounds expensive.  Of course, terminal servers themselves are typically
> pretty expensive, although we used to use them in the old days when we
> finally had more terminals than our server could manage even with
> expansions.  And then workstations started getting cheaper and we
> converted over to workstations and ethernet and never looked back.
> 
> How is it that you're still using terminals?  I didn't know that
> terminals were still a viable option -- a cheap PC is less than what,
> $500 these days, and by the time you compare the cost of the terminal
> itself, the serial port terminal server, the serial wiring, and the
> incredible loss of productivity associated with using what amounts to a
> single, slow, tty interface they just don't sound cost effective.  Not
> to mention maintenance, user complaints, and your time...

This is an application in the warehouse. We have many serial (dumb)
terminals and printers. We are using 'Dorio's(?). Similiar to the
Wyse 60. I've not used a dorio before, but wyse terminals lots. The
application is all curses based and doesn't require much. The users
are not even concerned about the speed of the application (display, etc.)
just that the terminals are quick to setup and work all the time.

> > - put small VIA type boards whose image is loaded through tftp and
> >   the serial terminals actually run from the via boards
> > - what else?
> 
> Give the terminals to somebody you don't like, replace them with cheap
> diskless second hand PCs on ethernet running a stripped linux that
> basically provides either the standard set of Alt-Fx tty's in
> non-graphical mode or basic X and as many xterms as memory permits.
> Problem solved.
> 
> In fact, depending on the applications being accessed and whether they
> CAN run locally, problem solved even better by running them locally and
> reducing demand on the network and servers.

I can use the terminals on the via boards and not have to replace
them with crt monitors and keyboards, until they all start failing.
I'd prefer to use the crt monitors through vga (fewer problems with linux
and getty).

Do you (anyone) know of a cheap motherboard that would do this?

Mike
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Tue Mar  2 10:44:47 2004
From: john.hearns at clustervision.com (John Hearns)
Date: Tue, 2 Mar 2004 16:44:47 +0100 (CET)
Subject: [Beowulf] help! need thoughts/recommendations quickly
In-Reply-To: <20040302140856.GA4615@mikee.ath.cx>
Message-ID: <Pine.LNX.4.44.0403021642220.16970-100000@druifje.clustervision.com>

On Tue, 2 Mar 2004, Mike Eggleston wrote:

> 
> Do you (anyone) know of a cheap motherboard that would do this?

Sorry to sound like a Cyclades salesman, but from their webpages
the Cyclades TS-100 would fit the bill.
Plus lots of packing tape.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rbw at demec.ufpe.br  Tue Mar  2 15:51:05 2004
From: rbw at demec.ufpe.br (Ramiro Brito Willmersdorf)
Date: Tue, 2 Mar 2004 17:51:05 -0300
Subject: [Beowulf] Invitation to Conference
Message-ID: <20040302205105.GA30141@demec.ufpe.br>


Dear Colleagues,

The XXV CILAMCE (Iberian Latin American Congress on Computational Methods for
Engineering) will be held from November 10th to the 12th at Recife, Brazil.
This Congress will encompass more than 30 mini-symposia over a very wide range
of multidisciplinary methods in engineering and applied sciences. Please check
the congress home page (http://www.demec.ufpe.br/cilamce2004/) for more
specific details.

We would like to invite you to participate in the High Performance Computing
mini-symposium. If you are interested, you should submit an abstract by March
29th, 2004. This is one of the most important conferences on this subject in
South America, and top researchers from here and abroad will attend. On a
personal note, we would like to tell you that Recife is one of the top
touristic destinations in Brazil, with a very pleasant weather and very nice
beaches.

We are grateful for you attention are ask that this information be passed
along to other people in your institution that may be interested.

Many Thanks,

A. L. G. Coutinho, COPPE/UFRJ, alvaro at nacad.ufrj.br
R. B. Willmersdorf, DEMEC/UFPE, rbw at demec.ufpe.br
-- 
Ramiro Brito Willmersdorf            rbw at demec.ufpe.br  
GPG key: http://www.demec.ufpe.br/~rbw/GPG/gpg_key.txt
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Wed Mar  3 12:52:30 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Wed, 03 Mar 2004 09:52:30 -0800
Subject: [Beowulf] mpich program segfaults
Message-ID: <40461B5E.6010003@cert.ucr.edu>

Hi,

Sorry if this is off topic.  Anyway, I've got an mpich Fortran program 
I'm trying to get going, which produces a segmentation fault right at a 
subroutine call.  I put a print statement right before and right after 
the call and when I run the program, I'm only seeing the one before.  
I've also put a print statement right at the beginning of the subroutine 
which is being called and never see that either.  The real strange part 
is when I run this under a debugger, the program runs fine.  So would 
anyone happen to have any insight to what's going on here?  I'd really 
appriciate it.

Thanks,
Glen
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sdutta at deas.harvard.edu  Wed Mar  3 14:42:13 2004
From: sdutta at deas.harvard.edu (Suvendra Nath Dutta)
Date: Wed, 3 Mar 2004 14:42:13 -0500
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <40461B5E.6010003@cert.ucr.edu>
References: <40461B5E.6010003@cert.ucr.edu>
Message-ID: <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu>

Glen,
	Does your program seg fault when compiled with debugging off or on? 
Sometimes compilers will initialize arrays when compiling for 
debugging, but not waste time doing that when compiled without 
debugging. Also if you compile with optimization which line follows 
which one isn't always clear. You want to make sure you aren't 
over-running memory. Because what you say sounds suspiciously like 
that. Also you want to be sure its nothing to do with MPICH. Try 
calling the subroutine from a serial program if possible.
				Suvendra.

On Mar 3, 2004, at 12:52 PM, Glen Kaukola wrote:

> Hi,
>
> Sorry if this is off topic.  Anyway, I've got an mpich Fortran program 
> I'm trying to get going, which produces a segmentation fault right at 
> a subroutine call.  I put a print statement right before and right 
> after the call and when I run the program, I'm only seeing the one 
> before.  I've also put a print statement right at the beginning of the 
> subroutine which is being called and never see that either.  The real 
> strange part is when I run this under a debugger, the program runs 
> fine.  So would anyone happen to have any insight to what's going on 
> here?  I'd really appriciate it.
>
> Thanks,
> Glen
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Wed Mar  3 15:46:36 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Wed, 03 Mar 2004 12:46:36 -0800
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu>
References: <40461B5E.6010003@cert.ucr.edu> <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu>
Message-ID: <4046442C.4090704@cert.ucr.edu>

Suvendra Nath Dutta wrote:

> Glen,
>     Does your program seg fault when compiled with debugging off or on?


Either way.

> Sometimes compilers will initialize arrays when compiling for 
> debugging, but not waste time doing that when compiled without debugging.


The arguments being passed to the subroutine are two arrays of real 
numbers and a few integers.  Nothing being passed to the subroutine has 
been dynamically allocated.  The compiler, IBM's XLF compiler, 
initializes the array to 0.  At least I'm pretty sure it does, since I 
can print things before the subroutine call.

> Also if you compile with optimization which line follows which one 
> isn't always clear.


I don't have any optimizations turned on.

> You want to make sure you aren't over-running memory.


The machine has 2 gigs of memory, which should be plenty.  The same 
program runs on an x86 machine with 1 gig of memory just fine (I'm 
trying to get the program working on an Apple G5 by the way).

> Also you want to be sure its nothing to do with MPICH. Try calling the 
> subroutine from a serial program if possible.


I've tried telling mpirun to only use one cpu and I get the same 
results.  I've also tried running the program all by itself and it still 
crashes.  Like I said though, it runs just fine under the a debugger.


Glen
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sdutta at deas.harvard.edu  Thu Mar  4 06:26:49 2004
From: sdutta at deas.harvard.edu (Suvendra Nath Dutta)
Date: Thu, 4 Mar 2004 06:26:49 -0500 (EST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4046442C.4090704@cert.ucr.edu>
References: <40461B5E.6010003@cert.ucr.edu> <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu>
 <4046442C.4090704@cert.ucr.edu>
Message-ID: <Pine.GSO.4.58.0403040624500.21102@mass>

Glen,
	I am sorry, I meant buffer-overrun instead of memory overrun. It
is of course impossible to say, but you are describing a classic
description of buffer overrun. Program seg-faulting, some where there
shouldn't be a problem. This is usually because you've over run the array
limits and are writing on the program space.
				Suvendra.

On Wed, 3 Mar 2004, Glen Kaukola wrote:

> Suvendra Nath Dutta wrote:
>
> > Glen,
> >     Does your program seg fault when compiled with debugging off or on?
>
>
> Either way.
>
> > Sometimes compilers will initialize arrays when compiling for
> > debugging, but not waste time doing that when compiled without debugging.
>
>
> The arguments being passed to the subroutine are two arrays of real
> numbers and a few integers.  Nothing being passed to the subroutine has
> been dynamically allocated.  The compiler, IBM's XLF compiler,
> initializes the array to 0.  At least I'm pretty sure it does, since I
> can print things before the subroutine call.
>
> > Also if you compile with optimization which line follows which one
> > isn't always clear.
>
>
> I don't have any optimizations turned on.
>
> > You want to make sure you aren't over-running memory.
>
>
> The machine has 2 gigs of memory, which should be plenty.  The same
> program runs on an x86 machine with 1 gig of memory just fine (I'm
> trying to get the program working on an Apple G5 by the way).
>
> > Also you want to be sure its nothing to do with MPICH. Try calling the
> > subroutine from a serial program if possible.
>
>
> I've tried telling mpirun to only use one cpu and I get the same
> results.  I've also tried running the program all by itself and it still
> crashes.  Like I said though, it runs just fine under the a debugger.
>
>
> Glen
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From wseas at canada.com  Thu Mar  4 09:34:46 2004
From: wseas at canada.com (WSEAS Newsletter on MECHANICAL ENGINEERING)
Date: Thu, 4 Mar 2004 16:34:46 +0200
Subject: [Beowulf] WSEAS NEWSLETTER in MECHANICAL ENGINEERING
Message-ID: <3FE20F40001FB40E@fesscrpp1.tellas.gr> (added by postmaster@fesscrpp1.tellas.gr)

If you want to contact us, the Subject of your email must contains the code:
WSEAS

CALL FOR PAPERS -- CALL FOR REVIEWERS -- CALL FOR SPECIAL SESSIONS

wseas at canada.com
http://wseas.freeservers.com

****************************************************************
Udine, Italy, March 25-27, 2004:  

IASME/WSEAS 2004 Int.Conf. on MECHANICS and MECHATRONICS  

****************************************************************

Miami, Florida, USA, April 21-23, 2004

5th WSEAS International Conference on APPLIED MATHEMATICS (SYMPOSIA on: Linear
Algebra and Applications, Numerical Analysis and Applications, Differential
Equations and Applications, Probabilities, Statistics, Operational Research,
Optimization, Algorithms, Discrete Mathematics, Systems, Communications, Control,
Computers, Education) 

****************************************************************

Corfu Island, Greece, August 17-19, 2004

WSEAS/IASME Int.Conf. on FLUID MECHANICS
WSEAS/IASME Int.Conf. on HEAT and MASS TRANSFER

**********************************************************

Vouliagmeni, Athens, Greece, July 12-13, 2004

WSEAS ELECTROSCIENCE AND TECHNOLOGY FOR NAVAL ENGINEERING and ALL-ELECTRIC SHIP 


**********************************************************

Copacabana, Rio de Janeiro, Brazil, October 12-15, 2004

3rd WSEAS Int.Conf. on INFORMATION SECURITY, HARDWARE/SOFTWARE CODESIGN and
COMPUTER NETWORKS (ISCOCO 2004)
3rd WSEAS Int. Conf. on APPLIED MATHEMATICS and COMPUTER SCIENCE (AMCOS 2004)
3rd WSEAS Int.Conf. on SYSTEM SCIENCE and ENGINEERING (ICOSSE 2004)
4th WSEAS Int.Conf. on POWER ENGINEERING SYSTEMS (ICOPES 2004)


****************************************************************
Cancun, Mexico, May 12-15, 2004 

6th WSEAS Int.Conf. on ALGORITHMS, SCIENTIFIC COMPUTING, MODELLING AND SIMULATION
(ASCOMS '04) 


**********************************************************

NOTE THAT 
IN WSEAS CONFERENCES
YOU CAN HAVE PROCEEDINGS 

1) HARD COPY 
2) CD-ROM and 
3) Web Publishing


SELECTED PAPERS are also published (after further review)

*  as regular papers in WSEAS TRANSACTIONS (Journals) or 
*  as Chapters in WSEAS Book Series.

WSEAS Books, Journals, Proceedings participate now in all major science citation
indexes.
ISI, ELSEVIER, CSA, AMS. Mathematical Reviews, ELP, NLG, Engineering Index 
Directory of Published Proceedings, INSPEC (IEE) 


Thanks
Alexis Espen

WSEAS NEWSLETTER in MECHANICAL ENGINEERING

wseas at canada.com
http://wseas.freeservers.com


   #####    HOW TO UNSUBSCRIBE   ####

You receive this newsletter from your email address: beowulf at beowulf.org
If you want to unsubscribe, send an email to:  wseas at canada.com
The Subject of your message must be exactly: REMOVE beowulf at beowulf.org  WSEAS
If you want to unsubscribe more than one email addresses, send a message
to nata at wseas.org with Subject:  REMOVE [email1, emal2, ...., emailn]  WSEAS
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From robl at mcs.anl.gov  Thu Mar  4 13:46:26 2004
From: robl at mcs.anl.gov (Robert Latham)
Date: Thu, 4 Mar 2004 12:46:26 -0600
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4046442C.4090704@cert.ucr.edu>
References: <40461B5E.6010003@cert.ucr.edu> <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu> <4046442C.4090704@cert.ucr.edu>
Message-ID: <20040304184626.GA2746@mcs.anl.gov>

On Wed, Mar 03, 2004 at 12:46:36PM -0800, Glen Kaukola wrote:
> I've tried telling mpirun to only use one cpu and I get the same 
> results.  I've also tried running the program all by itself and it still 
> crashes.  Like I said though, it runs just fine under the a debugger.

since you see this crash when the program runs by itself, try running
under a memory checker (valgrid is good and free, also purify,
insure++...).  

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Thu Mar  4 14:32:12 2004
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Thu, 4 Mar 2004 11:32:12 -0800 (PST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4046442C.4090704@cert.ucr.edu>
Message-ID: <20040304193213.411.qmail@web11407.mail.yahoo.com>

Then run the program by hand, and attach a debugger...

Rayson

--- Glen Kaukola <glen at cert.ucr.edu> wrote:
> Like I said though, it runs just fine under the a debugger.
> 
> 
> Glen
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you?re looking for faster
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Thu Mar  4 13:45:25 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Thu, 04 Mar 2004 10:45:25 -0800
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <Pine.GSO.4.58.0403040624500.21102@mass>
References: <40461B5E.6010003@cert.ucr.edu> <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu> <4046442C.4090704@cert.ucr.edu> <Pine.GSO.4.58.0403040624500.21102@mass>
Message-ID: <40477945.9090808@cert.ucr.edu>

Suvendra Nath Dutta wrote:

>Glen,
>	I am sorry, I meant buffer-overrun instead of memory overrun. It
>is of course impossible to say, but you are describing a classic
>description of buffer overrun. Program seg-faulting, some where there
>shouldn't be a problem. This is usually because you've over run the array
>limits and are writing on the program space.
>  
>

Ok, but simply calling a subroutine shouldn't cause a buffer overrun 
should it?  Especially when none of the arguments being passed to the 
subroutine are dynamically allocated.  I'm beginning to suspect it's a 
problem with the compiler actually.  Maybe the stack that holds 
subroutine arguments isn't big enough.  And when my problematic 
subroutine call is 4 levels deep or so like it is, then there isn't 
enough room on the stack for it's arguments.

Glen

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Thu Mar  4 17:34:29 2004
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Thu, 4 Mar 2004 17:34:29 -0500 (EST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4046442C.4090704@cert.ucr.edu>
Message-ID: <Pine.LNX.4.44.0403041733080.22097-100000@boltzmann.basement-supercomputing.com>


What type of machine is this?

Doug


On Wed, 3 Mar 2004, Glen Kaukola wrote:

> Suvendra Nath Dutta wrote:
> 
> > Glen,
> >     Does your program seg fault when compiled with debugging off or on?
> 
> 
> Either way.
> 
> > Sometimes compilers will initialize arrays when compiling for 
> > debugging, but not waste time doing that when compiled without debugging.
> 
> 
> The arguments being passed to the subroutine are two arrays of real 
> numbers and a few integers.  Nothing being passed to the subroutine has 
> been dynamically allocated.  The compiler, IBM's XLF compiler, 
> initializes the array to 0.  At least I'm pretty sure it does, since I 
> can print things before the subroutine call.
> 
> > Also if you compile with optimization which line follows which one 
> > isn't always clear.
> 
> 
> I don't have any optimizations turned on.
> 
> > You want to make sure you aren't over-running memory.
> 
> 
> The machine has 2 gigs of memory, which should be plenty.  The same 
> program runs on an x86 machine with 1 gig of memory just fine (I'm 
> trying to get the program working on an Apple G5 by the way).
> 
> > Also you want to be sure its nothing to do with MPICH. Try calling the 
> > subroutine from a serial program if possible.
> 
> 
> I've tried telling mpirun to only use one cpu and I get the same 
> results.  I've also tried running the program all by itself and it still 
> crashes.  Like I said though, it runs just fine under the a debugger.
> 
> 
> Glen
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
----------------------------------------------------------------
Editor-in-chief                   ClusterWorld Magazine
Desk: 610.865.6061                            
Cell: 610.390.7765         Redefining High Performance Computing
Fax:  610.865.6618                www.clusterworld.com


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From smcdaniel at kciinc.net  Thu Mar  4 13:59:37 2004
From: smcdaniel at kciinc.net (smcdaniel)
Date: Thu, 4 Mar 2004 12:59:37 -0600
Subject: [Beowulf] mpich program segfaults (Glen Kaukola)
Message-ID: <002501c4021a$d77830c0$2a01010a@kciinc.local>

Physical memory errors could be the problem if they occur between the
pointer and offset of your array location
in the stack.

Other than that I would suspect a buffer overrun that Suvendra Nath Dutta
mentioned.

Sam McDaniel

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Thu Mar  4 19:48:21 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Thu, 04 Mar 2004 16:48:21 -0800
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <Pine.LNX.4.44.0403041733080.22097-100000@boltzmann.basement-supercomputing.com>
References: <Pine.LNX.4.44.0403041733080.22097-100000@boltzmann.basement-supercomputing.com>
Message-ID: <4047CE55.6010300@cert.ucr.edu>

Douglas Eadline, Cluster World Magazine wrote:

>What type of machine is this?
>  
>

An Apple G5.

And actually I've figured out what's wrong.  Sorta.  =)

I replaced my problematic subroutine with a dummy subroutine that 
contains nothing but variable declarations and a print statement.  This 
still caused a segmentation fault.  So I commented pretty much 
everything out.  No segmentation fault.  Alright then.  I slowly added 
it all back in, checking each time to see if I got a segmentation fault.

And now I'm down to 4 variable declarations that are causing a problem:
REAL          ZFGLURG   ( NCOLS,NROWS,0:NLAYS )
INTEGER      ICASE( NCOLS,NROWS,0:NLAYS )
REAL         THETAV( NCOLS,NROWS,NLAYS )
REAL         ZINT  ( NCOLS,NROWS,NLAYS )

If I uncomment any one of those, I get a segmentation fault again.

But it still doesn't make any sense.  First of all, there are variable 
declarations almost exactly like the ones I listed and those don't cause 
a problem.  I also made a small test case that called my dummy 
subroutine and that worked just fine.  I then commented out everything 
but the problematic variable declarations I listed above and that worked 
just fine.  I tried changing the variable names but that didn't seem to 
make a difference, as I still got a segmentation fault.  So I have no 
idea what the heck is going on.  I think I need to tell my boss we need 
to give up on G5's.


Glen
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Thu Mar  4 20:05:28 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Fri, 5 Mar 2004 09:05:28 +0800 (CST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <40477945.9090808@cert.ucr.edu>
Message-ID: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com>

the default stack size on OSX is 512 KB, try to
increase it to 64MB, I encountered this problem
before.

Andrew.

 --- Glen Kaukola <glen at cert.ucr.edu> ????>
Suvendra Nath Dutta wrote:
> 
> >Glen,
> >	I am sorry, I meant buffer-overrun instead of
> memory overrun. It
> >is of course impossible to say, but you are
> describing a classic
> >description of buffer overrun. Program
> seg-faulting, some where there
> >shouldn't be a problem. This is usually because
> you've over run the array
> >limits and are writing on the program space.
> >  
> >
> 
> Ok, but simply calling a subroutine shouldn't cause
> a buffer overrun 
> should it?  Especially when none of the arguments
> being passed to the 
> subroutine are dynamically allocated.  I'm beginning
> to suspect it's a 
> problem with the compiler actually.  Maybe the stack
> that holds 
> subroutine arguments isn't big enough.  And when my
> problematic 
> subroutine call is 4 levels deep or so like it is,
> then there isn't 
> enough room on the stack for it's arguments.
> 
> Glen
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Thu Mar  4 21:46:16 2004
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Thu, 4 Mar 2004 21:46:16 -0500 (EST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4047CE55.6010300@cert.ucr.edu>
Message-ID: <Pine.LNX.4.44.0403042133300.22097-100000@boltzmann.basement-supercomputing.com>


Don't give up on  the G5 just yet.

Sounds like to me you may be stepping on some memory somehow. Which means
the crash occurs at that particular spot in the code, but the cause of the
crash probably is occurring somewhere else in the program.

There are "simple" several you can do to collect evidence that may help
you solve this "crime". (this is detective work by the way)

First, this sounds like the kind of thing that happens in C programs. Is
it pure Fortran? What version of MPICH?

1) try another compiler, if you are lucky it will find the problem. It may
also work, in which case you will want to blame the first compiler, don't,
because that is probably not the case. The new compiler probably lays out
the memory different than the first one and you just got lucky.

2) run your code on another architecture.

3) try another MPI (LAM?)

I am sure there are more, but not knowing the particulars, I can not
suggest anything else.

Doug


On Thu, 4 Mar 2004, Glen Kaukola wrote:

> Douglas Eadline, Cluster World Magazine wrote:
> 
> >What type of machine is this?
> >  
> >
> 
> An Apple G5.
> 
> And actually I've figured out what's wrong.  Sorta.  =)
> 
> I replaced my problematic subroutine with a dummy subroutine that 
> contains nothing but variable declarations and a print statement.  This 
> still caused a segmentation fault.  So I commented pretty much 
> everything out.  No segmentation fault.  Alright then.  I slowly added 
> it all back in, checking each time to see if I got a segmentation fault.
> 
> And now I'm down to 4 variable declarations that are causing a problem:
> REAL          ZFGLURG   ( NCOLS,NROWS,0:NLAYS )
> INTEGER      ICASE( NCOLS,NROWS,0:NLAYS )
> REAL         THETAV( NCOLS,NROWS,NLAYS )
> REAL         ZINT  ( NCOLS,NROWS,NLAYS )
> 
> If I uncomment any one of those, I get a segmentation fault again.
> 
> But it still doesn't make any sense.  First of all, there are variable 
> declarations almost exactly like the ones I listed and those don't cause 
> a problem.  I also made a small test case that called my dummy 
> subroutine and that worked just fine.  I then commented out everything 
> but the problematic variable declarations I listed above and that worked 
> just fine.  I tried changing the variable names but that didn't seem to 
> make a difference, as I still got a segmentation fault.  So I have no 
> idea what the heck is going on.  I think I need to tell my boss we need 
> to give up on G5's.
> 
> 
> Glen
> 

-- 
----------------------------------------------------------------
Editor-in-chief                   ClusterWorld Magazine
Desk: 610.865.6061                            
Cell: 610.390.7765         Redefining High Performance Computing
Fax:  610.865.6618                www.clusterworld.com


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mathiasbrito at yahoo.com.br  Fri Mar  5 08:43:33 2004
From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=)
Date: Fri, 5 Mar 2004 10:43:33 -0300 (ART)
Subject: [Beowulf] Benchmarking with HPL
Message-ID: <20040305134333.90538.qmail@web12201.mail.yahoo.com>

Hello,

I'm benchmarking my cluster with HPL, the cluster have
16 nodes, 8 nodes athlon 1600+ with 512MB RAM and 20GB
H.D. , and 8 nodes athlan 1700+ with 512MB RAM and
20GB, all with a 100Mbit fast ethernet linked in a
switch. Well, the problem is, what the best setup for
the HPL.dat, to obtain the maximum performance of the
cluster?

Mathias

=====
Mathias Brito
Universidade Estadual de Santa Cruz - UESC
Departamento de Ci?ncias Exatas e Tecnol?gicas
Estudante do Curso de Ci?ncia da Computa??o

______________________________________________________________________

Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora:
http://br.yahoo.com/info/mail.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Sebastien.Georget at sophia.inria.fr  Fri Mar  5 10:10:10 2004
From: Sebastien.Georget at sophia.inria.fr (=?ISO-8859-1?Q?S=E9bastien_Georget?=)
Date: Fri, 05 Mar 2004 16:10:10 +0100
Subject: [Beowulf] Benchmarking with HPL
In-Reply-To: <20040305134333.90538.qmail@web12201.mail.yahoo.com>
References: <20040305134333.90538.qmail@web12201.mail.yahoo.com>
Message-ID: <40489852.3050206@sophia.inria.fr>

Mathias Brito wrote:
> Hello,
> 
> I'm benchmarking my cluster with HPL, the cluster have
> 16 nodes, 8 nodes athlon 1600+ with 512MB RAM and 20GB
> H.D. , and 8 nodes athlan 1700+ with 512MB RAM and
> 20GB, all with a 100Mbit fast ethernet linked in a
> switch. Well, the problem is, what the best setup for
> the HPL.dat, to obtain the maximum performance of the
> cluster?
> 
> Mathias

Hi,

starting points for HPL tuning here:
   http://www.netlib.org/benchmark/hpl/faqs.html
   http://www.netlib.org/benchmark/hpl/tuning.html

++
-- 
S?bastien Georget
INRIA Sophia-Antipolis, Service DREAM, B.P. 93
06902 Sophia-Antipolis Cedex, FRANCE
E-mail:sebastien.georget at sophia.inria.fr

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Mar  5 12:28:36 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 5 Mar 2004 12:28:36 -0500 (EST)
Subject: [Beowulf] Newbie on beowulf clustering
In-Reply-To: <20040305171757.15481.qmail@web20730.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0403051227510.31868-100000@ganesh>

On Fri, 5 Mar 2004, khurram b wrote:

> hi!
> i am newbie to beowulf clustering, have done some work
> in MOSIX linux clustering and got interested in
> beowulf clustering, please guide me where to start ,
> tutorials, documents.

  http://www.phy.duke.edu/brahma

Has many resources and links to many more.  Also think about subscribing
to Cluster World magazine.

   rgb

> 
> Thanks!
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! Search - Find what you?re looking for faster
> http://search.yahoo.com
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From myaoha at yahoo.com  Fri Mar  5 12:17:57 2004
From: myaoha at yahoo.com (khurram b)
Date: Fri, 5 Mar 2004 09:17:57 -0800 (PST)
Subject: [Beowulf] Newbie on beowulf clustering
Message-ID: <20040305171757.15481.qmail@web20730.mail.yahoo.com>

hi!
i am newbie to beowulf clustering, have done some work
in MOSIX linux clustering and got interested in
beowulf clustering, please guide me where to start ,
tutorials, documents.

Thanks!

__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you?re looking for faster
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mprinkey at aeolusresearch.com  Fri Mar  5 14:02:13 2004
From: mprinkey at aeolusresearch.com (Michael T. Prinkey)
Date: Fri, 5 Mar 2004 14:02:13 -0500 (EST)
Subject: [Beowulf] "noht" in 2.4.24?
Message-ID: <Pine.LNX.4.44.0403051359200.21915-100000@ra.thebes>

Hi Everyone,

I installed 2.4.24 on a dual Xeon system with a Tyan 7501-chipset
motherboard and the noht option seems to be ignored.  The RH9 kernel
(2.4.20?) repected noht.  Has this been changed or is there a patch that I
missed?  I can't think that it is a BIOS issue or otherwise hardware
related as I can shut it off with RH9 kernel.

Thanks,

Mike


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hartner at cs.utah.edu  Fri Mar  5 16:22:37 2004
From: hartner at cs.utah.edu (Mark Hartner)
Date: Fri, 5 Mar 2004 14:22:37 -0700 (MST)
Subject: [Beowulf] "noht" in 2.4.24?
In-Reply-To: <Pine.LNX.4.44.0403051359200.21915-100000@ra.thebes>
Message-ID: <Pine.LNX.4.43L0.0403051416180.11101-100000@trust.cs.utah.edu>

> I installed 2.4.24 on a dual Xeon system with a Tyan 7501-chipset
> motherboard and the noht option seems to be ignored.  The RH9 kernel
> (2.4.20?) repected noht.  Has this been changed or is there a patch that

I think that option was removed around 2.4.21

If you look at Documentation/kernel-parameters.txt in the kernel source it
will give you a list of options for the 2.4.24 kernel.

> missed?  I can't think that it is a BIOS issue or otherwise hardware
> related as I can shut it off with RH9 kernel.

'acpi=off' will disable ht'ing (and a bunch of other stuff)

The other option is to disable it in your BIOS.

Mark


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rdn at uchicago.edu  Fri Mar  5 18:27:34 2004
From: rdn at uchicago.edu (Russell Nordquist)
Date: Fri, 5 Mar 2004 17:27:34 -0600 (CST)
Subject: [Beowulf] good 24 port gige switch
Message-ID: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>


Does anyone have a recommendation for a good 24 port gige switch for
clustering? I know this issue has been discussed, but I didn't find any
actual manufacturer/models people like. Were not really looking at the
very high end models from Cisco, but I am wary of the many low end
switches on the market with regard to bisectional bandwidth issues.

Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
and found one to be better than the other. There are a bunch of 24 port
gige switches for <$2000, but are they any good? are some better than
others (likely so i'd guess)?

thanks and have a good weekend.
russell


- - - - - - - - - - - -
Russell Nordquist
UNIX Systems Administrator
Geophysical Sciences Computing
http://geosci.uchicago.edu/computing
NSIT, University of Chicago
 - - - - - - - - - - -

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Fri Mar  5 20:24:55 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sat, 6 Mar 2004 09:24:55 +0800 (CST)
Subject: [Beowulf] SGEEE free and more platform offically supported
Message-ID: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com>

I used to think that SGE is free, but SGEEE (with more
advanced scheduling algorithms) is not. But it is not
true, both are free and open source.

In SGE 6.0, there will be no "SGEEE mode", but the
default mode will have all the SGEEE functionality!

And Sun is adding more support too, instead of looking
at the source or finding other people to support
non-Sun OSes:
 "Sun will also support non Sun platforms beginning
with Grid Engine 6 (HP, IBM, SGI, MAC)."

http://gridengine.sunsource.net/servlets/ReadMsg?msgId=16510&listName=users

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Fri Mar  5 20:04:33 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sat, 6 Mar 2004 09:04:33 +0800 (CST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <40491AA7.6050703@cert.ucr.edu>
Message-ID: <20040306010433.99259.qmail@web16812.mail.tpe.yahoo.com>

It's not your code, I think there is a compiler flag
to not allocate variables from the stack, but I need
to look at the XLF manuals again.

BTW, there are several OSX settings that you can do to
tune the performance of your fortran on the G5. I said
fortran since it has to do with the hardware
prefetching on the Power4 and the G5, if you have c
programs with a lot of vector computation, you can set
those too.

Andrew.

 --- Glen Kaukola 
> >the default stack size on OSX is 512 KB, try to
> >increase it to 64MB, I encountered this problem
> >before.
> Yep, that did the trick. Thanks a bunch!
> 
> I'm wondering though, does this indicate there's
> some sort of problem
> with the code?
> 
> 
> Glen 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Fri Mar  5 19:26:15 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Fri, 05 Mar 2004 16:26:15 -0800
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com>
References: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com>
Message-ID: <40491AA7.6050703@cert.ucr.edu>

Andrew Wang wrote:

>the default stack size on OSX is 512 KB, try to
>increase it to 64MB, I encountered this problem
>before.
>  
>

Yep, that did the trick. Thanks a bunch!

I'm wondering though, does this indicate there's some sort of problem
with the code?


Glen
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Fri Mar  5 19:34:05 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Fri, 5 Mar 2004 19:34:05 -0500 (EST)
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
Message-ID: <Pine.LNX.4.44.0403051930450.7709-100000@coffee.psychology.mcmaster.ca>

> Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
> and found one to be better than the other. There are a bunch of 24 port
> gige switches for <$2000, but are they any good? are some better than
> others (likely so i'd guess)?

I've had good luck with SMC 8624t's, and know of one quite large cluster 
that uses a lot of them of them (mckenzie, #140).

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lars at meshtechnologies.com  Sat Mar  6 04:55:22 2004
From: lars at meshtechnologies.com (Lars Henriksen)
Date: 06 Mar 2004 09:55:22 +0000
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
References: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
Message-ID: <1078566922.2547.6.camel@fermi>

On Fri, 2004-03-05 at 23:27, Russell Nordquist wrote:
> Does anyone have a recommendation for a good 24 port gige switch for
> clustering? 

> Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
> and found one to be better than the other. There are a bunch of 24 port
> gige switches for <$2000, but are they any good? are some better than
> others (likely so i'd guess)?

We mostly use HP2724's for this size of clusters. We have found them to
perform ok and they are stable under heavy load - and they are priced at
around $2000 (in Denmark, that is, might be cheaper in the US)

best regards
Lars
--
Lars Henriksen                  | MESH-Technologies A/S
Systems Manager & Consultant    | Lille Gr?br?drestr?de 1
www.meshtechnologies.com        | DK-5000 Odense C, Denmark
lars at meshtechnologies.com       | mobile: +45 2291 2904
direct: +45 6311 1187           | fax:    +45 6311 1189


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Sat Mar  6 09:01:49 2004
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Sat, 6 Mar 2004 06:01:49 -0800 (PST)
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <1078566922.2547.6.camel@fermi>
Message-ID: <Pine.LNX.4.44.0403060600150.25866-100000@twin.uoregon.edu>

On 6 Mar 2004, Lars Henriksen wrote:

> On Fri, 2004-03-05 at 23:27, Russell Nordquist wrote:
> > Does anyone have a recommendation for a good 24 port gige switch for
> > clustering? 
> 
> > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
> > and found one to be better than the other. There are a bunch of 24 port
> > gige switches for <$2000, but are they any good? are some better than
> > others (likely so i'd guess)?
> 
> We mostly use HP2724's for this size of clusters. We have found them to
> perform ok and they are stable under heavy load - and they are priced at
> around $2000 (in Denmark, that is, might be cheaper in the US)

hp doesn't do jumbo frames on anything other than their top of the line 
l3 switch products which may or may not be an issue for certain 
applications.

> best regards
> Lars
> --
> Lars Henriksen                  | MESH-Technologies A/S
> Systems Manager & Consultant    | Lille Gr?br?drestr?de 1
> www.meshtechnologies.com        | DK-5000 Odense C, Denmark
> lars at meshtechnologies.com       | mobile: +45 2291 2904
> direct: +45 6311 1187           | fax:    +45 6311 1189
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli  	       Unix Consulting 	       joelja at darkwing.uoregon.edu    
GPG Key Fingerprint:     5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hanzl at noel.feld.cvut.cz  Sat Mar  6 10:02:37 2004
From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz)
Date: Sat, 06 Mar 2004 16:02:37 +0100
Subject: [Beowulf] SGEEE free and more platform offically
 supported
In-Reply-To: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com>
References: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com>
Message-ID: <20040306160237D.hanzl@unknown-domain>

> I used to think that SGE is free, but SGEEE (with more
> advanced scheduling algorithms) is not. But it is not
> true, both are free and open source.

SGEEE is free and opensource but many many people did not know this. I
thing this confusion made big harm to SGE project and I invested a lot
of effort in clarifying this (Google "hanzl SGEEE" to see all that).

> In SGE 6.0, there will be no "SGEEE mode", but the
> default mode will have all the SGEEE functionality!

Great, hope this will stop the confusion once for ever.

Regards

Vaclav Hanzl
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Sat Mar  6 10:00:35 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sat, 6 Mar 2004 23:00:35 +0800 (CST)
Subject: [Beowulf] SGEEE free and more platform offically supported
In-Reply-To: <20040306160237D.hanzl@unknown-domain>
Message-ID: <20040306150035.75079.qmail@web16806.mail.tpe.yahoo.com>

 --- hanzl at noel.feld.cvut.cz ????>
> SGEEE is free and opensource but many many people
> did not know this. I
> thing this confusion made big harm to SGE project
> and I invested a lot
> of effort in clarifying this (Google "hanzl SGEEE"
> to see all that).

I think it is because Sun called it "Enterprise
Edition" (EE), and when people think of Enterprise,
they think of $$$.

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From atp at piskorski.com  Sat Mar  6 15:43:28 2004
From: atp at piskorski.com (Andrew Piskorski)
Date: Sat, 6 Mar 2004 15:43:28 -0500
Subject: [Beowulf] DC powered clusters?
Message-ID: <20040306204328.GA49615@piskorski.com>

Some rackmount vendors now offer systems with a small DC-to-DC power
supply for each node, with separate AC-DC rectifiers feeding power.  I
imagine the DC is probably at 48 V rather than 12 V or whatever, but
often they don't even seem to ay that, e.g.:

  http://rackable.com/products/dcpower.htm

Has anyone OTHER than commercial rackmount vendors designed and built
a cluster using such DC-to-DC power supplies?  Is there detailed info
on such anywhere on the web?

Anybody have any idea exactly what components those vendors are using
for their power systems, where they can be purchased (in small
quantities), and/or how much they cost?

I'm curious how the purchase and operating costs compare to the normal
"stick a standard desktop AC-to-DC PUSE in each node" approach, or
even the hackish "wire on extra connectors and use one high qualtiy
desktop PSU to power 2 or 3 nodes" approach.

The only DC-to-DC supplies I've seen on the web seem quite expensive,
e.g.:

  http://www.rackmountpro.com/productsearch.cfm?catid=118
  http://www.mini-box.com/power-faq.htm

So I suspect the DC-to-DC approach would only ever make economic sense
for large high-end clusters, those with unusual space or heat
constraints, or the like.  But I'm still curious about the details...

-- 
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Fri Mar  5 23:41:07 2004
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Fri, 05 Mar 2004 22:41:07 -0600
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
References: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
Message-ID: <40495663.7010507@tamu.edu>

Caveats:
1.  It's been arough week.
2.  I've got some specific opinions about 3Com hardware these days.

I just ordered a 16 node cluster.  I'm using the Foundry EdgeIron 24G as 
the basic switch.  More than adequate backplane, pretty good small and 
large packet performance as tested with an Anritsu MD1230.  Cost is 
expected to be about $3000, for the 24 port model.  I'm getting 2, and 
have dual nics on the nodes, for some playing with channel bonding, and 
so that I've got a failover hot spare if/when one dies.  Remember: 
Murphy was an optimist.

For the record I don't expect the EdgeIron to die, but conversely 
(perversely?) I expect any and all network devices to die at the least 
opportune time!

I didn't even consider 3Com.  Didn't test it.  The 3Com "gigabit" 
hardware I've seen recently in the LAN-space was usually capable of gig 
uplinks, but had trouble with congestion when gig and 100BaseT were 
mixed on the switch.

HP had been OEM'ing Foundry.  I'm not sure if that's still the case or 
if they went recently to someone else; my Foundry rep won't say, and I 
don't have a close HP rep.

We have programmatically stayed away from Asante in our LAN operations 
here.  That translates to no experience an dno contacts.  Sorry.

Cluster should be in within a month, and so should the switches.  I'll 
do some latency runs and report objective data.

gerry

Russell Nordquist wrote:
> Does anyone have a recommendation for a good 24 port gige switch for
> clustering? I know this issue has been discussed, but I didn't find any
> actual manufacturer/models people like. Were not really looking at the
> very high end models from Cisco, but I am wary of the many low end
> switches on the market with regard to bisectional bandwidth issues.
> 
> Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
> and found one to be better than the other. There are a bunch of 24 port
> gige switches for <$2000, but are they any good? are some better than
> others (likely so i'd guess)?
> 
> thanks and have a good weekend.
> russell
> 
> 
> - - - - - - - - - - - -
> Russell Nordquist
> UNIX Systems Administrator
> Geophysical Sciences Computing
> http://geosci.uchicago.edu/computing
> NSIT, University of Chicago
>  - - - - - - - - - - -
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Sun Mar  7 03:00:56 2004
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Sun, 7 Mar 2004 00:00:56 -0800 (PST)
Subject: [Beowulf] DC powered clusters? - fun
In-Reply-To: <20040306204328.GA49615@piskorski.com>
Message-ID: <Pine.LNX.3.96.1040306232504.5090A-100000@Maggie.Linux-Consulting.com>


hi ya andrew 

fun stuff ... :-)  good techie vitamins ;-) - lots of thinking of why 
it is the way it is vs what the real measure power consumption is

On Sat, 6 Mar 2004, Andrew Piskorski wrote:

> Some rackmount vendors now offer systems with a small DC-to-DC power
> supply for each node, with separate AC-DC rectifiers feeding power.  I
> imagine the DC is probably at 48 V rather than 12 V or whatever, but
> often they don't even seem to ay that, e.g.:
> 
>   http://rackable.com/products/dcpower.htm

i don't like that they claim "back-to-back rackmounts" is their "patented
technology" ... geez ... 
	- anybody can mount a generic 1U in the rack .. one in the front
	and one in the back ( other side ) ... ( obviously the 1U chassis
	cannot be too deep )
 
> Has anyone OTHER than commercial rackmount vendors designed and built
> a cluster using such DC-to-DC power supplies?  Is there detailed info
> on such anywhere on the web?

dc-dc power supplies are made literally and figuratively by the million
various combination of voltage, current capacity and footprint

	http://www.Linux-1U.net/PowerSupp
	( see the list of various power supply manufacturers )

> Anybody have any idea exactly what components those vendors are using
> for their power systems, where they can be purchased (in small
> quantities), and/or how much they cost?

you can buy any size dc-dc power supplies from $1.oo to the thousands

if you want the dc-dc power supply to have atx output capabilities,
than you have 2 or 3 choice of dc-atx output power supplies:
	- mini-box.com ( and they have a few resellers )
	- there's a power supply company that also did a variation
	of mini-box.com's design ... i cant find the orig url at this time
		http://www.dc2dc.com is a resller of the "other option"
 	- probably a bunch of power supp working on dc-atx convertors

> The only DC-to-DC supplies I've seen on the web seem quite expensive,
> e.g.:
> 
>   http://www.rackmountpro.com/productsearch.cfm?catid=118

99% of the rackmount vendors are just reselling (adding $$$ to ) a power
supply manufacturer's power supply ...

	- you can save a good chunk of change by buying direct
	from the generic power supply OEM distributors 

	- somtimes as much or mroe than 50% cost savings of the cost of
	the power supply

>   http://www.mini-box.com/power-faq.htm

most of their data are measured data per their test setups
and more info about dc-dc stuff

	http://www.via.com.tw/en/VInternet/power.pdf

see the rest of the +12v DC input "atx power supply" vendors

	http://www.Linux-1U.net/PowerSupp/DC/

	http://www.Linux-1U.net/PowerSupp/12v/
	( +12v at up to 500A or more )

> So I suspect the DC-to-DC approach would only ever make economic sense
> for large high-end clusters, those with unusual space or heat
> constraints, or the like.  But I'm still curious about the details...

dc-atx power supply makes sense when:

	- power supply heat and airflow is a problem
		or you dont like having too many power cords 
		( 400 cords vs 40 in a rack )
		- simple cabling is a big problem ( rats nest )

	- you want to reduce the costs of the system by throwing away
	un-used power supply capacity that is available with the
	traditional one power supply per 1 motherboard and peripherals
		- most power supplies used are used for maximum
		supported load (NOT a motherboard + cpu + disk + mem only)

	- you have a huge airconditioning bill problem
		- that should motivate you to find and test a system
		with "less heat generated solutions"

	- your cluster only needs to have enough power for the cpu + 1disk

	- you have a space consideration problems
		- dc-atx power supply allows 420 cpus per 42U rack
		and up to 840 cpus for front and back loaded cluster

	- on and on ...

for a typical 4U-8U height blade clusters ( 10 blades )
	- you only need one 600-800W atx power supply to drive
	the 10 mini-itx or flex-atx blades
	- cpu is 25W ?? motherboard is 25W ... 
	- disks need 1A at 12v to spin up.. normal operation current is
	80ma at 12v ... etc .. per disk specs
	- how you want to do power calculations is the trick

	10 full-tower system with a 450W power does NOT imply you';re
	using 4500W of power for 10 systems :-)

have fun
alvin
http://www.1U-ITX.net

100TB - 200TB of disks per 42U racks ??  -- even more fun
	http://www.itx-blades.net/1U-Blades
	( blades are with mini-box.com's  dc-dc atx power supply )

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mayank_kaushik at vsnl.net  Sun Mar  7 03:29:42 2004
From: mayank_kaushik at vsnl.net (mayank_kaushik at vsnl.net)
Date: Sun, 07 Mar 2004 13:29:42 +0500
Subject: [Beowulf] PVM says `PVM_ROOT not set..cant start pvmd` on remote computer
Message-ID: <69bcf6b69beb90.69beb9069bcf6b@vsnl.net>

hi...


im trying to make a two-machine PVM virtual machine. but im having problems with PVM.
the names of the two machines are "mayank" and "manish".."mayank" runs fedora core 1, "manish" runs red hat linux -9..both are part of a simple 100mbps LAN, connected by a 100mbps switch.
iv *disabled* the firewall on both machines.

iv installed pvm-3.4.4-14 on both machines.
the problem is: 

when i try to add "mayank" to the virtual machine from "manish" using "add mayank", pvm is unable to do so..gives an error message "cant start pvmd"..then it tries to diagnose what went wrong..it passes all tests but one -- says "PVM_ROOT" is set to "" on the target machine ("mayank")...but thats ABSURD..iv checked a mill-yun times, the said variable is correctly set..when i ssh to mayank from manish, and then echo $PVM_ROOT , i get the correct answer...
plz note that im using ssh instead of rsh, by changing the variable PVM_RSH=/usr/bin/ssh..since im more comfortable with ssh...
but when i try the opposite--adding "manish" to the virtual machine from "mayank" runnnig fedora..it works!
furthermore....before i installed fedora core 1 on mayank, it too had red hat 9..and then i was getting the same problem from BOTH machines..but after installing fedora on mayank, things began to work from that end.


what going on??? (apart from me whos going nuts)


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Sun Mar  7 11:10:20 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Sun, 7 Mar 2004 08:10:20 -0800 (PST)
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <40495663.7010507@tamu.edu>
Message-ID: <Pine.LNX.4.04.10403070800300.31862-100000@c-24-18-245-161.client.comcast.net>

Does anyone have experience with Dell's new 2624 unmanaged 24 port gigE
switch?  It's only about $330, around a 1/10 the cost of the managed switches. 
>From what I've read, the Dell/Linksys 5224 managed gigE switch is good.  It
could be that the unmanaged switch uses the exact same Broadcom switch chips,
but just doesn't have management.

On Fri, 5 Mar 2004, Gerry Creager N5JXS wrote:
> expected to be about $3000, for the 24 port model.  I'm getting 2, and 
> have dual nics on the nodes, for some playing with channel bonding, and 

Last I heard, the interrupt mitigation on gigE cards messes up channel bonding
for extra bandwidth.  The packets arrive in batches out of order, and Linux's
TCP/IP stack doesn't like this, so you get less bandwidth with two cards than
you would with just one.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Sun Mar  7 17:13:26 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Sun, 7 Mar 2004 17:13:26 -0500 (EST)
Subject: [Beowulf] PVM says `PVM_ROOT not set..cant start pvmd` on remote
 computer
In-Reply-To: <69bcf6b69beb90.69beb9069bcf6b@vsnl.net>
Message-ID: <Pine.LNX.4.44.0403071704490.1983-100000@lilith.rgb.private.net>

On Sun, 7 Mar 2004 mayank_kaushik at vsnl.net wrote:

> hi...
> 
> 
> im trying to make a two-machine PVM virtual machine. but im having problems with PVM.
> the names of the two machines are "mayank" and "manish".."mayank" runs fedora core 1, "manish" runs red hat linux -9..both are part of a simple 100mbps LAN, connected by a 100mbps switch.
> iv *disabled* the firewall on both machines.
> 
> iv installed pvm-3.4.4-14 on both machines.
> the problem is: 

> when i try to add "mayank" to the virtual machine from "manish" using
> "add mayank", pvm is unable to do so..gives an error message "cant start
> pvmd"..then it tries to diagnose what went wrong..it passes all tests
> but one -- says "PVM_ROOT" is set to "" on the target machine
> ("mayank")...but thats ABSURD..iv checked a mill-yun times, the said
> variable is correctly set..when i ssh to mayank from manish, and then
> echo $PVM_ROOT , i get the correct answer...

This COULD be associated with the order things like .bash_profile and so
forth are run for interactive shells vs login shells.

If you are setting PVM_ROOT in .bash_profile (so it would be correct on
a login) be sure to ALSO set it in .bashrc so that it is set for the
remote shell likely used to start PVM.  I haven't looked at the fedora
RPM so I don't know if /usr/bin/pvm is still a script that sets this
variable for you anyway.

> plz note that im using ssh instead of rsh, by changing the variable
> PVM_RSH=/usr/bin/ssh..since im more comfortable with ssh...

Me too.  ssh also has a very nice feature that permits an environment to
be set on the remote machine for non-interactive remote commands that
CAN be useful for PVM, although I think the stuff above might fix it.

> but when i try the opposite--adding "manish" to the virtual machine
> from "mayank" runnnig fedora..it works!

> furthermore....before i installed fedora core 1 on mayank, it too had
> red hat 9..and then i was getting the same problem from BOTH
> machines..but after installing fedora on mayank, things began to work
> from that end.

I've encountered a similar problem only once, trying to add nodes FROM a
wireless laptop.  Didn't work.  Adding the wireless laptop from anywhere
else worked fine, all systems RH 9 and clean (new) installs from RPM of
pvm, I explicitly set PVM_ROOT and PVM_RSH when logging in.  PVM_ROOT is
additionally set (correctly) by the /usr/bin/pvm command, which is
really a shell.

> what going on??? (apart from me whos going nuts)

Try checking your environment to make sure it is set for both a remote
command:

  ssh mayank echo "\$PVM_ROOT"

and in a remote login:

  ssh mayank

$ echo "$PVM_ROOT"

     rgb

> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sunyy_2004 at hotmail.com  Mon Mar  8 11:33:18 2004
From: sunyy_2004 at hotmail.com (Yiyang Sun)
Date: Tue, 09 Mar 2004 00:33:18 +0800
Subject: [Beowulf] Relation between Marvell Yukon Controller and SysKonnect GbE Adapters
Message-ID: <BAY9-F3D9h5fL09hQzt00006346@hotmail.com>

Hi, Beowulf users,

We're going to setup a small cluster. The motherboard we ordered is
the newly released Gigabyte GA-8IPE1000-G which integrates Marvell's
Yukon 8001 GbE Controller. I tried to find the Linux driver for this 
controller
on Google and was directed to SysKonnect's website

http://www.syskonnect.com/syskonnect/support/driver/d0102_driver.html

which provides a driver for Marvell Yukon/SysKonnect SK-98xx Gigabit 
Ethernet Adapters.
However, there is no explicit indication on this website that SysKonnect's 
adapters use
Marvell's chips. Does any here have experience using Marvell's controllers?
Is it easy to install Yukon 8001 on Linux? Thanks!

Yiyang

_________________________________________________________________
Get MSN Hotmail alerts on your mobile. 
http://en-asiasms.mobile.msn.com/ac.aspx?cid=1002

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Mar  8 14:44:50 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 8 Mar 2004 14:44:50 -0500 (EST)
Subject: [Beowulf] Re: beowulf
In-Reply-To: <20040308184024.955.qmail@web21501.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0403081436440.15576-100000@ganesh>

On Mon, 8 Mar 2004, prakash borade wrote:


> how should i proceed for a client which takes dta from 5 servers
> reoetadly after every 15 seconds

> i get the data but it prints the garbage value
> 
> what can be the problem  i am usiung sockets on redhat 9
> 
> i am creting new sockets for it every time on clien side

Dear Prakash,

There is such a dazzling array of possible problems with your code that
(not being psychic) I cannot possibly help you.

For example -- 

  You could be printing an integer as a float without a cast (purely
misusing printf).  Or vice versa.  I do this all the time; it is a
common mistake.

  You could be sending the data on a bigendian system, receiving it and
trying to print it on a littleendian system.

  You could have a trivial offset wrong in your receive buffers --
printing an integer (for example) starting a byte in and overlapping
some other data in your stack would yield garbage.

  You could have a serious problem with your read algorithm.  Reading
reliably from a socket is not trivial.  I use a routine that I developed
over a fairly long time and it STILL has bugs that surface.  The
reading/writing are fundamentally asynchronous, and a read can easily
leave data behind in the socket buffer (so that what IS read is
garbage).

...and this is the tip of an immense iceberg of possible programming
errors.

The best way to proceed to write network code is to

  a) start with a working template of networking/socket code.  There are
examples in a number of texts, for example, as well as lots of
socket-based applications.  Pick a template, get it working.

  b) SLOWLY and GENTLY change your working template into your
application, ensuring that the networking component never breaks at
intermediary revisions.

or

  c) learn, slowly, surely, and by making many mistakes, to write socket
code from scratch without using a template.

Me, I use a template.

   rgb

P.S. to get more help, you're really going to have to provide a LOT more
detail than this.  Possibly including the actual source code.

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From beowulf at studio26.be  Mon Mar  8 14:54:40 2004
From: beowulf at studio26.be (Maikel Punie)
Date: Mon, 8 Mar 2004 20:54:40 +0100
Subject: [Beowulf] Cluster school project
Message-ID: <JBEHIPLJAENKCNCNFEMMGEIMCAAA.beowulf@studio26.be>


hi,

I need to make a smaal beowulf cluster for a school project i have like 2
months for this stuff, but i need to make my own task asignment.
So basicly what do you guys think that would be possible to realize in 2
months time? The only thing they told me, is that the nodes must be discless
systems.

any ideas about what could be donne in 2 months.

Maikel

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From beowulf at studio26.be  Mon Mar  8 16:03:41 2004
From: beowulf at studio26.be (Maikel Punie)
Date: Mon, 8 Mar 2004 22:03:41 +0100
Subject: [Beowulf] Cluster school project
In-Reply-To: <Pine.LNX.4.44.0403082149350.16285-100000@druifje.clustervision.com>
Message-ID: <JBEHIPLJAENKCNCNFEMMCEINCAAA.beowulf@studio26.be>

hmm,

ok, maybe i explained badly, at the moment i just need to create a project
discryption on what would be possible to realize in 2 months, and off course
i could use the cluster knoppix, but then its not a real project anymore,
then its just an install task.

also the openmosix structure is it using diskless nodes? or what because i
can't find a lot off info about it.

By the way, which part of Belgium are you from?
I recently attended the FOSDEM conference at the ULB in Bruxelles.
Great conference.

Well its the whole other part off the country, but yeah it was a great
conference i was there to :)

Thanks
Miakle

-----Oorspronkelijk bericht-----
Van: John Hearns [mailto:john.hearns at clustervision.com]
Verzonden: maandag 8 maart 2004 21:52
Aan: Maikel Punie
CC: Beowul-f Mailing lists
Onderwerp: Re: [Beowulf] Cluster school project


On Mon, 8 Mar 2004, Maikel Punie wrote:

>
> hi,
>
> I need to make a smaal beowulf cluster for a school project i have like 2
> months for this stuff, but i need to make my own task asignment.
> So basicly what do you guys think that would be possible to realize in 2
> months time? The only thing they told me, is that the nodes must be
discless
> systems.
>
Maikel, first you need the computers!

Then you should first look at ClusterKnoppix
http://bofh.be/clusterknoppix/

Once you have that running, come back and tell us how you got on.
We'll help you do more then.


By the way, which part of Belgium are you from?
I recently attended the FOSDEM conference at the ULB in Bruxelles.
Great conference.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Mon Mar  8 15:51:58 2004
From: john.hearns at clustervision.com (John Hearns)
Date: Mon, 8 Mar 2004 21:51:58 +0100 (CET)
Subject: [Beowulf] Cluster school project
In-Reply-To: <JBEHIPLJAENKCNCNFEMMGEIMCAAA.beowulf@studio26.be>
Message-ID: <Pine.LNX.4.44.0403082149350.16285-100000@druifje.clustervision.com>

On Mon, 8 Mar 2004, Maikel Punie wrote:

> 
> hi,
> 
> I need to make a smaal beowulf cluster for a school project i have like 2
> months for this stuff, but i need to make my own task asignment.
> So basicly what do you guys think that would be possible to realize in 2
> months time? The only thing they told me, is that the nodes must be discless
> systems.
> 
Maikel, first you need the computers!

Then you should first look at ClusterKnoppix 
http://bofh.be/clusterknoppix/

Once you have that running, come back and tell us how you got on.
We'll help you do more then.


By the way, which part of Belgium are you from?
I recently attended the FOSDEM conference at the ULB in Bruxelles.
Great conference.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mprinkey at aeolusresearch.com  Mon Mar  8 14:39:44 2004
From: mprinkey at aeolusresearch.com (Michael T. Prinkey)
Date: Mon, 8 Mar 2004 14:39:44 -0500 (EST)
Subject: [Beowulf] e1000 performance
Message-ID: <Pine.LNX.4.44.0403081429390.9544-100000@ra.thebes>

Hello everyone,

I am building a small cluster that uses Tyan S2723GNN motherboards that
include an integrated Intel e1000 gigabit NIC.  I have installed two
Netgear 302T gigabit cards in the 66 MHz slots as well.  With
point-to-point links, I can get a very respectable 890 Mbps with the tg3
cards, but the e1000 lags significantly at 300 to 450 Mbps.  I am using
the NAPI e1000 driver in the 2.4.24 kernel. I have tried the following
measures without any improvement:

 - changed the tcp_mem,_wmem,_rmem to larger values.
 - increased the MTU to values >1500.
 - reniced the ksoftirq processes to 0.

The 2.4.24 kernel contains the 4.x version of the e1000.  I plan to try
the 5.x version this evening.  Also, want to try increasing the Txqueuelen
as well.

Has anyone had similar experience with these embedded e1000s?  Googling 
leads me to several sites like this one: 

http://www.hep.ucl.ac.uk/~ytl/tcpip/tuning/

that seem to indicate that I should expect much more from the e1000.  Any 
help here is welcome?

Thanks,

Mike

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Mon Mar  8 16:59:59 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Mon, 8 Mar 2004 13:59:59 -0800 (PST)
Subject: [Beowulf] e1000 performance
In-Reply-To: <Pine.LNX.4.44.0403081429390.9544-100000@ra.thebes>
Message-ID: <Pine.LNX.4.04.10403081346570.1662-100000@c-24-18-245-161.client.comcast.net>

On Mon, 8 Mar 2004, Michael T. Prinkey wrote:
> I am building a small cluster that uses Tyan S2723GNN motherboards that
> include an integrated Intel e1000 gigabit NIC.  I have installed two

>From a supermicro X5DPL-iGM (E7501 chipset) with onboard e1000 to supermicro
E7500 board with an e1000 PCI-X gigabit card, via a dell 5224 switch.  The
E7501 board has a 3ware 8506 card on the same PCI-X bus as the e1000 chip, so
it's running at 64/66.  The PCI-X card is running at 133 MHz.

TCP STREAM TEST to duet
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

131070 131070   1472    9.99      940.86

Kernel versions are 2.4.20 (PCI-X card) and 2.4.22-pre2 (the onboard chip). 
2.4.20 has driver 4.4.12-k1, while 2.4.22-pre2 has driver 5.1.11-k1.

The old e1000 driver has a very useful proc file in /proc/net/PRO_LAN_Adapters
that gives all kind of information.  I have RX checksum on and flow control
turned on.  The newer driver doesn't have this information.


> the NAPI e1000 driver in the 2.4.24 kernel. I have tried the following

NAPI?

> measures without any improvement:

I've done nothing wrt gigabit performance, other than turn on flow control.  I
found that without flowcontrol, tcp connections to 100 mbit hosts would hang.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mnerren at paracel.com  Mon Mar  8 17:31:55 2004
From: mnerren at paracel.com (micah nerren)
Date: Mon, 08 Mar 2004 14:31:55 -0800
Subject: [Beowulf] Cluster school project
In-Reply-To: <JBEHIPLJAENKCNCNFEMMGEIMCAAA.beowulf@studio26.be>
References: <JBEHIPLJAENKCNCNFEMMGEIMCAAA.beowulf@studio26.be>
Message-ID: <1078785115.30523.89.camel@angmar>

On Mon, 2004-03-08 at 11:54, Maikel Punie wrote:
> hi,
> 
> I need to make a smaal beowulf cluster for a school project i have like 2
> months for this stuff, but i need to make my own task asignment.
> So basicly what do you guys think that would be possible to realize in 2
> months time? The only thing they told me, is that the nodes must be discless
> systems.
> 
> any ideas about what could be donne in 2 months.
> 
> Maikel

To actually build a small (or large!) beowulf of discless systems is
pretty easy, I guess the hardest part will be determining what the
purpose of the cluster will be. What type of code will be running on it?

They will basically be network booting a kernel and mounting an nfs
filesystem. Research these aspects, and research what kind of tools you
want to have on the cluster, ie. distributed shell, monitoring, mpi,
etc.

2 months should be plenty, you should be able to get a basic small
beowulf up and running in 2 hours once you know what to do and how to
set it up.

Time to fire up google and start researching beowulf's and diskless
booting. There is a lot of good info out there.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Mon Mar  8 17:15:46 2004
From: becker at scyld.com (Donald Becker)
Date: Mon, 8 Mar 2004 17:15:46 -0500 (EST)
Subject: [Beowulf] BWBUG Greenbelt: Intel HPC and Grid, Beowulf Clusters
Message-ID: <Pine.LNX.4.44.0403081704130.27981-100000@localhost.localdomain>


Special notes:
  This month's meeting is in Greenbelt Maryland, not Virginia!
  From pre-registration we expect a full room, so please register on
  line at http://bwbug.org and show up at least 15 minutes early.


Title: Intel's Perspective on Beowulf's Clusters
Speaker:  Stephen Wheat Ph.D

This talk will review Intel's perspective on technology trends and
transitions in this decade. The focus will be on bringing the latest
technology to the scientists' labs in the shortest amount of time. The
technologies reviewed will include processors, chipsets, I/O, systems
management, and software tools. Come with your questions; the
presentation is designed to be interactive.

Date: March 9, 2004
Time: 3:00 PM  (doors open at 2:30)
Location: Northrop Grumman IT
   7501 Greenway Center Drive  (Intersection of BW Parkway and DC beltway)
   Suite 1200 (12th floor)
   Greenbelt Maryland
Need to be a member?: No ( guests are welcome )
Parking: Free

As usual there will be door prizes, food and refreshments.


From: "Fitzmaurice, Michael" <michael.fitzmaurice at ngc.com>
  Dr. Wheat from Intel must be a popular speaker we have a big turn out
  expected. If you have not registered yet please do so. We may need to
  plan for extra chairs and we need to predict how many pizzas to
  order. This would be great meeting to invite a friend or your boss.
  It may be crowded, therefore, getting there a little early is
  recommended.

This event is sponsored by the Baltimore-Washington Beowulf Users Group
(BWBUG)
Please register on line at http://bwbug.org
As usual there will be door prizes, food and refreshments.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From nathan at iwantka.com  Mon Mar  8 18:21:15 2004
From: nathan at iwantka.com (Nathan Littlepage)
Date: Mon, 8 Mar 2004 17:21:15 -0600
Subject: [Beowulf] SCTP
Message-ID: <00d701c40564$1d21a830$6c45a8c0@ntbrt.bigrivertelephone.com>


Has anyone looked into incorporating SCTP in the cluster environment?

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Mon Mar  8 20:44:52 2004
From: becker at scyld.com (Donald Becker)
Date: Mon, 8 Mar 2004 20:44:52 -0500 (EST)
Subject: [Beowulf] SCTP
In-Reply-To: <00d701c40564$1d21a830$6c45a8c0@ntbrt.bigrivertelephone.com>
Message-ID: <Pine.LNX.4.44.0403082038280.27981-100000@localhost.localdomain>

On Mon, 8 Mar 2004, Nathan Littlepage wrote:

> Has anyone looked into incorporating SCTP in the cluster environment?

What advantage would it provide for a SAN- or LAN-based cluster?

Not that TCP is especially light-weight.  TCP implementations are
WAN-oriented and have increasingly costly features (look at the CPU cost
of iptables/ipchains) and defenses against spoofing (TCP stream start-up
is much more costly than the early BSD implementations).  The only
reason SCTP would be a better cluster protocol is that it hasn't yet
accumulated the cruft ("features") of a typical TCP stack.  But if it
became popular, that would change pretty much instantly.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
914 Bay Ridge Road, Suite 220		Scyld Beowulf cluster systems
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rdn at uchicago.edu  Mon Mar  8 23:40:01 2004
From: rdn at uchicago.edu (Russell Nordquist)
Date: Mon, 8 Mar 2004 22:40:01 -0600
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <Pine.LNX.4.44.0403060600150.25866-100000@twin.uoregon.edu>
References: <1078566922.2547.6.camel@fermi>
	<Pine.LNX.4.44.0403060600150.25866-100000@twin.uoregon.edu>
Message-ID: <20040308224001.50f2f728@vitalstatistix>


thanks for all the good info. it got me to thinking....i have resources
for comparing most components of a cluster excepts network switches. it
would be nice to have a source of information for this as well.
something like:

*bandwidth/latency between 2 hosts
*bandwidth/latency at 25%/50%/75%/100% port usage
*short vs long message comparisons

great so far, but what about the issues:

*what SW to use for the benchmark. perhaps netpipe?
*the NICS used will make a difference. how does one account for the
difference between a realtec and syskonnect chipset, bus speeds, etc?
*do we have enough variation of cluster sizes and HW to make a useful
repository?
*and i'm sure there's more 

Is this feasible? Is it a case where any info is useful even if it is
not very reliable/accurate? With more MB's coming with decent gige on
board there will be a greater chance the the difference between to
setups will only be the switch.

so, is this a worthwhile are useful project for the community? or are
there to many variables to make the results useful?

russell
-- 
- - - - - - - - - - - -
Russell Nordquist
UNIX Systems Administrator
Geophysical Sciences Computing
http://geosci.uchicago.edu/computing
NSIT, University of Chicago
 - - - - - - - - - - -
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From beowulf at studio26.be  Tue Mar  9 12:45:47 2004
From: beowulf at studio26.be (Maikel Punie)
Date: Tue, 9 Mar 2004 18:45:47 +0100
Subject: [Beowulf] Cluster school project
In-Reply-To: <644D9337A02FC24689647BF9E48EC39E08ABB797@drm556>
Message-ID: <JBEHIPLJAENKCNCNFEMMEEJLCAAA.beowulf@studio26.be>


>> ok, maybe i explained badly, at the moment i just need to create a
project
>> discryption on what would be possible to realize in 2 months, and off
course

>Do you mean a computing/programming project could you do,
>like calculating pi to some large number of digits?

yeah something like that, i realy have no idea what is possible.
if there are any suggestions, they are always welcome.

Maikel


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From paulojjs at bragatel.pt  Tue Mar  9 04:15:05 2004
From: paulojjs at bragatel.pt (Paulo Silva)
Date: Tue, 09 Mar 2004 09:15:05 +0000
Subject: [Beowulf] How to choose an UPS for a Beowulf cluster
Message-ID: <1078823704.1882.33.camel@blackTiger>

Hi,

I'm building a small Beowulf cluster for HPC (about 16 nodes) and I need
some advices on choosing the right UPS. The UPS should be able to signal
the central node when the battery reaches some level (I think this is
common usage) and it should be able to turn itself off before running
out of battery (I was told that this extends the life of the battery).
10 minutes of runtime sould be enough. I was looking in the APC site but
I was rather confused by all the models available. Can anyone give me
some advice on the type of device to choose?

Thanks for any tip
--
Paulo Jorge Jesus Silva
perl -we 'print "paulojjs".reverse "\ntp.letagarb@"'

If a guru falls in the forest with no one to hear him, was he really a
guru at all?
-- Strange de Jim, "The Metasexuals"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Esta ? uma parte de mensagem	assinada digitalmente
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20040309/97af44f9/attachment.sig>

From brichard at clusterworldexpo.com  Tue Mar  9 13:45:15 2004
From: brichard at clusterworldexpo.com (Bryan Richard)
Date: Tue, 9 Mar 2004 13:45:15 -0500
Subject: [Beowulf] Join Don Becker and Thomas Sterling at ClusterWorld Conference & Expo
Message-ID: <20040309184515.GB47601@clusterworldexpo.com>


ClusterWorld Conference & Expo welcomes Scyld's Don Becker and Keynote
Thomas Sterling to the program! If you work in Beowulf and clusters, you
can't miss the following program events:

- Donald Becker, Scyld Computing Corporation: "Scyld Beowulf
Introductory Workshop"

- Donald Becker, Scyld Computing Corporation: "Scyld Beowulf Advanced
Workshop"

- Thomas Sterling, California Institute of Technology: "Beowulf Cluster
Computing a Decade of Accomplishment, a Decade of Challenge"

PLUS, ClusterWorld's exciting program of intensive tutorials, special
events, and expert presentations in 8 vertical industry tracks:
Applications, Automotive & Aerospace Engineering, Bioinformatics, 
Digital Content Creation, Grid, Finance, Petroleum & Geophysical
Exploration, and Systems.

A Special Offer for Beowulf Members
===================================

Beowulf.org members get 20% off registration prices when registering
online! You MUST use your special Priority Code - BEOW -- when
registering online to receive your 20% discount! Online registration
ends March 31, 2004 so don't delay! Just go to
http://www.clusterworldexpo.com and click on "REGISTER NOW!" to fill out
our quick enrollment form.

Associations, Universities and Labs Get 50% off Registration
============================================================

Students and employees of universities, associations, and government
labs are eligible for 50% off ClusterWorld registration!  This offer is
only available via fax or mail. Please log on to
www.clusterworldexpo.com and click on "Register Now" to download
registration PDFs. Or call 415-321-3062 for more information

A TERRIFIC PROGRAM
==================

At ClusterWorld Conference & Expo, you will: 

* LEARN from top clustering experts in our extensive conference program. 
* EXPERIENCE the latest cluster technology from the top vendors on our
  expo floor.
* MEET AND NETWORK with colleagues from across the world of clustering
  at our social events and parties.

Keynotes:

- Ian Foster, Argonne National Laboratory, University of Chicago, Globus
Alliance, and co-author of "The Grid: Blueprint for a New Computing
Infrastructure",

- Thomas Sterling, California Institute of Technology, author of "How to
Build a Beowulf," and co-author of "Enabling Technologies for Petaflops
Computing".

- Andrew Mendelsohn, Senior Vice President, Database & Application
Server Technology, Oracle Corporation

- David Kuck, Intel Fellow, Manager, Software and Solutions Group, Intel
Corporation

Want to know which sessions are getting the biggest buzz? Click on
http://www.clusterworldexpo.com/SessionSpotlight for a list of 
highlights by Technical Session Track.

REGISTER TODAY!

ClusterWorld Conference and Expo
April 5 - 8, 2004
San Jose Convention Center
San Jose, California
http://www.clusterworldexpo.com

ClusterWorld Conference & Expo Sponsors
=======================================

Platinum: Oracle Corporation, Intel Corporation

Gold: AMD, Dell, Hewlett Packard, Linux Networx, Mountain View Data,
Panasas, Penguin Computing, and RLX Technologies

Silver: Appro, Engineered Intelligence, Microway, NEC, Platform
Computing, and PolyServe

Media & Association Sponsors: Bioinformatics.org, ClusterWorld Magazine,
Distributed Systems Online, Dr. Dobbs Journal, Gelato Federation, Global
Grid Forum, GlobusWorld, LinuxHPC, Linux Magazine, PR Newswire, Storage
Management, and SysAdmin Magazine
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From wseas at canada.com  Tue Mar  9 12:25:56 2004
From: wseas at canada.com (WSEAS newsletter in mechanical engineering)
Date: Tue, 9 Mar 2004 19:25:56 +0200
Subject: [Beowulf] WSEAS and IASME newsletter in mechanical engineering, March 9, 2004
Message-ID: <3FE20F4000220BB2@fesscrpp1.tellas.gr> (added by postmaster@fesscrpp1.tellas.gr)

If you want to contact us, the Subject of your email must contains the code:
WSEAS

CALL FOR PAPERS -- CALL FOR REVIEWERS -- CALL FOR SPECIAL SESSIONS
http://www.wseas.org

IASME / WSEAS International Conference on "FLUID MECHANICS" (FLUIDS 2004)
 
August 17-19, Corfu Island, Greece

The papers of this conference will be published: 
(a) as regular papers in the IASME/WSEAS conference proceedings
(b) regular papers in the IASME TRANSACTIONS ON MECHANICAL ENGINEERING 

http://www.wseas.org

REGISTRATION FEES: 250 EUR

DEADLINE: APRIL 10, 2004

ACCOMODATION: Incredible low prices in a 5 Star Sea Resort (former HILTON of Corfu
Island), Greece, 
5 Star Sea resort where the multiconference of WSEAS will
take place in August 2004: 

51 EUR in double room and 
81 EUR in single room. 

(in August 2004, in the Capital of Greece, Athens, the 2004 Olympic Games
will take place) 


    ---> Sponsored by IASME  <----

	
Topics of FLUIDS 2004

Mathematical Modelling in fluid mechanics
Simulation in fluid mechanics
Numerical methods in fluid mechanics 
Convection, heat and mass transfer 
Experimental Methodologies in fluid mechanics 
Thin film technologies 
Multiphase flow 
Boundary layer flow 
Material properties
Fluid structure interaction
Hydrotechnology
Hydrodynamics
Coastal and estuarial modelling
Wave modelling
Industrial applications
Environmental Problems
Air Pollution Problems 
Fluid Mechanics for Civil Engineering
Fluid Mechanics in Geosciences 
Flow visualisation
Biofluids 
Meteorology
Waste Management
Environmental protection
Management of living resources
Mathematical models
Management of Rivers and Lakes
Underwater Ecology
Hydrology
Oceanology
Ocean Engineering
Others


INTERNATIONAL SCIENTIFIC COMMITTEE 

Andrei Fedorov (USA)
A. C. Baytas (Turkey)
Albert R. George (USA)
Alexander I. Leontiev (Russia)
Andreas Dillmann (Germany)
Bruce Caswell (USA)
Chris Swan (UK)
David A. Caughey (USA)
Derek B Ingham (UK)
Donatien Njomo (CM)
Dong Chen (Australia)
Dong-Ryul Lee (Korea)
Edward E. Anderson (USA)
G. Gaiser (Germany)
G.D. Raithby (Canada)
Gad Hetsroni (Israel)
H. Beir?o da Veiga (Italy)
Ingegerd Sjfholm (Sweden)
Jerry R. Dunn (USA)
Joseph T. C. Liu (USA)
Karl B?hler (Germany)
Kenneth S. Breuer (USA)
Kumar K. Tamma (USA)
Kyungkeun Kang (USA)
M. A. Hossain (UK)
M. F. El-Amin (USA)
M.-Y. Wen (Taiwan)
Michiel Nijemeisland (USA)
Ming-C. Chyu (USA)
Naoto Tanaka (Japan)
Natalia V. Medvetskaya (Russia) 
O. Liungman (Sweden)
Philip Marcus (USA)
Pradip Majumdar (USA)
Rama Subba Reddy Gorla (USA)
Robert Nerem (USA)
Rod Sobey (UK)
Ruairi Maciver (UK)
S.M.Ghiaasiaan (USA)
Stanley Berger (USA)
Tak?o Takahashi (France)
Vassilis Gekas (Sweden)
Yinping Zhang (China)
Yoshitaka Watanabe (Japan)
 
NOTE THAT IN WSEAS CONFERENCES YOU CAN HAVE PROCEEDINGS 
1) HARD COPY 
2) CD-ROM and 
3) Web Publishing

WSEAS Books, Journals, Proceedings participate now in all major science citation
indexes.
ISI, ELSEVIER, CSA, AMS. Mathematical Reviews, ELP, NLG, Engineering Index 
Directory of Published Proceedings, INSPEC (IEE) 

More Details: http://www.wseas.org


Thanks
Alexis Espen


   #####    HOW TO UNSUBSCRIBE   ####

You receive this newsletter from your email address: beowulf at beowulf.org
If you want to unsubscribe, send an email to:  wseas at canada.com
The Subject of your message must be exactly: REMOVE beowulf at beowulf.org  WSEAS
If you want to unsubscribe more than one email addresses, send a message
to nata at wseas.org with Subject:  REMOVE [email1, emal2, ...., emailn]  WSEAS
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From michael.worsham at mci.com  Tue Mar  9 13:32:14 2004
From: michael.worsham at mci.com (Michael Worsham)
Date: Tue, 09 Mar 2004 13:32:14 -0500
Subject: [Beowulf] Cluster school project
Message-ID: <000f01c40604$d8ef6520$987a32a6@Wcomnet.com>


I would say also check out the Bootable Cluster CD (http://bccd.cs.uni.edu/)
as well. It is very easy to use and was specifically designed so you could
cluster an entire network lab, without having to worry about the hard drives
being written to.

-- Michael

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Tue Mar  9 16:13:24 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Tue, 9 Mar 2004 13:13:24 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
Message-ID: <Pine.LNX.4.04.10403091300000.3174-100000@c-24-18-245-161.client.comcast.net>

Has anyone with dual opteron machines and a kill-a-watt measured how much
power they consume?

I measured the dual P3 and xeons we have here, but no dual opterons yet.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Tue Mar  9 17:36:05 2004
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Tue, 9 Mar 2004 14:36:05 -0800
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.04.10403091300000.3174-100000@c-24-18-245-161.client.comcast.net>
References: <Pine.LNX.4.04.10403091300000.3174-100000@c-24-18-245-161.client.comcast.net>
Message-ID: <20040309223605.GA29912@cse.ucdavis.edu>

On Tue, Mar 09, 2004 at 01:13:24PM -0800, Trent Piepho wrote:
> Has anyone with dual opteron machines and a kill-a-watt measured how much
> power they consume?
> 
> I measured the dual P3 and xeons we have here, but no dual opterons yet.

I recently measured a Sunfire V20z (dual 2.2 GHz) opteron, I believe it
had 2 scsi disks, 4 GB ram.

                   watts     VA
Idle               237-249   260-281
Pstream 1 thread   260-277   290-311
Pstream 2 threads  265-280   303-313

Pstream is very much like McCalpin's stream, except it uses pthreads 2
run parallel threads in sync, and it runs over a range of array sizes.
It's the most power intensive application I've found, anything with
heave disk usage tends to decrease the power usage.

It's also great for showing memory system parallelism, say for a dual
p4 vs opteron.  I also find it useful for finding misconfigured dual
opterons.

For those interested:
http://cse.ucdavis.edu/bill/pstream.c

-- 
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Tue Mar  9 17:49:14 2004
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Tue, 9 Mar 2004 14:49:14 -0800
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <20040308224001.50f2f728@vitalstatistix>
References: <1078566922.2547.6.camel@fermi> <Pine.LNX.4.44.0403060600150.25866-100000@twin.uoregon.edu> <20040308224001.50f2f728@vitalstatistix>
Message-ID: <20040309224914.GB29912@cse.ucdavis.edu>

On Mon, Mar 08, 2004 at 10:40:01PM -0600, Russell Nordquist wrote:
> 
> thanks for all the good info. it got me to thinking....i have resources
> for comparing most components of a cluster excepts network switches. it
> would be nice to have a source of information for this as well.
> something like:
> 
> *bandwidth/latency between 2 hosts
> *bandwidth/latency at 25%/50%/75%/100% port usage
> *short vs long message comparisons

I use nrelay.c a small simple program I wrote that will MPI_Send
MPI_send very size packets between sets of nodes.

So I do something like the following to find best base latency
and bandwidth:
mpirun -np 2 ./nrelay 1  # then run with 10 100 1000 10000
size = 1, 2 nodes in  2.97 sec (  5.7 us/hop)    690 KB/sec
size=    10, 524288 hops,  2 nodes in  3.06 sec (  5.8 us/hop)   6688 KB/sec
size=   100, 524288 hops,  2 nodes in  4.19 sec (  8.0 us/hop)  48868 KB/sec
size=  1000, 524288 hops,  2 nodes in 15.37 sec ( 29.3 us/hop) 133267 KB/sec
size= 10000, 524288 hops,  2 nodes in 40.72 sec ( 77.7 us/hop) 502908 KB/sec

So we have an interconnect that manages 5.8 us for small messages and
500 MB/sec or so for large (10000 MPI_INTs).

Then I run:
mpirun -np 2,4,8,16,32,64 ./nrelay 10000  
size= 10000, 524288 hops,  2 nodes in 40.72 sec ( 77.7 us/hop) 502908 KB/sec
size= 10000, 524288 hops,  4 nodes in 39.79 sec ( 75.9 us/hop) 514698 KB/sec
size= 10000, 524288 hops,  8 nodes in 39.21 sec ( 74.8 us/hop) 522253 KB/sec
size= 10000, 524288 hops, 16 nodes in 45.53 sec ( 86.8 us/hop) 449772 KB/sec
size= 10000, 524288 hops, 32 nodes in 49.25 sec ( 93.9 us/hop) 415876 KB/sec
size= 10000, 524288 hops, 64 nodes in 52.90 sec (100.9 us/hop) 387111 KB/sec

So in this case it looks like the switch is becoming saturated.

The source is at:
http://cse.ucdavis.edu/bill/nrelay.c

I'd love to see numbers posted for various GigE, Myrinet, Dolphin and IB 
configurations

-- 
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Mar  9 19:32:49 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 9 Mar 2004 19:32:49 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.04.10403091300000.3174-100000@c-24-18-245-161.client.comcast.net>
Message-ID: <Pine.LNX.4.44.0403091917490.1281-100000@lilith.rgb.private.net>

On Tue, 9 Mar 2004, Trent Piepho wrote:

> Has anyone with dual opteron machines and a kill-a-watt measured how much
> power they consume?
> 
> I measured the dual P3 and xeons we have here, but no dual opterons yet.

By strange chance yes.  An astoundingly low 154 watts (IIRC -- I'm home,
the kill-a-watt is at Duke -- but it was definitely ballpark of 150W)
under load.  That's a load average of 2, one task per processor, without
testing under a variety of KINDS of load.  Around 75W per loaded CPU.

That's a bit less than the draw of an >>idle<< dual Athlon (165W).

I'm actually racking six more boxes tomorrow and will recheck the draw
and verify that it really is under load, but I was with Seth when I
measured it and we remarked back and forth about it, really pleased, so
I'm pretty sure I'm right. It has several very positive implications and
seems believable.  They are 1U cases (Penguin Altus 1000's) but the air
coming out of the back is not that hot, really, again compared to the
E-Z Bake Oven 2U 2466 dual Athlons (something like 260W under load).  So
we gain significantly in CPU, get access to larger memory if/when we
care, get 64 bit memory bus, and drop power and cooling requirements
(per CPU, but very nearly per rack U).  It just don't get any better
than this.

I think they are 242's, FWIW. 

YMMV.  I could be wrong, mistaken, deaf, dumb, blind, and stupid.  My
kill-a-watt could be on drugs.  I could be on drugs.  Maybe I dropped a
decimal and they really draw 1500W.  Perhaps the beer I spilled in my
kill-a-watt confused it.  I was up to 3:30 am finishing a month-late
column for deadline himself (leaving me only days late on the CURRENT
column) and my brain doesn't work very well any more.

Caveat auditor.

   rgb

> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Tue Mar  9 20:41:45 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Wed, 10 Mar 2004 09:41:45 +0800 (CST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <20040309223605.GA29912@cse.ucdavis.edu>
Message-ID: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com>

 --- Bill Broadley <bill at cse.ucdavis.edu> ???
> I recently measured a Sunfire V20z (dual 2.2 GHz)
> opteron, I believe it had 2 scsi disks, 4 GB ram.
> 
>                    watts     VA
> Idle               237-249   260-281
> Pstream 1 thread   260-277   290-311
> Pstream 2 threads  265-280   303-313

But that is with the disks, RAM, and other hardware
you have. Anyone with similar configurations but have
P4s instead?

It just looks too good to believe the numbers...
consider that the similar performance one IA64
processor ALONE draws over 120W.

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Tue Mar  9 21:08:45 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Tue, 9 Mar 2004 18:08:45 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.58.0403092308160.7398@krylov.OptimaNumerics.com>
Message-ID: <Pine.LNX.4.04.10403091635500.3287-100000@c-24-18-245-161.client.comcast.net>

On Tue, 9 Mar 2004, C J Kenneth Tan -- Heuchera Technologies wrote:
> What is the power consumption that you measured for your dual P3 and
> Xeons?

System #1:  Dual P3-500 Katmai, BX motherbaord, 512 MB PC100 ECC RAM, two
tulip NICs, cheap graphics card, 5400 RPM IDE drive, floppy drive, one case
fan, and a normal 250W ATX PS with a fan:

System #2:  Nearly the same as system #1 more or less, but with dual P3-850
Coppermines and no case fan.

System #3:  Dual Xeon 2.4 GHz 533FSB, E7501 chipset, 1 GB PC2100 ECC memory,
two 3Ware 8506-8 cards, a firewire card, onboard intel GB and FE, one Maxtor
6Y200P0 drive, 6 high speed case fans (rated 4.44W each), floppy drive, CD-ROM
drive, 550W PS with power factor correction (rated minimum 63% efficient),
SATA backplane, and 16 Maxtor 6Y200M0 SATA drives (rated 7.4W idle each) in
hotswap carriers.  I measured system #3 with the SATA drives both installed
and removed.

Unfortunately I don't have a dual Xeon with minimal extra hardware to test.

#1 Idle 		42W	72 VA	(.58 PF)
#1 Loaded		103W	157 VA	(.66 PF)
#2 Idle			39W	67 VA   (.58 PF)
#2 Loaded		96W	148 VA	(.65 PF)
#3 Idle w/o RAID	162W	168 VA	(.96 PF)
#3 Loaded w/o RAID	283W	289 VA	(.98 PF)
#3 Idle w/ RAID		375W		(stays at .98)
#3 Loaded w/ RAID	510W		(stays at .98)
#3 Loaded w/RAID/bonnie 534W		(stays at .98)

For the load, I used two processes of burnP6, part of cpuburn at
http://users.ev1.net/~redelm/

For a load breakdown by load type for system 1:
        1 process   2 processes
burnP5	65W
burnP6	72WA		103W  (exactly 30W per CPU over idle)
burnMMX	64W
burnK6	69W
burnK7	67W
burnBX	87W		90W
stream	84W		85W

The stream and burnBX memory loaders use more power than a single CPU load
program, but two at once and the CPU loaders use more power.

To load system #3 with the disks on, I ran bonnie++ on all 16 drives.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Wed Mar 10 00:48:45 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Wed, 10 Mar 2004 00:48:45 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com>
Message-ID: <Pine.LNX.4.44.0403100040590.20854-100000@coffee.psychology.mcmaster.ca>

> > I recently measured a Sunfire V20z (dual 2.2 GHz)
> > opteron, I believe it had 2 scsi disks, 4 GB ram.
> > 
> >                    watts     VA
> > Idle               237-249   260-281
> > Pstream 1 thread   260-277   290-311
> > Pstream 2 threads  265-280   303-313

that's about right.  my dual 240's peak at about 250 running 
two copies of stream and one bonnie (2GB, 40G 7200rpm IDE).

> But that is with the disks, RAM, and other hardware
> you have.

nothing else counts for much.  for instance, dimms are a couple
watts apiece (makes you wonder about the heatspreaders that 
gamers/overclockers love so much), nics and disks are ~10W, etc.

> Anyone with similar configurations but have
> P4s instead?

iirc my dual xeon/2.4's peak at around 190W (1-2GB, otherwise same).

> It just looks too good to believe the numbers...
> consider that the similar performance one IA64
> processor ALONE draws over 120W.

hey, to marketing planners, massive power dissipation is 
probably a *good* thing.  serious "enterprise" computers must 
have an impressive dissipation to set them apart from those 
piddly little game/surfing boxes ;)


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From burcu at ulakbim.gov.tr  Wed Mar 10 02:30:47 2004
From: burcu at ulakbim.gov.tr (Burcu Akcan)
Date: Wed, 10 Mar 2004 09:30:47 +0200
Subject: [Beowulf] SPBS problem
Message-ID: <404EC427.7070200@ulakbim.gov.tr>

Hi,

We have built a beowulf Debian cluster that contains 128 PIV nodes and one dual xeon server. I need
some help about SPBS (Storm). We have already installed SPBS on the server and nodes and all daemons seem to work regularly. When any job is given to the system by using pbs scripting, the job can be seen on defined queue by running status and related nodes are allocated for the job. On the other hand there is no cpu or memory consumption on the nodes, the job does not run exactly and at the end of estimated cpu time there is no output file. 

Can anyone give some advice on my problem about SPBS.

Thank you...


Burcu Akcan

ULAKBIM High Performance Computing Center


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Wed Mar 10 09:56:49 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Wed, 10 Mar 2004 06:56:49 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com>
Message-ID: <Pine.LNX.4.04.10403100401140.4230-100000@c-24-18-245-161.client.comcast.net>

On Wed, 10 Mar 2004, [big5] Andrew Wang wrote:
> --- Bill Broadley <bill at cse.ucdavis.edu>
> > I recently measured a Sunfire V20z (dual 2.2 GHz)
> > opteron, I believe it had 2 scsi disks, 4 GB ram.
> > 
> >                    watts     VA
> > Idle               237-249   260-281
> > Pstream 1 thread   260-277   290-311
> > Pstream 2 threads  265-280   303-313
>
> It just looks too good to believe the numbers...
> consider that the similar performance one IA64
> processor ALONE draws over 120W.

You also have to consider that the typical computer power supply is only
around 60% to 80% efficient.  If the CPU draws 120W, then that's going to be
something like 150 to 200 watts measured with a power meter, and really,
that's what matters.  It makes no difference to the AC and circuit breakers if
the power is dissipated in the CPU or in the power supply.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Wed Mar 10 10:14:18 2004
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Wed, 10 Mar 2004 07:14:18 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403100040590.20854-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040310151418.51414.qmail@web11413.mail.yahoo.com>

But the Itanium 2 is using so much energy that Intel couldn't rise the
frequency... or else the machine would melt :(

See the online lecture: "Things CPU Architects Need To Think About"
http://www.stanford.edu/class/ee380/

BTW, that guy used to work for Intel, and he also mentioned about the
compiler guys tuned the IA-64 compiler for the benchmarks...

Rayson


> hey, to marketing planners, massive power dissipation is 
> probably a *good* thing.  serious "enterprise" computers must 
> have an impressive dissipation to set them apart from those 
> piddly little game/surfing boxes ;)
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you?re looking for faster
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Wed Mar 10 10:13:58 2004
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Wed, 10 Mar 2004 07:13:58 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403100040590.20854-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040310151358.43826.qmail@web11407.mail.yahoo.com>

But the Itanium 2 is using so much energy that Intel couldn't rise the
frequency... or else the machine would melt :(

See the online lecture: "Things CPU Architects Need To Think About"
http://www.stanford.edu/class/ee380/

BTW, that guy used to work for Intel, and he also mentioned about the
compiler guys tuned the IA-64 compiler for the benchmarks...

Rayson


> hey, to marketing planners, massive power dissipation is 
> probably a *good* thing.  serious "enterprise" computers must 
> have an impressive dissipation to set them apart from those 
> piddly little game/surfing boxes ;)
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you?re looking for faster
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at intnet.mu  Wed Mar 10 11:42:39 2004
From: rgoornaden at intnet.mu (roudy)
Date: Wed, 10 Mar 2004 20:42:39 +0400
Subject: [Beowulf] Writing a parallel program
References: <200403101448.i2AEmIA22804@NewBlue.scyld.com>
Message-ID: <003701c406bf$085f25b0$590b7bca@roudy>

Hello everybody,
I completed to build my beowulf cluster. Now I am writing a parallel program
using MPICH2. Can someone give me a help. Because, the program that I wrote
take more time to run on several nodes compare when it is run on one node.
If there is a small program that someone can send me about distributing data
among nodes, then each node process the data, and the information is sent
back to the master node for printing. This will be a real help for me.
Thanks
Roud


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Mar 10 12:28:54 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Mar 2004 12:28:54 -0500 (EST)
Subject: [Beowulf] Writing a parallel program
In-Reply-To: <003701c406bf$085f25b0$590b7bca@roudy>
Message-ID: <Pine.LNX.4.44.0403101205330.20900-100000@ganesh>

On Wed, 10 Mar 2004, roudy wrote:

> Hello everybody,
> I completed to build my beowulf cluster. Now I am writing a parallel program
> using MPICH2. Can someone give me a help. Because, the program that I wrote
> take more time to run on several nodes compare when it is run on one node.
> If there is a small program that someone can send me about distributing data
> among nodes, then each node process the data, and the information is sent
> back to the master node for printing. This will be a real help for me.
> Thanks
> Roud

I can't help you much with MPI but I can help you understand the
problems you might encounter with ANY message passing system or library
in terms of parallel task scaling.

There is a ready-to-run PVM program I just posted in tarball form on my
personal website that will be featured in the May issue of Cluster World
Magazine.  

  http:www.phy.duke.edu/~rgb/General/random_pvm.php

It is designed to give you direct control over the most
important parameters that affect task scaling so that you can learn just
how it works.

The task itself consists of a "master" program and a "slave" program.
The master parses several parameters from the command line:

  -n number of slaves
  -d delay (to vary the amount of simulated work per communication)
  -r number of rands (to vary the number of communications per run and
work burdent per slave)
  -b a flag to control whether the slaves send back EACH number as it is
generated (lots of small messags) or "bundles" all the numbers they
generate into a single message.  This makes a visible, rather huge
difference in task scaling, as it should.

The task itself is trivial -- generating random numbers.  The master
starts by computing a trivial task partitioning among the n nodes.  It
spawns n slave tasks, sending each one the delay on the command line.
It then sends each slave the number of rands to generate and a trivially
unique seed as messages.  Each slave generates a rand, waits delay (in
nanoseconds, with a high-precision polling loop), and either sends it
back as a message immediately (the default) or saves it in a large
vector until the task is finished and sends the whole buffer as a single
message (if the -b flag was set).

This serves two valuable purposes for the novice.

First, it gives you a ready-to-build working master/slave program to
use as a template for a pretty much any problem for which the paradigm
is a good fit.

Second, by simply playing with it, you can learn LOTS of things about
parallel programs and clusters.  If delay is small (order of the packet
latency, 100 usec or less) the program is in a latency dominated scaling
regime where communications per number actually takes longer than
generating the numbers and its parallel scaling is lousy (if slowing a
task down relative to serial can be called merely lousy).  If delay is
large, so that it takes a long time to compute and a short time to send
back the results, parallel scaling is excellent with near linear
speedup.  Turning on the -b flag for certain ranges of the delay can
"instantly" shift one from latency bounded to bandwidth bounded parallel
scaling regimes, and restore decent scaling.

Even if you don't use it because it is based on PVM, if you clone it for
MPI you'll learn the same lessons there, as they are universal and part
of the theoretical basis for understanding parallel scaling.  Eventually
I'll do an MPI version myself for the column, but the mag HAS an MPI
column and my focus would be more for the novice learning about parallel
computing in general.

BTW, obviously I think that subscribing to CWM is a good idea for
novices.  Among its many other virtues (such as articles by lots of the
luminaries of this vary list:-), you can read my columns.  In fact, from
what I've seen from the first few issues, ALL the columns are pretty
damn good and getting back issues to the beginning wouldn't hurt, if it
is still possible.

If you (or anybody) DO grab random_pvm and give it a try, please send me
feedback, preferrably before the actual column comes out in May, so that
I can fix it before then.  It is moderately well documented in the
tarball, but of course there is more "documentation" and explanation
in the column itself.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Mar 10 12:28:54 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Mar 2004 12:28:54 -0500 (EST)
Subject: [Beowulf] Writing a parallel program
In-Reply-To: <003701c406bf$085f25b0$590b7bca@roudy>
Message-ID: <Pine.LNX.4.44.0403101205330.20900-100000@ganesh>

On Wed, 10 Mar 2004, roudy wrote:

> Hello everybody,
> I completed to build my beowulf cluster. Now I am writing a parallel program
> using MPICH2. Can someone give me a help. Because, the program that I wrote
> take more time to run on several nodes compare when it is run on one node.
> If there is a small program that someone can send me about distributing data
> among nodes, then each node process the data, and the information is sent
> back to the master node for printing. This will be a real help for me.
> Thanks
> Roud

I can't help you much with MPI but I can help you understand the
problems you might encounter with ANY message passing system or library
in terms of parallel task scaling.

There is a ready-to-run PVM program I just posted in tarball form on my
personal website that will be featured in the May issue of Cluster World
Magazine.  

  http:www.phy.duke.edu/~rgb/General/random_pvm.php

It is designed to give you direct control over the most
important parameters that affect task scaling so that you can learn just
how it works.

The task itself consists of a "master" program and a "slave" program.
The master parses several parameters from the command line:

  -n number of slaves
  -d delay (to vary the amount of simulated work per communication)
  -r number of rands (to vary the number of communications per run and
work burdent per slave)
  -b a flag to control whether the slaves send back EACH number as it is
generated (lots of small messags) or "bundles" all the numbers they
generate into a single message.  This makes a visible, rather huge
difference in task scaling, as it should.

The task itself is trivial -- generating random numbers.  The master
starts by computing a trivial task partitioning among the n nodes.  It
spawns n slave tasks, sending each one the delay on the command line.
It then sends each slave the number of rands to generate and a trivially
unique seed as messages.  Each slave generates a rand, waits delay (in
nanoseconds, with a high-precision polling loop), and either sends it
back as a message immediately (the default) or saves it in a large
vector until the task is finished and sends the whole buffer as a single
message (if the -b flag was set).

This serves two valuable purposes for the novice.

First, it gives you a ready-to-build working master/slave program to
use as a template for a pretty much any problem for which the paradigm
is a good fit.

Second, by simply playing with it, you can learn LOTS of things about
parallel programs and clusters.  If delay is small (order of the packet
latency, 100 usec or less) the program is in a latency dominated scaling
regime where communications per number actually takes longer than
generating the numbers and its parallel scaling is lousy (if slowing a
task down relative to serial can be called merely lousy).  If delay is
large, so that it takes a long time to compute and a short time to send
back the results, parallel scaling is excellent with near linear
speedup.  Turning on the -b flag for certain ranges of the delay can
"instantly" shift one from latency bounded to bandwidth bounded parallel
scaling regimes, and restore decent scaling.

Even if you don't use it because it is based on PVM, if you clone it for
MPI you'll learn the same lessons there, as they are universal and part
of the theoretical basis for understanding parallel scaling.  Eventually
I'll do an MPI version myself for the column, but the mag HAS an MPI
column and my focus would be more for the novice learning about parallel
computing in general.

BTW, obviously I think that subscribing to CWM is a good idea for
novices.  Among its many other virtues (such as articles by lots of the
luminaries of this vary list:-), you can read my columns.  In fact, from
what I've seen from the first few issues, ALL the columns are pretty
damn good and getting back issues to the beginning wouldn't hurt, if it
is still possible.

If you (or anybody) DO grab random_pvm and give it a try, please send me
feedback, preferrably before the actual column comes out in May, so that
I can fix it before then.  It is moderately well documented in the
tarball, but of course there is more "documentation" and explanation
in the column itself.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Wed Mar 10 12:07:10 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Wed, 10 Mar 2004 12:07:10 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <20040310151358.43826.qmail@web11407.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>

> See the online lecture: "Things CPU Architects Need To Think About"
> http://www.stanford.edu/class/ee380/

does anyone have a lead on an open-source player for these .asx files?
or at least something not tied to windows?


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sp at scali.com  Wed Mar 10 13:41:59 2004
From: sp at scali.com (Steffen Persvold)
Date: Wed, 10 Mar 2004 19:41:59 +0100
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>
Message-ID: <404F6177.8050108@scali.com>

Mark Hahn wrote:
>>See the online lecture: "Things CPU Architects Need To Think About"
>>http://www.stanford.edu/class/ee380/
> 
> 
> does anyone have a lead on an open-source player for these .asx files?
> or at least something not tied to windows?
> 

The .asx file is just a link to a .wmv (Windows Media) file, which again just
contains a streaming media reference. I haven't tried, but I think you could
use mplayer to play them :

http://www.mplayerhq.hu

Best regards,
Steffen

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Wed Mar 10 16:11:07 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Wed, 10 Mar 2004 16:11:07 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <404F7BE0.6040900@nada.kth.se>
Message-ID: <Pine.LNX.4.44.0403101610440.24719-100000@coffee.psychology.mcmaster.ca>

> Seems to be running fine with xine.

wow, you're right!  thanks...

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Mar 10 18:56:06 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Mar 2004 18:56:06 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403101610440.24719-100000@coffee.psychology.mcmaster.ca>
Message-ID: <Pine.LNX.4.44.0403101851000.1295-100000@lilith.rgb.private.net>

On Wed, 10 Mar 2004, Mark Hahn wrote:

> > Seems to be running fine with xine.
> 
> wow, you're right!  thanks...

(sorry to jump back on the thread this way, but it is easier than
scrolling back through mail to find the original:-)

I went downstairs again today and really paid attention to the
kill-a-watt.  Dual 1600 MHz Opteron, 1 GB of memory, load average of 3
(I don't know why but they are running three jobs instead of two at the
moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over
120 V line voltage).

This seems lower than a lot of the other numbers being reported
(although it is a bit higher than my memory recalled yesterday -- I TOLD
you not to trust me:-).  It is still considerably better than a dual
Athlon at much higher clock as well.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ddw at dreamscape.com  Wed Mar 10 20:36:13 2004
From: ddw at dreamscape.com (Daniel Williams)
Date: Wed, 10 Mar 2004 20:36:13 -0500
Subject: [Beowulf] Cluster school project
References: <200403101446.i2AEknA22660@NewBlue.scyld.com>
Message-ID: <404FC28A.7607EF77@dreamscape.com>

> From: "Maikel Punie" <beowulf at studio26.be>
> Subject: RE: [Beowulf] Cluster school project
> Date: Tue, 9 Mar 2004 18:45:47 +0100
>
[snip...]
>
>>Do you mean a computing/programming project could you do,
>>like calculating pi to some large number of digits?
>
>yeah something like that, i realy have no idea what is possible.
>if there are any suggestions, they are always welcome.

Here's what I want to do once I get enough junk 500mhz machines together:
	Make a model of the spread of genetic diseases in a population of a few
hundred million.  I've been wanting to do that for years, but it would
probably take a few months to run on any single machine I own.  I figure it
should run in a few weeks as soon as I get a 16 node cluster together to run it.
	Is that something you could maybe use?

DDW
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From a.j.martin at qmul.ac.uk  Thu Mar 11 04:56:24 2004
From: a.j.martin at qmul.ac.uk (Alex Martin)
Date: Thu, 11 Mar 2004 09:56:24 +0000
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403101851000.1295-100000@lilith.rgb.private.net>
References: <Pine.LNX.4.44.0403101851000.1295-100000@lilith.rgb.private.net>
Message-ID: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>

On Wednesday 10 March 2004 11:56 pm, Robert G. Brown wrote:

>
> I went downstairs again today and really paid attention to the
> kill-a-watt.  Dual 1600 MHz Opteron, 1 GB of memory, load average of 3
> (I don't know why but they are running three jobs instead of two at the
> moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over
> 120 V line voltage).
>
> This seems lower than a lot of the other numbers being reported
> (although it is a bit higher than my memory recalled yesterday -- I TOLD
> you not to trust me:-).  It is still considerably better than a dual
> Athlon at much higher clock as well.
>
>    rgb


I find you numbers a bit surprising still  As part of our latest procurement 
I looked up the power consumption in the INTEL/AMD documention for the 
various processors under consideration:


Athlon

model 6              2200MP             58.9 W
model 8              2400MP             54.5 W
model 11             2800MP (Barton)    47.2 W


Opteron             240-244             82.1 W
                    246-248             89.0 W


Xeon                2.8  GHz              77 W
(512K Cache)        3.06 GHz              87 W


I think these numbers are meant to be maximum?


-- 
------------------------------------------------------------------------------
|                                                                            |
|  Dr. Alex Martin                                                           |
|  e-Mail:   a.j.martin at qmul.ac.uk        Queen Mary, University of London,  |
|  Phone :   +44-(0)20-7882-5033          Mile End Road,                     |
|  Fax   :   +44-(0)20-8981-9465          London, UK   E1 4NS                |
|                                                                            |
------------------------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From a.j.martin at qmul.ac.uk  Thu Mar 11 07:47:57 2004
From: a.j.martin at qmul.ac.uk (Alex Martin)
Date: Thu, 11 Mar 2004 12:47:57 +0000
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403111328390.20910-100000@kenzo.iwr.uni-heidelberg.de>
References: <Pine.LNX.4.44.0403111328390.20910-100000@kenzo.iwr.uni-heidelberg.de>
Message-ID: <200403111247.i2BClv215026@heppcb.ph.qmw.ac.uk>

On Thursday 11 March 2004 12:35 pm, Bogdan Costescu wrote:
> On Thu, 11 Mar 2004, Alex Martin wrote:
> > I find you numbers a bit surprising still
>
> I don't :-)

I was suprised that rgb's opteron numbers were so low!


> While I can't remember what was the exact figure for the dual Opteron
> 246 (2 GHz) system, I'm sure that it was over 200W.
>
> > Athlon model 11             2800MP (Barton)    47.2 W
>
> dual Athlon 2800MP (2133MHz) under load from 2 cpuburn ~ 230W
>
> > Xeon (512K Cache)        3.06 GHz              87 W
>
> dual Xeon 3.06GHz under load from 2 cpuburn ~ 275W


your system numbers are pretty consistent with what I've measured. 
( ~230 W for Athlon 2200MP  and ~250W for Xeon 2.8GHz )

-- 
------------------------------------------------------------------------------
|                                                                            |
|  Dr. Alex Martin                                                           |
|  e-Mail:   a.j.martin at qmul.ac.uk        Queen Mary, University of London,  |
|  Phone :   +44-(0)20-7882-5033          Mile End Road,                     |
|  Fax   :   +44-(0)20-8981-9465          London, UK   E1 4NS                |
|                                                                            |
------------------------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bogdan.costescu at iwr.uni-heidelberg.de  Thu Mar 11 07:35:30 2004
From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu)
Date: Thu, 11 Mar 2004 13:35:30 +0100 (CET)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>
Message-ID: <Pine.LNX.4.44.0403111328390.20910-100000@kenzo.iwr.uni-heidelberg.de>

On Thu, 11 Mar 2004, Alex Martin wrote:

> I find you numbers a bit surprising still

I don't :-)
While I can't remember what was the exact figure for the dual Opteron
246 (2 GHz) system, I'm sure that it was over 200W.

> Athlon model 11             2800MP (Barton)    47.2 W

dual Athlon 2800MP (2133MHz) under load from 2 cpuburn ~ 230W

> Xeon (512K Cache)        3.06 GHz              87 W

dual Xeon 3.06GHz under load from 2 cpuburn ~ 275W

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Mar 11 08:39:02 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 11 Mar 2004 08:39:02 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>
Message-ID: <Pine.LNX.4.44.0403110829230.1295-100000@lilith.rgb.private.net>

On Thu, 11 Mar 2004, Alex Martin wrote:

> I find you numbers a bit surprising still  As part of our latest procurement 
> I looked up the power consumption in the INTEL/AMD documention for the 
> various processors under consideration:
...
> Opteron             240-244             82.1 W
>                     246-248             89.0 W
> I think these numbers are meant to be maximum?

You've got me -- dunno.  I can post a digital photo of the kill-a-watt
reading if you like (I was going to take a camera down there anyway to
add a new rack photo to the brahma tour).  I can also take the
kill-a-watt and plug in an electric light bulb or something with a
fairly predictable draw and see if it is broken somehow.

Right now a system in production work is plugged into it -- I'll try to
retrieve it soon and plug one of my new systems into it so that I can
run more detailed tests under more controlled loads.  I don't know
exactly what kind of work is being done in the current jobs being run.

One advantage may be that the cases are apparently equipped with a PFC
power supply.  The power factor appears to be very good -- close to 1.
This may make the power supplies themselves run cooler, so that the
power draw of the rest of the system IS only 20 or so more watts.  The
systems also have a bare minimum of peripherals -- a hard disk (sitting
idle), onboard dual gig NICs (one idle) and video (idle).

Will post newer/better tests as I have time and make them, although
others may beat me to it...;-)

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Thu Mar 11 11:10:16 2004
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Thu, 11 Mar 2004 08:10:16 -0800
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403110829230.1295-100000@lilith.rgb.private
 .net>
References: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>
Message-ID: <5.2.0.9.2.20040311080304.017d8008@mailhost4.jpl.nasa.gov>

At 08:39 AM 3/11/2004 -0500, Robert G. Brown wrote:
>On Thu, 11 Mar 2004, Alex Martin wrote:
>
> > I find you numbers a bit surprising still  As part of our latest 
> procurement
> > I looked up the power consumption in the INTEL/AMD documention for the
> > various processors under consideration:
>...
> > Opteron             240-244             82.1 W
> >                     246-248             89.0 W
> > I think these numbers are meant to be maximum?
>
>You've got me -- dunno.  I can post a digital photo of the kill-a-watt
>reading if you like (I was going to take a camera down there anyway to
>add a new rack photo to the brahma tour).  I can also take the
>kill-a-watt and plug in an electric light bulb or something with a
>fairly predictable draw and see if it is broken somehow.
>
>Right now a system in production work is plugged into it -- I'll try to
>retrieve it soon and plug one of my new systems into it so that I can
>run more detailed tests under more controlled loads.  I don't know
>exactly what kind of work is being done in the current jobs being run.
>
>One advantage may be that the cases are apparently equipped with a PFC
>power supply.  The power factor appears to be very good -- close to 1.
>This may make the power supplies themselves run cooler, so that the
>power draw of the rest of the system IS only 20 or so more watts.  The
>systems also have a bare minimum of peripherals -- a hard disk (sitting
>idle), onboard dual gig NICs (one idle) and video (idle).

Those power supplies are impressive PFC wise..

I'd venture to say, though, that the rated powers are peak over some fairly 
short time.  The Kill-A-Watt averages over some reasonable time (a second 
or two?), so you could actually have an average that's half the peak.

Everytime there's a pipeline stall, or a cache miss, etc, the current's 
going to change.  We used processor current to debug DSP code, because you 
could actually see interrupts come in during the other steps(FFT = very 
high power, sudden drop for a few microseconds while ISR is running).  You 
could also accurately time how long each "pass" in the FFT took, since the 
CPU power dropped while setting up the parameters for the next set of 
butterflies.

To really track this kind of thing down, you'd want to hook a DC current 
probe around the wires from the Power supply to the motherboard. Then, 
write some benchmark program with a fairly repeatable computational 
resource requirement pattern.  Look at the current on an oscilloscope.

I suspect that onboard filtering will get rid of variations that last less 
than, say, 1-10 mSec, so a program that has a basic cyclical nature lasting 
10 times that would be nice.

Ideally, you'd probe the current going to the CPU, vs the rest of the mobo, 
but that's probably a bit of a challenge.

Another experiment would be to write a small program that you KNOW will 
stay in cache and never go off chip and measure the current draw when 
running it.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tegner at nada.kth.se  Wed Mar 10 15:34:40 2004
From: tegner at nada.kth.se (Jon Tegner)
Date: Wed, 10 Mar 2004 21:34:40 +0100
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>
Message-ID: <404F7BE0.6040900@nada.kth.se>

Seems to be running fine with xine.

/jon


Mark Hahn wrote:

>>See the online lecture: "Things CPU Architects Need To Think About"
>>http://www.stanford.edu/class/ee380/
>>    
>>
>
>does anyone have a lead on an open-source player for these .asx files?
>or at least something not tied to windows?
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>  
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jimlux at earthlink.net  Thu Mar 11 09:07:09 2004
From: jimlux at earthlink.net (Jim Lux)
Date: Thu, 11 Mar 2004 06:07:09 -0800
Subject: [Beowulf] Power consumption for opterons?
References: <Pine.LNX.4.44.0403101851000.1295-100000@lilith.rgb.private.net> <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>
Message-ID: <000e01c40772$2611bf60$36a8a8c0@LAPTOP152422>


----- Original Message -----
From: "Alex Martin" <a.j.martin at qmul.ac.uk>
To: "Robert G. Brown" <rgb at phy.duke.edu>; "Mark Hahn"
<hahn at physics.mcmaster.ca>
Cc: "Jon Tegner" <tegner at nada.kth.se>; <beowulf at beowulf.org>
Sent: Thursday, March 11, 2004 1:56 AM
Subject: Re: [Beowulf] Power consumption for opterons?


> On Wednesday 10 March 2004 11:56 pm, Robert G. Brown wrote:
>
> >
> > I went downstairs again today and really paid attention to the
> > kill-a-watt.  Dual 1600 MHz Opteron, 1 GB of memory, load average of 3
> > (I don't know why but they are running three jobs instead of two at the
> > moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over
> > 120 V line voltage).
>
<snip>

> I find you numbers a bit surprising still  As part of our latest
procurement
> I looked up the power consumption in the INTEL/AMD documention for the
> various processors under consideration:
>
surprising high or surprising low?

You're comparing DC power to just the processor vs wall plug power to the
whole system (including cooling fans, RAM, PCI bridge chips, etc.)  I think
that the databook numbers of ca 50-80 W per CPU (probably the highest
continuous average power) is nicely matched with 180 W from the wall for a
dual CPU...

The databook number is probably a bit on the high side... 180W from the wall
probably equates to about 140W DC.  There's probably 10W or so in fans and
glue, maybe 100W for both procesors, and 30W for the rest of the logic and
RAM

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mathiasbrito at yahoo.com.br  Fri Mar 12 08:51:22 2004
From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=)
Date: Fri, 12 Mar 2004 10:51:22 -0300 (ART)
Subject: [Beowulf] Strange Behavior
Message-ID: <20040312135122.92643.qmail@web12208.mail.yahoo.com>

Hi,

I'm benchmarking my 16 nodes cluster with HPL and I
obtain a estrange result, different of all I ever seen
before. When I send more data with a big N, the
performance is worse than with small values of N. I
used N=5000 with NB=20 and the performance was 3.3GB,
when I send N=10000 with NB=20 i get only 2.1GB. I
don't liked the result, the nodes are athlon xp 1600+
with 512MB RAM, and I think the cluster very slow.
Someone had the same problem and could help me?

Mathias

=====
Mathias Brito
Universidade Estadual de Santa Cruz - UESC
Departamento de Ci?ncias Exatas e Tecnol?gicas
Estudante do Curso de Ci?ncia da Computa??o

______________________________________________________________________

Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora:
http://br.yahoo.com/info/mail.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lars at meshtechnologies.com  Fri Mar 12 11:43:47 2004
From: lars at meshtechnologies.com (Lars Henriksen)
Date: Fri, 12 Mar 2004 16:43:47 +0000
Subject: [Beowulf] Strange Behavior
In-Reply-To: <20040312135122.92643.qmail@web12208.mail.yahoo.com>
References: <20040312135122.92643.qmail@web12208.mail.yahoo.com>
Message-ID: <1079109827.3745.7.camel@tp1.mesh-hq>

On Fri, 2004-03-12 at 13:51, Mathias Brito wrote:

> I'm benchmarking my 16 nodes cluster with HPL and I
> obtain a estrange result, different of all I ever seen
> before. When I send more data with a big N, the
> performance is worse than with small values of N. I
> used N=5000 with NB=20 and the performance was 3.3GB,
> when I send N=10000 with NB=20 i get only 2.1GB. I
> don't liked the result, the nodes are athlon xp 1600+
> with 512MB RAM, and I think the cluster very slow.
> Someone had the same problem and could help me?

Please correct me anybody, if im wrong:
It seems to me, that the best results are acheived with approx 85-90%
memory utilization (leaving something to the rest of the system).

(16*512*1024*1024/8)^0.5 ~= 30200, that would close to the best N value
isn't Nb=20 very low? I currently use arround 145 for P4 cpu's

What performance du you get from a setup like the one above?

best regards

Lars
-- 
Lars Henriksen                  | MESH-Technologies A/S
Systems Manager & Consultant    | Lille Graabroedrestraede 1
www.meshtechnologies.com        | DK-5000 Odense C, Denmark
lars at meshtechnologies.com       | mobile: +45 2291 2904
direct: +45 6311 1187	 	| fax:	  +45 6311 1189


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From M.Arndt at science-computing.de  Fri Mar 12 07:06:36 2004
From: M.Arndt at science-computing.de (Michael Arndt)
Date: Fri, 12 Mar 2004 13:06:36 +0100
Subject: [Beowulf] Cluster Uplink via Wireless
Message-ID: <20040312130636.D49119@blnsrv1.science-computing.de>

Hello *

has anyone done a wireless technology uplink to a compute cluster
that is in real use ?
If so, i would be interested to know how and how is the experinece in
transferring  "greater" (e.g. 2 GB ++ ) Result files?

explanation: 
We have a cluster with gigabit interconnect
where it would make life cheaper, if there is a possibility to upload
input data and download output data via wireless link, since connecting 
twisted pair between WS and CLuster would be expensive.
 

TIA
Micha
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Mar 12 17:22:58 2004
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Mar 2004 14:22:58 -0800
Subject: [Beowulf] Cluster Uplink via Wireless
In-Reply-To: <20040312130636.D49119@blnsrv1.science-computing.de>
Message-ID: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov>

At 01:06 PM 3/12/2004 +0100, Michael Arndt wrote:
>Hello *
>
>has anyone done a wireless technology uplink to a compute cluster
>that is in real use ?
>If so, i would be interested to know how and how is the experinece in
>transferring  "greater" (e.g. 2 GB ++ ) Result files?
>
>explanation:
>We have a cluster with gigabit interconnect
>where it would make life cheaper, if there is a possibility to upload
>input data and download output data via wireless link, since connecting
>twisted pair between WS and CLuster would be expensive.
>
I have a very small cluster that is using wireless interconnect for 
everything, and based upon my early observations, I'd be real, real leery 
of contemplating transferring Gigabytes in any practical time. For 
instance, loading a 25 MB compressed ram file system using tftp during PXE 
boot takes about a minute. This is on a very non-optimized configuration 
using 802.11a, through  a variety of devices.

Yes, indeed, the ad literature claims 54 Mbps, but that's not the actual 
data rate, but more the "bit rate" of the over the air signal.  Wireless 
LANs are NOT full duplex, and there are synchronization preambles, etc. 
that make the throughput much lower.

On a standard "11 Mbps" 802.11b type network, the "real data throughput" in 
a unidirectional transfer is probably closer to 3-5 Mbps.

Say you get that wireless link really humming at 20 Mbps real data 
rate.  Transferring 16,000 Mbit is still going to take 10-15 minutes.

Your situation might be a bit better, especially if you can use a point to 
point wireless link.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Mar 12 19:29:41 2004
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Mar 2004 16:29:41 -0800
Subject: [Beowulf] Cluster Uplink via Wireless
References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov>
Message-ID: <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov>

At 06:04 PM 3/12/2004 -0500, Mark Hahn wrote:
> > Say you get that wireless link really humming at 20 Mbps real data
> > rate.  Transferring 16,000 Mbit is still going to take 10-15 minutes.
>
>out of truely morbid curiosity, what's the latency like?


I'll have some numbers next week.  The configuration is sort of weird..

diskless node booting w/PXE
D-Link Wireless AP in multi AP connect mode
over the air
D-Link wireless AP in multi AP connect mode
network w/NFS and DHCP server

The D-Link boxes try to be smart and not push packets across the air link 
that are for MACs they know are on the wired side, and that whole process 
is "tricky"...


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From clwang at csis.hku.hk  Fri Mar 12 21:29:43 2004
From: clwang at csis.hku.hk (Cho Li Wang)
Date: Sat, 13 Mar 2004 10:29:43 +0800
Subject: [Beowulf] NPC2004 CFP : Deadline Extended to March 22, 2004
Message-ID: <40527217.92D67387@csis.hku.hk>


*******************************************************************
                            NPC2004
IFIP International Conference on Network and Parallel Computing
                       October 18-20, 2004
                          Wuhan, China
                 http://grid.hust.edu.cn/npc04
-------------------------------------------------------------------
Important Dates

  Paper Submission                        March 22, 2004 (extended)
  Author Notification                     May    1, 2004
  Final Camera Ready Manuscript           June   1, 2004

*******************************************************************

Call For Papers

  The goal of IFIP International Conference on Network and Parallel
Computing 
(NPC 2004) is to establish an international forum for engineers and
scientists 
to present their excellent ideas and experiences in all system fields of 
network and parallel computing. NPC 2004, hosted by the Huazhong
University of 
Science and Technology, will be held in the city of Wuhan, China - 
the "Homeland of White Clouds and the Yellow Crane." Topics of interest 
include, but are not limited to:

        - Parallel & Distributed Architectures
        - Parallel & Distributed Applications/Algorithms
        - Parallel Programming Environments & Tools
        - Network & Interconnect Architecture
        - Network Security 
        - Network Storage
        - Advanced Web and Proxy Services
        - Middleware Frameworks & Toolkits
        - Cluster and Grid Computing                        
        - Ubiquitous Computing
        - Peer-to-peer Computing 
        - Multimedia Streaming Services
        - Performance Modeling & Evaluation

Submitted papers may not have appeared in or be considered for another 
conference. Papers must be written in English and must be in PDF format. 
Detailed electronic submission instructions will be posted on the
conference 
web site. The conference proceedings will be published by Springer
Verlag in 
the Lecture Notes in Computer Science Series (cited by SCI). Best papers
from 
NPC 2004 will be published in a special issue of International Journal
of High 
Performance Computing and Networking (IJHPCN) after conference.

**************************************************************************

Committee

  General Co-Chairs: 
        H. J. Siegel           Colorado State University, USA
        Guojie Li              The Institute of Computing Technology, 
                               CAS, China
  Steering Committee Chair:
        Kemal Ebcioglu         IBM T.J. Watson Research Center, USA

  Program Co-Chairs:
        Guangrong Gao          University of Delaware, USA
        Zhiwei Xu              Chinese Academy of Sciences, China

  Program Vice-Chairs:
        Victor K. Prasanna     University of Southern California, USA
        Albert Y. Zomaya       University of Sydney, Australia
        Hai Jin                Huazhong University of Science and 
                               Technology, China
  Publicity Co-Chairs:
        Cho-Li Wang           The University of Hong Kong, Hong Kong
        Chris Jesshope        The University of Hull, UK

  Local Arrangement Chair:
        Song Wu               Huazhong University of Science and 
                              Technology, China

  Steering Committee Members:
        Jack Dongarra         University of Tennessee, USA
        Guangrong Gao         University of Delaware, USA
        Jean-Luc Gaudiot      University of California, Irvine, USA
        Guojie Li             The Institute of Computing Technology, 
                              CAS, China
        Yoichi Muraoka        Waseda University, Japan
        Daniel Reed           University of North Carolina, USA

  Program Committee Members:
        Ishfaq Ahmad             University of Texas at Arlington, USA
        Shoukat Ali              University of Missouri-Rolla, USA
        Makoto Amamiya           Kyushu University, Japan
        David Bader              University of New Mexico, USA
        Luc Bouge                IRISA/ENS Cachan, France
        Pascal Bouvry            University of Luxembourg, Luxembourg
        Ralph Castain            Los Alamos National Laboratory, USA
        Guoliang Chen            University of Science and Technology 
                                 of China, China
        Alain Darte              CNRS, ENS-Lyon, France
        Chen Ding                University of Rochester, USA
        Jianping Fan             Institute of Computing Technology, CAS,
China
        Xiaobing Feng            Institute of Computing Technology, 
                                 CAS, China
        Jean-Luc Gaudiot         University of California, Irvine, USA
        Minyi Guo                University of Aizu, Japan
        Mary Hall                University of Southern California, USA
        Salim Hariri             University of Arizona, USA
        Kai Hwang                University of Southern California, USA
        Anura Jayasumana         Colorado State Univeristy, USA
        Chris R. Jesshop         The University of Hull, UK
        Ricky Kwok               The University of Hong Kong, Hong Kong
        Francis Lau              The University of Hong Kong, Hong Kong
        Chuang Lin               Tsinghua University, China
        John Morrison            University College Cork, Ireland
        Lionel Ni                Hong Kong University of Science and 
                                 Technology, Hong Kong
        Stephan Olariu           Old Dominion University, USA
        Yi Pan                   Georgia State University, USA
        Depei Qian               Xi'an Jiaotong University, China
        Daniel A. Reed           University of North Carolina at 
                                 Chapel Hill, USA
        Jose Rolim               University of Geneva, Switzerland
        Arnold Rosenberg         University of Massachusetts at Amherst,
USA
        Sartaj Sahni             University of Florida, USA
        Selvakennedy Selvadurai  University of Sydney, Australia
        Franciszek Seredynski    Polish Academy of Sciences, Poland
        Hong Shen                Japan Advanced Institute of Science 
                                 and Technology, Japan
        Xiaowei Shen             IBM T. J. Watson Research Center, USA
        Gabby Silberman          IBM Centers for Advanced Studies, USA
        Per Stenstrom            Chalmers University of Technology,
Sweden
        Ivan Stojmenovic         University of Ottawa, Canada
        Ninghui Sun              Institute of Computing Technology, CAS,
China
        El-Ghazali Talbi         University of Lille, France
        Domenico Talia           University of Calabria, Italy
        Mitchell D. Theys        University of Illinois at Chicago, USA
        Xinmin Tian              Intel Corporation, USA
        Dean Tullsen             University of California, San Diego,
USA
        Cho-Li Wang              The University of Hong Kong, Hong Kong
        Qing Yang                University of Rhode Island, USA
        Yuanyuan Yang            State University of New York at 
                                 Stony Brook, USA
        Xiaodong Zhang           College of William and Mary, USA
        Weimin Zheng             Tsinghua University, China
        Bingbing Zhou            University of Sydney, Australia
        Chuanqi Zhu              Fudan University, China

------------------------------------------------------------------------  
  For more information, please contact the program vice-chair 
  at the address below:

        Dr. Hai Jin, Professor
        Director, Cluster and Grid Computing Lab
        Vice-Dean, School of Computer
        Huazhong University of Science and Technology
        Wuhan, 430074, China
        Tel:    +86-27-87543529
        Fax:   +86-27-87557354
        e-fax:  +1-425-920-8937
        e-mail: hjin at hust.edu.cn
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mayank_kaushik at vsnl.net  Sat Mar 13 05:24:13 2004
From: mayank_kaushik at vsnl.net (mayank_kaushik at vsnl.net)
Date: Sat, 13 Mar 2004 15:24:13 +0500
Subject: [Beowulf] Benchmarking with PVM
Message-ID: <74070c77404a91.7404a9174070c7@vsnl.net>

hi everyone


first of all, id like to thank Robert G. Brown for his help in solving my PVM problem, and getting my cluster running!

now that its running, iv been trying to run tests on it to see how fast it really is..so i ran PVMPOV, and the results were pretty impressive- i had two P4s clustered, and the rendering time was reduced by half..may sound trivial to you guys, but to a first-timer like me, it looks great! :-)

okay, so heres the deal- we`v got lots of idle computers in the college computer lab..an eclectic mix of P2 350s and P3 733s, which everyone has abandoned in favour of flashy new compaq evo P4 2.4ghzs, so along comes me the evangelist and turns all the outcasts into cluster nodes..
(wev got a gigabit LAN too)

now,id like to run benchmarking tests on the cluster so as to outline the increase in performance as individual nodes are added..and also the increase in the load on the network..
are there tools available that would let me do all this..and, say, get graphs etc too? tools that are compatible with PVM? could anyone provide links to places where they can be downloaded?
(im running red-hat 9.0 on all systems)

thanx in anticipation

Mayank

PS. those proud compaq evos are giving me trouble..thev got winXP with an NTFS filesystem, n im trying to use partition magic to make a pratition so that i can make a dual boot system and install linux...but partition magic always exits with an error, on all the systems..fips wont work with NTFS..has anyone ever done this? the quick-restor cd says it would remove all partitions and make just one NTFS partition, so i didnt try that.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From daniel.kidger at quadrics.com  Sat Mar 13 17:00:40 2004
From: daniel.kidger at quadrics.com (Dan Kidger)
Date: Sat, 13 Mar 2004 22:00:40 +0000
Subject: [Beowulf] Strange Behavior
In-Reply-To: <1079109827.3745.7.camel@tp1.mesh-hq>
References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> <1079109827.3745.7.camel@tp1.mesh-hq>
Message-ID: <200403132200.40877.daniel.kidger@quadrics.com>

On Friday 12 March 2004 4:43 pm, Lars Henriksen wrote:
> On Fri, 2004-03-12 at 13:51, Mathias Brito wrote:
> > I'm benchmarking my 16 nodes cluster with HPL and I
> > obtain a estrange result, different of all I ever seen
> > before. When I send more data with a big N, the
> > performance is worse than with small values of N. I
> > used N=5000 with NB=20 and the performance was 3.3GB,
> > when I send N=10000 with NB=20 i get only 2.1GB. I
> > don't liked the result, the nodes are athlon xp 1600+
> > with 512MB RAM, and I think the cluster very slow.
> > Someone had the same problem and could help me?
>
> Please correct me anybody, if im wrong:
> It seems to me, that the best results are acheived with approx 85-90%
> memory utilization (leaving something to the rest of the system).
>
> (16*512*1024*1024/8)^0.5 ~= 30200, that would close to the best N value


Your target should be say 75% of theoretical peak performance 
   0.75 * 16nodes  * 1 cpupernode * 1.4Ghz * 1 floppertick 
   = 16.8 Gflops/s

So figures like '3.1' Gflops/s (14% peak) are much lower than what you should 
be achieving (Only vendors like IBM post figures on the top500 with %peak 
figures as low as this (Nov2003) )

Linpack figures are dominated by the choice of maths library - you do not say 
which one you are using (MKL, libgoto, Atlas, ACML) ?


> isn't Nb=20 very low? I currently use arround 145 for P4 cpu's

Remember choice of NB depends on which maths library you use rather than 
simply on the platform - but in general the best values lie between 80 to 
256; 20x20 is far too small for a matrix multiply.


Daniel.

--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd.      daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK         0117 915 5505
----------------------- www.quadrics.com --------------------


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ratscus at hotmail.com  Sat Mar 13 20:55:17 2004
From: ratscus at hotmail.com (Joe Manning)
Date: Sat, 13 Mar 2004 18:55:17 -0700
Subject: [Beowulf] project
Message-ID: <BAY7-F791pXFTaAc7tq0005c21f@hotmail.com>

Does anyone know of a good non-profit that posts data to be processed?  Kind 
of like how SETI dispenses its data, but for cancer or something?  I have a 
whole school to my disposal and am just going to run a diskless system 
pushed down from a server.  I can't really do much about the network, but 
will use it as a working model for some personal curiosities.  (hopefully I 
will be able to contribute to this group at some point)  Also, if anyone 
does know of a good place to get this type of data, can they please point me 
in the right direction of the type of process said sight uses, so I can 
decide what version I want to use to implement the process.

Thanks,


Joe Manning

_________________________________________________________________
Get a FREE online computer virus scan from McAfee when you click here. 
http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From patrick at myri.com  Sat Mar 13 21:57:40 2004
From: patrick at myri.com (Patrick Geoffray)
Date: Sat, 13 Mar 2004 21:57:40 -0500
Subject: [Beowulf] Strange Behavior
In-Reply-To: <200403132200.40877.daniel.kidger@quadrics.com>
References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> <1079109827.3745.7.camel@tp1.mesh-hq> <200403132200.40877.daniel.kidger@quadrics.com>
Message-ID: <4053CA24.1020901@myri.com>

Hi Dan.

Dan Kidger wrote:
> Your target should be say 75% of theoretical peak performance

He is likely using IP over Ethernet, so 50% would be a more reasonable 
expectation.

> So figures like '3.1' Gflops/s (14% peak) are much lower than what you should 
> be achieving (Only vendors like IBM post figures on the top500 with %peak 
> figures as low as this (Nov2003) )

Which ones ?

Patrick
-- 

Patrick Geoffray
Myricom, Inc.
http://www.myri.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From unix_no_win at yahoo.com  Sun Mar 14 11:49:17 2004
From: unix_no_win at yahoo.com (unix_no_win)
Date: Sun, 14 Mar 2004 08:49:17 -0800 (PST)
Subject: [Beowulf] project
In-Reply-To: <BAY7-F791pXFTaAc7tq0005c21f@hotmail.com>
Message-ID: <20040314164917.45310.qmail@web40412.mail.yahoo.com>


You might want to check out:
www.distributedfolding.org


--- Joe Manning <ratscus at hotmail.com> wrote:
> Does anyone know of a good non-profit that posts
> data to be processed?  Kind 
> of like how SETI dispenses its data, but for cancer
> or something?  I have a 
> whole school to my disposal and am just going to run
> a diskless system 
> pushed down from a server.  I can't really do much
> about the network, but 
> will use it as a working model for some personal
> curiosities.  (hopefully I 
> will be able to contribute to this group at some
> point)  Also, if anyone 
> does know of a good place to get this type of data,
> can they please point me 
> in the right direction of the type of process said
> sight uses, so I can 
> decide what version I want to use to implement the
> process.
> 
> Thanks,
> 
> 
> Joe Manning
> 
>
_________________________________________________________________
> Get a FREE online computer virus scan from McAfee
> when you click here. 
>
http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From peter at cs.usfca.edu  Sun Mar 14 13:57:22 2004
From: peter at cs.usfca.edu (Peter Pacheco)
Date: Sun, 14 Mar 2004 10:57:22 -0800
Subject: [Beowulf] Flashmob Supercomputer
Message-ID: <20040314185722.GB14301@cs.usfca.edu>

The University of San Francisco is sponsoring the first FlashMob Supercomputer
on

   - Saturday, April 3, from 8 am to 6 pm,

in the

   - Koret Center of the University of San Francisco.

We're planning to network 1200-1400 laptops with Myrinet and Foundry
Switches.  We'll be running High-Performance Linpack, and we're hoping
to achieve 600 GFLOPS, which is faster than some of the Top500 fastest
supercomputers.

We need volunteers to

   - Bring their laptops:  Pentium III or IV or AMD, minimum requirements 
     1.3 GHz with 256 MBytes of RAM
   - Be table captains:  help people set up laptops before running the
     benchmark
   - Speak on subjects related to high-performance computing

For further information, please visit our website

   http://flashmobcomputing.org

Peter Pacheco
Department of Computer Science
University of San Francisco
San Francisco, CA 94117
peter at cs.usfca.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Sun Mar 14 21:17:50 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Mon, 15 Mar 2004 10:17:50 +0800 (CST)
Subject: [Beowulf] Oh MyGrid
Message-ID: <20040315021750.49880.qmail@web16813.mail.tpe.yahoo.com>

http://mygrid.sourceforge.net/

"MyGrid is designed with the modern concepts in mind,
simple naming and transparent class hierarchy."

It's targeting DataSynapse, licensed under GPL, and
more features.

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at intnet.mu  Sun Mar 14 22:12:55 2004
From: rgoornaden at intnet.mu (roudy)
Date: Mon, 15 Mar 2004 07:12:55 +0400
Subject: [Beowulf] Re: Writing a parallel program
Message-ID: <000701c40a3b$9a415e60$2b007bca@roudy>

Hello,
I don't know if it will be here that I can get a solution to my problem.
Well, I have an array of elements and I would like to divide the array by
the number of processors and then each processor process parts of the whole
array.
Below is the source code of how I am proceeding, can someone tell me what is
wrong?

Assume that the I have an array allval[tdegree]

void share_data(void)
{
double  nleft;
int i, k, j, nmin;
nmin = tdegree/size;  /* Number of degrees to be handled by each processor
*/
nleft = tdegree%size;
for(i=0;i<size;i++)
 ndegree[i] = ( i < nleft) ? nmin + 1 : nmin ;
 displ[i] += ndegree[i];
printf("\n%s: Number of points handled = %d\n", processor_name,
ndegree[rank]);
MPI_Scatterv ((void *)allval, ndegree, displ, MPI_DOUBLE, (void *)&val[1],
ndegree[rank], MPI_DOUBLE, MASTERNODE, MPI_COMM_WORLD);
}

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From burcu at ulakbim.gov.tr  Mon Mar 15 03:51:47 2004
From: burcu at ulakbim.gov.tr (Burcu Akcan)
Date: Mon, 15 Mar 2004 10:51:47 +0200
Subject: [Beowulf] SPBS problem--need help...
In-Reply-To: <404EC427.7070200@ulakbim.gov.tr>
References: <404EC427.7070200@ulakbim.gov.tr>
Message-ID: <40556EA3.60400@ulakbim.gov.tr>

Hi,

We have built a beowulf Debian cluster that contains 128 PIV nodes and 
one dual xeon server. I need
some help about SPBS (Storm). We have already installed SPBS on the 
server and nodes and all daemons seem to work regularly. When any job is 
given to the system by using pbs scripting, the job can be seen on 
defined queue by running status and related nodes are allocated for the 
job. On the other hand there is no cpu or memory consumption on the 
nodes, the job does not run exactly and at the end of estimated cpu time 
there is no output file.
Can anyone give some advice on my problem about SPBS.

Thank you...


Burcu Akcan

ULAKBIM High Performance Computing Center


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From klamman.gard at telia.com  Mon Mar 15 13:42:43 2004
From: klamman.gard at telia.com (Per Lindstrom)
Date: Mon, 15 Mar 2004 19:42:43 +0100
Subject: [Beowulf] MOSIX cluster
Message-ID: <4055F923.70203@telia.com>

Hi,
         
I wonder if some of you have experience of MOSIX? (www.mosix.org)
               
What do you think about that solution for FEA-simulations?

Can MOSIX be regarded as a form of a Beowulf cluster?

Best regards
Per Lindstrom

Per.Lindstrom at me.chalmers.se , klamman.gard at telia.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john4482 at umn.edu  Mon Mar 15 15:02:40 2004
From: john4482 at umn.edu (Eric R Johnson)
Date: Mon, 15 Mar 2004 14:02:40 -0600
Subject: [Beowulf] Scyld system mysteriously locks up
Message-ID: <40560BE0.1090808@umn.edu>

Hello,

I purchased a 4 node, 8 processor Scyld (version 28) cluster 
approximately 6 months ago.  About 5 days ago, it started mysteriously 
locking up on me.  Once it is locked up, I can't do anything except 
physically reboot the machine.
Unfortunately, I am rather new to Linux clusters and, since it worked 
"right out of the box", I have had no experience in troubleshooting.  
Can someone give me an idea of where I should start?
I have the BIOS on all machines set to do a full memory check on startup 
and the /var/log/message file shows nothing.
Thanks,
Eric

-- 
********************************************************************
 Eric R A Johnson
 University Of Minnesota                      tel: (612) 626 5115
 Dept. of Laboratory Medicine & Pathology     fax: (612) 625 1121
 7-230 BSBE                              e-mail: john4482 at umn.edu
 312 Church Street                    web: www.eric-r-johnson.com
 Minneapolis, MN 55455    
 USA                              


-- 
********************************************************************
  Eric R A Johnson
  University Of Minnesota                      tel: (612) 626 5115
  Dept. of Laboratory Medicine & Pathology     fax: (612) 625 1121
  7-230 BSBE                              e-mail: john4482 at umn.edu
  312 Church Street                    web: www.eric-r-johnson.com
  Minneapolis, MN 55455    
  USA                              


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From agrajag at dragaera.net  Mon Mar 15 16:23:34 2004
From: agrajag at dragaera.net (Jag)
Date: Mon, 15 Mar 2004 16:23:34 -0500
Subject: [Beowulf] Cluster Uplink via Wireless
In-Reply-To: <20040312130636.D49119@blnsrv1.science-computing.de>
References: <20040312130636.D49119@blnsrv1.science-computing.de>
Message-ID: <1079385814.4352.86.camel@pel>

On Fri, 2004-03-12 at 07:06, Michael Arndt wrote:
> Hello *
> 
> has anyone done a wireless technology uplink to a compute cluster
> that is in real use ?
> If so, i would be interested to know how and how is the experinece in
> transferring  "greater" (e.g. 2 GB ++ ) Result files?
> 
> explanation: 
> We have a cluster with gigabit interconnect
> where it would make life cheaper, if there is a possibility to upload
> input data and download output data via wireless link, since connecting 
> twisted pair between WS and CLuster would be expensive.


Depending on your setup, some kind of "wireless" besides 802.11[bg] may
be worth considering.  I'm assuming the expense in wiring the WS to the
cluster isn't wire costs so much as where you'd have to put the cable.
One thing you might consider is IR uplink.  I don't remember what speed
they get, but a few years back I saw a college use IR to get
connectivity to a building, that otherwise would have required digging
up a busy public street to wire.  In the long run it was a lot cheaper. 
If your expense in wiring is something similar, you may want to look
into IR or similar technologies.  (The IR guns weren't cheap by any
means, except when compared to digging up a city street)

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mnerren at paracel.com  Mon Mar 15 18:21:04 2004
From: mnerren at paracel.com (micah nerren)
Date: Mon, 15 Mar 2004 15:21:04 -0800
Subject: [Beowulf] Scyld system mysteriously locks up
In-Reply-To: <40560BE0.1090808@umn.edu>
References: <40560BE0.1090808@umn.edu>
Message-ID: <1079392863.27739.25.camel@angmar>

On Mon, 2004-03-15 at 12:02, Eric R Johnson wrote:
> Hello,
> 
> I purchased a 4 node, 8 processor Scyld (version 28) cluster 
> approximately 6 months ago.  About 5 days ago, it started mysteriously 
> locking up on me.  Once it is locked up, I can't do anything except 
> physically reboot the machine.

I would check heating issues. Has the ventilation changed, does the
machine feel hot? How long between lockups?

Micah

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Tue Mar 16 04:42:30 2004
From: john.hearns at clustervision.com (John Hearns)
Date: Tue, 16 Mar 2004 10:42:30 +0100 (CET)
Subject: [Beowulf] Cluster Uplink via Wireless
In-Reply-To: <1079385814.4352.86.camel@pel>
Message-ID: <Pine.LNX.4.44.0403161038040.25183-100000@druifje.clustervision.com>

On Mon, 15 Mar 2004, Jag wrote:

> be worth considering.  I'm assuming the expense in wiring the WS to the
> cluster isn't wire costs so much as where you'd have to put the cable.
> One thing you might consider is IR uplink.  I don't remember what speed
> they get, but a few years back I saw a college use IR to get
> connectivity to a building, that otherwise would have required digging
> up a busy public street to wire.  In the long run it was a lot cheaper. 

When I worked in Soho, we had a laser link over the rooftops of London.
At the time a 155Mbps ATM link, which we later used for 100Mbps Ethernet.
Main problem was cleaning the lenses every so often, in the lovely
London air conditions. 
We later put in a gigabit laser from Nbase to another building.

We needed much more bandwidth than 100Mbps in the end, and had our own
trench dug and put in dark fibre. 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Tue Mar 16 04:58:23 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Tue, 16 Mar 2004 17:58:23 +0800 (CST)
Subject: [Beowulf] MOSIX cluster
In-Reply-To: <4055F923.70203@telia.com>
Message-ID: <20040316095823.57806.qmail@web16813.mail.tpe.yahoo.com>

Since you know the number of tasks your simulations
use, I think using a batch system would make it easier
to management - MOSIX is usually for jobs which are
very dynamic.

You can take a look at the common batch systems such
as SGE or SPBS.

http://gridengine.sunsource.net
http://www.supercluster.org/projects/torque/

Andrew.

--- Per Lindstrom <klamman.gard at telia.com> ????>
Hi,
>          
> I wonder if some of you have experience of MOSIX?
> (www.mosix.org)
>                
> What do you think about that solution for
> FEA-simulations?
> 
> Can MOSIX be regarded as a form of a Beowulf
> cluster?
> 
> Best regards
> Per Lindstrom
> 
> Per.Lindstrom at me.chalmers.se ,
> klamman.gard at telia.com
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From prml at na.chalmers.se  Mon Mar 15 13:39:23 2004
From: prml at na.chalmers.se (Per R M Lindstrom)
Date: Mon, 15 Mar 2004 19:39:23 +0100 (CET)
Subject: [Beowulf] (no subject)
Message-ID: <Pine.OSF.4.58.0403151933410.205565@seahag.na.chalmers.se>

Hi,

I wonder if some of you have experience of MOSIX? (www.mosix.org)

What do you think about that solution for FEA-simulations?

Can MOSIX be regarded as a form of a Beowulf cluster?

Best regards
Per Lindstrom
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bioinformaticist at mn.rr.com  Mon Mar 15 14:49:36 2004
From: bioinformaticist at mn.rr.com (Eric R Johnson)
Date: Mon, 15 Mar 2004 13:49:36 -0600
Subject: [Beowulf] Scyld system mysteriously locks up
Message-ID: <405608D0.60501@mn.rr.com>

Hello,

I purchased a 4 node, 8 processor Scyld (version 28) cluster 
approximately 6 months ago.  About 5 days ago, it started mysteriously 
locking up on me.  Once it is locked up, I can't do anything except 
physically reboot the machine.
Unfortunately, I am rather new to Linux clusters and, since it worked 
"right out of the box", I have had no experience in troubleshooting.  
Can someone give me an idea of where I should start?
I have the BIOS on all machines set to do a full memory check on startup 
and the /var/log/message file shows nothing.
Thanks,
Eric

-- 
********************************************************************
  Eric R A Johnson
  University Of Minnesota                      tel: (612) 626 5115
  Dept. of Laboratory Medicine & Pathology     fax: (612) 625 1121
  7-230 BSBE                              e-mail: john4482 at umn.edu
  312 Church Street                    web: www.eric-r-johnson.com
  Minneapolis, MN 55455    
  USA                              


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From br66 at HPCL.CSE.MsState.Edu  Mon Mar 15 18:09:37 2004
From: br66 at HPCL.CSE.MsState.Edu (Balaji Rangasamy)
Date: Mon, 15 Mar 2004 17:09:37 -0600 (CST)
Subject: [Beowulf] MPICH Exporting environment variables.
Message-ID: <Pine.GSO.4.44.0403151707190.4181-100000@aurora.cs.msstate.edu>

Hi,
Has anyone successfully exported any environment variables (specifically
LD_PRELOAD) in MPICH? There is an easy way to do this with LAM/MPI; there
is this -x switch in mpirun command that comes with LAM/MPI that will
export the environment variable you specify to all the child processes. Is
there any easy way to do this in MPICH?
Thanks for your reply,
Balaji.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tsariysk at craft-tech.com  Tue Mar 16 14:12:46 2004
From: tsariysk at craft-tech.com (Ted Sariyski)
Date: Tue, 16 Mar 2004 14:12:46 -0500
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov>
References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov>
Message-ID: <405751AE.2040806@craft-tech.com>

Hi,
I am about to configure a 16 node dual xeon cluster based on Supermicro 
X5DPA-TGM motherboard. The cluster may grow so I am looking for a 
manageable, nonblocking 24 or 32 port gigabit switch. Any comments or 
recommendations will be highly appreciated.
Thanks,
Ted

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From agrajag at dragaera.net  Tue Mar 16 13:49:02 2004
From: agrajag at dragaera.net (Sean Dilda)
Date: Tue, 16 Mar 2004 13:49:02 -0500
Subject: [Beowulf] Scyld system mysteriously locks up
In-Reply-To: <405608D0.60501@mn.rr.com>
References: <405608D0.60501@mn.rr.com>
Message-ID: <1079462942.4354.49.camel@pel>

On Mon, 2004-03-15 at 14:49, Eric R Johnson wrote:
> Hello,
> 
> I purchased a 4 node, 8 processor Scyld (version 28) cluster 
> approximately 6 months ago.  About 5 days ago, it started mysteriously 
> locking up on me.  Once it is locked up, I can't do anything except 
> physically reboot the machine.
> Unfortunately, I am rather new to Linux clusters and, since it worked 
> "right out of the box", I have had no experience in troubleshooting.  
> Can someone give me an idea of where I should start?
> I have the BIOS on all machines set to do a full memory check on startup 
> and the /var/log/message file shows nothing.

It might be useful to try to figure out what is locking up.  Is it just
the head node that's locking?

Have you made any recent changes that might account for it?  Or are you
running any new programs that might be stressing the machine in a way it
wasn't stressed before?    If its completely locking (if you can no
longer toggle the numlock light on your keyboard, then its completely
locked), then its either a kernel hang, or a hardware issue.  If the
kernel is the same and the usage pattern hasn't changed, then it might
be a hardware issue.  Hardware can degrade over time and dying hardware
can be unpredictable.

You may also consider contacting Scyld, and possibly the hardware
manufacturer for help diagnosing the problem.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Tue Mar 16 16:19:12 2004
From: david.n.lombard at intel.com (Lombard, David N)
Date: Tue, 16 Mar 2004 13:19:12 -0800
Subject: [Beowulf] MOSIX for FEA (was: no subject)
Message-ID: <187D3A7CAB42A54DB61F1D05F0125722025F5662@orsmsx402.jf.intel.com>

From: Per R M Lindstrom; Monday, March 15, 2004 10:39 AM
> 
> Hi,
> 
> I wonder if some of you have experience of MOSIX? (www.mosix.org)
> 
> What do you think about that solution for FEA-simulations?

As with all things, "it depends."  More specifically, it depends on the
characteristics of the FEA app.

For the FEA app that I have intimate familiarity with, MOSIX would not
work well at all.  The reason is the app is highly sensitive to
sustained memory bandwidth and sustained disk I/O bandwidth.  While
memory bandwidth is not an issue with MOSIX, disk I/O bandwidth will
become an issue once MOSIX migrates a process to balance CPU load.  The
(local scratch) disk I/O will then be forced through both the current
and original nodes, severely impacting the bandwidth.

Having said that, I can imagine an in-memory FEA app that could work
quite well on MOSIX.  More specifically, the hypothetical app would read
its data from disk, crunch for a while, and then write its results to
disk.

-- 
David N. Lombard
 
My comments represent my opinions, not those of Intel Corporation
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gropp at mcs.anl.gov  Tue Mar 16 15:14:58 2004
From: gropp at mcs.anl.gov (William Gropp)
Date: Tue, 16 Mar 2004 14:14:58 -0600
Subject: [Beowulf] MPICH Exporting environment variables.
In-Reply-To: <Pine.GSO.4.44.0403151707190.4181-100000@aurora.cs.msstate.
 edu>
References: <Pine.GSO.4.44.0403151707190.4181-100000@aurora.cs.msstate.edu>
Message-ID: <6.0.0.22.2.20040316141246.025e4f48@localhost>

At 05:09 PM 3/15/2004, Balaji Rangasamy wrote:
>Hi,
>Has anyone successfully exported any environment variables (specifically
>LD_PRELOAD) in MPICH? There is an easy way to do this with LAM/MPI; there
>is this -x switch in mpirun command that comes with LAM/MPI that will
>export the environment variable you specify to all the child processes. Is
>there any easy way to do this in MPICH?

It depends on the process manager/startup system that you are using with 
MPICH.  With the "p4 secure server", environment variables can be 
exported.  With the default ch_p4 device, environment variables are not 
exported.  Under MPICH2, most process managers export the environment to 
the user processes.

Bill 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Tue Mar 16 22:09:44 2004
From: csamuel at vpac.org (Chris Samuel)
Date: Wed, 17 Mar 2004 14:09:44 +1100
Subject: [Beowulf] cfengine users ?
Message-ID: <200403171409.45273.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi folks,

Anyone out there using cfengine to manage clusters, or who's tried and failed?

Just curious as to whether it's worth looking at..

cheers!
Chris
- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQFAV8F4O2KABBYQAh8RAth7AJ9NkRhIUqcykX1zWGZyi/vZcB7JhwCgkVej
uX5R/EcQrBPX+/Pyew55FC0=
=tRe+
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From a.j.martin at qmul.ac.uk  Wed Mar 17 05:09:28 2004
From: a.j.martin at qmul.ac.uk (Alex Martin)
Date: Wed, 17 Mar 2004 10:09:28 +0000
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <405751AE.2040806@craft-tech.com>
References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> <405751AE.2040806@craft-tech.com>
Message-ID: <200403171009.i2HA9S314735@heppcb.ph.qmw.ac.uk>

On Tuesday 16 March 2004 7:12 pm, Ted Sariyski wrote:
> Hi,
> I am about to configure a 16 node dual xeon cluster based on Supermicro
> X5DPA-TGM motherboard. The cluster may grow so I am looking for a
> manageable, nonblocking 24 or 32 port gigabit switch. Any comments or
> recommendations will be highly appreciated.
> Thanks,
> Ted
>

You might want to look at the HP ProCurve 2824 or 2848 series.  We choose the 
latter, because it means we only need one switch per (logical) rack and the 
cost/port is pretty low. I can't yet comment on performance. 

cheers,
Alex

-- 
------------------------------------------------------------------------------
|                                                                            |
|  Dr. Alex Martin                                                           |
|  e-Mail:   a.j.martin at qmul.ac.uk        Queen Mary, University of London,  |
|  Phone :   +44-(0)20-7882-5033          Mile End Road,                     |
|  Fax   :   +44-(0)20-8981-9465          London, UK   E1 4NS                |
|                                                                            |
------------------------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bogdan.costescu at iwr.uni-heidelberg.de  Wed Mar 17 07:17:06 2004
From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu)
Date: Wed, 17 Mar 2004 13:17:06 +0100 (CET)
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <200403171009.i2HA9S314735@heppcb.ph.qmw.ac.uk>
Message-ID: <Pine.LNX.4.44.0403171308420.741-100000@kenzo.iwr.uni-heidelberg.de>

On Wed, 17 Mar 2004, Alex Martin wrote:

> You might want to look at the HP ProCurve 2824 or 2848 series.  We
> choose the latter, because it means we only need one switch per
> (logical) rack and the cost/port is pretty low. I can't yet comment
> on performance.

I'm interested in buying a 48 port Gigabit switch as well, and I was
looking at the 2848 as it has the advantage of 48 ports in only 1U.
One thing that is not clear from the descriptions that I find on the 
net is if it has support for Jumbo frames. Does the documentation that 
come with it mention something like this or, even better, have you 
tried using Jumbo frames ?

I'm also interested in hearing opinions about other 48 ports Gigabit 
switches.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Wed Mar 17 08:04:10 2004
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Wed, 17 Mar 2004 05:04:10 -0800 (PST)
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <Pine.LNX.4.44.0403171308420.741-100000@kenzo.iwr.uni-heidelberg.de>
Message-ID: <Pine.LNX.4.44.0403170503340.20971-100000@twin.uoregon.edu>

On Wed, 17 Mar 2004, Bogdan Costescu wrote:

> On Wed, 17 Mar 2004, Alex Martin wrote:
> 
> > You might want to look at the HP ProCurve 2824 or 2848 series.  We
> > choose the latter, because it means we only need one switch per
> > (logical) rack and the cost/port is pretty low. I can't yet comment
> > on performance.
> 
> I'm interested in buying a 48 port Gigabit switch as well, and I was
> looking at the 2848 as it has the advantage of 48 ports in only 1U.
> One thing that is not clear from the descriptions that I find on the 
> net is if it has support for Jumbo frames. Does the documentation that 
> come with it mention something like this or, even better, have you 
> tried using Jumbo frames ?

hp does not support jumbo frames on anything except their high-end l3 
products...
 
> I'm also interested in hearing opinions about other 48 ports Gigabit 
> switches.
> 
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli  	       Unix Consulting 	       joelja at darkwing.uoregon.edu    
GPG Key Fingerprint:     5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tsariysk at craft-tech.com  Wed Mar 17 07:28:34 2004
From: tsariysk at craft-tech.com (Ted Sariyski)
Date: Wed, 17 Mar 2004 07:28:34 -0500
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <Pine.LNX.4.44.0403171308420.741-100000@kenzo.iwr.uni-heidelberg.de>
References: <Pine.LNX.4.44.0403171308420.741-100000@kenzo.iwr.uni-heidelberg.de>
Message-ID: <40584472.1050600@craft-tech.com>

If jumboframes are important you may look at Foundry EdgeIron 24G or 48G.
Ted

Bogdan Costescu wrote:

> On Wed, 17 Mar 2004, Alex Martin wrote:
> 
> 
>>You might want to look at the HP ProCurve 2824 or 2848 series.  We
>>choose the latter, because it means we only need one switch per
>>(logical) rack and the cost/port is pretty low. I can't yet comment
>>on performance.
> 
> 
> I'm interested in buying a 48 port Gigabit switch as well, and I was
> looking at the 2848 as it has the advantage of 48 ports in only 1U.
> One thing that is not clear from the descriptions that I find on the 
> net is if it has support for Jumbo frames. Does the documentation that 
> come with it mention something like this or, even better, have you 
> tried using Jumbo frames ?
> 
> I'm also interested in hearing opinions about other 48 ports Gigabit 
> switches.
> 

-- 
Ted Sariyski
------------
Combustion Research and Flow Technology, Inc.
6210 Keller's Church Road
Pipersville, PA 18947
Tel: 215-766-1520
Fax: 215-766-1524
www.craft-tech.com
tsariysk at craft-tech.com
-----------------------
"Our experiment is perfect and is not limited by fundamental principles."

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From canon at nersc.gov  Wed Mar 17 10:26:28 2004
From: canon at nersc.gov (canon at nersc.gov)
Date: Wed, 17 Mar 2004 07:26:28 -0800
Subject: [Beowulf] cfengine users ? 
In-Reply-To: Message from Chris Samuel <csamuel@vpac.org> 
   of "Wed, 17 Mar 2004 14:09:44 +1100." <200403171409.45273.csamuel@vpac.org> 
Message-ID: <200403171526.i2HFQSni004735@pookie.nersc.gov>


Chris,

We use cfengine to help manage our ~400 node linux cluster and
416 nodes (6656 processor) SP system.  I highly recommend it.
We typically use an rpm update script (we are moving to yum now)
to manage the binaries and use cfengine to manage config files
and scripts.  There are some aspects of cfengine that can be
a little convoluted, but it is very flexible.

--Shane


------------------------------------------------------------------------
Shane Canon                            
PSDF Project Lead                       
National Energy Research Scientific
  Computing Center
1 Cyclotron Road Mailstop 943-256
Berkeley, CA 94720                      
------------------------------------------------------------------------


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From anandv at singnet.com.sg  Wed Mar 17 00:40:39 2004
From: anandv at singnet.com.sg (Anand Vaidya)
Date: Wed, 17 Mar 2004 13:40:39 +0800
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <405751AE.2040806@craft-tech.com>
References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> <405751AE.2040806@craft-tech.com>
Message-ID: <200403171340.39601.anandv@singnet.com.sg>

You can try Foundry Networks EIF24G or EIF48G, offers full BW, 1U, we like it.

-Anand

On Wednesday 17 March 2004 03:12, Ted Sariyski wrote:
> Hi,
> I am about to configure a 16 node dual xeon cluster based on Supermicro
> X5DPA-TGM motherboard. The cluster may grow so I am looking for a
> manageable, nonblocking 24 or 32 port gigabit switch. Any comments or
> recommendations will be highly appreciated.
> Thanks,
> Ted
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Thu Mar 18 10:26:51 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 18 Mar 2004 10:26:51 -0500 (EST)
Subject: [Beowulf] Intel CSA performance?
Message-ID: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca>

Intel added a special connection on their chipset to connect 
gigabit on some chipsets (CSA).  I've been wondering whether 
this would offer a latency advantage, since it's conventional wisdom
that PCI latency is a noticable part of MPI latency.

this article:
http://tinyurl.com/2vlez
claims that CSA actually hurts latency, which is a bit puzzling.
it is, admittedly, "gamepc.com", so perhaps they are unaware of 
tuning issues like interrupt-coalescing/mitigation.

do any of you have CSA-based networks and have done performance tests?

thanks, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From venkatraman at programmer.net  Thu Mar 18 07:03:57 2004
From: venkatraman at programmer.net (Venkatraman Madurai Venkatasubramanyam)
Date: Thu, 18 Mar 2004 07:03:57 -0500
Subject: [Beowulf] Suggest me on my attempt!!
Message-ID: <20040318120357.A52A91D435B@ws1-12.us4.outblaze.com>

Hello ppl!

     I am a Computer Science and Engineering student of India. I am planning to build a Beowulf Cluster for my Project as a part of my curriculum. Resource I have are four laptops with Intel Celeron 2 GHz, 18 GB HDD, HP Compaq Presario 2100 series, 192 MB RAM and I dont  know what else shud I specify here. I have RedHat Linux 9 running on it. So I seek your help here to suggest me on how to build a Cluster. Please show me a way, as I am new to the Linux Platform. If you can personally help me, I will be really appreciated. 

MOkShAA.
-- 
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Mar 18 15:01:10 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 18 Mar 2004 15:01:10 -0500 (EST)
Subject: [Beowulf] Suggest me on my attempt!!
In-Reply-To: <20040318120357.A52A91D435B@ws1-12.us4.outblaze.com>
Message-ID: <Pine.LNX.4.44.0403181452480.3786-100000@ganesh>

On Thu, 18 Mar 2004, Venkatraman Madurai Venkatasubramanyam wrote:

> Hello ppl!

> I am a Computer Science and Engineering student of India. I am
> planning to build a Beowulf Cluster for my Project as a part of my
> curriculum. Resource I have are four laptops with Intel Celeron 2 GHz,
> 18 GB HDD, HP Compaq Presario 2100 series, 192 MB RAM and I dont know
> what else shud I specify here. I have RedHat Linux 9 running on it. So I
> seek your help here to suggest me on how to build a Cluster. Please show
> me a way, as I am new to the Linux Platform. If you can personally help
> me, I will be really appreciated.

  a) Visit http://www.phy.duke.edu/brahma

Among other things on this site is an online book on building clusters.
Read/skim it.

  b) In your case the recipe is almost certainly going to be:

    i) Put laptops on a common switched network (cheap 100 Mbps switch).
   ii) Install PVM, MPI (lam and/or mpich), programming tools and
support if you haven't already on all nodes.
  iii) Set them up with a common home directory space NFS exported from
one to the rest, and with common accounts to match.  You can distribute
account information on so small a cluster by just copying e.g.
/etc/passwd and /etc/group and so on or by using NIS (or other ways).
   iv) Set up a remote shell so that you can freely login from any node
to any other node without a password.  I recommend ssh (openssh rpms)
but rsh is OK if your network is otherwise isolated and secure.
    v) Obtain, write, build parallel applications to explore what your
cluster can do.  There are demo programs for both PVM and MPI that come
with the distributions and more are available on the web.  There is a
PVM program template and an example PVM application suitable for
demonstrating scaling (also a potential template for master/slave code)
on:

  http://www.phy.duke.edu/~rgb

under "General".
   vi) Proceed from there as your skills increase.

I think that you'll find that after this you'll be in pretty good shape
for further progress, guided as you think necessary by this list.  

There are also books out there that can help, but they cost money.

Finally, I'd strongly suggest subscribing to Cluster World Magazine,
where there are both articles and monthly columns that cover how to do
all of the above and much more.

   rgb

> MOkShAA.
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rouds at servihoo.com  Fri Mar 19 06:48:38 2004
From: rouds at servihoo.com (RoUdY)
Date: Fri, 19 Mar 2004 15:48:38 +0400
Subject: [Beowulf] HELP! MPI PROGRAM
In-Reply-To: <200310011901.h91J1LY06826@NewBlue.Scyld.com>
Message-ID: <web-2509963@servihoo.com>

Hello
I really need a very big hand from you...
I have to run a program on my cluster for the final year 
project, which require a lot of computation power...
Can someone sent me a program (the source code) or a site 
where i can download a big program PLEASE ...
Using MPI....
Hope to hear from you 
Roud
--------------------------------------------------
Get your free email address from Servihoo.com!
http://www.servihoo.com
The Portal of Mauritius
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lars at meshtechnologies.com  Fri Mar 19 09:31:25 2004
From: lars at meshtechnologies.com (Lars Henriksen)
Date: Fri, 19 Mar 2004 14:31:25 +0000
Subject: [Beowulf] HELP! MPI PROGRAM
In-Reply-To: <web-2509963@servihoo.com>
References: <web-2509963@servihoo.com>
Message-ID: <1079706684.2520.1.camel@tp1.mesh-hq>

On Fri, 2004-03-19 at 11:48, RoUdY wrote:

> I have to run a program on my cluster for the final year 
> project, which require a lot of computation power...
> Can someone sent me a program (the source code) or a site 
> where i can download a big program PLEASE ...
> Using MPI....

Try HPL (High-Performance Linpack):
http://www.netlib.org/benchmark/hpl/

best regards
Lars
-- 
Lars Henriksen                  | MESH-Technologies A/S
Systems Manager & Consultant    | Lille Graabroedrestraede 1
www.meshtechnologies.com        | DK-5000 Odense C, Denmark
lars at meshtechnologies.com       | mobile: +45 2291 2904
direct: +45 6311 1187	 	| fax:	  +45 6311 1189


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gropp at mcs.anl.gov  Fri Mar 19 08:43:43 2004
From: gropp at mcs.anl.gov (William Gropp)
Date: Fri, 19 Mar 2004 07:43:43 -0600
Subject: [Beowulf] HELP! MPI PROGRAM
In-Reply-To: <web-2509963@servihoo.com>
References: <200310011901.h91J1LY06826@NewBlue.Scyld.com>
 <web-2509963@servihoo.com>
Message-ID: <6.0.0.22.2.20040319074111.02505e60@localhost>

At 05:48 AM 3/19/2004, RoUdY wrote:
>Hello
>I really need a very big hand from you...
>I have to run a program on my cluster for the final year project, which 
>require a lot of computation power...
>Can someone sent me a program (the source code) or a site where i can 
>download a big program PLEASE ...
>Using MPI....
>Hope to hear from you Roud

There are many examples included with PETSc (www.mcs.anl.gov/petsc) that 
can be sized to use as much power as you have.  HPLinpack will also use as 
much computational power as you have and allows you to compare your cluster 
to the Top500 list.  Both use MPI for communication.

Bill 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gharinarayana at yahoo.com  Fri Mar 19 11:34:57 2004
From: gharinarayana at yahoo.com (HARINARAYANA G)
Date: Fri, 19 Mar 2004 08:34:57 -0800 (PST)
Subject: [Beowulf] Give an application to PARALLELIZE
Message-ID: <20040319163457.3051.qmail@web11306.mail.yahoo.com>

Dear friends,

Please give me a very good application which uses
pda(algorithms) and MPI to the maximum extent and
which is POSSIBLE to do in 2 months(It's OK even if
you have done it already, just send the NAME of the
topic and the problem requirements).

    I am doing my Bachelor of Engineering in Comp.
Science at RNSIT,Bangalore,INDIA.

     I am with a team of 4 people.

With regards,
Sivaram.

__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Fri Mar 19 21:18:31 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sat, 20 Mar 2004 10:18:31 +0800 (CST)
Subject: [Beowulf] GridEngine 6.0 beta is ready!
Message-ID: <20040320021831.65847.qmail@web16811.mail.tpe.yahoo.com>

It's finally available, follow this link to download
the binary packages or source:

http://gridengine.sunsource.net/project/gridengine/news/SGE60beta-announce.html

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Fri Mar 19 21:51:56 2004
From: lindahl at pathscale.com (Greg Lindahl)
Date: Fri, 19 Mar 2004 18:51:56 -0800
Subject: [Beowulf] Intel CSA performance?
In-Reply-To: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040320025156.GB7761@greglaptop.internal.keyresearch.com>

On Thu, Mar 18, 2004 at 10:26:51AM -0500, Mark Hahn wrote:

> Intel added a special connection on their chipset to connect 
> gigabit on some chipsets (CSA).  I've been wondering whether 
> this would offer a latency advantage, since it's conventional wisdom
> that PCI latency is a noticable part of MPI latency.

Eh? PCI latency can be noticable when you have a low latency network,
but gigE latency isn't nearly that low, especially once you've gone
through a switch.

The only reference to gigabit latency in the article didn't say what
they measured. I'd assume that it was using the normal drivers, which
means the kernel networking stack, which means you're looking through
the telescope from the wrong end.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From desi_star786 at yahoo.com  Sat Mar 20 13:38:10 2004
From: desi_star786 at yahoo.com (desi star)
Date: Sat, 20 Mar 2004 10:38:10 -0800 (PST)
Subject: [Beowulf] Problem running Jaguar on Scyld-Beowulf in parallel mode.
Message-ID: <20040320183810.94267.qmail@web40812.mail.yahoo.com>

Hi..

I have installed a molecular modeling software Jaguar
by Schrodinger Inc. on my scyld-beowulf 16 node
cluster. The software runs perfectly fine on the
master node but gives an error when I try to run the
program on more than one CPU. User manual of the
program suggests following steps to run Jaguar in
parallel mode:

1. Install MPICH and configure with option: 
   --with-comm=shared --with-device=ch_p4
2. Edit the machine.LINUX file in the MPICH directory
and list the name of the host and number of processors
on that host.
3. Test that 'rsh' is working
4. Launch the secure server ch4p_servs

We already have the MPICH installed on the cluster 
using package 'mpich-p4-inter-1.3.2-5_scyld.i368.rpm'.
I do not know whether package installation was done
with specific configure options in step#1. Do I need
to re-install the MPICH? I know that MPICH works
perfectly fine for the FORTRAN 90 programs on
different nodes.  

Also, Is it really important to enable 'rsh' on scyld?
The cluster is not protected by firewall so I want to
use the more secure 'ssh' but then do I need to
install the MPICH again telling it to use ssh rather
than rsh for communication? 

I am also wondering if the reason I am not been able
to run program on more than one CPU has to do with the
fact that Jaguar is not linked to MPICH libraries?

This is my first experience with MPICH and running
programs in parallel. I would really appreciate quick
tips and suggestions as to why I am not been to make
Jaguar run in the parallel mode.

Thanks in advance. Eagerly waiting for a response.

--

Pratap Singh.
Graduate Student,
The Chemical and Biomolecular Eng.
Johns Hopkins Univ.


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rmyers1400 at comcast.net  Fri Mar 19 22:58:53 2004
From: rmyers1400 at comcast.net (Robert Myers)
Date: Fri, 19 Mar 2004 22:58:53 -0500
Subject: [Beowulf] Intel CSA performance?
In-Reply-To: <20040320025156.GB7761@greglaptop.internal.keyresearch.com>
References: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca> <20040320025156.GB7761@greglaptop.internal.keyresearch.com>
Message-ID: <405BC17D.3010504@comcast.net>

Greg Lindahl wrote:

>On Thu, Mar 18, 2004 at 10:26:51AM -0500, Mark Hahn wrote:
>
>  
>
>>Intel added a special connection on their chipset to connect 
>>gigabit on some chipsets (CSA).  I've been wondering whether 
>>this would offer a latency advantage, since it's conventional wisdom
>>that PCI latency is a noticable part of MPI latency.
>>    
>>
>
>Eh? PCI latency can be noticable when you have a low latency network,
>but gigE latency isn't nearly that low, especially once you've gone
>through a switch.
>
>The only reference to gigabit latency in the article didn't say what
>they measured. I'd assume that it was using the normal drivers, which
>means the kernel networking stack, which means you're looking through
>the telescope from the wrong end.
>
>  
>
I had thought it might be interesting to fool around with trying to use 
CSA for hyperscsi, but I think you're saying if you're going to use a 
switched network, don't bother, if you're trying to win on latency.

When Intel abandoned infiniband and the memory controller hub sprouted 
this ethernet link, I figured that was their opening shot in stomping 
what's left of infiniband. Maybe it is, and they just don't care about 
latency, but it sounds like nobody's got any reliable information as to 
what the latency effects of CSA may be, anyway.

Every indication I can find is that Intel has all its bets on ethernet, 
and I don't know that there is any technological obstacle to building a 
low-latency ethernet.

RM

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jimlux at earthlink.net  Sat Mar 20 17:42:54 2004
From: jimlux at earthlink.net (Jim Lux)
Date: Sat, 20 Mar 2004 14:42:54 -0800
Subject: [Beowulf] Wireless network speed for clusters
Message-ID: <002b01c40ecc$cd7cec50$32a8a8c0@LAPTOP152422>

Some preliminary results for those of you wondering just how slow it
actually is...

Configuration is basically this:

node (Via EPIA C3 533MHz) running freevix kernel (ramdisk filesystem)
wired connection through Dlink 5 port hub
DWL-7000AP set up for point to multipoint 802.11a (5GHz band)

luminiferous aether

DWL-7000AP
ancient 10Mbps hub
    Clunky PPro running Knoppix/debian
    Maxtor NAS with a NFS mount

Pings with default 63 byte packets give 1.2-2.0 ms both ways...
Compare to <0.1 ms with a wired connection (i.e. plugging a cable from the
Dlink hub to the ancient hub)

DHCP/PXE booting sort of works (not exhaustively tested)
For some reason, the nodes can't see the NAS so NFS doesn't mount

There are a lot of "issues" with the DWL-7000AP... I think it's trying to be
clever about not routing traffic to MACs on the local side over the air, but
then, it doesn't know to route the traffic to the NFS server. The DWL-7000's
also don't like to be powered up with no live (as in responding to packets)
device hooked up to them, so there's sort of a potential power sequencing
thing with the EPIA boards and the DWL-7000AP.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Sat Mar 20 23:01:36 2004
From: lindahl at pathscale.com (Greg Lindahl)
Date: Sat, 20 Mar 2004 20:01:36 -0800
Subject: [Beowulf] Intel CSA performance?
In-Reply-To: <405BC17D.3010504@comcast.net>
References: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca> <20040320025156.GB7761@greglaptop.internal.keyresearch.com> <405BC17D.3010504@comcast.net>
Message-ID: <20040321040136.GA1977@greglaptop.greghome.keyresearch.com>

On Fri, Mar 19, 2004 at 10:58:53PM -0500, Robert Myers wrote:

> I had thought it might be interesting to fool around with trying to use 
> CSA for hyperscsi, but I think you're saying if you're going to use a 
> switched network, don't bother, if you're trying to win on latency.

I've never heard of "hyperscsi", and I am not saying what you think
I'm saying. What I am saying is that if you're going to use 1 gigabit
Ethernet, which has high latency in the switches, AND go through the
kernel, don't bother. I was pretty clear, so I don't see how you
missed it. There are certainly many examples of switched networks that
are low latency, such as Myrinet, IB, Quadrics, SCI, and so forth.

> When Intel abandoned infiniband

Intel has not abandoned Infiniband. They discontinued a 1X interface
that was going to get stomped in the market that was developing more
slowly than expected. Just like you drew the wrong lesson from what I
said, don't draw the wrong lesson from what Intel did.

> Every indication I can find is that Intel has all its bets on ethernet, 

This contradicts what Intel says. They are not betting against
ethernet, but they are certainly encouraging FC and IB where FC and IB
make sense. However, this is straying beyond beowulf, and I hope that
this mailing list can avoid being the cesspool that comp.arch has
been for many years.

-- greg
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rmyers1400 at comcast.net  Sun Mar 21 01:40:30 2004
From: rmyers1400 at comcast.net (Robert Myers)
Date: Sun, 21 Mar 2004 01:40:30 -0500
Subject: [Beowulf] Intel CSA performance?
In-Reply-To: <20040321040136.GA1977@greglaptop.greghome.keyresearch.com>
References: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca> <20040320025156.GB7761@greglaptop.internal.keyresearch.com> <405BC17D.3010504@comcast.net> <20040321040136.GA1977@greglaptop.greghome.keyresearch.com>
Message-ID: <405D38DE.1010409@comcast.net>

Greg Lindahl wrote:

>On Fri, Mar 19, 2004 at 10:58:53PM -0500, Robert Myers wrote:
>
>  
>
>>I had thought it might be interesting to fool around with trying to use 
>>CSA for hyperscsi, but I think you're saying if you're going to use a 
>>switched network, don't bother, if you're trying to win on latency.
>>    
>>
>
>I've never heard of "hyperscsi", and I am not saying what you think
>I'm saying. What I am saying is that if you're going to use 1 gigabit
>Ethernet, which has high latency in the switches, AND go through the
>kernel, don't bother. I was pretty clear, so I don't see how you
>missed it. There are certainly many examples of switched networks that
>are low latency, such as Myrinet, IB, Quadrics, SCI, and so forth.
>
I should have been explicit. "If you are going through a switched 
_ethernet_ connection." If you do the groups.google.com search

low-latency infiniband group:comp.arch author:Robert author:Myers

you will find that you really don't need to educate me about the 
existence of low-latency interconnects.

As to hyperscsi, I gather that it is incumbent only on others to check 
google. Hyperscsi is a way to pass raw data over ethernet without going 
through the TCP/IP stack:

http://www.linuxdevices.com/files/misc/hyperscsi.pdf

so it doesn't consume nearly the CPU resources that TCP/IP does without 
hardware offload, and I don't think CSA allows you to use separate 
hardware TCP/IP offload. It looks potentially interesting as a low-cost 
clustering interconnect, especially if, as I expect, Intel continues to 
push ethernet.

RM


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Sun Mar 21 09:46:36 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sun, 21 Mar 2004 22:46:36 +0800 (CST)
Subject: [Beowulf] Re: GridEngine 6.0 beta is ready!
In-Reply-To: <DA6B3E37-7A9A-11D8-B51F-000A95DA5638@sdsc.edu>
Message-ID: <20040321144636.49074.qmail@web16808.mail.tpe.yahoo.com>

SGE 6.1 will be avaiable at the end of the year, so
when the newer version of Rocks Cluster picks up SGE
6.0, SGE 6.1 will be available at around the same
time.

Andrew.

 --- "Mason J. Katz" <mjk at sdsc.edu> ???T???G> Thanks
for the update.  We're not going to include
> this in our April  
> release, but we will update to the official Opteron
> port and remove our  
> version of this port.  We hope to build experience
> with SGE 6.0 in the  
> coming months and include it as part of our November
> release as 6.0  
> goes from beta to release.  Thanks.
> 
> 	-mjk
> 
> On Mar 19, 2004, at 6:18 PM, Andrew Wang wrote:
> 
> > It's finally available, follow this link to
> download
> > the binary packages or source:
> >
> >
>
http://gridengine.sunsource.net/project/gridengine/news/SGE60beta-
> 
> > announce.html
> >
> > Andrew.
> >
> >
> >
>
-----------------------------------------------------------------
> > ????????Yahoo!??????
> >
>
?????????????????????????????????????????????????????????>
>
>
http://tw.promo.yahoo.com/mail_premium/stationery.html
>  

-----------------------------------------------------------------
?C???? Yahoo!?_??
?????C???B?????????B?R?A???????A???b?H??????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Mon Mar 22 12:33:15 2004
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 22 Mar 2004 09:33:15 -0800
Subject: [Beowulf] Give an application to PARALLELIZE
In-Reply-To: <20040319163457.3051.qmail@web11306.mail.yahoo.com>
Message-ID: <5.2.0.9.2.20040322093203.017e1000@mailhost4.jpl.nasa.gov>

At 08:34 AM 3/19/2004 -0800, HARINARAYANA G wrote:
>Dear friends,
>
>Please give me a very good application which uses
>pda(algorithms) and MPI to the maximum extent and
>which is POSSIBLE to do in 2 months(It's OK even if
>you have done it already, just send the NAME of the
>topic and the problem requirements).
>
>     I am doing my Bachelor of Engineering in Comp.
>Science at RNSIT,Bangalore,INDIA.
>
>      I am with a team of 4 people.
>
>With regards,
>Sivaram.

A couple issues back of IEEE Proceedings, there were several papers 
describing doing acoustic source localization with a bunch of iPAQs.  I 
don't know if they were doing MPI for node/node communication, but there's 
fairly extensive literature out there, and the papers describe the 
algorithms used.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From clwang at csis.hku.hk  Sun Mar 21 23:55:00 2004
From: clwang at csis.hku.hk (Cho Li Wang)
Date: Mon, 22 Mar 2004 12:55:00 +0800
Subject: [Beowulf] Final Call : NPC2004 (Deadline: March 22, 2004)
Message-ID: <405E71A4.1556E651@csis.hku.hk>


*******************************************************************
                            NPC2004
IFIP International Conference on Network and Parallel Computing
                       October 18-20, 2004
                          Wuhan, China
                 http://grid.hust.edu.cn/npc04
- -------------------------------------------------------------------
Important Dates

  Paper Submission                        March 22, 2004 (extended)
  Author Notification                     May    1, 2004
  Final Camera Ready Manuscript           June   1, 2004

*******************************************************************

Call For Papers

  The goal of IFIP International Conference on Network and Parallel
Computing (NPC 2004) is to establish an international forum for
engineers and
scientists to present their excellent ideas and experiences in all
system fields of
network and parallel computing. NPC 2004, hosted by the Huazhong
University of Science and Technology, will be held in the city of Wuhan,
China -
the "Homeland of White Clouds and the Yellow Crane." Topics of interest
include, but are not limited to:

        - Parallel & Distributed Architectures
        - Parallel & Distributed Applications/Algorithms
        - Parallel Programming Environments & Tools
        - Network & Interconnect Architecture
        - Network Security
        - Network Storage
        - Advanced Web and Proxy Services
        - Middleware Frameworks & Toolkits
        - Cluster and Grid Computing
        - Ubiquitous Computing
        - Peer-to-peer Computing
        - Multimedia Streaming Services
        - Performance Modeling & Evaluation

Submitted papers may not have appeared in or be considered for another
conference. Papers must be written in English and must be in PDF format.
Detailed electronic submission instructions will be posted on the
conference web site. The conference proceedings will be published by
Springer
Verlag in the Lecture Notes in Computer Science Series (cited by SCI).
Best papers
from NPC 2004 will be published in a special issue of International
Journal
of High Performance Computing and Networking (IJHPCN) after conference.

**************************************************************************

Committee

  General Co-Chairs:
        H. J. Siegel           Colorado State University, USA
        Guojie Li              The Institute of Computing Technology,
                               CAS, China
  Steering Committee Chair:
        Kemal Ebcioglu         IBM T.J. Watson Research Center, USA

  Program Co-Chairs:
        Guangrong Gao          University of Delaware, USA
        Zhiwei Xu              Chinese Academy of Sciences, China

  Program Vice-Chairs:
        Victor K. Prasanna     University of Southern California, USA
        Albert Y. Zomaya       University of Sydney, Australia
        Hai Jin                Huazhong University of Science and
                               Technology, China
  Publicity Co-Chairs:
        Cho-Li Wang           The University of Hong Kong, Hong Kong
        Chris Jesshope        The University of Hull, UK

  Local Arrangement Chair:
        Song Wu               Huazhong University of Science and
                              Technology, China

  Steering Committee Members:
        Jack Dongarra         University of Tennessee, USA
        Guangrong Gao         University of Delaware, USA
        Jean-Luc Gaudiot      University of California, Irvine, USA
        Guojie Li             The Institute of Computing Technology,
                              CAS, China
        Yoichi Muraoka        Waseda University, Japan
        Daniel Reed           University of North Carolina, USA

  Program Committee Members:
        Ishfaq Ahmad             University of Texas at Arlington, USA
        Shoukat Ali              University of Missouri-Rolla, USA
        Makoto Amamiya           Kyushu University, Japan
        David Bader              University of New Mexico, USA
        Luc Bouge                IRISA/ENS Cachan, France
        Pascal Bouvry            University of Luxembourg, Luxembourg
        Ralph Castain            Los Alamos National Laboratory, USA
        Guoliang Chen            University of Science and Technology
                                 of China, China
        Alain Darte              CNRS, ENS-Lyon, France
        Chen Ding                University of Rochester, USA
        Jianping Fan             Institute of Computing Technology, CAS,
China
        Xiaobing Feng            Institute of Computing Technology,
                                 CAS, China

        Jean-Luc Gaudiot         University of California, Irvine, USA
        Minyi Guo                University of Aizu, Japan
        Mary Hall                University of Southern California, USA
        Salim Hariri             University of Arizona, USA
        Kai Hwang                University of Southern California, USA
        Anura Jayasumana         Colorado State Univeristy, USA
        Chris R. Jesshop         The University of Hull, UK
        Ricky Kwok               The University of Hong Kong, Hong Kong
        Francis Lau              The University of Hong Kong, Hong Kong
        Chuang Lin               Tsinghua University, China
        John Morrison            University College Cork, Ireland
        Lionel Ni                Hong Kong University of Science and
                                 Technology, Hong Kong
        Stephan Olariu           Old Dominion University, USA
        Yi Pan                   Georgia State University, USA
        Depei Qian               Xi'an Jiaotong University, China
        Daniel A. Reed           University of North Carolina at
                                 Chapel Hill, USA
        Jose Rolim               University of Geneva, Switzerland
        Arnold Rosenberg         University of Massachusetts at Amherst,
USA
        Sartaj Sahni             University of Florida, USA
        Selvakennedy Selvadurai  University of Sydney, Australia
        Franciszek Seredynski    Polish Academy of Sciences, Poland
        Hong Shen                Japan Advanced Institute of Science
                                 and Technology, Japan
        Xiaowei Shen             IBM T. J. Watson Research Center, USA
        Gabby Silberman          IBM Centers for Advanced Studies, USA
        Per Stenstrom            Chalmers University of Technology,
Sweden
        Ivan Stojmenovic         University of Ottawa, Canada
        Ninghui Sun              Institute of Computing Technology, CAS,
China
        El-Ghazali Talbi         University of Lille, France
        Domenico Talia           University of Calabria, Italy
        Mitchell D. Theys        University of Illinois at Chicago, USA
        Xinmin Tian              Intel Corporation, USA
        Dean Tullsen             University of California, San Diego,
USA
        Cho-Li Wang              The University of Hong Kong, Hong Kong
        Qing Yang                University of Rhode Island, USA
        Yuanyuan Yang            State University of New York at
                                 Stony Brook, USA
        Xiaodong Zhang           College of William and Mary, USA
        Weimin Zheng             Tsinghua University, China
        Bingbing Zhou            University of Sydney, Australia
        Chuanqi Zhu              Fudan University, China
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwill at penguincomputing.com  Mon Mar 22 12:20:47 2004
From: mwill at penguincomputing.com (Michael Will)
Date: Mon, 22 Mar 2004 09:20:47 -0800
Subject: [Beowulf] Re: scyld and jaguar
Message-ID: <200403220920.47878.mwill@penguincomputing.com>

Hi,

I saw your email on the beowulf list, and have a few comments:

1. MPICH on Scyld does not require rsh or ssh but rather it will take 
advantage of the bproc features of Scyld to achieve the same faster.


2. If your fortran programs work fine, so should the c programs. Unless you 
have an executable that is statically linked with its own mpich 
implementation. You can test that by using 'ldd' on the executable, it will 
list which libraries it is loading. If there are no mpich libs mentioned, you 
might have a statically linked program. 

Let me know how it goes.

Michael Will
-- 
Michael Will, Linux Sales Engineer
NEWS: We have moved to a larger iceberg :-)
NEWS: 300 California St., San Francisco, CA.
Tel:  415-954-2822  Toll Free:  888-PENGUIN
Fax:  415-954-2899 
www.penguincomputing.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jeffrey.b.layton at lmco.com  Mon Mar 22 15:03:35 2004
From: jeffrey.b.layton at lmco.com (Jeff Layton)
Date: Mon, 22 Mar 2004 15:03:35 -0500
Subject: [Beowulf] NUMA Patches for AMD64 in 2.4?
Message-ID: <405F4697.9070507@lmco.com>

Good Afternoon!

   Does anyone know if the latest stock 2.4 kernel has the
NUMA patches in it? If not, where can I get NUMA patches
that will work for AMD64?

TIA!

Jeff

-- 
Dr. Jeff Layton
Aerodynamics and CFD
Lockheed-Martin Aeronautical Company - Marietta


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Mon Mar 22 16:30:04 2004
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 22 Mar 2004 16:30:04 -0500
Subject: [Beowulf] NUMA Patches for AMD64 in 2.4?
In-Reply-To: <405F4697.9070507@lmco.com>
References: <405F4697.9070507@lmco.com>
Message-ID: <405F5ADC.2080101@scalableinformatics.com>

You can pull x86_64 patches from ftp://ftp.x86-64.org/pub/linux/v2.6/  
.  The 2.4 kernels would need backports in some cases (RedHat is doing 
this, and I think SUSE might be as well).

Not sure if Fedora is doing this as well (no /proc/numa in it or in the 
SUSE 9.0 AMD64).

Joe

Jeff Layton wrote:

> Good Afternoon!
>
>   Does anyone know if the latest stock 2.4 kernel has the
> NUMA patches in it? If not, where can I get NUMA patches
> that will work for AMD64?
>
> TIA!
>
> Jeff
>


-- 
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From desi_star786 at yahoo.com  Mon Mar 22 15:15:27 2004
From: desi_star786 at yahoo.com (desi star)
Date: Mon, 22 Mar 2004 12:15:27 -0800 (PST)
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <200403220920.47878.mwill@penguincomputing.com>
Message-ID: <20040322201527.98403.qmail@web40809.mail.yahoo.com>

Hi Mike,

Thanks much for responding. Jaguar is indeed staticaly
linked to the MPICH libraries as per manuals. When I
ran the ldd commands as you suggested:

--
$ ldd Jaguar
  not a dynamic executable
$
--

Thats why the very first step sugested in the Jaguar
installation is to build and configure MPICH from the
start. Where do I go from here?

I also worked on Alan's suggestion and created a
dynamic link between the ssh and rsh. I am now stuck
in making ssh passwordless. Using 'ssh-keygen -t' I
generated public and private keys and then copied
public key to the authorised_keys2 in ~/.ssh/. I am
not sure if thats all I need to make ssh passwordless.
I was wondering if I will have to copy public keys on
each node using bpcp command.

I would appreciate suggestions in this matter. Thanks.

Pratap. 

--- Michael Will <mwill at penguincomputing.com> wrote:
> Hi,
> 
> I saw your email on the beowulf list, and have a few
> comments:
> 
> 1. MPICH on Scyld does not require rsh or ssh but
> rather it will take 
> advantage of the bproc features of Scyld to achieve
> the same faster.
> 
> 
> 2. If your fortran programs work fine, so should the
> c programs. Unless you 
> have an executable that is statically linked with
> its own mpich 
> implementation. You can test that by using 'ldd' on
> the executable, it will 
> list which libraries it is loading. If there are no
> mpich libs mentioned, you 
> might have a statically linked program. 
> 
> Let me know how it goes.
> 
> Michael Will
> -- 
> Michael Will, Linux Sales Engineer
> NEWS: We have moved to a larger iceberg :-)
> NEWS: 300 California St., San Francisco, CA.
> Tel:  415-954-2822  Toll Free:  888-PENGUIN
> Fax:  415-954-2899 
> www.penguincomputing.com
> 


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwill at penguincomputing.com  Mon Mar 22 17:01:31 2004
From: mwill at penguincomputing.com (Michael Will)
Date: Mon, 22 Mar 2004 14:01:31 -0800
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
References: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
Message-ID: <200403221401.31370.mwill@penguincomputing.com>

The problem is that a statically linked executable will not be able to use the 
Scyld infrastructure.  It won't take advantage of your Infinidband or 
Myrinet, it won't use bproc, etcpp.

You might set up the compute nodes to look like general unix nodes in order to 
run that particular implementation, but then you loose all the advantages of 
Scyld.
 
> I also worked on Alan's suggestion and created a
> dynamic link between the ssh and rsh. 
AFAIK you would be better off to set the enviroment variable to force it to 
use rsh or ssh. I think its  P4_RSHCOMMAND="ssh" .

The best way would be to ask your vendor to provide you with a dymanically 
linked executable, or even the source code and compile it yourself.

> I am now stuck 
> in making ssh passwordless. Using 'ssh-keygen -t' I
> generated public and private keys and then copied
> public key to the authorised_keys2 in ~/.ssh/. I am
> not sure if thats all I need to make ssh passwordless.
Does it work with localhost? It sometimes is tricky to get it right. 
then it could also work remotely, given that you 
1) have sshd running 
2) have your home NFS mounted
3) have made /dev/random accessible, at least for ssh I believe thats 
necessary

> I was wondering if I will have to copy public keys on
> each node using bpcp command.
You could do that too if you do not want to NFS mount the home. 

That you could easily do by editing /etc/exports to export /home 
and /etc/beowulf/fstab to mount $MASTER, after that rebooting your compute 
node. (might be possible without rebooting, but I don't know off of the top 
of my head)

Michael
-- 
Michael Will, Linux Sales Engineer
NEWS: We have moved to a larger iceberg :-)
NEWS: 300 California St., San Francisco, CA.
Tel:  415-954-2822  Toll Free:  888-PENGUIN
Fax:  415-954-2899 
www.penguincomputing.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From m.dierks at skynet.be  Mon Mar 22 18:39:30 2004
From: m.dierks at skynet.be (Michel Dierks)
Date: Tue, 23 Mar 2004 00:39:30 +0100
Subject: [Beowulf] Minimal OS
Message-ID: <405F7932.20404@skynet.be>

Hello,
I?m a beginner in the Beowulf world.
To achieve my school graduate I choose to make a Beowulf cluster.
My cluster: 
8 slaves: pc IBM 166 Mhz, 96 Mb ram, HD 2 Giga.
1 master: Dell PowerEdge 2200 bi processor 233 Mhz, 320 Mb ram, 3 SCSI HD (9.1, 2.1 and 2.1 Giga).
1 switch 10/100 Ethernet.
The application must calculate a mesh 2D for a research over stream in fluid mechanics.
I must use the MPI library for communication and PARMS for the calculation.
This application will be developed in C.
The operating system is the Red Hat distribution 9.0.
My question is: for the slave pc?s , which is the minimal operating system to install. (Kernell + which package?).
Thank you.

Michel D.

Belgium


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwill at penguincomputing.com  Mon Mar 22 18:01:00 2004
From: mwill at penguincomputing.com (Michael Will)
Date: Mon, 22 Mar 2004 15:01:00 -0800
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <1079996184.4352.14.camel@pel>
References: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
    <1079996184.4352.14.camel@pel>
Message-ID: <200403221501.00766.mwill@penguincomputing.com>

I agree that rather than compiling your own MPICH you should try to make it 
work with the existing one. However 
1) there is no source
2) the binary is statically linked.
3) Scyld does have an mpirun which should set the enviromentvariables right. 

The right attempt is to make it use bpsh instead of rsh or ssh. I saw that 
some of the calls are done with shell scripts, which might be a way to fix it 
as well if the enviroment variables don't help.

Michael
On Monday 22 March 2004 02:56 pm, Sean Dilda wrote:
> On Mon, 2004-03-22 at 15:15, desi star wrote:
> > Hi Mike,
> >
> > Thanks much for responding. Jaguar is indeed staticaly
> > linked to the MPICH libraries as per manuals. When I
> > ran the ldd commands as you suggested:
> >
> > --
> > $ ldd Jaguar
> >   not a dynamic executable
> > $
> > --
> >
> > Thats why the very first step sugested in the Jaguar
> > installation is to build and configure MPICH from the
> > start. Where do I go from here?
>
> I'm not familiar with Jaguar, but I am somewhat familiar with Scyld.  I
> believe you are taking the wrong approach with this.
>
> Even though Jaguar says you should start with building mpich, I don't
> think that's what you want to do.  You almost certainly want to stick
> with the MPICH binaries that were provided by Scyld.  First make sure
> there is no confusion and remove the copy of mpich that you built.  Next
> make sure the mpich and mpich-devel packages are installed on your
> system.  'rpm -q mpich ; rpm -q mpich-devel' should tell you this.  If
> they're not, 'rpm -i mpich-XXXXX.rpm' should install the package.  You
> can find the packages on your Scyld cd(s).
>
> Once you have those packages installed, then attempt to compile jaguar.
> It should link against Scyld's copy of mpich and just work.  I suggest
> following Scyld's instructions for running mpich jobs, not Jaguars.
> Scyld has made adjustments to their copy of MPICH that make it work
> right on their system.  In the process they also change the way jobs are
> launched.  So Scyld may not have 'mpirun', but has a better way to start
> the job.
>
> As Michael pointed out, Scyld's version of MPICH doesn't require rsh,
> ssh, or anything like it.  So your questions along those lines are
> somewhat moot.
>
>
> Sean

-- 
Michael Will, Linux Sales Engineer
NEWS: We have moved to a larger iceberg :-)
NEWS: 300 California St., San Francisco, CA.
Tel:  415-954-2822  Toll Free:  888-PENGUIN
Fax:  415-954-2899 
www.penguincomputing.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwill at penguincomputing.com  Mon Mar 22 17:11:42 2004
From: mwill at penguincomputing.com (Michael Will)
Date: Mon, 22 Mar 2004 14:11:42 -0800
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
References: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
Message-ID: <200403221411.42975.mwill@penguincomputing.com>

Another idea - make it use bpsh by setting    
export P4_RSHCOMMAND="bpsh" or set it to use some shell script of yours that 
massages its parameters into the format bpsh expects.

bpsh will start a process without requiring rsh or ssh, using Scylds bproc 
support.

Michael.
-- 
Michael Will, Linux Sales Engineer
NEWS: We have moved to a larger iceberg :-)
NEWS: 300 California St., San Francisco, CA.
Tel:  415-954-2822  Toll Free:  888-PENGUIN
Fax:  415-954-2899 
www.penguincomputing.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From agrajag at dragaera.net  Mon Mar 22 17:56:24 2004
From: agrajag at dragaera.net (Sean Dilda)
Date: Mon, 22 Mar 2004 17:56:24 -0500
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
References: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
Message-ID: <1079996184.4352.14.camel@pel>

On Mon, 2004-03-22 at 15:15, desi star wrote:
> Hi Mike,
> 
> Thanks much for responding. Jaguar is indeed staticaly
> linked to the MPICH libraries as per manuals. When I
> ran the ldd commands as you suggested:
> 
> --
> $ ldd Jaguar
>   not a dynamic executable
> $
> --
> 
> Thats why the very first step sugested in the Jaguar
> installation is to build and configure MPICH from the
> start. Where do I go from here?
> 

I'm not familiar with Jaguar, but I am somewhat familiar with Scyld.  I
believe you are taking the wrong approach with this.

Even though Jaguar says you should start with building mpich, I don't
think that's what you want to do.  You almost certainly want to stick
with the MPICH binaries that were provided by Scyld.  First make sure
there is no confusion and remove the copy of mpich that you built.  Next
make sure the mpich and mpich-devel packages are installed on your
system.  'rpm -q mpich ; rpm -q mpich-devel' should tell you this.  If
they're not, 'rpm -i mpich-XXXXX.rpm' should install the package.  You
can find the packages on your Scyld cd(s).

Once you have those packages installed, then attempt to compile jaguar. 
It should link against Scyld's copy of mpich and just work.  I suggest
following Scyld's instructions for running mpich jobs, not Jaguars. 
Scyld has made adjustments to their copy of MPICH that make it work
right on their system.  In the process they also change the way jobs are
launched.  So Scyld may not have 'mpirun', but has a better way to start
the job.

As Michael pointed out, Scyld's version of MPICH doesn't require rsh,
ssh, or anything like it.  So your questions along those lines are
somewhat moot.


Sean

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Mon Mar 22 21:24:16 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Mon, 22 Mar 2004 18:24:16 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <000e01c40772$2611bf60$36a8a8c0@LAPTOP152422>
Message-ID: <Pine.LNX.4.04.10403111029210.1538-100000@c-24-18-245-161.client.comcast.net>

Two weeks ago, I asked about power consumption for dual opteron systems.  This
is summary of the numbers I saw posted here.

237 idle to 280 loaded for a dual 248 with two SCSI drives from Bill Broadley
250 loaded for a dual 240 from Mark Hahn
182 loaded for a dual 242 from Robert G. Brown

The 182 numbers seems to be too low, but it would be nice to have some other
data points.  Combine fewer fans, less memory, lower power or no harddrive,
more efficient power supply, and less load on the CPU, and you could see 182
vs 250 watts I think.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From pegu at dolphinics.no  Mon Mar  1 03:45:32 2004
From: pegu at dolphinics.no (Petter Gustad)
Date: Mon, 01 Mar 2004 09:45:32 +0100 (CET)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
Message-ID: <20040301.094532.17863925.pegu@dolphinics.no>


Taken from:

http://www.dolphinics.no/news/2004/2_25.html


Dolphin SCI Socket Software Delivers
Record Breaking Latency

New evaluation kit available at special pricing

Clinton, MA and Oslo, Norway, Feb 26, 2004 Dolphin Interconnect today
announced that the SCI Socket version 1.0 software library is now
available to customers for high-performance computing applications
interconnected with Dolphin SCI adapters. SCI Socket enables standard
Berkley sockets to use the Scalable Coherent Interface (SCI) as a
transport medium with its high bandwidth and extremely low latency.

"This is the lowest latency socket solution available today," said
Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new
high-performance possibilities for a broad range of networking
applications."

Dolphin has benchmarked a completed one byte socket send/socket
receive latency at 2.27 microseconds, which corresponds to more than
203,800 roundtrip transactions per second. Benchmarks using Netperf
also show more than 255 MBytes (2,035 Megabits/s) sustained throughput
using standard TCP STREAM sockets. The SCI Socket software uses
Dolphin's SISCI API as its transport and most of the communication
takes place in user space, avoiding time-consuming system calls and
networking protocols. SCI remote memory access provides a fast and
reliable connection.

"These record-setting performance benchmarks underscore the
capabilities of the SCI standard as a high-performance interconnect,"
said Kare Lochsen, CEO of Dolphin Interconnect. "Dolphin has extensive
expertise in this technology having developed the first SCI-based
interconnect soon after it became a IEEE standard in 1992, and we
remain committed to keeping SCI at the most competitive performance
levels in the future."

SCI Socket requires no operating system patches or application
modifications to run the software. SCI Socket is open source software
available under LGPL/GPL and supports all popular Linux distributions
for x86 and x86/Opteron. In Dolphin testing, the lowest latency was
achieved using AMD Opteron (X86_64) processors. Support for UDP and
Microsoft Windows is planned.

Dolphin SCI adapters are used to build server clusters for
high-performance computing and in a wide range of embedded real-time
computing applications including reflective memory, simulation and
visualization systems, and systems requiring high-availability and
fast failover.

For a limited time, an evaluation kit consisting of two PCI-SCI
adapter cards and cables is available directly from Dolphin
Interconnect at a substantial discount. When installed in a user's
application platform, the evaluation kit enables effective testing of
the SCI Socket software. The software and documentation is included at
no charge. Please visit the Dolphin web site for more information at
www.dolphinics.com/eval.

foobar GmbH (www.foobar-cpa.de), a software development and consulting
firm with particular expertise in SCI and located in Chemnitz,
Germany, assisted Dolphin in the development of SCI Socket.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Mon Mar  1 08:16:28 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Mon, 1 Mar 2004 21:16:28 +0800 (CST)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no>
Message-ID: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com>

> In Dolphin testing, the
> lowest latency was
> achieved using AMD Opteron (X86_64) processors.

No wonder Intel killed IA64 and released 64-bit x86
(aka IA32e) a week or two ago...

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mathiasbrito at yahoo.com.br  Mon Mar  1 08:59:19 2004
From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=)
Date: Mon, 1 Mar 2004 10:59:19 -0300 (ART)
Subject: [Beowulf] Mpirun error
Message-ID: <20040301135919.45861.qmail@web12202.mail.yahoo.com>

I intalled the lastest version of mpich in my personal
computer for simulate my parallel programs. I can
compile my programs without problem, but when I try to
run it I receive the fallowing message error:

p0_6941:  p4_error: Path to program is invalid while
starting /home/mathias/mpi/bubble with RSHCOMMAND on
linux: -1
    p4_error: latest msg from perror: No such file or
directory

What can I do?

Thanks

=====
Mathias Brito
Universidade Estadual de Santa Cruz - UESC
Departamento de Ci?ncias Exatas e Tecnol?gicas
Estudante do Curso de Ci?ncia da Computa??o

______________________________________________________________________

Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora:
http://br.yahoo.com/info/mail.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bogdan.costescu at iwr.uni-heidelberg.de  Mon Mar  1 10:49:39 2004
From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu)
Date: Mon, 1 Mar 2004 16:49:39 +0100 (CET)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no>
Message-ID: <Pine.LNX.4.44.0403011647380.10276-100000@kenzo.iwr.uni-heidelberg.de>

On Mon, 1 Mar 2004, Petter Gustad wrote:

> Dolphin has benchmarked a completed one byte socket send/socket
> receive latency at 2.27 microseconds,

Is this in polling mode or interrupt-driven ? I'm interested to see if 
I can do something useful (like computation) _and_ get such low 
latency.

> Benchmarks using Netperf also show more than 255 MBytes (2,035
> Megabits/s) sustained throughput using standard TCP STREAM sockets.

What is the CPU usage for this throughput ?

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kfarmer at linuxhpc.org  Mon Mar  1 09:51:39 2004
From: kfarmer at linuxhpc.org (Kenneth Farmer)
Date: Mon, 1 Mar 2004 09:51:39 -0500
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
References: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com>
Message-ID: <097701c3ff9c$b465fe30$1601a8c0@deskpro>

----- Original Message ----- 
From: "Andrew Wang" <andrewxwang at yahoo.com.tw>
To: <beowulf at beowulf.org>
Sent: Monday, March 01, 2004 8:16 AM
Subject: Re: [Beowulf] SCI Socket latency: 2.27 microseconds


> > In Dolphin testing, the
> > lowest latency was
> > achieved using AMD Opteron (X86_64) processors.
> 
> No wonder Intel killed IA64 and released 64-bit x86
> (aka IA32e) a week or two ago...
> 
> Andrew.


Intel killed IA64?  Where did you come up with that?

--
Kenneth Farmer <><
LinuxHPC.org
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Mar  1 11:35:56 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 1 Mar 2004 11:35:56 -0500 (EST)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no>
Message-ID: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>

well, this is interesting.  it appears that AMD has given all 
interconnect vendors a boost, since Myri and Quadrics seem to like
Opterons as well ;)

> "This is the lowest latency socket solution available today," said
> Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new

well, Quadrics now claims 1.8 us MPI latency:
http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD

it's interesting that SCI is still on 64x66 PCI - it would be very
interesting to know how many and what kinds of codes really require
higher bandwidth.  yes, some vendors (esp IB) are pushing PCI-express
as bandwith salvation, but afaikt, none of my users need even >500 MB/s
today.  it doesn't seem like PCI-express will be any kind of major win
in small-packet latency...

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Mon Mar  1 17:09:55 2004
From: csamuel at vpac.org (Chris Samuel)
Date: Tue, 2 Mar 2004 09:09:55 +1100
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <097701c3ff9c$b465fe30$1601a8c0@deskpro>
References: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com> <097701c3ff9c$b465fe30$1601a8c0@deskpro>
Message-ID: <200403020910.02925.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 2 Mar 2004 01:51 am, Kenneth Farmer wrote:

> From: "Andrew Wang" <andrewxwang at yahoo.com.tw>
>
> > No wonder Intel killed IA64 and released 64-bit x86
> > (aka IA32e) a week or two ago...
>
> Intel killed IA64?  Where did you come up with that?

Intel certainly haven't announced the death of Itanium, but you've got to 
wonder about its long term future when Intel start producing 64-bit AMD 
compatible chips. Also see [1] below.

This is more the question of what will the market do when choosing between 
them, especially as HPC is only really a niche (though a fairly high spending 
one) compared to the general computing market.

The big advantage AMD have is that "legacy" 32-bit apps will be around for a 
long long time to come (look at the mass clamour for MS to continue 
supporting Win98, something they'd hoped would be dead a long time ago) and 
that gives the hybrids a big advantage in the general market.

I guess it comes down to a business decision on Intel as to whether they feel 
the demand for Itanium is enough to justify its continued development.

Note that I'm not saying the demand per se isn't there, I've got absolutely no 
idea on the matter!

cheers,
Chris

[1] - for those who haven't seen it, here's Linus's response to the launch:

		http://kerneltrap.org/node/view/2466

- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQFAQ7S2O2KABBYQAh8RAlA/AJ4yzNxJcXZc3e8I8CtYjgScQOCpUwCfdVzF
lpG7iEOXSo3+xAK73kNb9c0=
=eYRs
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Mon Mar  1 16:38:52 2004
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Mon, 1 Mar 2004 13:38:52 -0800
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
References: <20040301.094532.17863925.pegu@dolphinics.no> <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040301213852.GA28803@cse.ucdavis.edu>

> well, Quadrics now claims 1.8 us MPI latency:
> http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD

Note the title says "sub 2us" and the body says "close to" 1.8us.

Of more interest (to me) is that further down they say:
 In the next quarter, Quadrics will announce a series of highly competitive
 switch configurations making QsNetII more cost-effective for medium
 sized cluster configuration deployment.

Sounds like more competition for IB, Myrinet and Dolphin.  Hopefully anyways.

Cool, found a quadrics price list:
	http://doc.quadrics.com/Quadrics/QuadricsHome.nsf/DisplayPages/A3EE4AED738B6E2480256DD30057B227
	 http://tinyurl.com/2sn2b

Looks like $3k per node or so for 64, and $4k per node for 1024, I'm guessing
that is list price and is somewhat negotiable.

According to my sc2003 notes the Quadrics latency was:
	 100ns for the sending elan4
	 300ns for the 128 node switch and 20 meters of cable
	 130ns for the receiving card.
	2420ns for two trips across the PCI-X bus and a main memory write
================
    2950ns for an mpi message between 2 nodes.

Anyone know what changes to get this number down to 1.8us - 2.0us?

> higher bandwidth.  yes, some vendors (esp IB) are pushing PCI-express
> as bandwith salvation, but afaikt, none of my users need even >500 MB/s
> today.  it doesn't seem like PCI-express will be any kind of major win
> in small-packet latency...

Anyone have an expected timetable for PCI-express connected interconnect
cards?

Anyone have projected PCI-express latencies vs PCI-X (133 MHz/64 bit)?

-- 
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From patrick at myri.com  Mon Mar  1 17:40:55 2004
From: patrick at myri.com (Patrick Geoffray)
Date: Mon, 01 Mar 2004 17:40:55 -0500
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <4043BBF7.9090706@myri.com>

Mark Hahn wrote:
> well, Quadrics now claims 1.8 us MPI latency:
> http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD

Hum, this one claims "under 3us": 
http://doc.quadrics.com/quadrics/QuadricsHome.nsf/PageSectionsByName/F6E4FE91508A319580256D5900447E40/$File/QsNetII+Performance+Evaluation+ltr.pdf

Maybe the 1.8us is a one-sided MPI latency, aka a PUT ?

Patrick
-- 

Patrick Geoffray
Myricom, Inc.
http://www.myri.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From daniel.kidger at quadrics.com  Mon Mar  1 18:47:33 2004
From: daniel.kidger at quadrics.com (Dan Kidger)
Date: Mon, 1 Mar 2004 23:47:33 +0000
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <20040301213852.GA28803@cse.ucdavis.edu>
References: <20040301.094532.17863925.pegu@dolphinics.no> <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca> <20040301213852.GA28803@cse.ucdavis.edu>
Message-ID: <200403012347.33322.daniel.kidger@quadrics.com>

On Monday 01 March 2004 9:38 pm, Bill Broadley wrote:
> > well, Quadrics now claims 1.8 us MPI latency:
> > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B
> >398280256E44005A31DD
>
> Note the title says "sub 2us" and the body says "close to" 1.8us.

Just ran this:
[dan at opteron0]$ mpicc mping.c -o mping;  prun -N2 ./mping
  1:        0 bytes      1.80 uSec     0.00 MB/s

This is simple bit of MPI: proc 1 posts an MPI_Recv,  proc0 then does a 
MPI_Send, then proc1 does MPI_Send and proc0 an MPI_Recv.
Latency printed is half the round trip averaged over say 1000 passes

This is for Opteron - it seems to have the best PCI-X implimentation we have 
seen. Latency on IA64 is a little higher - say 2.61uSec on one platform I 
have just tried.
   MPI performance has also improved over time as we have tuned the DMA/PIO 
writes,etc. in the device drivers.


> Of more interest (to me) is that further down they say:
>  In the next quarter, Quadrics will announce a series of highly competitive
>  switch configurations making QsNetII more cost-effective for medium
>  sized cluster configuration deployment.

yep - yet to be announced offically - but as you might expect this revolves 
around introducing a wider range of smaller switch chasses and 
configuratiions.   


-- 
Yours,
Daniel.

--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd.      daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK         0117 915 5505
----------------------- www.quadrics.com --------------------


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Mar  1 19:37:11 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 1 Mar 2004 19:37:11 -0500 (EST)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <200403020910.02925.csamuel@vpac.org>
Message-ID: <Pine.LNX.4.44.0403011907280.5445-100000@coffee.psychology.mcmaster.ca>

> > > No wonder Intel killed IA64 and released 64-bit x86
> > > (aka IA32e) a week or two ago...
> >
> > Intel killed IA64?  Where did you come up with that?
> 
> Intel certainly haven't announced the death of Itanium, but you've got to 
> wonder about its long term future when Intel start producing 64-bit AMD 
> compatible chips. Also see [1] below.

bah.  buying chips based on their address register width makes 
about as much sense as buying based on clock.  yes, some people have 
good reason to be excited about 64b hitting the mass market.  but 
that number is quite small - how many machines do you have with 
>4 GB per cpu?

remember, Intel has always said that 64b wasn't terribly important
for anything except the "enterprise" (mauve has more ram) market
(mainframe recidivists).  I think they're right, but should have also
adopted AMD's cpu-integrated memory controller.

> I guess it comes down to a business decision on Intel as to whether they feel 
> the demand for Itanium is enough to justify its continued development.

maybe instead of a bazillion bytes of cache on the next it2, 
Intel will just drop in a P4 or two ;)

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From pegu at dolphinics.no  Tue Mar  2 02:59:04 2004
From: pegu at dolphinics.no (Petter Gustad)
Date: Tue, 02 Mar 2004 08:59:04 +0100 (CET)
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
References: <20040301.094532.17863925.pegu@dolphinics.no>
	<Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040302.085904.68044976.pegu@dolphinics.no>

From: Mark Hahn <hahn at physics.mcmaster.ca>
Subject: Re: [Beowulf] SCI Socket latency: 2.27 microseconds
Date: Mon, 1 Mar 2004 11:35:56 -0500 (EST)

> > "This is the lowest latency socket solution available today," said
> > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new
> 
> well, Quadrics now claims 1.8 us MPI latency:

This is excellent MPI latency. However, the quoted 2.27 ?s latency was
for the *socket* library. Latency using the Dolphin SISCI library is
1.4 ?s. See also: http://www.dolphinics.no/products/benchmarks.html


Petter
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joachim at ccrl-nece.de  Tue Mar  2 03:36:42 2004
From: joachim at ccrl-nece.de (Joachim Worringen)
Date: Tue, 2 Mar 2004 09:36:42 +0100
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
In-Reply-To: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403011123230.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <200403020936.42553.joachim@ccrl-nece.de>

Mark Hahn:
> > "This is the lowest latency socket solution available today," said
> > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new
>
> well, Quadrics now claims 1.8 us MPI latency:
> http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B39
>8280256E44005A31DD
>
> it's interesting that SCI is still on 64x66 PCI - it would be very
> interesting to know how many and what kinds of codes really require
[..]

A large fraction of the latency does indeed stem from the two PCI-buses that 
need to be crossed. For that reason, Dolphin would certainly get an 
additional latency decrease when running on a 133MHz bus. I guess they have 
this in the pipeline. 

 Joachim

-- 
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sfr at foobar-cpa.de  Tue Mar  2 04:40:46 2004
From: sfr at foobar-cpa.de (Friedrich Seifert)
Date: Tue, 02 Mar 2004 10:40:46 +0100
Subject: [Beowulf] SCI Socket latency: 2.27 microseconds
Message-ID: <4044569E.9010803@foobar-cpa.de>

Bogdan Costescu wrote:

> On Mon, 1 Mar 2004, Petter Gustad wrote:
> 
> 
>>Dolphin has benchmarked a completed one byte socket send/socket
>>receive latency at 2.27 microseconds,
> 
> 
> Is this in polling mode or interrupt-driven ? I'm interested to see if
> I can do something useful (like computation) _and_ get such low
> latency.

Actually, SCI SOCKET uses a combination of both, it polls for a 
configurable amount of time, and if nothing arrives meanwhile, waits for 
an interrupt. Something like that is necessary since the current Linux 
interrupt processing and wake up mechanism is quite slow and 
unpredictable. There is a promising project going on to provide real 
time interrupt capability, but it is still in an early stage 
(http://lwn.net/Articles/65710/)

>>Benchmarks using Netperf also show more than 255 MBytes (2,035
>>Megabits/s) sustained throughput using standard TCP STREAM sockets.
> 
> 
> What is the CPU usage for this throughput ?

SCI SOCKET was run in PIO mode for this test, so one CPU is needed to 
transfer the data. Current DMA performance is lower, but is subject to 
optimization in future revisions. CPU usage for DMA is 8%/29% at 
sender/receiver.

Regards,
Friedrich

-- 
Dipl.-Inf. Friedrich Seifert - foobar GmbH
Phone: +49-371-5221-157         Email: sfr at foobar-cpa.de
Mobil: +49-172-3740089          Web:   http://www.foobar-cpa.de

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Mon Mar  1 20:57:48 2004
From: lindahl at pathscale.com (Greg Lindahl)
Date: Mon, 1 Mar 2004 17:57:48 -0800
Subject: [Beowulf] advantages of this particular 64-bit chip
In-Reply-To: <Pine.LNX.4.44.0403011907280.5445-100000@coffee.psychology.mcmaster.ca>
References: <200403020910.02925.csamuel@vpac.org> <Pine.LNX.4.44.0403011907280.5445-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040302015748.GA6730@greglaptop.internal.keyresearch.com>

On Mon, Mar 01, 2004 at 07:37:11PM -0500, Mark Hahn wrote:

> bah.  buying chips based on their address register width makes 
> about as much sense as buying based on clock.  yes, some people have 
> good reason to be excited about 64b hitting the mass market.  but 
> that number is quite small - how many machines do you have with 
> >4 GB per cpu?

Don't forget that "64 bits", in this case means "wider GPRs, and twice
as many, plus a better ABI." These are substantial wins on many codes,
even on machines with small memories. Bignums are a well known example,
but there are far more general-purpose examples.

For example, with the PathScale compilers on the Opteron, we find that
only 1 of the SPECfp benchmarks and 3 of the SPECint benchmarks run
faster in 32-bit mode than 64-bit mode -- keeping in mind that 64-bit
mode features longer instructions and bigger pointers and longs. (This
is our alpha 32-bit mode vs. our beta 64-bit mode, so this answer will
change a little by the time both are production quality.)

So yes, there's a reason to buy Opteron and IA32e chips beyond the
address width: more bang for the buck.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Mar  2 08:59:55 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 2 Mar 2004 08:59:55 -0500 (EST)
Subject: [Beowulf] help! need thoughts/recommendations quickly
In-Reply-To: <20040302132333.GA3957@mikee.ath.cx>
Message-ID: <Pine.LNX.4.44.0403020845550.4030-100000@lilith.rgb.private.net>

On Tue, 2 Mar 2004, Mike Eggleston wrote:

> I realize this question is not specific to beowulf clusters... however,
> at 9a I'm meeting with an upset user about a bunch of workstations
> using serial termainals. Things don't happen as quickly as he wants:
> setup, problem diagnosis, throughput, etc. What solutions can I present
> for these problems (I realize this is just a quick summary!). Also,
> the serial terminals are running at 9600 baud over sometimes 50 meters.
> One table I found shows 60 meters is 2400 baud and 30 meters is 4800
> baud. I think this is part of the problem.

It really shouldn't be, if the wiring is decent quality TP.  Back in the
old days, when our department was basically NOTHING but serial terminals
running over TP down to a Sun 110 with a serial port expansion, we had
lots of runs over 50 meters (probably some close to 100) without
difficulty at 9600 baud.  Keep the wires away from e.g. fluorescent
lights (BIG problem), major power cables, or other sources of low
frequency noise.  Running parallel to a noise source over a long
distance is where most crosstalk occurs -- try to cross wires at right
angles.  Conduit can help as it shields, as well, but our wires were
basically thrown up in a drop ceiling haphazardly by "trained
professionals" a.k.a. graduate students, faculty, and sometimes a
shop/maintenance guy.

> Possible solutions I have thought of:
> 
> - user stops complaining and deals with the situation

Always a popular one.  To accomplish it you had better be prepared to
use force.  Bring duct tape to the meeting...

> - put ethernet->serial converts at the terminals so the terminals are
>   on the network

Sounds expensive.  Of course, terminal servers themselves are typically
pretty expensive, although we used to use them in the old days when we
finally had more terminals than our server could manage even with
expansions.  And then workstations started getting cheaper and we
converted over to workstations and ethernet and never looked back.

How is it that you're still using terminals?  I didn't know that
terminals were still a viable option -- a cheap PC is less than what,
$500 these days, and by the time you compare the cost of the terminal
itself, the serial port terminal server, the serial wiring, and the
incredible loss of productivity associated with using what amounts to a
single, slow, tty interface they just don't sound cost effective.  Not
to mention maintenance, user complaints, and your time...

> - put small VIA type boards whose image is loaded through tftp and
>   the serial terminals actually run from the via boards
> - what else?

Give the terminals to somebody you don't like, replace them with cheap
diskless second hand PCs on ethernet running a stripped linux that
basically provides either the standard set of Alt-Fx tty's in
non-graphical mode or basic X and as many xterms as memory permits.
Problem solved.

In fact, depending on the applications being accessed and whether they
CAN run locally, problem solved even better by running them locally and
reducing demand on the network and servers.

   rgb

> 
> Mike
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mikee at mikee.ath.cx  Tue Mar  2 08:23:33 2004
From: mikee at mikee.ath.cx (Mike Eggleston)
Date: Tue, 2 Mar 2004 07:23:33 -0600
Subject: [Beowulf] help! need thoughts/recommendations quickly
Message-ID: <20040302132333.GA3957@mikee.ath.cx>

I realize this question is not specific to beowulf clusters... however,
at 9a I'm meeting with an upset user about a bunch of workstations
using serial termainals. Things don't happen as quickly as he wants:
setup, problem diagnosis, throughput, etc. What solutions can I present
for these problems (I realize this is just a quick summary!). Also,
the serial terminals are running at 9600 baud over sometimes 50 meters.
One table I found shows 60 meters is 2400 baud and 30 meters is 4800
baud. I think this is part of the problem.

Possible solutions I have thought of:

- user stops complaining and deals with the situation
- put ethernet->serial converts at the terminals so the terminals are
  on the network
- put small VIA type boards whose image is loaded through tftp and
  the serial terminals actually run from the via boards
- what else?

Mike
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mikee at mikee.ath.cx  Tue Mar  2 09:08:56 2004
From: mikee at mikee.ath.cx (Mike Eggleston)
Date: Tue, 2 Mar 2004 08:08:56 -0600
Subject: [Beowulf] help! need thoughts/recommendations quickly
In-Reply-To: <Pine.LNX.4.44.0403020845550.4030-100000@lilith.rgb.private.net>
References: <20040302132333.GA3957@mikee.ath.cx> <Pine.LNX.4.44.0403020845550.4030-100000@lilith.rgb.private.net>
Message-ID: <20040302140856.GA4615@mikee.ath.cx>

On Tue, 02 Mar 2004, Robert G. Brown wrote:

> On Tue, 2 Mar 2004, Mike Eggleston wrote:
> 
> > I realize this question is not specific to beowulf clusters... however,
> > at 9a I'm meeting with an upset user about a bunch of workstations
> > using serial termainals. Things don't happen as quickly as he wants:
> > setup, problem diagnosis, throughput, etc. What solutions can I present
> > for these problems (I realize this is just a quick summary!). Also,
> > the serial terminals are running at 9600 baud over sometimes 50 meters.
> > One table I found shows 60 meters is 2400 baud and 30 meters is 4800
> > baud. I think this is part of the problem.
> 
> It really shouldn't be, if the wiring is decent quality TP.  Back in the
> old days, when our department was basically NOTHING but serial terminals
> running over TP down to a Sun 110 with a serial port expansion, we had
> lots of runs over 50 meters (probably some close to 100) without
> difficulty at 9600 baud.  Keep the wires away from e.g. fluorescent
> lights (BIG problem), major power cables, or other sources of low
> frequency noise.  Running parallel to a noise source over a long
> distance is where most crosstalk occurs -- try to cross wires at right
> angles.  Conduit can help as it shields, as well, but our wires were
> basically thrown up in a drop ceiling haphazardly by "trained
> professionals" a.k.a. graduate students, faculty, and sometimes a
> shop/maintenance guy.

I know it should work and the old way it does work, but I've always
seen problems with serial and printers. I much prefer getting away
from them to full ethernet.

> > Possible solutions I have thought of:
> > 
> > - user stops complaining and deals with the situation
> 
> Always a popular one.  To accomplish it you had better be prepared to
> use force.  Bring duct tape to the meeting...

This problem is happening in the warehouse, so there is lots of packing
material and tape around. :)

> > - put ethernet->serial converts at the terminals so the terminals are
> >   on the network
> 
> Sounds expensive.  Of course, terminal servers themselves are typically
> pretty expensive, although we used to use them in the old days when we
> finally had more terminals than our server could manage even with
> expansions.  And then workstations started getting cheaper and we
> converted over to workstations and ethernet and never looked back.
> 
> How is it that you're still using terminals?  I didn't know that
> terminals were still a viable option -- a cheap PC is less than what,
> $500 these days, and by the time you compare the cost of the terminal
> itself, the serial port terminal server, the serial wiring, and the
> incredible loss of productivity associated with using what amounts to a
> single, slow, tty interface they just don't sound cost effective.  Not
> to mention maintenance, user complaints, and your time...

This is an application in the warehouse. We have many serial (dumb)
terminals and printers. We are using 'Dorio's(?). Similiar to the
Wyse 60. I've not used a dorio before, but wyse terminals lots. The
application is all curses based and doesn't require much. The users
are not even concerned about the speed of the application (display, etc.)
just that the terminals are quick to setup and work all the time.

> > - put small VIA type boards whose image is loaded through tftp and
> >   the serial terminals actually run from the via boards
> > - what else?
> 
> Give the terminals to somebody you don't like, replace them with cheap
> diskless second hand PCs on ethernet running a stripped linux that
> basically provides either the standard set of Alt-Fx tty's in
> non-graphical mode or basic X and as many xterms as memory permits.
> Problem solved.
> 
> In fact, depending on the applications being accessed and whether they
> CAN run locally, problem solved even better by running them locally and
> reducing demand on the network and servers.

I can use the terminals on the via boards and not have to replace
them with crt monitors and keyboards, until they all start failing.
I'd prefer to use the crt monitors through vga (fewer problems with linux
and getty).

Do you (anyone) know of a cheap motherboard that would do this?

Mike
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Tue Mar  2 10:44:47 2004
From: john.hearns at clustervision.com (John Hearns)
Date: Tue, 2 Mar 2004 16:44:47 +0100 (CET)
Subject: [Beowulf] help! need thoughts/recommendations quickly
In-Reply-To: <20040302140856.GA4615@mikee.ath.cx>
Message-ID: <Pine.LNX.4.44.0403021642220.16970-100000@druifje.clustervision.com>

On Tue, 2 Mar 2004, Mike Eggleston wrote:

> 
> Do you (anyone) know of a cheap motherboard that would do this?

Sorry to sound like a Cyclades salesman, but from their webpages
the Cyclades TS-100 would fit the bill.
Plus lots of packing tape.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rbw at demec.ufpe.br  Tue Mar  2 15:51:05 2004
From: rbw at demec.ufpe.br (Ramiro Brito Willmersdorf)
Date: Tue, 2 Mar 2004 17:51:05 -0300
Subject: [Beowulf] Invitation to Conference
Message-ID: <20040302205105.GA30141@demec.ufpe.br>


Dear Colleagues,

The XXV CILAMCE (Iberian Latin American Congress on Computational Methods for
Engineering) will be held from November 10th to the 12th at Recife, Brazil.
This Congress will encompass more than 30 mini-symposia over a very wide range
of multidisciplinary methods in engineering and applied sciences. Please check
the congress home page (http://www.demec.ufpe.br/cilamce2004/) for more
specific details.

We would like to invite you to participate in the High Performance Computing
mini-symposium. If you are interested, you should submit an abstract by March
29th, 2004. This is one of the most important conferences on this subject in
South America, and top researchers from here and abroad will attend. On a
personal note, we would like to tell you that Recife is one of the top
touristic destinations in Brazil, with a very pleasant weather and very nice
beaches.

We are grateful for you attention are ask that this information be passed
along to other people in your institution that may be interested.

Many Thanks,

A. L. G. Coutinho, COPPE/UFRJ, alvaro at nacad.ufrj.br
R. B. Willmersdorf, DEMEC/UFPE, rbw at demec.ufpe.br
-- 
Ramiro Brito Willmersdorf            rbw at demec.ufpe.br  
GPG key: http://www.demec.ufpe.br/~rbw/GPG/gpg_key.txt
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Wed Mar  3 12:52:30 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Wed, 03 Mar 2004 09:52:30 -0800
Subject: [Beowulf] mpich program segfaults
Message-ID: <40461B5E.6010003@cert.ucr.edu>

Hi,

Sorry if this is off topic.  Anyway, I've got an mpich Fortran program 
I'm trying to get going, which produces a segmentation fault right at a 
subroutine call.  I put a print statement right before and right after 
the call and when I run the program, I'm only seeing the one before.  
I've also put a print statement right at the beginning of the subroutine 
which is being called and never see that either.  The real strange part 
is when I run this under a debugger, the program runs fine.  So would 
anyone happen to have any insight to what's going on here?  I'd really 
appriciate it.

Thanks,
Glen
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sdutta at deas.harvard.edu  Wed Mar  3 14:42:13 2004
From: sdutta at deas.harvard.edu (Suvendra Nath Dutta)
Date: Wed, 3 Mar 2004 14:42:13 -0500
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <40461B5E.6010003@cert.ucr.edu>
References: <40461B5E.6010003@cert.ucr.edu>
Message-ID: <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu>

Glen,
	Does your program seg fault when compiled with debugging off or on? 
Sometimes compilers will initialize arrays when compiling for 
debugging, but not waste time doing that when compiled without 
debugging. Also if you compile with optimization which line follows 
which one isn't always clear. You want to make sure you aren't 
over-running memory. Because what you say sounds suspiciously like 
that. Also you want to be sure its nothing to do with MPICH. Try 
calling the subroutine from a serial program if possible.
				Suvendra.

On Mar 3, 2004, at 12:52 PM, Glen Kaukola wrote:

> Hi,
>
> Sorry if this is off topic.  Anyway, I've got an mpich Fortran program 
> I'm trying to get going, which produces a segmentation fault right at 
> a subroutine call.  I put a print statement right before and right 
> after the call and when I run the program, I'm only seeing the one 
> before.  I've also put a print statement right at the beginning of the 
> subroutine which is being called and never see that either.  The real 
> strange part is when I run this under a debugger, the program runs 
> fine.  So would anyone happen to have any insight to what's going on 
> here?  I'd really appriciate it.
>
> Thanks,
> Glen
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Wed Mar  3 15:46:36 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Wed, 03 Mar 2004 12:46:36 -0800
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu>
References: <40461B5E.6010003@cert.ucr.edu> <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu>
Message-ID: <4046442C.4090704@cert.ucr.edu>

Suvendra Nath Dutta wrote:

> Glen,
>     Does your program seg fault when compiled with debugging off or on?


Either way.

> Sometimes compilers will initialize arrays when compiling for 
> debugging, but not waste time doing that when compiled without debugging.


The arguments being passed to the subroutine are two arrays of real 
numbers and a few integers.  Nothing being passed to the subroutine has 
been dynamically allocated.  The compiler, IBM's XLF compiler, 
initializes the array to 0.  At least I'm pretty sure it does, since I 
can print things before the subroutine call.

> Also if you compile with optimization which line follows which one 
> isn't always clear.


I don't have any optimizations turned on.

> You want to make sure you aren't over-running memory.


The machine has 2 gigs of memory, which should be plenty.  The same 
program runs on an x86 machine with 1 gig of memory just fine (I'm 
trying to get the program working on an Apple G5 by the way).

> Also you want to be sure its nothing to do with MPICH. Try calling the 
> subroutine from a serial program if possible.


I've tried telling mpirun to only use one cpu and I get the same 
results.  I've also tried running the program all by itself and it still 
crashes.  Like I said though, it runs just fine under the a debugger.


Glen
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sdutta at deas.harvard.edu  Thu Mar  4 06:26:49 2004
From: sdutta at deas.harvard.edu (Suvendra Nath Dutta)
Date: Thu, 4 Mar 2004 06:26:49 -0500 (EST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4046442C.4090704@cert.ucr.edu>
References: <40461B5E.6010003@cert.ucr.edu> <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu>
 <4046442C.4090704@cert.ucr.edu>
Message-ID: <Pine.GSO.4.58.0403040624500.21102@mass>

Glen,
	I am sorry, I meant buffer-overrun instead of memory overrun. It
is of course impossible to say, but you are describing a classic
description of buffer overrun. Program seg-faulting, some where there
shouldn't be a problem. This is usually because you've over run the array
limits and are writing on the program space.
				Suvendra.

On Wed, 3 Mar 2004, Glen Kaukola wrote:

> Suvendra Nath Dutta wrote:
>
> > Glen,
> >     Does your program seg fault when compiled with debugging off or on?
>
>
> Either way.
>
> > Sometimes compilers will initialize arrays when compiling for
> > debugging, but not waste time doing that when compiled without debugging.
>
>
> The arguments being passed to the subroutine are two arrays of real
> numbers and a few integers.  Nothing being passed to the subroutine has
> been dynamically allocated.  The compiler, IBM's XLF compiler,
> initializes the array to 0.  At least I'm pretty sure it does, since I
> can print things before the subroutine call.
>
> > Also if you compile with optimization which line follows which one
> > isn't always clear.
>
>
> I don't have any optimizations turned on.
>
> > You want to make sure you aren't over-running memory.
>
>
> The machine has 2 gigs of memory, which should be plenty.  The same
> program runs on an x86 machine with 1 gig of memory just fine (I'm
> trying to get the program working on an Apple G5 by the way).
>
> > Also you want to be sure its nothing to do with MPICH. Try calling the
> > subroutine from a serial program if possible.
>
>
> I've tried telling mpirun to only use one cpu and I get the same
> results.  I've also tried running the program all by itself and it still
> crashes.  Like I said though, it runs just fine under the a debugger.
>
>
> Glen
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From wseas at canada.com  Thu Mar  4 09:34:46 2004
From: wseas at canada.com (WSEAS Newsletter on MECHANICAL ENGINEERING)
Date: Thu, 4 Mar 2004 16:34:46 +0200
Subject: [Beowulf] WSEAS NEWSLETTER in MECHANICAL ENGINEERING
Message-ID: <3FE20F40001FB40E@fesscrpp1.tellas.gr> (added by postmaster@fesscrpp1.tellas.gr)

If you want to contact us, the Subject of your email must contains the code:
WSEAS

CALL FOR PAPERS -- CALL FOR REVIEWERS -- CALL FOR SPECIAL SESSIONS

wseas at canada.com
http://wseas.freeservers.com

****************************************************************
Udine, Italy, March 25-27, 2004:  

IASME/WSEAS 2004 Int.Conf. on MECHANICS and MECHATRONICS  

****************************************************************

Miami, Florida, USA, April 21-23, 2004

5th WSEAS International Conference on APPLIED MATHEMATICS (SYMPOSIA on: Linear
Algebra and Applications, Numerical Analysis and Applications, Differential
Equations and Applications, Probabilities, Statistics, Operational Research,
Optimization, Algorithms, Discrete Mathematics, Systems, Communications, Control,
Computers, Education) 

****************************************************************

Corfu Island, Greece, August 17-19, 2004

WSEAS/IASME Int.Conf. on FLUID MECHANICS
WSEAS/IASME Int.Conf. on HEAT and MASS TRANSFER

**********************************************************

Vouliagmeni, Athens, Greece, July 12-13, 2004

WSEAS ELECTROSCIENCE AND TECHNOLOGY FOR NAVAL ENGINEERING and ALL-ELECTRIC SHIP 


**********************************************************

Copacabana, Rio de Janeiro, Brazil, October 12-15, 2004

3rd WSEAS Int.Conf. on INFORMATION SECURITY, HARDWARE/SOFTWARE CODESIGN and
COMPUTER NETWORKS (ISCOCO 2004)
3rd WSEAS Int. Conf. on APPLIED MATHEMATICS and COMPUTER SCIENCE (AMCOS 2004)
3rd WSEAS Int.Conf. on SYSTEM SCIENCE and ENGINEERING (ICOSSE 2004)
4th WSEAS Int.Conf. on POWER ENGINEERING SYSTEMS (ICOPES 2004)


****************************************************************
Cancun, Mexico, May 12-15, 2004 

6th WSEAS Int.Conf. on ALGORITHMS, SCIENTIFIC COMPUTING, MODELLING AND SIMULATION
(ASCOMS '04) 


**********************************************************

NOTE THAT 
IN WSEAS CONFERENCES
YOU CAN HAVE PROCEEDINGS 

1) HARD COPY 
2) CD-ROM and 
3) Web Publishing


SELECTED PAPERS are also published (after further review)

*  as regular papers in WSEAS TRANSACTIONS (Journals) or 
*  as Chapters in WSEAS Book Series.

WSEAS Books, Journals, Proceedings participate now in all major science citation
indexes.
ISI, ELSEVIER, CSA, AMS. Mathematical Reviews, ELP, NLG, Engineering Index 
Directory of Published Proceedings, INSPEC (IEE) 


Thanks
Alexis Espen

WSEAS NEWSLETTER in MECHANICAL ENGINEERING

wseas at canada.com
http://wseas.freeservers.com


   #####    HOW TO UNSUBSCRIBE   ####

You receive this newsletter from your email address: beowulf at beowulf.org
If you want to unsubscribe, send an email to:  wseas at canada.com
The Subject of your message must be exactly: REMOVE beowulf at beowulf.org  WSEAS
If you want to unsubscribe more than one email addresses, send a message
to nata at wseas.org with Subject:  REMOVE [email1, emal2, ...., emailn]  WSEAS
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From robl at mcs.anl.gov  Thu Mar  4 13:46:26 2004
From: robl at mcs.anl.gov (Robert Latham)
Date: Thu, 4 Mar 2004 12:46:26 -0600
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4046442C.4090704@cert.ucr.edu>
References: <40461B5E.6010003@cert.ucr.edu> <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu> <4046442C.4090704@cert.ucr.edu>
Message-ID: <20040304184626.GA2746@mcs.anl.gov>

On Wed, Mar 03, 2004 at 12:46:36PM -0800, Glen Kaukola wrote:
> I've tried telling mpirun to only use one cpu and I get the same 
> results.  I've also tried running the program all by itself and it still 
> crashes.  Like I said though, it runs just fine under the a debugger.

since you see this crash when the program runs by itself, try running
under a memory checker (valgrid is good and free, also purify,
insure++...).  

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Thu Mar  4 14:32:12 2004
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Thu, 4 Mar 2004 11:32:12 -0800 (PST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4046442C.4090704@cert.ucr.edu>
Message-ID: <20040304193213.411.qmail@web11407.mail.yahoo.com>

Then run the program by hand, and attach a debugger...

Rayson

--- Glen Kaukola <glen at cert.ucr.edu> wrote:
> Like I said though, it runs just fine under the a debugger.
> 
> 
> Glen
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you?re looking for faster
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Thu Mar  4 13:45:25 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Thu, 04 Mar 2004 10:45:25 -0800
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <Pine.GSO.4.58.0403040624500.21102@mass>
References: <40461B5E.6010003@cert.ucr.edu> <DE728A8C-6D4A-11D8-BB61-0003935570D2@deas.harvard.edu> <4046442C.4090704@cert.ucr.edu> <Pine.GSO.4.58.0403040624500.21102@mass>
Message-ID: <40477945.9090808@cert.ucr.edu>

Suvendra Nath Dutta wrote:

>Glen,
>	I am sorry, I meant buffer-overrun instead of memory overrun. It
>is of course impossible to say, but you are describing a classic
>description of buffer overrun. Program seg-faulting, some where there
>shouldn't be a problem. This is usually because you've over run the array
>limits and are writing on the program space.
>  
>

Ok, but simply calling a subroutine shouldn't cause a buffer overrun 
should it?  Especially when none of the arguments being passed to the 
subroutine are dynamically allocated.  I'm beginning to suspect it's a 
problem with the compiler actually.  Maybe the stack that holds 
subroutine arguments isn't big enough.  And when my problematic 
subroutine call is 4 levels deep or so like it is, then there isn't 
enough room on the stack for it's arguments.

Glen

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Thu Mar  4 17:34:29 2004
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Thu, 4 Mar 2004 17:34:29 -0500 (EST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4046442C.4090704@cert.ucr.edu>
Message-ID: <Pine.LNX.4.44.0403041733080.22097-100000@boltzmann.basement-supercomputing.com>


What type of machine is this?

Doug


On Wed, 3 Mar 2004, Glen Kaukola wrote:

> Suvendra Nath Dutta wrote:
> 
> > Glen,
> >     Does your program seg fault when compiled with debugging off or on?
> 
> 
> Either way.
> 
> > Sometimes compilers will initialize arrays when compiling for 
> > debugging, but not waste time doing that when compiled without debugging.
> 
> 
> The arguments being passed to the subroutine are two arrays of real 
> numbers and a few integers.  Nothing being passed to the subroutine has 
> been dynamically allocated.  The compiler, IBM's XLF compiler, 
> initializes the array to 0.  At least I'm pretty sure it does, since I 
> can print things before the subroutine call.
> 
> > Also if you compile with optimization which line follows which one 
> > isn't always clear.
> 
> 
> I don't have any optimizations turned on.
> 
> > You want to make sure you aren't over-running memory.
> 
> 
> The machine has 2 gigs of memory, which should be plenty.  The same 
> program runs on an x86 machine with 1 gig of memory just fine (I'm 
> trying to get the program working on an Apple G5 by the way).
> 
> > Also you want to be sure its nothing to do with MPICH. Try calling the 
> > subroutine from a serial program if possible.
> 
> 
> I've tried telling mpirun to only use one cpu and I get the same 
> results.  I've also tried running the program all by itself and it still 
> crashes.  Like I said though, it runs just fine under the a debugger.
> 
> 
> Glen
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
----------------------------------------------------------------
Editor-in-chief                   ClusterWorld Magazine
Desk: 610.865.6061                            
Cell: 610.390.7765         Redefining High Performance Computing
Fax:  610.865.6618                www.clusterworld.com


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From smcdaniel at kciinc.net  Thu Mar  4 13:59:37 2004
From: smcdaniel at kciinc.net (smcdaniel)
Date: Thu, 4 Mar 2004 12:59:37 -0600
Subject: [Beowulf] mpich program segfaults (Glen Kaukola)
Message-ID: <002501c4021a$d77830c0$2a01010a@kciinc.local>

Physical memory errors could be the problem if they occur between the
pointer and offset of your array location
in the stack.

Other than that I would suspect a buffer overrun that Suvendra Nath Dutta
mentioned.

Sam McDaniel

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Thu Mar  4 19:48:21 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Thu, 04 Mar 2004 16:48:21 -0800
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <Pine.LNX.4.44.0403041733080.22097-100000@boltzmann.basement-supercomputing.com>
References: <Pine.LNX.4.44.0403041733080.22097-100000@boltzmann.basement-supercomputing.com>
Message-ID: <4047CE55.6010300@cert.ucr.edu>

Douglas Eadline, Cluster World Magazine wrote:

>What type of machine is this?
>  
>

An Apple G5.

And actually I've figured out what's wrong.  Sorta.  =)

I replaced my problematic subroutine with a dummy subroutine that 
contains nothing but variable declarations and a print statement.  This 
still caused a segmentation fault.  So I commented pretty much 
everything out.  No segmentation fault.  Alright then.  I slowly added 
it all back in, checking each time to see if I got a segmentation fault.

And now I'm down to 4 variable declarations that are causing a problem:
REAL          ZFGLURG   ( NCOLS,NROWS,0:NLAYS )
INTEGER      ICASE( NCOLS,NROWS,0:NLAYS )
REAL         THETAV( NCOLS,NROWS,NLAYS )
REAL         ZINT  ( NCOLS,NROWS,NLAYS )

If I uncomment any one of those, I get a segmentation fault again.

But it still doesn't make any sense.  First of all, there are variable 
declarations almost exactly like the ones I listed and those don't cause 
a problem.  I also made a small test case that called my dummy 
subroutine and that worked just fine.  I then commented out everything 
but the problematic variable declarations I listed above and that worked 
just fine.  I tried changing the variable names but that didn't seem to 
make a difference, as I still got a segmentation fault.  So I have no 
idea what the heck is going on.  I think I need to tell my boss we need 
to give up on G5's.


Glen
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Thu Mar  4 20:05:28 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Fri, 5 Mar 2004 09:05:28 +0800 (CST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <40477945.9090808@cert.ucr.edu>
Message-ID: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com>

the default stack size on OSX is 512 KB, try to
increase it to 64MB, I encountered this problem
before.

Andrew.

 --- Glen Kaukola <glen at cert.ucr.edu> ????>
Suvendra Nath Dutta wrote:
> 
> >Glen,
> >	I am sorry, I meant buffer-overrun instead of
> memory overrun. It
> >is of course impossible to say, but you are
> describing a classic
> >description of buffer overrun. Program
> seg-faulting, some where there
> >shouldn't be a problem. This is usually because
> you've over run the array
> >limits and are writing on the program space.
> >  
> >
> 
> Ok, but simply calling a subroutine shouldn't cause
> a buffer overrun 
> should it?  Especially when none of the arguments
> being passed to the 
> subroutine are dynamically allocated.  I'm beginning
> to suspect it's a 
> problem with the compiler actually.  Maybe the stack
> that holds 
> subroutine arguments isn't big enough.  And when my
> problematic 
> subroutine call is 4 levels deep or so like it is,
> then there isn't 
> enough room on the stack for it's arguments.
> 
> Glen
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Thu Mar  4 21:46:16 2004
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Thu, 4 Mar 2004 21:46:16 -0500 (EST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <4047CE55.6010300@cert.ucr.edu>
Message-ID: <Pine.LNX.4.44.0403042133300.22097-100000@boltzmann.basement-supercomputing.com>


Don't give up on  the G5 just yet.

Sounds like to me you may be stepping on some memory somehow. Which means
the crash occurs at that particular spot in the code, but the cause of the
crash probably is occurring somewhere else in the program.

There are "simple" several you can do to collect evidence that may help
you solve this "crime". (this is detective work by the way)

First, this sounds like the kind of thing that happens in C programs. Is
it pure Fortran? What version of MPICH?

1) try another compiler, if you are lucky it will find the problem. It may
also work, in which case you will want to blame the first compiler, don't,
because that is probably not the case. The new compiler probably lays out
the memory different than the first one and you just got lucky.

2) run your code on another architecture.

3) try another MPI (LAM?)

I am sure there are more, but not knowing the particulars, I can not
suggest anything else.

Doug


On Thu, 4 Mar 2004, Glen Kaukola wrote:

> Douglas Eadline, Cluster World Magazine wrote:
> 
> >What type of machine is this?
> >  
> >
> 
> An Apple G5.
> 
> And actually I've figured out what's wrong.  Sorta.  =)
> 
> I replaced my problematic subroutine with a dummy subroutine that 
> contains nothing but variable declarations and a print statement.  This 
> still caused a segmentation fault.  So I commented pretty much 
> everything out.  No segmentation fault.  Alright then.  I slowly added 
> it all back in, checking each time to see if I got a segmentation fault.
> 
> And now I'm down to 4 variable declarations that are causing a problem:
> REAL          ZFGLURG   ( NCOLS,NROWS,0:NLAYS )
> INTEGER      ICASE( NCOLS,NROWS,0:NLAYS )
> REAL         THETAV( NCOLS,NROWS,NLAYS )
> REAL         ZINT  ( NCOLS,NROWS,NLAYS )
> 
> If I uncomment any one of those, I get a segmentation fault again.
> 
> But it still doesn't make any sense.  First of all, there are variable 
> declarations almost exactly like the ones I listed and those don't cause 
> a problem.  I also made a small test case that called my dummy 
> subroutine and that worked just fine.  I then commented out everything 
> but the problematic variable declarations I listed above and that worked 
> just fine.  I tried changing the variable names but that didn't seem to 
> make a difference, as I still got a segmentation fault.  So I have no 
> idea what the heck is going on.  I think I need to tell my boss we need 
> to give up on G5's.
> 
> 
> Glen
> 

-- 
----------------------------------------------------------------
Editor-in-chief                   ClusterWorld Magazine
Desk: 610.865.6061                            
Cell: 610.390.7765         Redefining High Performance Computing
Fax:  610.865.6618                www.clusterworld.com


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mathiasbrito at yahoo.com.br  Fri Mar  5 08:43:33 2004
From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=)
Date: Fri, 5 Mar 2004 10:43:33 -0300 (ART)
Subject: [Beowulf] Benchmarking with HPL
Message-ID: <20040305134333.90538.qmail@web12201.mail.yahoo.com>

Hello,

I'm benchmarking my cluster with HPL, the cluster have
16 nodes, 8 nodes athlon 1600+ with 512MB RAM and 20GB
H.D. , and 8 nodes athlan 1700+ with 512MB RAM and
20GB, all with a 100Mbit fast ethernet linked in a
switch. Well, the problem is, what the best setup for
the HPL.dat, to obtain the maximum performance of the
cluster?

Mathias

=====
Mathias Brito
Universidade Estadual de Santa Cruz - UESC
Departamento de Ci?ncias Exatas e Tecnol?gicas
Estudante do Curso de Ci?ncia da Computa??o

______________________________________________________________________

Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora:
http://br.yahoo.com/info/mail.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Sebastien.Georget at sophia.inria.fr  Fri Mar  5 10:10:10 2004
From: Sebastien.Georget at sophia.inria.fr (=?ISO-8859-1?Q?S=E9bastien_Georget?=)
Date: Fri, 05 Mar 2004 16:10:10 +0100
Subject: [Beowulf] Benchmarking with HPL
In-Reply-To: <20040305134333.90538.qmail@web12201.mail.yahoo.com>
References: <20040305134333.90538.qmail@web12201.mail.yahoo.com>
Message-ID: <40489852.3050206@sophia.inria.fr>

Mathias Brito wrote:
> Hello,
> 
> I'm benchmarking my cluster with HPL, the cluster have
> 16 nodes, 8 nodes athlon 1600+ with 512MB RAM and 20GB
> H.D. , and 8 nodes athlan 1700+ with 512MB RAM and
> 20GB, all with a 100Mbit fast ethernet linked in a
> switch. Well, the problem is, what the best setup for
> the HPL.dat, to obtain the maximum performance of the
> cluster?
> 
> Mathias

Hi,

starting points for HPL tuning here:
   http://www.netlib.org/benchmark/hpl/faqs.html
   http://www.netlib.org/benchmark/hpl/tuning.html

++
-- 
S?bastien Georget
INRIA Sophia-Antipolis, Service DREAM, B.P. 93
06902 Sophia-Antipolis Cedex, FRANCE
E-mail:sebastien.georget at sophia.inria.fr

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Mar  5 12:28:36 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 5 Mar 2004 12:28:36 -0500 (EST)
Subject: [Beowulf] Newbie on beowulf clustering
In-Reply-To: <20040305171757.15481.qmail@web20730.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0403051227510.31868-100000@ganesh>

On Fri, 5 Mar 2004, khurram b wrote:

> hi!
> i am newbie to beowulf clustering, have done some work
> in MOSIX linux clustering and got interested in
> beowulf clustering, please guide me where to start ,
> tutorials, documents.

  http://www.phy.duke.edu/brahma

Has many resources and links to many more.  Also think about subscribing
to Cluster World magazine.

   rgb

> 
> Thanks!
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! Search - Find what you?re looking for faster
> http://search.yahoo.com
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From myaoha at yahoo.com  Fri Mar  5 12:17:57 2004
From: myaoha at yahoo.com (khurram b)
Date: Fri, 5 Mar 2004 09:17:57 -0800 (PST)
Subject: [Beowulf] Newbie on beowulf clustering
Message-ID: <20040305171757.15481.qmail@web20730.mail.yahoo.com>

hi!
i am newbie to beowulf clustering, have done some work
in MOSIX linux clustering and got interested in
beowulf clustering, please guide me where to start ,
tutorials, documents.

Thanks!

__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you?re looking for faster
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mprinkey at aeolusresearch.com  Fri Mar  5 14:02:13 2004
From: mprinkey at aeolusresearch.com (Michael T. Prinkey)
Date: Fri, 5 Mar 2004 14:02:13 -0500 (EST)
Subject: [Beowulf] "noht" in 2.4.24?
Message-ID: <Pine.LNX.4.44.0403051359200.21915-100000@ra.thebes>

Hi Everyone,

I installed 2.4.24 on a dual Xeon system with a Tyan 7501-chipset
motherboard and the noht option seems to be ignored.  The RH9 kernel
(2.4.20?) repected noht.  Has this been changed or is there a patch that I
missed?  I can't think that it is a BIOS issue or otherwise hardware
related as I can shut it off with RH9 kernel.

Thanks,

Mike


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hartner at cs.utah.edu  Fri Mar  5 16:22:37 2004
From: hartner at cs.utah.edu (Mark Hartner)
Date: Fri, 5 Mar 2004 14:22:37 -0700 (MST)
Subject: [Beowulf] "noht" in 2.4.24?
In-Reply-To: <Pine.LNX.4.44.0403051359200.21915-100000@ra.thebes>
Message-ID: <Pine.LNX.4.43L0.0403051416180.11101-100000@trust.cs.utah.edu>

> I installed 2.4.24 on a dual Xeon system with a Tyan 7501-chipset
> motherboard and the noht option seems to be ignored.  The RH9 kernel
> (2.4.20?) repected noht.  Has this been changed or is there a patch that

I think that option was removed around 2.4.21

If you look at Documentation/kernel-parameters.txt in the kernel source it
will give you a list of options for the 2.4.24 kernel.

> missed?  I can't think that it is a BIOS issue or otherwise hardware
> related as I can shut it off with RH9 kernel.

'acpi=off' will disable ht'ing (and a bunch of other stuff)

The other option is to disable it in your BIOS.

Mark


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rdn at uchicago.edu  Fri Mar  5 18:27:34 2004
From: rdn at uchicago.edu (Russell Nordquist)
Date: Fri, 5 Mar 2004 17:27:34 -0600 (CST)
Subject: [Beowulf] good 24 port gige switch
Message-ID: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>


Does anyone have a recommendation for a good 24 port gige switch for
clustering? I know this issue has been discussed, but I didn't find any
actual manufacturer/models people like. Were not really looking at the
very high end models from Cisco, but I am wary of the many low end
switches on the market with regard to bisectional bandwidth issues.

Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
and found one to be better than the other. There are a bunch of 24 port
gige switches for <$2000, but are they any good? are some better than
others (likely so i'd guess)?

thanks and have a good weekend.
russell


- - - - - - - - - - - -
Russell Nordquist
UNIX Systems Administrator
Geophysical Sciences Computing
http://geosci.uchicago.edu/computing
NSIT, University of Chicago
 - - - - - - - - - - -

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Fri Mar  5 20:24:55 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sat, 6 Mar 2004 09:24:55 +0800 (CST)
Subject: [Beowulf] SGEEE free and more platform offically supported
Message-ID: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com>

I used to think that SGE is free, but SGEEE (with more
advanced scheduling algorithms) is not. But it is not
true, both are free and open source.

In SGE 6.0, there will be no "SGEEE mode", but the
default mode will have all the SGEEE functionality!

And Sun is adding more support too, instead of looking
at the source or finding other people to support
non-Sun OSes:
 "Sun will also support non Sun platforms beginning
with Grid Engine 6 (HP, IBM, SGI, MAC)."

http://gridengine.sunsource.net/servlets/ReadMsg?msgId=16510&listName=users

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Fri Mar  5 20:04:33 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sat, 6 Mar 2004 09:04:33 +0800 (CST)
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <40491AA7.6050703@cert.ucr.edu>
Message-ID: <20040306010433.99259.qmail@web16812.mail.tpe.yahoo.com>

It's not your code, I think there is a compiler flag
to not allocate variables from the stack, but I need
to look at the XLF manuals again.

BTW, there are several OSX settings that you can do to
tune the performance of your fortran on the G5. I said
fortran since it has to do with the hardware
prefetching on the Power4 and the G5, if you have c
programs with a lot of vector computation, you can set
those too.

Andrew.

 --- Glen Kaukola 
> >the default stack size on OSX is 512 KB, try to
> >increase it to 64MB, I encountered this problem
> >before.
> Yep, that did the trick. Thanks a bunch!
> 
> I'm wondering though, does this indicate there's
> some sort of problem
> with the code?
> 
> 
> Glen 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Fri Mar  5 19:26:15 2004
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Fri, 05 Mar 2004 16:26:15 -0800
Subject: [Beowulf] mpich program segfaults
In-Reply-To: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com>
References: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com>
Message-ID: <40491AA7.6050703@cert.ucr.edu>

Andrew Wang wrote:

>the default stack size on OSX is 512 KB, try to
>increase it to 64MB, I encountered this problem
>before.
>  
>

Yep, that did the trick. Thanks a bunch!

I'm wondering though, does this indicate there's some sort of problem
with the code?


Glen
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Fri Mar  5 19:34:05 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Fri, 5 Mar 2004 19:34:05 -0500 (EST)
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
Message-ID: <Pine.LNX.4.44.0403051930450.7709-100000@coffee.psychology.mcmaster.ca>

> Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
> and found one to be better than the other. There are a bunch of 24 port
> gige switches for <$2000, but are they any good? are some better than
> others (likely so i'd guess)?

I've had good luck with SMC 8624t's, and know of one quite large cluster 
that uses a lot of them of them (mckenzie, #140).

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lars at meshtechnologies.com  Sat Mar  6 04:55:22 2004
From: lars at meshtechnologies.com (Lars Henriksen)
Date: 06 Mar 2004 09:55:22 +0000
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
References: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
Message-ID: <1078566922.2547.6.camel@fermi>

On Fri, 2004-03-05 at 23:27, Russell Nordquist wrote:
> Does anyone have a recommendation for a good 24 port gige switch for
> clustering? 

> Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
> and found one to be better than the other. There are a bunch of 24 port
> gige switches for <$2000, but are they any good? are some better than
> others (likely so i'd guess)?

We mostly use HP2724's for this size of clusters. We have found them to
perform ok and they are stable under heavy load - and they are priced at
around $2000 (in Denmark, that is, might be cheaper in the US)

best regards
Lars
--
Lars Henriksen                  | MESH-Technologies A/S
Systems Manager & Consultant    | Lille Gr?br?drestr?de 1
www.meshtechnologies.com        | DK-5000 Odense C, Denmark
lars at meshtechnologies.com       | mobile: +45 2291 2904
direct: +45 6311 1187           | fax:    +45 6311 1189


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Sat Mar  6 09:01:49 2004
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Sat, 6 Mar 2004 06:01:49 -0800 (PST)
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <1078566922.2547.6.camel@fermi>
Message-ID: <Pine.LNX.4.44.0403060600150.25866-100000@twin.uoregon.edu>

On 6 Mar 2004, Lars Henriksen wrote:

> On Fri, 2004-03-05 at 23:27, Russell Nordquist wrote:
> > Does anyone have a recommendation for a good 24 port gige switch for
> > clustering? 
> 
> > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
> > and found one to be better than the other. There are a bunch of 24 port
> > gige switches for <$2000, but are they any good? are some better than
> > others (likely so i'd guess)?
> 
> We mostly use HP2724's for this size of clusters. We have found them to
> perform ok and they are stable under heavy load - and they are priced at
> around $2000 (in Denmark, that is, might be cheaper in the US)

hp doesn't do jumbo frames on anything other than their top of the line 
l3 switch products which may or may not be an issue for certain 
applications.

> best regards
> Lars
> --
> Lars Henriksen                  | MESH-Technologies A/S
> Systems Manager & Consultant    | Lille Gr?br?drestr?de 1
> www.meshtechnologies.com        | DK-5000 Odense C, Denmark
> lars at meshtechnologies.com       | mobile: +45 2291 2904
> direct: +45 6311 1187           | fax:    +45 6311 1189
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli  	       Unix Consulting 	       joelja at darkwing.uoregon.edu    
GPG Key Fingerprint:     5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hanzl at noel.feld.cvut.cz  Sat Mar  6 10:02:37 2004
From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz)
Date: Sat, 06 Mar 2004 16:02:37 +0100
Subject: [Beowulf] SGEEE free and more platform offically
 supported
In-Reply-To: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com>
References: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com>
Message-ID: <20040306160237D.hanzl@unknown-domain>

> I used to think that SGE is free, but SGEEE (with more
> advanced scheduling algorithms) is not. But it is not
> true, both are free and open source.

SGEEE is free and opensource but many many people did not know this. I
thing this confusion made big harm to SGE project and I invested a lot
of effort in clarifying this (Google "hanzl SGEEE" to see all that).

> In SGE 6.0, there will be no "SGEEE mode", but the
> default mode will have all the SGEEE functionality!

Great, hope this will stop the confusion once for ever.

Regards

Vaclav Hanzl
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Sat Mar  6 10:00:35 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sat, 6 Mar 2004 23:00:35 +0800 (CST)
Subject: [Beowulf] SGEEE free and more platform offically supported
In-Reply-To: <20040306160237D.hanzl@unknown-domain>
Message-ID: <20040306150035.75079.qmail@web16806.mail.tpe.yahoo.com>

 --- hanzl at noel.feld.cvut.cz ????>
> SGEEE is free and opensource but many many people
> did not know this. I
> thing this confusion made big harm to SGE project
> and I invested a lot
> of effort in clarifying this (Google "hanzl SGEEE"
> to see all that).

I think it is because Sun called it "Enterprise
Edition" (EE), and when people think of Enterprise,
they think of $$$.

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From atp at piskorski.com  Sat Mar  6 15:43:28 2004
From: atp at piskorski.com (Andrew Piskorski)
Date: Sat, 6 Mar 2004 15:43:28 -0500
Subject: [Beowulf] DC powered clusters?
Message-ID: <20040306204328.GA49615@piskorski.com>

Some rackmount vendors now offer systems with a small DC-to-DC power
supply for each node, with separate AC-DC rectifiers feeding power.  I
imagine the DC is probably at 48 V rather than 12 V or whatever, but
often they don't even seem to ay that, e.g.:

  http://rackable.com/products/dcpower.htm

Has anyone OTHER than commercial rackmount vendors designed and built
a cluster using such DC-to-DC power supplies?  Is there detailed info
on such anywhere on the web?

Anybody have any idea exactly what components those vendors are using
for their power systems, where they can be purchased (in small
quantities), and/or how much they cost?

I'm curious how the purchase and operating costs compare to the normal
"stick a standard desktop AC-to-DC PUSE in each node" approach, or
even the hackish "wire on extra connectors and use one high qualtiy
desktop PSU to power 2 or 3 nodes" approach.

The only DC-to-DC supplies I've seen on the web seem quite expensive,
e.g.:

  http://www.rackmountpro.com/productsearch.cfm?catid=118
  http://www.mini-box.com/power-faq.htm

So I suspect the DC-to-DC approach would only ever make economic sense
for large high-end clusters, those with unusual space or heat
constraints, or the like.  But I'm still curious about the details...

-- 
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Fri Mar  5 23:41:07 2004
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Fri, 05 Mar 2004 22:41:07 -0600
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
References: <Pine.GSO.4.44.0403051656480.25485-100000@geosci.uchicago.edu>
Message-ID: <40495663.7010507@tamu.edu>

Caveats:
1.  It's been arough week.
2.  I've got some specific opinions about 3Com hardware these days.

I just ordered a 16 node cluster.  I'm using the Foundry EdgeIron 24G as 
the basic switch.  More than adequate backplane, pretty good small and 
large packet performance as tested with an Anritsu MD1230.  Cost is 
expected to be about $3000, for the 24 port model.  I'm getting 2, and 
have dual nics on the nodes, for some playing with channel bonding, and 
so that I've got a failover hot spare if/when one dies.  Remember: 
Murphy was an optimist.

For the record I don't expect the EdgeIron to die, but conversely 
(perversely?) I expect any and all network devices to die at the least 
opportune time!

I didn't even consider 3Com.  Didn't test it.  The 3Com "gigabit" 
hardware I've seen recently in the LAN-space was usually capable of gig 
uplinks, but had trouble with congestion when gig and 100BaseT were 
mixed on the switch.

HP had been OEM'ing Foundry.  I'm not sure if that's still the case or 
if they went recently to someone else; my Foundry rep won't say, and I 
don't have a close HP rep.

We have programmatically stayed away from Asante in our LAN operations 
here.  That translates to no experience an dno contacts.  Sorry.

Cluster should be in within a month, and so should the switches.  I'll 
do some latency runs and report objective data.

gerry

Russell Nordquist wrote:
> Does anyone have a recommendation for a good 24 port gige switch for
> clustering? I know this issue has been discussed, but I didn't find any
> actual manufacturer/models people like. Were not really looking at the
> very high end models from Cisco, but I am wary of the many low end
> switches on the market with regard to bisectional bandwidth issues.
> 
> Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches
> and found one to be better than the other. There are a bunch of 24 port
> gige switches for <$2000, but are they any good? are some better than
> others (likely so i'd guess)?
> 
> thanks and have a good weekend.
> russell
> 
> 
> - - - - - - - - - - - -
> Russell Nordquist
> UNIX Systems Administrator
> Geophysical Sciences Computing
> http://geosci.uchicago.edu/computing
> NSIT, University of Chicago
>  - - - - - - - - - - -
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Sun Mar  7 03:00:56 2004
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Sun, 7 Mar 2004 00:00:56 -0800 (PST)
Subject: [Beowulf] DC powered clusters? - fun
In-Reply-To: <20040306204328.GA49615@piskorski.com>
Message-ID: <Pine.LNX.3.96.1040306232504.5090A-100000@Maggie.Linux-Consulting.com>


hi ya andrew 

fun stuff ... :-)  good techie vitamins ;-) - lots of thinking of why 
it is the way it is vs what the real measure power consumption is

On Sat, 6 Mar 2004, Andrew Piskorski wrote:

> Some rackmount vendors now offer systems with a small DC-to-DC power
> supply for each node, with separate AC-DC rectifiers feeding power.  I
> imagine the DC is probably at 48 V rather than 12 V or whatever, but
> often they don't even seem to ay that, e.g.:
> 
>   http://rackable.com/products/dcpower.htm

i don't like that they claim "back-to-back rackmounts" is their "patented
technology" ... geez ... 
	- anybody can mount a generic 1U in the rack .. one in the front
	and one in the back ( other side ) ... ( obviously the 1U chassis
	cannot be too deep )
 
> Has anyone OTHER than commercial rackmount vendors designed and built
> a cluster using such DC-to-DC power supplies?  Is there detailed info
> on such anywhere on the web?

dc-dc power supplies are made literally and figuratively by the million
various combination of voltage, current capacity and footprint

	http://www.Linux-1U.net/PowerSupp
	( see the list of various power supply manufacturers )

> Anybody have any idea exactly what components those vendors are using
> for their power systems, where they can be purchased (in small
> quantities), and/or how much they cost?

you can buy any size dc-dc power supplies from $1.oo to the thousands

if you want the dc-dc power supply to have atx output capabilities,
than you have 2 or 3 choice of dc-atx output power supplies:
	- mini-box.com ( and they have a few resellers )
	- there's a power supply company that also did a variation
	of mini-box.com's design ... i cant find the orig url at this time
		http://www.dc2dc.com is a resller of the "other option"
 	- probably a bunch of power supp working on dc-atx convertors

> The only DC-to-DC supplies I've seen on the web seem quite expensive,
> e.g.:
> 
>   http://www.rackmountpro.com/productsearch.cfm?catid=118

99% of the rackmount vendors are just reselling (adding $$$ to ) a power
supply manufacturer's power supply ...

	- you can save a good chunk of change by buying direct
	from the generic power supply OEM distributors 

	- somtimes as much or mroe than 50% cost savings of the cost of
	the power supply

>   http://www.mini-box.com/power-faq.htm

most of their data are measured data per their test setups
and more info about dc-dc stuff

	http://www.via.com.tw/en/VInternet/power.pdf

see the rest of the +12v DC input "atx power supply" vendors

	http://www.Linux-1U.net/PowerSupp/DC/

	http://www.Linux-1U.net/PowerSupp/12v/
	( +12v at up to 500A or more )

> So I suspect the DC-to-DC approach would only ever make economic sense
> for large high-end clusters, those with unusual space or heat
> constraints, or the like.  But I'm still curious about the details...

dc-atx power supply makes sense when:

	- power supply heat and airflow is a problem
		or you dont like having too many power cords 
		( 400 cords vs 40 in a rack )
		- simple cabling is a big problem ( rats nest )

	- you want to reduce the costs of the system by throwing away
	un-used power supply capacity that is available with the
	traditional one power supply per 1 motherboard and peripherals
		- most power supplies used are used for maximum
		supported load (NOT a motherboard + cpu + disk + mem only)

	- you have a huge airconditioning bill problem
		- that should motivate you to find and test a system
		with "less heat generated solutions"

	- your cluster only needs to have enough power for the cpu + 1disk

	- you have a space consideration problems
		- dc-atx power supply allows 420 cpus per 42U rack
		and up to 840 cpus for front and back loaded cluster

	- on and on ...

for a typical 4U-8U height blade clusters ( 10 blades )
	- you only need one 600-800W atx power supply to drive
	the 10 mini-itx or flex-atx blades
	- cpu is 25W ?? motherboard is 25W ... 
	- disks need 1A at 12v to spin up.. normal operation current is
	80ma at 12v ... etc .. per disk specs
	- how you want to do power calculations is the trick

	10 full-tower system with a 450W power does NOT imply you';re
	using 4500W of power for 10 systems :-)

have fun
alvin
http://www.1U-ITX.net

100TB - 200TB of disks per 42U racks ??  -- even more fun
	http://www.itx-blades.net/1U-Blades
	( blades are with mini-box.com's  dc-dc atx power supply )

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mayank_kaushik at vsnl.net  Sun Mar  7 03:29:42 2004
From: mayank_kaushik at vsnl.net (mayank_kaushik at vsnl.net)
Date: Sun, 07 Mar 2004 13:29:42 +0500
Subject: [Beowulf] PVM says `PVM_ROOT not set..cant start pvmd` on remote computer
Message-ID: <69bcf6b69beb90.69beb9069bcf6b@vsnl.net>

hi...


im trying to make a two-machine PVM virtual machine. but im having problems with PVM.
the names of the two machines are "mayank" and "manish".."mayank" runs fedora core 1, "manish" runs red hat linux -9..both are part of a simple 100mbps LAN, connected by a 100mbps switch.
iv *disabled* the firewall on both machines.

iv installed pvm-3.4.4-14 on both machines.
the problem is: 

when i try to add "mayank" to the virtual machine from "manish" using "add mayank", pvm is unable to do so..gives an error message "cant start pvmd"..then it tries to diagnose what went wrong..it passes all tests but one -- says "PVM_ROOT" is set to "" on the target machine ("mayank")...but thats ABSURD..iv checked a mill-yun times, the said variable is correctly set..when i ssh to mayank from manish, and then echo $PVM_ROOT , i get the correct answer...
plz note that im using ssh instead of rsh, by changing the variable PVM_RSH=/usr/bin/ssh..since im more comfortable with ssh...
but when i try the opposite--adding "manish" to the virtual machine from "mayank" runnnig fedora..it works!
furthermore....before i installed fedora core 1 on mayank, it too had red hat 9..and then i was getting the same problem from BOTH machines..but after installing fedora on mayank, things began to work from that end.


what going on??? (apart from me whos going nuts)


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Sun Mar  7 11:10:20 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Sun, 7 Mar 2004 08:10:20 -0800 (PST)
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <40495663.7010507@tamu.edu>
Message-ID: <Pine.LNX.4.04.10403070800300.31862-100000@c-24-18-245-161.client.comcast.net>

Does anyone have experience with Dell's new 2624 unmanaged 24 port gigE
switch?  It's only about $330, around a 1/10 the cost of the managed switches. 
>From what I've read, the Dell/Linksys 5224 managed gigE switch is good.  It
could be that the unmanaged switch uses the exact same Broadcom switch chips,
but just doesn't have management.

On Fri, 5 Mar 2004, Gerry Creager N5JXS wrote:
> expected to be about $3000, for the 24 port model.  I'm getting 2, and 
> have dual nics on the nodes, for some playing with channel bonding, and 

Last I heard, the interrupt mitigation on gigE cards messes up channel bonding
for extra bandwidth.  The packets arrive in batches out of order, and Linux's
TCP/IP stack doesn't like this, so you get less bandwidth with two cards than
you would with just one.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Sun Mar  7 17:13:26 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Sun, 7 Mar 2004 17:13:26 -0500 (EST)
Subject: [Beowulf] PVM says `PVM_ROOT not set..cant start pvmd` on remote
 computer
In-Reply-To: <69bcf6b69beb90.69beb9069bcf6b@vsnl.net>
Message-ID: <Pine.LNX.4.44.0403071704490.1983-100000@lilith.rgb.private.net>

On Sun, 7 Mar 2004 mayank_kaushik at vsnl.net wrote:

> hi...
> 
> 
> im trying to make a two-machine PVM virtual machine. but im having problems with PVM.
> the names of the two machines are "mayank" and "manish".."mayank" runs fedora core 1, "manish" runs red hat linux -9..both are part of a simple 100mbps LAN, connected by a 100mbps switch.
> iv *disabled* the firewall on both machines.
> 
> iv installed pvm-3.4.4-14 on both machines.
> the problem is: 

> when i try to add "mayank" to the virtual machine from "manish" using
> "add mayank", pvm is unable to do so..gives an error message "cant start
> pvmd"..then it tries to diagnose what went wrong..it passes all tests
> but one -- says "PVM_ROOT" is set to "" on the target machine
> ("mayank")...but thats ABSURD..iv checked a mill-yun times, the said
> variable is correctly set..when i ssh to mayank from manish, and then
> echo $PVM_ROOT , i get the correct answer...

This COULD be associated with the order things like .bash_profile and so
forth are run for interactive shells vs login shells.

If you are setting PVM_ROOT in .bash_profile (so it would be correct on
a login) be sure to ALSO set it in .bashrc so that it is set for the
remote shell likely used to start PVM.  I haven't looked at the fedora
RPM so I don't know if /usr/bin/pvm is still a script that sets this
variable for you anyway.

> plz note that im using ssh instead of rsh, by changing the variable
> PVM_RSH=/usr/bin/ssh..since im more comfortable with ssh...

Me too.  ssh also has a very nice feature that permits an environment to
be set on the remote machine for non-interactive remote commands that
CAN be useful for PVM, although I think the stuff above might fix it.

> but when i try the opposite--adding "manish" to the virtual machine
> from "mayank" runnnig fedora..it works!

> furthermore....before i installed fedora core 1 on mayank, it too had
> red hat 9..and then i was getting the same problem from BOTH
> machines..but after installing fedora on mayank, things began to work
> from that end.

I've encountered a similar problem only once, trying to add nodes FROM a
wireless laptop.  Didn't work.  Adding the wireless laptop from anywhere
else worked fine, all systems RH 9 and clean (new) installs from RPM of
pvm, I explicitly set PVM_ROOT and PVM_RSH when logging in.  PVM_ROOT is
additionally set (correctly) by the /usr/bin/pvm command, which is
really a shell.

> what going on??? (apart from me whos going nuts)

Try checking your environment to make sure it is set for both a remote
command:

  ssh mayank echo "\$PVM_ROOT"

and in a remote login:

  ssh mayank

$ echo "$PVM_ROOT"

     rgb

> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sunyy_2004 at hotmail.com  Mon Mar  8 11:33:18 2004
From: sunyy_2004 at hotmail.com (Yiyang Sun)
Date: Tue, 09 Mar 2004 00:33:18 +0800
Subject: [Beowulf] Relation between Marvell Yukon Controller and SysKonnect GbE Adapters
Message-ID: <BAY9-F3D9h5fL09hQzt00006346@hotmail.com>

Hi, Beowulf users,

We're going to setup a small cluster. The motherboard we ordered is
the newly released Gigabyte GA-8IPE1000-G which integrates Marvell's
Yukon 8001 GbE Controller. I tried to find the Linux driver for this 
controller
on Google and was directed to SysKonnect's website

http://www.syskonnect.com/syskonnect/support/driver/d0102_driver.html

which provides a driver for Marvell Yukon/SysKonnect SK-98xx Gigabit 
Ethernet Adapters.
However, there is no explicit indication on this website that SysKonnect's 
adapters use
Marvell's chips. Does any here have experience using Marvell's controllers?
Is it easy to install Yukon 8001 on Linux? Thanks!

Yiyang

_________________________________________________________________
Get MSN Hotmail alerts on your mobile. 
http://en-asiasms.mobile.msn.com/ac.aspx?cid=1002

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Mar  8 14:44:50 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 8 Mar 2004 14:44:50 -0500 (EST)
Subject: [Beowulf] Re: beowulf
In-Reply-To: <20040308184024.955.qmail@web21501.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0403081436440.15576-100000@ganesh>

On Mon, 8 Mar 2004, prakash borade wrote:


> how should i proceed for a client which takes dta from 5 servers
> reoetadly after every 15 seconds

> i get the data but it prints the garbage value
> 
> what can be the problem  i am usiung sockets on redhat 9
> 
> i am creting new sockets for it every time on clien side

Dear Prakash,

There is such a dazzling array of possible problems with your code that
(not being psychic) I cannot possibly help you.

For example -- 

  You could be printing an integer as a float without a cast (purely
misusing printf).  Or vice versa.  I do this all the time; it is a
common mistake.

  You could be sending the data on a bigendian system, receiving it and
trying to print it on a littleendian system.

  You could have a trivial offset wrong in your receive buffers --
printing an integer (for example) starting a byte in and overlapping
some other data in your stack would yield garbage.

  You could have a serious problem with your read algorithm.  Reading
reliably from a socket is not trivial.  I use a routine that I developed
over a fairly long time and it STILL has bugs that surface.  The
reading/writing are fundamentally asynchronous, and a read can easily
leave data behind in the socket buffer (so that what IS read is
garbage).

...and this is the tip of an immense iceberg of possible programming
errors.

The best way to proceed to write network code is to

  a) start with a working template of networking/socket code.  There are
examples in a number of texts, for example, as well as lots of
socket-based applications.  Pick a template, get it working.

  b) SLOWLY and GENTLY change your working template into your
application, ensuring that the networking component never breaks at
intermediary revisions.

or

  c) learn, slowly, surely, and by making many mistakes, to write socket
code from scratch without using a template.

Me, I use a template.

   rgb

P.S. to get more help, you're really going to have to provide a LOT more
detail than this.  Possibly including the actual source code.

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From beowulf at studio26.be  Mon Mar  8 14:54:40 2004
From: beowulf at studio26.be (Maikel Punie)
Date: Mon, 8 Mar 2004 20:54:40 +0100
Subject: [Beowulf] Cluster school project
Message-ID: <JBEHIPLJAENKCNCNFEMMGEIMCAAA.beowulf@studio26.be>


hi,

I need to make a smaal beowulf cluster for a school project i have like 2
months for this stuff, but i need to make my own task asignment.
So basicly what do you guys think that would be possible to realize in 2
months time? The only thing they told me, is that the nodes must be discless
systems.

any ideas about what could be donne in 2 months.

Maikel

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From beowulf at studio26.be  Mon Mar  8 16:03:41 2004
From: beowulf at studio26.be (Maikel Punie)
Date: Mon, 8 Mar 2004 22:03:41 +0100
Subject: [Beowulf] Cluster school project
In-Reply-To: <Pine.LNX.4.44.0403082149350.16285-100000@druifje.clustervision.com>
Message-ID: <JBEHIPLJAENKCNCNFEMMCEINCAAA.beowulf@studio26.be>

hmm,

ok, maybe i explained badly, at the moment i just need to create a project
discryption on what would be possible to realize in 2 months, and off course
i could use the cluster knoppix, but then its not a real project anymore,
then its just an install task.

also the openmosix structure is it using diskless nodes? or what because i
can't find a lot off info about it.

By the way, which part of Belgium are you from?
I recently attended the FOSDEM conference at the ULB in Bruxelles.
Great conference.

Well its the whole other part off the country, but yeah it was a great
conference i was there to :)

Thanks
Miakle

-----Oorspronkelijk bericht-----
Van: John Hearns [mailto:john.hearns at clustervision.com]
Verzonden: maandag 8 maart 2004 21:52
Aan: Maikel Punie
CC: Beowul-f Mailing lists
Onderwerp: Re: [Beowulf] Cluster school project


On Mon, 8 Mar 2004, Maikel Punie wrote:

>
> hi,
>
> I need to make a smaal beowulf cluster for a school project i have like 2
> months for this stuff, but i need to make my own task asignment.
> So basicly what do you guys think that would be possible to realize in 2
> months time? The only thing they told me, is that the nodes must be
discless
> systems.
>
Maikel, first you need the computers!

Then you should first look at ClusterKnoppix
http://bofh.be/clusterknoppix/

Once you have that running, come back and tell us how you got on.
We'll help you do more then.


By the way, which part of Belgium are you from?
I recently attended the FOSDEM conference at the ULB in Bruxelles.
Great conference.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Mon Mar  8 15:51:58 2004
From: john.hearns at clustervision.com (John Hearns)
Date: Mon, 8 Mar 2004 21:51:58 +0100 (CET)
Subject: [Beowulf] Cluster school project
In-Reply-To: <JBEHIPLJAENKCNCNFEMMGEIMCAAA.beowulf@studio26.be>
Message-ID: <Pine.LNX.4.44.0403082149350.16285-100000@druifje.clustervision.com>

On Mon, 8 Mar 2004, Maikel Punie wrote:

> 
> hi,
> 
> I need to make a smaal beowulf cluster for a school project i have like 2
> months for this stuff, but i need to make my own task asignment.
> So basicly what do you guys think that would be possible to realize in 2
> months time? The only thing they told me, is that the nodes must be discless
> systems.
> 
Maikel, first you need the computers!

Then you should first look at ClusterKnoppix 
http://bofh.be/clusterknoppix/

Once you have that running, come back and tell us how you got on.
We'll help you do more then.


By the way, which part of Belgium are you from?
I recently attended the FOSDEM conference at the ULB in Bruxelles.
Great conference.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mprinkey at aeolusresearch.com  Mon Mar  8 14:39:44 2004
From: mprinkey at aeolusresearch.com (Michael T. Prinkey)
Date: Mon, 8 Mar 2004 14:39:44 -0500 (EST)
Subject: [Beowulf] e1000 performance
Message-ID: <Pine.LNX.4.44.0403081429390.9544-100000@ra.thebes>

Hello everyone,

I am building a small cluster that uses Tyan S2723GNN motherboards that
include an integrated Intel e1000 gigabit NIC.  I have installed two
Netgear 302T gigabit cards in the 66 MHz slots as well.  With
point-to-point links, I can get a very respectable 890 Mbps with the tg3
cards, but the e1000 lags significantly at 300 to 450 Mbps.  I am using
the NAPI e1000 driver in the 2.4.24 kernel. I have tried the following
measures without any improvement:

 - changed the tcp_mem,_wmem,_rmem to larger values.
 - increased the MTU to values >1500.
 - reniced the ksoftirq processes to 0.

The 2.4.24 kernel contains the 4.x version of the e1000.  I plan to try
the 5.x version this evening.  Also, want to try increasing the Txqueuelen
as well.

Has anyone had similar experience with these embedded e1000s?  Googling 
leads me to several sites like this one: 

http://www.hep.ucl.ac.uk/~ytl/tcpip/tuning/

that seem to indicate that I should expect much more from the e1000.  Any 
help here is welcome?

Thanks,

Mike

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Mon Mar  8 16:59:59 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Mon, 8 Mar 2004 13:59:59 -0800 (PST)
Subject: [Beowulf] e1000 performance
In-Reply-To: <Pine.LNX.4.44.0403081429390.9544-100000@ra.thebes>
Message-ID: <Pine.LNX.4.04.10403081346570.1662-100000@c-24-18-245-161.client.comcast.net>

On Mon, 8 Mar 2004, Michael T. Prinkey wrote:
> I am building a small cluster that uses Tyan S2723GNN motherboards that
> include an integrated Intel e1000 gigabit NIC.  I have installed two

>From a supermicro X5DPL-iGM (E7501 chipset) with onboard e1000 to supermicro
E7500 board with an e1000 PCI-X gigabit card, via a dell 5224 switch.  The
E7501 board has a 3ware 8506 card on the same PCI-X bus as the e1000 chip, so
it's running at 64/66.  The PCI-X card is running at 133 MHz.

TCP STREAM TEST to duet
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

131070 131070   1472    9.99      940.86

Kernel versions are 2.4.20 (PCI-X card) and 2.4.22-pre2 (the onboard chip). 
2.4.20 has driver 4.4.12-k1, while 2.4.22-pre2 has driver 5.1.11-k1.

The old e1000 driver has a very useful proc file in /proc/net/PRO_LAN_Adapters
that gives all kind of information.  I have RX checksum on and flow control
turned on.  The newer driver doesn't have this information.


> the NAPI e1000 driver in the 2.4.24 kernel. I have tried the following

NAPI?

> measures without any improvement:

I've done nothing wrt gigabit performance, other than turn on flow control.  I
found that without flowcontrol, tcp connections to 100 mbit hosts would hang.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mnerren at paracel.com  Mon Mar  8 17:31:55 2004
From: mnerren at paracel.com (micah nerren)
Date: Mon, 08 Mar 2004 14:31:55 -0800
Subject: [Beowulf] Cluster school project
In-Reply-To: <JBEHIPLJAENKCNCNFEMMGEIMCAAA.beowulf@studio26.be>
References: <JBEHIPLJAENKCNCNFEMMGEIMCAAA.beowulf@studio26.be>
Message-ID: <1078785115.30523.89.camel@angmar>

On Mon, 2004-03-08 at 11:54, Maikel Punie wrote:
> hi,
> 
> I need to make a smaal beowulf cluster for a school project i have like 2
> months for this stuff, but i need to make my own task asignment.
> So basicly what do you guys think that would be possible to realize in 2
> months time? The only thing they told me, is that the nodes must be discless
> systems.
> 
> any ideas about what could be donne in 2 months.
> 
> Maikel

To actually build a small (or large!) beowulf of discless systems is
pretty easy, I guess the hardest part will be determining what the
purpose of the cluster will be. What type of code will be running on it?

They will basically be network booting a kernel and mounting an nfs
filesystem. Research these aspects, and research what kind of tools you
want to have on the cluster, ie. distributed shell, monitoring, mpi,
etc.

2 months should be plenty, you should be able to get a basic small
beowulf up and running in 2 hours once you know what to do and how to
set it up.

Time to fire up google and start researching beowulf's and diskless
booting. There is a lot of good info out there.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Mon Mar  8 17:15:46 2004
From: becker at scyld.com (Donald Becker)
Date: Mon, 8 Mar 2004 17:15:46 -0500 (EST)
Subject: [Beowulf] BWBUG Greenbelt: Intel HPC and Grid, Beowulf Clusters
Message-ID: <Pine.LNX.4.44.0403081704130.27981-100000@localhost.localdomain>


Special notes:
  This month's meeting is in Greenbelt Maryland, not Virginia!
  From pre-registration we expect a full room, so please register on
  line at http://bwbug.org and show up at least 15 minutes early.


Title: Intel's Perspective on Beowulf's Clusters
Speaker:  Stephen Wheat Ph.D

This talk will review Intel's perspective on technology trends and
transitions in this decade. The focus will be on bringing the latest
technology to the scientists' labs in the shortest amount of time. The
technologies reviewed will include processors, chipsets, I/O, systems
management, and software tools. Come with your questions; the
presentation is designed to be interactive.

Date: March 9, 2004
Time: 3:00 PM  (doors open at 2:30)
Location: Northrop Grumman IT
   7501 Greenway Center Drive  (Intersection of BW Parkway and DC beltway)
   Suite 1200 (12th floor)
   Greenbelt Maryland
Need to be a member?: No ( guests are welcome )
Parking: Free

As usual there will be door prizes, food and refreshments.


From: "Fitzmaurice, Michael" <michael.fitzmaurice at ngc.com>
  Dr. Wheat from Intel must be a popular speaker we have a big turn out
  expected. If you have not registered yet please do so. We may need to
  plan for extra chairs and we need to predict how many pizzas to
  order. This would be great meeting to invite a friend or your boss.
  It may be crowded, therefore, getting there a little early is
  recommended.

This event is sponsored by the Baltimore-Washington Beowulf Users Group
(BWBUG)
Please register on line at http://bwbug.org
As usual there will be door prizes, food and refreshments.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From nathan at iwantka.com  Mon Mar  8 18:21:15 2004
From: nathan at iwantka.com (Nathan Littlepage)
Date: Mon, 8 Mar 2004 17:21:15 -0600
Subject: [Beowulf] SCTP
Message-ID: <00d701c40564$1d21a830$6c45a8c0@ntbrt.bigrivertelephone.com>


Has anyone looked into incorporating SCTP in the cluster environment?

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Mon Mar  8 20:44:52 2004
From: becker at scyld.com (Donald Becker)
Date: Mon, 8 Mar 2004 20:44:52 -0500 (EST)
Subject: [Beowulf] SCTP
In-Reply-To: <00d701c40564$1d21a830$6c45a8c0@ntbrt.bigrivertelephone.com>
Message-ID: <Pine.LNX.4.44.0403082038280.27981-100000@localhost.localdomain>

On Mon, 8 Mar 2004, Nathan Littlepage wrote:

> Has anyone looked into incorporating SCTP in the cluster environment?

What advantage would it provide for a SAN- or LAN-based cluster?

Not that TCP is especially light-weight.  TCP implementations are
WAN-oriented and have increasingly costly features (look at the CPU cost
of iptables/ipchains) and defenses against spoofing (TCP stream start-up
is much more costly than the early BSD implementations).  The only
reason SCTP would be a better cluster protocol is that it hasn't yet
accumulated the cruft ("features") of a typical TCP stack.  But if it
became popular, that would change pretty much instantly.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
914 Bay Ridge Road, Suite 220		Scyld Beowulf cluster systems
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rdn at uchicago.edu  Mon Mar  8 23:40:01 2004
From: rdn at uchicago.edu (Russell Nordquist)
Date: Mon, 8 Mar 2004 22:40:01 -0600
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <Pine.LNX.4.44.0403060600150.25866-100000@twin.uoregon.edu>
References: <1078566922.2547.6.camel@fermi>
	<Pine.LNX.4.44.0403060600150.25866-100000@twin.uoregon.edu>
Message-ID: <20040308224001.50f2f728@vitalstatistix>


thanks for all the good info. it got me to thinking....i have resources
for comparing most components of a cluster excepts network switches. it
would be nice to have a source of information for this as well.
something like:

*bandwidth/latency between 2 hosts
*bandwidth/latency at 25%/50%/75%/100% port usage
*short vs long message comparisons

great so far, but what about the issues:

*what SW to use for the benchmark. perhaps netpipe?
*the NICS used will make a difference. how does one account for the
difference between a realtec and syskonnect chipset, bus speeds, etc?
*do we have enough variation of cluster sizes and HW to make a useful
repository?
*and i'm sure there's more 

Is this feasible? Is it a case where any info is useful even if it is
not very reliable/accurate? With more MB's coming with decent gige on
board there will be a greater chance the the difference between to
setups will only be the switch.

so, is this a worthwhile are useful project for the community? or are
there to many variables to make the results useful?

russell
-- 
- - - - - - - - - - - -
Russell Nordquist
UNIX Systems Administrator
Geophysical Sciences Computing
http://geosci.uchicago.edu/computing
NSIT, University of Chicago
 - - - - - - - - - - -
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From beowulf at studio26.be  Tue Mar  9 12:45:47 2004
From: beowulf at studio26.be (Maikel Punie)
Date: Tue, 9 Mar 2004 18:45:47 +0100
Subject: [Beowulf] Cluster school project
In-Reply-To: <644D9337A02FC24689647BF9E48EC39E08ABB797@drm556>
Message-ID: <JBEHIPLJAENKCNCNFEMMEEJLCAAA.beowulf@studio26.be>


>> ok, maybe i explained badly, at the moment i just need to create a
project
>> discryption on what would be possible to realize in 2 months, and off
course

>Do you mean a computing/programming project could you do,
>like calculating pi to some large number of digits?

yeah something like that, i realy have no idea what is possible.
if there are any suggestions, they are always welcome.

Maikel


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From paulojjs at bragatel.pt  Tue Mar  9 04:15:05 2004
From: paulojjs at bragatel.pt (Paulo Silva)
Date: Tue, 09 Mar 2004 09:15:05 +0000
Subject: [Beowulf] How to choose an UPS for a Beowulf cluster
Message-ID: <1078823704.1882.33.camel@blackTiger>

Hi,

I'm building a small Beowulf cluster for HPC (about 16 nodes) and I need
some advices on choosing the right UPS. The UPS should be able to signal
the central node when the battery reaches some level (I think this is
common usage) and it should be able to turn itself off before running
out of battery (I was told that this extends the life of the battery).
10 minutes of runtime sould be enough. I was looking in the APC site but
I was rather confused by all the models available. Can anyone give me
some advice on the type of device to choose?

Thanks for any tip
--
Paulo Jorge Jesus Silva
perl -we 'print "paulojjs".reverse "\ntp.letagarb@"'

If a guru falls in the forest with no one to hear him, was he really a
guru at all?
-- Strange de Jim, "The Metasexuals"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Esta ? uma parte de mensagem	assinada digitalmente
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20040309/97af44f9/attachment-0001.sig>

From brichard at clusterworldexpo.com  Tue Mar  9 13:45:15 2004
From: brichard at clusterworldexpo.com (Bryan Richard)
Date: Tue, 9 Mar 2004 13:45:15 -0500
Subject: [Beowulf] Join Don Becker and Thomas Sterling at ClusterWorld Conference & Expo
Message-ID: <20040309184515.GB47601@clusterworldexpo.com>


ClusterWorld Conference & Expo welcomes Scyld's Don Becker and Keynote
Thomas Sterling to the program! If you work in Beowulf and clusters, you
can't miss the following program events:

- Donald Becker, Scyld Computing Corporation: "Scyld Beowulf
Introductory Workshop"

- Donald Becker, Scyld Computing Corporation: "Scyld Beowulf Advanced
Workshop"

- Thomas Sterling, California Institute of Technology: "Beowulf Cluster
Computing a Decade of Accomplishment, a Decade of Challenge"

PLUS, ClusterWorld's exciting program of intensive tutorials, special
events, and expert presentations in 8 vertical industry tracks:
Applications, Automotive & Aerospace Engineering, Bioinformatics, 
Digital Content Creation, Grid, Finance, Petroleum & Geophysical
Exploration, and Systems.

A Special Offer for Beowulf Members
===================================

Beowulf.org members get 20% off registration prices when registering
online! You MUST use your special Priority Code - BEOW -- when
registering online to receive your 20% discount! Online registration
ends March 31, 2004 so don't delay! Just go to
http://www.clusterworldexpo.com and click on "REGISTER NOW!" to fill out
our quick enrollment form.

Associations, Universities and Labs Get 50% off Registration
============================================================

Students and employees of universities, associations, and government
labs are eligible for 50% off ClusterWorld registration!  This offer is
only available via fax or mail. Please log on to
www.clusterworldexpo.com and click on "Register Now" to download
registration PDFs. Or call 415-321-3062 for more information

A TERRIFIC PROGRAM
==================

At ClusterWorld Conference & Expo, you will: 

* LEARN from top clustering experts in our extensive conference program. 
* EXPERIENCE the latest cluster technology from the top vendors on our
  expo floor.
* MEET AND NETWORK with colleagues from across the world of clustering
  at our social events and parties.

Keynotes:

- Ian Foster, Argonne National Laboratory, University of Chicago, Globus
Alliance, and co-author of "The Grid: Blueprint for a New Computing
Infrastructure",

- Thomas Sterling, California Institute of Technology, author of "How to
Build a Beowulf," and co-author of "Enabling Technologies for Petaflops
Computing".

- Andrew Mendelsohn, Senior Vice President, Database & Application
Server Technology, Oracle Corporation

- David Kuck, Intel Fellow, Manager, Software and Solutions Group, Intel
Corporation

Want to know which sessions are getting the biggest buzz? Click on
http://www.clusterworldexpo.com/SessionSpotlight for a list of 
highlights by Technical Session Track.

REGISTER TODAY!

ClusterWorld Conference and Expo
April 5 - 8, 2004
San Jose Convention Center
San Jose, California
http://www.clusterworldexpo.com

ClusterWorld Conference & Expo Sponsors
=======================================

Platinum: Oracle Corporation, Intel Corporation

Gold: AMD, Dell, Hewlett Packard, Linux Networx, Mountain View Data,
Panasas, Penguin Computing, and RLX Technologies

Silver: Appro, Engineered Intelligence, Microway, NEC, Platform
Computing, and PolyServe

Media & Association Sponsors: Bioinformatics.org, ClusterWorld Magazine,
Distributed Systems Online, Dr. Dobbs Journal, Gelato Federation, Global
Grid Forum, GlobusWorld, LinuxHPC, Linux Magazine, PR Newswire, Storage
Management, and SysAdmin Magazine
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From wseas at canada.com  Tue Mar  9 12:25:56 2004
From: wseas at canada.com (WSEAS newsletter in mechanical engineering)
Date: Tue, 9 Mar 2004 19:25:56 +0200
Subject: [Beowulf] WSEAS and IASME newsletter in mechanical engineering, March 9, 2004
Message-ID: <3FE20F4000220BB2@fesscrpp1.tellas.gr> (added by postmaster@fesscrpp1.tellas.gr)

If you want to contact us, the Subject of your email must contains the code:
WSEAS

CALL FOR PAPERS -- CALL FOR REVIEWERS -- CALL FOR SPECIAL SESSIONS
http://www.wseas.org

IASME / WSEAS International Conference on "FLUID MECHANICS" (FLUIDS 2004)
 
August 17-19, Corfu Island, Greece

The papers of this conference will be published: 
(a) as regular papers in the IASME/WSEAS conference proceedings
(b) regular papers in the IASME TRANSACTIONS ON MECHANICAL ENGINEERING 

http://www.wseas.org

REGISTRATION FEES: 250 EUR

DEADLINE: APRIL 10, 2004

ACCOMODATION: Incredible low prices in a 5 Star Sea Resort (former HILTON of Corfu
Island), Greece, 
5 Star Sea resort where the multiconference of WSEAS will
take place in August 2004: 

51 EUR in double room and 
81 EUR in single room. 

(in August 2004, in the Capital of Greece, Athens, the 2004 Olympic Games
will take place) 


    ---> Sponsored by IASME  <----

	
Topics of FLUIDS 2004

Mathematical Modelling in fluid mechanics
Simulation in fluid mechanics
Numerical methods in fluid mechanics 
Convection, heat and mass transfer 
Experimental Methodologies in fluid mechanics 
Thin film technologies 
Multiphase flow 
Boundary layer flow 
Material properties
Fluid structure interaction
Hydrotechnology
Hydrodynamics
Coastal and estuarial modelling
Wave modelling
Industrial applications
Environmental Problems
Air Pollution Problems 
Fluid Mechanics for Civil Engineering
Fluid Mechanics in Geosciences 
Flow visualisation
Biofluids 
Meteorology
Waste Management
Environmental protection
Management of living resources
Mathematical models
Management of Rivers and Lakes
Underwater Ecology
Hydrology
Oceanology
Ocean Engineering
Others


INTERNATIONAL SCIENTIFIC COMMITTEE 

Andrei Fedorov (USA)
A. C. Baytas (Turkey)
Albert R. George (USA)
Alexander I. Leontiev (Russia)
Andreas Dillmann (Germany)
Bruce Caswell (USA)
Chris Swan (UK)
David A. Caughey (USA)
Derek B Ingham (UK)
Donatien Njomo (CM)
Dong Chen (Australia)
Dong-Ryul Lee (Korea)
Edward E. Anderson (USA)
G. Gaiser (Germany)
G.D. Raithby (Canada)
Gad Hetsroni (Israel)
H. Beir?o da Veiga (Italy)
Ingegerd Sjfholm (Sweden)
Jerry R. Dunn (USA)
Joseph T. C. Liu (USA)
Karl B?hler (Germany)
Kenneth S. Breuer (USA)
Kumar K. Tamma (USA)
Kyungkeun Kang (USA)
M. A. Hossain (UK)
M. F. El-Amin (USA)
M.-Y. Wen (Taiwan)
Michiel Nijemeisland (USA)
Ming-C. Chyu (USA)
Naoto Tanaka (Japan)
Natalia V. Medvetskaya (Russia) 
O. Liungman (Sweden)
Philip Marcus (USA)
Pradip Majumdar (USA)
Rama Subba Reddy Gorla (USA)
Robert Nerem (USA)
Rod Sobey (UK)
Ruairi Maciver (UK)
S.M.Ghiaasiaan (USA)
Stanley Berger (USA)
Tak?o Takahashi (France)
Vassilis Gekas (Sweden)
Yinping Zhang (China)
Yoshitaka Watanabe (Japan)
 
NOTE THAT IN WSEAS CONFERENCES YOU CAN HAVE PROCEEDINGS 
1) HARD COPY 
2) CD-ROM and 
3) Web Publishing

WSEAS Books, Journals, Proceedings participate now in all major science citation
indexes.
ISI, ELSEVIER, CSA, AMS. Mathematical Reviews, ELP, NLG, Engineering Index 
Directory of Published Proceedings, INSPEC (IEE) 

More Details: http://www.wseas.org


Thanks
Alexis Espen


   #####    HOW TO UNSUBSCRIBE   ####

You receive this newsletter from your email address: beowulf at beowulf.org
If you want to unsubscribe, send an email to:  wseas at canada.com
The Subject of your message must be exactly: REMOVE beowulf at beowulf.org  WSEAS
If you want to unsubscribe more than one email addresses, send a message
to nata at wseas.org with Subject:  REMOVE [email1, emal2, ...., emailn]  WSEAS
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From michael.worsham at mci.com  Tue Mar  9 13:32:14 2004
From: michael.worsham at mci.com (Michael Worsham)
Date: Tue, 09 Mar 2004 13:32:14 -0500
Subject: [Beowulf] Cluster school project
Message-ID: <000f01c40604$d8ef6520$987a32a6@Wcomnet.com>


I would say also check out the Bootable Cluster CD (http://bccd.cs.uni.edu/)
as well. It is very easy to use and was specifically designed so you could
cluster an entire network lab, without having to worry about the hard drives
being written to.

-- Michael

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Tue Mar  9 16:13:24 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Tue, 9 Mar 2004 13:13:24 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
Message-ID: <Pine.LNX.4.04.10403091300000.3174-100000@c-24-18-245-161.client.comcast.net>

Has anyone with dual opteron machines and a kill-a-watt measured how much
power they consume?

I measured the dual P3 and xeons we have here, but no dual opterons yet.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Tue Mar  9 17:36:05 2004
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Tue, 9 Mar 2004 14:36:05 -0800
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.04.10403091300000.3174-100000@c-24-18-245-161.client.comcast.net>
References: <Pine.LNX.4.04.10403091300000.3174-100000@c-24-18-245-161.client.comcast.net>
Message-ID: <20040309223605.GA29912@cse.ucdavis.edu>

On Tue, Mar 09, 2004 at 01:13:24PM -0800, Trent Piepho wrote:
> Has anyone with dual opteron machines and a kill-a-watt measured how much
> power they consume?
> 
> I measured the dual P3 and xeons we have here, but no dual opterons yet.

I recently measured a Sunfire V20z (dual 2.2 GHz) opteron, I believe it
had 2 scsi disks, 4 GB ram.

                   watts     VA
Idle               237-249   260-281
Pstream 1 thread   260-277   290-311
Pstream 2 threads  265-280   303-313

Pstream is very much like McCalpin's stream, except it uses pthreads 2
run parallel threads in sync, and it runs over a range of array sizes.
It's the most power intensive application I've found, anything with
heave disk usage tends to decrease the power usage.

It's also great for showing memory system parallelism, say for a dual
p4 vs opteron.  I also find it useful for finding misconfigured dual
opterons.

For those interested:
http://cse.ucdavis.edu/bill/pstream.c

-- 
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Tue Mar  9 17:49:14 2004
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Tue, 9 Mar 2004 14:49:14 -0800
Subject: [Beowulf] good 24 port gige switch
In-Reply-To: <20040308224001.50f2f728@vitalstatistix>
References: <1078566922.2547.6.camel@fermi> <Pine.LNX.4.44.0403060600150.25866-100000@twin.uoregon.edu> <20040308224001.50f2f728@vitalstatistix>
Message-ID: <20040309224914.GB29912@cse.ucdavis.edu>

On Mon, Mar 08, 2004 at 10:40:01PM -0600, Russell Nordquist wrote:
> 
> thanks for all the good info. it got me to thinking....i have resources
> for comparing most components of a cluster excepts network switches. it
> would be nice to have a source of information for this as well.
> something like:
> 
> *bandwidth/latency between 2 hosts
> *bandwidth/latency at 25%/50%/75%/100% port usage
> *short vs long message comparisons

I use nrelay.c a small simple program I wrote that will MPI_Send
MPI_send very size packets between sets of nodes.

So I do something like the following to find best base latency
and bandwidth:
mpirun -np 2 ./nrelay 1  # then run with 10 100 1000 10000
size = 1, 2 nodes in  2.97 sec (  5.7 us/hop)    690 KB/sec
size=    10, 524288 hops,  2 nodes in  3.06 sec (  5.8 us/hop)   6688 KB/sec
size=   100, 524288 hops,  2 nodes in  4.19 sec (  8.0 us/hop)  48868 KB/sec
size=  1000, 524288 hops,  2 nodes in 15.37 sec ( 29.3 us/hop) 133267 KB/sec
size= 10000, 524288 hops,  2 nodes in 40.72 sec ( 77.7 us/hop) 502908 KB/sec

So we have an interconnect that manages 5.8 us for small messages and
500 MB/sec or so for large (10000 MPI_INTs).

Then I run:
mpirun -np 2,4,8,16,32,64 ./nrelay 10000  
size= 10000, 524288 hops,  2 nodes in 40.72 sec ( 77.7 us/hop) 502908 KB/sec
size= 10000, 524288 hops,  4 nodes in 39.79 sec ( 75.9 us/hop) 514698 KB/sec
size= 10000, 524288 hops,  8 nodes in 39.21 sec ( 74.8 us/hop) 522253 KB/sec
size= 10000, 524288 hops, 16 nodes in 45.53 sec ( 86.8 us/hop) 449772 KB/sec
size= 10000, 524288 hops, 32 nodes in 49.25 sec ( 93.9 us/hop) 415876 KB/sec
size= 10000, 524288 hops, 64 nodes in 52.90 sec (100.9 us/hop) 387111 KB/sec

So in this case it looks like the switch is becoming saturated.

The source is at:
http://cse.ucdavis.edu/bill/nrelay.c

I'd love to see numbers posted for various GigE, Myrinet, Dolphin and IB 
configurations

-- 
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Mar  9 19:32:49 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 9 Mar 2004 19:32:49 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.04.10403091300000.3174-100000@c-24-18-245-161.client.comcast.net>
Message-ID: <Pine.LNX.4.44.0403091917490.1281-100000@lilith.rgb.private.net>

On Tue, 9 Mar 2004, Trent Piepho wrote:

> Has anyone with dual opteron machines and a kill-a-watt measured how much
> power they consume?
> 
> I measured the dual P3 and xeons we have here, but no dual opterons yet.

By strange chance yes.  An astoundingly low 154 watts (IIRC -- I'm home,
the kill-a-watt is at Duke -- but it was definitely ballpark of 150W)
under load.  That's a load average of 2, one task per processor, without
testing under a variety of KINDS of load.  Around 75W per loaded CPU.

That's a bit less than the draw of an >>idle<< dual Athlon (165W).

I'm actually racking six more boxes tomorrow and will recheck the draw
and verify that it really is under load, but I was with Seth when I
measured it and we remarked back and forth about it, really pleased, so
I'm pretty sure I'm right. It has several very positive implications and
seems believable.  They are 1U cases (Penguin Altus 1000's) but the air
coming out of the back is not that hot, really, again compared to the
E-Z Bake Oven 2U 2466 dual Athlons (something like 260W under load).  So
we gain significantly in CPU, get access to larger memory if/when we
care, get 64 bit memory bus, and drop power and cooling requirements
(per CPU, but very nearly per rack U).  It just don't get any better
than this.

I think they are 242's, FWIW. 

YMMV.  I could be wrong, mistaken, deaf, dumb, blind, and stupid.  My
kill-a-watt could be on drugs.  I could be on drugs.  Maybe I dropped a
decimal and they really draw 1500W.  Perhaps the beer I spilled in my
kill-a-watt confused it.  I was up to 3:30 am finishing a month-late
column for deadline himself (leaving me only days late on the CURRENT
column) and my brain doesn't work very well any more.

Caveat auditor.

   rgb

> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Tue Mar  9 20:41:45 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Wed, 10 Mar 2004 09:41:45 +0800 (CST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <20040309223605.GA29912@cse.ucdavis.edu>
Message-ID: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com>

 --- Bill Broadley <bill at cse.ucdavis.edu> ???
> I recently measured a Sunfire V20z (dual 2.2 GHz)
> opteron, I believe it had 2 scsi disks, 4 GB ram.
> 
>                    watts     VA
> Idle               237-249   260-281
> Pstream 1 thread   260-277   290-311
> Pstream 2 threads  265-280   303-313

But that is with the disks, RAM, and other hardware
you have. Anyone with similar configurations but have
P4s instead?

It just looks too good to believe the numbers...
consider that the similar performance one IA64
processor ALONE draws over 120W.

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Tue Mar  9 21:08:45 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Tue, 9 Mar 2004 18:08:45 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.58.0403092308160.7398@krylov.OptimaNumerics.com>
Message-ID: <Pine.LNX.4.04.10403091635500.3287-100000@c-24-18-245-161.client.comcast.net>

On Tue, 9 Mar 2004, C J Kenneth Tan -- Heuchera Technologies wrote:
> What is the power consumption that you measured for your dual P3 and
> Xeons?

System #1:  Dual P3-500 Katmai, BX motherbaord, 512 MB PC100 ECC RAM, two
tulip NICs, cheap graphics card, 5400 RPM IDE drive, floppy drive, one case
fan, and a normal 250W ATX PS with a fan:

System #2:  Nearly the same as system #1 more or less, but with dual P3-850
Coppermines and no case fan.

System #3:  Dual Xeon 2.4 GHz 533FSB, E7501 chipset, 1 GB PC2100 ECC memory,
two 3Ware 8506-8 cards, a firewire card, onboard intel GB and FE, one Maxtor
6Y200P0 drive, 6 high speed case fans (rated 4.44W each), floppy drive, CD-ROM
drive, 550W PS with power factor correction (rated minimum 63% efficient),
SATA backplane, and 16 Maxtor 6Y200M0 SATA drives (rated 7.4W idle each) in
hotswap carriers.  I measured system #3 with the SATA drives both installed
and removed.

Unfortunately I don't have a dual Xeon with minimal extra hardware to test.

#1 Idle 		42W	72 VA	(.58 PF)
#1 Loaded		103W	157 VA	(.66 PF)
#2 Idle			39W	67 VA   (.58 PF)
#2 Loaded		96W	148 VA	(.65 PF)
#3 Idle w/o RAID	162W	168 VA	(.96 PF)
#3 Loaded w/o RAID	283W	289 VA	(.98 PF)
#3 Idle w/ RAID		375W		(stays at .98)
#3 Loaded w/ RAID	510W		(stays at .98)
#3 Loaded w/RAID/bonnie 534W		(stays at .98)

For the load, I used two processes of burnP6, part of cpuburn at
http://users.ev1.net/~redelm/

For a load breakdown by load type for system 1:
        1 process   2 processes
burnP5	65W
burnP6	72WA		103W  (exactly 30W per CPU over idle)
burnMMX	64W
burnK6	69W
burnK7	67W
burnBX	87W		90W
stream	84W		85W

The stream and burnBX memory loaders use more power than a single CPU load
program, but two at once and the CPU loaders use more power.

To load system #3 with the disks on, I ran bonnie++ on all 16 drives.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Wed Mar 10 00:48:45 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Wed, 10 Mar 2004 00:48:45 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com>
Message-ID: <Pine.LNX.4.44.0403100040590.20854-100000@coffee.psychology.mcmaster.ca>

> > I recently measured a Sunfire V20z (dual 2.2 GHz)
> > opteron, I believe it had 2 scsi disks, 4 GB ram.
> > 
> >                    watts     VA
> > Idle               237-249   260-281
> > Pstream 1 thread   260-277   290-311
> > Pstream 2 threads  265-280   303-313

that's about right.  my dual 240's peak at about 250 running 
two copies of stream and one bonnie (2GB, 40G 7200rpm IDE).

> But that is with the disks, RAM, and other hardware
> you have.

nothing else counts for much.  for instance, dimms are a couple
watts apiece (makes you wonder about the heatspreaders that 
gamers/overclockers love so much), nics and disks are ~10W, etc.

> Anyone with similar configurations but have
> P4s instead?

iirc my dual xeon/2.4's peak at around 190W (1-2GB, otherwise same).

> It just looks too good to believe the numbers...
> consider that the similar performance one IA64
> processor ALONE draws over 120W.

hey, to marketing planners, massive power dissipation is 
probably a *good* thing.  serious "enterprise" computers must 
have an impressive dissipation to set them apart from those 
piddly little game/surfing boxes ;)


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From burcu at ulakbim.gov.tr  Wed Mar 10 02:30:47 2004
From: burcu at ulakbim.gov.tr (Burcu Akcan)
Date: Wed, 10 Mar 2004 09:30:47 +0200
Subject: [Beowulf] SPBS problem
Message-ID: <404EC427.7070200@ulakbim.gov.tr>

Hi,

We have built a beowulf Debian cluster that contains 128 PIV nodes and one dual xeon server. I need
some help about SPBS (Storm). We have already installed SPBS on the server and nodes and all daemons seem to work regularly. When any job is given to the system by using pbs scripting, the job can be seen on defined queue by running status and related nodes are allocated for the job. On the other hand there is no cpu or memory consumption on the nodes, the job does not run exactly and at the end of estimated cpu time there is no output file. 

Can anyone give some advice on my problem about SPBS.

Thank you...


Burcu Akcan

ULAKBIM High Performance Computing Center


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Wed Mar 10 09:56:49 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Wed, 10 Mar 2004 06:56:49 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com>
Message-ID: <Pine.LNX.4.04.10403100401140.4230-100000@c-24-18-245-161.client.comcast.net>

On Wed, 10 Mar 2004, [big5] Andrew Wang wrote:
> --- Bill Broadley <bill at cse.ucdavis.edu>
> > I recently measured a Sunfire V20z (dual 2.2 GHz)
> > opteron, I believe it had 2 scsi disks, 4 GB ram.
> > 
> >                    watts     VA
> > Idle               237-249   260-281
> > Pstream 1 thread   260-277   290-311
> > Pstream 2 threads  265-280   303-313
>
> It just looks too good to believe the numbers...
> consider that the similar performance one IA64
> processor ALONE draws over 120W.

You also have to consider that the typical computer power supply is only
around 60% to 80% efficient.  If the CPU draws 120W, then that's going to be
something like 150 to 200 watts measured with a power meter, and really,
that's what matters.  It makes no difference to the AC and circuit breakers if
the power is dissipated in the CPU or in the power supply.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Wed Mar 10 10:14:18 2004
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Wed, 10 Mar 2004 07:14:18 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403100040590.20854-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040310151418.51414.qmail@web11413.mail.yahoo.com>

But the Itanium 2 is using so much energy that Intel couldn't rise the
frequency... or else the machine would melt :(

See the online lecture: "Things CPU Architects Need To Think About"
http://www.stanford.edu/class/ee380/

BTW, that guy used to work for Intel, and he also mentioned about the
compiler guys tuned the IA-64 compiler for the benchmarks...

Rayson


> hey, to marketing planners, massive power dissipation is 
> probably a *good* thing.  serious "enterprise" computers must 
> have an impressive dissipation to set them apart from those 
> piddly little game/surfing boxes ;)
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you?re looking for faster
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Wed Mar 10 10:13:58 2004
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Wed, 10 Mar 2004 07:13:58 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403100040590.20854-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040310151358.43826.qmail@web11407.mail.yahoo.com>

But the Itanium 2 is using so much energy that Intel couldn't rise the
frequency... or else the machine would melt :(

See the online lecture: "Things CPU Architects Need To Think About"
http://www.stanford.edu/class/ee380/

BTW, that guy used to work for Intel, and he also mentioned about the
compiler guys tuned the IA-64 compiler for the benchmarks...

Rayson


> hey, to marketing planners, massive power dissipation is 
> probably a *good* thing.  serious "enterprise" computers must 
> have an impressive dissipation to set them apart from those 
> piddly little game/surfing boxes ;)
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you?re looking for faster
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at intnet.mu  Wed Mar 10 11:42:39 2004
From: rgoornaden at intnet.mu (roudy)
Date: Wed, 10 Mar 2004 20:42:39 +0400
Subject: [Beowulf] Writing a parallel program
References: <200403101448.i2AEmIA22804@NewBlue.scyld.com>
Message-ID: <003701c406bf$085f25b0$590b7bca@roudy>

Hello everybody,
I completed to build my beowulf cluster. Now I am writing a parallel program
using MPICH2. Can someone give me a help. Because, the program that I wrote
take more time to run on several nodes compare when it is run on one node.
If there is a small program that someone can send me about distributing data
among nodes, then each node process the data, and the information is sent
back to the master node for printing. This will be a real help for me.
Thanks
Roud


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Mar 10 12:28:54 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Mar 2004 12:28:54 -0500 (EST)
Subject: [Beowulf] Writing a parallel program
In-Reply-To: <003701c406bf$085f25b0$590b7bca@roudy>
Message-ID: <Pine.LNX.4.44.0403101205330.20900-100000@ganesh>

On Wed, 10 Mar 2004, roudy wrote:

> Hello everybody,
> I completed to build my beowulf cluster. Now I am writing a parallel program
> using MPICH2. Can someone give me a help. Because, the program that I wrote
> take more time to run on several nodes compare when it is run on one node.
> If there is a small program that someone can send me about distributing data
> among nodes, then each node process the data, and the information is sent
> back to the master node for printing. This will be a real help for me.
> Thanks
> Roud

I can't help you much with MPI but I can help you understand the
problems you might encounter with ANY message passing system or library
in terms of parallel task scaling.

There is a ready-to-run PVM program I just posted in tarball form on my
personal website that will be featured in the May issue of Cluster World
Magazine.  

  http:www.phy.duke.edu/~rgb/General/random_pvm.php

It is designed to give you direct control over the most
important parameters that affect task scaling so that you can learn just
how it works.

The task itself consists of a "master" program and a "slave" program.
The master parses several parameters from the command line:

  -n number of slaves
  -d delay (to vary the amount of simulated work per communication)
  -r number of rands (to vary the number of communications per run and
work burdent per slave)
  -b a flag to control whether the slaves send back EACH number as it is
generated (lots of small messags) or "bundles" all the numbers they
generate into a single message.  This makes a visible, rather huge
difference in task scaling, as it should.

The task itself is trivial -- generating random numbers.  The master
starts by computing a trivial task partitioning among the n nodes.  It
spawns n slave tasks, sending each one the delay on the command line.
It then sends each slave the number of rands to generate and a trivially
unique seed as messages.  Each slave generates a rand, waits delay (in
nanoseconds, with a high-precision polling loop), and either sends it
back as a message immediately (the default) or saves it in a large
vector until the task is finished and sends the whole buffer as a single
message (if the -b flag was set).

This serves two valuable purposes for the novice.

First, it gives you a ready-to-build working master/slave program to
use as a template for a pretty much any problem for which the paradigm
is a good fit.

Second, by simply playing with it, you can learn LOTS of things about
parallel programs and clusters.  If delay is small (order of the packet
latency, 100 usec or less) the program is in a latency dominated scaling
regime where communications per number actually takes longer than
generating the numbers and its parallel scaling is lousy (if slowing a
task down relative to serial can be called merely lousy).  If delay is
large, so that it takes a long time to compute and a short time to send
back the results, parallel scaling is excellent with near linear
speedup.  Turning on the -b flag for certain ranges of the delay can
"instantly" shift one from latency bounded to bandwidth bounded parallel
scaling regimes, and restore decent scaling.

Even if you don't use it because it is based on PVM, if you clone it for
MPI you'll learn the same lessons there, as they are universal and part
of the theoretical basis for understanding parallel scaling.  Eventually
I'll do an MPI version myself for the column, but the mag HAS an MPI
column and my focus would be more for the novice learning about parallel
computing in general.

BTW, obviously I think that subscribing to CWM is a good idea for
novices.  Among its many other virtues (such as articles by lots of the
luminaries of this vary list:-), you can read my columns.  In fact, from
what I've seen from the first few issues, ALL the columns are pretty
damn good and getting back issues to the beginning wouldn't hurt, if it
is still possible.

If you (or anybody) DO grab random_pvm and give it a try, please send me
feedback, preferrably before the actual column comes out in May, so that
I can fix it before then.  It is moderately well documented in the
tarball, but of course there is more "documentation" and explanation
in the column itself.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Mar 10 12:28:54 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Mar 2004 12:28:54 -0500 (EST)
Subject: [Beowulf] Writing a parallel program
In-Reply-To: <003701c406bf$085f25b0$590b7bca@roudy>
Message-ID: <Pine.LNX.4.44.0403101205330.20900-100000@ganesh>

On Wed, 10 Mar 2004, roudy wrote:

> Hello everybody,
> I completed to build my beowulf cluster. Now I am writing a parallel program
> using MPICH2. Can someone give me a help. Because, the program that I wrote
> take more time to run on several nodes compare when it is run on one node.
> If there is a small program that someone can send me about distributing data
> among nodes, then each node process the data, and the information is sent
> back to the master node for printing. This will be a real help for me.
> Thanks
> Roud

I can't help you much with MPI but I can help you understand the
problems you might encounter with ANY message passing system or library
in terms of parallel task scaling.

There is a ready-to-run PVM program I just posted in tarball form on my
personal website that will be featured in the May issue of Cluster World
Magazine.  

  http:www.phy.duke.edu/~rgb/General/random_pvm.php

It is designed to give you direct control over the most
important parameters that affect task scaling so that you can learn just
how it works.

The task itself consists of a "master" program and a "slave" program.
The master parses several parameters from the command line:

  -n number of slaves
  -d delay (to vary the amount of simulated work per communication)
  -r number of rands (to vary the number of communications per run and
work burdent per slave)
  -b a flag to control whether the slaves send back EACH number as it is
generated (lots of small messags) or "bundles" all the numbers they
generate into a single message.  This makes a visible, rather huge
difference in task scaling, as it should.

The task itself is trivial -- generating random numbers.  The master
starts by computing a trivial task partitioning among the n nodes.  It
spawns n slave tasks, sending each one the delay on the command line.
It then sends each slave the number of rands to generate and a trivially
unique seed as messages.  Each slave generates a rand, waits delay (in
nanoseconds, with a high-precision polling loop), and either sends it
back as a message immediately (the default) or saves it in a large
vector until the task is finished and sends the whole buffer as a single
message (if the -b flag was set).

This serves two valuable purposes for the novice.

First, it gives you a ready-to-build working master/slave program to
use as a template for a pretty much any problem for which the paradigm
is a good fit.

Second, by simply playing with it, you can learn LOTS of things about
parallel programs and clusters.  If delay is small (order of the packet
latency, 100 usec or less) the program is in a latency dominated scaling
regime where communications per number actually takes longer than
generating the numbers and its parallel scaling is lousy (if slowing a
task down relative to serial can be called merely lousy).  If delay is
large, so that it takes a long time to compute and a short time to send
back the results, parallel scaling is excellent with near linear
speedup.  Turning on the -b flag for certain ranges of the delay can
"instantly" shift one from latency bounded to bandwidth bounded parallel
scaling regimes, and restore decent scaling.

Even if you don't use it because it is based on PVM, if you clone it for
MPI you'll learn the same lessons there, as they are universal and part
of the theoretical basis for understanding parallel scaling.  Eventually
I'll do an MPI version myself for the column, but the mag HAS an MPI
column and my focus would be more for the novice learning about parallel
computing in general.

BTW, obviously I think that subscribing to CWM is a good idea for
novices.  Among its many other virtues (such as articles by lots of the
luminaries of this vary list:-), you can read my columns.  In fact, from
what I've seen from the first few issues, ALL the columns are pretty
damn good and getting back issues to the beginning wouldn't hurt, if it
is still possible.

If you (or anybody) DO grab random_pvm and give it a try, please send me
feedback, preferrably before the actual column comes out in May, so that
I can fix it before then.  It is moderately well documented in the
tarball, but of course there is more "documentation" and explanation
in the column itself.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Wed Mar 10 12:07:10 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Wed, 10 Mar 2004 12:07:10 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <20040310151358.43826.qmail@web11407.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>

> See the online lecture: "Things CPU Architects Need To Think About"
> http://www.stanford.edu/class/ee380/

does anyone have a lead on an open-source player for these .asx files?
or at least something not tied to windows?


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sp at scali.com  Wed Mar 10 13:41:59 2004
From: sp at scali.com (Steffen Persvold)
Date: Wed, 10 Mar 2004 19:41:59 +0100
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>
Message-ID: <404F6177.8050108@scali.com>

Mark Hahn wrote:
>>See the online lecture: "Things CPU Architects Need To Think About"
>>http://www.stanford.edu/class/ee380/
> 
> 
> does anyone have a lead on an open-source player for these .asx files?
> or at least something not tied to windows?
> 

The .asx file is just a link to a .wmv (Windows Media) file, which again just
contains a streaming media reference. I haven't tried, but I think you could
use mplayer to play them :

http://www.mplayerhq.hu

Best regards,
Steffen

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Wed Mar 10 16:11:07 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Wed, 10 Mar 2004 16:11:07 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <404F7BE0.6040900@nada.kth.se>
Message-ID: <Pine.LNX.4.44.0403101610440.24719-100000@coffee.psychology.mcmaster.ca>

> Seems to be running fine with xine.

wow, you're right!  thanks...

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Mar 10 18:56:06 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Mar 2004 18:56:06 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403101610440.24719-100000@coffee.psychology.mcmaster.ca>
Message-ID: <Pine.LNX.4.44.0403101851000.1295-100000@lilith.rgb.private.net>

On Wed, 10 Mar 2004, Mark Hahn wrote:

> > Seems to be running fine with xine.
> 
> wow, you're right!  thanks...

(sorry to jump back on the thread this way, but it is easier than
scrolling back through mail to find the original:-)

I went downstairs again today and really paid attention to the
kill-a-watt.  Dual 1600 MHz Opteron, 1 GB of memory, load average of 3
(I don't know why but they are running three jobs instead of two at the
moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over
120 V line voltage).

This seems lower than a lot of the other numbers being reported
(although it is a bit higher than my memory recalled yesterday -- I TOLD
you not to trust me:-).  It is still considerably better than a dual
Athlon at much higher clock as well.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ddw at dreamscape.com  Wed Mar 10 20:36:13 2004
From: ddw at dreamscape.com (Daniel Williams)
Date: Wed, 10 Mar 2004 20:36:13 -0500
Subject: [Beowulf] Cluster school project
References: <200403101446.i2AEknA22660@NewBlue.scyld.com>
Message-ID: <404FC28A.7607EF77@dreamscape.com>

> From: "Maikel Punie" <beowulf at studio26.be>
> Subject: RE: [Beowulf] Cluster school project
> Date: Tue, 9 Mar 2004 18:45:47 +0100
>
[snip...]
>
>>Do you mean a computing/programming project could you do,
>>like calculating pi to some large number of digits?
>
>yeah something like that, i realy have no idea what is possible.
>if there are any suggestions, they are always welcome.

Here's what I want to do once I get enough junk 500mhz machines together:
	Make a model of the spread of genetic diseases in a population of a few
hundred million.  I've been wanting to do that for years, but it would
probably take a few months to run on any single machine I own.  I figure it
should run in a few weeks as soon as I get a 16 node cluster together to run it.
	Is that something you could maybe use?

DDW
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From a.j.martin at qmul.ac.uk  Thu Mar 11 04:56:24 2004
From: a.j.martin at qmul.ac.uk (Alex Martin)
Date: Thu, 11 Mar 2004 09:56:24 +0000
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403101851000.1295-100000@lilith.rgb.private.net>
References: <Pine.LNX.4.44.0403101851000.1295-100000@lilith.rgb.private.net>
Message-ID: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>

On Wednesday 10 March 2004 11:56 pm, Robert G. Brown wrote:

>
> I went downstairs again today and really paid attention to the
> kill-a-watt.  Dual 1600 MHz Opteron, 1 GB of memory, load average of 3
> (I don't know why but they are running three jobs instead of two at the
> moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over
> 120 V line voltage).
>
> This seems lower than a lot of the other numbers being reported
> (although it is a bit higher than my memory recalled yesterday -- I TOLD
> you not to trust me:-).  It is still considerably better than a dual
> Athlon at much higher clock as well.
>
>    rgb


I find you numbers a bit surprising still  As part of our latest procurement 
I looked up the power consumption in the INTEL/AMD documention for the 
various processors under consideration:


Athlon

model 6              2200MP             58.9 W
model 8              2400MP             54.5 W
model 11             2800MP (Barton)    47.2 W


Opteron             240-244             82.1 W
                    246-248             89.0 W


Xeon                2.8  GHz              77 W
(512K Cache)        3.06 GHz              87 W


I think these numbers are meant to be maximum?


-- 
------------------------------------------------------------------------------
|                                                                            |
|  Dr. Alex Martin                                                           |
|  e-Mail:   a.j.martin at qmul.ac.uk        Queen Mary, University of London,  |
|  Phone :   +44-(0)20-7882-5033          Mile End Road,                     |
|  Fax   :   +44-(0)20-8981-9465          London, UK   E1 4NS                |
|                                                                            |
------------------------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From a.j.martin at qmul.ac.uk  Thu Mar 11 07:47:57 2004
From: a.j.martin at qmul.ac.uk (Alex Martin)
Date: Thu, 11 Mar 2004 12:47:57 +0000
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403111328390.20910-100000@kenzo.iwr.uni-heidelberg.de>
References: <Pine.LNX.4.44.0403111328390.20910-100000@kenzo.iwr.uni-heidelberg.de>
Message-ID: <200403111247.i2BClv215026@heppcb.ph.qmw.ac.uk>

On Thursday 11 March 2004 12:35 pm, Bogdan Costescu wrote:
> On Thu, 11 Mar 2004, Alex Martin wrote:
> > I find you numbers a bit surprising still
>
> I don't :-)

I was suprised that rgb's opteron numbers were so low!


> While I can't remember what was the exact figure for the dual Opteron
> 246 (2 GHz) system, I'm sure that it was over 200W.
>
> > Athlon model 11             2800MP (Barton)    47.2 W
>
> dual Athlon 2800MP (2133MHz) under load from 2 cpuburn ~ 230W
>
> > Xeon (512K Cache)        3.06 GHz              87 W
>
> dual Xeon 3.06GHz under load from 2 cpuburn ~ 275W


your system numbers are pretty consistent with what I've measured. 
( ~230 W for Athlon 2200MP  and ~250W for Xeon 2.8GHz )

-- 
------------------------------------------------------------------------------
|                                                                            |
|  Dr. Alex Martin                                                           |
|  e-Mail:   a.j.martin at qmul.ac.uk        Queen Mary, University of London,  |
|  Phone :   +44-(0)20-7882-5033          Mile End Road,                     |
|  Fax   :   +44-(0)20-8981-9465          London, UK   E1 4NS                |
|                                                                            |
------------------------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bogdan.costescu at iwr.uni-heidelberg.de  Thu Mar 11 07:35:30 2004
From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu)
Date: Thu, 11 Mar 2004 13:35:30 +0100 (CET)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>
Message-ID: <Pine.LNX.4.44.0403111328390.20910-100000@kenzo.iwr.uni-heidelberg.de>

On Thu, 11 Mar 2004, Alex Martin wrote:

> I find you numbers a bit surprising still

I don't :-)
While I can't remember what was the exact figure for the dual Opteron
246 (2 GHz) system, I'm sure that it was over 200W.

> Athlon model 11             2800MP (Barton)    47.2 W

dual Athlon 2800MP (2133MHz) under load from 2 cpuburn ~ 230W

> Xeon (512K Cache)        3.06 GHz              87 W

dual Xeon 3.06GHz under load from 2 cpuburn ~ 275W

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Mar 11 08:39:02 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 11 Mar 2004 08:39:02 -0500 (EST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>
Message-ID: <Pine.LNX.4.44.0403110829230.1295-100000@lilith.rgb.private.net>

On Thu, 11 Mar 2004, Alex Martin wrote:

> I find you numbers a bit surprising still  As part of our latest procurement 
> I looked up the power consumption in the INTEL/AMD documention for the 
> various processors under consideration:
...
> Opteron             240-244             82.1 W
>                     246-248             89.0 W
> I think these numbers are meant to be maximum?

You've got me -- dunno.  I can post a digital photo of the kill-a-watt
reading if you like (I was going to take a camera down there anyway to
add a new rack photo to the brahma tour).  I can also take the
kill-a-watt and plug in an electric light bulb or something with a
fairly predictable draw and see if it is broken somehow.

Right now a system in production work is plugged into it -- I'll try to
retrieve it soon and plug one of my new systems into it so that I can
run more detailed tests under more controlled loads.  I don't know
exactly what kind of work is being done in the current jobs being run.

One advantage may be that the cases are apparently equipped with a PFC
power supply.  The power factor appears to be very good -- close to 1.
This may make the power supplies themselves run cooler, so that the
power draw of the rest of the system IS only 20 or so more watts.  The
systems also have a bare minimum of peripherals -- a hard disk (sitting
idle), onboard dual gig NICs (one idle) and video (idle).

Will post newer/better tests as I have time and make them, although
others may beat me to it...;-)

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Thu Mar 11 11:10:16 2004
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Thu, 11 Mar 2004 08:10:16 -0800
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403110829230.1295-100000@lilith.rgb.private
 .net>
References: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>
Message-ID: <5.2.0.9.2.20040311080304.017d8008@mailhost4.jpl.nasa.gov>

At 08:39 AM 3/11/2004 -0500, Robert G. Brown wrote:
>On Thu, 11 Mar 2004, Alex Martin wrote:
>
> > I find you numbers a bit surprising still  As part of our latest 
> procurement
> > I looked up the power consumption in the INTEL/AMD documention for the
> > various processors under consideration:
>...
> > Opteron             240-244             82.1 W
> >                     246-248             89.0 W
> > I think these numbers are meant to be maximum?
>
>You've got me -- dunno.  I can post a digital photo of the kill-a-watt
>reading if you like (I was going to take a camera down there anyway to
>add a new rack photo to the brahma tour).  I can also take the
>kill-a-watt and plug in an electric light bulb or something with a
>fairly predictable draw and see if it is broken somehow.
>
>Right now a system in production work is plugged into it -- I'll try to
>retrieve it soon and plug one of my new systems into it so that I can
>run more detailed tests under more controlled loads.  I don't know
>exactly what kind of work is being done in the current jobs being run.
>
>One advantage may be that the cases are apparently equipped with a PFC
>power supply.  The power factor appears to be very good -- close to 1.
>This may make the power supplies themselves run cooler, so that the
>power draw of the rest of the system IS only 20 or so more watts.  The
>systems also have a bare minimum of peripherals -- a hard disk (sitting
>idle), onboard dual gig NICs (one idle) and video (idle).

Those power supplies are impressive PFC wise..

I'd venture to say, though, that the rated powers are peak over some fairly 
short time.  The Kill-A-Watt averages over some reasonable time (a second 
or two?), so you could actually have an average that's half the peak.

Everytime there's a pipeline stall, or a cache miss, etc, the current's 
going to change.  We used processor current to debug DSP code, because you 
could actually see interrupts come in during the other steps(FFT = very 
high power, sudden drop for a few microseconds while ISR is running).  You 
could also accurately time how long each "pass" in the FFT took, since the 
CPU power dropped while setting up the parameters for the next set of 
butterflies.

To really track this kind of thing down, you'd want to hook a DC current 
probe around the wires from the Power supply to the motherboard. Then, 
write some benchmark program with a fairly repeatable computational 
resource requirement pattern.  Look at the current on an oscilloscope.

I suspect that onboard filtering will get rid of variations that last less 
than, say, 1-10 mSec, so a program that has a basic cyclical nature lasting 
10 times that would be nice.

Ideally, you'd probe the current going to the CPU, vs the rest of the mobo, 
but that's probably a bit of a challenge.

Another experiment would be to write a small program that you KNOW will 
stay in cache and never go off chip and measure the current draw when 
running it.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tegner at nada.kth.se  Wed Mar 10 15:34:40 2004
From: tegner at nada.kth.se (Jon Tegner)
Date: Wed, 10 Mar 2004 21:34:40 +0100
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403101149390.24719-100000@coffee.psychology.mcmaster.ca>
Message-ID: <404F7BE0.6040900@nada.kth.se>

Seems to be running fine with xine.

/jon


Mark Hahn wrote:

>>See the online lecture: "Things CPU Architects Need To Think About"
>>http://www.stanford.edu/class/ee380/
>>    
>>
>
>does anyone have a lead on an open-source player for these .asx files?
>or at least something not tied to windows?
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>  
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jimlux at earthlink.net  Thu Mar 11 09:07:09 2004
From: jimlux at earthlink.net (Jim Lux)
Date: Thu, 11 Mar 2004 06:07:09 -0800
Subject: [Beowulf] Power consumption for opterons?
References: <Pine.LNX.4.44.0403101851000.1295-100000@lilith.rgb.private.net> <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk>
Message-ID: <000e01c40772$2611bf60$36a8a8c0@LAPTOP152422>


----- Original Message -----
From: "Alex Martin" <a.j.martin at qmul.ac.uk>
To: "Robert G. Brown" <rgb at phy.duke.edu>; "Mark Hahn"
<hahn at physics.mcmaster.ca>
Cc: "Jon Tegner" <tegner at nada.kth.se>; <beowulf at beowulf.org>
Sent: Thursday, March 11, 2004 1:56 AM
Subject: Re: [Beowulf] Power consumption for opterons?


> On Wednesday 10 March 2004 11:56 pm, Robert G. Brown wrote:
>
> >
> > I went downstairs again today and really paid attention to the
> > kill-a-watt.  Dual 1600 MHz Opteron, 1 GB of memory, load average of 3
> > (I don't know why but they are running three jobs instead of two at the
> > moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over
> > 120 V line voltage).
>
<snip>

> I find you numbers a bit surprising still  As part of our latest
procurement
> I looked up the power consumption in the INTEL/AMD documention for the
> various processors under consideration:
>
surprising high or surprising low?

You're comparing DC power to just the processor vs wall plug power to the
whole system (including cooling fans, RAM, PCI bridge chips, etc.)  I think
that the databook numbers of ca 50-80 W per CPU (probably the highest
continuous average power) is nicely matched with 180 W from the wall for a
dual CPU...

The databook number is probably a bit on the high side... 180W from the wall
probably equates to about 140W DC.  There's probably 10W or so in fans and
glue, maybe 100W for both procesors, and 30W for the rest of the logic and
RAM

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mathiasbrito at yahoo.com.br  Fri Mar 12 08:51:22 2004
From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=)
Date: Fri, 12 Mar 2004 10:51:22 -0300 (ART)
Subject: [Beowulf] Strange Behavior
Message-ID: <20040312135122.92643.qmail@web12208.mail.yahoo.com>

Hi,

I'm benchmarking my 16 nodes cluster with HPL and I
obtain a estrange result, different of all I ever seen
before. When I send more data with a big N, the
performance is worse than with small values of N. I
used N=5000 with NB=20 and the performance was 3.3GB,
when I send N=10000 with NB=20 i get only 2.1GB. I
don't liked the result, the nodes are athlon xp 1600+
with 512MB RAM, and I think the cluster very slow.
Someone had the same problem and could help me?

Mathias

=====
Mathias Brito
Universidade Estadual de Santa Cruz - UESC
Departamento de Ci?ncias Exatas e Tecnol?gicas
Estudante do Curso de Ci?ncia da Computa??o

______________________________________________________________________

Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora:
http://br.yahoo.com/info/mail.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lars at meshtechnologies.com  Fri Mar 12 11:43:47 2004
From: lars at meshtechnologies.com (Lars Henriksen)
Date: Fri, 12 Mar 2004 16:43:47 +0000
Subject: [Beowulf] Strange Behavior
In-Reply-To: <20040312135122.92643.qmail@web12208.mail.yahoo.com>
References: <20040312135122.92643.qmail@web12208.mail.yahoo.com>
Message-ID: <1079109827.3745.7.camel@tp1.mesh-hq>

On Fri, 2004-03-12 at 13:51, Mathias Brito wrote:

> I'm benchmarking my 16 nodes cluster with HPL and I
> obtain a estrange result, different of all I ever seen
> before. When I send more data with a big N, the
> performance is worse than with small values of N. I
> used N=5000 with NB=20 and the performance was 3.3GB,
> when I send N=10000 with NB=20 i get only 2.1GB. I
> don't liked the result, the nodes are athlon xp 1600+
> with 512MB RAM, and I think the cluster very slow.
> Someone had the same problem and could help me?

Please correct me anybody, if im wrong:
It seems to me, that the best results are acheived with approx 85-90%
memory utilization (leaving something to the rest of the system).

(16*512*1024*1024/8)^0.5 ~= 30200, that would close to the best N value
isn't Nb=20 very low? I currently use arround 145 for P4 cpu's

What performance du you get from a setup like the one above?

best regards

Lars
-- 
Lars Henriksen                  | MESH-Technologies A/S
Systems Manager & Consultant    | Lille Graabroedrestraede 1
www.meshtechnologies.com        | DK-5000 Odense C, Denmark
lars at meshtechnologies.com       | mobile: +45 2291 2904
direct: +45 6311 1187	 	| fax:	  +45 6311 1189


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From M.Arndt at science-computing.de  Fri Mar 12 07:06:36 2004
From: M.Arndt at science-computing.de (Michael Arndt)
Date: Fri, 12 Mar 2004 13:06:36 +0100
Subject: [Beowulf] Cluster Uplink via Wireless
Message-ID: <20040312130636.D49119@blnsrv1.science-computing.de>

Hello *

has anyone done a wireless technology uplink to a compute cluster
that is in real use ?
If so, i would be interested to know how and how is the experinece in
transferring  "greater" (e.g. 2 GB ++ ) Result files?

explanation: 
We have a cluster with gigabit interconnect
where it would make life cheaper, if there is a possibility to upload
input data and download output data via wireless link, since connecting 
twisted pair between WS and CLuster would be expensive.
 

TIA
Micha
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Mar 12 17:22:58 2004
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Mar 2004 14:22:58 -0800
Subject: [Beowulf] Cluster Uplink via Wireless
In-Reply-To: <20040312130636.D49119@blnsrv1.science-computing.de>
Message-ID: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov>

At 01:06 PM 3/12/2004 +0100, Michael Arndt wrote:
>Hello *
>
>has anyone done a wireless technology uplink to a compute cluster
>that is in real use ?
>If so, i would be interested to know how and how is the experinece in
>transferring  "greater" (e.g. 2 GB ++ ) Result files?
>
>explanation:
>We have a cluster with gigabit interconnect
>where it would make life cheaper, if there is a possibility to upload
>input data and download output data via wireless link, since connecting
>twisted pair between WS and CLuster would be expensive.
>
I have a very small cluster that is using wireless interconnect for 
everything, and based upon my early observations, I'd be real, real leery 
of contemplating transferring Gigabytes in any practical time. For 
instance, loading a 25 MB compressed ram file system using tftp during PXE 
boot takes about a minute. This is on a very non-optimized configuration 
using 802.11a, through  a variety of devices.

Yes, indeed, the ad literature claims 54 Mbps, but that's not the actual 
data rate, but more the "bit rate" of the over the air signal.  Wireless 
LANs are NOT full duplex, and there are synchronization preambles, etc. 
that make the throughput much lower.

On a standard "11 Mbps" 802.11b type network, the "real data throughput" in 
a unidirectional transfer is probably closer to 3-5 Mbps.

Say you get that wireless link really humming at 20 Mbps real data 
rate.  Transferring 16,000 Mbit is still going to take 10-15 minutes.

Your situation might be a bit better, especially if you can use a point to 
point wireless link.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Mar 12 19:29:41 2004
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Mar 2004 16:29:41 -0800
Subject: [Beowulf] Cluster Uplink via Wireless
References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov>
Message-ID: <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov>

At 06:04 PM 3/12/2004 -0500, Mark Hahn wrote:
> > Say you get that wireless link really humming at 20 Mbps real data
> > rate.  Transferring 16,000 Mbit is still going to take 10-15 minutes.
>
>out of truely morbid curiosity, what's the latency like?


I'll have some numbers next week.  The configuration is sort of weird..

diskless node booting w/PXE
D-Link Wireless AP in multi AP connect mode
over the air
D-Link wireless AP in multi AP connect mode
network w/NFS and DHCP server

The D-Link boxes try to be smart and not push packets across the air link 
that are for MACs they know are on the wired side, and that whole process 
is "tricky"...


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From clwang at csis.hku.hk  Fri Mar 12 21:29:43 2004
From: clwang at csis.hku.hk (Cho Li Wang)
Date: Sat, 13 Mar 2004 10:29:43 +0800
Subject: [Beowulf] NPC2004 CFP : Deadline Extended to March 22, 2004
Message-ID: <40527217.92D67387@csis.hku.hk>


*******************************************************************
                            NPC2004
IFIP International Conference on Network and Parallel Computing
                       October 18-20, 2004
                          Wuhan, China
                 http://grid.hust.edu.cn/npc04
-------------------------------------------------------------------
Important Dates

  Paper Submission                        March 22, 2004 (extended)
  Author Notification                     May    1, 2004
  Final Camera Ready Manuscript           June   1, 2004

*******************************************************************

Call For Papers

  The goal of IFIP International Conference on Network and Parallel
Computing 
(NPC 2004) is to establish an international forum for engineers and
scientists 
to present their excellent ideas and experiences in all system fields of 
network and parallel computing. NPC 2004, hosted by the Huazhong
University of 
Science and Technology, will be held in the city of Wuhan, China - 
the "Homeland of White Clouds and the Yellow Crane." Topics of interest 
include, but are not limited to:

        - Parallel & Distributed Architectures
        - Parallel & Distributed Applications/Algorithms
        - Parallel Programming Environments & Tools
        - Network & Interconnect Architecture
        - Network Security 
        - Network Storage
        - Advanced Web and Proxy Services
        - Middleware Frameworks & Toolkits
        - Cluster and Grid Computing                        
        - Ubiquitous Computing
        - Peer-to-peer Computing 
        - Multimedia Streaming Services
        - Performance Modeling & Evaluation

Submitted papers may not have appeared in or be considered for another 
conference. Papers must be written in English and must be in PDF format. 
Detailed electronic submission instructions will be posted on the
conference 
web site. The conference proceedings will be published by Springer
Verlag in 
the Lecture Notes in Computer Science Series (cited by SCI). Best papers
from 
NPC 2004 will be published in a special issue of International Journal
of High 
Performance Computing and Networking (IJHPCN) after conference.

**************************************************************************

Committee

  General Co-Chairs: 
        H. J. Siegel           Colorado State University, USA
        Guojie Li              The Institute of Computing Technology, 
                               CAS, China
  Steering Committee Chair:
        Kemal Ebcioglu         IBM T.J. Watson Research Center, USA

  Program Co-Chairs:
        Guangrong Gao          University of Delaware, USA
        Zhiwei Xu              Chinese Academy of Sciences, China

  Program Vice-Chairs:
        Victor K. Prasanna     University of Southern California, USA
        Albert Y. Zomaya       University of Sydney, Australia
        Hai Jin                Huazhong University of Science and 
                               Technology, China
  Publicity Co-Chairs:
        Cho-Li Wang           The University of Hong Kong, Hong Kong
        Chris Jesshope        The University of Hull, UK

  Local Arrangement Chair:
        Song Wu               Huazhong University of Science and 
                              Technology, China

  Steering Committee Members:
        Jack Dongarra         University of Tennessee, USA
        Guangrong Gao         University of Delaware, USA
        Jean-Luc Gaudiot      University of California, Irvine, USA
        Guojie Li             The Institute of Computing Technology, 
                              CAS, China
        Yoichi Muraoka        Waseda University, Japan
        Daniel Reed           University of North Carolina, USA

  Program Committee Members:
        Ishfaq Ahmad             University of Texas at Arlington, USA
        Shoukat Ali              University of Missouri-Rolla, USA
        Makoto Amamiya           Kyushu University, Japan
        David Bader              University of New Mexico, USA
        Luc Bouge                IRISA/ENS Cachan, France
        Pascal Bouvry            University of Luxembourg, Luxembourg
        Ralph Castain            Los Alamos National Laboratory, USA
        Guoliang Chen            University of Science and Technology 
                                 of China, China
        Alain Darte              CNRS, ENS-Lyon, France
        Chen Ding                University of Rochester, USA
        Jianping Fan             Institute of Computing Technology, CAS,
China
        Xiaobing Feng            Institute of Computing Technology, 
                                 CAS, China
        Jean-Luc Gaudiot         University of California, Irvine, USA
        Minyi Guo                University of Aizu, Japan
        Mary Hall                University of Southern California, USA
        Salim Hariri             University of Arizona, USA
        Kai Hwang                University of Southern California, USA
        Anura Jayasumana         Colorado State Univeristy, USA
        Chris R. Jesshop         The University of Hull, UK
        Ricky Kwok               The University of Hong Kong, Hong Kong
        Francis Lau              The University of Hong Kong, Hong Kong
        Chuang Lin               Tsinghua University, China
        John Morrison            University College Cork, Ireland
        Lionel Ni                Hong Kong University of Science and 
                                 Technology, Hong Kong
        Stephan Olariu           Old Dominion University, USA
        Yi Pan                   Georgia State University, USA
        Depei Qian               Xi'an Jiaotong University, China
        Daniel A. Reed           University of North Carolina at 
                                 Chapel Hill, USA
        Jose Rolim               University of Geneva, Switzerland
        Arnold Rosenberg         University of Massachusetts at Amherst,
USA
        Sartaj Sahni             University of Florida, USA
        Selvakennedy Selvadurai  University of Sydney, Australia
        Franciszek Seredynski    Polish Academy of Sciences, Poland
        Hong Shen                Japan Advanced Institute of Science 
                                 and Technology, Japan
        Xiaowei Shen             IBM T. J. Watson Research Center, USA
        Gabby Silberman          IBM Centers for Advanced Studies, USA
        Per Stenstrom            Chalmers University of Technology,
Sweden
        Ivan Stojmenovic         University of Ottawa, Canada
        Ninghui Sun              Institute of Computing Technology, CAS,
China
        El-Ghazali Talbi         University of Lille, France
        Domenico Talia           University of Calabria, Italy
        Mitchell D. Theys        University of Illinois at Chicago, USA
        Xinmin Tian              Intel Corporation, USA
        Dean Tullsen             University of California, San Diego,
USA
        Cho-Li Wang              The University of Hong Kong, Hong Kong
        Qing Yang                University of Rhode Island, USA
        Yuanyuan Yang            State University of New York at 
                                 Stony Brook, USA
        Xiaodong Zhang           College of William and Mary, USA
        Weimin Zheng             Tsinghua University, China
        Bingbing Zhou            University of Sydney, Australia
        Chuanqi Zhu              Fudan University, China

------------------------------------------------------------------------  
  For more information, please contact the program vice-chair 
  at the address below:

        Dr. Hai Jin, Professor
        Director, Cluster and Grid Computing Lab
        Vice-Dean, School of Computer
        Huazhong University of Science and Technology
        Wuhan, 430074, China
        Tel:    +86-27-87543529
        Fax:   +86-27-87557354
        e-fax:  +1-425-920-8937
        e-mail: hjin at hust.edu.cn
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mayank_kaushik at vsnl.net  Sat Mar 13 05:24:13 2004
From: mayank_kaushik at vsnl.net (mayank_kaushik at vsnl.net)
Date: Sat, 13 Mar 2004 15:24:13 +0500
Subject: [Beowulf] Benchmarking with PVM
Message-ID: <74070c77404a91.7404a9174070c7@vsnl.net>

hi everyone


first of all, id like to thank Robert G. Brown for his help in solving my PVM problem, and getting my cluster running!

now that its running, iv been trying to run tests on it to see how fast it really is..so i ran PVMPOV, and the results were pretty impressive- i had two P4s clustered, and the rendering time was reduced by half..may sound trivial to you guys, but to a first-timer like me, it looks great! :-)

okay, so heres the deal- we`v got lots of idle computers in the college computer lab..an eclectic mix of P2 350s and P3 733s, which everyone has abandoned in favour of flashy new compaq evo P4 2.4ghzs, so along comes me the evangelist and turns all the outcasts into cluster nodes..
(wev got a gigabit LAN too)

now,id like to run benchmarking tests on the cluster so as to outline the increase in performance as individual nodes are added..and also the increase in the load on the network..
are there tools available that would let me do all this..and, say, get graphs etc too? tools that are compatible with PVM? could anyone provide links to places where they can be downloaded?
(im running red-hat 9.0 on all systems)

thanx in anticipation

Mayank

PS. those proud compaq evos are giving me trouble..thev got winXP with an NTFS filesystem, n im trying to use partition magic to make a pratition so that i can make a dual boot system and install linux...but partition magic always exits with an error, on all the systems..fips wont work with NTFS..has anyone ever done this? the quick-restor cd says it would remove all partitions and make just one NTFS partition, so i didnt try that.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From daniel.kidger at quadrics.com  Sat Mar 13 17:00:40 2004
From: daniel.kidger at quadrics.com (Dan Kidger)
Date: Sat, 13 Mar 2004 22:00:40 +0000
Subject: [Beowulf] Strange Behavior
In-Reply-To: <1079109827.3745.7.camel@tp1.mesh-hq>
References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> <1079109827.3745.7.camel@tp1.mesh-hq>
Message-ID: <200403132200.40877.daniel.kidger@quadrics.com>

On Friday 12 March 2004 4:43 pm, Lars Henriksen wrote:
> On Fri, 2004-03-12 at 13:51, Mathias Brito wrote:
> > I'm benchmarking my 16 nodes cluster with HPL and I
> > obtain a estrange result, different of all I ever seen
> > before. When I send more data with a big N, the
> > performance is worse than with small values of N. I
> > used N=5000 with NB=20 and the performance was 3.3GB,
> > when I send N=10000 with NB=20 i get only 2.1GB. I
> > don't liked the result, the nodes are athlon xp 1600+
> > with 512MB RAM, and I think the cluster very slow.
> > Someone had the same problem and could help me?
>
> Please correct me anybody, if im wrong:
> It seems to me, that the best results are acheived with approx 85-90%
> memory utilization (leaving something to the rest of the system).
>
> (16*512*1024*1024/8)^0.5 ~= 30200, that would close to the best N value


Your target should be say 75% of theoretical peak performance 
   0.75 * 16nodes  * 1 cpupernode * 1.4Ghz * 1 floppertick 
   = 16.8 Gflops/s

So figures like '3.1' Gflops/s (14% peak) are much lower than what you should 
be achieving (Only vendors like IBM post figures on the top500 with %peak 
figures as low as this (Nov2003) )

Linpack figures are dominated by the choice of maths library - you do not say 
which one you are using (MKL, libgoto, Atlas, ACML) ?


> isn't Nb=20 very low? I currently use arround 145 for P4 cpu's

Remember choice of NB depends on which maths library you use rather than 
simply on the platform - but in general the best values lie between 80 to 
256; 20x20 is far too small for a matrix multiply.


Daniel.

--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd.      daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK         0117 915 5505
----------------------- www.quadrics.com --------------------


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ratscus at hotmail.com  Sat Mar 13 20:55:17 2004
From: ratscus at hotmail.com (Joe Manning)
Date: Sat, 13 Mar 2004 18:55:17 -0700
Subject: [Beowulf] project
Message-ID: <BAY7-F791pXFTaAc7tq0005c21f@hotmail.com>

Does anyone know of a good non-profit that posts data to be processed?  Kind 
of like how SETI dispenses its data, but for cancer or something?  I have a 
whole school to my disposal and am just going to run a diskless system 
pushed down from a server.  I can't really do much about the network, but 
will use it as a working model for some personal curiosities.  (hopefully I 
will be able to contribute to this group at some point)  Also, if anyone 
does know of a good place to get this type of data, can they please point me 
in the right direction of the type of process said sight uses, so I can 
decide what version I want to use to implement the process.

Thanks,


Joe Manning

_________________________________________________________________
Get a FREE online computer virus scan from McAfee when you click here. 
http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From patrick at myri.com  Sat Mar 13 21:57:40 2004
From: patrick at myri.com (Patrick Geoffray)
Date: Sat, 13 Mar 2004 21:57:40 -0500
Subject: [Beowulf] Strange Behavior
In-Reply-To: <200403132200.40877.daniel.kidger@quadrics.com>
References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> <1079109827.3745.7.camel@tp1.mesh-hq> <200403132200.40877.daniel.kidger@quadrics.com>
Message-ID: <4053CA24.1020901@myri.com>

Hi Dan.

Dan Kidger wrote:
> Your target should be say 75% of theoretical peak performance

He is likely using IP over Ethernet, so 50% would be a more reasonable 
expectation.

> So figures like '3.1' Gflops/s (14% peak) are much lower than what you should 
> be achieving (Only vendors like IBM post figures on the top500 with %peak 
> figures as low as this (Nov2003) )

Which ones ?

Patrick
-- 

Patrick Geoffray
Myricom, Inc.
http://www.myri.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From unix_no_win at yahoo.com  Sun Mar 14 11:49:17 2004
From: unix_no_win at yahoo.com (unix_no_win)
Date: Sun, 14 Mar 2004 08:49:17 -0800 (PST)
Subject: [Beowulf] project
In-Reply-To: <BAY7-F791pXFTaAc7tq0005c21f@hotmail.com>
Message-ID: <20040314164917.45310.qmail@web40412.mail.yahoo.com>


You might want to check out:
www.distributedfolding.org


--- Joe Manning <ratscus at hotmail.com> wrote:
> Does anyone know of a good non-profit that posts
> data to be processed?  Kind 
> of like how SETI dispenses its data, but for cancer
> or something?  I have a 
> whole school to my disposal and am just going to run
> a diskless system 
> pushed down from a server.  I can't really do much
> about the network, but 
> will use it as a working model for some personal
> curiosities.  (hopefully I 
> will be able to contribute to this group at some
> point)  Also, if anyone 
> does know of a good place to get this type of data,
> can they please point me 
> in the right direction of the type of process said
> sight uses, so I can 
> decide what version I want to use to implement the
> process.
> 
> Thanks,
> 
> 
> Joe Manning
> 
>
_________________________________________________________________
> Get a FREE online computer virus scan from McAfee
> when you click here. 
>
http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From peter at cs.usfca.edu  Sun Mar 14 13:57:22 2004
From: peter at cs.usfca.edu (Peter Pacheco)
Date: Sun, 14 Mar 2004 10:57:22 -0800
Subject: [Beowulf] Flashmob Supercomputer
Message-ID: <20040314185722.GB14301@cs.usfca.edu>

The University of San Francisco is sponsoring the first FlashMob Supercomputer
on

   - Saturday, April 3, from 8 am to 6 pm,

in the

   - Koret Center of the University of San Francisco.

We're planning to network 1200-1400 laptops with Myrinet and Foundry
Switches.  We'll be running High-Performance Linpack, and we're hoping
to achieve 600 GFLOPS, which is faster than some of the Top500 fastest
supercomputers.

We need volunteers to

   - Bring their laptops:  Pentium III or IV or AMD, minimum requirements 
     1.3 GHz with 256 MBytes of RAM
   - Be table captains:  help people set up laptops before running the
     benchmark
   - Speak on subjects related to high-performance computing

For further information, please visit our website

   http://flashmobcomputing.org

Peter Pacheco
Department of Computer Science
University of San Francisco
San Francisco, CA 94117
peter at cs.usfca.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Sun Mar 14 21:17:50 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Mon, 15 Mar 2004 10:17:50 +0800 (CST)
Subject: [Beowulf] Oh MyGrid
Message-ID: <20040315021750.49880.qmail@web16813.mail.tpe.yahoo.com>

http://mygrid.sourceforge.net/

"MyGrid is designed with the modern concepts in mind,
simple naming and transparent class hierarchy."

It's targeting DataSynapse, licensed under GPL, and
more features.

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at intnet.mu  Sun Mar 14 22:12:55 2004
From: rgoornaden at intnet.mu (roudy)
Date: Mon, 15 Mar 2004 07:12:55 +0400
Subject: [Beowulf] Re: Writing a parallel program
Message-ID: <000701c40a3b$9a415e60$2b007bca@roudy>

Hello,
I don't know if it will be here that I can get a solution to my problem.
Well, I have an array of elements and I would like to divide the array by
the number of processors and then each processor process parts of the whole
array.
Below is the source code of how I am proceeding, can someone tell me what is
wrong?

Assume that the I have an array allval[tdegree]

void share_data(void)
{
double  nleft;
int i, k, j, nmin;
nmin = tdegree/size;  /* Number of degrees to be handled by each processor
*/
nleft = tdegree%size;
for(i=0;i<size;i++)
 ndegree[i] = ( i < nleft) ? nmin + 1 : nmin ;
 displ[i] += ndegree[i];
printf("\n%s: Number of points handled = %d\n", processor_name,
ndegree[rank]);
MPI_Scatterv ((void *)allval, ndegree, displ, MPI_DOUBLE, (void *)&val[1],
ndegree[rank], MPI_DOUBLE, MASTERNODE, MPI_COMM_WORLD);
}

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From burcu at ulakbim.gov.tr  Mon Mar 15 03:51:47 2004
From: burcu at ulakbim.gov.tr (Burcu Akcan)
Date: Mon, 15 Mar 2004 10:51:47 +0200
Subject: [Beowulf] SPBS problem--need help...
In-Reply-To: <404EC427.7070200@ulakbim.gov.tr>
References: <404EC427.7070200@ulakbim.gov.tr>
Message-ID: <40556EA3.60400@ulakbim.gov.tr>

Hi,

We have built a beowulf Debian cluster that contains 128 PIV nodes and 
one dual xeon server. I need
some help about SPBS (Storm). We have already installed SPBS on the 
server and nodes and all daemons seem to work regularly. When any job is 
given to the system by using pbs scripting, the job can be seen on 
defined queue by running status and related nodes are allocated for the 
job. On the other hand there is no cpu or memory consumption on the 
nodes, the job does not run exactly and at the end of estimated cpu time 
there is no output file.
Can anyone give some advice on my problem about SPBS.

Thank you...


Burcu Akcan

ULAKBIM High Performance Computing Center


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From klamman.gard at telia.com  Mon Mar 15 13:42:43 2004
From: klamman.gard at telia.com (Per Lindstrom)
Date: Mon, 15 Mar 2004 19:42:43 +0100
Subject: [Beowulf] MOSIX cluster
Message-ID: <4055F923.70203@telia.com>

Hi,
         
I wonder if some of you have experience of MOSIX? (www.mosix.org)
               
What do you think about that solution for FEA-simulations?

Can MOSIX be regarded as a form of a Beowulf cluster?

Best regards
Per Lindstrom

Per.Lindstrom at me.chalmers.se , klamman.gard at telia.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john4482 at umn.edu  Mon Mar 15 15:02:40 2004
From: john4482 at umn.edu (Eric R Johnson)
Date: Mon, 15 Mar 2004 14:02:40 -0600
Subject: [Beowulf] Scyld system mysteriously locks up
Message-ID: <40560BE0.1090808@umn.edu>

Hello,

I purchased a 4 node, 8 processor Scyld (version 28) cluster 
approximately 6 months ago.  About 5 days ago, it started mysteriously 
locking up on me.  Once it is locked up, I can't do anything except 
physically reboot the machine.
Unfortunately, I am rather new to Linux clusters and, since it worked 
"right out of the box", I have had no experience in troubleshooting.  
Can someone give me an idea of where I should start?
I have the BIOS on all machines set to do a full memory check on startup 
and the /var/log/message file shows nothing.
Thanks,
Eric

-- 
********************************************************************
 Eric R A Johnson
 University Of Minnesota                      tel: (612) 626 5115
 Dept. of Laboratory Medicine & Pathology     fax: (612) 625 1121
 7-230 BSBE                              e-mail: john4482 at umn.edu
 312 Church Street                    web: www.eric-r-johnson.com
 Minneapolis, MN 55455    
 USA                              


-- 
********************************************************************
  Eric R A Johnson
  University Of Minnesota                      tel: (612) 626 5115
  Dept. of Laboratory Medicine & Pathology     fax: (612) 625 1121
  7-230 BSBE                              e-mail: john4482 at umn.edu
  312 Church Street                    web: www.eric-r-johnson.com
  Minneapolis, MN 55455    
  USA                              


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From agrajag at dragaera.net  Mon Mar 15 16:23:34 2004
From: agrajag at dragaera.net (Jag)
Date: Mon, 15 Mar 2004 16:23:34 -0500
Subject: [Beowulf] Cluster Uplink via Wireless
In-Reply-To: <20040312130636.D49119@blnsrv1.science-computing.de>
References: <20040312130636.D49119@blnsrv1.science-computing.de>
Message-ID: <1079385814.4352.86.camel@pel>

On Fri, 2004-03-12 at 07:06, Michael Arndt wrote:
> Hello *
> 
> has anyone done a wireless technology uplink to a compute cluster
> that is in real use ?
> If so, i would be interested to know how and how is the experinece in
> transferring  "greater" (e.g. 2 GB ++ ) Result files?
> 
> explanation: 
> We have a cluster with gigabit interconnect
> where it would make life cheaper, if there is a possibility to upload
> input data and download output data via wireless link, since connecting 
> twisted pair between WS and CLuster would be expensive.


Depending on your setup, some kind of "wireless" besides 802.11[bg] may
be worth considering.  I'm assuming the expense in wiring the WS to the
cluster isn't wire costs so much as where you'd have to put the cable.
One thing you might consider is IR uplink.  I don't remember what speed
they get, but a few years back I saw a college use IR to get
connectivity to a building, that otherwise would have required digging
up a busy public street to wire.  In the long run it was a lot cheaper. 
If your expense in wiring is something similar, you may want to look
into IR or similar technologies.  (The IR guns weren't cheap by any
means, except when compared to digging up a city street)

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mnerren at paracel.com  Mon Mar 15 18:21:04 2004
From: mnerren at paracel.com (micah nerren)
Date: Mon, 15 Mar 2004 15:21:04 -0800
Subject: [Beowulf] Scyld system mysteriously locks up
In-Reply-To: <40560BE0.1090808@umn.edu>
References: <40560BE0.1090808@umn.edu>
Message-ID: <1079392863.27739.25.camel@angmar>

On Mon, 2004-03-15 at 12:02, Eric R Johnson wrote:
> Hello,
> 
> I purchased a 4 node, 8 processor Scyld (version 28) cluster 
> approximately 6 months ago.  About 5 days ago, it started mysteriously 
> locking up on me.  Once it is locked up, I can't do anything except 
> physically reboot the machine.

I would check heating issues. Has the ventilation changed, does the
machine feel hot? How long between lockups?

Micah

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Tue Mar 16 04:42:30 2004
From: john.hearns at clustervision.com (John Hearns)
Date: Tue, 16 Mar 2004 10:42:30 +0100 (CET)
Subject: [Beowulf] Cluster Uplink via Wireless
In-Reply-To: <1079385814.4352.86.camel@pel>
Message-ID: <Pine.LNX.4.44.0403161038040.25183-100000@druifje.clustervision.com>

On Mon, 15 Mar 2004, Jag wrote:

> be worth considering.  I'm assuming the expense in wiring the WS to the
> cluster isn't wire costs so much as where you'd have to put the cable.
> One thing you might consider is IR uplink.  I don't remember what speed
> they get, but a few years back I saw a college use IR to get
> connectivity to a building, that otherwise would have required digging
> up a busy public street to wire.  In the long run it was a lot cheaper. 

When I worked in Soho, we had a laser link over the rooftops of London.
At the time a 155Mbps ATM link, which we later used for 100Mbps Ethernet.
Main problem was cleaning the lenses every so often, in the lovely
London air conditions. 
We later put in a gigabit laser from Nbase to another building.

We needed much more bandwidth than 100Mbps in the end, and had our own
trench dug and put in dark fibre. 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Tue Mar 16 04:58:23 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Tue, 16 Mar 2004 17:58:23 +0800 (CST)
Subject: [Beowulf] MOSIX cluster
In-Reply-To: <4055F923.70203@telia.com>
Message-ID: <20040316095823.57806.qmail@web16813.mail.tpe.yahoo.com>

Since you know the number of tasks your simulations
use, I think using a batch system would make it easier
to management - MOSIX is usually for jobs which are
very dynamic.

You can take a look at the common batch systems such
as SGE or SPBS.

http://gridengine.sunsource.net
http://www.supercluster.org/projects/torque/

Andrew.

--- Per Lindstrom <klamman.gard at telia.com> ????>
Hi,
>          
> I wonder if some of you have experience of MOSIX?
> (www.mosix.org)
>                
> What do you think about that solution for
> FEA-simulations?
> 
> Can MOSIX be regarded as a form of a Beowulf
> cluster?
> 
> Best regards
> Per Lindstrom
> 
> Per.Lindstrom at me.chalmers.se ,
> klamman.gard at telia.com
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From prml at na.chalmers.se  Mon Mar 15 13:39:23 2004
From: prml at na.chalmers.se (Per R M Lindstrom)
Date: Mon, 15 Mar 2004 19:39:23 +0100 (CET)
Subject: [Beowulf] (no subject)
Message-ID: <Pine.OSF.4.58.0403151933410.205565@seahag.na.chalmers.se>

Hi,

I wonder if some of you have experience of MOSIX? (www.mosix.org)

What do you think about that solution for FEA-simulations?

Can MOSIX be regarded as a form of a Beowulf cluster?

Best regards
Per Lindstrom
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bioinformaticist at mn.rr.com  Mon Mar 15 14:49:36 2004
From: bioinformaticist at mn.rr.com (Eric R Johnson)
Date: Mon, 15 Mar 2004 13:49:36 -0600
Subject: [Beowulf] Scyld system mysteriously locks up
Message-ID: <405608D0.60501@mn.rr.com>

Hello,

I purchased a 4 node, 8 processor Scyld (version 28) cluster 
approximately 6 months ago.  About 5 days ago, it started mysteriously 
locking up on me.  Once it is locked up, I can't do anything except 
physically reboot the machine.
Unfortunately, I am rather new to Linux clusters and, since it worked 
"right out of the box", I have had no experience in troubleshooting.  
Can someone give me an idea of where I should start?
I have the BIOS on all machines set to do a full memory check on startup 
and the /var/log/message file shows nothing.
Thanks,
Eric

-- 
********************************************************************
  Eric R A Johnson
  University Of Minnesota                      tel: (612) 626 5115
  Dept. of Laboratory Medicine & Pathology     fax: (612) 625 1121
  7-230 BSBE                              e-mail: john4482 at umn.edu
  312 Church Street                    web: www.eric-r-johnson.com
  Minneapolis, MN 55455    
  USA                              


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From br66 at HPCL.CSE.MsState.Edu  Mon Mar 15 18:09:37 2004
From: br66 at HPCL.CSE.MsState.Edu (Balaji Rangasamy)
Date: Mon, 15 Mar 2004 17:09:37 -0600 (CST)
Subject: [Beowulf] MPICH Exporting environment variables.
Message-ID: <Pine.GSO.4.44.0403151707190.4181-100000@aurora.cs.msstate.edu>

Hi,
Has anyone successfully exported any environment variables (specifically
LD_PRELOAD) in MPICH? There is an easy way to do this with LAM/MPI; there
is this -x switch in mpirun command that comes with LAM/MPI that will
export the environment variable you specify to all the child processes. Is
there any easy way to do this in MPICH?
Thanks for your reply,
Balaji.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tsariysk at craft-tech.com  Tue Mar 16 14:12:46 2004
From: tsariysk at craft-tech.com (Ted Sariyski)
Date: Tue, 16 Mar 2004 14:12:46 -0500
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov>
References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov>
Message-ID: <405751AE.2040806@craft-tech.com>

Hi,
I am about to configure a 16 node dual xeon cluster based on Supermicro 
X5DPA-TGM motherboard. The cluster may grow so I am looking for a 
manageable, nonblocking 24 or 32 port gigabit switch. Any comments or 
recommendations will be highly appreciated.
Thanks,
Ted

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From agrajag at dragaera.net  Tue Mar 16 13:49:02 2004
From: agrajag at dragaera.net (Sean Dilda)
Date: Tue, 16 Mar 2004 13:49:02 -0500
Subject: [Beowulf] Scyld system mysteriously locks up
In-Reply-To: <405608D0.60501@mn.rr.com>
References: <405608D0.60501@mn.rr.com>
Message-ID: <1079462942.4354.49.camel@pel>

On Mon, 2004-03-15 at 14:49, Eric R Johnson wrote:
> Hello,
> 
> I purchased a 4 node, 8 processor Scyld (version 28) cluster 
> approximately 6 months ago.  About 5 days ago, it started mysteriously 
> locking up on me.  Once it is locked up, I can't do anything except 
> physically reboot the machine.
> Unfortunately, I am rather new to Linux clusters and, since it worked 
> "right out of the box", I have had no experience in troubleshooting.  
> Can someone give me an idea of where I should start?
> I have the BIOS on all machines set to do a full memory check on startup 
> and the /var/log/message file shows nothing.

It might be useful to try to figure out what is locking up.  Is it just
the head node that's locking?

Have you made any recent changes that might account for it?  Or are you
running any new programs that might be stressing the machine in a way it
wasn't stressed before?    If its completely locking (if you can no
longer toggle the numlock light on your keyboard, then its completely
locked), then its either a kernel hang, or a hardware issue.  If the
kernel is the same and the usage pattern hasn't changed, then it might
be a hardware issue.  Hardware can degrade over time and dying hardware
can be unpredictable.

You may also consider contacting Scyld, and possibly the hardware
manufacturer for help diagnosing the problem.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Tue Mar 16 16:19:12 2004
From: david.n.lombard at intel.com (Lombard, David N)
Date: Tue, 16 Mar 2004 13:19:12 -0800
Subject: [Beowulf] MOSIX for FEA (was: no subject)
Message-ID: <187D3A7CAB42A54DB61F1D05F0125722025F5662@orsmsx402.jf.intel.com>

From: Per R M Lindstrom; Monday, March 15, 2004 10:39 AM
> 
> Hi,
> 
> I wonder if some of you have experience of MOSIX? (www.mosix.org)
> 
> What do you think about that solution for FEA-simulations?

As with all things, "it depends."  More specifically, it depends on the
characteristics of the FEA app.

For the FEA app that I have intimate familiarity with, MOSIX would not
work well at all.  The reason is the app is highly sensitive to
sustained memory bandwidth and sustained disk I/O bandwidth.  While
memory bandwidth is not an issue with MOSIX, disk I/O bandwidth will
become an issue once MOSIX migrates a process to balance CPU load.  The
(local scratch) disk I/O will then be forced through both the current
and original nodes, severely impacting the bandwidth.

Having said that, I can imagine an in-memory FEA app that could work
quite well on MOSIX.  More specifically, the hypothetical app would read
its data from disk, crunch for a while, and then write its results to
disk.

-- 
David N. Lombard
 
My comments represent my opinions, not those of Intel Corporation
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gropp at mcs.anl.gov  Tue Mar 16 15:14:58 2004
From: gropp at mcs.anl.gov (William Gropp)
Date: Tue, 16 Mar 2004 14:14:58 -0600
Subject: [Beowulf] MPICH Exporting environment variables.
In-Reply-To: <Pine.GSO.4.44.0403151707190.4181-100000@aurora.cs.msstate.
 edu>
References: <Pine.GSO.4.44.0403151707190.4181-100000@aurora.cs.msstate.edu>
Message-ID: <6.0.0.22.2.20040316141246.025e4f48@localhost>

At 05:09 PM 3/15/2004, Balaji Rangasamy wrote:
>Hi,
>Has anyone successfully exported any environment variables (specifically
>LD_PRELOAD) in MPICH? There is an easy way to do this with LAM/MPI; there
>is this -x switch in mpirun command that comes with LAM/MPI that will
>export the environment variable you specify to all the child processes. Is
>there any easy way to do this in MPICH?

It depends on the process manager/startup system that you are using with 
MPICH.  With the "p4 secure server", environment variables can be 
exported.  With the default ch_p4 device, environment variables are not 
exported.  Under MPICH2, most process managers export the environment to 
the user processes.

Bill 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Tue Mar 16 22:09:44 2004
From: csamuel at vpac.org (Chris Samuel)
Date: Wed, 17 Mar 2004 14:09:44 +1100
Subject: [Beowulf] cfengine users ?
Message-ID: <200403171409.45273.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi folks,

Anyone out there using cfengine to manage clusters, or who's tried and failed?

Just curious as to whether it's worth looking at..

cheers!
Chris
- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQFAV8F4O2KABBYQAh8RAth7AJ9NkRhIUqcykX1zWGZyi/vZcB7JhwCgkVej
uX5R/EcQrBPX+/Pyew55FC0=
=tRe+
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From a.j.martin at qmul.ac.uk  Wed Mar 17 05:09:28 2004
From: a.j.martin at qmul.ac.uk (Alex Martin)
Date: Wed, 17 Mar 2004 10:09:28 +0000
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <405751AE.2040806@craft-tech.com>
References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> <405751AE.2040806@craft-tech.com>
Message-ID: <200403171009.i2HA9S314735@heppcb.ph.qmw.ac.uk>

On Tuesday 16 March 2004 7:12 pm, Ted Sariyski wrote:
> Hi,
> I am about to configure a 16 node dual xeon cluster based on Supermicro
> X5DPA-TGM motherboard. The cluster may grow so I am looking for a
> manageable, nonblocking 24 or 32 port gigabit switch. Any comments or
> recommendations will be highly appreciated.
> Thanks,
> Ted
>

You might want to look at the HP ProCurve 2824 or 2848 series.  We choose the 
latter, because it means we only need one switch per (logical) rack and the 
cost/port is pretty low. I can't yet comment on performance. 

cheers,
Alex

-- 
------------------------------------------------------------------------------
|                                                                            |
|  Dr. Alex Martin                                                           |
|  e-Mail:   a.j.martin at qmul.ac.uk        Queen Mary, University of London,  |
|  Phone :   +44-(0)20-7882-5033          Mile End Road,                     |
|  Fax   :   +44-(0)20-8981-9465          London, UK   E1 4NS                |
|                                                                            |
------------------------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bogdan.costescu at iwr.uni-heidelberg.de  Wed Mar 17 07:17:06 2004
From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu)
Date: Wed, 17 Mar 2004 13:17:06 +0100 (CET)
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <200403171009.i2HA9S314735@heppcb.ph.qmw.ac.uk>
Message-ID: <Pine.LNX.4.44.0403171308420.741-100000@kenzo.iwr.uni-heidelberg.de>

On Wed, 17 Mar 2004, Alex Martin wrote:

> You might want to look at the HP ProCurve 2824 or 2848 series.  We
> choose the latter, because it means we only need one switch per
> (logical) rack and the cost/port is pretty low. I can't yet comment
> on performance.

I'm interested in buying a 48 port Gigabit switch as well, and I was
looking at the 2848 as it has the advantage of 48 ports in only 1U.
One thing that is not clear from the descriptions that I find on the 
net is if it has support for Jumbo frames. Does the documentation that 
come with it mention something like this or, even better, have you 
tried using Jumbo frames ?

I'm also interested in hearing opinions about other 48 ports Gigabit 
switches.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Wed Mar 17 08:04:10 2004
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Wed, 17 Mar 2004 05:04:10 -0800 (PST)
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <Pine.LNX.4.44.0403171308420.741-100000@kenzo.iwr.uni-heidelberg.de>
Message-ID: <Pine.LNX.4.44.0403170503340.20971-100000@twin.uoregon.edu>

On Wed, 17 Mar 2004, Bogdan Costescu wrote:

> On Wed, 17 Mar 2004, Alex Martin wrote:
> 
> > You might want to look at the HP ProCurve 2824 or 2848 series.  We
> > choose the latter, because it means we only need one switch per
> > (logical) rack and the cost/port is pretty low. I can't yet comment
> > on performance.
> 
> I'm interested in buying a 48 port Gigabit switch as well, and I was
> looking at the 2848 as it has the advantage of 48 ports in only 1U.
> One thing that is not clear from the descriptions that I find on the 
> net is if it has support for Jumbo frames. Does the documentation that 
> come with it mention something like this or, even better, have you 
> tried using Jumbo frames ?

hp does not support jumbo frames on anything except their high-end l3 
products...
 
> I'm also interested in hearing opinions about other 48 ports Gigabit 
> switches.
> 
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli  	       Unix Consulting 	       joelja at darkwing.uoregon.edu    
GPG Key Fingerprint:     5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tsariysk at craft-tech.com  Wed Mar 17 07:28:34 2004
From: tsariysk at craft-tech.com (Ted Sariyski)
Date: Wed, 17 Mar 2004 07:28:34 -0500
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <Pine.LNX.4.44.0403171308420.741-100000@kenzo.iwr.uni-heidelberg.de>
References: <Pine.LNX.4.44.0403171308420.741-100000@kenzo.iwr.uni-heidelberg.de>
Message-ID: <40584472.1050600@craft-tech.com>

If jumboframes are important you may look at Foundry EdgeIron 24G or 48G.
Ted

Bogdan Costescu wrote:

> On Wed, 17 Mar 2004, Alex Martin wrote:
> 
> 
>>You might want to look at the HP ProCurve 2824 or 2848 series.  We
>>choose the latter, because it means we only need one switch per
>>(logical) rack and the cost/port is pretty low. I can't yet comment
>>on performance.
> 
> 
> I'm interested in buying a 48 port Gigabit switch as well, and I was
> looking at the 2848 as it has the advantage of 48 ports in only 1U.
> One thing that is not clear from the descriptions that I find on the 
> net is if it has support for Jumbo frames. Does the documentation that 
> come with it mention something like this or, even better, have you 
> tried using Jumbo frames ?
> 
> I'm also interested in hearing opinions about other 48 ports Gigabit 
> switches.
> 

-- 
Ted Sariyski
------------
Combustion Research and Flow Technology, Inc.
6210 Keller's Church Road
Pipersville, PA 18947
Tel: 215-766-1520
Fax: 215-766-1524
www.craft-tech.com
tsariysk at craft-tech.com
-----------------------
"Our experiment is perfect and is not limited by fundamental principles."

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From canon at nersc.gov  Wed Mar 17 10:26:28 2004
From: canon at nersc.gov (canon at nersc.gov)
Date: Wed, 17 Mar 2004 07:26:28 -0800
Subject: [Beowulf] cfengine users ? 
In-Reply-To: Message from Chris Samuel <csamuel@vpac.org> 
   of "Wed, 17 Mar 2004 14:09:44 +1100." <200403171409.45273.csamuel@vpac.org> 
Message-ID: <200403171526.i2HFQSni004735@pookie.nersc.gov>


Chris,

We use cfengine to help manage our ~400 node linux cluster and
416 nodes (6656 processor) SP system.  I highly recommend it.
We typically use an rpm update script (we are moving to yum now)
to manage the binaries and use cfengine to manage config files
and scripts.  There are some aspects of cfengine that can be
a little convoluted, but it is very flexible.

--Shane


------------------------------------------------------------------------
Shane Canon                            
PSDF Project Lead                       
National Energy Research Scientific
  Computing Center
1 Cyclotron Road Mailstop 943-256
Berkeley, CA 94720                      
------------------------------------------------------------------------


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From anandv at singnet.com.sg  Wed Mar 17 00:40:39 2004
From: anandv at singnet.com.sg (Anand Vaidya)
Date: Wed, 17 Mar 2004 13:40:39 +0800
Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch
In-Reply-To: <405751AE.2040806@craft-tech.com>
References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> <405751AE.2040806@craft-tech.com>
Message-ID: <200403171340.39601.anandv@singnet.com.sg>

You can try Foundry Networks EIF24G or EIF48G, offers full BW, 1U, we like it.

-Anand

On Wednesday 17 March 2004 03:12, Ted Sariyski wrote:
> Hi,
> I am about to configure a 16 node dual xeon cluster based on Supermicro
> X5DPA-TGM motherboard. The cluster may grow so I am looking for a
> manageable, nonblocking 24 or 32 port gigabit switch. Any comments or
> recommendations will be highly appreciated.
> Thanks,
> Ted
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Thu Mar 18 10:26:51 2004
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 18 Mar 2004 10:26:51 -0500 (EST)
Subject: [Beowulf] Intel CSA performance?
Message-ID: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca>

Intel added a special connection on their chipset to connect 
gigabit on some chipsets (CSA).  I've been wondering whether 
this would offer a latency advantage, since it's conventional wisdom
that PCI latency is a noticable part of MPI latency.

this article:
http://tinyurl.com/2vlez
claims that CSA actually hurts latency, which is a bit puzzling.
it is, admittedly, "gamepc.com", so perhaps they are unaware of 
tuning issues like interrupt-coalescing/mitigation.

do any of you have CSA-based networks and have done performance tests?

thanks, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From venkatraman at programmer.net  Thu Mar 18 07:03:57 2004
From: venkatraman at programmer.net (Venkatraman Madurai Venkatasubramanyam)
Date: Thu, 18 Mar 2004 07:03:57 -0500
Subject: [Beowulf] Suggest me on my attempt!!
Message-ID: <20040318120357.A52A91D435B@ws1-12.us4.outblaze.com>

Hello ppl!

     I am a Computer Science and Engineering student of India. I am planning to build a Beowulf Cluster for my Project as a part of my curriculum. Resource I have are four laptops with Intel Celeron 2 GHz, 18 GB HDD, HP Compaq Presario 2100 series, 192 MB RAM and I dont  know what else shud I specify here. I have RedHat Linux 9 running on it. So I seek your help here to suggest me on how to build a Cluster. Please show me a way, as I am new to the Linux Platform. If you can personally help me, I will be really appreciated. 

MOkShAA.
-- 
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Mar 18 15:01:10 2004
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 18 Mar 2004 15:01:10 -0500 (EST)
Subject: [Beowulf] Suggest me on my attempt!!
In-Reply-To: <20040318120357.A52A91D435B@ws1-12.us4.outblaze.com>
Message-ID: <Pine.LNX.4.44.0403181452480.3786-100000@ganesh>

On Thu, 18 Mar 2004, Venkatraman Madurai Venkatasubramanyam wrote:

> Hello ppl!

> I am a Computer Science and Engineering student of India. I am
> planning to build a Beowulf Cluster for my Project as a part of my
> curriculum. Resource I have are four laptops with Intel Celeron 2 GHz,
> 18 GB HDD, HP Compaq Presario 2100 series, 192 MB RAM and I dont know
> what else shud I specify here. I have RedHat Linux 9 running on it. So I
> seek your help here to suggest me on how to build a Cluster. Please show
> me a way, as I am new to the Linux Platform. If you can personally help
> me, I will be really appreciated.

  a) Visit http://www.phy.duke.edu/brahma

Among other things on this site is an online book on building clusters.
Read/skim it.

  b) In your case the recipe is almost certainly going to be:

    i) Put laptops on a common switched network (cheap 100 Mbps switch).
   ii) Install PVM, MPI (lam and/or mpich), programming tools and
support if you haven't already on all nodes.
  iii) Set them up with a common home directory space NFS exported from
one to the rest, and with common accounts to match.  You can distribute
account information on so small a cluster by just copying e.g.
/etc/passwd and /etc/group and so on or by using NIS (or other ways).
   iv) Set up a remote shell so that you can freely login from any node
to any other node without a password.  I recommend ssh (openssh rpms)
but rsh is OK if your network is otherwise isolated and secure.
    v) Obtain, write, build parallel applications to explore what your
cluster can do.  There are demo programs for both PVM and MPI that come
with the distributions and more are available on the web.  There is a
PVM program template and an example PVM application suitable for
demonstrating scaling (also a potential template for master/slave code)
on:

  http://www.phy.duke.edu/~rgb

under "General".
   vi) Proceed from there as your skills increase.

I think that you'll find that after this you'll be in pretty good shape
for further progress, guided as you think necessary by this list.  

There are also books out there that can help, but they cost money.

Finally, I'd strongly suggest subscribing to Cluster World Magazine,
where there are both articles and monthly columns that cover how to do
all of the above and much more.

   rgb

> MOkShAA.
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rouds at servihoo.com  Fri Mar 19 06:48:38 2004
From: rouds at servihoo.com (RoUdY)
Date: Fri, 19 Mar 2004 15:48:38 +0400
Subject: [Beowulf] HELP! MPI PROGRAM
In-Reply-To: <200310011901.h91J1LY06826@NewBlue.Scyld.com>
Message-ID: <web-2509963@servihoo.com>

Hello
I really need a very big hand from you...
I have to run a program on my cluster for the final year 
project, which require a lot of computation power...
Can someone sent me a program (the source code) or a site 
where i can download a big program PLEASE ...
Using MPI....
Hope to hear from you 
Roud
--------------------------------------------------
Get your free email address from Servihoo.com!
http://www.servihoo.com
The Portal of Mauritius
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lars at meshtechnologies.com  Fri Mar 19 09:31:25 2004
From: lars at meshtechnologies.com (Lars Henriksen)
Date: Fri, 19 Mar 2004 14:31:25 +0000
Subject: [Beowulf] HELP! MPI PROGRAM
In-Reply-To: <web-2509963@servihoo.com>
References: <web-2509963@servihoo.com>
Message-ID: <1079706684.2520.1.camel@tp1.mesh-hq>

On Fri, 2004-03-19 at 11:48, RoUdY wrote:

> I have to run a program on my cluster for the final year 
> project, which require a lot of computation power...
> Can someone sent me a program (the source code) or a site 
> where i can download a big program PLEASE ...
> Using MPI....

Try HPL (High-Performance Linpack):
http://www.netlib.org/benchmark/hpl/

best regards
Lars
-- 
Lars Henriksen                  | MESH-Technologies A/S
Systems Manager & Consultant    | Lille Graabroedrestraede 1
www.meshtechnologies.com        | DK-5000 Odense C, Denmark
lars at meshtechnologies.com       | mobile: +45 2291 2904
direct: +45 6311 1187	 	| fax:	  +45 6311 1189


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gropp at mcs.anl.gov  Fri Mar 19 08:43:43 2004
From: gropp at mcs.anl.gov (William Gropp)
Date: Fri, 19 Mar 2004 07:43:43 -0600
Subject: [Beowulf] HELP! MPI PROGRAM
In-Reply-To: <web-2509963@servihoo.com>
References: <200310011901.h91J1LY06826@NewBlue.Scyld.com>
 <web-2509963@servihoo.com>
Message-ID: <6.0.0.22.2.20040319074111.02505e60@localhost>

At 05:48 AM 3/19/2004, RoUdY wrote:
>Hello
>I really need a very big hand from you...
>I have to run a program on my cluster for the final year project, which 
>require a lot of computation power...
>Can someone sent me a program (the source code) or a site where i can 
>download a big program PLEASE ...
>Using MPI....
>Hope to hear from you Roud

There are many examples included with PETSc (www.mcs.anl.gov/petsc) that 
can be sized to use as much power as you have.  HPLinpack will also use as 
much computational power as you have and allows you to compare your cluster 
to the Top500 list.  Both use MPI for communication.

Bill 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gharinarayana at yahoo.com  Fri Mar 19 11:34:57 2004
From: gharinarayana at yahoo.com (HARINARAYANA G)
Date: Fri, 19 Mar 2004 08:34:57 -0800 (PST)
Subject: [Beowulf] Give an application to PARALLELIZE
Message-ID: <20040319163457.3051.qmail@web11306.mail.yahoo.com>

Dear friends,

Please give me a very good application which uses
pda(algorithms) and MPI to the maximum extent and
which is POSSIBLE to do in 2 months(It's OK even if
you have done it already, just send the NAME of the
topic and the problem requirements).

    I am doing my Bachelor of Engineering in Comp.
Science at RNSIT,Bangalore,INDIA.

     I am with a team of 4 people.

With regards,
Sivaram.

__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Fri Mar 19 21:18:31 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sat, 20 Mar 2004 10:18:31 +0800 (CST)
Subject: [Beowulf] GridEngine 6.0 beta is ready!
Message-ID: <20040320021831.65847.qmail@web16811.mail.tpe.yahoo.com>

It's finally available, follow this link to download
the binary packages or source:

http://gridengine.sunsource.net/project/gridengine/news/SGE60beta-announce.html

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Fri Mar 19 21:51:56 2004
From: lindahl at pathscale.com (Greg Lindahl)
Date: Fri, 19 Mar 2004 18:51:56 -0800
Subject: [Beowulf] Intel CSA performance?
In-Reply-To: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20040320025156.GB7761@greglaptop.internal.keyresearch.com>

On Thu, Mar 18, 2004 at 10:26:51AM -0500, Mark Hahn wrote:

> Intel added a special connection on their chipset to connect 
> gigabit on some chipsets (CSA).  I've been wondering whether 
> this would offer a latency advantage, since it's conventional wisdom
> that PCI latency is a noticable part of MPI latency.

Eh? PCI latency can be noticable when you have a low latency network,
but gigE latency isn't nearly that low, especially once you've gone
through a switch.

The only reference to gigabit latency in the article didn't say what
they measured. I'd assume that it was using the normal drivers, which
means the kernel networking stack, which means you're looking through
the telescope from the wrong end.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From desi_star786 at yahoo.com  Sat Mar 20 13:38:10 2004
From: desi_star786 at yahoo.com (desi star)
Date: Sat, 20 Mar 2004 10:38:10 -0800 (PST)
Subject: [Beowulf] Problem running Jaguar on Scyld-Beowulf in parallel mode.
Message-ID: <20040320183810.94267.qmail@web40812.mail.yahoo.com>

Hi..

I have installed a molecular modeling software Jaguar
by Schrodinger Inc. on my scyld-beowulf 16 node
cluster. The software runs perfectly fine on the
master node but gives an error when I try to run the
program on more than one CPU. User manual of the
program suggests following steps to run Jaguar in
parallel mode:

1. Install MPICH and configure with option: 
   --with-comm=shared --with-device=ch_p4
2. Edit the machine.LINUX file in the MPICH directory
and list the name of the host and number of processors
on that host.
3. Test that 'rsh' is working
4. Launch the secure server ch4p_servs

We already have the MPICH installed on the cluster 
using package 'mpich-p4-inter-1.3.2-5_scyld.i368.rpm'.
I do not know whether package installation was done
with specific configure options in step#1. Do I need
to re-install the MPICH? I know that MPICH works
perfectly fine for the FORTRAN 90 programs on
different nodes.  

Also, Is it really important to enable 'rsh' on scyld?
The cluster is not protected by firewall so I want to
use the more secure 'ssh' but then do I need to
install the MPICH again telling it to use ssh rather
than rsh for communication? 

I am also wondering if the reason I am not been able
to run program on more than one CPU has to do with the
fact that Jaguar is not linked to MPICH libraries?

This is my first experience with MPICH and running
programs in parallel. I would really appreciate quick
tips and suggestions as to why I am not been to make
Jaguar run in the parallel mode.

Thanks in advance. Eagerly waiting for a response.

--

Pratap Singh.
Graduate Student,
The Chemical and Biomolecular Eng.
Johns Hopkins Univ.


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rmyers1400 at comcast.net  Fri Mar 19 22:58:53 2004
From: rmyers1400 at comcast.net (Robert Myers)
Date: Fri, 19 Mar 2004 22:58:53 -0500
Subject: [Beowulf] Intel CSA performance?
In-Reply-To: <20040320025156.GB7761@greglaptop.internal.keyresearch.com>
References: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca> <20040320025156.GB7761@greglaptop.internal.keyresearch.com>
Message-ID: <405BC17D.3010504@comcast.net>

Greg Lindahl wrote:

>On Thu, Mar 18, 2004 at 10:26:51AM -0500, Mark Hahn wrote:
>
>  
>
>>Intel added a special connection on their chipset to connect 
>>gigabit on some chipsets (CSA).  I've been wondering whether 
>>this would offer a latency advantage, since it's conventional wisdom
>>that PCI latency is a noticable part of MPI latency.
>>    
>>
>
>Eh? PCI latency can be noticable when you have a low latency network,
>but gigE latency isn't nearly that low, especially once you've gone
>through a switch.
>
>The only reference to gigabit latency in the article didn't say what
>they measured. I'd assume that it was using the normal drivers, which
>means the kernel networking stack, which means you're looking through
>the telescope from the wrong end.
>
>  
>
I had thought it might be interesting to fool around with trying to use 
CSA for hyperscsi, but I think you're saying if you're going to use a 
switched network, don't bother, if you're trying to win on latency.

When Intel abandoned infiniband and the memory controller hub sprouted 
this ethernet link, I figured that was their opening shot in stomping 
what's left of infiniband. Maybe it is, and they just don't care about 
latency, but it sounds like nobody's got any reliable information as to 
what the latency effects of CSA may be, anyway.

Every indication I can find is that Intel has all its bets on ethernet, 
and I don't know that there is any technological obstacle to building a 
low-latency ethernet.

RM

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jimlux at earthlink.net  Sat Mar 20 17:42:54 2004
From: jimlux at earthlink.net (Jim Lux)
Date: Sat, 20 Mar 2004 14:42:54 -0800
Subject: [Beowulf] Wireless network speed for clusters
Message-ID: <002b01c40ecc$cd7cec50$32a8a8c0@LAPTOP152422>

Some preliminary results for those of you wondering just how slow it
actually is...

Configuration is basically this:

node (Via EPIA C3 533MHz) running freevix kernel (ramdisk filesystem)
wired connection through Dlink 5 port hub
DWL-7000AP set up for point to multipoint 802.11a (5GHz band)

luminiferous aether

DWL-7000AP
ancient 10Mbps hub
    Clunky PPro running Knoppix/debian
    Maxtor NAS with a NFS mount

Pings with default 63 byte packets give 1.2-2.0 ms both ways...
Compare to <0.1 ms with a wired connection (i.e. plugging a cable from the
Dlink hub to the ancient hub)

DHCP/PXE booting sort of works (not exhaustively tested)
For some reason, the nodes can't see the NAS so NFS doesn't mount

There are a lot of "issues" with the DWL-7000AP... I think it's trying to be
clever about not routing traffic to MACs on the local side over the air, but
then, it doesn't know to route the traffic to the NFS server. The DWL-7000's
also don't like to be powered up with no live (as in responding to packets)
device hooked up to them, so there's sort of a potential power sequencing
thing with the EPIA boards and the DWL-7000AP.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Sat Mar 20 23:01:36 2004
From: lindahl at pathscale.com (Greg Lindahl)
Date: Sat, 20 Mar 2004 20:01:36 -0800
Subject: [Beowulf] Intel CSA performance?
In-Reply-To: <405BC17D.3010504@comcast.net>
References: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca> <20040320025156.GB7761@greglaptop.internal.keyresearch.com> <405BC17D.3010504@comcast.net>
Message-ID: <20040321040136.GA1977@greglaptop.greghome.keyresearch.com>

On Fri, Mar 19, 2004 at 10:58:53PM -0500, Robert Myers wrote:

> I had thought it might be interesting to fool around with trying to use 
> CSA for hyperscsi, but I think you're saying if you're going to use a 
> switched network, don't bother, if you're trying to win on latency.

I've never heard of "hyperscsi", and I am not saying what you think
I'm saying. What I am saying is that if you're going to use 1 gigabit
Ethernet, which has high latency in the switches, AND go through the
kernel, don't bother. I was pretty clear, so I don't see how you
missed it. There are certainly many examples of switched networks that
are low latency, such as Myrinet, IB, Quadrics, SCI, and so forth.

> When Intel abandoned infiniband

Intel has not abandoned Infiniband. They discontinued a 1X interface
that was going to get stomped in the market that was developing more
slowly than expected. Just like you drew the wrong lesson from what I
said, don't draw the wrong lesson from what Intel did.

> Every indication I can find is that Intel has all its bets on ethernet, 

This contradicts what Intel says. They are not betting against
ethernet, but they are certainly encouraging FC and IB where FC and IB
make sense. However, this is straying beyond beowulf, and I hope that
this mailing list can avoid being the cesspool that comp.arch has
been for many years.

-- greg
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rmyers1400 at comcast.net  Sun Mar 21 01:40:30 2004
From: rmyers1400 at comcast.net (Robert Myers)
Date: Sun, 21 Mar 2004 01:40:30 -0500
Subject: [Beowulf] Intel CSA performance?
In-Reply-To: <20040321040136.GA1977@greglaptop.greghome.keyresearch.com>
References: <Pine.LNX.4.44.0403181023051.1822-100000@coffee.psychology.mcmaster.ca> <20040320025156.GB7761@greglaptop.internal.keyresearch.com> <405BC17D.3010504@comcast.net> <20040321040136.GA1977@greglaptop.greghome.keyresearch.com>
Message-ID: <405D38DE.1010409@comcast.net>

Greg Lindahl wrote:

>On Fri, Mar 19, 2004 at 10:58:53PM -0500, Robert Myers wrote:
>
>  
>
>>I had thought it might be interesting to fool around with trying to use 
>>CSA for hyperscsi, but I think you're saying if you're going to use a 
>>switched network, don't bother, if you're trying to win on latency.
>>    
>>
>
>I've never heard of "hyperscsi", and I am not saying what you think
>I'm saying. What I am saying is that if you're going to use 1 gigabit
>Ethernet, which has high latency in the switches, AND go through the
>kernel, don't bother. I was pretty clear, so I don't see how you
>missed it. There are certainly many examples of switched networks that
>are low latency, such as Myrinet, IB, Quadrics, SCI, and so forth.
>
I should have been explicit. "If you are going through a switched 
_ethernet_ connection." If you do the groups.google.com search

low-latency infiniband group:comp.arch author:Robert author:Myers

you will find that you really don't need to educate me about the 
existence of low-latency interconnects.

As to hyperscsi, I gather that it is incumbent only on others to check 
google. Hyperscsi is a way to pass raw data over ethernet without going 
through the TCP/IP stack:

http://www.linuxdevices.com/files/misc/hyperscsi.pdf

so it doesn't consume nearly the CPU resources that TCP/IP does without 
hardware offload, and I don't think CSA allows you to use separate 
hardware TCP/IP offload. It looks potentially interesting as a low-cost 
clustering interconnect, especially if, as I expect, Intel continues to 
push ethernet.

RM


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Sun Mar 21 09:46:36 2004
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Sun, 21 Mar 2004 22:46:36 +0800 (CST)
Subject: [Beowulf] Re: GridEngine 6.0 beta is ready!
In-Reply-To: <DA6B3E37-7A9A-11D8-B51F-000A95DA5638@sdsc.edu>
Message-ID: <20040321144636.49074.qmail@web16808.mail.tpe.yahoo.com>

SGE 6.1 will be avaiable at the end of the year, so
when the newer version of Rocks Cluster picks up SGE
6.0, SGE 6.1 will be available at around the same
time.

Andrew.

 --- "Mason J. Katz" <mjk at sdsc.edu> ???T???G> Thanks
for the update.  We're not going to include
> this in our April  
> release, but we will update to the official Opteron
> port and remove our  
> version of this port.  We hope to build experience
> with SGE 6.0 in the  
> coming months and include it as part of our November
> release as 6.0  
> goes from beta to release.  Thanks.
> 
> 	-mjk
> 
> On Mar 19, 2004, at 6:18 PM, Andrew Wang wrote:
> 
> > It's finally available, follow this link to
> download
> > the binary packages or source:
> >
> >
>
http://gridengine.sunsource.net/project/gridengine/news/SGE60beta-
> 
> > announce.html
> >
> > Andrew.
> >
> >
> >
>
-----------------------------------------------------------------
> > ????????Yahoo!??????
> >
>
?????????????????????????????????????????????????????????>
>
>
http://tw.promo.yahoo.com/mail_premium/stationery.html
>  

-----------------------------------------------------------------
?C???? Yahoo!?_??
?????C???B?????????B?R?A???????A???b?H??????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Mon Mar 22 12:33:15 2004
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 22 Mar 2004 09:33:15 -0800
Subject: [Beowulf] Give an application to PARALLELIZE
In-Reply-To: <20040319163457.3051.qmail@web11306.mail.yahoo.com>
Message-ID: <5.2.0.9.2.20040322093203.017e1000@mailhost4.jpl.nasa.gov>

At 08:34 AM 3/19/2004 -0800, HARINARAYANA G wrote:
>Dear friends,
>
>Please give me a very good application which uses
>pda(algorithms) and MPI to the maximum extent and
>which is POSSIBLE to do in 2 months(It's OK even if
>you have done it already, just send the NAME of the
>topic and the problem requirements).
>
>     I am doing my Bachelor of Engineering in Comp.
>Science at RNSIT,Bangalore,INDIA.
>
>      I am with a team of 4 people.
>
>With regards,
>Sivaram.

A couple issues back of IEEE Proceedings, there were several papers 
describing doing acoustic source localization with a bunch of iPAQs.  I 
don't know if they were doing MPI for node/node communication, but there's 
fairly extensive literature out there, and the papers describe the 
algorithms used.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From clwang at csis.hku.hk  Sun Mar 21 23:55:00 2004
From: clwang at csis.hku.hk (Cho Li Wang)
Date: Mon, 22 Mar 2004 12:55:00 +0800
Subject: [Beowulf] Final Call : NPC2004 (Deadline: March 22, 2004)
Message-ID: <405E71A4.1556E651@csis.hku.hk>


*******************************************************************
                            NPC2004
IFIP International Conference on Network and Parallel Computing
                       October 18-20, 2004
                          Wuhan, China
                 http://grid.hust.edu.cn/npc04
- -------------------------------------------------------------------
Important Dates

  Paper Submission                        March 22, 2004 (extended)
  Author Notification                     May    1, 2004
  Final Camera Ready Manuscript           June   1, 2004

*******************************************************************

Call For Papers

  The goal of IFIP International Conference on Network and Parallel
Computing (NPC 2004) is to establish an international forum for
engineers and
scientists to present their excellent ideas and experiences in all
system fields of
network and parallel computing. NPC 2004, hosted by the Huazhong
University of Science and Technology, will be held in the city of Wuhan,
China -
the "Homeland of White Clouds and the Yellow Crane." Topics of interest
include, but are not limited to:

        - Parallel & Distributed Architectures
        - Parallel & Distributed Applications/Algorithms
        - Parallel Programming Environments & Tools
        - Network & Interconnect Architecture
        - Network Security
        - Network Storage
        - Advanced Web and Proxy Services
        - Middleware Frameworks & Toolkits
        - Cluster and Grid Computing
        - Ubiquitous Computing
        - Peer-to-peer Computing
        - Multimedia Streaming Services
        - Performance Modeling & Evaluation

Submitted papers may not have appeared in or be considered for another
conference. Papers must be written in English and must be in PDF format.
Detailed electronic submission instructions will be posted on the
conference web site. The conference proceedings will be published by
Springer
Verlag in the Lecture Notes in Computer Science Series (cited by SCI).
Best papers
from NPC 2004 will be published in a special issue of International
Journal
of High Performance Computing and Networking (IJHPCN) after conference.

**************************************************************************

Committee

  General Co-Chairs:
        H. J. Siegel           Colorado State University, USA
        Guojie Li              The Institute of Computing Technology,
                               CAS, China
  Steering Committee Chair:
        Kemal Ebcioglu         IBM T.J. Watson Research Center, USA

  Program Co-Chairs:
        Guangrong Gao          University of Delaware, USA
        Zhiwei Xu              Chinese Academy of Sciences, China

  Program Vice-Chairs:
        Victor K. Prasanna     University of Southern California, USA
        Albert Y. Zomaya       University of Sydney, Australia
        Hai Jin                Huazhong University of Science and
                               Technology, China
  Publicity Co-Chairs:
        Cho-Li Wang           The University of Hong Kong, Hong Kong
        Chris Jesshope        The University of Hull, UK

  Local Arrangement Chair:
        Song Wu               Huazhong University of Science and
                              Technology, China

  Steering Committee Members:
        Jack Dongarra         University of Tennessee, USA
        Guangrong Gao         University of Delaware, USA
        Jean-Luc Gaudiot      University of California, Irvine, USA
        Guojie Li             The Institute of Computing Technology,
                              CAS, China
        Yoichi Muraoka        Waseda University, Japan
        Daniel Reed           University of North Carolina, USA

  Program Committee Members:
        Ishfaq Ahmad             University of Texas at Arlington, USA
        Shoukat Ali              University of Missouri-Rolla, USA
        Makoto Amamiya           Kyushu University, Japan
        David Bader              University of New Mexico, USA
        Luc Bouge                IRISA/ENS Cachan, France
        Pascal Bouvry            University of Luxembourg, Luxembourg
        Ralph Castain            Los Alamos National Laboratory, USA
        Guoliang Chen            University of Science and Technology
                                 of China, China
        Alain Darte              CNRS, ENS-Lyon, France
        Chen Ding                University of Rochester, USA
        Jianping Fan             Institute of Computing Technology, CAS,
China
        Xiaobing Feng            Institute of Computing Technology,
                                 CAS, China

        Jean-Luc Gaudiot         University of California, Irvine, USA
        Minyi Guo                University of Aizu, Japan
        Mary Hall                University of Southern California, USA
        Salim Hariri             University of Arizona, USA
        Kai Hwang                University of Southern California, USA
        Anura Jayasumana         Colorado State Univeristy, USA
        Chris R. Jesshop         The University of Hull, UK
        Ricky Kwok               The University of Hong Kong, Hong Kong
        Francis Lau              The University of Hong Kong, Hong Kong
        Chuang Lin               Tsinghua University, China
        John Morrison            University College Cork, Ireland
        Lionel Ni                Hong Kong University of Science and
                                 Technology, Hong Kong
        Stephan Olariu           Old Dominion University, USA
        Yi Pan                   Georgia State University, USA
        Depei Qian               Xi'an Jiaotong University, China
        Daniel A. Reed           University of North Carolina at
                                 Chapel Hill, USA
        Jose Rolim               University of Geneva, Switzerland
        Arnold Rosenberg         University of Massachusetts at Amherst,
USA
        Sartaj Sahni             University of Florida, USA
        Selvakennedy Selvadurai  University of Sydney, Australia
        Franciszek Seredynski    Polish Academy of Sciences, Poland
        Hong Shen                Japan Advanced Institute of Science
                                 and Technology, Japan
        Xiaowei Shen             IBM T. J. Watson Research Center, USA
        Gabby Silberman          IBM Centers for Advanced Studies, USA
        Per Stenstrom            Chalmers University of Technology,
Sweden
        Ivan Stojmenovic         University of Ottawa, Canada
        Ninghui Sun              Institute of Computing Technology, CAS,
China
        El-Ghazali Talbi         University of Lille, France
        Domenico Talia           University of Calabria, Italy
        Mitchell D. Theys        University of Illinois at Chicago, USA
        Xinmin Tian              Intel Corporation, USA
        Dean Tullsen             University of California, San Diego,
USA
        Cho-Li Wang              The University of Hong Kong, Hong Kong
        Qing Yang                University of Rhode Island, USA
        Yuanyuan Yang            State University of New York at
                                 Stony Brook, USA
        Xiaodong Zhang           College of William and Mary, USA
        Weimin Zheng             Tsinghua University, China
        Bingbing Zhou            University of Sydney, Australia
        Chuanqi Zhu              Fudan University, China
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwill at penguincomputing.com  Mon Mar 22 12:20:47 2004
From: mwill at penguincomputing.com (Michael Will)
Date: Mon, 22 Mar 2004 09:20:47 -0800
Subject: [Beowulf] Re: scyld and jaguar
Message-ID: <200403220920.47878.mwill@penguincomputing.com>

Hi,

I saw your email on the beowulf list, and have a few comments:

1. MPICH on Scyld does not require rsh or ssh but rather it will take 
advantage of the bproc features of Scyld to achieve the same faster.


2. If your fortran programs work fine, so should the c programs. Unless you 
have an executable that is statically linked with its own mpich 
implementation. You can test that by using 'ldd' on the executable, it will 
list which libraries it is loading. If there are no mpich libs mentioned, you 
might have a statically linked program. 

Let me know how it goes.

Michael Will
-- 
Michael Will, Linux Sales Engineer
NEWS: We have moved to a larger iceberg :-)
NEWS: 300 California St., San Francisco, CA.
Tel:  415-954-2822  Toll Free:  888-PENGUIN
Fax:  415-954-2899 
www.penguincomputing.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jeffrey.b.layton at lmco.com  Mon Mar 22 15:03:35 2004
From: jeffrey.b.layton at lmco.com (Jeff Layton)
Date: Mon, 22 Mar 2004 15:03:35 -0500
Subject: [Beowulf] NUMA Patches for AMD64 in 2.4?
Message-ID: <405F4697.9070507@lmco.com>

Good Afternoon!

   Does anyone know if the latest stock 2.4 kernel has the
NUMA patches in it? If not, where can I get NUMA patches
that will work for AMD64?

TIA!

Jeff

-- 
Dr. Jeff Layton
Aerodynamics and CFD
Lockheed-Martin Aeronautical Company - Marietta


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Mon Mar 22 16:30:04 2004
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 22 Mar 2004 16:30:04 -0500
Subject: [Beowulf] NUMA Patches for AMD64 in 2.4?
In-Reply-To: <405F4697.9070507@lmco.com>
References: <405F4697.9070507@lmco.com>
Message-ID: <405F5ADC.2080101@scalableinformatics.com>

You can pull x86_64 patches from ftp://ftp.x86-64.org/pub/linux/v2.6/  
.  The 2.4 kernels would need backports in some cases (RedHat is doing 
this, and I think SUSE might be as well).

Not sure if Fedora is doing this as well (no /proc/numa in it or in the 
SUSE 9.0 AMD64).

Joe

Jeff Layton wrote:

> Good Afternoon!
>
>   Does anyone know if the latest stock 2.4 kernel has the
> NUMA patches in it? If not, where can I get NUMA patches
> that will work for AMD64?
>
> TIA!
>
> Jeff
>


-- 
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From desi_star786 at yahoo.com  Mon Mar 22 15:15:27 2004
From: desi_star786 at yahoo.com (desi star)
Date: Mon, 22 Mar 2004 12:15:27 -0800 (PST)
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <200403220920.47878.mwill@penguincomputing.com>
Message-ID: <20040322201527.98403.qmail@web40809.mail.yahoo.com>

Hi Mike,

Thanks much for responding. Jaguar is indeed staticaly
linked to the MPICH libraries as per manuals. When I
ran the ldd commands as you suggested:

--
$ ldd Jaguar
  not a dynamic executable
$
--

Thats why the very first step sugested in the Jaguar
installation is to build and configure MPICH from the
start. Where do I go from here?

I also worked on Alan's suggestion and created a
dynamic link between the ssh and rsh. I am now stuck
in making ssh passwordless. Using 'ssh-keygen -t' I
generated public and private keys and then copied
public key to the authorised_keys2 in ~/.ssh/. I am
not sure if thats all I need to make ssh passwordless.
I was wondering if I will have to copy public keys on
each node using bpcp command.

I would appreciate suggestions in this matter. Thanks.

Pratap. 

--- Michael Will <mwill at penguincomputing.com> wrote:
> Hi,
> 
> I saw your email on the beowulf list, and have a few
> comments:
> 
> 1. MPICH on Scyld does not require rsh or ssh but
> rather it will take 
> advantage of the bproc features of Scyld to achieve
> the same faster.
> 
> 
> 2. If your fortran programs work fine, so should the
> c programs. Unless you 
> have an executable that is statically linked with
> its own mpich 
> implementation. You can test that by using 'ldd' on
> the executable, it will 
> list which libraries it is loading. If there are no
> mpich libs mentioned, you 
> might have a statically linked program. 
> 
> Let me know how it goes.
> 
> Michael Will
> -- 
> Michael Will, Linux Sales Engineer
> NEWS: We have moved to a larger iceberg :-)
> NEWS: 300 California St., San Francisco, CA.
> Tel:  415-954-2822  Toll Free:  888-PENGUIN
> Fax:  415-954-2899 
> www.penguincomputing.com
> 


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwill at penguincomputing.com  Mon Mar 22 17:01:31 2004
From: mwill at penguincomputing.com (Michael Will)
Date: Mon, 22 Mar 2004 14:01:31 -0800
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
References: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
Message-ID: <200403221401.31370.mwill@penguincomputing.com>

The problem is that a statically linked executable will not be able to use the 
Scyld infrastructure.  It won't take advantage of your Infinidband or 
Myrinet, it won't use bproc, etcpp.

You might set up the compute nodes to look like general unix nodes in order to 
run that particular implementation, but then you loose all the advantages of 
Scyld.
 
> I also worked on Alan's suggestion and created a
> dynamic link between the ssh and rsh. 
AFAIK you would be better off to set the enviroment variable to force it to 
use rsh or ssh. I think its  P4_RSHCOMMAND="ssh" .

The best way would be to ask your vendor to provide you with a dymanically 
linked executable, or even the source code and compile it yourself.

> I am now stuck 
> in making ssh passwordless. Using 'ssh-keygen -t' I
> generated public and private keys and then copied
> public key to the authorised_keys2 in ~/.ssh/. I am
> not sure if thats all I need to make ssh passwordless.
Does it work with localhost? It sometimes is tricky to get it right. 
then it could also work remotely, given that you 
1) have sshd running 
2) have your home NFS mounted
3) have made /dev/random accessible, at least for ssh I believe thats 
necessary

> I was wondering if I will have to copy public keys on
> each node using bpcp command.
You could do that too if you do not want to NFS mount the home. 

That you could easily do by editing /etc/exports to export /home 
and /etc/beowulf/fstab to mount $MASTER, after that rebooting your compute 
node. (might be possible without rebooting, but I don't know off of the top 
of my head)

Michael
-- 
Michael Will, Linux Sales Engineer
NEWS: We have moved to a larger iceberg :-)
NEWS: 300 California St., San Francisco, CA.
Tel:  415-954-2822  Toll Free:  888-PENGUIN
Fax:  415-954-2899 
www.penguincomputing.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From m.dierks at skynet.be  Mon Mar 22 18:39:30 2004
From: m.dierks at skynet.be (Michel Dierks)
Date: Tue, 23 Mar 2004 00:39:30 +0100
Subject: [Beowulf] Minimal OS
Message-ID: <405F7932.20404@skynet.be>

Hello,
I?m a beginner in the Beowulf world.
To achieve my school graduate I choose to make a Beowulf cluster.
My cluster: 
8 slaves: pc IBM 166 Mhz, 96 Mb ram, HD 2 Giga.
1 master: Dell PowerEdge 2200 bi processor 233 Mhz, 320 Mb ram, 3 SCSI HD (9.1, 2.1 and 2.1 Giga).
1 switch 10/100 Ethernet.
The application must calculate a mesh 2D for a research over stream in fluid mechanics.
I must use the MPI library for communication and PARMS for the calculation.
This application will be developed in C.
The operating system is the Red Hat distribution 9.0.
My question is: for the slave pc?s , which is the minimal operating system to install. (Kernell + which package?).
Thank you.

Michel D.

Belgium


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwill at penguincomputing.com  Mon Mar 22 18:01:00 2004
From: mwill at penguincomputing.com (Michael Will)
Date: Mon, 22 Mar 2004 15:01:00 -0800
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <1079996184.4352.14.camel@pel>
References: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
    <1079996184.4352.14.camel@pel>
Message-ID: <200403221501.00766.mwill@penguincomputing.com>

I agree that rather than compiling your own MPICH you should try to make it 
work with the existing one. However 
1) there is no source
2) the binary is statically linked.
3) Scyld does have an mpirun which should set the enviromentvariables right. 

The right attempt is to make it use bpsh instead of rsh or ssh. I saw that 
some of the calls are done with shell scripts, which might be a way to fix it 
as well if the enviroment variables don't help.

Michael
On Monday 22 March 2004 02:56 pm, Sean Dilda wrote:
> On Mon, 2004-03-22 at 15:15, desi star wrote:
> > Hi Mike,
> >
> > Thanks much for responding. Jaguar is indeed staticaly
> > linked to the MPICH libraries as per manuals. When I
> > ran the ldd commands as you suggested:
> >
> > --
> > $ ldd Jaguar
> >   not a dynamic executable
> > $
> > --
> >
> > Thats why the very first step sugested in the Jaguar
> > installation is to build and configure MPICH from the
> > start. Where do I go from here?
>
> I'm not familiar with Jaguar, but I am somewhat familiar with Scyld.  I
> believe you are taking the wrong approach with this.
>
> Even though Jaguar says you should start with building mpich, I don't
> think that's what you want to do.  You almost certainly want to stick
> with the MPICH binaries that were provided by Scyld.  First make sure
> there is no confusion and remove the copy of mpich that you built.  Next
> make sure the mpich and mpich-devel packages are installed on your
> system.  'rpm -q mpich ; rpm -q mpich-devel' should tell you this.  If
> they're not, 'rpm -i mpich-XXXXX.rpm' should install the package.  You
> can find the packages on your Scyld cd(s).
>
> Once you have those packages installed, then attempt to compile jaguar.
> It should link against Scyld's copy of mpich and just work.  I suggest
> following Scyld's instructions for running mpich jobs, not Jaguars.
> Scyld has made adjustments to their copy of MPICH that make it work
> right on their system.  In the process they also change the way jobs are
> launched.  So Scyld may not have 'mpirun', but has a better way to start
> the job.
>
> As Michael pointed out, Scyld's version of MPICH doesn't require rsh,
> ssh, or anything like it.  So your questions along those lines are
> somewhat moot.
>
>
> Sean

-- 
Michael Will, Linux Sales Engineer
NEWS: We have moved to a larger iceberg :-)
NEWS: 300 California St., San Francisco, CA.
Tel:  415-954-2822  Toll Free:  888-PENGUIN
Fax:  415-954-2899 
www.penguincomputing.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwill at penguincomputing.com  Mon Mar 22 17:11:42 2004
From: mwill at penguincomputing.com (Michael Will)
Date: Mon, 22 Mar 2004 14:11:42 -0800
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
References: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
Message-ID: <200403221411.42975.mwill@penguincomputing.com>

Another idea - make it use bpsh by setting    
export P4_RSHCOMMAND="bpsh" or set it to use some shell script of yours that 
massages its parameters into the format bpsh expects.

bpsh will start a process without requiring rsh or ssh, using Scylds bproc 
support.

Michael.
-- 
Michael Will, Linux Sales Engineer
NEWS: We have moved to a larger iceberg :-)
NEWS: 300 California St., San Francisco, CA.
Tel:  415-954-2822  Toll Free:  888-PENGUIN
Fax:  415-954-2899 
www.penguincomputing.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From agrajag at dragaera.net  Mon Mar 22 17:56:24 2004
From: agrajag at dragaera.net (Sean Dilda)
Date: Mon, 22 Mar 2004 17:56:24 -0500
Subject: [Beowulf] Re: scyld and jaguar
In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
References: <20040322201527.98403.qmail@web40809.mail.yahoo.com>
Message-ID: <1079996184.4352.14.camel@pel>

On Mon, 2004-03-22 at 15:15, desi star wrote:
> Hi Mike,
> 
> Thanks much for responding. Jaguar is indeed staticaly
> linked to the MPICH libraries as per manuals. When I
> ran the ldd commands as you suggested:
> 
> --
> $ ldd Jaguar
>   not a dynamic executable
> $
> --
> 
> Thats why the very first step sugested in the Jaguar
> installation is to build and configure MPICH from the
> start. Where do I go from here?
> 

I'm not familiar with Jaguar, but I am somewhat familiar with Scyld.  I
believe you are taking the wrong approach with this.

Even though Jaguar says you should start with building mpich, I don't
think that's what you want to do.  You almost certainly want to stick
with the MPICH binaries that were provided by Scyld.  First make sure
there is no confusion and remove the copy of mpich that you built.  Next
make sure the mpich and mpich-devel packages are installed on your
system.  'rpm -q mpich ; rpm -q mpich-devel' should tell you this.  If
they're not, 'rpm -i mpich-XXXXX.rpm' should install the package.  You
can find the packages on your Scyld cd(s).

Once you have those packages installed, then attempt to compile jaguar. 
It should link against Scyld's copy of mpich and just work.  I suggest
following Scyld's instructions for running mpich jobs, not Jaguars. 
Scyld has made adjustments to their copy of MPICH that make it work
right on their system.  In the process they also change the way jobs are
launched.  So Scyld may not have 'mpirun', but has a better way to start
the job.

As Michael pointed out, Scyld's version of MPICH doesn't require rsh,
ssh, or anything like it.  So your questions along those lines are
somewhat moot.


Sean

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From xyzzy at speakeasy.org  Mon Mar 22 21:24:16 2004
From: xyzzy at speakeasy.org (Trent Piepho)
Date: Mon, 22 Mar 2004 18:24:16 -0800 (PST)
Subject: [Beowulf] Power consumption for opterons?
In-Reply-To: <000e01c40772$2611bf60$36a8a8c0@LAPTOP152422>
Message-ID: <Pine.LNX.4.04.10403111029210.1538-100000@c-24-18-245-161.client.comcast.net>

Two weeks ago, I asked about power consumption for dual opteron systems.  This
is summary of the numbers I saw posted here.

237 idle to 280 loaded for a dual 248 with two SCSI drives from Bill Broadley
250 loaded for a dual 240 from Mark Hahn
182 loaded for a dual 242 from Robert G. Brown

The 182 numbers seems to be too low, but it would be nice to have some other
data points.  Combine fewer fans, less memory, lower power or no harddrive,
more efficient power supply, and less load on the CPU, and you could see 182
vs 250 watts I think.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf