From adam4098 at uidaho.edu  Tue Apr  1 00:47:58 2003
From: adam4098 at uidaho.edu (Adam Phillabaum)
Date: Mon, 31 Mar 2003 21:47:58 -0800
Subject: MSI motherboards
Message-ID: <003f01c2f812$40d1db70$ee506581@lookout>

Hello,
    I'm looking for some information about A MSI motherboard, the MS-9138
http://www.msicomputer.com/product/detail_spec/product_detail.asp?model=E750
1_Master-LS2

Its a dual Xeon motherboard.  Just checking if anyone has anything positive
or negative to say about it.

--
Adam

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dlane at ap.stmarys.ca  Tue Apr  1 13:21:40 2003
From: dlane at ap.stmarys.ca (Dave Lane)
Date: Tue, 01 Apr 2003 14:21:40 -0400
Subject: SMC8624T vs DLINK DGC-1024T
Message-ID: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca>

Can anyone comment on the strengths/weaknesses of these two 24-port gigabit 
switches. We're going to be building a 16 node dual-Xeon cluster this 
spring and were planning on the SMC switch (which has received good review 
here before), but a vendor pointed out the DLINK switch as a less expensive 
alternative.

... Dave

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mprinkey at aeolusresearch.com  Tue Apr  1 16:17:26 2003
From: mprinkey at aeolusresearch.com (Michael T. Prinkey)
Date: Tue, 1 Apr 2003 16:17:26 -0500 (EST)
Subject: DDR Xeon Chipsets
Message-ID: <36051.66.118.77.29.1049231846.squirrel@ra.aeolustec.com>

This question was asked a few months ago but not answered.  Are stream or
other memory benchmark data available for the new DDR Xeon chipsets,
specifically the Intel 7501 and 7505 and the Serverworks GC-SL and GC-LE?

In particular, I was looking at the Supermicro x5DEi which uses the GC-SL
chipset.  It seems that this north bridge uses only a single DDR channel.  I
would think that this would seriously impact performance.  The GC-LE and the
7501/5 chipsets use dual DDR channels which seems more appropriate.

Any insights here?  Any numbers?

Thanks,

Mike Prinkey


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joachim at ccrl-nece.de  Wed Apr  2 02:53:14 2003
From: joachim at ccrl-nece.de (Joachim Worringen)
Date: Wed, 2 Apr 2003 09:53:14 +0200
Subject: Which MPI implementation for MPI-2?
In-Reply-To: <200303301650.34552.exa@kablonet.com.tr>
References: <200303301650.34552.exa@kablonet.com.tr>
Message-ID: <200304020953.14361.joachim@ccrl-nece.de>

Eray Ozkural:
> In order to make use of MPI-2 features such as one-sided communications,
> new collective operations and I/O which implementation do you think is
> preferable?


Which platform? Do you want it for free? When do you need the MPI-2 features?

 Joachim

-- 
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From exa at kablonet.com.tr  Wed Apr  2 13:33:49 2003
From: exa at kablonet.com.tr (Eray Ozkural)
Date: Wed, 2 Apr 2003 21:33:49 +0300
Subject: Which MPI implementation for MPI-2?
In-Reply-To: <200304020953.14361.joachim@ccrl-nece.de>
References: <200303301650.34552.exa@kablonet.com.tr> <200304020953.14361.joachim@ccrl-nece.de>
Message-ID: <200304022133.49554.exa@kablonet.com.tr>

On Wednesday 02 April 2003 10:53, Joachim Worringen wrote:
>
> Which platform? Do you want it for free? When do you need the MPI-2
> features?

Beowulf class, linux :)

We currently have a switched fast ethernet network. NICs eepro100, switch 3com 
superstackii

I wasn't able to get LAM one sided ops to run at all last year when I gave it 
a shot. (I have a feeling their architecture is a little buggy and 
inefficient). Maybe mpich can cope better with MPI-2, well aren't they using 
libraries like global arrays at Sandia which do one sided comms?

Thanks,

-- 
Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rbw at ahpcrc.org  Wed Apr  2 15:11:20 2003
From: rbw at ahpcrc.org (Richard Walsh)
Date: Wed, 2 Apr 2003 14:11:20 -0600
Subject: Which MPI implementation for MPI-2? ...
Message-ID: <200304022011.h32KBKv10390@mycroft.ahpcrc.org>


Eray Ozkural wrote:

>I wasn't able to get LAM one sided ops to run at all last year when I gave it
>a shot. (I have a feeling their architecture is a little buggy and
>inefficient). Maybe mpich can cope better with MPI-2, well aren't they using
>libraries like global arrays at Sandia which do one sided comms?

To anyone ...

Somewhat tangentially, but while we are on the subject of one-sided
communications in MPI-2, am I correct in assuming that this capability 
is implemented as it is in SHMEM ... via communication to/from symmetric 
(or known asymmetric) memory locations inside the companion processes 
memory space.  It would seem to be a requirement for speed and would 
also seem to require the use of identical binaries on each processor
(and COMMON or static to place data in a symmetric location).

Thanks for your guidance ...

rbw
#---------------------------------------------------
# Richard Walsh
# Project Manager, Cluster Computing, Computational
#                  Chemistry and Finance
# netASPx, Inc.
# 1200 Washington Ave. So.
# Minneapolis, MN 55415
# VOX:    612-337-3467
# FAX:    612-337-3400
# EMAIL:  rbw at networkcs.com, richard.walsh at netaspx.com
#         rbw at ahpcrc.org
#
#---------------------------------------------------
# "When Noah built the arc, it was not raining."
#                                  -Anonymous
#---------------------------------------------------
#
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From exa at kablonet.com.tr  Wed Apr  2 16:33:16 2003
From: exa at kablonet.com.tr (Eray Ozkural)
Date: Thu, 3 Apr 2003 00:33:16 +0300
Subject: Which MPI implementation for MPI-2?
In-Reply-To: <200304020953.14361.joachim@ccrl-nece.de>
References: <200303301650.34552.exa@kablonet.com.tr> <200304020953.14361.joachim@ccrl-nece.de>
Message-ID: <200304030033.16374.exa@kablonet.com.tr>

On Wednesday 02 April 2003 10:53, Joachim Worringen wrote:
>
> Which platform? Do you want it for free? When do you need the MPI-2
> features?

About free-ness, yes probably :)

We don't have any important MPI-2 code right now, I think I only used IO 
features till now. Maybe in one of our future projects but I've no idea when.

Thanks,

-- 
Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dsarvis at zcorum.com  Wed Apr  2 15:41:03 2003
From: dsarvis at zcorum.com (Dennis Sarvis, II)
Date: 02 Apr 2003 15:41:03 -0500
Subject: small cluster
Message-ID: <1049316063.1932.4.camel@skull.america.net>

How does one go about creating a 2 PC cluster? I have a redhat 400Mhz
PII and a Debian Celeron 550Mhz.  Can I do something like use 2 NICs in
the controller and one in the slave (1 NIC for the office
network/internet and the other connecting via crossover 10baseT to the
NIC on node1 slave)?


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joachim at ccrl-nece.de  Thu Apr  3 02:07:41 2003
From: joachim at ccrl-nece.de (Joachim Worringen)
Date: Thu, 3 Apr 2003 09:07:41 +0200
Subject: Which MPI implementation for MPI-2?
In-Reply-To: <200304030033.16374.exa@kablonet.com.tr>
References: <200303301650.34552.exa@kablonet.com.tr> <200304020953.14361.joachim@ccrl-nece.de> <200304030033.16374.exa@kablonet.com.tr>
Message-ID: <200304030907.41246.joachim@ccrl-nece.de>

Eray Ozkural:
> About free-ness, yes probably :)
>
> We don't have any important MPI-2 code right now, I think I only used IO
> features till now. Maybe in one of our future projects but I've no idea
> when.

So you could easily use MPICH (if LAM doesn't work for you) and wait until 
MPICH-2 is available (or test the beta-release...).

Or you could buy a full MPI-2 implementaion from NEC which runs on 
SCore-clusters... oops, wrong mailinglist. ;-)

 Joachim

-- 
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Thu Apr  3 04:04:01 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Thu, 3 Apr 2003 01:04:01 -0800
Subject: Which MPI implementation for MPI-2? ...
In-Reply-To: <200304022011.h32KBKv10390@mycroft.ahpcrc.org>
References: <200304022011.h32KBKv10390@mycroft.ahpcrc.org>
Message-ID: <20030403090401.GB1447@greglaptop.attbi.com>

On Wed, Apr 02, 2003 at 02:11:20PM -0600, Richard Walsh wrote:

> Somewhat tangentially, but while we are on the subject of one-sided
> communications in MPI-2, am I correct in assuming that this capability 
> is implemented as it is in SHMEM ...

No. It's much more complicated and general. You have to register
windows within which one-sided ops can be used, and there are some
extra calls that you make to make sure operations have completed.

UPC is a much more compact method of expressing one-sided calls, and
unlike shmem, it can benefit from pipelined transfers.

> It would seem to be a requirement for speed and would 
> also seem to require the use of identical binaries on each processor
> (and COMMON or static to place data in a symmetric location).

shmem doesn't require that; you can use a common address (I'm very
punny at 1am) to exchange addresses of malloc-ed data. But with shmem,
you get a free registration of all static & common variables, and the
stack too, as long as you use it in a consistant fashion.

greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rbw at ahpcrc.org  Thu Apr  3 09:51:50 2003
From: rbw at ahpcrc.org (Richard Walsh)
Date: Thu, 3 Apr 2003 08:51:50 -0600
Subject: Which MPI implementation for MPI-2? ...
Message-ID: <200304031451.h33Epo500310@mycroft.ahpcrc.org>

On Thu Apr  3 03:43:24 2003 Greg Lindahl wrote:

>On Wed, Apr 02, 2003 at 02:11:20PM -0600, Richard Walsh wrote:
>
>> Somewhat tangentially, but while we are on the subject of one-sided
>> communications in MPI-2, am I correct in assuming that this capability 
>> is implemented as it is in SHMEM ...
>
>No. It's much more complicated and general. You have to register
>windows within which one-sided ops can be used, and there are some
>extra calls that you make to make sure operations have completed.

 I see ... then I should also anticipate some loss of performance 
 (higher latency) when using one-sided MPI communications compared 
 to SHMEM. Or perhaps this is one-time overhead paid at registration
 only?

>UPC is a much more compact method of expressing one-sided calls, and
>unlike shmem, it can benefit from pipelined transfers.

 Right (so also with CAF) for messages, but you still have to explicitly 
 sychronize/lock, etc.

>> It would seem to be a requirement for speed and would 
>> also seem to require the use of identical binaries on each processor
>> (and COMMON or static to place data in a symmetric location).
>
>shmem doesn't require that; you can use a common address (I'm very
>punny at 1am) to exchange addresses of malloc-ed data. But with shmem,
>you get a free registration of all static & common variables, and the
>stack too, as long as you use it in a consistant fashion.

 As far as I know, SHMEM requires a known address either explicitly 
 passed (asymmetric location) between partners or a implicitly determined 
 from the symmetry relationships of the images communicating (static 
 or common).  As you say, this is "free" for COMMON/STATIC data.
 Perhaps we are actually agreeing ... explicitly exchange addresses
 of malloc-ed locations in different binaries would be fine.


 Thanks,

 rbw

 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From edwardsa at plk.af.mil  Thu Apr  3 10:35:41 2003
From: edwardsa at plk.af.mil (Art Edwards)
Date: Thu, 3 Apr 2003 08:35:41 -0700
Subject: small cluster
In-Reply-To: <1049316063.1932.4.camel@skull.america.net>
References: <1049316063.1932.4.camel@skull.america.net>
Message-ID: <20030403153541.GB30047@plk.af.mil>

I have a two node cluster running just Debian. Indeed you need two NICS
on the head node (one for the outside and one for the local network).
I'm still running NFS to mount /home and /usr/local on  the internal
node. Also, I'm running MPICH. It was an exercise to learn rudimentary
skills in cluster building, but it still runs.

Art Edwards

On Wed, Apr 02, 2003 at 03:41:03PM -0500, Dennis Sarvis, II wrote:
> How does one go about creating a 2 PC cluster? I have a redhat 400Mhz
> PII and a Debian Celeron 550Mhz.  Can I do something like use 2 NICs in
> the controller and one in the slave (1 NIC for the office
> network/internet and the other connecting via crossover 10baseT to the
> NIC on node1 slave)?
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Art Edwards
Senior Research Physicist
Air Force Research Laboratory
Electronics Foundations Branch
KAFB, New Mexico

(505) 853-6042 (v)
(505) 846-2290 (f)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tony at mpi-softtech.com  Thu Apr  3 10:45:58 2003
From: tony at mpi-softtech.com (Anthony Skjellum)
Date: Thu, 3 Apr 2003 09:45:58 -0600 (CST)
Subject: Which MPI implementation for MPI-2? ...
In-Reply-To: <200304031451.h33Epo500310@mycroft.ahpcrc.org>
Message-ID: <Pine.GSO.4.33.0304030944570.669-100000@mpi.mpi-softtech.com>

Our experience in ChaMPIon/Pro is that we get higher latency and higher
bandwidth than 2-sided, vs. the design target of lower latency and lower
bandwidth; the standard missed the mark, but it is still useful.

On Thu, 3 Apr 2003, Richard Walsh wrote:

> On Thu Apr  3 03:43:24 2003 Greg Lindahl wrote:
>
> >On Wed, Apr 02, 2003 at 02:11:20PM -0600, Richard Walsh wrote:
> >
> >> Somewhat tangentially, but while we are on the subject of one-sided
> >> communications in MPI-2, am I correct in assuming that this capability
> >> is implemented as it is in SHMEM ...
> >
> >No. It's much more complicated and general. You have to register
> >windows within which one-sided ops can be used, and there are some
> >extra calls that you make to make sure operations have completed.
>
>  I see ... then I should also anticipate some loss of performance
>  (higher latency) when using one-sided MPI communications compared
>  to SHMEM. Or perhaps this is one-time overhead paid at registration
>  only?
>
> >UPC is a much more compact method of expressing one-sided calls, and
> >unlike shmem, it can benefit from pipelined transfers.
>
>  Right (so also with CAF) for messages, but you still have to explicitly
>  sychronize/lock, etc.
>
> >> It would seem to be a requirement for speed and would
> >> also seem to require the use of identical binaries on each processor
> >> (and COMMON or static to place data in a symmetric location).
> >
> >shmem doesn't require that; you can use a common address (I'm very
> >punny at 1am) to exchange addresses of malloc-ed data. But with shmem,
> >you get a free registration of all static & common variables, and the
> >stack too, as long as you use it in a consistant fashion.
>
>  As far as I know, SHMEM requires a known address either explicitly
>  passed (asymmetric location) between partners or a implicitly determined
>  from the symmetry relationships of the images communicating (static
>  or common).  As you say, this is "free" for COMMON/STATIC data.
>  Perhaps we are actually agreeing ... explicitly exchange addresses
>  of malloc-ed locations in different binaries would be fine.
>
>
>  Thanks,
>
>  rbw
>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

Anthony Skjellum PhD, CTO       | MPI Software Technology, Inc.
101 South Lafayette St, Ste. 33 | Starkville, MS 39759, USA
Ph: +1-(662)320-4300 x15        | FAX: +1-(662)320-4301

http://www.mpi-softtech.com     | tony at mpi-softtech.com

Middleware that's hard at work for you and your enterprise.(SM)


The information contained in this communication may be confidential and is
intended only for the use of the recipient(s) named above.  If the reader of
this communication is not the intended recipient(s), you are hereby notified
that any dissemination, distribution, or copying of this communication, or
any of its contents, is strictly prohibited.  If you are not a named
recipient or received this communication by mistake, please notify the sender
and delete the communication and all copies of it.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From exa at kablonet.com.tr  Wed Apr  2 21:20:04 2003
From: exa at kablonet.com.tr (Eray Ozkural)
Date: Thu, 3 Apr 2003 05:20:04 +0300
Subject: Which MPI implementation for MPI-2? ...
In-Reply-To: <200304022011.h32KBKv10390@mycroft.ahpcrc.org>
References: <200304022011.h32KBKv10390@mycroft.ahpcrc.org>
Message-ID: <200304030520.05254.exa@kablonet.com.tr>

On Wednesday 02 April 2003 23:11, Richard Walsh wrote:
>
> To anyone ...
>
> Somewhat tangentially, but while we are on the subject of one-sided
> communications in MPI-2, am I correct in assuming that this capability
> is implemented as it is in SHMEM ... via communication to/from symmetric
> (or known asymmetric) memory locations inside the companion processes
> memory space.  It would seem to be a requirement for speed and would
> also seem to require the use of identical binaries on each processor
> (and COMMON or static to place data in a symmetric location).
>
> Thanks for your guidance ...

I think that's the idea but it isn't shared memory! 

http://www.mpi-forum.org/docs/mpi-20-html/node117.htm#Node117

Remote Memory Access ( RMA) extends the communication mechanisms of MPI by 
allowing one process to specify all communication parameters, both for the 
sending side and for the receiving side. 
....

Evidently, you cannot treat RMA like shared memory. What RMA really is what 
shared memory should have been (in a sense):
   The design is similar to that of weakly coherent memory systems: correct 
ordering of memory accesses has to be imposed by the user, using 
synchronization calls; the implementation can delay communication operations 
until the synchronization calls occur, for efficiency. 

Using RMA you have to design your algorithm like before however it is much 
easier to cope with dynamic communication. In usual MPI-1 code you would have 
to specify tons of custom async. send/recv. routines, sync. code etc. to 
accomplish the same thing.

So the answer is, yes it's like shared memory but you know that each call 
(put, get, accumulate) will incur a message passing eventually. IMO the 
greatest advantage comes from the (possibly) higher level of abstraction 
attained this way. Of course a nicer thing is the ease of writing in-place 
routines, that could potentially make a difference in a lot of places, for 
example sparse codes like graph partitioning.

What we had thought was perhaps implementing complex parallel algorithms (like 
fold/expand) could be easier with one sided comms.

Thanks,

-- 
Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Apr  3 12:35:00 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 3 Apr 2003 12:35:00 -0500 (EST)
Subject: small cluster
In-Reply-To: <200304031559.TAA01399@nocserv.free.net>
Message-ID: <Pine.LNX.4.44.0304031230540.15145-100000@ganesh.phy.duke.edu>

On Thu, 3 Apr 103, Mikhail Kuzminsky wrote:

> According to Dennis Sarvis, II
> > How does one go about creating a 2 PC cluster? I have a redhat 400Mhz
> > PII and a Debian Celeron 550Mhz.  Can I do something like use 2 NICs in
> > the controller and one in the slave (1 NIC for the office
> > network/internet and the other connecting via crossover 10baseT to the
> > NIC on node1 slave)?
>    Yes, I use like configuration in my home (but w/o permanent
> external link to Internet).

However, small switches are SO cheap ($50?) that it is hard not to
justify buying a switch unless you are stone cold broke.  Even a small
switch also makes it much easier to add more nodes as you find them.
They are almost certainly going to be 100BT as well, so that you can use
faster NICs on future systems without having to deal with 10BT to 100BT
crossover connections, multiple NICs in the head node, and so forth.

I mean, the cost of a switch (per port) can actually be less than the
cost of the NIC ports that connect to it these days.  Go for it.

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Thu Apr  3 14:29:44 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Thu, 3 Apr 2003 11:29:44 -0800
Subject: Which MPI implementation for MPI-2? ...
In-Reply-To: <200304030520.05254.exa@kablonet.com.tr>
References: <200304022011.h32KBKv10390@mycroft.ahpcrc.org> <200304030520.05254.exa@kablonet.com.tr>
Message-ID: <20030403192944.GC1201@greglaptop.internal.keyresearch.com>

On Thu, Apr 03, 2003 at 05:20:04AM +0300, Eray Ozkural wrote:

> I think that's the idea but it isn't shared memory! 

He is talking about the Cray T3E SHMEM library, not shared memory.
SHMEM is a SALC (Shared address, local consistancy) model, and is very
similar to the MPI-2 one-sided stuff, but with much simpler syntax.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Thu Apr  3 14:56:08 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Thu, 3 Apr 2003 11:56:08 -0800
Subject: Which MPI implementation for MPI-2? ...
In-Reply-To: <200304031451.h33Epo500310@mycroft.ahpcrc.org>
References: <200304031451.h33Epo500310@mycroft.ahpcrc.org>
Message-ID: <20030403195608.GD1201@greglaptop.internal.keyresearch.com>

On Thu, Apr 03, 2003 at 08:51:50AM -0600, Richard Walsh wrote:

>  I see ... then I should also anticipate some loss of performance 
>  (higher latency) when using one-sided MPI communications compared 
>  to SHMEM. Or perhaps this is one-time overhead paid at registration
>  only?

Registration is a one-time overhead. However, the exact semantics of
MPI-2 are annoying and may end up introducing some significant
overhead for tiny messages. A machine which only allows a limited
amout of memory to be registered might have significant overhead all
the time. The T3E didn't have that problem because it had a direct
mapping from virtual to physical addresses, so the communications
system didn't need to know what the TLB mappings looked like.

For modern interconnects like Myrinet, there's enough SRAM on the card
to map the entire process: 3 bytes per page times (4 GB/4k per page) = 3
megabytes, so the larger memory version of the card suffices for a
32-bit system. The current GM only supports put, not get.

I have no idea how much memory SCI or Quadrics could map. You may be
able to hack Linux such that it always handed out groups of pages;
this would waste some memory, but could reduce the memory needed to
hold the full set of mappings by a factor of 4 for 16k groups, 16 for
64k groups, etc. It's a shame that x86 doesn't have support for
slightly larger pages; the Opteron has the same problem.

>  As far as I know, SHMEM requires a known address either explicitly 
>  passed (asymmetric location) between partners or a implicitly determined 
>  from the symmetry relationships of the images communicating (static 
>  or common).  As you say, this is "free" for COMMON/STATIC data.
>  Perhaps we are actually agreeing ... explicitly exchange addresses
>  of malloc-ed locations in different binaries would be fine.

We are agreeing. It comes down to "does this object have the same
address in both processes?"

greg
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From purp at acm.org  Thu Apr  3 17:22:07 2003
From: purp at acm.org (Jim Meyer)
Date: 03 Apr 2003 14:22:07 -0800
Subject: FOLLOWUP: Re: Platform adopts DRMAA?
In-Reply-To: <1049138047.18608.48.camel@utonium.pdi.com>
References: <20030329072430.66227.qmail@web41315.mail.yahoo.com>
	 <1049138047.18608.48.camel@utonium.pdi.com>
Message-ID: <1049408526.13758.98.camel@utonium.pdi.com>

On Mon, 2003-03-31 at 11:14, Jim Meyer wrote:
> On Fri, 2003-03-28 at 23:24, Ron Chen wrote:
> > Something more interesting! Platform throws away its
> > own API and joined the DRMAA camp.
> 
> You've mentioned this twice but I've not seen any mention of Platform 
> [...]
> Can you (or anyone) point to something more specific indicating that
> Platform is adopting DRMAA (and hopefully providing an estimated
> timeline =)?

Failing any response from the original poster, I made a few phone calls
and determined that while Platform did indeed discard NPi, they have not
announced any effort to support DRMAA (nor did the couple of folks I
spoke with know of any such effort under consideration). 

Perhaps that's a shame as it'd be nice to be able to integrate with one
interface and have a selection of DRM packages to choose from ... but
then, I suspect that an easy bidirectional migration path doesn't top
the list of any software vendor, proprietary or otherwise.

A pity, that.

Cheers!

--j, with salt shaker in hand.
-- 
Jim Meyer, Geek at Large                                    purp at acm.org

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ron_chen_123 at yahoo.com  Fri Apr  4 02:36:40 2003
From: ron_chen_123 at yahoo.com (Ron Chen)
Date: Thu, 3 Apr 2003 23:36:40 -0800 (PST)
Subject: FOLLOWUP: Re: Platform adopts DRMAA?
In-Reply-To: <1049408526.13758.98.camel@utonium.pdi.com>
Message-ID: <20030404073640.8869.qmail@web41313.mail.yahoo.com>

I have to confess that I did not get the information
directly form Platform.

I have been looking for the forum post which mentioned
that Platform's NPi is dead, and Platform is joining
DRMAA. I still couldn't find it.

Nevertheless, I just found that Platform is a member
of GGF:
http://www.gridforum.org/L_About/who.htm

which defines DRMAA:
http://www.gridforum.org/3_SRM/drmaa.htm

So that made the original poster on the forum believed
that Platform is joining DRMAA.

Do you know why Platform discarded NPi?

(Looks like another M$ API standard!)

 -Ron

--- Jim Meyer <purp at acm.org> wrote:
> Failing any response from the original poster, I
> made a few phone calls
> and determined that while Platform did indeed
> discard NPi, they have not
> announced any effort to support DRMAA (nor did the
> couple of folks I
> spoke with know of any such effort under
> consideration). 
> 
> Perhaps that's a shame as it'd be nice to be able to
> integrate with one
> interface and have a selection of DRM packages to
> choose from ... but
> then, I suspect that an easy bidirectional migration
> path doesn't top
> the list of any software vendor, proprietary or
> otherwise.
> 
> A pity, that.
> 
> Cheers!
> 
> --j, with salt shaker in hand.
> -- 
> Jim Meyer, Geek at Large                            
>        purp at acm.org
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joachim at ccrl-nece.de  Fri Apr  4 03:01:55 2003
From: joachim at ccrl-nece.de (Joachim Worringen)
Date: Fri, 4 Apr 2003 10:01:55 +0200
Subject: Which MPI implementation for MPI-2? ...
In-Reply-To: <Pine.GSO.4.33.0304030944570.669-100000@mpi.mpi-softtech.com>
References: <Pine.GSO.4.33.0304030944570.669-100000@mpi.mpi-softtech.com>
Message-ID: <200304041001.55470.joachim@ccrl-nece.de>

Anthony Skjellum:
> Our experience in ChaMPIon/Pro is that we get higher latency and higher
> bandwidth than 2-sided, vs. the design target of lower latency and lower
> bandwidth; the standard missed the mark, but it is still useful.

I would be surprised if the (primary) design target for one-sided 
communication on non-shared-memory architectures was lower latency and higher 
bandwidth - this can obviously not be achieved if you need to use messages. 
I'd say it's the different communication paradigm ("origin process chooses 
which data to read or write, independant from target process") which helps to 
adopt certain communication patterns more easily/naturally, and *maybe* avoid 
some synchronization delays. 

But then again, MPI-2 one sided with it's higly relaxed consistency model does 
not come really naturally for most users...

 Joachim

-- 
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at plogic.com  Fri Apr  4 09:29:02 2003
From: deadline at plogic.com (Douglas Eadline)
Date: Fri, 4 Apr 2003 09:29:02 -0500 (EST)
Subject: More SMP Memory Data
Message-ID: <Pine.LNX.4.44.0304040925580.22025-100000@otto.plogic.internal>


I have posted more memory contention data (including wall clock times)  
for PIII and Athlon SMP systems at:

http://www.cluster-rant.com/article.pl?sid=03/04/03/1429239

Doug

-- 
-------------------------------------------------------------------
Paralogic, Inc.           |     PEAK     |      Voice:+610.814.2800
130 Webster Street        |   PARALLEL   |        Fax:+610.814.5844
Bethlehem, PA 18015 USA   |  PERFORMANCE |    http://www.plogic.com
-------------------------------------------------------------------

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ilumb at platform.com  Fri Apr  4 06:47:33 2003
From: ilumb at platform.com (Ian Lumb)
Date: Fri, 4 Apr 2003 06:47:33 -0500
Subject: FOLLOWUP: Re: Platform adopts DRMAA?
Message-ID: <4AB0624F069DAD4E90F18B13A818EEFE287D7D@catoexm04.noam.corp.platform.com>

NPi merged with the GGF in April 2002 - see http://www.ggf.org/5_ARCH/npi.htm for more.  -Ian

-----Original Message-----
From: Ron Chen [mailto:ron_chen_123 at yahoo.com]
Sent: Friday, April 04, 2003 2:37 AM
To: Jim Meyer
Cc: Beowulf Mailing List
Subject: Re: FOLLOWUP: Re: Platform adopts DRMAA?


I have to confess that I did not get the information
directly form Platform.

I have been looking for the forum post which mentioned
that Platform's NPi is dead, and Platform is joining
DRMAA. I still couldn't find it.

Nevertheless, I just found that Platform is a member
of GGF:
http://www.gridforum.org/L_About/who.htm

which defines DRMAA:
http://www.gridforum.org/3_SRM/drmaa.htm

So that made the original poster on the forum believed
that Platform is joining DRMAA.

Do you know why Platform discarded NPi?

(Looks like another M$ API standard!)

 -Ron

--- Jim Meyer <purp at acm.org> wrote:
> Failing any response from the original poster, I
> made a few phone calls
> and determined that while Platform did indeed
> discard NPi, they have not
> announced any effort to support DRMAA (nor did the
> couple of folks I
> spoke with know of any such effort under
> consideration). 
> 
> Perhaps that's a shame as it'd be nice to be able to
> integrate with one
> interface and have a selection of DRM packages to
> choose from ... but
> then, I suspect that an easy bidirectional migration
> path doesn't top
> the list of any software vendor, proprietary or
> otherwise.
> 
> A pity, that.
> 
> Cheers!
> 
> --j, with salt shaker in hand.
> -- 
> Jim Meyer, Geek at Large                            
>        purp at acm.org
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20030404/cb57162f/attachment.html>

From hahn at physics.mcmaster.ca  Fri Apr  4 10:23:03 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Fri, 4 Apr 2003 10:23:03 -0500 (EST)
Subject: FOLLOWUP: Re: Platform adopts DRMAA?
In-Reply-To: <20030404073640.8869.qmail@web41313.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0304041019170.25394-100000@coffee.psychology.mcmaster.ca>


I'm rather puzzled at this exchange: why would anyone care?
from the quick look I had at the drmaa stuff, it seemed quite 
trivial.  if I was writing a tool that needed to interact with
a queueing system, I'd just have an encapsulation layer anyway,
so wouldn't care about exact interfaces.

this is one of those nasty (and ultimately useless)
lowest-common-denominator types of interfaces, 
and seems to be driven by marketing weasels.


> I have to confess that I did not get the information
> directly form Platform.
> 
> I have been looking for the forum post which mentioned
> that Platform's NPi is dead, and Platform is joining
> DRMAA. I still couldn't find it.
> 
> Nevertheless, I just found that Platform is a member
> of GGF:
> http://www.gridforum.org/L_About/who.htm
> 
> which defines DRMAA:
> http://www.gridforum.org/3_SRM/drmaa.htm
> 
> So that made the original poster on the forum believed
> that Platform is joining DRMAA.
> 
> Do you know why Platform discarded NPi?
> 
> (Looks like another M$ API standard!)
> 
>  -Ron
> 
> --- Jim Meyer <purp at acm.org> wrote:
> > Failing any response from the original poster, I
> > made a few phone calls
> > and determined that while Platform did indeed
> > discard NPi, they have not
> > announced any effort to support DRMAA (nor did the
> > couple of folks I
> > spoke with know of any such effort under
> > consideration). 
> > 
> > Perhaps that's a shame as it'd be nice to be able to
> > integrate with one
> > interface and have a selection of DRM packages to
> > choose from ... but
> > then, I suspect that an easy bidirectional migration
> > path doesn't top
> > the list of any software vendor, proprietary or
> > otherwise.
> > 
> > A pity, that.
> > 
> > Cheers!
> > 
> > --j, with salt shaker in hand.
> > -- 
> > Jim Meyer, Geek at Large                            
> >        purp at acm.org
> > 
> 
> 
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Tax Center - File online, calculators, forms, and more
> http://tax.yahoo.com
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
operator may differ from spokesperson.	            hahn at mcmaster.ca
                                              http://hahn.mcmaster.ca/~hahn

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From purp at acm.org  Fri Apr  4 11:45:50 2003
From: purp at acm.org (Jim Meyer)
Date: 04 Apr 2003 08:45:50 -0800
Subject: FOLLOWUP: Re: Platform adopts DRMAA?
In-Reply-To: <4AB0624F069DAD4E90F18B13A818EEFE287D7D@catoexm04.noam.corp.platform.com>
References: 	 <4AB0624F069DAD4E90F18B13A818EEFE287D7D@catoexm04.noam.corp.platform.com>
Message-ID: <1049474750.19892.3.camel@utonium.pdi.com>

On Fri, 2003-04-04 at 03:47, Ian Lumb wrote:
> NPi merged with the GGF in April 2002 - see
> http://www.ggf.org/5_ARCH/npi.htm for more.  -Ian

It seems my reports of NPi's demise are much exaggerated. My apologies.

On a brighter note, the OGSA/OGSI bits look promising.

Thanks for the link!

--j
-- 
Jim Meyer, Geek at Large                                    purp at acm.org

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ron_chen_123 at yahoo.com  Sun Apr  6 00:20:00 2003
From: ron_chen_123 at yahoo.com (Ron Chen)
Date: Sat, 5 Apr 2003 21:20:00 -0800 (PST)
Subject: Fwd: Re: [GE users] SGEEE is opensource, opensource, please repeat with me...
Message-ID: <20030406052000.35183.qmail@web41303.mail.yahoo.com>


--- Fritz Ferstl <Friedrich.Ferstl at sun.com> wrote:
> Date: Fri, 28 Mar 2003 16:53:33 +0100 (MET)
> From: Fritz Ferstl <Friedrich.Ferstl at sun.com>
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] SGEEE is opensource,
> opensource, please repeat with
>  me...
> 
> Hi Vaclav,
> 
> thanks for pointing this out again.
> 
> The reply you wrote to the beowulf list is
> completely correct and not 
> speculative at all. In reality, it goes even
> farther. If Sun or anyone 
> else who want's to create a commercial Grid Engine
> version intends to 
> change the sources such that project defined
> compatibility requirements 
> are violated then those changes would need to be
> documented and the 
> interfaces which were introduced would need to be
> published. This is the 
> spirit of the SISSL open source licensing model
> under which Grid Engine 
> has been released and it tries to ensure
> interoperability between open 
> source and commercialized versions. This is 100% the
> case today. You can 
> use Sun's commercial version completely
> interchangeably with the 
> same-release-level versions from the project.
> 
> Note that everyone has the same rights in the
> project and can create a 
> commercial version of Grid Engine. The thing that
> needs to be honored is 
> the SISSL. See
> 
>
http://gridengine.sunsource.net/project/gridengine/Gridengine_SISSL_license.html
> 
> for details.
> 
> We'll upgrade the site to a new version of
> SourceCast within the next 
> few weeks and have planned to refurbish the content
> a bit in this 
> process. We'll keep an eye on your recommendation to
> make more clear 
> that the full Enterprise Edition functionality is
> indeed contained in 
> the published project source code and in the project
> builds.
> 
> Cheers,
> 
> Fritz
> 
> 
> 
> On Fri, 28 Mar 2003 hanzl at noel.feld.cvut.cz wrote:
> 
> > Sorry to repeat this old topic, but I see this
> happen again and again:
> > 
> >  PEOPLE THINK THAT SGEEE IS NOT OPENSOURCE !
> > 
> > And some of them get upset and it is hard to
> explain them that they
> > are mistaken. I spent quite big effort on
> beowulf at beowulf.org to make
> > this clear but confused people arise again and
> again.
> > 
> > Please:
> > 
> > - If you can find more suitable places where to
> put bold label
> >   "SGEEE is OPENSOURCE", please do it
> > 
> > - Kindly verify my explanation below - I did not
> intend to send a copy
> >   here but later I realized that maybe I was too
> speculative, so please
> >   check that my claims are true.
> > 
> > Thanks a lot
> > 
> > Vaclav Hanzl
> > 
> > ------- one of my beowulf posts - please verify my
> claims: -----------
> > 
> > Subject: Re: sun grid engine?
> > From: hanzl
> > To: kus at free.net
> > Cc: beowulf at beowulf.org
> > Date: Thu, 27 Mar 2003 20:12:03 +0100
> > 
> > > May be it's integrated into SGE 5.3 Enterprise
> Edition ? I said about
> > > *free* SGE 5.3. Both "Sun ONE Grid Engine
> Administartor and User's Guide"
> > > and "Sun ONE Grid Engine Release Notes" don't
> have just the word "MAUI".
> > > Moreover, the only sheduler algorithm allowed in
> usual
> > > (free) SGE 5.3 is "standard" (see SGE
> Administrator & User's guide, p.225).
> > 
> > It is easy to get confused by SGE versions.
> > 
> > Enterprise Edition is also free. MAUI was
> integrated with it - most of
> > this work was done by MAUI team with help from SGE
> team.
> > 
> > Regarding SGE versions, I think it works as
> follows:
> > 
> > 1) Developers create opensource SGE version. They
> work using publicly
> > available CVS software repository. All new
> features come to this
> > version.
> > 
> > This opensource version is both "SGE" ans "SGE
> Enterprise Edition" -
> > the difference is just an instalation option. You
> install both using
> > the same files, you may compile both using the
> same sources from the
> > CVS archive.
> > 
> > 2) 'Commercial' part of SUN takes these sources
> (probably without any
> > important changes) and compiles 'commercial' SGE
> and SGEEE. They add
> > word 'ONE' to the name. They create nice manuals.
> You can buy this
> > software and get usual support you expect for
> commercial software.
> > You can still download the manuals for free. Just
> skip word 'ONE'
> > while reading them - they are perfectly usable for
> free SGE as well.
> > They just may be out of date because the free
> version already has new
> > features (like MAUI integration). They may also
> never mention MAUI
> > integration because the 'commercial' part of SUN
> has no support for
> > it.
> > 
> > 
> > All this is just too nice to believe it so people
> often get confused.
> > 
> > Note that it probably quite differs from
> PBS/OpenPBS development model
> > - I am no expert on PBS (experienced experts,
> please correct me if I
> > am wrong!) but I think that commercial PBS and
> OpenPBS are split and
> > the development team has quite hard times deciding
> what to do - they
> > introduce new features to commercial branch to
> make it more attractive
> > (to make any money on it) but in the same time
> similar features are
> > wanted in the OpenPBS version. They themselves
> created their own enemy
> > on the market (OpenPBS) and now they are not sure
> how to behave to it
> > - support it as their child? Kill it as their
> enemy? 
> > 
> > Even if	I am wrong in my thoughts on PBS (and I
> may easily be wrong as
> > it is a long time I left PBS maillists) I am
> pretty sure many PBS
> > users percieve it like this (as I got few quite
> few emails from them
> > indicating this).
> > 
> > PBS is older than SGE (and yes, PBS did many good
> things, no doubt)
> > and everybody knew PBS when opensource SGE was
> born. And many people
> > could easily expect that SGE used the same model
> as PBS did. (It was
> > easy to think that SGE EE is the commercial
> version - no, it is not.)
> > 
> > SGE did not use the same model as PBS. It used
> more open one. And this
> > choice was huge success I think.
> > 
> > ...
> > 
> > Regards
> > 
> > Vaclav
> > 
> > 
> > ---- one more example of confusion -------
> > 
> > Subject: Re: sun grid engine?
> > From: Alan Scheinine <scheinin at crs4.it>
> > To: Beowulf at beowulf.org
> > Date: Fri, 28 Mar 2003 10:33:09 +0100
> > 
> > I see "Vaclav" posted a message.  Last week we
> began the installation
> > of SGE and someone involved with the installation
> said that in order
> > to have the options of sgeee it is necessary to
> buy that version.
> > Using grep on the messages I had saved, I found
> the message from
> > Vaclav from the year 2001 showing how to convert
> sge to sgeee.
> > In 2001 Vaclav said that the information was at
> the end of the
> > download page, now it is in a readme file in the
> distribution.
> > In any case, the note from Vaclav in 2001 proved
> to be useful also
> > in 2003, the file is easily overlooked if the
> system administrator
> > does not know it can be done.  By the way, the
> file is
> >  <your SGE directory hierarchy
> root>/README.inst_sgeee
> > Alan
> > 
> > ---- I think Alan means this my old note:  -----
> > 
> > Subject: SGEEE easily mistaken as commercial
> version
> > From: hanzl at noel.feld.cvut.cz
> > To: dev at gridengine.sunsource.net,
> beowulf at beowulf.org
> > Date: Thu, 25 Oct 2001 11:22:40 +0200
> > 
> > Prospective SGE users could very easily be
> mistaken and suppose that
> > Enterprise Edition is a commercial close source
> version. If you
> > download the three tarfiles from "Binary
> Downloads" page, unpack them
> > and look around, you will install non-EE version.
> There is no way to
> > find easily (from unpacked files) that you could
> install EE. The pdf
> > manual will tell you about EE features but your
> instalation is missing
> > them. No hint at all that EE is also opensource.
> > 
> > This IMHO seriously harms the SGE project and
> should be corrected as
> > soon as possible by including inst_sgeee script in
> tar files.
> > 
> > Potential SGE users and opensource co-developers
> are likely to know
> > PBS, which exists in both opensource and
> commercial version. During SGE
> > test-install many of them will be systematically
> driven into false
> > assumption that SGE project is organised the same.
> > 
> > I wish all the best to Veridian and PBS and
> everybody making free
> > versions of commercial software. This setup of
> things however
> > inevitably makes opensource users to assess danger
> that core
> > developement team will be torn between opensource
> and commercial
> > version support, will be reluctant to port
> commercial version fixes to
> > opensource version (cause it takes time) and will
> be unable to
> > integrate opensource-community created patches
> cause without knowledge
> > of the commercial version source these patches
> will diverge.
> > 
> > It is very sad to have these worries about SGE by
> mistake.
> > 
> > 
> > Only after lot of hacking around I found that all
> you have to do to
> > install EE is to rename inst_sge to inst_sgeee
> (and it behaves
> > accordingly). Only after this I looked around once
> more and found this
> > at the bottom of binary download page:
> > 
> >  Only for Grid Engine Enterprise Edition you have
> to make slight modifications:
> >  % cd $SGE_ROOT 
> >  % ln -s inst_sge inst_sgeee 
> >  % replace inst_sge with inst_sgeee in the last
> line of the files install_qmaster and install_execd 
> >  Then you can proceed as with the standard Grid
> Engine installation. 
> > 
> > Well, you may say it is my fault not to notice
> this before. Sure it is
> > but I think this fault is quite common and harms
> SGE a lot. It is
> > worth it to include inst_sgeee in tar files now as
> many Beowulf
> > maillist readers might be prompted by recent SGE
> discussions to go and
> > try SGE - and maybe forget about it if they make
> the same mistake as I
> > did.
> > 
> > 
> > With all the best wishes to SGE team (and thanks
> for all the work done
> > so far)
> > 
> > Vaclav
> > 
> > 
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> > 
> > 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From virtualsuresh at yahoo.co.in  Sun Apr  6 03:34:41 2003
From: virtualsuresh at yahoo.co.in (=?iso-8859-1?q?suresh=20chandra?=)
Date: Sun, 6 Apr 2003 08:34:41 +0100 (BST)
Subject: small cluster
Message-ID: <20030406073441.50944.qmail@web8102.in.yahoo.com>

Hi,
I am also Interested in Building a two node cluster,
I had Athlon 850Mhz and old Pentium 133Mhz(Hard Disk
Less).
We are going to build a 16 node cluster for our
university. So as a practice, I want to build 2 node
cluster in Home. I am planning to use OpenMosix2(SSI).
I want to share Ideas with all of you in implementing.

Thanks & Regards,
Suresh Chandra, India

=====


________________________________________________________________________
Missed your favourite TV serial last night? Try the new, Yahoo! TV.
       visit http://in.tv.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jbassett at blue.weeg.uiowa.edu  Sun Apr  6 00:35:31 2003
From: jbassett at blue.weeg.uiowa.edu (jbassett)
Date: Sat, 5 Apr 2003 23:35:31 -0600
Subject: advice on cluster purchase
Message-ID: <3E9602A5@itsnt5.its.uiowa.edu>

Hi, I am an undergraduate involved with a totally student run parallel 
computing experience. We have approximately 10,000 of university money with 
which to produce the best possible machine. I would  be interested to hear 
from you all what configuration you would choose if someone just said "here's 
the money, build the best system you can." The system will do both cpu 
dominated and network intensive activities, so it would be tailored for 
neither. Do SMP nodes tend to be superior in a cost/performance framework? I 
have worked with other peoples systems and they are always dual cpu nodes, my 
impression being that it is for the purpose of minimizing overall size- as I 
tend to start a process on each cpu. Any advice would be appreciated. 
Joseph Bassett


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From chih-houng-king at uiowa.edu  Sun Apr  6 23:33:44 2003
From: chih-houng-king at uiowa.edu (Chih King)
Date: Sun, 6 Apr 2003 22:33:44 -0500
Subject: Specific Question about Single vs. Dual Processor System
Message-ID: <002501c2fcb6$7e369930$6401a8c0@chihking>

Hello.  I am a member of the University of Iowa Student Supercomputing
Project (UISSP), and we are planning for the purchase of our first cluster.
Currently we are divided between a sixteen node single-processor Pentium 4
system and a seven node dual-processor Xeon system.  Here are the brief
specification of both machines:

16 Pentium 4 single-processor system (total cost $7,407):

Intel Pentium 4 2.4GHz 533FSB 512KB
ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN
512MB PC2700 DDR333
Maxtor 20GB Ultra100 Hard Drive
ATI Rage Mobility VGA Card 8MB AGP
CG 6039L 350W USB Midtower Case
Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel)

7 Xeon dual-processor system (total cost $8,400):

INTEL XEON 2.4GHZ 533FSB PROCESSOR x2
TYAN S2723GNN E7501 GLAN MOTHERBOARD
PC2100 256MB ECC/REG DDR x2
Maxtor 20GB Ultra100 Hard Drive
Chenbro Beige Server Case
NMB 460W Xeon Power Supply
MITSUMI 54X CD-ROM Drive

As you can see, the single-processor system is about $1,000 cheaper than the
dual-processor system.  We have a total of $9,500 in our budget (to pay for
the system, the switch, and everything else).  Taking into consideration
both performance and economical issues which system would you choose and
why?  Some more details: since Gigabit LAN is built in both motherboards we
will probably establish one Gigabit channel, and if necessary have a second
100Mbps LAN channel as well.  Therefore we will probably have to spend an
additional $500-600 on switches.  Currently we are not sure about specific
application that we will be running on the cluster, but we would like to run
a broad range of calculations/simulations (ie. biological, economical,
mathematical, etc.)  We would really appreciate any response in this matter.
Thank you very much!

Sincerely,

Chih King
UISSP


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Sun Apr  6 19:57:24 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Sun, 6 Apr 2003 19:57:24 -0400 (EDT)
Subject: Specific Question about Single vs. Dual Processor System
In-Reply-To: <002501c2fcb6$7e369930$6401a8c0@chihking>
Message-ID: <Pine.LNX.4.44.0304061942220.1314-100000@lilith.rgb.private.net>

On Sun, 6 Apr 2003, Chih King wrote:

> Hello.  I am a member of the University of Iowa Student Supercomputing
> Project (UISSP), and we are planning for the purchase of our first cluster.
> Currently we are divided between a sixteen node single-processor Pentium 4
> system and a seven node dual-processor Xeon system.  Here are the brief
> specification of both machines:
> 
> 16 Pentium 4 single-processor system (total cost $7,407):
> 
> Intel Pentium 4 2.4GHz 533FSB 512KB
> ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN
> 512MB PC2700 DDR333
> Maxtor 20GB Ultra100 Hard Drive
> ATI Rage Mobility VGA Card 8MB AGP
> CG 6039L 350W USB Midtower Case
> Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel)
> 
> 7 Xeon dual-processor system (total cost $8,400):
> 
> INTEL XEON 2.4GHZ 533FSB PROCESSOR x2
> TYAN S2723GNN E7501 GLAN MOTHERBOARD
> PC2100 256MB ECC/REG DDR x2
> Maxtor 20GB Ultra100 Hard Drive
> Chenbro Beige Server Case
> NMB 460W Xeon Power Supply
> MITSUMI 54X CD-ROM Drive
> 
> As you can see, the single-processor system is about $1,000 cheaper than the
> dual-processor system.  We have a total of $9,500 in our budget (to pay for
> the system, the switch, and everything else).  Taking into consideration
> both performance and economical issues which system would you choose and
> why?  Some more details: since Gigabit LAN is built in both motherboards we
> will probably establish one Gigabit channel, and if necessary have a second
> 100Mbps LAN channel as well.  Therefore we will probably have to spend an
> additional $500-600 on switches.  Currently we are not sure about specific
> application that we will be running on the cluster, but we would like to run
> a broad range of calculations/simulations (ie. biological, economical,
> mathematical, etc.)  We would really appreciate any response in this matter.
> Thank you very much!

Hmmm, I think I just responded with ONE plan -- looks like you already
have better quotes than I expected EXCEPT that you look like you're
getting less memory than I think you should get on the duals.  I'd
recommend at least 512 MB per processor, maybe 1 GB per processor if you
can afford it.

You also haven't said anything about a server -- if the cluster is going
to do any serious work, you'll likely want a "server node" in either
configuration with a lot more than 20GB in relatively unreliable IDE
drives.  You are also getting more nodes (either way) than will
comfortably fit on a cheap KVM, which is ok but not as convenient on a
starter/demo cluster when you'll have relatively many occasions to
connect directly to nodes to mess with them.  So replace the KVM with
just the monitor, keyboard, mouse themselves and a cart to put them on.

Now, about your question.  The UP systems have faster memory and more
memory and more processors total.  If you REALLY have no more money,
having 16 systems is better than having 7 if you have to deal with node
failures.  You do have to get a bigger switch, you do give up a bit of
speed when processors have to talk at least some of the time.

I think I'd get the UP configuration, with one node pulled and beefed up
in a bigger case into a server node.  The other 15 and the switch will
fit very neatly onto a single heavy duty steel shelf unit, and can be
cabled up to look lovely.  This should be very serviceable.  For coarse
grained or embarrassingly parallel code you've optimized CPU and have
more memory for applications; for parallel code with a fair bit of IPC's
you no longer can talk to at least ONE processor locally, but neither do
you have to share a single gigE connection among two processors.  It
will look more impressive.  It will run slightly hotter and cost
slightly more to operate, if you are paying the power bill (about
$2K/year, at a guess, so I hope you are NOT paying the power bill:-).

   rgb

> 
> Sincerely,
> 
> Chih King
> UISSP
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at plogic.com  Mon Apr  7 10:09:19 2003
From: deadline at plogic.com (Douglas Eadline)
Date: Mon, 7 Apr 2003 10:09:19 -0400 (EDT)
Subject: Specific Question about Single vs. Dual Processor System
In-Reply-To: <002501c2fcb6$7e369930$6401a8c0@chihking>
Message-ID: <Pine.LNX.4.44.0304071006130.24748-100000@otto.plogic.internal>

On Sun, 6 Apr 2003, Chih King wrote:

> Hello.  I am a member of the University of Iowa Student Supercomputing
> Project (UISSP), and we are planning for the purchase of our first cluster.
> Currently we are divided between a sixteen node single-processor Pentium 4
> system and a seven node dual-processor Xeon system.  Here are the brief
> specification of both machines:
> 
> 16 Pentium 4 single-processor system (total cost $7,407):
> 
> Intel Pentium 4 2.4GHz 533FSB 512KB
> ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN

According to Asus this MB has an on-board 10/100 (realtek) interface
is there a 10/100/1000 option for this board.
Or are you adding a GigE NIC?

Doug

> 512MB PC2700 DDR333
> Maxtor 20GB Ultra100 Hard Drive
> ATI Rage Mobility VGA Card 8MB AGP
> CG 6039L 350W USB Midtower Case
> Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel)
> 
> 7 Xeon dual-processor system (total cost $8,400):
> 
> INTEL XEON 2.4GHZ 533FSB PROCESSOR x2
> TYAN S2723GNN E7501 GLAN MOTHERBOARD
> PC2100 256MB ECC/REG DDR x2
> Maxtor 20GB Ultra100 Hard Drive
> Chenbro Beige Server Case
> NMB 460W Xeon Power Supply
> MITSUMI 54X CD-ROM Drive
> 
> As you can see, the single-processor system is about $1,000 cheaper than the
> dual-processor system.  We have a total of $9,500 in our budget (to pay for
> the system, the switch, and everything else).  Taking into consideration
> both performance and economical issues which system would you choose and
> why?  Some more details: since Gigabit LAN is built in both motherboards we
> will probably establish one Gigabit channel, and if necessary have a second
> 100Mbps LAN channel as well.  Therefore we will probably have to spend an
> additional $500-600 on switches.  Currently we are not sure about specific
> application that we will be running on the cluster, but we would like to run
> a broad range of calculations/simulations (ie. biological, economical,
> mathematical, etc.)  We would really appreciate any response in this matter.
> Thank you very much!
> 
> Sincerely,
> 
> Chih King
> UISSP
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------
Paralogic, Inc.           |     PEAK     |      Voice:+610.814.2800
130 Webster Street        |   PARALLEL   |        Fax:+610.814.5844
Bethlehem, PA 18015 USA   |  PERFORMANCE |    http://www.plogic.com
-------------------------------------------------------------------

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Sun Apr  6 19:40:42 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Sun, 6 Apr 2003 19:40:42 -0400 (EDT)
Subject: advice on cluster purchase
In-Reply-To: <3E9602A5@itsnt5.its.uiowa.edu>
Message-ID: <Pine.LNX.4.44.0304061914550.1314-100000@lilith.rgb.private.net>

On Sat, 5 Apr 2003, jbassett wrote:

> Hi, I am an undergraduate involved with a totally student run parallel 
> computing experience. We have approximately 10,000 of university money with 
> which to produce the best possible machine. I would  be interested to hear 
> from you all what configuration you would choose if someone just said "here's 
> the money, build the best system you can." The system will do both cpu 
> dominated and network intensive activities, so it would be tailored for 
> neither. Do SMP nodes tend to be superior in a cost/performance framework? I 
> have worked with other peoples systems and they are always dual cpu nodes, my 
> impression being that it is for the purpose of minimizing overall size- as I 
> tend to start a process on each cpu. Any advice would be appreciated. 

I think you can barely afford the following:

3 dual Xeon or dual Athlon systems.  Budget them for $1800-2000, get at
least 512 MB of memory per, small/wimpy IDE hard disk, gigabit ethernet
card.  Tower cases are cheaper and 4 nodes don't need a rackmount.  No
CD drives.  A floppy is ok, a cheap video card is ok although likely to
be onboard on the motherboard along with a possibly useful 100BT
interface.

1 dual processor P4 or Athlon with a gig card, a SCSI interface, and 3-4
SCSI disks set up in a RAID, in a server (supertower) case.  If data
preservation is very importanty to you and you can afford it, add a tape
or CD-RW to back it up.  If this "head node" is to connect to an
external network, buy it an extra 100BT interface.  Get it some
bric-a-brac, as well -- a CD RW, a nice sound card (if one isn't
onboard), some decent speakers -- this is where one will "work".  A bit
of extra memory (relative to the nodes) wouldn't hurt as well.

1 small gigabit ethernet switch.  Netgear has a cheap one.  So do other
vendors.  I leave it to your shopping process to determine the number of
ports -- at least 4, of course, but you might want to try for 8, or 16,
if you think your cluster might grow later.  You may want a cheap 100BT
switch as well (or extra ports for the 100BT interfaces) if you'd like
to preserve the gig network for IPC computations only.

1 four port KVM switch.  Don't go cheap -- good cables, maybe a Belkin
switch.  This should cost you $200+ (including cables) not $100-.  The
cheap serial/switch ones suck, and cheap cables will distort video.

1 monitor as large and nice as you wish.  If you can afford it, I'd go
for e.g a NEC 17" flatpanel that does 1280x1024.  Oh, and a nice mouse
and keyboard too.

1 heavy duty shelf unit.  See pictures on http://www.phy.duke.edu/brahma
for a nice one I got at Home Depot for $60 or so -- you only need a half
of one for four nodes, but your cluster might grow...

Miscellaneous cables, UPS/surge protectors, some nifty LEDs and glowing
lights to make people think it is a really powerful computer;-)

A name, and a nice logo.  Never underestimate the importance of
marketing...:-)

I make it (3x$1800=$5400) + (1x$3000) + $600 + $400 = $9400, plus
several hundred for the miscellaneous -- cables, shelf, KVM, UPS, and
anything I might have forgotten.  At least you have something to
structure a price search around while shopping.

Note that you won't get bleeding edge systems at these prices.  I'd
guess 2.4 GHz P4 Xeons or 2000+ Athlons with 1 GB of DDR, maybe a bit
better.

   rgb

> Joseph Bassett
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From eugen at leitl.org  Mon Apr  7 08:56:05 2003
From: eugen at leitl.org (Eugen Leitl)
Date: Mon, 7 Apr 2003 14:56:05 +0200
Subject: renting time on a cluster
Message-ID: <20030407125604.GR2067@leitl.org>

A friend of mine has a project requiring a lot
of crunch (no idea yet which bandwidth/latency 
requirements).

Can you think of places where one can rent nontrivial
amount of crunch for money?

TIA,
Eugene
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20030407/375d9b72/attachment.sig>

From stan at temple.edu  Mon Apr  7 11:27:55 2003
From: stan at temple.edu (Stan Horwitz)
Date: Mon, 7 Apr 2003 11:27:55 -0400 (EDT)
Subject: Question about linking Beowulf nodes
Message-ID: <Pine.OSF.4.53.0304071122300.703821@gs873ps>

Hello all;

Sorry if this is a FAQ. I have been assigned the job of budgeting for a
six-node Beowulf cluster. I have no experience in this area, yet. We would
like to use PCs with AMD processors in them and have disk storage reside
on a Compaq Storageworks SAN. What I am not clear on is the best hardware
solution to link up the six nodes and the appropriate type of network
cards for the individual PCs that will form the cluster.

The purpose of this cluster will be to run computational jobs such as SAS,
Gausian, SPSS, IMSL, and various and sundry FORTRAN and C programs that
our faculty and graduate require for their research projects. Not
surprisingly, we also want to keep implementation and maintenance costs as
low as possible.

I have looked through the faq on the beowfulf.org web site, but I did not
come across any specific hardware recommendations.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From natorro at fisica.unam.mx  Mon Apr  7 12:33:30 2003
From: natorro at fisica.unam.mx (Carlos Ernesto Lopez Nataren)
Date: 07 Apr 2003 11:33:30 -0500
Subject: Mac OS X or Linux?
Message-ID: <1049733210.7632.3.camel@linux>

Hi!, we recently acquired 6 Xserve nodes at my institute, and we are
planning to setup a beowulf cluster, can anyone tell if it is worthy to
set it up with the OS it brings??? (Mac OS X server jaguar) or if it is
better to try to use linux on these beauties???

Thanks a lot in advance for any help
-- 
Carlos Ernesto Lopez Nataren <natorro at fisica.unam.mx>
IFISICA

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rocky at atipa.com  Mon Apr  7 10:54:24 2003
From: rocky at atipa.com (Rocky McGaugh)
Date: Mon, 7 Apr 2003 09:54:24 -0500 (CDT)
Subject: Specific Question about Single vs. Dual Processor System
In-Reply-To: <002501c2fcb6$7e369930$6401a8c0@chihking>
Message-ID: <Pine.LNX.4.44.0304070935570.24465-100000@rocky>

On Sun, 6 Apr 2003, Chih King wrote:

> Hello.  I am a member of the University of Iowa Student Supercomputing
> Project (UISSP), and we are planning for the purchase of our first cluster.
> Currently we are divided between a sixteen node single-processor Pentium 4
> system and a seven node dual-processor Xeon system.  Here are the brief
> specification of both machines:
> 
> 16 Pentium 4 single-processor system (total cost $7,407):
> 
> Intel Pentium 4 2.4GHz 533FSB 512KB
> ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN
> 512MB PC2700 DDR333
> Maxtor 20GB Ultra100 Hard Drive
> ATI Rage Mobility VGA Card 8MB AGP
> CG 6039L 350W USB Midtower Case
> Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel)
> 
> 7 Xeon dual-processor system (total cost $8,400):
> 
> INTEL XEON 2.4GHZ 533FSB PROCESSOR x2
> TYAN S2723GNN E7501 GLAN MOTHERBOARD
> PC2100 256MB ECC/REG DDR x2
> Maxtor 20GB Ultra100 Hard Drive
> Chenbro Beige Server Case
> NMB 460W Xeon Power Supply
> MITSUMI 54X CD-ROM Drive
> 

Given the above options, i'd go with the dual Xeons. Memory bandwidth is 
greater on the 2723 due to its dual-DDR setup. It is also a server-class
motherboard. The i7501 chipset was designed for server use and works well 
for clusters.

The Asus board has a SiS chipset. It is fairly safe to assume that this 
board was optimized for AGP speed which wont matter much to you.

I think you'll find much better reliability with the servers.

-- 
Rocky McGaugh
Atipa Technologies
rocky at atipatechnologies.com
rmcgaugh at atipa.com
1-785-841-9513 x3110
http://1087800222/
perl -e 'print unpack(u, ".=W=W+F%T:7\!A+F-O;0H`");'

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From robl at mcs.anl.gov  Mon Apr  7 13:57:48 2003
From: robl at mcs.anl.gov (Robert Latham)
Date: Mon, 7 Apr 2003 12:57:48 -0500
Subject: Mac OS X or Linux?
In-Reply-To: <1049733210.7632.3.camel@linux>
References: <1049733210.7632.3.camel@linux>
Message-ID: <20030407175748.GB20765@mcs.anl.gov>

On Mon, Apr 07, 2003 at 11:33:30AM -0500, Carlos Ernesto Lopez Nataren wrote:
> Hi!, we recently acquired 6 Xserve nodes at my institute, and we are
> planning to setup a beowulf cluster, can anyone tell if it is worthy to
> set it up with the OS it brings??? (Mac OS X server jaguar) or if it is
> better to try to use linux on these beauties???

please, if you have the time and resources, make them dual-boot
(granted, with six, it will mildy annoying to switch operating
systems) and tell us how well your applications run under an os x
cluster versus under a powerpc linux cluster.  

You'll spark a massive flame war, but there is a dearth of real data
showing how good or bad mac os X is in a cluster environment (where
the benefits of "good user interface" and "i can watch quicktime
trailers" aren't important) compared to linux on the same hardware.

please note that you'll need a quite recent linux kernel to support
the xserve hardware

I can show you lmbench numbers that show linux outperforming os x in
*operating system specific* tasks, but real applications carry more
weight than microbenchmarks.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Apr  7 12:53:37 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 7 Apr 2003 12:53:37 -0400 (EDT)
Subject: Specific Question about Single vs. Dual Processor System
In-Reply-To: <Pine.LNX.4.44.0304071006130.24748-100000@otto.plogic.internal>
Message-ID: <Pine.LNX.4.44.0304071230240.18370-100000@coffee.psychology.mcmaster.ca>

> > 16 Pentium 4 single-processor system (total cost $7,407):

pretty cheap!

> > Intel Pentium 4 2.4GHz 533FSB 512KB
> > ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN
> 
> According to Asus this MB has an on-board 10/100 (realtek) interface
> is there a 10/100/1000 option for this board.
> Or are you adding a GigE NIC?

recent SiS chipsets also have a bit of a smell about them, Linux-wise.
Alan Cox says that SiS hasn't been cooperative in producing docs that 
permit good linux support.

I don't see anything really attractive about this board - 
most of the features are useless (agp, for instance, S/PDIF, 1394).
it doesn't seem like SATA is happening fast enough to be a good 
motive, either.

if you insist on P4's, I'd probably go with a 845PE or maybe GE.
several vendors bundle such boards with gigabit.  I have mixed info
on whether the integrated video on the GE causes problems - I expect
that if you're in text mode, it wouldn't steal enough dram bandwidth
to notice.  Intel chipsets are a bit of a conservative choice,
but sometimes that's the right move (heck, Asus is fairly conservative).

it's worth at least considering e7205 boards, since doubling the bandwidth
does definitely help many compute codes (unlike most desktop apps.)

finally, AMD remains a viable option, though mainly as a low-end approach.
for instance, an ECS K7S5a-pro is incredibly cheap, has builtin 100bT, and is
pretty snappy.  for $500 computers, saving a hundred dollars on the
motherboard, along with a hundred on the CPU can add up quickly.

> > Maxtor 20GB Ultra100 Hard Drive

are you sure you really want that?  the U100 part is not important
(since current disks peak at around 50 MB/s), but the size implies 
density, and means that the disk is ~2 generations old.  consider
getting a 30-40G disk just so you get current (60-80G/platter)
mechanisms.

> > ATI Rage Mobility VGA Card 8MB AGP

or integrated video.

> > Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel)

a good choice 5 years ago, but I'd probaby consider something more 
interesting.  if you're planning to just treat it as a management net,
(which I don't understand the appeal of), then just go with realtek nics.
if you're going to use it for anything interesting, try to get gigabit.

> > 7 Xeon dual-processor system (total cost $8,400):
> > 
> > INTEL XEON 2.4GHZ 533FSB PROCESSOR x2
> > TYAN S2723GNN E7501 GLAN MOTHERBOARD
> > PC2100 256MB ECC/REG DDR x2

that's not much ram (since ram is cheap).  if you're interested in exploring
the benefits of duals and/or double-wide DDR, perhaps you should wait for 
the next generation chipsets (springdale/etc).

> > Maxtor 20GB Ultra100 Hard Drive
> > Chenbro Beige Server Case
> > NMB 460W Xeon Power Supply

350 is actually plenty for a dual 2.4.

naturally, people will be more impressed with a cluster of non-beige cases ;)

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ganeshnamboothiri at yahoo.com  Mon Apr  7 13:28:18 2003
From: ganeshnamboothiri at yahoo.com (Ganesh Namboothiri)
Date: Mon, 7 Apr 2003 10:28:18 -0700 (PDT)
Subject: Matrix Multiplication
Message-ID: <20030407172818.97982.qmail@web21504.mail.yahoo.com>


Hello 

         I want to implement a parallel matrix multiplication algm and i dont know how to split the array and send it. Plz help me to split the nxn matrix into pices so that ican do parallel matrix multiplication.

ganeshnamboothiri at yahoo.com

 
---------------------------------
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20030407/49dd3f19/attachment.html>

From deadline at plogic.com  Mon Apr  7 14:31:34 2003
From: deadline at plogic.com (Douglas Eadline)
Date: Mon, 7 Apr 2003 14:31:34 -0400 (EDT)
Subject: Specific Question about Single vs. Dual Processor System
In-Reply-To: <Pine.LNX.4.44.0304070935570.24465-100000@rocky>
Message-ID: <Pine.LNX.4.44.0304071408140.5487-100000@otto.plogic.internal>

On Mon, 7 Apr 2003, Rocky McGaugh wrote:

> On Sun, 6 Apr 2003, Chih King wrote:
> 
> > Hello.  I am a member of the University of Iowa Student Supercomputing
> > Project (UISSP), and we are planning for the purchase of our first cluster.
> > Currently we are divided between a sixteen node single-processor Pentium 4
> > system and a seven node dual-processor Xeon system.  Here are the brief
> > specification of both machines:
> > 
> > 16 Pentium 4 single-processor system (total cost $7,407):
> > 
> > Intel Pentium 4 2.4GHz 533FSB 512KB
> > ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN
> > 512MB PC2700 DDR333
> > Maxtor 20GB Ultra100 Hard Drive
> > ATI Rage Mobility VGA Card 8MB AGP
> > CG 6039L 350W USB Midtower Case
> > Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel)
> > 
> > 7 Xeon dual-processor system (total cost $8,400):
> > 
> > INTEL XEON 2.4GHZ 533FSB PROCESSOR x2
> > TYAN S2723GNN E7501 GLAN MOTHERBOARD
> > PC2100 256MB ECC/REG DDR x2
> > Maxtor 20GB Ultra100 Hard Drive
> > Chenbro Beige Server Case
> > NMB 460W Xeon Power Supply
> > MITSUMI 54X CD-ROM Drive
> > 
> 
> Given the above options, i'd go with the dual Xeons. Memory bandwidth is 
> greater on the 2723 due to its dual-DDR setup. It is also a server-class
> motherboard. The i7501 chipset was designed for server use and works well 
> for clusters.

Of course it all depends on the application(s). Depending on the
application mix, you may not realize the full potential DDR offers.  
(look at some of the numbers for SMP motherboards on cluster-rant.com)

This is a very important question - single vs. dual. In addition to
sharing the memory, they will also share the interconnect. 

I wonder if systems built from boards like the Tyan 2707 or the SM X5SSE
would provide better price to performance for some applications than
using dual MB's. I have some testing planned. Nice thing about the Tyan
board is it has GigE on a PCI-X bus and a PCI-X slot if you need one.

Doug
> 
> The Asus board has a SiS chipset. It is fairly safe to assume that this 
> board was optimized for AGP speed which wont matter much to you.
> 
> I think you'll find much better reliability with the servers.
> 

> 

-- 
-------------------------------------------------------------------
Paralogic, Inc.           |     PEAK     |      Voice:+610.814.2800
130 Webster Street        |   PARALLEL   |        Fax:+610.814.5844
Bethlehem, PA 18015 USA   |  PERFORMANCE |    http://www.plogic.com
-------------------------------------------------------------------

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Mon Apr  7 14:33:39 2003
From: landman at scalableinformatics.com (Joseph Landman)
Date: 07 Apr 2003 14:33:39 -0400
Subject: Mac OS X or Linux?
In-Reply-To: <20030407175748.GB20765@mcs.anl.gov>
References: <1049733210.7632.3.camel@linux>
	 <20030407175748.GB20765@mcs.anl.gov>
Message-ID: <1049740419.6940.43.camel@protein.scalableinformatics.com>

On Mon, 2003-04-07 at 13:57, Robert Latham wrote:

> You'll spark a massive flame war, but there is a dearth of real data
> showing how good or bad mac os X is in a cluster environment (where
> the benefits of "good user interface" and "i can watch quicktime
> trailers" aren't important) compared to linux on the same hardware.

Happens with everything though...

> please note that you'll need a quite recent linux kernel to support
> the xserve hardware
> 
> I can show you lmbench numbers that show linux outperforming os x in
> *operating system specific* tasks, but real applications carry more
> weight than microbenchmarks.

Numbers I have seen for bioinfo apps seem to indicate that the hardware
is faster when code is redone for the built in vector registers (gcc
compiler doesn't automatically do this).  Then again, this is comparing
non-SIMD to SIMD, and I would expect that the SIMD could be faster at
specific code patterns/fragments.  Single CPU (non-SIMD) to single CPU
(non-SIMD) the performance comparison seems not to favor the current
Apple PPC hardware against current IA32 machines.

I am curious about other apps as well.  Please summarize the lmbench
microbenchmarks.  I would be curious about heavy FP/memory codes.  I
would think that the PPC would have some interesting performance there.

> 
> ==rob
-- 
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman at scalableinformatics.com
  web: http://scalableinformatics.com
phone: +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From eugen at leitl.org  Mon Apr  7 15:46:11 2003
From: eugen at leitl.org (Eugen Leitl)
Date: Mon, 7 Apr 2003 21:46:11 +0200
Subject: thanks [was renting time on a clustger]
Message-ID: <20030407194611.GC3245@leitl.org>

Thanks for all the helpful responses, both on-list
and off-list.

I've passed on the information to the party in question.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20030407/a5d1d4eb/attachment.sig>

From rgb at phy.duke.edu  Mon Apr  7 17:21:07 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 7 Apr 2003 17:21:07 -0400 (EDT)
Subject: Mac OS X or Linux?
In-Reply-To: <20030407175748.GB20765@mcs.anl.gov>
Message-ID: <Pine.LNX.4.44.0304071455230.13486-100000@ganesh.phy.duke.edu>

On Mon, 7 Apr 2003, Robert Latham wrote:

> On Mon, Apr 07, 2003 at 11:33:30AM -0500, Carlos Ernesto Lopez Nataren wrote:
> > Hi!, we recently acquired 6 Xserve nodes at my institute, and we are
> > planning to setup a beowulf cluster, can anyone tell if it is worthy to
> > set it up with the OS it brings??? (Mac OS X server jaguar) or if it is
> > better to try to use linux on these beauties???
> 
> I can show you lmbench numbers that show linux outperforming os x in
> *operating system specific* tasks, but real applications carry more
> weight than microbenchmarks.

Hola, Carlos!  Como Estas?  Say hello to Carmela and Jaime (if they are
still there).

My only modification to this is that I'd recommend looking at the
non-hardware costs a bit to determine if messing with either solution is
worth it.  Remember, it costs time and money to set things up and run
them, and this differential cost is very sensitive to things like the
scalability of what you build.  For example, with linux on intel or amd,
you can fully automate installation and upgrade for an entire cluster so
that it takes only a tiny bit of time per node per year to run the
thing.  All software is prebuilt ready to run, basically for free.  With
linux on mac, or mac os on macs, are you going to have anything like
this level of scaling?  No, because there will be lots of things you
have to build (and possibly port) for linux on mac, and because the mac
os on mac was built for a PC environment and I doubt that it scales like
rpm distros or debian.

As in, hardware can be "free" and not be worth it, if it costs you a
huge amount of human time to make everything work.  The further you get
from any of the "standard beowulf models" (whatever they might be at any
point in time) the more of YOUR time you're going to put in screwing
around getting things to work.  If your time is cheap and the benefit of
eventual success is great, this is no problem.  If your time is costly
and the benefit is at best "ok" if everything works when your done, you
might better consider ways of setting up 'wulfs closer to the standard
approaches.

Of course, you may have cheap labor in the form of graduate
students...but it is still something to think about.;-)

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ron_chen_123 at yahoo.com  Mon Apr  7 22:53:28 2003
From: ron_chen_123 at yahoo.com (Ron Chen)
Date: Mon, 7 Apr 2003 19:53:28 -0700 (PDT)
Subject: Fwd: [PBS-USERS] cost for educational sites
Message-ID: <20030408025328.7453.qmail@web41310.mail.yahoo.com>

Looks like PBSPro is *not* free for educational sites
anymore.

When PBS was owned by Veridian, OpenPBS was quite
broken, as not all PBSPro fixes went into OpenPBS.

I guess more and more sites will switch to GridEngine.

 -Ron


--- Jenn Sturm <jsturm at hamilton.edu> wrote:
> I see from PBSPro's new website that educational
> sites need to submit a 
> grant application in order to purchase PBSPro at
> reduced costs, where 
> previously it was free to educational sites. Has
> anyone submitted this 
> yet and found out what the price actually is? I'm
> building a new 
> machine right now and am surprised to find this out
> (didn't exactly 
> expect to have to complete a grant application in
> order to build this 
> new machine...) and now have to move back to
> OpenPBS, but I'm curious, 
> still...
> 
> Thanks,
> 
> Jenn Sturm
> 
> 
>
+-------------------------------------------------------------------+
> Jennifer Sturm
> System Administrator and Research Support Specialist
> Chemistry Department
> Hamilton College
> 
> jsturm at hamilton.edu
> help at mercury.chem.hamilton.edu
> 315-859-4745
> 
> http://www.chem.hamilton.edu/
> http://mars.chem.hamilton.edu/
>
+-------------------------------------------------------------------+
> 
>
__________________________________________________________________________
> To unsubscribe: email majordomo at OpenPBS.org with
> body "unsubscribe pbs-users"
> For message archives:
> http://www.OpenPBS.org/UserArea/pbs-users.html
>     -    -    -    -    -    -    -    -    -    -  
>  -    -    -    -
> OpenPBS and the pbs-users mailing list are sponsored
> by Altair.
>
__________________________________________________________________________


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jeff at aslab.com  Mon Apr  7 20:22:04 2003
From: jeff at aslab.com (Jeff Nguyen)
Date: Mon, 7 Apr 2003 17:22:04 -0700
Subject: Specific Question about Single vs. Dual Processor System
References: <Pine.LNX.4.44.0304071230240.18370-100000@coffee.psychology.mcmaster.ca>
Message-ID: <0b6a01c2fd64$e2155140$6502a8c0@jeff>

> I don't see anything really attractive about this board -
> most of the features are useless (agp, for instance, S/PDIF, 1394).
> it doesn't seem like SATA is happening fast enough to be a good
> motive, either.
>
> if you insist on P4's, I'd probably go with a 845PE or maybe GE.
> several vendors bundle such boards with gigabit.  I have mixed info
> on whether the integrated video on the GE causes problems - I expect
> that if you're in text mode, it wouldn't steal enough dram bandwidth
> to notice.  Intel chipsets are a bit of a conservative choice,
> but sometimes that's the right move (heck, Asus is fairly conservative).
>
> it's worth at least considering e7205 boards, since doubling the bandwidth
> does definitely help many compute codes (unlike most desktop apps.)
>

I would rather wait for 865P (Springsdale) or 875P (Canterwood) instead
of going for E7205. These new platforms will out really soon. :) They will
offer higher front side bus (800mhz) and faster memory bus (400mhz) at
the same cost as the existing E7205 machines.

Jeff

ASL Inc.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ron_chen_123 at yahoo.com  Tue Apr  8 02:02:36 2003
From: ron_chen_123 at yahoo.com (Ron Chen)
Date: Mon, 7 Apr 2003 23:02:36 -0700 (PDT)
Subject: Fwd: [PBS-USERS] cost for educational sites
In-Reply-To: <Pine.LNX.4.44.0304080132570.24985-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20030408060236.98444.qmail@web41303.mail.yahoo.com>

First, both SGE and SGEEE are free and opensource.
They have better features, and far better fault
tolerance. If PBSPro is starting to cost $$ even for
educational sites, why not do the switch now?

Second, by "I guess", I am talking about the trend, if
you follow the discussions on beowulf lately, you will
find that there really are a lot of people switching
from PBS to SGE.

Lastly, I don't work for Sun, and besides, Sun is not
making a dollar even if the whole world is using SGE.
(also note that Sun is making SGE free not only for
Solaris, but for other platforms -- AIX, HP, Alpha,
Mac...)

Next time if I suggest people to use Linux instead of
Windows, I hope people don't ask me whether I work for
Linus or Redhat :-)

 -Ron

--- Mark Hahn <hahn at physics.mcmaster.ca> wrote:
> > I guess more and more sites will switch to
> GridEngine.
> 
> do you work for Sun?
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From robl at mcs.anl.gov  Wed Apr  9 02:19:21 2003
From: robl at mcs.anl.gov (Robert Latham)
Date: Wed, 9 Apr 2003 01:19:21 -0500
Subject: Mac OS X or Linux?
In-Reply-To: <1049740419.6940.43.camel@protein.scalableinformatics.com>
References: <1049733210.7632.3.camel@linux> <20030407175748.GB20765@mcs.anl.gov> <1049740419.6940.43.camel@protein.scalableinformatics.com>
Message-ID: <20030409061920.GA32255@mcs.anl.gov>

On Mon, Apr 07, 2003 at 02:33:39PM -0400, Joseph Landman wrote:
> I am curious about other apps as well.  Please summarize the lmbench
> microbenchmarks.  I would be curious about heavy FP/memory codes.  I
> would think that the PPC would have some interesting performance there.

Usual caveats about benchmarks and misleading numbers apply, but this
is as fair as i can make it: same hardware, same benchmark, different
operating systems:

new:
http://terizla.org/~robl/pbook/benchmarks/lmbench-linux_vs_osx.1

old (but same results):
http://drmirage.clustermonkey.org/~laz/pbook/lmbench.powerbook.txt

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From eugen at leitl.org  Tue Apr  8 13:37:07 2003
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 8 Apr 2003 19:37:07 +0200
Subject: [baillot@ait.nrl.navy.mil: [ARFORUM] Fwd: Cluster job opportunity]
Message-ID: <20030408173707.GK13969@leitl.org>

----- Forwarded message from yohan baillot <baillot at ait.nrl.navy.mil> -----

From: yohan baillot <baillot at ait.nrl.navy.mil>
Date: Tue, 08 Apr 2003 13:00:01 -0400
To: ARforum <arforum at topica.com>
Subject: [ARFORUM] Fwd: Cluster job opportunity
Reply-To: arforum at topica.com
X-Mailer: QUALCOMM Windows Eudora Version 5.1

FYI

Yohan


>From: Jonathan Gratch <gratch at ict.usc.edu>
>To:
>Subject: Cluster job opportunity
>Date: Tue, 8 Apr 2003 07:43:36 -0700
>X-Mailer: Internet Mail Service (5.5.2653.19)
>
>Hi,
>
>I saw your recent paper at the VR2003 confrence.
>Our research institute is hoping to hire someone to lead
>a R&D effort to develop a cluster system for VR applications.  I don't
>know if this community has a mailing list where it might be more
>appropriate to post job openings so I've contacted you directly in the
>hope that you might let me know if there is a mailing list or if there
>might be a place at your institute that you could post the following
>job opening.
>
>Thanks in advance,
>
>jon gratch
>______________________________________________
>Jonathan Gratch                              | www.ict.usc.edu/~gratch
>Project Leader, Research Assistant Professor | Phone:  (310) 448-0306
>USC Institute for Creative Technologies      | Fax:    (310) 574-5725
>13274 Fiji Way, Suite 600                    | E-mail: gratch at ict.usc.edu
>Marina del Rey, CA 90292                     |
>
>
>
>
>Job Posting for Cluster Project Leader (Req# 14490)
>
>The University of Southern California's Institute for Creative
>Technologies is involved in fundamental research on advancing the
>state of virtual reality training systems through a combination of
>advanced graphics, audio, artificial intelligence and Hollywood
>production techniques. We are currently seeking a senior programmer
>with project management experience to coordinate the research and
>development of a distributed rendering and animation engine that will
>serve as the backbone of the next generation of ICT training
>simulators.  The goal of this multi-year project is to create a
>flexible architecture, using commercial off-the-shelf software where
>possible, that will support the real-time graphics, audio,
>animation and simulation requirements of multiple ICT research
>efforts. A key aspect of the project is to support distributed
>rendering of real-time graphics on a cluster of PC computers.
>
>The applicant can expect to spend half of their time performing
>management duties and half programming.  Management duties include
>working with research project leaders to refine system requirements,
>defining tasks and priorities, creating milestones and managing a
>small team of developers, contacting vendors and attending conferences
>to stay informed of developments in the area. Programming duties
>include developing new software and evaluating and integrating
>commercial solutions.
>
>The ideal applicant will have:
>* project management experience
>* familiarity with virtual reality systems (military simulations,
>  computer games)
>* expertise in computer graphics (specifically, Performer and OpenGL),
>* familiarity with graphics clusters and supporting software
>  (Renderizer, ClusterJuggler, Chromium)
>* expertise with C++, UNIX/LINUX/IRIX and Windows operating systems
>* familiarity with high-speed network solutions (Myrinet, Gigabit
>  ethernet)
>* familiarity with commercial content production tools (Maya, 3D
>  Sudio, Diva)
>
>Interested applicants should apply to job Requisition number 14490
>at http://www.usc.edu/bus-affairs/ers/search.html.


Yohan BAILLOT

Virtual Reality Laboratory,
Advanced Information Technology (Code 5580),
Naval Research Laboratory,
4555 Overlook Avenue SW,
Washington, DC 20375-5337

Email   : baillot at ait.nrl.navy.mil
Work    : (202) 404 7801
Home    : (202) 518 3960
Cell    : (703) 732 5679
Fax     : (202) 767 1122
Web     : http://ait.nrl.navy.mil/vrlab/projects/BARS/BARS.html

==^================================================================
This email was sent to: eugen at leitl.org

EASY UNSUBSCRIBE click here: http://topica.com/u/?a84Ao5.bb5321.ZXVnZW5A
Or send an email to: arforum-unsubscribe at topica.com

TOPICA - Start your own email discussion group. FREE!
http://www.topica.com/partner/tag02/create/index2.html
==^================================================================


----- End forwarded message -----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20030408/3852c50d/attachment.sig>

From landman at scalableinformatics.com  Tue Apr  8 17:19:59 2003
From: landman at scalableinformatics.com (Joseph Landman)
Date: 08 Apr 2003 17:19:59 -0400
Subject: Mac OS X or Linux?
In-Reply-To: <200304081541.36834.exa@kablonet.com.tr>
References: <1049733210.7632.3.camel@linux>
	 <20030407175748.GB20765@mcs.anl.gov>
	 <1049740419.6940.43.camel@protein.scalableinformatics.com>
	 <200304081541.36834.exa@kablonet.com.tr>
Message-ID: <1049836799.16158.4.camel@protein.scalableinformatics.com>

On Tue, 2003-04-08 at 08:41, Eray Ozkural wrote:

> I wonder if we can really classify those vector operations as SIMD which means 
> Single Instruction Multiple Data architecture. (no this isn't a troll!)

I would word that the other way.  They look like SIMD to me, and not
"vector" in the Cray-ish model. I could be wrong on this, but I didn't
see long "vectors", rather large (128 bit) data types upon which you can
apply simultaneous operations.

> 
> Thanks,
-- 
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman at scalableinformatics.com
  web: http://scalableinformatics.com
phone: +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From exa at kablonet.com.tr  Tue Apr  8 08:41:36 2003
From: exa at kablonet.com.tr (Eray Ozkural)
Date: Tue, 8 Apr 2003 15:41:36 +0300
Subject: Mac OS X or Linux?
In-Reply-To: <1049740419.6940.43.camel@protein.scalableinformatics.com>
References: <1049733210.7632.3.camel@linux> <20030407175748.GB20765@mcs.anl.gov> <1049740419.6940.43.camel@protein.scalableinformatics.com>
Message-ID: <200304081541.36834.exa@kablonet.com.tr>

On Monday 07 April 2003 21:33, Joseph Landman wrote:
> Numbers I have seen for bioinfo apps seem to indicate that the hardware
> is faster when code is redone for the built in vector registers (gcc
> compiler doesn't automatically do this).  Then again, this is comparing
> non-SIMD to SIMD, and I would expect that the SIMD could be faster at
> specific code patterns/fragments.  Single CPU (non-SIMD) to single CPU
> (non-SIMD) the performance comparison seems not to favor the current
> Apple PPC hardware against current IA32 machines.
>

I wonder if we can really classify those vector operations as SIMD which means 
Single Instruction Multiple Data architecture. (no this isn't a troll!)

Thanks,

-- 
Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From whately at lcp.coppe.ufrj.br  Wed Apr  9 16:11:10 2003
From: whately at lcp.coppe.ufrj.br (Lauro L. A. Whately)
Date: Wed, 09 Apr 2003 17:11:10 -0300
Subject: setting PXE on a different net interface
Message-ID: <3E947E5E.1020601@lcp.coppe.ufrj.br>

Hi,

   I would like to make the machines in the cluster boot from a remote 
server. The mainboard of the nodes has a giga-ethernet interface 
on-board. Also, each node has a pci fast-ethernet interface that I want 
to use for administration and services (nfs, nis, monitoring, ...).
The only configuration I find in the bios for the PXE is booting from 
the on-board interface.
Does anyone know I way to reconfigure the (Intel) boot agent bypass the 
  onboard interface and boot from the pci interface ?

  TIA,
Lauro Whately.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Wed Apr  9 17:04:52 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Wed, 9 Apr 2003 17:04:52 -0400 (EDT)
Subject: Mac OS X or Linux?
In-Reply-To: <20030409061920.GA32255@mcs.anl.gov>
Message-ID: <Pine.LNX.4.44.0304091653100.4430-100000@coffee.psychology.mcmaster.ca>

> http://terizla.org/~robl/pbook/benchmarks/lmbench-linux_vs_osx.1

yow!  I think it's fair to say that Apple has some work to do.
I suppose it's also possible that the OS is tuned for models
(such as desktop ones, perhaps with different cpu/cache/dram configs.)

does OS X have page coloring inherited from *BSD?  perhaps that 
explains the only place it comes out ahead (memory bandwidth/latency).

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Wed Apr  9 17:45:52 2003
From: becker at scyld.com (Donald Becker)
Date: Wed, 9 Apr 2003 17:45:52 -0400 (EDT)
Subject: setting PXE on a different net interface
In-Reply-To: <3E947E5E.1020601@lcp.coppe.ufrj.br>
Message-ID: <Pine.LNX.4.44.0304091729560.2234-100000@beohost.scyld.com>

On Wed, 9 Apr 2003, Lauro L. A. Whately wrote:

>    I would like to make the machines in the cluster boot from a remote 
> server. The mainboard of the nodes has a giga-ethernet interface 
> on-board. Also, each node has a pci fast-ethernet interface that I want 
> to use for administration and services (nfs, nis, monitoring, ...).
> The only configuration I find in the bios for the PXE is booting from 
> the on-board interface.

This one is pretty easy to answer: the on-board interface is the only
interface that the BIOS knows how to use.

> Does anyone know I way to reconfigure the (Intel) boot agent bypass the 
>  onboard interface and boot from the pci interface ?

The only way your add-on PCI NIC can support PXE boot is if it has its
own boot agent code.

Note that most PXE clients out there use the Intel framework as the basis
of their PXE boot.  If the add-on card can do PXE boot, it will have a
duplicate copy of the boot agent code.  It might have the same message
as the on-board NIC, but it's different boot step.

If you have multiple on-board NICs, the Intel boot agent will attempt to
PXE boot sequentially, rather than send requests out on all interfaces
simultaneously and then using the best response.  (If you read the PXE
specs you would expect the latter behavior.)

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Wolfgang.Dobler at kis.uni-freiburg.de  Thu Apr 10 04:47:04 2003
From: Wolfgang.Dobler at kis.uni-freiburg.de (Wolfgang Dobler)
Date: Thu, 10 Apr 2003 10:47:04 +0200
Subject: Scaling of hydro codes
Message-ID: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de>

We have a 3-d finite-difference hydro code and find that the time per time
step and grid point scales almost linearly,
  t_step ~ Ncpu^(-1) ,
on an Origin3000 from 1 up to 64 CPUs.

On our Linux cluster (Gbit ethernet, 8x2 CPUs) however, we get a scaling
that is well represented by
  t_step ~ Ncpu^(-0.75) .
More or less the same scaling is obtained on another machine (100Mbit, 128
nodes), and also for another hydro code (parallelized using Cactus).
Note that the number of grid points was adapted for these timings, so that
the problem size per CPU is roughly constant.

My question is: do others find the same type of scaling for hydro codes?
If so, how can this be understood?

I don't expect latency to play a role for these timings, as we are only
communicating a reasonably low number of large arrays in every time step;
I suppose, Cactus does the same.
  And if saturation of the switch played a role, I would expect a
well-defined drop at some critical value of Ncpu, not a power law.


W o l f g a n g

-- 

 -------------------------------------------------------------------------
|  Wolfgang Dobler                           Phone: ++49/(0)761/3198-224  |
|  Kiepenheuer Institute for Solar Physics   Fax:   ++49/(0)761/3198-111  |
|  Sch?neckstra?e 6                                                       |
|  D-79104 Freiburg                   E-Mail: Dobler at kis.uni-freiburg.de  |
|  Germany                       http://www.kis.uni-freiburg.de/~dobler/  |
 -------------------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mail_for_anand at yahoo.com  Thu Apr 10 01:12:48 2003
From: mail_for_anand at yahoo.com (anand bagchi)
Date: Wed, 9 Apr 2003 22:12:48 -0700 (PDT)
Subject: help!!!!!!!!!-suggest a parallel program to be run on a beowulf cluster
Message-ID: <20030410051248.81786.qmail@web21508.mail.yahoo.com>

hi all ,
            i am working on a beowulf cluster as a
part of my undergraduate training and need to run a
program on it . Could anybody suggest a parallel
program that can be implemented or suggest a book or a
website from where i can get some help . 

anand(India) 

__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From exa at kablonet.com.tr  Thu Apr 10 08:54:43 2003
From: exa at kablonet.com.tr (Eray Ozkural)
Date: Thu, 10 Apr 2003 15:54:43 +0300
Subject: Scaling of hydro codes
In-Reply-To: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de>
References: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de>
Message-ID: <200304101554.43012.exa@kablonet.com.tr>

On Thursday 10 April 2003 11:47, Wolfgang Dobler wrote:
> My question is: do others find the same type of scaling for hydro codes?
> If so, how can this be understood?

Those are quite different architectures, that's why. Same parallel algorithm 
will show different performance on such different architectures. Your beowulf 
is a cluster of SMP nodes, does your algorithm take that into account? I 
think it probably doesn't.

What exactly is the topology and architecture of the network on Origin3000? 
How fast are the nodes (cpu/mem bandwidth), and how much memory does it have?

Same goes for the beowulf cluster.

By scaling I take it that you increase problem size as well as number of 
processors. If you don't increase problem size it's called speedup. A 
scalability plot together with a speedup plot can say more about your 
problem.

Thanks,

-- 
Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Apr 10 11:54:57 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 10 Apr 2003 11:54:57 -0400 (EDT)
Subject: help!!!!!!!!!-suggest a parallel program to be run on a beowulf
 cluster
In-Reply-To: <20030410051248.81786.qmail@web21508.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0304101147180.21143-100000@ganesh.phy.duke.edu>

On Wed, 9 Apr 2003, anand bagchi wrote:

> hi all ,
>             i am working on a beowulf cluster as a
> part of my undergraduate training and need to run a
> program on it . Could anybody suggest a parallel
> program that can be implemented or suggest a book or a
> website from where i can get some help . 

The two most common demo programs to my experience are pvmpov (povray
parallelized on PVM) and one of several parallelized Mandelbrot set
generators, under either PVM or MPI.  There are also mini-demo's and
example programs in the distributions themselves or on their primary web
homes, although they tend to be less graphical.

The nice thing about either of these is that parallel speedup is clearly
evident on almost any network, and that the speedup is beautifully
(literally) rendered on the screen.  One can rubberband one's way down
into the visually stunning mandelbrot set and "see" individual patches
of the new image being returned by the nodes, ditto for the rendering of
pvmpov's standard pitcher/picture.  If you use e.g. xpvm to add or
remove nodes to your cluster, you can watch the computation speed up or
slow down.

Beyond these, there are of course many other resources you can use to
write demos of your own or adopt demos from code in books.  There are
nice books on both PVM and MPI from e.g. MIT press that you can probably
order via Amazon from anywhere in the world.  A perusal of the mpich and
pvm websites will turn up lots of useful things.  I'm sure others on the
list will return other specific reference programs and resources, for
example parallelized linpack computations and the like, that are
sometimes used to "benchmark" a cluster.

   rgb

> 
> anand(India) 
> 
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Tax Center - File online, calculators, forms, and more
> http://tax.yahoo.com
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joachim at ccrl-nece.de  Thu Apr 10 12:41:47 2003
From: joachim at ccrl-nece.de (Joachim Worringen)
Date: Thu, 10 Apr 2003 18:41:47 +0200
Subject: Scaling of hydro codes
In-Reply-To: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de>
References: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de>
Message-ID: <200304101841.47669.joachim@ccrl-nece.de>

Wolfgang Dobler:
> I don't expect latency to play a role for these timings, as we are only
> communicating a reasonably low number of large arrays in every time step;
> I suppose, Cactus does the same.
>   And if saturation of the switch played a role, I would expect a
> well-defined drop at some critical value of Ncpu, not a power law.

I'd say that it's just your network which is to slow (Gbit ethernet is not 
necessarily fast!) in relation to the speed of the CPUs. 

Without knowing your code, I guess that with increasing Ncpu, the number of 
communication operations and the transported volume of data increases, too.  
This leads to increased communication time, while the time that each CPU 
needs to run through its timestep remains constant (as you adapted the 
problem size ~ Ncpu).

But wait, if you keep the workload per CPU constant with increasing Ncpu, how 
comes that t_step scales with 1/Ncpu at all? Am I missing something here?

Anyway, you should check if a faster network could help you (by verifying if 
the reason I suspected is valid). You might do this with MPE or Vampir 
(commercial tool from Pallas, demo licenses available), or some other way of 
profiling.

 Joachim

-- 
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sgaudet at wildopensource.com  Thu Apr 10 13:04:04 2003
From: sgaudet at wildopensource.com (Stephen Gaudet)
Date: Thu, 10 Apr 2003 13:04:04 -0400
Subject: Itanium gets supercomputing software 
Message-ID: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>


http://msnbc-cnet.com.com/2100-1012-996357.html?type=pt&part=msnbc&tag=alert
&form=feed&subj=cnetnews

Stephen Gaudet
 .....
<(???)>
----------------------
Wild Open Source
Bedford, NH 03110
pH: 603-488-1599
cell: 603-498-1600
Home: 603-472-8040
http://www.wildopensource.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dsarvis at zcorum.com  Thu Apr 10 12:32:49 2003
From: dsarvis at zcorum.com (Dennis Sarvis, II)
Date: 10 Apr 2003 12:32:49 -0400
Subject: problem with load balancing
Message-ID: <1049992368.16688.4.camel@skull.america.net>

I tried implementing a 2 node cluster (both redhat, 1 a PII400 and 1 a
Celeron550) with a cross-over cable I built. I tried implementing an
open-mosix kernel and they talk to each other, I can manually migrate
processes in x-windows, but they will not auto share. I also tried PVM
but it just freezes up. I wanted to try mpi-ch but I need some guidance.
I did 'try' to turn on RSH, but I may not have done it correctly.
--
Alpharetta, GA 
Dennis Sarvis, II <dsarvis at zcorum.com>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jakob at unthought.net  Thu Apr 10 13:52:32 2003
From: jakob at unthought.net (Jakob Oestergaard)
Date: Thu, 10 Apr 2003 19:52:32 +0200
Subject: renting time on a cluster
In-Reply-To: <20030407125604.GR2067@leitl.org>
References: <20030407125604.GR2067@leitl.org>
Message-ID: <20030410175232.GB16320@unthought.net>

On Mon, Apr 07, 2003 at 02:56:05PM +0200, Eugen Leitl wrote:
> A friend of mine has a project requiring a lot
> of crunch (no idea yet which bandwidth/latency 
> requirements).
> 
> Can you think of places where one can rent nontrivial
> amount of crunch for money?

I know people who would be interested in providing such a service. As
in, getting a cluster and start renting out time on it.

(and no, I'm not affiliated with them, I just know them well  :)

So, while I don't have a real answer to your question, allow me to add
yet another question:   Is there interest in such a service?

How many here would, or know people who might, rent time on a remote
cluster ?

I personally think that security concerns is the main showstopper here -
you often cannot really do paid research on such a system, if the
results are supposed to help getting patents etc.   Larger organizations
would rather buy their own cluster, than risk losing a patent to the
competition.   And for hobbyists?  I guess most hobbyists can sneak in
low-priority jobs at work   :)

So, is the original question a once-in-a-decade thing, or do people
generally feel that there is interest in such a service?

(haven't seen many of those requests on this list, AFAIR)

Cheers,

-- 
................................................................
:   jakob at unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob ?stergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From craig.tierney at noaa.gov  Thu Apr 10 11:42:10 2003
From: craig.tierney at noaa.gov (Craig Tierney)
Date: Thu, 10 Apr 2003 09:42:10 -0600
Subject: Scaling of hydro codes
In-Reply-To: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de>
References: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de>
Message-ID: <20030410154210.GA24214@hpti.com>

On Thu, Apr 10, 2003 at 10:47:04AM +0200, Wolfgang Dobler wrote:
> We have a 3-d finite-difference hydro code and find that the time per time
> step and grid point scales almost linearly,
>   t_step ~ Ncpu^(-1) ,
> on an Origin3000 from 1 up to 64 CPUs.
> 
> On our Linux cluster (Gbit ethernet, 8x2 CPUs) however, we get a scaling
> that is well represented by
>   t_step ~ Ncpu^(-0.75) .
> More or less the same scaling is obtained on another machine (100Mbit, 128
> nodes), and also for another hydro code (parallelized using Cactus).
> Note that the number of grid points was adapted for these timings, so that
> the problem size per CPU is roughly constant.

Did determine this number scaling from 1 to 16 cpus, or from 2 to 16 cpus?
You aren't going to get good scaling from 1 to 2 because lack of memory bandwidth
(this is usually the case).  Scale from 1 to 8 nodes (2 to 16 processors) to
see how the code scales due to the interconnect.

Craig


> 
> My question is: do others find the same type of scaling for hydro codes?
> If so, how can this be understood?
> 
> I don't expect latency to play a role for these timings, as we are only
> communicating a reasonably low number of large arrays in every time step;
> I suppose, Cactus does the same.
>   And if saturation of the switch played a role, I would expect a
> well-defined drop at some critical value of Ncpu, not a power law.
> 
> 
> W o l f g a n g
> 
> -- 
> 
>  -------------------------------------------------------------------------
> |  Wolfgang Dobler                           Phone: ++49/(0)761/3198-224  |
> |  Kiepenheuer Institute for Solar Physics   Fax:   ++49/(0)761/3198-111  |
> |  Sch?neckstra?e 6                                                       |
> |  D-79104 Freiburg                   E-Mail: Dobler at kis.uni-freiburg.de  |
> |  Germany                       http://www.kis.uni-freiburg.de/~dobler/  |
>  -------------------------------------------------------------------------
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Craig Tierney (ctierney at hpti.com)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From keith.murphy at attglobal.net  Thu Apr 10 14:23:23 2003
From: keith.murphy at attglobal.net (Keith Murphy)
Date: Thu, 10 Apr 2003 11:23:23 -0700
Subject: renting time on a cluster
References: <20030407125604.GR2067@leitl.org> <20030410175232.GB16320@unthought.net>
Message-ID: <060401c2ff8e$45f4df70$02fea8c0@oemcomputer>

There is a company in Florida Tsunamic Technologies, who already offers such
a service.
No, they are not a customer or even a friend
http://www.tsunamictechnologies.com/

Regards
Keith Murphy
Dolphin Interconnect
T: 818-597-2114
F: 818-597-2119
C: 818-292-5100
www.dolphinics.com
www.scali.com

----- Original Message -----
From: "Jakob Oestergaard" <jakob at unthought.net>
To: "Eugen Leitl" <eugen at leitl.org>
Cc: <Beowulf at beowulf.org>
Sent: Thursday, April 10, 2003 10:52 AM
Subject: Re: renting time on a cluster


> On Mon, Apr 07, 2003 at 02:56:05PM +0200, Eugen Leitl wrote:
> > A friend of mine has a project requiring a lot
> > of crunch (no idea yet which bandwidth/latency
> > requirements).
> >
> > Can you think of places where one can rent nontrivial
> > amount of crunch for money?
>
> I know people who would be interested in providing such a service. As
> in, getting a cluster and start renting out time on it.
>
> (and no, I'm not affiliated with them, I just know them well  :)
>
> So, while I don't have a real answer to your question, allow me to add
> yet another question:   Is there interest in such a service?
>
> How many here would, or know people who might, rent time on a remote
> cluster ?
>
> I personally think that security concerns is the main showstopper here -
> you often cannot really do paid research on such a system, if the
> results are supposed to help getting patents etc.   Larger organizations
> would rather buy their own cluster, than risk losing a patent to the
> competition.   And for hobbyists?  I guess most hobbyists can sneak in
> low-priority jobs at work   :)
>
> So, is the original question a once-in-a-decade thing, or do people
> generally feel that there is interest in such a service?
>
> (haven't seen many of those requests on this list, AFAIR)
>
> Cheers,
>
> --
> ................................................................
> :   jakob at unthought.net   : And I see the elder races,         :
> :.........................: putrid forms of man                :
> :   Jakob ?stergaard      : See him rise and claim the earth,  :
> :        OZ9ABN           : his downfall is at hand.           :
> :.........................:............{Konkhra}...............:
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From iod00d at hp.com  Thu Apr 10 14:26:09 2003
From: iod00d at hp.com (Grant Grundler)
Date: Thu, 10 Apr 2003 11:26:09 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>
Message-ID: <20030410182609.GF29125@cup.hp.com>

On Thu, Apr 10, 2003 at 01:04:04PM -0400, Stephen Gaudet wrote:
> 
> http://msnbc-cnet.com.com/2100-1012-996357.html?type=pt&part=msnbc&tag=alert
> &form=feed&subj=cnetnews

...
| That barrier has hindered adoption of Itanium in broad business markets,
| but it's been less of a problem in the supercomputing niche, where
| customers often control their own software instead of relying on products
| such as Oracle's database or Computer Associates' management software.

Gah!
Both Oracle and Computer Associates have ia64-linux product available.

grant
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Apr 10 15:57:46 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 10 Apr 2003 15:57:46 -0400 (EDT)
Subject: renting time on a cluster
In-Reply-To: <20030410175232.GB16320@unthought.net>
Message-ID: <Pine.LNX.4.44.0304101536530.21143-100000@ganesh.phy.duke.edu>

On Thu, 10 Apr 2003, Jakob Oestergaard wrote:

> 
> So, is the original question a once-in-a-decade thing, or do people
> generally feel that there is interest in such a service?
> 
> (haven't seen many of those requests on this list, AFAIR)

The problem is that there is a fairly narrow profile of problems for
which such a service is optimal; in "most cases" the cost benefit of
doing it yourself or in your existing IT organization are superior, as
you have to pay any such service provider the real costs plus
depreciation plus a profit; in your own organization some parts of these
costs are low marginal cost rescalings of existing infrastructure or
opportunity cost time paid out of a pool of low priority competing
tasks or FTE surplus hours (i.e. free).

The same problem exists, actually, for "centralized" shared compute
resources at universities or supercomputer centers -- for these to be a
cost win they generally need a pool of clients that is:

  a) Big enough to keep their cluster operating close to capacity all
the time, since the only way to be the fixed costs of dead time is to
amortize it over active time, raising rates and starting a deadly spiral
of still fewer clients.

  b) With demand that can be spread out to keep the duty cycle high a la
a).  It does no good to have one cluster-year's worth of tasks for your
cluster if all your clients insist on having their work done in the same
three month time of the year -- you'll have to have a cluster 3x bigger
(and idle 3/4 of the year) or lose 2/3 of your clients and STILL be idle
3/4 of the year.  Oooo, hate to even do the math on that one.

  c) Poor enough in local computing resources that a locally purchased
and administered cluster doesn't make more sense.

  d) Almost by definition, with a problem that needs only a short,
intense burst of computation.  People with longrunning problems tend NOT
to use this sort of resource because they almost always are better off
with their own cluster.  It's people who need a 128 node cluster for a
month who can't make do with a 16 node cluster for a year that will be
your primary clients (along with a FEW of those local-resource poor
groups -- this is an important client base of a shared resource in a
University, for example).

A good sized campus is likely to have enough of a mix where a
centralized cluster can make sense, especially one that is "owned" by
the primary groups that operate it who effectively subscribe most of its
time with a clear understanding of how it is to be split up among long
runners and on demanders.

A commercial cluster is pretty tough.  I think you'd need a bunch of
long term "subscribers" there as well, contractually bound for periods
on the order of a year, to keep risks sane and costs reasonably
competitive with DIY.  If you had some sort of auction/market model
whereby you could resell idle time at or even below cost to keep from
losing money actively while charging a lot more than cost to the on
demand short term users (who would pay it as it is still cheaper than
building their own) you might work out a stable and profitable business.

You'd also do better reselling to businesses than to university or
government researchers.  We're notoriously cheap and like to DIY anyway.
Small businesses especially often have significant infrastructure
barriers that would make purchasing rented time desirable in at least
the short run, IF you could identify the small businesses that need
it...

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Thu Apr 10 19:58:08 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Fri, 11 Apr 2003 09:58:08 +1000
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <16022.194.793900.97453@napali.hpl.hp.com>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>	<20030410182609.GF29125@cup.hp.com>	<3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com>
Message-ID: <3E960510.6070503@octopus.com.au>

David Mosberger wrote:
> Remember that Intel is targeting Itanium 2 against Power4 and SPARC.
> In that space, the price of Itanium 2 is very competitive.

OK, I want to be clear on this. I asked why Itanium hardware is still so 
expensive. Your answer seems to be marketing speak for "The prices are 
still high because we are _happy_ selling small quantities of this 
equipment to people used to paying through the nose for good quality 
hardware." Is this correct?

Can I then conclude that Intel has not yet had any interest whatsoever 
in driving IA64 into the realm of reasonble prices? It's sad to see so 
much work being put into this Linux port when, if things remain as they 
are, it will hardly be used.

>   Duraid> Seriously, IA64 must be the first architecture in history
>   Duraid> where a software simulator is still being developed 4 years
>   Duraid> after commercial availability of silicon (indeed, entire
>   Duraid> systems).
> 
> What's a software simulator got to do with anything?  Certain things
> are easier to develop on a simulator, others are easier to develop on
> hardware.  Nothing unique to IA64.

I put it to you that software is easier to develop on hardware. Nothing 
unique to IA64, indeed.

	Duraid

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Thu Apr 10 16:55:30 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Fri, 11 Apr 2003 06:55:30 +1000
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <20030410182609.GF29125@cup.hp.com>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com> <20030410182609.GF29125@cup.hp.com>
Message-ID: <3E95DA42.7000607@octopus.com.au>

You and I both know the only real barrier to Itanium adoption is the 
price. Can anyone here shed some light on this? Why is Itanium hardware 
still so expensive?

Seriously, IA64 must be the first architecture in history where a 
software simulator is still being developed 4 years after commercial 
availability of silicon (indeed, entire systems).

Hello? Is anyone home? If Intel thinks an 0.13u respin of Itanium 2 
going for $1000 a pop is going to save them from the horrible onslaught 
of horrible hardware (x86-64 ;) it'd seem they have another thing coming!

We live in Carly times. :\

	Duraid


Grant Grundler wrote:
> Gah!
> Both Oracle and Computer Associates have ia64-linux product available.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From randolph at tausq.org  Thu Apr 10 19:56:37 2003
From: randolph at tausq.org (Randolph Chung)
Date: Thu, 10 Apr 2003 16:56:37 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <16022.194.793900.97453@napali.hpl.hp.com>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com> <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com>
Message-ID: <20030410235636.GO12993@tausq.org>

> What's a software simulator got to do with anything?  Certain things
> are easier to develop on a simulator, others are easier to develop on
> hardware.  Nothing unique to IA64.

hear hear... i might have access to a bunch of parisc hardware, but i
would love to get my hands on a good parisc simulator. 

i setup the ia64 simulator to play with kernel modules support.. but now
that david got it working... :-)

randolph
-- 
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From davidm at napali.hpl.hp.com  Thu Apr 10 19:39:46 2003
From: davidm at napali.hpl.hp.com (David Mosberger)
Date: Thu, 10 Apr 2003 16:39:46 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <3E95DA42.7000607@octopus.com.au>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>
	<20030410182609.GF29125@cup.hp.com>
	<3E95DA42.7000607@octopus.com.au>
Message-ID: <16022.194.793900.97453@napali.hpl.hp.com>

>>>>> On Fri, 11 Apr 2003 06:55:30 +1000, Duraid Madina <duraid at octopus.com.au> said:

  Duraid> You and I both know the only real barrier to Itanium
  Duraid> adoption is the price. Can anyone here shed some light on
  Duraid> this? Why is Itanium hardware still so expensive?

Remember that Intel is targeting Itanium 2 against Power4 and SPARC.
In that space, the price of Itanium 2 is very competitive.

  Duraid> Seriously, IA64 must be the first architecture in history
  Duraid> where a software simulator is still being developed 4 years
  Duraid> after commercial availability of silicon (indeed, entire
  Duraid> systems).

What's a software simulator got to do with anything?  Certain things
are easier to develop on a simulator, others are easier to develop on
hardware.  Nothing unique to IA64.

	--david
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bob at drzyzgula.org  Thu Apr 10 21:51:39 2003
From: bob at drzyzgula.org (Bob Drzyzgula)
Date: Thu, 10 Apr 2003 21:51:39 -0400
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <3E960510.6070503@octopus.com.au>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com> <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au>
Message-ID: <20030410215139.O3614@www2>

On Fri, Apr 11, 2003 at 09:58:08AM +1000, Duraid Madina wrote:
> 
> David Mosberger wrote:
> >Remember that Intel is targeting Itanium 2 against Power4 and SPARC.
> >In that space, the price of Itanium 2 is very competitive.
> 
> OK, I want to be clear on this. I asked why Itanium hardware is still so 
> expensive. Your answer seems to be marketing speak for "The prices are 
> still high because we are _happy_ selling small quantities of this 
> equipment to people used to paying through the nose for good quality 
> hardware." Is this correct?

I'm not sure that it works this way. I think it's more like
"We are making the best processor we know (or, perhaps,
"knew", or "thought we knew", or even "allowed ourselves
to know") how to make that will/would/might in our dreams
be profitable to sell at this high price in moderate
quantities." I expect that if they could sell one hundred
times as many Itaniums at a tenth the price, they would
ramp up the fabs and do it. But then you get into the
chicken-or-egg problem: There's no software, and hence
no demand, and hence no software, and hence no demand,
that would justify the production of a hundred times as
many Itaniums.

> Can I then conclude that Intel has not yet had any interest whatsoever 
> in driving IA64 into the realm of reasonble prices? It's sad to see so 
> much work being put into this Linux port when, if things remain as they 
> are, it will hardly be used.

Be careful that you put the horse before the cart.
Might it not be that the people doing this work are
wagering that it will ultimately cause demand for
the Itanium to increase? Could it really be expected
that demand for Itanium *would* materialize without
such investment in software happening first?

In any event, virtually nothing remains as it is.

--Bob Drzyzgula
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From matthewc at cse.unsw.edu.au  Thu Apr 10 22:20:46 2003
From: matthewc at cse.unsw.edu.au (Matt Chapman)
Date: Fri, 11 Apr 2003 12:20:46 +1000
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <3E960510.6070503@octopus.com.au>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com> <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au>
Message-ID: <20030411022046.GA22381@cse.unsw.edu.au>

On Fri, Apr 11, 2003 at 09:58:08AM +1000, Duraid Madina wrote:
> 
> Can I then conclude that Intel has not yet had any interest whatsoever 
> in driving IA64 into the realm of reasonble prices?

My understanding is that Deerfield will be targeted at the lower cost
market, though I haven't seen much info about it recently.

> I put it to you that software is easier to develop on hardware. Nothing 
> unique to IA64, indeed.

We still use simulators despite the availability of hardware.  Operating
system software is often easier to debug on a simulator.

Matt

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rochus.schmid at ch.tum.de  Fri Apr 11 06:34:08 2003
From: rochus.schmid at ch.tum.de (rochus.schmid at ch.tum.de)
Date: Fri, 11 Apr 2003 12:34:08 +0200 (CEST)
Subject: SMC8624T vs DLINK DGC-1024T / Jumbo Frames ?
In-Reply-To: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca>
References: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca>
Message-ID: <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de>

 
dear beowulfers, 
 
we are in a similar situation as dave: we get an 8nodes dual-xeon cluster 
(with tyan e7501 mobo) with intel gige on board. and now the "switch issue" 
comes up. my vendor also suggested the DLINK, whereas i found the discussion 
on the (more expensive managed) SMC on this list supporting jumbo frames. 
the issue was whether or not any of the cheaper (unmanaged) switches support 
jumbo frames. i couldnt figure out if this is resolved, yet. it sounded like 
they might, but since they are unmanaged the problem is to switch it on or 
off. is that right? 
 
i also found this document: 
http://www.scl.ameslab.gov/Publications/ HalsteadPubs/usenix_halstead.pdf 
it says that the effect on bandwith with jumbo frames is only seen for tcp/ip 
commun (netpipe) but is completely lost using MPI. since my code is MPI based 
it wouldn't matter to have jumbo frames and i could go with the cheaper DLINK. 
is this info right? or outdated? missunderstood? 
 
any hints highly appreciated. 
 
greetings 
 
   rochus 
 
 
Quoting Dave Lane <dlane at ap.stmarys.ca>: 
 
> Can anyone comment on the strengths/weaknesses of these two 24-port 
> gigabit  
> switches. We're going to be building a 16 node dual-Xeon cluster this  
> spring and were planning on the SMC switch (which has received good 
> review  
> here before), but a vendor pointed out the DLINK switch as a less 
> expensive  
> alternative. 
>  
> ... Dave 
>  
 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From virtualsuresh at yahoo.co.in  Fri Apr 11 06:49:48 2003
From: virtualsuresh at yahoo.co.in (=?iso-8859-1?q?suresh=20chandra?=)
Date: Fri, 11 Apr 2003 11:49:48 +0100 (BST)
Subject: remote booting
Message-ID: <20030411104948.94018.qmail@web8107.mail.in.yahoo.com>

Hi,
I am building a 2-node cluster as a practice for
building a 16-node cluster in University.
I want to remote boot for client (Diskless), I found
PXELINUX should be flashed or burned into a PROM on
the network card.
Is there any other way for remote booting by using a
Floppy disk (which in turn invoke my NIC for remote
booting), I have less time to get a PROM for my
network card.

I am going to use OpenMosix.
Thanks in Advance.

Regards,
Suresh Chandra Mannava, India.


=====


________________________________________________________________________
Missed your favourite TV serial last night? Try the new, Yahoo! TV.
       visit http://in.tv.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Andrew.Cannon at nnc.co.uk  Fri Apr 11 08:20:33 2003
From: Andrew.Cannon at nnc.co.uk (Cannon, Andrew)
Date: Fri, 11 Apr 2003 13:20:33 +0100
Subject: PVM and MPI differences?
Message-ID: <DD1E19A9AFC2D311A32200508B5589EF0498B7B4@nnc.co.uk>

Hi All,

I've recently set up a Monte Carlo compute cluster of 4 computers (RH8)
running pvm. I have heard about MPI and I was wondering what the differences
between mpi and pvm are? 

Regards

Andrew

Andrew Cannon, Nuclear Technology (J2), NNC Ltd, Booths Hall, Knutsford,
Cheshire, WA16 8QZ.

Telephone; +44 (0) 1565 843768
email: mailto:andrew.cannon at nnc.co.uk
NNC website: http://www.nnc.co.uk


***********************************************************************************
 NNC Limited
 Booths Hall
 Chelford Road
 Knutsford
 Cheshire
 WA16 8QZ
 
 Country of Registration: United Kingdom
 Registered Number: 1120437
 
 This e-mail and any files transmitted with it are confidential and 
 intended solely for the use of the individual or entity to whom they   
 are addressed. If you have received this e-mail in error please notify 
 the NNC system manager by e-mail at eadm at nnc.co.uk.
***********************************************************************************

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From seth at hogg.org  Fri Apr 11 04:39:12 2003
From: seth at hogg.org (Simon Hogg)
Date: Fri, 11 Apr 2003 09:39:12 +0100
Subject: IA-64 related question (tangentially)
Message-ID: <4.3.2.7.2.20030411093017.00c3cc70@pop.freeuk.net>

Don't everybody get excited straight away - this needs to be approved by a 
few people first :-)

Suppose 'a friend' had some Itanium hardware to be donated to a 'good 
cause' - what's the best way of going about it?  Should it go to FSF / Gnu 
/ Debian projects to further their development (in general software terms)?

Or, should it go to a local university in support of a specific 'end-user' 
/ project (maybe beowulf related) role.

What are the pros and cons of each route (for the receiver)?  I am more 
tempted by the FSF / Debian donation, but maybe there are other benefits.

Simon
p.s. The 'gift' won't be in the order of the 300+ nodes at OSC!

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From adriano at satec.es  Fri Apr 11 04:37:39 2003
From: adriano at satec.es (Adriano Galano)
Date: Fri, 11 Apr 2003 10:37:39 +0200
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <16022.194.793900.97453@napali.hpl.hp.com>
Message-ID: <003601c30005$9d740a10$a620a4d5@tsatec.int>


> >>>>> On Fri, 11 Apr 2003 06:55:30 +1000, Duraid Madina 
> <duraid at octopus.com.au> said:
> 
>   Duraid> You and I both know the only real barrier to Itanium
>   Duraid> adoption is the price. Can anyone here shed some light on
>   Duraid> this? Why is Itanium hardware still so expensive?
> 
> Remember that Intel is targeting Itanium 2 against Power4 and SPARC.
> In that space, the price of Itanium 2 is very competitive.
> 

What's mean very competitive? How it compare with Power* for example? 

>   Duraid> Seriously, IA64 must be the first architecture in history
>   Duraid> where a software simulator is still being developed 4 years
>   Duraid> after commercial availability of silicon (indeed, entire
>   Duraid> systems).
> 
> What's a software simulator got to do with anything?  Certain things
> are easier to develop on a simulator, others are easier to develop on
> hardware.  Nothing unique to IA64.
> 

AMD's Opteron is in a simulator yet...

Regards,

--Adriano


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mtina at tahoe.com  Fri Apr 11 10:18:25 2003
From: mtina at tahoe.com (Mohammad Tina)
Date: 11 Apr 2003 15:18:25 +0100
Subject: new to linux clustering
Message-ID: <57d7501c30035$3721bb10$4701020a@corp.load.com>

Hi,
i am new to linux clustering, i am planning to  install cluster on 3 machines (redhat 7).
i was reading about clustering and i found many packages.
can anyone recommend a package for me??
 
Thanks


==================================================================
	Get Your Free Web-Based Email at http://www.tahoe.com!
==================================================================
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From yudong at lshp.gsfc.nasa.gov  Fri Apr 11 11:00:12 2003
From: yudong at lshp.gsfc.nasa.gov (Yudong Tian)
Date: Fri, 11 Apr 2003 11:00:12 -0400
Subject: remote booting
In-Reply-To: <20030411104948.94018.qmail@web8107.mail.in.yahoo.com>
Message-ID: <LDENIIHGLJNAFHFLJNOJAEKPCDAA.yudong@hsb.gsfc.nasa.gov>

Please make sure your NIC supports PXE boot or not. If it does,
then you can boot over the network without using a floppy. 
If it does not, you might need to use syslinux on a floppy. 
I did network boot and installation before, and here you can 
find the steps I took:
   http://lis.gsfc.nasa.gov/yudong/notes/net-install.txt

------------------------------------------------------------
Falun Dafa: The Tao of Meditation (http://www.falundafa.org)
------------------------------------------------------------
Yudong Tian, Ph.D.  NASA/GSFC  (301) 286-2275


> -----Original Message-----
> From: beowulf-admin at beowulf.org [mailto:beowulf-admin at beowulf.org]On
> Behalf Of suresh chandra
> Sent: Friday, April 11, 2003 6:50 AM
> To: Beowulf at beowulf.org
> Subject: remote booting
> 
> 
> Hi,
> I am building a 2-node cluster as a practice for
> building a 16-node cluster in University.
> I want to remote boot for client (Diskless), I found
> PXELINUX should be flashed or burned into a PROM on
> the network card.
> Is there any other way for remote booting by using a
> Floppy disk (which in turn invoke my NIC for remote
> booting), I have less time to get a PROM for my
> network card.
> 
> I am going to use OpenMosix.
> Thanks in Advance.
> 
> Regards,
> Suresh Chandra Mannava, India.
> 
> 
> =====
> 
> 
> ________________________________________________________________________
> Missed your favourite TV serial last night? Try the new, Yahoo! TV.
>        visit http://in.tv.yahoo.com
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Apr 11 10:27:31 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 11 Apr 2003 10:27:31 -0400 (EDT)
Subject: IA-64 related question (tangentially)
In-Reply-To: <4.3.2.7.2.20030411093017.00c3cc70@pop.freeuk.net>
Message-ID: <Pine.LNX.4.44.0304111022150.26489-100000@ganesh.phy.duke.edu>

On Fri, 11 Apr 2003, Simon Hogg wrote:

> Don't everybody get excited straight away - this needs to be approved by a 
> few people first :-)
> 
> Suppose 'a friend' had some Itanium hardware to be donated to a 'good 
> cause' - what's the best way of going about it?  Should it go to FSF / Gnu 
> / Debian projects to further their development (in general software terms)?
> 
> Or, should it go to a local university in support of a specific 'end-user' 
> / project (maybe beowulf related) role.
> 
> What are the pros and cons of each route (for the receiver)?  I am more 
> tempted by the FSF / Debian donation, but maybe there are other benefits.
> 
> Simon
> p.s. The 'gift' won't be in the order of the 300+ nodes at OSC!

Goodness!  I just HAVE to take a stab at answering this one (I answer
everything else, after all...:-)

It's perfectly clear that the best way is to donate it to a
University, in fact, more specifically, to the Duke University Physics
Department.  Indeed, most specifically of all, to the group of Brown and
Ciftan in the Duke University Physics Department, to be used in Monte
Carlo computations in O(3) Symmetric critical systems and a new Multiple
Scattering Band Theory project just getting underway.

FSF or Debian don't need Itaniums, really, except for maybe one or two
to ensure that builds work on the architecture.  They don't "compute".
I do.  To me a cycle is a precious thing as I use so MANY of them over
the years.

Selflessly yours (just trying to make sure you Do The Right Thing...:-)

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gmpc at sanger.ac.uk  Fri Apr 11 10:53:09 2003
From: gmpc at sanger.ac.uk (Guy Coates)
Date: Fri, 11 Apr 2003 15:53:09 +0100 (BST)
Subject: remote booting
In-Reply-To: <200304111422.h3BEMUs01923@NewBlue.Scyld.com>
References: <200304111422.h3BEMUs01923@NewBlue.Scyld.com>
Message-ID: <Pine.OSF.4.44.0304111550440.3014723-100000@ecs2e.internal.sanger.ac.uk>

>Is there any other way for remote booting by using a
>Floppy disk (which in turn invoke my NIC for remote
>booting)

Yup, take a look at the etherboot project
http://etherboot.sourceforge.net/

which does exactly this for a wide range of ethernet hardware.

There is even a nice webpage which will build your boot image for you:

http://www.rom-o-matic.net/

Cheers,

Guy Coates

-- 
Guy Coates,  Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
Tel: +44 (0)1223 834244 ex 7199


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david at virtutech.se  Fri Apr 11 03:51:36 2003
From: david at virtutech.se (David =?iso-8859-1?q?K=E5gedal?=)
Date: Fri, 11 Apr 2003 09:51:36 +0200
Subject: Itanium gets supercomputing software
In-Reply-To: <20030411022046.GA22381@cse.unsw.edu.au> (Matt Chapman's
 message of "Fri, 11 Apr 2003 12:20:46 +1000")
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>
	<20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au>
	<16022.194.793900.97453@napali.hpl.hp.com>
	<3E960510.6070503@octopus.com.au>
	<20030411022046.GA22381@cse.unsw.edu.au>
Message-ID: <u5tbrzdlbef.fsf@deckard.uppsala.vtech>

Matt Chapman <matthewc at cse.unsw.edu.au> writes:

> On Fri, Apr 11, 2003 at 09:58:08AM +1000, Duraid Madina wrote:
>> 
>> I put it to you that software is easier to develop on hardware. Nothing 
>> unique to IA64, indeed.
>
> We still use simulators despite the availability of hardware.  Operating
> system software is often easier to debug on a simulator.

Exactly.  There are a lot of things that you can do with a simulator
that you can't do with hardware.  Developing software before hardware
is available is just one of them.  (plug mode on) That's why we sell
simulators for most major current CPU architectures.  Including IA64.

-- 
David K?gedal, Virtutech
http://www.simics.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From seth at hogg.org  Fri Apr 11 11:07:33 2003
From: seth at hogg.org (Simon Hogg)
Date: Fri, 11 Apr 2003 16:07:33 +0100
Subject: IA-64 related question (tangentially)
Message-ID: <4.3.2.7.2.20030411160726.00c3d700@pop.freeuk.net>

At 10:27 11/04/03 -0400, you wrote:
>On Fri, 11 Apr 2003, Simon Hogg wrote:
> > Suppose 'a friend' had some Itanium hardware to be donated to a 'good
> > cause' - what's the best way of going about it?  Should it go to FSF / Gnu
> > / Debian projects to further their development (in general software terms)?
> >
> > Or, should it go to a local university in support of a specific 'end-user'
> > / project (maybe beowulf related) role.
>
>Goodness!  I just HAVE to take a stab at answering this one (I answer
>everything else, after all...:-)
>
>It's perfectly clear that the best way is to donate it to a
>University, in fact, more specifically, to the Duke University Physics
>Department.  Indeed, most specifically of all, to the group of Brown and
>Ciftan in the Duke University Physics Department, to be used in Monte
>Carlo computations in O(3) Symmetric critical systems and a new Multiple
>Scattering Band Theory project just getting underway.
>
>FSF or Debian don't need Itaniums, really, except for maybe one or two
>to ensure that builds work on the architecture.  They don't "compute".
>I do.  To me a cycle is a precious thing as I use so MANY of them over
>the years.
>
>Selflessly yours (just trying to make sure you Do The Right Thing...:-)

Well, how surprised was I by this answer? :-)

Thinking about this a little bit more, would 'you' be happier getting the 
cash equivalent (with strings attached as to what you could buy).  Is this 
more tax-efficient?  Or is it better to 'loan' you the equipment which then 
depreciates over two or three years, then I donate it to you for zero cost?

Simon

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Fri Apr 11 11:16:43 2003
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Fri, 11 Apr 2003 08:16:43 -0700 (PDT)
Subject: new to linux clustering
In-Reply-To: <57d7501c30035$3721bb10$4701020a@corp.load.com>
Message-ID: <20030411151643.92394.qmail@web11405.mail.yahoo.com>

Tell us what you want to run first...

Rayson


--- Mohammad Tina <mtina at tahoe.com> wrote:
> Hi,
> i am new to linux clustering, i am planning to  install cluster on 3
> machines (redhat 7).
> i was reading about clustering and i found many packages.
> can anyone recommend a package for me??
>  
> Thanks
> 
> 
> 
> 
> ==================================================================
> 	Get Your Free Web-Based Email at http://www.tahoe.com!
> ==================================================================
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From chris_oubre at hotmail.com  Fri Apr 11 12:29:49 2003
From: chris_oubre at hotmail.com (Chris Oubre)
Date: Fri, 11 Apr 2003 11:29:49 -0500
Subject: new to linux clustering
In-Reply-To: <200304111420.h3BEKos01501@NewBlue.Scyld.com>
Message-ID: <000e01c30047$92978f30$25462a80@rice.edu>

I am using OSCAR 2.1 to run my cluster of 15 dual Xeons.  I quite like
the package.  It lays on top of Red Hat 7.2 7.3 or Mandrake 8.2.  OSCAR
is basically a suite of packages (PBS, MPI,LAM, PVM, C3, HDF5, ...)
which make "culsterize" and make administration easier.  Check them out
at http://oscar.sourceforge.net/


-----Original Message-----
From: beowulf-admin at beowulf.org [mailto:beowulf-admin at beowulf.org] On
Behalf Of beowulf-request at beowulf.org
Sent: Friday, April 11, 2003 9:21 AM
To: beowulf at beowulf.org
Subject: Beowulf digest, Vol 1 #1243 - 14 msgs

--__--__--

Message: 14
Date: 11 Apr 2003 15:18:25 +0100
From: "Mohammad Tina" <mtina at tahoe.com>
To: "beowulf" <beowulf at beowulf.org>
Subject: new to linux clustering

Hi,
i am new to linux clustering, i am planning to  install cluster on 3
machines (redhat 7). i was reading about clustering and i found many
packages. can anyone recommend a package for me??
 
Thanks


==================================================================
	Get Your Free Web-Based Email at http://www.tahoe.com!
==================================================================


--__--__--

_______________________________________________
Beowulf mailing list
Beowulf at beowulf.org http://www.beowulf.org/mailman/listinfo/beowulf


End of Beowulf Digest
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jbassett at blue.weeg.uiowa.edu  Fri Apr 11 12:41:07 2003
From: jbassett at blue.weeg.uiowa.edu (jbassett)
Date: Fri, 11 Apr 2003 11:41:07 -0500
Subject: cooling systems
Message-ID: <3EA0F5C5@itsnt5.its.uiowa.edu>

Does anyone know of if it is possible to buy a rackmount cluster with an 
integrated cooling system? It seems against the philosophy of Beowulf to look 
for low cost computing solutions, and then find that you need to make a 
substantial investment just to cool the room. I had an Athlon system shut down 
on me due to overheat, so I look at the cases and I think- why aren't people 
looking to use airflow in a more efficient manner. I know the ambient air temp 
isn't this high. I may be in left field, but it seems like the flow inside a 
case is so turbulent that the mean air velocity is not carrying the warm air 
away from the cpu as quickly as it could. 
Joseph Bassett


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Fri Apr 11 13:29:01 2003
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Fri, 11 Apr 2003 10:29:01 -0700 (PDT)
Subject: cooling systems
In-Reply-To: <3EA0F5C5@itsnt5.its.uiowa.edu>
Message-ID: <Pine.LNX.4.44.0304111007030.9087-100000@twin.uoregon.edu>

rack systems with integrated hvac do exist. as a general rule it cheaper 
to buy one or two large ac units and move air around as needed than it is 
to buy lots of small ac units have to deal seperately with the exhaust 
from all their little heat-exchangers...

 On Fri, 11 Apr 2003, jbassett wrote:

> Does anyone know of if it is possible to buy a rackmount cluster with an 
> integrated cooling system? It seems against the philosophy of Beowulf to look 
> for low cost computing solutions, and then find that you need to make a 
> substantial investment just to cool the room. I had an Athlon system shut down 
> on me due to overheat, so I look at the cases and I think- why aren't people 
> looking to use airflow in a more efficient manner. I know the ambient air temp 
> isn't this high. I may be in left field, but it seems like the flow inside a 
> case is so turbulent that the mean air velocity is not carrying the warm air 
> away from the cpu as quickly as it could.

there's a substantial amount of engineering that has to go into the 
thermal management in a 1u or 2u case that simply doen't have to 
happen in the desktop pc industry. even if you have sufficient cooling for 
the room you may still need dedicated airhandlers to move it to the 
right location for the rack with the nodes in them...
 
> Joseph Bassett
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli	      Academic User Services   joelja at darkwing.uoregon.edu    
--    PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E      --
  In Dr. Johnson's famous dictionary patriotism is defined as the last
  resort of the scoundrel.  With all due respect to an enlightened but
  inferior lexicographer I beg to submit that it is the first.
	   	            -- Ambrose Bierce, "The Devil's Dictionary"


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Fri Apr 11 14:13:35 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Fri, 11 Apr 2003 11:13:35 -0700
Subject: [Linux-IA64] Itanium gets supercomputer software
In-Reply-To: <200304111505.TAA23036@nocserv.free.net>
References: <200304111505.TAA23036@nocserv.free.net>
Message-ID: <20030411181335.GB1321@greglaptop.internal.keyresearch.com>

On Fri, Apr 11, 2003 at 07:05:13PM +0400, Mikhail Kuzminsky wrote:

>   It looks that there is some "gentleman's" agreement between Intel
> and companies, manufacturing IA64-based systems, about "price increase".

With the first Itanium generation, only Intel built boxes, and
everyone else OEMed and sold the same 2 boxes (one dual, one quad).
There shouldn't be any surprise that they were roughly the same price.

Things are a bit more diverse with the Itanium2, but it's still a low
volume item.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Fri Apr 11 14:00:34 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Fri, 11 Apr 2003 11:00:34 -0700
Subject: IA-64 related question (tangentially)
In-Reply-To: <Pine.LNX.4.44.0304111022150.26489-100000@ganesh.phy.duke.edu>
References: <4.3.2.7.2.20030411093017.00c3cc70@pop.freeuk.net> <Pine.LNX.4.44.0304111022150.26489-100000@ganesh.phy.duke.edu>
Message-ID: <20030411180034.GA1321@greglaptop.internal.keyresearch.com>

On Fri, Apr 11, 2003 at 10:27:31AM -0400, Robert G. Brown wrote:

> FSF or Debian don't need Itaniums, really, except for maybe one or two
> to ensure that builds work on the architecture.  They don't "compute".

Joking aside, every compiler group needs a cluster, because their
nightly testing is:

build kernel, test kernel
build compiler, test compiler
build all rpms in your distro
build and run SPECcpu
build and run misc tests
run a search over combinations of optimization flags to see if
  any are broken or have performance regressions

And that's not even considering parallelizing builds.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Apr 11 14:11:01 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 11 Apr 2003 14:11:01 -0400 (EDT)
Subject: cooling systems
In-Reply-To: <3EA0F5C5@itsnt5.its.uiowa.edu>
Message-ID: <Pine.LNX.4.44.0304111326250.26489-100000@ganesh.phy.duke.edu>

On Fri, 11 Apr 2003, jbassett wrote:

> Does anyone know of if it is possible to buy a rackmount cluster with an 
> integrated cooling system? It seems against the philosophy of Beowulf to look 
> for low cost computing solutions, and then find that you need to make a 
> substantial investment just to cool the room. I had an Athlon system shut down 

This is in some ways impossible, if I understand what you mean.  Or from
another point of view, it is already standard.  Let's understand
refrigeration and thermodynamics a bit:

  All the energy used to run your systems and do computations turns into
  heat (1st law).

  One cannot make heat "go away"; it either naturally flows from hot
  places to cooler places, or one can move it forcibly from a hot place
  to a hotter place. It costs energy which makes still MORE heat to move
  it forcibly around (2nd law).

Now view the CPUs as little heaters -- 50W to 100W apiece (as hot as
most incandescent light bulbs) and confined inside a 1U or 2U case.  Add
on another 50W plus for the motherboard, memory, disk, network, and the
switching power supply itself inside the case.  Even the "refrigeration"
devices already standard in the case (case fans intended to speed the
heat on its way) add heat to the case exhaust in the process.

Cases are already designed to move heat from the hot spots inside out
into the ambient air as efficiently as possible (within the quality of
engineering and layout of any particular case with any particular
motherboard).  There are even cooling devices designed for e.g. CPU
cooling that are active electronic refrigerators (peltier coolers) and
not just fan+heat sink conduction+convection coolers.

The problem is out in the room.  Once you remove the heat from the
cases, with or without an actual case refrigerator at work (in general
one will exhaust MORE heat into the room than a case cooled with fan
alone) the heat still HAS to get out of the room.  If the room has lots
of nodes making heat, nice thick walls, ceilings, floors, and lots of
dead air (as do most uncooled cluster rooms, it seems), it won't get out
quickly enough on its own, so it will start to build up.  This makes the
room get hot -- temperature being a measure of the "heat" (random
kinetic energy) in the room's air.

Now, a passive cooler fan can only cool the CPU if the ambient air is
cooler than the CPU.  It can move air through more quickly, but
basically heat is flowing from hot to cold.  As the room air temperature
goes up, so does the CPU temperature as the fan is less successful in
helping to remove its power-generated heat.  An active cooler is in no
fundamentally better shape.  Yest, it will maintain a temperature
gradient, and keep the CPU actually cooler than ambient air, but as
ambient air goes up in temperature so will the CPU temperature AND the
ambient air will get still hotter as a result of the extra energy the
cooler itself consumes (which in turn goes up as the ambient air
temperature increases in a vicious cycle).  It also heats the other
components in the case more while keeping the CPU a bit cooler, so other
things may fail at a higher rate unless you remove all that heat.

ONE WAY OR ANOTHER you will HAVE to remove the heat from the room JUST
AS FAST as all the systems and other heat sources (including electric
lights and human bodies) produce it to maintain the room's temperature
as constant.  If you live in a cold climate or have some handy "cold
reservoir" that can absorb the heat from your cluster indefinitely
without getting warmer itself, maybe you can metaphorically open a
window and stick in a fan and blow the hot air out into the snow,
replacing it with nice cool air from outside.  If you live in Durham NC
in the summer, the air outside the building is a lot HOTTER than you'd
like the cluster room to be, so you have to do work to actively move the
heat from your nice cool cluster room to the much hotter out of doors,
moving it "uphill" so to speak.

This work WILL be done by a refrigeration unit -- an air conditioner --
as that's what they are and what they do.  You can even estimate fairly
accurately how much air conditioning you'll require to keep up with the
rate at which the cluster produces heat, using 3500 Watts per "ton" of
A/C (and remembering to provide a lot more capacity than you think
you'll need, maybe twice as much).  You can install an "off-the-shelf"
air conditioning solution if one is possible and makes sense for your
cluster room, or you can (likely better) have a pro come and install a
proper climate control system.  

You'd have to do this for EITHER a "big iron" supercomputer OR a beowulf
-- in both cases they make lots of heat, in both cases you MUST remove
that heat as fast as it is made and dump it outside to maintain ambient
air temperatures in the 60's (ideally).  Beowulfish clusters are cheap
to build, they are relatively cheap and scalable to operate in most
environments, but there are most definitely infrastructure costs and
requirements -- adequate power and ac and networking in the physical
space, and the actual cost of power to run and cool the nodes.  The
former can usually be "amortized" over many years so that it adds a few
tens of dollars per year to the cost of operating the nodes themselves.
The latter is unavoidable -- roughly $1/watt/year for heating and
cooling.  

This is another "killer" surprise for cluster builders -- a 100 node
cluster of 100 Watt nodes might cost $75,000 in direct hardware costs,
AND $25,000 in renovation costs for new power and AC (amortized over ten
years and 100 nodes -- maybe $30 per node per year "payback", including
the cost of the money), AND $10,000 a year for power and A/C.  It's
still cheap, really, compared to big iron -- just not as cheap as you
might have thought looking at hardware costs alone.

This serious, thoughtful approach to infrastructure, is the best way to
keep from having problems with overheating.  The best fans or Peltier
coolers in the world aren't going to do much if ambient air in the
cluster room is in the 80's or 90's, and without AC a cluster room can
get well into the 100's and beyond in a remarkably short period of time.
If you have 50 KW or so being given off in an office-sized space with
insulating walls and no AC, you'll be able to bake brownies by leaving
cups of batter out on top of your racks, at least until something melts,
shorts, starts a fire, and burns down the whole thing.

As far as the rest of your remarks on case design are concerned, they
may be well-justified but there are a lot of cases out there and you
should look at more than one.  It isn't horribly easy to design airflow
inside a 1U space filled with big block-like components, and some do a
better job of it than others.  Even with a good case design, something
like using a flat ribbon cable ide/floppy connector instead of a round
cable can defeat your purpose, in SOME units, by virtue of the ribbon
accidentally blocking part of the airflow!

I "like" 2U cases a bit better than 1U's for that reason, but there are
some people that make very lovely 1U cases that seem to be quite robust
and reliable -- as long as you keep ambient air in the 60's or at worst
low 70's at the fan intake.

   rgb

> on me due to overheat, so I look at the cases and I think- why aren't people 
> looking to use airflow in a more efficient manner. I know the ambient air temp 
> isn't this high. I may be in left field, but it seems like the flow inside a 
> case is so turbulent that the mean air velocity is not carrying the warm air 
> away from the cpu as quickly as it could. 
> Joseph Bassett
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jbassett at blue.weeg.uiowa.edu  Fri Apr 11 14:26:30 2003
From: jbassett at blue.weeg.uiowa.edu (jbassett)
Date: Fri, 11 Apr 2003 13:26:30 -0500
Subject: cooling systems
Message-ID: <3EA1CB6B@itsnt5.its.uiowa.edu>

This is precisely the point that I am getting at. It seems indirect to me to 
cool the ambient atmosphere in a room using air conditioners, then expect the 
heat to distribute itself so that the temperature is at equilibrium throughout 
the system. It seems entirely more sensible to have a system such that cold 
air would be directed more precisely at the cpus, then have the exiting flow 
directed through some sort of exhaust system which would take it to some place 
that would act as a heat resovoir. In that way you could use the existing 
airconditioning infrastructure at a facility by distributing the hot exhaust 
from a cluster into the building. You could even use a venturi on an air duct 
pipe to keep a vacuum going and redistribute the hot air.

Joseph Bassett


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From davidm at napali.hpl.hp.com  Fri Apr 11 13:59:12 2003
From: davidm at napali.hpl.hp.com (David Mosberger)
Date: Fri, 11 Apr 2003 10:59:12 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <003601c30005$9d740a10$a620a4d5@tsatec.int>
References: <16022.194.793900.97453@napali.hpl.hp.com>
	<003601c30005$9d740a10$a620a4d5@tsatec.int>
Message-ID: <16023.624.775864.544742@napali.hpl.hp.com>

>>>>> On Fri, 11 Apr 2003 10:37:39 +0200, "Adriano Galano" <adriano at satec.es> said:

  >> >>>>> On Fri, 11 Apr 2003 06:55:30 +1000, Duraid Madina
  >> <duraid at octopus.com.au> said:

  Duraid> You and I both know the only real barrier to Itanium
  Duraid> adoption is the price. Can anyone here shed some light on
  Duraid> this? Why is Itanium hardware still so expensive?

  >> Remember that Intel is targeting Itanium 2 against Power4 and SPARC.
  >> In that space, the price of Itanium 2 is very competitive.

  Adriano> What's mean very competitive? How it compare with Power* for example?

AFAIK, Power4 CPUs are not sold on the open market, so it's difficult
to compare the price of the CPU alone (surely IBM has a list price,
but with different discount schedules, that price may or may not be
meaningful in practice).  Here is one real price point for an Itanium 2
workstation:

	- hp workstation zx2000 (Linux software enablement kit)
	-  Intel? Itanium 2 900MHz Processor with 1.5MB on-chip L3 cache
	-  512MB Total PC2100 Registered ECC DDR 266 SDRAM Memory (2x256MB)
	-  40GB EIDE Hard Drive
	-  NVIDIA Quadro2 EX
	-  10/100/1000BT LAN integrated
	-  16X Max DVD-ROM
	-  Linux software enablement kit (not an operating system)
	-  3-year warranty, next-day, onsite hardware response, Mon - Fri, 8am - 5pm
	- $3,298

(To see this config, go to www.hp.com, then click on "online shopping"
-> "small and medium business store" -> "workstations" -> "hp Itanium
2-based workstations" -> "zx2000").

I don't know exactly what price/configuration Power4 machines start.
Perhaps one of the IBMers on this list could chime in?

	--david
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Fri Apr 11 14:50:24 2003
From: becker at scyld.com (Donald Becker)
Date: Fri, 11 Apr 2003 14:50:24 -0400 (EDT)
Subject: remote booting
In-Reply-To: <LDENIIHGLJNAFHFLJNOJAEKPCDAA.yudong@hsb.gsfc.nasa.gov>
Message-ID: <Pine.LNX.4.44.0304111440400.2234-100000@beohost.scyld.com>

On Fri, 11 Apr 2003, Yudong Tian wrote:

> Please make sure your NIC supports PXE boot or not. If it does,
> then you can boot over the network without using a floppy. 
> If it does not, you might need to use syslinux on a floppy. 

The Scyld system boots using almost any boot media, including floppy.
And since it uses a Linux kernel as part of the boot system, it supports
almost any network devices that the cluster might end up using.

But we still strongly recommend using PXE boot instead of the "stage 1"
system we developed.  The PXE boot protocol isn't as technically strong,
but that is out-weighted by
  - the tens of millions of machines that already have PXE support
  - its near ubiquity in current production machines, and
  - its very low cost to retrofit by installing a NIC with a PXE ROM.


-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Fri Apr 11 14:18:42 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Fri, 11 Apr 2003 11:18:42 -0700
Subject: SMC8624T vs DLINK DGC-1024T / Jumbo Frames ?
In-Reply-To: <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de>
References: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca> <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de>
Message-ID: <20030411181841.GC1321@greglaptop.internal.keyresearch.com>

On Fri, Apr 11, 2003 at 12:34:08PM +0200, rochus.schmid at ch.tum.de wrote:

> http://www.scl.ameslab.gov/Publications/ HalsteadPubs/usenix_halstead.pdf 
> it says that the effect on bandwith with jumbo frames is only seen for tcp/ip 
> commun (netpipe) but is completely lost using MPI. since my code is MPI based 
> it wouldn't matter to have jumbo frames and i could go with the cheaper DLINK. 
> is this info right? or outdated? missunderstood? 

This is a function of how big your communications are -- if you always
send multi-megabyte messages, you probably will get a bit better
bandwidth with jumbo frames. By the way, interrupt coalescence can get
most of the improvement of jumbo frames without the pain that jumbo
frames can cause. But interrupt coalescence might make short messages
take a bit longer to arrive.

greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Fri Apr 11 15:29:11 2003
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Fri, 11 Apr 2003 12:29:11 -0700 (PDT)
Subject: cooling systems
In-Reply-To: <3EA1CB6B@itsnt5.its.uiowa.edu>
Message-ID: <Pine.LNX.4.44.0304111218500.11278-100000@twin.uoregon.edu>

On Fri, 11 Apr 2003, jbassett wrote:

> This is precisely the point that I am getting at. It seems indirect to me to 
> cool the ambient atmosphere in a room using air conditioners, then expect the 
> heat to distribute itself so that the temperature is at equilibrium throughout 
> the system. It seems entirely more sensible to have a system such that cold 
> air would be directed more precisely at the cpus, then have the exiting flow 
> directed through some sort of exhaust system which would take it to some place 
> that would act as a heat resovoir. In that way you could use the existing 
> airconditioning infrastructure at a facility by distributing the hot exhaust 
> from a cluster into the building.

You're better off just taking the heat out of the air and exhausting it 
out of the building, which you do with either vapor-phase refridgeration 
or chilled water in general... when you have large datacenters, you don't 
use the rest of the building as a heat resovoir, it's not nearly big enough a 
space unless you happen to work in one of the moffet field blimp hangars 
or something. if you had enough space or a convenient lake you could also 
use a heat-pump.

> You could even use a venturi on an air duct 
> pipe to keep a vacuum going and redistribute the hot air.
> 
> Joseph Bassett
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli	      Academic User Services   joelja at darkwing.uoregon.edu    
--    PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E      --
  In Dr. Johnson's famous dictionary patriotism is defined as the last
  resort of the scoundrel.  With all due respect to an enlightened but
  inferior lexicographer I beg to submit that it is the first.
	   	            -- Ambrose Bierce, "The Devil's Dictionary"


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Fri Apr 11 14:26:16 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Fri, 11 Apr 2003 11:26:16 -0700
Subject: Scaling of hydro codes
In-Reply-To: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de>
References: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de>
Message-ID: <20030411182616.GD1321@greglaptop.internal.keyresearch.com>

On Thu, Apr 10, 2003 at 10:47:04AM +0200, Wolfgang Dobler wrote:

> My question is: do others find the same type of scaling for hydro codes?
> If so, how can this be understood?

CFD can vary widely. Some algorithms are cache friendly (operator
splitting, the compute part of spectral codes), some are not (3D
operators). Sometimes the data size is huge (1+ gbytes/cpu) and
sometimes it's small enough to fit in the combined L2 caches of your
cluster.

A non-cache-friendly code won't get a great speedup when you use the
2nd cpu. This is what Craig Tierney mentioned, and you can test for
this effect using a 1-cpu and 2-cpu run.

Large data sizes mean easier network scaling. You can look at that
separately by running the code at several sizes using 1 cpu per
machine. If you increase the data size as you use more cpus, this
scaling should be nearly linear.

greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mike.sullivan at alltec.com  Fri Apr 11 14:47:26 2003
From: mike.sullivan at alltec.com (Mike Sullivan)
Date: Fri, 11 Apr 2003 14:47:26 -0400
Subject: cooling systems (jbassett)
Message-ID: <3E970DBE.8020301@alltec.com>

We have designed a custom cabinet for a client that must run in a room 
without AC. The system has
an integral centrifugal blower that can have the air ducted to an 
outside sink. ( or to double as a furnace
in the winter). The Motherboards mount to trays inside the cabinet so we 
do not use standard 1U cases.

-- 
Mike Sullivan                           Director Performance Computing
@lliance Technologies,                  Voice: (416) 385-3255 x 228, 
18 Wynford Dr, Suite 407                Fax:   (416) 385-1774
Toronto, ON, Canada, M3C-3S2            Toll Free:1-877-216-3199
http://www.alltec.com


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Fri Apr 11 15:38:38 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Fri, 11 Apr 2003 12:38:38 -0700
Subject: cooling systems
In-Reply-To: <3EA0F5C5@itsnt5.its.uiowa.edu>
References: <3EA0F5C5@itsnt5.its.uiowa.edu>
Message-ID: <20030411193838.GB1690@greglaptop.internal.keyresearch.com>

On Fri, Apr 11, 2003 at 11:41:07AM -0500, jbassett wrote:

> Does anyone know of if it is possible to buy a rackmount cluster with an 
> integrated cooling system?

ASCI Red has a little air conditioner on the top of every rack, but
that's undoubtedly more expensive than using standard commercial
units.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rmyers1400 at attbi.com  Fri Apr 11 15:16:00 2003
From: rmyers1400 at attbi.com (Robert Myers)
Date: Fri, 11 Apr 2003 15:16:00 -0400
Subject: cooling systems
In-Reply-To: <3EA0F5C5@itsnt5.its.uiowa.edu>
References: <3EA0F5C5@itsnt5.its.uiowa.edu>
Message-ID: <3E971470.7020004@attbi.com>

jbassett wrote:

>Does anyone know of if it is possible to buy a rackmount cluster with an 
>integrated cooling system? It seems against the philosophy of Beowulf to look 
>for low cost computing solutions, and then find that you need to make a 
>substantial investment just to cool the room. I had an Athlon system shut down 
>on me due to overheat, so I look at the cases and I think- why aren't people 
>looking to use airflow in a more efficient manner. I know the ambient air temp 
>isn't this high. I may be in left field, but it seems like the flow inside a 
>case is so turbulent that the mean air velocity is not carrying the warm air 
>away from the cpu as quickly as it could. 
>  
>
A rackmount cluster with an integrated cooling system sounds like big 
money.  If you're looking at your case and worrying about overheat and 
especially if you think airflow is the problem, you might want to look 
at lower hanging fruit, like getting cables out of the way of the 
airflow and/or going to round cables instead of ribbon cables.  A 
homebuilt or commercial ducted fan solution that brings air directly to 
the CPU from the outside is a big, relatively low-cost win more in line 
with the typical economics of a beowulf cluster.  The typical homebuilt 
installation involves a case fan, some off the shelf flexible ducting, 
and a home built shroud over the CPU heatsink.

Google groups.google.com and www.google.com on

CPU "ducted fan"

to get an idea of the range of possibilities and results.

RM


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Apr 11 15:52:03 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 11 Apr 2003 15:52:03 -0400 (EDT)
Subject: cooling systems
In-Reply-To: <3EA1CB6B@itsnt5.its.uiowa.edu>
Message-ID: <Pine.LNX.4.44.0304111456490.26489-100000@ganesh.phy.duke.edu>

On Fri, 11 Apr 2003, jbassett wrote:

> This is precisely the point that I am getting at. It seems indirect to me to 
> cool the ambient atmosphere in a room using air conditioners, then expect the 
> heat to distribute itself so that the temperature is at equilibrium throughout 
> the system. It seems entirely more sensible to have a system such that cold 
> air would be directed more precisely at the cpus, then have the exiting flow 
> directed through some sort of exhaust system which would take it to some place 
> that would act as a heat resovoir. In that way you could use the existing 
> airconditioning infrastructure at a facility by distributing the hot exhaust 
> from a cluster into the building. You could even use a venturi on an air duct 
> pipe to keep a vacuum going and redistribute the hot air.

I think that if you worked it all out you'd find that you CAN do this,
AND that it would provide you more precise control of temperatures, AND
that it would cost you a lot MORE than a standard AC/chiller/heat
exchanger with relatively simple but suitable ductwork and fans with the
ability to balance and redirect airflow.  Although I could easily be
wrong, since I don't really know what kind of cluster we're talking
about, or how big, or what kind of space you're trying to put it in.

First of all, I think you're almost certainly mistaken when you say that
you can use existing A/C infrastructure to cool a cluster, at least if
that cluster has more than 16 nodes or so (small clusters you often can,
and many do).  Orindary building AC that is servicing "offices" isn't
really generally engineered to remove more than a couple or three of KW
in a single room-sized space and deliver it back to its heat exchanger
-- the ductwork and delivery/return systems simply aren't adequate to do
more.  Sometimes cool air is delivered in one office and only
exhausted/returned several offices away, passing through several
interoffice vents in between!  

My office gets air through a square foot or two of ductwork.  That air
would have to howl in at ten below zero to cool a big cluster, and the
hot air would have to howl out just as fast.  Worse still, as we found
out the hard way, most physical plants will shut A/C chillers down
altogether in the winter time, or run them on a standby/intermittant
basis.  After all, who needs AC in the winter?  It's COLD outside, isn't
it?  So don't count on building air to be adequate OR reliable, unless
reengineered to be both for your particular needs.

Those needs can be quite variable.  A good sized cluster can consume as
much power as all the offices in a good sized building put together --
tens of KW -- and needs A/C just as much in the winter as in the summer,
unless you figure out a clever way of using the cluster room as a
"furnace" to warm all the offices while cooling the cluster.  To put it
in perspective, the A/C heat exchanger/blower alone in our server room
sits in a unit about two meters cubed in size and sounds like a 747 at
cruising altitude in operation, which is all the time.  Then there is
the actual chiller, which is far away and (fractionally speaking, as it
is shared) just as big.  Its air delivery and return are about a square
meter or more each in cross section before it starts splitting down.

At the moment it is removing a few tens of KW continuously, day and
night, and the ambient room air (delivered in a balanced way from
overhead down to the general fronts of the racks but not ducted right
down into them) hovers between 60 and 70, except in the air columns
right BEHIND the node racks where it is more like 70-80 (about a 15
degree difference between incoming and exiting air).  To work in the
room you need a jacket, unless you're standing behind the racks where
you could work comfortably in shorts and a tee shirt.  This is still a
pretty "small" cluster, too -- around 150 dual CPU nodes, plus sundry
single processor nodes and some servers -- with infrastructure capacity
that might support about twice as many nodes eventually as the room
fills.

How could this ever be managed by an ordinary office AC duct?

Second, look at the costs.  Putting active, directed coolers in each
chassis costs more than just fans (and generates more total heat).
Also, there will ALREADY be a warm air return in the room if it is A/C'd
at all -- your "some sort of exhaust system".  Air, like energy, is
conserved and whatever comes out of the blowers into the room must go
out of the room into the blowers.

In many cases simply directing cold incoming air down the case fronts
(intakes) and permitting the rising warm air off the case backs
(outflows) to get to the ceiling and into the return is sufficient -- a
reasonably stable airflow will set up to balance cool air out against
warm air in.  Note that this is NOT equilibrium -- the air going in at
the front is cool, out at the back warm, and they do NOT mix before the
warm air is exhausted -- it is just a stable pattern of circulation.

This may or may not be good enough -- for us it seems to be working, but
for you it might not.  If you want to do better, all it costs you is
more ductwork, more fans and control systems to ensure that the ducting
does its job of dumping cool air in a balanced way (so it all doesn't
come out of the ducts closest to the blower, leaving none for the back
part of the room) and picking up the outflowing warm air ditto, maybe a
raised floor (since without a raised floor the ductwork will interfere
with access to the nodes, front and back).  The capture of the exhaust
will be particularly tricky as you will no longer be exhausting the hot
air trapped on the ceiling as actively and will need to make sure that
spillover doesn't build up there and anomalously heat the upper part of
the room so that it DOES eventually mix with the ambient air.  

You can also build closed or open racks on raised floors and deliver
cool air directly in at the bottom and remove it at the top in what
amounts to a dedicated cooling chamber or airflow pattern, per rack.
However, as Joel said, folks do all of this -- lots of really big
clusters or server farms have very carefully engineered cooling systems,
and you can find websites where they discuss and illustrate particular
patterns of cool/warm airflow for various designs.

They're just expensive. All of this just costs more money, and you
seemed to be complaining about the (lesser) cost of ordinary A/C or
using A/C at all, not looking for ways of increasing costs still further
with complicated ductwork on top of ordinary A/C.

So the only point I was (and am) making is that NOTHING you do at the
node level gets you around the fundamental problem that leads you to A/C
in the first place -- ensuring that power in = heat out such that input
cooling air (or ambient air at the fan intakes) stays at or below may
75F, ideally much cooler.  There are lots of ways to make this happen
with or without fancy ductwork depending on cluster size and design, but
you MUST make it happen.  If your space DOES have adequate capacity in
building AC and your cluster isn't too huge, then you are lucky -- some
fans and maybe some ductwork and you'll likely be able to fly.  If
you're building a cluster that will draw, say, 4 KW and up, and don't
"happen" to have a space with lots of power and surplus AC capacity and
ductwork that can deliver and return air in a balanced way, (like a
former server room) you're almost certainly going to be looking at some
cluster-specific renovation to provide the required power and cooling.

   rgb

> 
> Joseph Bassett
> 
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Apr 11 16:02:33 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 11 Apr 2003 16:02:33 -0400 (EDT)
Subject: IA-64 related question (tangentially)
In-Reply-To: <3E97144D.4050201@moene.indiv.nluug.nl>
Message-ID: <Pine.LNX.4.44.0304111554120.26489-100000@ganesh.phy.duke.edu>

On Fri, 11 Apr 2003, Toon Moene wrote:

> Robert G. Brown wrote:
> 
> > FSF or Debian don't need Itaniums, really, except for maybe one or two
> > to ensure that builds work on the architecture.  They don't "compute".
> > I do.  To me a cycle is a precious thing as I use so MANY of them over
> > the years.
> 
> Given what you told us about your code (how you didn't need Fortran and 
> all those nice multi-rank array loops), I gather that you could get lots 
> of cycles with an Itanium, but no speed.
> 
> Perhaps David M-T can explain this better than me ...

I know, I know.  It would be a horrible waste of money to give them to
me in terms of raw cost benefit, EXCEPT of course that to me they'd be
FREE, and it is hard to beat that for cost benefit;-)

So I guess you can ask for them instead.  At least the Netherlands is
closer to the UK...:-)

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

 In a little known book of ancient wisdom appears the following Koan:

    The Devil finds work
    For idle systems
    Nature abhors a NoOp

 The sages have argued about the meaning of this for megacycles, some
 contending that idle systems are easily turned to evil tasks, others
 arguing that whoever uses an idle system must be possessed of the Devil
 and should be smote with a sucker rod until purified.
 
 I myself interpret "Devil" to be an obvious mistranslation of the word
 "Daemon". It is for this reason, my son, that I wish to place a simple
 daemon on your system so that Nature is satisfied, for it is clear that
 a NoOp is merely a Void waiting to be filled...

 This is the true Tao.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Apr 11 14:56:18 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 11 Apr 2003 14:56:18 -0400 (EDT)
Subject: IA-64 related question (tangentially)
In-Reply-To: <20030411180034.GA1321@greglaptop.internal.keyresearch.com>
Message-ID: <Pine.LNX.4.44.0304111452350.26489-100000@ganesh.phy.duke.edu>

On Fri, 11 Apr 2003, Greg Lindahl wrote:

> On Fri, Apr 11, 2003 at 10:27:31AM -0400, Robert G. Brown wrote:
> 
> > FSF or Debian don't need Itaniums, really, except for maybe one or two
> > to ensure that builds work on the architecture.  They don't "compute".
> 
> Joking aside, every compiler group needs a cluster, because their
> nightly testing is:
> 
> build kernel, test kernel
> build compiler, test compiler
> build all rpms in your distro
> build and run SPECcpu
> build and run misc tests
> run a search over combinations of optimization flags to see if
>   any are broken or have performance regressions
> 
> And that's not even considering parallelizing builds.

Good point; I had forgotten the compiler people.  Kernel people in
general as well.  I was thinking more in terms of application level
people.  Not to mention trying to weasel some free high end systems by
talking down the competition...;-)

  Alas, alas,
  It is not meant to be.  
  They in UK, 
  Me in NC.

:-)

    rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sgaudet at wildopensource.com  Fri Apr 11 16:46:25 2003
From: sgaudet at wildopensource.com (Stephen Gaudet)
Date: Fri, 11 Apr 2003 16:46:25 -0400
Subject: [Linux-ia64] Itanium gets supercomputing software
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com> <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> <20030410215139.O3614@www2>
Message-ID: <3E9729A1.1040100@wildopensource.com>


Bob Drzyzgula wrote:
> On Fri, Apr 11, 2003 at 09:58:08AM +1000, Duraid Madina wrote:
> 
>>David Mosberger wrote:
>>
>>>Remember that Intel is targeting Itanium 2 against Power4 and SPARC.
>>>In that space, the price of Itanium 2 is very competitive.
>>
>>OK, I want to be clear on this. I asked why Itanium hardware is still so 
>>expensive. Your answer seems to be marketing speak for "The prices are 
>>still high because we are _happy_ selling small quantities of this 
>>equipment to people used to paying through the nose for good quality 
>>hardware." Is this correct?
> 
> I'm not sure that it works this way. I think it's more like
> "We are making the best processor we know (or, perhaps,
> "knew", or "thought we knew", or even "allowed ourselves
> to know") how to make that will/would/might in our dreams
> be profitable to sell at this high price in moderate
> quantities." I expect that if they could sell one hundred
> times as many Itaniums at a tenth the price, they would
> ramp up the fabs and do it. But then you get into the
> chicken-or-egg problem: There's no software, and hence
> no demand, and hence no software, and hence no demand,
> that would justify the production of a hundred times as
> many Itaniums.

Based on over 25 years in computers owning a company for 5 years you see 
this change in the computing market over time.  When I sold Alpha based 
systems there was always a bitch about cost.

However, people that needed the compute cycles were more than willing to 
purchase Alpha over Intel because of what it brought them in total TCO. 
  More compute cycles, memory and bandwith.

Main problem was Digital at the time, was they never knew how to sell 
Alpha other than with UNIX.  They tried selling it with MS Windows and 
never made a dent in the market until OEM's starting selling it in the 
3D space with a little package called Renderman.  This was big hit with 
film studios.   Remember the movie Titanic?  Rendered on Alpha to give 
you a time line. The fastest cpu was a 21064, 275MHz and a system cost 
about $12,000.00.

The Alpha market started to take off when Digital screwed up with a 
product call the Multia.  This was a 21066 processor, 166Mhz or 233Mhz. 
The Multia was Digital's attempt to build a X terminal for Windows NT. 
It failed and left DEC with 15,000 of these pigs sitting around.

Now Digital needed to get rid of them quick. The plan was to sell them
with Linux and hopefully develop the Linux space.  These Multia/UDB sold 
for less than $2000.00. That's when Alpha started to take off. I 
personally sold tons of them. In fact, in a former life I even sold a 
system or two to David Mosberger.

So I'll agree that when the cost comes down more people will get involved
with the ia64.

BTW: Intel is looking to release a single cpu version of the ia64 
sometime this year.  When this happens I believe you'll see the market 
open up.

>>Can I then conclude that Intel has not yet had any interest whatsoever 
>>in driving IA64 into the realm of reasonble prices? It's sad to see so 
>>much work being put into this Linux port when, if things remain as they 
>>are, it will hardly be used.

Main reason as David alluded to these systems are meant to compete with 
high end Sun, HP and IBM servers.  Not in the commodity market. 
Remember, the cost in R&D on ia64 development.

> Be careful that you put the horse before the cart.
> Might it not be that the people doing this work are
> wagering that it will ultimately cause demand for
> the Itanium to increase? Could it really be expected
> that demand for Itanium *would* materialize without
> such investment in software happening first?
> 
> In any event, virtually nothing remains as it is.

Myself I wouldn't worry, over time Intel has a way of getting the price 
down.  Heck, Dell has P4 desktops selling for $449.00 and notebooks for 
$799.00.  Wow.

Cheers,

Steve Gaudet
  .....
<(???)>
----------------------
Wild Open Source
Bedford, NH 03110
pH: 603-488-1599
cell: 603-498-1600
Home: 603-472-8040
http://www.wildopensource.com

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Fri Apr 11 17:20:05 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Fri, 11 Apr 2003 17:20:05 -0400 (EDT)
Subject: cooling systems
In-Reply-To: <Pine.LNX.4.44.0304111456490.26489-100000@ganesh.phy.duke.edu>
Message-ID: <Pine.LNX.4.44.0304111645550.19865-100000@coffee.psychology.mcmaster.ca>

as usual, Robert (et al) gave a comprehensive answer.
I'd like to just emphasize one thing: expect trouble if you
use chilled-water cooling that's designed/managed for offices.
ours works fairly well during the summer (16 tons of chillers and
35 KW dissipated, which should work out to 10/16 utilization.)

but facilities people tend not to think of chilled water as a 
critical resource, and construction people certainly do not.
you *will* have problems with the temperature of your chilled 
water (not to mention whether it's even flowing).  I was surprised
how little thermal capacity the chillers/pipes have - our room
heats up in seconds if there's any disruption.

and don't forget to run your chiller blowers on your UPS :(

I expect we'll be adding supplemental electical cooling soon ;(

consider wiring up a few ibutton thermo sensors - I have 5 now
(incoming chilled water pipe, chilled air duct, dead/ambient
and hot/return air), and log them every 30 seconds or so.
yes, I have a little script that monitors the temps, logs them 
in mysql, pages me, powers off.

clusters are getting bigger, and these problems aren't going away.
yes, one solution is to use laptop processors.  that works, but is 
simply inapropriate for some applications.  another is to try
water-cooling, which I've heard some cluster vendors are working on.
the main appeal there is to avoid flakey CPU fans, and potentially
to exchange and transport heat more effectively.  but you're still
probably dependent on a chiller somewhere.

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Duclam80 at gmx.net  Fri Apr 11 17:57:56 2003
From: Duclam80 at gmx.net (Vu Duc Lam)
Date: Sat, 12 Apr 2003 04:57:56 +0700
Subject: Can't run NAS Benchmark
Message-ID: <002f01c30075$71b38570$1a3afea9@conan>

Hi,

To run NAS Benchmark correctly, may be or not to install Scalapack library.
I have some problem when trying to run 5 Kernel Benchmarks with class B and
C. I have installed NAS Benchmark in a cluster System. The system is
collection of Intel processor-based workstations and server interconnected
by TCP/IP network. Each node is Intel with Pentium 800 MHz processor  and
256 megabytes of memory, 2GB of Hark Disk.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Fri Apr 11 18:19:15 2003
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Fri, 11 Apr 2003 15:19:15 -0700 (PDT)
Subject: cooling systems
In-Reply-To: <Pine.LNX.4.44.0304111645550.19865-100000@coffee.psychology.mcmaster.ca>
Message-ID: <Pine.LNX.4.44.0304111455340.11381-100000@twin.uoregon.edu>

In our system a 24 ton chiller has a backup water source in the form of a
city water supply... There's no return on that one, it gets dumped into
the waste water treatment stream so your water bill goes up in a big
hurry, and it's way less effective because the water is in the upper 50s
instead of the mid 30's. It's controlled by a vacuum break valve so 
switchover is automatic if one supply goes away...

Backup fans, extensive temperature monitoring, and thermal kill switches 
are still necessary, esp as there have been about 6 chiller failures since 
I arrived in 93 (the thing is 22 years old at this point). water under the 
floor is always one of those exciting alarms...

joelja

On Fri, 11 Apr 2003, Mark Hahn wrote:

> as usual, Robert (et al) gave a comprehensive answer.
> I'd like to just emphasize one thing: expect trouble if you
> use chilled-water cooling that's designed/managed for offices.
> ours works fairly well during the summer (16 tons of chillers and
> 35 KW dissipated, which should work out to 10/16 utilization.)
> 
> but facilities people tend not to think of chilled water as a 
> critical resource, and construction people certainly do not.
> you *will* have problems with the temperature of your chilled 
> water (not to mention whether it's even flowing).  I was surprised
> how little thermal capacity the chillers/pipes have - our room
> heats up in seconds if there's any disruption.
> 
> and don't forget to run your chiller blowers on your UPS :(
> 
> I expect we'll be adding supplemental electical cooling soon ;(
> 
> consider wiring up a few ibutton thermo sensors - I have 5 now
> (incoming chilled water pipe, chilled air duct, dead/ambient
> and hot/return air), and log them every 30 seconds or so.
> yes, I have a little script that monitors the temps, logs them 
> in mysql, pages me, powers off.
> 
> clusters are getting bigger, and these problems aren't going away.
> yes, one solution is to use laptop processors.  that works, but is 
> simply inapropriate for some applications.  another is to try
> water-cooling, which I've heard some cluster vendors are working on.
> the main appeal there is to avoid flakey CPU fans, and potentially
> to exchange and transport heat more effectively.  but you're still
> probably dependent on a chiller somewhere.
> 
> regards, mark hahn.
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli	      Academic User Services   joelja at darkwing.uoregon.edu    
--    PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E      --
  In Dr. Johnson's famous dictionary patriotism is defined as the last
  resort of the scoundrel.  With all due respect to an enlightened but
  inferior lexicographer I beg to submit that it is the first.
	   	            -- Ambrose Bierce, "The Devil's Dictionary"


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Fri Apr 11 16:31:58 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Sat, 12 Apr 2003 06:31:58 +1000
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <16023.624.775864.544742@napali.hpl.hp.com>
References: <16022.194.793900.97453@napali.hpl.hp.com>	<003601c30005$9d740a10$a620a4d5@tsatec.int> <16023.624.775864.544742@napali.hpl.hp.com>
Message-ID: <3E97263E.5010605@octopus.com.au>

David,

	Itanium 2 isn't even competitive with other offerings from your own 
company. Compare:

David Mosberger wrote:
> Here is one real price point for an Itanium 2 workstation:
> 
> 	- hp workstation zx2000 (Linux software enablement kit)
> 	-  Intel? Itanium 2 900MHz Processor with 1.5MB on-chip L3 cache
> 	-  512MB Total PC2100 Registered ECC DDR 266 SDRAM Memory (2x256MB)
> 	-  40GB EIDE Hard Drive
> 	-  NVIDIA Quadro2 EX
> 	-  10/100/1000BT LAN integrated
> 	-  16X Max DVD-ROM
> 	-  Linux software enablement kit (not an operating system)
> 	-  3-year warranty, next-day, onsite hardware response, Mon - Fri, 8am - 5pm
> 	- $3,298

with:

	- HP server rp2430
	- 1xHP PA-8700 650MHz CPU with 2.25MB on-chip L1 cache
	- 128MB Roughly-2GB/sec-God-Knows-What ECC Memory
	- HP-UX 11i
	- 1-year warranty, next-day onsite hardware response
	- $1,095

	(missing things like disk, a reasonable amount of RAM, etc can be 
brought to the level of the Itanium system you quote for another $700 or 
so - to see this config, go to www.e-solutions.hp.com, and try to buy an 
rp2430 (HP part #A6889A))

	I bought one of these, and it is excellent (if a little loud. ;) I 
would happily buy a bare-bones Itanium 2 system at the same price. This 
doesn't seem to like it's going to be possible any time soon. In less 
than two weeks, I will be able to buy an Opteron system that runs a 
great deal faster at the same price.

	Good luck.

	Duraid

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From iod00d at hp.com  Fri Apr 11 16:42:17 2003
From: iod00d at hp.com (Grant Grundler)
Date: Fri, 11 Apr 2003 13:42:17 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <3E97263E.5010605@octopus.com.au>
References: <16022.194.793900.97453@napali.hpl.hp.com> <003601c30005$9d740a10$a620a4d5@tsatec.int> <16023.624.775864.544742@napali.hpl.hp.com> <3E97263E.5010605@octopus.com.au>
Message-ID: <20030411204217.GA4306@cup.hp.com>

On Sat, Apr 12, 2003 at 06:31:58AM +1000, Duraid Madina wrote:
> Itanium 2 isn't even competitive with other offerings from your own company.

Try comparing like products.

Single CPU rp2430 is about 1/4 to 1/2 the perf of dual zx6000 depending
on what one measures.

grant
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Maggie.Linux-Consulting.com  Fri Apr 11 16:52:47 2003
From: alvin at Maggie.Linux-Consulting.com (Alvin Oga)
Date: Fri, 11 Apr 2003 13:52:47 -0700 (PDT)
Subject: cooling systems
In-Reply-To: <20030411193838.GB1690@greglaptop.internal.keyresearch.com>
Message-ID: <Pine.LNX.3.96.1030411135132.14130A-100000@Maggie.Linux-Consulting.com>


hi ya


On Fri, 11 Apr 2003, Greg Lindahl wrote:

> On Fri, Apr 11, 2003 at 11:41:07AM -0500, jbassett wrote:
> 
> > Does anyone know of if it is possible to buy a rackmount cluster with an 
> > integrated cooling system?
> 
> ASCI Red has a little air conditioner on the top of every rack, but
> that's undoubtedly more expensive than using standard commercial
> units.

putting a real AC is nice but...

also using standard household fans to blow air into the racks is goood too
( as long as air can get in and get out of the chassis and up and out the
( other side of the cabinet

c ya
alvin

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From toon at moene.indiv.nluug.nl  Fri Apr 11 15:15:25 2003
From: toon at moene.indiv.nluug.nl (Toon Moene)
Date: Fri, 11 Apr 2003 21:15:25 +0200
Subject: IA-64 related question (tangentially)
References: <Pine.LNX.4.44.0304111022150.26489-100000@ganesh.phy.duke.edu>
Message-ID: <3E97144D.4050201@moene.indiv.nluug.nl>

Robert G. Brown wrote:

> FSF or Debian don't need Itaniums, really, except for maybe one or two
> to ensure that builds work on the architecture.  They don't "compute".
> I do.  To me a cycle is a precious thing as I use so MANY of them over
> the years.

Given what you told us about your code (how you didn't need Fortran and 
all those nice multi-rank array loops), I gather that you could get lots 
of cycles with an Itanium, but no speed.

Perhaps David M-T can explain this better than me ...

-- 
Toon Moene - mailto:toon at moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
GNU Fortran 95: http://gcc-g95.sourceforge.net/ (under construction)

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From davidm at napali.hpl.hp.com  Fri Apr 11 17:35:07 2003
From: davidm at napali.hpl.hp.com (David Mosberger)
Date: Fri, 11 Apr 2003 14:35:07 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <20030411212516.GD4306@cup.hp.com>
References: <16022.194.793900.97453@napali.hpl.hp.com>
	<003601c30005$9d740a10$a620a4d5@tsatec.int>
	<16023.624.775864.544742@napali.hpl.hp.com>
	<3E97263E.5010605@octopus.com.au>
	<20030411204217.GA4306@cup.hp.com>
	<3E972BA5.4010603@octopus.com.au>
	<20030411212516.GD4306@cup.hp.com>
Message-ID: <16023.13579.676695.490297@napali.hpl.hp.com>

>>>>> On Fri, 11 Apr 2003 14:25:16 -0700, Grant Grundler <iod00d at hp.com> said:

  Grant> Anyway, my point still stands, comparing a "server" (rackable, remote console)
  Grant> with a "workstation" (3D gfx, sound, DVD-ROM) has alot of variables and
  Grant> different folks value these things differently.  But even here, I would
  Grant> guess the difference in CPU perf alone is 2x or more by most measures.
  Grant> Yes, I know the price is 3x but other things make zx2000 attract to
  Grant> a different set of customers (like linux support).

Also, don't forget memory bandwidth.

The zx2000 is a very nice workstation.  Granted, it's not $1k, but it
doesn't perform like a Multia either!  And yes, it's quiet, too (I use
one as my main workstation these days... ;-)

As for what the future holds, I guess we'll just have to wait and see.
Remember though: just a year ago, the cheapest ia64 workstation you
could get was priced at $7k+.  This year, you can get a zx2000 for
$3k+, so, judging from where I sit, prices certainly are coming down.

	--david
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From davidm at napali.hpl.hp.com  Fri Apr 11 19:05:46 2003
From: davidm at napali.hpl.hp.com (David Mosberger)
Date: Fri, 11 Apr 2003 16:05:46 -0700
Subject: [Linux-ia64] Re: Itanium gets supercomputing software
In-Reply-To: <3E9747C0.5080603@octopus.com.au>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>
	<20030410182609.GF29125@cup.hp.com>
	<3E95DA42.7000607@octopus.com.au>
	<16022.194.793900.97453@napali.hpl.hp.com>
	<3E960510.6070503@octopus.com.au>
	<20030411022046.GA22381@cse.unsw.edu.au>
	<u5tbrzdlbef.fsf@deckard.uppsala.vtech>
	<3E972A5F.6000807@octopus.com.au>
	<16023.15589.825794.344192@napali.hpl.hp.com>
	<3E9747C0.5080603@octopus.com.au>
Message-ID: <16023.19018.83511.297444@napali.hpl.hp.com>

>>>>> On Sat, 12 Apr 2003 08:54:56 +1000, Duraid Madina <duraid at octopus.com.au> said:

  Duraid> That's right - _you_ use real hardware because you actually have it. 
  Duraid> Everyone else (figuratively speaking, though it's not far off the mark) 
  Duraid> has no choice _but_ to use Ski. That sucks, regardless of whether or not 
  Duraid> Ski is accurate, fast, or easy to use.

  Duraid> Anyway, I don't think there's much more that can be said. As Matt 
  Duraid> indicated, we must pray for Deerfield, so I will continue to align my 
  Duraid> holy carpet of hope to Fort Collins/Portland/Carly's hotel bedroom and 
  Duraid> pray for reasonably priced IA64 hardware.

Anyone can get access to ia64 hardware at:

	http://testdrive.hp.com/

They're shared machines, so kernel development is out, but for
user-level development, they are very handy.

	--david
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Fri Apr 11 18:54:56 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Sat, 12 Apr 2003 08:54:56 +1000
Subject: [Linux-ia64] Re: Itanium gets supercomputing software
In-Reply-To: <16023.15589.825794.344192@napali.hpl.hp.com>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>	<20030410182609.GF29125@cup.hp.com>	<3E95DA42.7000607@octopus.com.au>	<16022.194.793900.97453@napali.hpl.hp.com>	<3E960510.6070503@octopus.com.au>	<20030411022046.GA22381@cse.unsw.edu.au>	<u5tbrzdlbef.fsf@deckard.uppsala.vtech>	<3E972A5F.6000807@octopus.com.au> <16023.15589.825794.344192@napali.hpl.hp.com>
Message-ID: <3E9747C0.5080603@octopus.com.au>

David Mosberger wrote:
>>>>>>On Sat, 12 Apr 2003 06:49:35 +1000, Duraid Madina <duraid at octopus.com.au> said:
> 
>   Duraid> My point wasn't that software simulators are useless, but
>   Duraid> that software simulators _should_ be useless **4 years**
>   Duraid> (!!) after the public availability of hardware.
> 
> Then how do you explain the popularity of user-mode linux on x86?

If user-mode linux is "popular", then linux is "f@#%g buggy s#!t". I 
mean really, when your attitude to software development is:

	<Linus> hey somethings wrong my swap is full what gives????
	<Rik Riel> stop running 91589 copies of XMMS 8)
	<Linus> shut up riel god i hate you
	** Riel is banned from linux-kernel
	<Andrea> YO CHECK OUT THE NEW VM SYSTEM I WROTE THIS 
MORNING^H^H^H^H^H^H^HWEEK!! ITLL FIX YOUR PROBLEMS!!!!
	<Linus> k i know 2.4 is supposed to be a "stable" kernel but god i hate 
that riel dude!! :| welp.. out with the old, in with the new!!!!!
	** Linus integrates new VM
	<Andrea> THANKS D00D
	<Linus> no probs m8

	..then yes, having UML as a sandbox can help.

The UML guys see it differently though. According to them: "It doesn't 
need to be good for anything. It's fun!" Maybe Ski can embrace this 
spirit also. ;)

  > The reason I continue to use Ski is because it's one of the very few
> simulators out there that are (a) architecturally extremely accurate,
> (b) fast, and (c) very easy to setup & use.  Ski is an asset for ia64
> linux, not a weakness.
> 
> (And no, just because we have Ski doesn't mean we don't use real
>  hardware.  Nothing could be further from the truth.)

That's right - _you_ use real hardware because you actually have it. 
Everyone else (figuratively speaking, though it's not far off the mark) 
has no choice _but_ to use Ski. That sucks, regardless of whether or not 
Ski is accurate, fast, or easy to use.

Anyway, I don't think there's much more that can be said. As Matt 
indicated, we must pray for Deerfield, so I will continue to align my 
holy carpet of hope to Fort Collins/Portland/Carly's hotel bedroom and 
pray for reasonably priced IA64 hardware.

	Duraid

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Fri Apr 11 16:49:35 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Sat, 12 Apr 2003 06:49:35 +1000
Subject: [Linux-ia64] Re: Itanium gets supercomputing software
In-Reply-To: <u5tbrzdlbef.fsf@deckard.uppsala.vtech>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>	<20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au>	<16022.194.793900.97453@napali.hpl.hp.com>	<3E960510.6070503@octopus.com.au>	<20030411022046.GA22381@cse.unsw.edu.au> <u5tbrzdlbef.fsf@deckard.uppsala.vtech>
Message-ID: <3E972A5F.6000807@octopus.com.au>

I guess I was being a bit subtle.

I'm well aware there are things you can do with a simulator that you 
can't do with hardware. Like test your code against what's supposed to 
happen, not what actually happens. ;)

My point wasn't that software simulators are useless, but that software 
simulators _should_ be useless **4 years** (!!) after the public 
availability of hardware.

When I said:
 > I put it to you that software is easier to develop on hardware.

I meant that at this late stage, one would expect that people would be 
writing software, on their hardware. And not a whole lot else, all 
things considered. Do you see x86 linux people using simulators? Once in 
a blue moon, perhaps. Does anyone doubt that the x86-64 port will mature 
a heck of a lot faster than linux-ia64 has? One doesn't need to think 
for very long to realise why this might be.

Don't get me wrong, I think Linus was being a complete idiot for his 
comments against IA64 and for x86-64, but insofar as keeping hardware 
pricing so high that Joe K. Hacker can't even dream of affording it is 
"good business" on Intel/HP's part, it's an even better way of keeping 
your kernel untested.

	Duraid

David K?gedal wrote:
> Exactly.  There are a lot of things that you can do with a simulator
> that you can't do with hardware.  Developing software before hardware
> is available is just one of them.  (plug mode on) That's why we sell
> simulators for most major current CPU architectures.  Including IA64.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bropers at lsu.edu  Fri Apr 11 17:52:57 2003
From: bropers at lsu.edu (Brian D. Ropers-Huilman)
Date: Fri, 11 Apr 2003 16:52:57 -0500 (CDT)
Subject: cooling systems
In-Reply-To: <Pine.LNX.4.44.0304111218500.11278-100000@twin.uoregon.edu>
References: <Pine.LNX.4.44.0304111218500.11278-100000@twin.uoregon.edu>
Message-ID: <Pine.LNX.4.53.0304111650010.1946@cannondale.ocs.lsu.edu>

On Fri, 11 Apr 2003, Joel Jaeggli wrote:
> if you had enough space or a convenient lake you could also use a heat-pump.

One of the "old-timers" here tells a story of some old system, possibly a 
VAXen, that was water cooled. The water came from the deep nearby lake. There 
was apparantly an intricate screening system to prevent muck, plants, and the 
like from getting into the system.

Unfortunately, ... one day the temperature suddenly shot through the roof and 
the system crashed (there may have also been some form of physical damage). 
The cause, as you may already have guessed: a fish somehow made it into the 
system and was literally gumming up the pumps. :(

Possibly geek-urban legend, but the source is reliable.

--  
Brian D. Ropers-Huilman                        (225) 578-0461 (V)
Systems Administrator                 AIX      (225) 578-6400 (F)
Office of Computing Services       GNU Linux   brian at ropers-huilman.net
High Performance Computing            .^.      http://www.ropers-huilman.net/
Fred Frey Building, Rm. 201, E-1Q     /V\                          \o/
Louisiana State University           (/ \)           --  __o   /    |
Baton Rouge, LA 70803-1900           (   )          --- `\<,  /    `\\,
                                     ^^-^^              O/ O /     O/ O
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From davidm at napali.hpl.hp.com  Fri Apr 11 18:08:37 2003
From: davidm at napali.hpl.hp.com (David Mosberger)
Date: Fri, 11 Apr 2003 15:08:37 -0700
Subject: [Linux-ia64] Re: Itanium gets supercomputing software
In-Reply-To: <3E972A5F.6000807@octopus.com.au>
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>
	<20030410182609.GF29125@cup.hp.com>
	<3E95DA42.7000607@octopus.com.au>
	<16022.194.793900.97453@napali.hpl.hp.com>
	<3E960510.6070503@octopus.com.au>
	<20030411022046.GA22381@cse.unsw.edu.au>
	<u5tbrzdlbef.fsf@deckard.uppsala.vtech>
	<3E972A5F.6000807@octopus.com.au>
Message-ID: <16023.15589.825794.344192@napali.hpl.hp.com>

>>>>> On Sat, 12 Apr 2003 06:49:35 +1000, Duraid Madina <duraid at octopus.com.au> said:

  Duraid> My point wasn't that software simulators are useless, but
  Duraid> that software simulators _should_ be useless **4 years**
  Duraid> (!!) after the public availability of hardware.

Then how do you explain the popularity of user-mode linux on x86?

The reason I continue to use Ski is because it's one of the very few
simulators out there that are (a) architecturally extremely accurate,
(b) fast, and (c) very easy to setup & use.  Ski is an asset for ia64
linux, not a weakness.

(And no, just because we have Ski doesn't mean we don't use real
 hardware.  Nothing could be further from the truth.)

	--david
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kevin.vanmaren at unisys.com  Fri Apr 11 18:36:13 2003
From: kevin.vanmaren at unisys.com (Van Maren, Kevin)
Date: Fri, 11 Apr 2003 17:36:13 -0500
Subject: [Linux-ia64] Itanium gets supercomputing software
Message-ID: <3FAD1088D4556046AEC48D80B47B478C0101F734@usslc-exch-4.slc.unisys.com>

> I don't know exactly what price/configuration Power4 machines start.
> Perhaps one of the IBMers on this list could chime in?
> 
> 	--david

Okay, my turn to plug: Itanium 2-based systems _are_ very competitive
with RISC machines, even at the mid-range and high end.

Unisys is currently selling 4 to 16-processor Itanium 2 machines.
They are very competitively priced against mid-sized RISC machines,
although pricing information is not available on the web.
SCO's UnitedLinux will be available as soon as SCO ships.
http://www.unisys.com/products/es7000__servers/hardware/aries__130.htm

For a quote in North America, you can contact Rob Luke, rob.luke at unisys.com
(801) 594-5088.  I can get you contact info for other parts of the world.

Kevin Van Maren
Unisys
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Fri Apr 11 16:55:01 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Sat, 12 Apr 2003 06:55:01 +1000
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <20030411204217.GA4306@cup.hp.com>
References: <16022.194.793900.97453@napali.hpl.hp.com> <003601c30005$9d740a10$a620a4d5@tsatec.int> <16023.624.775864.544742@napali.hpl.hp.com> <3E97263E.5010605@octopus.com.au> <20030411204217.GA4306@cup.hp.com>
Message-ID: <3E972BA5.4010603@octopus.com.au>

Grant Grundler wrote:
> Try comparing like products.
> 
> Single CPU rp2430 is about 1/4 to 1/2 the perf of dual zx6000 depending
> on what one measures.

Isn't a single CPU rp2430 somewhere between 1/8 and 1/4 the price of a 
dual zx6000?

	Duraid


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From iod00d at hp.com  Fri Apr 11 17:25:16 2003
From: iod00d at hp.com (Grant Grundler)
Date: Fri, 11 Apr 2003 14:25:16 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <3E972BA5.4010603@octopus.com.au>
References: <16022.194.793900.97453@napali.hpl.hp.com> <003601c30005$9d740a10$a620a4d5@tsatec.int> <16023.624.775864.544742@napali.hpl.hp.com> <3E97263E.5010605@octopus.com.au> <20030411204217.GA4306@cup.hp.com> <3E972BA5.4010603@octopus.com.au>
Message-ID: <20030411212516.GD4306@cup.hp.com>

On Sat, Apr 12, 2003 at 06:55:01AM +1000, Duraid Madina wrote:
> Isn't a single CPU rp2430 somewhere between 1/8 and 1/4 the price of a 
> dual zx6000?

Dunno. is it?
Try comparing dual rp2470 and dual rx2600 with similar products
from other vendors.

David wrote:
|       - hp workstation zx2000 (Linux software enablement kit)
|	- Intel? Itanium 2 900MHz Processor with 1.5MB on-chip L3 cache

Sorry - my bad.  I misread that as "2x 900 Mhz".

Anyway, my point still stands, comparing a "server" (rackable, remote console)
with a "workstation" (3D gfx, sound, DVD-ROM) has alot of variables and
different folks value these things differently.  But even here, I would
guess the difference in CPU perf alone is 2x or more by most measures.
Yes, I know the price is 3x but other things make zx2000 attract to
a different set of customers (like linux support).

grant
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From astroguy at bellsouth.net  Sat Apr 12 01:19:28 2003
From: astroguy at bellsouth.net (astroguy at bellsouth.net)
Date: Sat, 12 Apr 2003 1:19:28 -0400
Subject: cooling systems (jbassett)
Message-ID: <20030412051928.DXW28599.imf60bis.bellsouth.net@mail.bellsouth.net>

Hi List, (I posted this to the group a bit earlier and it did not get posted but I see from Mike's post that this is not as strange as it first may seem;)

I know this might sound a little strange... but strange is my stock and trade I've been working on cooling solutions ever since I had my first 64/128... I have submerged, entire boards into a liquid silicon solution... playing around with liquid coolant is sometimes messy but to my surprise it worked!  Perhaps not very practical for rack nodes and I'm  not really sure of how this might work over the long haul but for a day it worked find and did demonstrate at least in principle an approach...     Now, something off the shelf and not requiring large vats of slick and messy goo, maybe is to build  a rack of nodes built inside a self contained refrigeration units like the ones we all have seen at some of the mom and pop convenient stores... the boxed ones with the glass doors... make them air tight and then you are only cooling your rack. Like I said, a little strange twist on an"inside" the box approach;P
C.Clary
Spartan Sys.analyst
PO 1515
Spartanburg, SC 29304-0243 

Fax# 801-858-2722
> 
> From: Mike Sullivan <mike.sullivan at alltec.com>
> Date: 2003/04/11 Fri PM 02:47:26 EDT
> To: beowulf at beowulf.org
> Subject: Re:cooling systems (jbassett)
> 
> We have designed a custom cabinet for a client that must run in a room 
> without AC. The system has
> an integral centrifugal blower that can have the air ducted to an 
> outside sink. ( or to double as a furnace
> in the winter). The Motherboards mount to trays inside the cabinet so we 
> do not use standard 1U cases.
> 
> -- 
> Mike Sullivan                           Director Performance Computing
> @lliance Technologies,                  Voice: (416) 385-3255 x 228, 
> 18 Wynford Dr, Suite 407                Fax:   (416) 385-1774
> Toronto, ON, Canada, M3C-3S2            Toll Free:1-877-216-3199
> http://www.alltec.com
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bclem at rice.edu  Sat Apr 12 12:06:47 2003
From: bclem at rice.edu (Brent M. Clements)
Date: Sat, 12 Apr 2003 11:06:47 -0500 (CDT)
Subject: HPL Benchmark on Itanium 2 box
In-Reply-To: <87adevu343.fsf@bix.grotte>
Message-ID: <Pine.GSO.4.33.0304121053060.25305-100000@is.rice.edu>

Hi Guys,
I'm trying to compile the hpl benchmark on a HP zx6000 box.

I have the hp math libraries and the intel 7.0 compilers.

Has anyone ever tried compiling the hpl benchmark using this compile
configuration? If so could they send me their Makefile

The reason I'm asking is because I keep on getting the following error

HPL_pdtest.o: In function `HPL_pdtest':
HPL_pdtest.o(.text+0x1a82): undefined reference to `cblas_dgemv'
HPL_pdtest.o(.text+0x1ad2): undefined reference to `cblas_dgemv'

Anyone have a clue?

Thanks,


Brentr Clements


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david at virtutech.se  Sat Apr 12 11:50:20 2003
From: david at virtutech.se (David =?iso-8859-1?q?K=E5gedal?=)
Date: Sat, 12 Apr 2003 17:50:20 +0200
Subject: Itanium gets supercomputing software
In-Reply-To: <3E972A5F.6000807@octopus.com.au> (Duraid Madina's message of
 "Sat, 12 Apr 2003 06:49:35 +1000")
References: <KFEBIIAHBJPIELCKJAHJAEEGCAAA.sgaudet@wildopensource.com>
	<20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au>
	<16022.194.793900.97453@napali.hpl.hp.com>
	<3E960510.6070503@octopus.com.au>
	<20030411022046.GA22381@cse.unsw.edu.au>
	<u5tbrzdlbef.fsf@deckard.uppsala.vtech>
	<3E972A5F.6000807@octopus.com.au>
Message-ID: <87adevu343.fsf@bix.grotte>

Duraid Madina <duraid at octopus.com.au> writes:

> I guess I was being a bit subtle.
>
> I'm well aware there are things you can do with a simulator that you
> can't do with hardware. Like test your code against what's supposed to
> happen, not what actually happens. ;)
>
> My point wasn't that software simulators are useless, but that
> software simulators _should_ be useless **4 years** (!!) after the
> public availability of hardware.

Why is that?  There are numerous reasons for using simulators to
develop software, especially low-level software (OS, drivers, firmware
etc.)  You get things like full system visibility, non-intrusive
debugging, deterministic repeatability, fault injection, and more.

-- 
David K?gedal, Virtutech
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Sat Apr 12 11:54:23 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Sat, 12 Apr 2003 08:54:23 -0700
Subject: HPL Benchmark on Itanium 2 box
In-Reply-To: <Pine.GSO.4.33.0304121053060.25305-100000@is.rice.edu>
References: <87adevu343.fsf@bix.grotte> <Pine.GSO.4.33.0304121053060.25305-100000@is.rice.edu>
Message-ID: <20030412155423.GA2884@greglaptop.attbi.com>

On Sat, Apr 12, 2003 at 11:06:47AM -0500, Brent M. Clements wrote:

> HPL_pdtest.o: In function `HPL_pdtest':
> HPL_pdtest.o(.text+0x1a82): undefined reference to `cblas_dgemv'
> HPL_pdtest.o(.text+0x1ad2): undefined reference to `cblas_dgemv'

You're missing some BLAS (linear algebra) subroutines. I'm pretty sure
that the HPL documentation explains this. For most machines, ATLAS
provides a pretty good BLAS library.

greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Sat Apr 12 12:22:34 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Sat, 12 Apr 2003 09:22:34 -0700
Subject: Can't run NAS Benchmark
In-Reply-To: <002f01c30075$71b38570$1a3afea9@conan>
References: <002f01c30075$71b38570$1a3afea9@conan>
Message-ID: <20030412162234.GB2884@greglaptop.attbi.com>

On Sat, Apr 12, 2003 at 04:57:56AM +0700, Vu Duc Lam wrote:

> I have some problem when trying to run 5 Kernel Benchmarks with class B and
> C.

These benchmarks take a large amount of memory -- do you have enough?

greg
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mizhou at coe.neu.edu  Sat Apr 12 15:34:38 2003
From: mizhou at coe.neu.edu (Mi Zhou)
Date: Sat, 12 Apr 2003 14:34:38 -0500
Subject: CPU time accounting
Message-ID: <008b01c3012a$8edc6ad0$0402a8c0@HuaMao>

I am new to cluster management. I wan to get some statistics on the usage of
the cluster. Is there some utility that can summarize CPU usage of each
user/group?

Thanks,

Mi

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From beowulf at paralline.com  Mon Apr 14 06:57:46 2003
From: beowulf at paralline.com (Pierre BRUA)
Date: Mon, 14 Apr 2003 12:57:46 +0200
Subject: renting time on a cluster
In-Reply-To: <20030407125604.GR2067@leitl.org>
References: <20030407125604.GR2067@leitl.org>
Message-ID: <3E9A942A.1030005@paralline.com>

Eugen Leitl a ?crit:
> Can you think of places where one can rent nontrivial
> amount of crunch for money?

Paralline can rent nontrivial amount of crunch for money.
The question is : how much money are your friend ready to spend and how 
much crunch he is looking for (requested node config would be nice) for 
how much time.

Pierre
-- 
           PARALLINE          ///     Clusters, Linux, Java
                             ///
71,av des Vosges Phone:+33 388 141 740
F-67000 STRASBOURG Fax:+33 388 141 741 http://www.paralline.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From robl at mcs.anl.gov  Sun Apr 13 13:57:59 2003
From: robl at mcs.anl.gov (Robert Latham)
Date: Sun, 13 Apr 2003 12:57:59 -0500
Subject: Mac OS X or Linux?
In-Reply-To: <Pine.LNX.4.44.0304091653100.4430-100000@coffee.psychology.mcmaster.ca>
References: <20030409061920.GA32255@mcs.anl.gov> <Pine.LNX.4.44.0304091653100.4430-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20030413175756.GA19214@mcs.anl.gov>

On Wed, Apr 09, 2003 at 05:04:52PM -0400, Mark Hahn wrote:
> > http://terizla.org/~robl/pbook/benchmarks/lmbench-linux_vs_osx.1

> does OS X have page coloring inherited from *BSD?  perhaps that 
> explains the only place it comes out ahead (memory bandwidth/latency).

linux loses out on the Libc(bcopy) score because gnu libc doesn't have
ppc-optimised string and memory operations.  This really surprised me,
but hopefully someone (maybe me, if i can learn ppc assembly fast
enough :> ) will implement them. 

The other memory bandwidth and latency numbers are too close to call,
unless i'm missing something.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From young_yuen at yahoo.com  Sun Apr 13 12:36:41 2003
From: young_yuen at yahoo.com (Young Yuen)
Date: Sun, 13 Apr 2003 09:36:41 -0700 (PDT)
Subject: problem with ANA-6911A/TX under kernel 2.4.18
Message-ID: <20030413163641.31713.qmail@web41303.mail.yahoo.com>

Hi,

The Tulip driver doesn't seem to detect the RJ45 port.
My kernel ver is 2.4.18 and Tulip driver ver is
0.9.15.

Linux Tulip driver version 0.9.15-pre11 (May 11, 2002)
tulip0:  EEPROM default media type Autosense.
tulip0:  Index #0 - Media MII (#11) described by a
21142 MII PHY (3) block.
tulip0:  Index #1 - Media 10base2 (#1) described by a
21142 Serial PHY (2) block.
tulip0: ***WARNING***: No MII transceiver found!
divert: allocating divert_blk for eth0
eth0: Digital DS21143 Tulip rev 33 at 0xc6855000,
00:00:D1:00:0B:4B, IRQ 11.

Somtimes after a reboot the warning message is gone.

tulip0:  MII transceiver #1 config 3100 status 7809
advertising 0101.
divert: allocating divert_blk for eth0
eth0: Digital DS21143 Tulip rev 33 at 0xc6855000,
00:00:D1:00:0B:4B, IRQ 11.

But in either cases, it fails to ping any nodes on the
network besides its own. ANA-6911A/TX is a
100BaseT/10Base2 combo card, RJ45 port is
connected to LAN. Windows dual boot from the same
machine works fine shows no problem with the network
configuration or hardware.

Can you please kindly advise.

Thx & Rgds,
Young

__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Sat Apr 12 19:16:03 2003
From: landman at scalableinformatics.com (Joseph Landman)
Date: 12 Apr 2003 19:16:03 -0400
Subject: HPL Benchmark on Itanium 2 box
In-Reply-To: <Pine.GSO.4.33.0304121053060.25305-100000@is.rice.edu>
References: <Pine.GSO.4.33.0304121053060.25305-100000@is.rice.edu>
Message-ID: <1050189362.20946.89.camel@protein.scalableinformatics.com>

Hi Brent:

  Looks like either a missing library, or a library order issue. The
HPL_pdtest.o is trying to find the cblas_dgemv function.  This function
is likely supplied in the optimized Intel libs (though I don't know
which library, but it would be one supplying BLAS and LAPACK routines
optimized for the platform).  You may have a -L/path in front of the
correct -lcblas (or similar library name).  If you can find out which
library is supposed to provide that function, try moving it to a
different position in the link line.

Joe

On Sat, 2003-04-12 at 12:06, Brent M. Clements wrote:
> Hi Guys,
> I'm trying to compile the hpl benchmark on a HP zx6000 box.
> 
> I have the hp math libraries and the intel 7.0 compilers.
> 
> Has anyone ever tried compiling the hpl benchmark using this compile
> configuration? If so could they send me their Makefile
> 
> The reason I'm asking is because I keep on getting the following error
> 
> HPL_pdtest.o: In function `HPL_pdtest':
> HPL_pdtest.o(.text+0x1a82): undefined reference to `cblas_dgemv'
> HPL_pdtest.o(.text+0x1ad2): undefined reference to `cblas_dgemv'
> 
> Anyone have a clue?
> 
> Thanks,
> 
> 
> Brentr Clements
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-- 
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman at scalableinformatics.com
  web: http://scalableinformatics.com
phone: +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Apr 14 09:29:55 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 14 Apr 2003 09:29:55 -0400 (EDT)
Subject: CPU time accounting
In-Reply-To: <008b01c3012a$8edc6ad0$0402a8c0@HuaMao>
Message-ID: <Pine.LNX.4.44.0304140842260.2017-100000@lilith.rgb.private.net>

On Sat, 12 Apr 2003, Mi Zhou wrote:

> I am new to cluster management. I wan to get some statistics on the usage of
> the cluster. Is there some utility that can summarize CPU usage of each
> user/group?

What an interesting question!

The "psacct" package in Red Hat et. al. linuces contains the BSD system
accounting package (accton, sa, ac, etc).  Install and read the man
pages to see what you get, on a node by node basis.

I have no idea if any of the other cluster monitor tools for general
workstation clusters interfaces with psacct -- xmlsysd (my own) does
not, although it wouldn't be terribly difficult to hack it so that it
did.

Alternatively, and perhaps more intelligently (since this isn't the kind
of question one generally cares to have answered in a 5-10 second
polling loop as the changes are usually fairly predictable given
knowledge of who's on a cluster at any given time) it would be fairly
straightforward to write a collection script in e.g. perl that polled
each node on demand and cumulated results across a cluster.  That would
be relatively resource expensive -- order of a second per remote ssh
call to get the cumulated results -- but presumably one would only run
it once a day or thereabouts to cumulate the usage du jour.

Note that accounting isn't usually turned on by default because it is
"expensive" in its own right -- the system creates an accounting file
that gets a record for each process run, and adds a write to this file
to the termination sequence for each job as it finishes to preserve its
cumulated stat data.  On a typical normal node this won't be a horrible
problem, but on a system running lots of little commands or with a
broken looping command stream it can be.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From anand at novaglobal.com.sg  Sun Apr 13 22:44:57 2003
From: anand at novaglobal.com.sg (Anand Vaidya)
Date: Mon, 14 Apr 2003 10:44:57 +0800
Subject: SMC8624T vs DLINK DGC-1024T / Jumbo Frames ?
In-Reply-To: <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de>
References: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca> <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de>
Message-ID: <200304141044.59457.anand@novaglobal.com.sg>


On Friday 11 April 2003 06:34 pm, rochus.schmid at ch.tum.de wrote:
> dear beowulfers,
>
> we are in a similar situation as dave: we get an 8nodes dual-xeon cluster
> (with tyan e7501 mobo) with intel gige on board. and now the "switch issue"
> comes up. my vendor also suggested the DLINK, whereas i found the
> discussion on the (more expensive managed) SMC on this list supporting
> jumbo frames. the issue was whether or not any of the cheaper (unmanaged)
> switches support jumbo frames. i couldnt figure out if this is resolved,
> yet. it sounded like they might, but since they are unmanaged the problem
> is to switch it on or off. is that right?
>
> i also found this document:
> http://www.scl.ameslab.gov/Publications/ HalsteadPubs/usenix_halstead.pdf

-----------------------------------------

The above URL seems to be outdated. Please use 
http://www.scl.ameslab.gov/Publications/Halstead/usenix_halstead.pdf

-----------------------------------------


> it says that the effect on bandwith with jumbo frames is only seen for
> tcp/ip commun (netpipe) but is completely lost using MPI. since my code is
> MPI based it wouldn't matter to have jumbo frames and i could go with the
> cheaper DLINK. is this info right? or outdated? missunderstood?
>
> any hints highly appreciated.
>
> greetings
>
>    rochus
>
> Quoting Dave Lane <dlane at ap.stmarys.ca>:
> > Can anyone comment on the strengths/weaknesses of these two 24-port
> > gigabit
> > switches. We're going to be building a 16 node dual-Xeon cluster this
> > spring and were planning on the SMC switch (which has received good
> > review
> > here before), but a vendor pointed out the DLINK switch as a less
> > expensive
> > alternative.
> >
> > ... Dave
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf

--
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From chris_oubre at hotmail.com  Mon Apr 14 10:46:58 2003
From: chris_oubre at hotmail.com (Chris Oubre)
Date: Mon, 14 Apr 2003 09:46:58 -0500
Subject: HPL Benchmark on Itanium 2 box
In-Reply-To: <200304121901.h3CJ19s24252@NewBlue.Scyld.com>
Message-ID: <001501c30294$b3608b50$25462a80@rice.edu>

     Have you tried using the Intel MKL (Math Kernel Library)?
http://www.intel.com/software/products/mkl/mkl52/  This is what we use.
We have found this library very fast!

****************************************************
Christopher D. Oubre                               *
email: chris_oubre at hotmail.com                     *
research: http://cmt.rice.edu/~coubre              *
Web: http://www.angelfire.com/la2/oubre            *
Hangout: http://pub44.ezboard.com/bsouthterrebonne *
Phone:(713)348-3541  Fax:   (713)348-4150          *
Rice University                                    *
Department of Physics, M.S. 61                     *
6100 Main St.                       ^-^            *
Houston, Tx  77251-1892, USA       (O O)           *
-= Phlax=-                         ( v )           *
************************************m*m*************


Message: 2
Date: Sat, 12 Apr 2003 11:06:47 -0500 (CDT)
From: "Brent M. Clements" <bclem at rice.edu>
To: <linux-ia64 at linuxia64.org>, <beowulf at beowulf.org>
Subject: HPL Benchmark on Itanium 2 box

Hi Guys,
I'm trying to compile the hpl benchmark on a HP zx6000 box.

I have the hp math libraries and the intel 7.0 compilers.

Has anyone ever tried compiling the hpl benchmark using this compile
configuration? If so could they send me their Makefile

The reason I'm asking is because I keep on getting the following error

HPL_pdtest.o: In function `HPL_pdtest':
HPL_pdtest.o(.text+0x1a82): undefined reference to `cblas_dgemv'
HPL_pdtest.o(.text+0x1ad2): undefined reference to `cblas_dgemv'

Anyone have a clue?

Thanks,


Brentr Clements


--__--__--

_______________________________________________
Beowulf mailing list
Beowulf at beowulf.org http://www.beowulf.org/mailman/listinfo/beowulf


End of Beowulf Digest
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at plogic.com  Mon Apr 14 13:13:41 2003
From: deadline at plogic.com (Douglas Eadline)
Date: Mon, 14 Apr 2003 12:13:41 -0500 (CDT)
Subject: Can't run NAS Benchmark
In-Reply-To: <002f01c30075$71b38570$1a3afea9@conan>
Message-ID: <Pine.LNX.4.44.0304141209170.826-100000@homer>

On Sat, 12 Apr 2003, Vu Duc Lam wrote:

> Hi,
> 
> To run NAS Benchmark correctly, may be or not to install Scalapack library.
> I have some problem when trying to run 5 Kernel Benchmarks with class B and
> C. I have installed NAS Benchmark in a cluster System. The system is
> collection of Intel processor-based workstations and server interconnected
> by TCP/IP network. Each node is Intel with Pentium 800 MHz processor  and
> 256 megabytes of memory, 2GB of Hark Disk.

You may wish to look at the BPS (Beowulf Performance Suite):

http://www.cluster-rant.com/article.pl?sid=03/03/17/1838236

and:

http://www.hpc-design.com/reports/bps1/index.html

BPS has the NAS suite included. It also has a script that allows different
compilers (gnu,pgi,intel), numbers of CPUs, test size, and MPI's
(mpich,lam,mpipro)

It has everything you need to run the tests (and more).

Doug
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------
Paralogic, Inc.           |     PEAK     |      Voice:+610.814.2800
130 Webster Street        |   PARALLEL   |        Fax:+610.814.5844
Bethlehem, PA 18015 USA   |  PERFORMANCE |    http://www.plogic.com
-------------------------------------------------------------------

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From JairoArbey at gmx.net  Mon Apr 14 11:20:01 2003
From: JairoArbey at gmx.net (Jairo Arbey Rodriguez)
Date: Mon, 14 Apr 2003 10:20:01 -0500
Subject: Help f90 - intel
Message-ID: <000601c30299$52de7b70$72b0fea9@Q13197.tjdo.com>

Hi Friends:

I have two PC. One with intel processor and Athlon processor the other
one. I got the fortran intel compiler (ifc) and it was installed
successful on the first pc. Now I want to install on the second pc (with
Athlon processor), but when I issued the command ?./install?, the
installer stopped with a message saying:

 
install can't identify your machine type, glibc or kernel.

This product is supported for use with the following combinations.

 
    Machine Type                Kernel   glibc

 
1.  IA-32                       2.4.7    2.2.4, or

    IA-32                       2.4.18   2.2.5, or

 
2.  Itanium(R)-based system     2.4.3    2.2.3, or

    Itanium(R)-based system     2.4.9    2.2.4, or

    Itanium(R)-based system     2.4.18   2.2.4

 
x.  Exit

 
For an unsupported install, select the platform most similar to yours.

[H[2JRPM shows no Intel packages as installed.

 
Which of the following would you like to install?

1.  Intel(R) Fortran Compiler for 32-bit applications, Version 7.0 Build
20030212Z

2.  Linux Application Debugger for 32-bit applications, Version 7.0,
Build 20021218

x.  Exit

[H[2JIntel(R) Fortran Compiler for 32-bit applications, Version 7.0
Build 20030212Z

 
------------------------------------------------------------------------
--------

Please carefully read the following license agreement.  Prior to
installing the

software you will be asked to agree to the terms and conditions of the
following

license agreement.

------------------------------------------------------------------------
--------

Press Enter to continue.

 
'accept' to continue, 'reject' to return to the main menu.

Where do you want to install to?  Specify directory starting with '/'.
[/opt/intel]

What rpm install options would you like?  [-U --replacefiles] 

 
------------------------------------------------------------------------
--------

Intel(R) Fortran Compiler for 32-bit applications, Version 7.0 Build
20030212Z

Installing...

error: failed dependencies:

         ld-linux.so.2   is needed by intel-ifc7-7.0-87

         libc.so.6   is needed by intel-ifc7-7.0-87

         libm.so.6   is needed by intel-ifc7-7.0-87

         libpthread.so.0   is needed by intel-ifc7-7.0-87

         libc.so.6(GLIBC_2.0)   is needed by intel-ifc7-7.0-87

         libc.so.6(GLIBC_2.1)   is needed by intel-ifc7-7.0-87

         libc.so.6(GLIBC_2.1.3)   is needed by intel-ifc7-7.0-87

         libc.so.6(GLIBC_2.2)   is needed by intel-ifc7-7.0-87

         libm.so.6(GLIBC_2.0)   is needed by intel-ifc7-7.0-87

         libpthread.so.0(GLIBC_2.0)   is needed by intel-ifc7-7.0-87

         libpthread.so.0(GLIBC_2.1)   is needed by intel-ifc7-7.0-87

Installation failed.

 
------------------------------------------------------------------------
--------

 
Press Enter to continue.  

RPM shows no Intel packages as installed.

 
Which of the following would you like to install?

1.  Intel(R) Fortran Compiler for 32-bit applications, Version 7.0 Build
20030212Z

2.  Linux Application Debugger for 32-bit applications, Version 7.0,
Build 20021218

x.  Exit

Exiting...

 
I am sure that the libraries mentioned above are in /lib/ and /usr/lib.

 
I want to question you: What do I do?

 
Thanks in advance.

 
Jairo Arbey Rodriguez M.

Grupo de Fisica de la Materia Condensada

Dept. de F?sica, Universidad Nacional de Colombia, Colombia

FAX: (571) 244 9122 & (571) 316 5135

TEL: (571) 316 5000 Ext. 13047 / 13081

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20030414/4ab7b15a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 1163 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20030414/4ab7b15a/attachment.gif>

From chettri at gst.com  Mon Apr 14 14:59:42 2003
From: chettri at gst.com (chettri at gst.com)
Date: Mon, 14 Apr 2003 11:59:42 -0700
Subject: beowulf in space
Message-ID: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com>

Has anybody considered the theoretical aspects of placing beowulfs on a 
cluster of satellites? I understand that communication will be slower AND 
unreliable,
and it would restrict the set of problems that could be solved. I'm looking 
for papers/tech reps etc on the subject.

Regards,

Samir Chettro


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Apr 14 13:53:07 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 14 Apr 2003 13:53:07 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com>
Message-ID: <Pine.LNX.4.44.0304141337140.12169-100000@ganesh.phy.duke.edu>

On Mon, 14 Apr 2003 chettri at gst.com wrote:

> Has anybody considered the theoretical aspects of placing beowulfs on a 
> cluster of satellites? I understand that communication will be slower AND 
> unreliable,
> and it would restrict the set of problems that could be solved. I'm looking 
> for papers/tech reps etc on the subject.

Well, let's see.  Beowulfs for what purpose?

As far as building general purpose computational supercomputing centers
in space, that is such a phenomenally silly idea that anyone that DID
have it would probably shake their head after a minute or two of
reflection and resolve never to use those particular drugs again.

As you say, problems include:

  a) expense 
  b) communications latency (bandwidth actually can be as big as you
     like or are likely to ever need, since you ARE a satellite, after
     all...:-)
  c) access/maintenance difficulties 
  d) expense 
  e) cooling (think of the cluster as being located a really big vacuum
     flask)
  f) onsite staff (astrobots?  astroadministrators?)
  g) radiation and shielding
  h) energy supply
  i) hard to get 24 hour turnaround on spare parts
  j) did I mention expense?

Even if you think about some sort of space station as being just another
cluster room and the cluster nodes being just off-the-shelf units from
Dell, you're looking at one hell of a delivery charge...

Now, with all of that said, it may be perfectly reasonable and sane to
send small clusters aloft -- I suspect that we already do, every time we
launch a shuttle or send experiments up.  Many modern jets are
architected like a "cluster" in many ways, with sensors and processing
units all over the place, interconnected by a network of sorts.  A
compute cluster has a lot of desirable features -- an extension of the
available total computational power that can be brought to bear on
certain problems, for example, in addition to some highly desirable
redundancy (if a node dies out of five or six you've got, you can
proceed to function a bit slower -- if a system dies and is all you've
got, you're in a lot more trouble).

The "problems" that would be solved are thus restricted by common sense
-- dedicated tasks in many cases to accomplish some specific purpose, or
MAYBE a very small general purpose cluster on something like a space
station doing science that happened to need some local processing power.
In most cases, though, it would still make more sense to locate the
processing power on the ground and use a dedicated comm channel to the
ground to access it.  Something out in space has an excellent vantage
point to establish high bandwidth (high latency) with any number of
ground stations.

   rgb

> 
> Regards,
> 
> Samir Chettro
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Mon Apr 14 13:47:55 2003
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Mon, 14 Apr 2003 10:47:55 -0700 (PDT)
Subject: beowulf in space
In-Reply-To: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com>
Message-ID: <Pine.LNX.4.44.0304140958550.6256-100000@twin.uoregon.edu>

On Mon, 14 Apr 2003 chettri at gst.com wrote:

> Has anybody considered the theoretical aspects of placing beowulfs on a 
> cluster of satellites? I understand that communication will be slower AND 
> unreliable,

Communication to satellites needs to be neither slow nor unreliable, it is 
generally fairly high latency... It can be quite expensive.

There are clusters of computers in space. they generally aren't what you 
would consider heavy computation platforms...

The biggest issues with with computer resources in space are:

mass - 	a large sattellite such as the hughes galaxy 4r bird is around 
	2500kg for everything that's half the mass of the ups that backs 
	up our racks. every gram you send up costs you.

power - solar power and long life cadmium batteries mean your whole 
	platform has to run on pretty thin resources. again using 
	something like galaxy 4r which is a very powerful satellite 8800 
	watts is what you get max to power everything... Thats with a 26 
	meter span of galium arsenide solar cells. most of the power is 
	going to communications equipement in the case of galaxy 4r r that 
	would be 24 c band at 40w each and 24 ku at 108w each

radiation hardening - without 50 miles of atmosphere overhead we're kinda 
	close to the sun and gamma ray bursts from other parts of the 
	galaxy are kinda hard on the equipment.

thermal management - air cooling doesn't work given no atomosphere... even 
	on something like the iss hot air doesn't rise in microgravity,
	you have resort to fairly extreme measures to deal with the 
	thermal management issues. if you see the laptops on the shuttle they're mostly 
	pentium class thinkpads with some fairly serious mods. There's
	mission specific equipment as well, but you won't find a rack of 
	dual xeons floating around due to thermal issues alone 
	(disregarding mass or power requirements).

expected service life - if you plan on go to the expense of putting it in 
	geostationary orbit you're probably planning on keeping it up 
	there for a minimum of 10-15 years, so it has to still work
	after a decde in a hostile environment, and upgrades and 
	service calls aren't in the plan.

It's pretty easy to spend a billion dollars by the time everything is said 
and done putting up a large satellite. you generally try to loft only 
what's critical to the mission of the satellite.

> and it would restrict the set of problems that could be solved. I'm looking 
> for papers/tech reps etc on the subject.
> 
> Regards,
> 
> Samir Chettro
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli	      Academic User Services   joelja at darkwing.uoregon.edu    
--    PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E      --
  In Dr. Johnson's famous dictionary patriotism is defined as the last
  resort of the scoundrel.  With all due respect to an enlightened but
  inferior lexicographer I beg to submit that it is the first.
	   	            -- Ambrose Bierce, "The Devil's Dictionary"


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Apr 14 14:46:05 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 14 Apr 2003 14:46:05 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com>
Message-ID: <Pine.LNX.4.44.0304141248480.10510-100000@coffee.psychology.mcmaster.ca>

> Has anybody considered the theoretical aspects of placing beowulfs on a 
> cluster of satellites?

if this interests you, I highly recommend reading Vernor Vinge's 
recent books (A Deepness in the Sky, for instance).  Robert Forward
has some topical ones, too.  they are science fiction, though...

> I understand that communication will be slower AND 
> unreliable,

well, to the extent that such a cluster would be spread out,
I can understand the "slower" part.  though c in vacuum is higher
than c in fiber or TP.

I don't see the "unreliable" part.  are you presuming some kind of 
traditional RF modulation?  using free-space optics seems like the 
more obvious way to network satellites, and I don't see why that would
be flakey.

> and it would restrict the set of problems that could be solved. I'm looking 
> for papers/tech reps etc on the subject.

I doubt they exist, simply because there's no practical reason,
given the huge cost and unclear advantage.

I can imagine some really great advertisements for colo though ;)

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From derek.richardson at pgs.com  Mon Apr 14 13:46:05 2003
From: derek.richardson at pgs.com (Derek Richardson)
Date: Mon, 14 Apr 2003 12:46:05 -0500
Subject: Machine Check Exception
Message-ID: <3E9AF3DD.1070002@pgs.com>

All,
Does anyone know if a power supply can cause a machine check exception ( 
I would think that the VRM would stop it from effecting the processor, 
but what about the rest of the system - seems odd that the machine 
wouldn't fail in other ways...)?  I have a cluster node that keeps 
crashing w/ one, and I've looked it up in the Intel ia32 manual, and 
it's a not specific to processor and RAM ( which I have already changed 
out ), so I've just been swapping parts out ( so far I've swapped CPU0, 
where the Exception took place, all the RAM, all the fibre, network, and 
RSA cards, the motherboard, etc. - basically the only things that are 
the same as the original node are the chass, power supply, scsi disk ( 
but not controller ), CPU1, and CPU1's VRM - I just changed out the VRM 
for CPU0 and am putting the node back into use once it's fibre disk 
fscks : this might fix the problem.
Does anyone have any thoughts on this?  I'd hate to throw the entire 
scenario out and just replace the entire node ( Since I'll eventually 
have to find and replace the faulty hardware and I've already done so 
much, I'd like to finish it ).
Thanks,
Derek R.

-- 

Linux Administrator
derek.richardson at pgs.com
derek.richardson at ieee.org
Office 713-781-4000
Cell 713-817-1197
bureaucracy, n:
	A method for transforming energy into solid waste.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From astroguy at bellsouth.net  Mon Apr 14 15:40:57 2003
From: astroguy at bellsouth.net (astroguy at bellsouth.net)
Date: Mon, 14 Apr 2003 15:40:57 -0400
Subject: beowulf in space
Message-ID: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net>

Hi list,
Ok, you computer genius and rocket scientist all... I tend to agree with Dr.Brown's position but for the sake of argument... Let's think of along the lines of where a computational cluster might find some in space application.  Say for example we were to launch a probe into the Sun's  outer corona... let assume also that we have some shielding device that would sustain the craft in the 10 million degree C or so that such a craft is sure to encounter... Even with our best science fiction such a craft could only endure a few precious moments in such a space environment, so we would have to use the advantage of speed... Ok, so we use an ion engine to get the craft up to speed... since the sun's corona extends apparently 700,000 km or so into space... the craft would have to get up to a speed say 250,000 mph.  Which we have yet to achieve but not impossible... Sling shot around Jupiter and Mars and back to the sun with the ion engine in a bit of celestial magic provided by or on!
 ground navigational cluster... certainly we can achieve a very high velocity for our death plunge into the Sun's outer atmosphere... Computational real time observations within those few precious moments before the probe vaporised would certainly be enhanced by an on board beowulf cluster... You asked for speculation, as to an application... I think this is perhaps one.
Chip    
> 
> From: Mark Hahn <hahn at physics.mcmaster.ca>
> Date: 2003/04/14 Mon PM 02:46:05 EDT
> To: chettri at gst.com
> CC: beowulf at beowulf.org
> Subject: Re: beowulf in space
> 
> > Has anybody considered the theoretical aspects of placing beowulfs on a 
> > cluster of satellites?
> 
> if this interests you, I highly recommend reading Vernor Vinge's 
> recent books (A Deepness in the Sky, for instance).  Robert Forward
> has some topical ones, too.  they are science fiction, though...
> 
> > I understand that communication will be slower AND 
> > unreliable,
> 
> well, to the extent that such a cluster would be spread out,
> I can understand the "slower" part.  though c in vacuum is higher
> than c in fiber or TP.
> 
> I don't see the "unreliable" part.  are you presuming some kind of 
> traditional RF modulation?  using free-space optics seems like the 
> more obvious way to network satellites, and I don't see why that would
> be flakey.
> 
> > and it would restrict the set of problems that could be solved. I'm looking 
> > for papers/tech reps etc on the subject.
> 
> I doubt they exist, simply because there's no practical reason,
> given the huge cost and unclear advantage.
> 
> I can imagine some really great advertisements for colo though ;)
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gmpc at sanger.ac.uk  Mon Apr 14 16:08:32 2003
From: gmpc at sanger.ac.uk (Guy Coates)
Date: Mon, 14 Apr 2003 21:08:32 +0100 (BST)
Subject: Help f90 - intel (Jairo Arbey Rodriguez)    
In-Reply-To: <200304141636.h3EGa1s16542@NewBlue.Scyld.com>
References: <200304141636.h3EGa1s16542@NewBlue.Scyld.com>
Message-ID: <Pine.OSF.4.44.0304142031280.3425508-100000@ecs2f.internal.sanger.ac.uk>

Hi,


It looks like you may be installing on a system which does not support RPM
as its native packaging format. The other gotcha is that v7.0 of the
C/Fortran compilers needs glibc <= 2.2.4 and some newer distros ship with
later versions.  There are workarounds for both problems.

You can force an install of the RPMs by specifying the --nodeps option
when the install script asks:

>What rpm install options would you like?


If your distribution ships with a version of glibc > 2.2.4 then you may
need to install the glibc-2.2.4 include files; the compiler cannot parse
the include files in newer versions.

The easiest way to do this is to grab glibc-2.2.4 from the GNU website,
compile (using gcc)  and install it in /usr/local/glibc-2.2.4 or
/opt/glibc-2.2.4.

Just make sure you don't install it on top of your existing glibc in
/usr/lib, or anywhere where LD_LIBRARY_PATH or ld.so.conf is going to pick
it up, otherwise you will break your system horribly.

Once you've installed the glibc headers you need to tell the compiler
where to find them. Add

-I/usr/local/glibc-2.2.4/include

and maybe

-restrict

to your compiler flags and you should be set. I've used this trick to
compile stuff on x86 and ia64 glibc-2.3.x systems without a problem.

Cheers,

Guy Coates

-- 
Guy Coates,  Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
Tel: +44 (0)1223 834244 ex 7199


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jim_windle at eudoramail.com  Mon Apr 14 15:56:48 2003
From: jim_windle at eudoramail.com (Jim Windle)
Date: Mon, 14 Apr 2003 15:56:48 -0400
Subject: beowulf in space
Message-ID: <FACJLMHIMNCFECAA@whowhere.com>

 
--

On Mon, 14 Apr 2003 11:59:42  
 chettri wrote:
>Has anybody considered the theoretical aspects of placing beowulfs on a 
>cluster of satellites? I understand that communication will be slower AND 
>unreliable,
>and it would restrict the set of problems that could be solved. I'm looking 
>for papers/tech reps etc on the subject.
>
I am not aware of any published work on placing beowulfs in orbit and as Bob Brown points out the expense and practical difficulties would be immense.  The only place I can think of where related issues would be discussed would be in technical papers related to the Iridium satellite network.  It has been a few years since I looked at it but if I recall correctly the architecture of their systems was different from all others.  In systems like Globalstar the satellites are controlled from the ground and satellites relay to ground stations which in turn relay to other satellites with all routing decisions for network traffic being made on Earth.  In Iridium, if I recall correctly, the satellites communicated directly with each other and all routing decisions were made in the satellites themselve.  There was no satellite designated as a control node but each satellite would have some processing power for making routing decisions and each satellite would in communication with ot!
her
 satellites directly so some sense it would be beowulf like. Whatever technical papers they published when they were looking for funding for the network might address some of the issues you are interested in.

Jim 
>
>Samir Chettro
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>


Need a new email address that people can remember
Check out the new EudoraMail at
http://www.eudoramail.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Apr 14 16:11:52 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 14 Apr 2003 16:11:52 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net>
Message-ID: <Pine.LNX.4.44.0304141556490.10510-100000@coffee.psychology.mcmaster.ca>

> ground navigational cluster... certainly we can achieve a very high
> velocity for our death plunge into the Sun's outer
> atmosphere... Computational real time observations within those few
> precious moments before the probe vaporised would certainly be enhanced by
> an on board beowulf cluster... You asked for speculation, as to an
> application... I think this is perhaps one. 

nice plan ;)

you have to remember that beowulfery is basically
for cheapskates and penny-pinchers.  the whole idea is to use
hardware that's been made cheap by the commodity PC market,
and build something powerful out of it.  there's really no 
special sauce (ie, "grid"), just a bunch of cost-effective hardware.

the point is that for space applications, costs are already
sky high (heh), so saving a few bucks by running a cluster
doesn't make that much sense.  

the real cost-savings is in reducing mass...

I'd also guess that space apps don't need that much compute power.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From shaeffer at neuralscape.com  Mon Apr 14 16:32:58 2003
From: shaeffer at neuralscape.com (Karen Shaeffer)
Date: Mon, 14 Apr 2003 13:32:58 -0700
Subject: beowulf in space
In-Reply-To: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net>
References: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net>
Message-ID: <20030414203258.GA28080@synapse.neuralscape.com>

>  ground navigational cluster... certainly we can achieve a very high velocity for our death plunge into the Sun's outer atmosphere... Computational real time observations within those few precious moments before the probe vaporised would certainly be enhanced by an on board beowulf cluster... You asked for speculation, as to an application... I think this is perhaps one.


This is getting silly. You still need to transmit the data back to earth. It
has already been asserted that it is far more efficient energy wise and cost
wise to transmit data than to process it. So you should invest in a system
that can transmit all the raw data back to earth. Then you even have the
benefit of saving the raw data set for future computations as more is
learned...

(giggles)
Karen
-- 
 Karen Shaeffer
 Neuralscape, Palo Alto, Ca. 94306
 shaeffer at neuralscape.com  http://www.neuralscape.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Mon Apr 14 16:18:50 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Tue, 15 Apr 2003 06:18:50 +1000
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <200304141425.SAA07170@nocserv.free.net>
References: <200304141425.SAA07170@nocserv.free.net>
Message-ID: <3E9B17AA.1000806@octopus.com.au>

Mikhail Kuzminsky wrote:
>    Taking into account that Itanium 2 has much more high performance,
> the price from HP looks reasonable.

On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's 
only comparing it against HP's own PA-8700 hardware! Compare it to more 
mainstream hardware and you'll see just how laughable Itanium 2 prices 
are. The Itanium 2 doesn't have significantly higher performance than 
today's Xeons. Opteron, at least for the time being, performs 
significantly better again.

>    Yes, Opteron may give good alternative, but I'm not sure
> that price/performance ratio for Opteron servers will be better
> than for P4 Xeon dual servers.

Let me tell you now: the price/performance ratio of Opteron systems _is_ 
better than that of their dual Xeon counterparts. How long this will 
remain the case is yet to be seen. Thank your lucky stars for AMD 
though, as they're the only people who have a chance at making Intel cut 
their prices. ;)

 > Only if you need badly 64-bit processor ...

It's thanks to Intel that people even think like this. It's now 2003: 
you shouldn't have to sell your children to get fast 64-bit systems.

	Duraid

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ctierney at hpti.com  Mon Apr 14 18:09:12 2003
From: ctierney at hpti.com (Craig Tierney)
Date: 14 Apr 2003 16:09:12 -0600
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <3E9B17AA.1000806@octopus.com.au>
References: <200304141425.SAA07170@nocserv.free.net>
	 <3E9B17AA.1000806@octopus.com.au>
Message-ID: <1050358151.6451.226.camel@woody>

On Mon, 2003-04-14 at 14:18, Duraid Madina wrote:
> Mikhail Kuzminsky wrote:
> >    Taking into account that Itanium 2 has much more high performance,
> > the price from HP looks reasonable.
> 
> On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's 
> only comparing it against HP's own PA-8700 hardware! Compare it to more 
> mainstream hardware and you'll see just how laughable Itanium 2 prices 
> are. The Itanium 2 doesn't have significantly higher performance than 
> today's Xeons. Opteron, at least for the time being, performs 
> significantly better again.

Er, really?  What are your comparisons?  My comparisons show that 
Itanium 2's operate about 100% faster on my codes than my 2.2 Ghz Xeons
(400 Mhz FSB).  This is without going in and trying to tweak the code. 
I don't know if it is running as fast as it should be.

No, this still doesn't justify the price difference, but the performance
isn't as bad as you are implying.

The I2 does have integer math performance problems, but is supposed to
be corrected with the next generation chip (Madison).


> 
> >    Yes, Opteron may give good alternative, but I'm not sure
> > that price/performance ratio for Opteron servers will be better
> > than for P4 Xeon dual servers.
> 
> Let me tell you now: the price/performance ratio of Opteron systems _is_ 
> better than that of their dual Xeon counterparts. How long this will 
> remain the case is yet to be seen. Thank your lucky stars for AMD 
> though, as they're the only people who have a chance at making Intel cut 
> their prices. ;)
> 

Unless we want start a flame war on NDA hardware, I think should be
adding the phrase 'It depends' to any Opteron benchmarks, because it
does.  Lets get to arguing numbers in about 2 weeks when we can.

And no, for MY CODES, Opteron does not perform significantly better than
the Itanium 2 in all cases.  However, when we start to talk
price/performance the Opteron will be the right choice for many
applications.  However, not necessarily all of them. 

I am not trying to be negative about a platform that does not exist
(yet). Personally I want 8 to 1000 Opteron nodes to do some real work. 
However blanket statements about any hardware platform don't do any
good.

But I agree with you, all competition is good.  It makes our toys
cheaper.

Craig


>  > Only if you need badly 64-bit processor ...
> 
> It's thanks to Intel that people even think like this. It's now 2003: 
> you shouldn't have to sell your children to get fast 64-bit systems.
> 
> 	Duraid
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-- 
Craig Tierney <ctierney at hpti.com>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rbw at ahpcrc.org  Mon Apr 14 18:15:34 2003
From: rbw at ahpcrc.org (Richard Walsh)
Date: Mon, 14 Apr 2003 17:15:34 -0500
Subject: [Linux-ia64] Itanium gets supercomputing software
Message-ID: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org>


Duraid Madina wrote:

>On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's 
>only comparing it against HP's own PA-8700 hardware! Compare it to more 
>mainstream hardware and you'll see just how laughable Itanium 2 prices 
>are. The Itanium 2 doesn't have significantly higher performance than 
>today's Xeons. Opteron, at least for the time being, performs 
>significantly better again.

The SPECFP numbers rate the Itanium 2 at about 1425 (relative to the
base Sun) and the Pentium 4 at about 1100. That's about a 20% advantage
on floating point (PA-RISC rates a 600 I think). The integer ratio is
about 1100 to 800 in favor of Pentium 4.  Bandwidth to memory as measured 
by stream triad is 50% better on the Itanium implying that you will get 
a larger percentage of peak for out-of-cache workloads.  Then there is 
the 64-bit address space, EPIC compiler technology, etc.  ... but ... 

Itanium 2 prices seem high to me. However, the questions is really one for
Intel and HP ... is the current price generating enough volume to hit the 
revenue sweet spot. They could care less whether I, you, or any random individual 
buyer likes the price ;-).  The price is right if they are maximizing the 
time-integrated return on the product. Initial pricing should err high ...
you can always lower it, but can never raise it.  Until Opteron is available, 
the only, long-lived, direct competition is the Power 4 (is it available in 1 
and 2 processor configurations?). Plus, why should Intel compete with their
own price-performance Pentium 4 systems by lowering Itanium 2 prices?  They 
are serving two markets segments those with more money than brains and those
with more brains than money ... ;-). The market is quantized ... each product
has its own quantum number.

I would be interested in SPECFP and Stream Triad numbers for the 
Opteron if you have them.

Regards,

rbw
#---------------------------------------------------
# Richard Walsh
# Project Manager, Cluster Computing, Computational
#                  Chemistry and Finance
# netASPx, Inc.
# 1200 Washington Ave. So.
# Minneapolis, MN 55415
# VOX:    612-337-3467
# FAX:    612-337-3400
# EMAIL:  rbw at networkcs.com, richard.walsh at netaspx.com
#         rbw at ahpcrc.org
#
#---------------------------------------------------
# "I'm quite contented to take my chances with
#  the Gildensterns and Rosenkrantzes.
#                                  -SpinDoctors
#---------------------------------------------------

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Mon Apr 14 18:34:33 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 14 Apr 2003 15:34:33 -0700
Subject: beowulf in space
References: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com>
Message-ID: <5.1.0.14.2.20030414150845.02f024f0@mailhost4.jpl.nasa.gov>

At 02:46 PM 4/14/2003 -0400, Mark Hahn wrote:
> > Has anybody considered the theoretical aspects of placing beowulfs on a
> > cluster of satellites?
>
>if this interests you, I highly recommend reading Vernor Vinge's
>recent books (A Deepness in the Sky, for instance).  Robert Forward
>has some topical ones, too.  they are science fiction, though...
>
> > I understand that communication will be slower AND
> > unreliable,
>
>well, to the extent that such a cluster would be spread out,
>I can understand the "slower" part.  though c in vacuum is higher
>than c in fiber or TP.
>
>I don't see the "unreliable" part.  are you presuming some kind of
>traditional RF modulation?  using free-space optics seems like the
>more obvious way to network satellites, and I don't see why that would
>be flakey.

RF for short (<1000km) links can be very reliable (certainly better than 
Ethernet, once you've factored in collisions, etc.).  Don't take 802.11 
kinds of links as an example.


> > and it would restrict the set of problems that could be solved. I'm 
> looking
> > for papers/tech reps etc on the subject.
>
>I doubt they exist, simply because there's no practical reason,
>given the huge cost and unclear advantage.

Certainly, flying a Beowulf to provide computing services to a terrestrial 
user makes no sense, but flying a Beowulf to provide insitu computing 
crunch for, e.g., data reduction on a deep space mission, makes a lot of sense.

While a broadband high rate "pipe" from GEO orbit isn't too tough (all it 
takes is money to buy or rent a transponder and suitable ground station 
equipment), the same from a LEO orbit is much more of a challenge.  Take 
something like the Shuttle Radar Topographic Mission (SRTM) as an example. 
The 2 radars produce 180 and 90 Mbit/sec raw data rate for C and X band, 
respectively. There isn't any convenient way to get that kind of data pipe 
for something orbiting the earth every 90 minutes or so. So, they recorded 
the data on a whole pile of tapes, which they brought back, and which will 
take some years to ground process the 10 Tbyte of data.  And that's for a 
mere 10-11 days of data.

Clearly, some sort of onboard processing would be useful. SRTM was designed 
to measure the topography of all land surfaces on a 10 meter (or so) grid. 
Figuring the Earth's surface at 564E6 square kilometers, figuring 40% land 
area, and 1E4 measurements/square km, you're looking at about 2E12 
measurements reduced from around 1E13 bytes of data.

Topography is actually one of the easier measurements.. the ground 
elevation doesn't change much on a day to day basis (usually).

Now consider a couple much more difficult problems:
1) quasi real time imaging of some parameter that varies quickly (wind, 
rainfall, vegetation)
2) moving target detection.. Say you wanted to track all airplanes in 
flight with an accuracy of, say, 100 meters.

For a constellation of spacecraft, one could do things like atmospheric 
sounding or tomography, the latter of which requires some serious 
processing crunch to reduce the raw data to usable output. Imagine that you 
want to tomographically process atmospheric sounding through the atmosphere 
of Jupiter, but you need to send the data back through a datalink with a 
bandwidth of, maybe, 1 Mbit/second, 8 hours a day.  It's also got to 
tolerate the somewhat(!) harsh radiation environment of Jupiter.


One can argue that for deep space missions, costing hundreds of millions of 
dollars, that you're not going to be using commodity PCs mail-ordered from 
WalMart stacked up on baker's carts.  However, one might very well use the 
Beowulf concept of lots of fairly simple, fairly slow processors, 
interconnected by a high latency, moderate bandwidth fabric of sorts.


>I can imagine some really great advertisements for colo though ;)
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Mon Apr 14 16:43:39 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 14 Apr 2003 13:43:39 -0700
Subject: beowulf in space
In-Reply-To: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com>
Message-ID: <5.1.0.14.2.20030414133602.030c6cd0@mailhost4.jpl.nasa.gov>

The short answer is yes, it has been and is being considered, in several 
forms.  The interprocessor comm is not necessarily slower (very wideband 
optical links are practical), but latency is an issue. However, there are 
many space applications that can benefit from this sort of thing that 
aren't particularly bandwidth or latency constrained.

While the scientists would generally like to have a big pipe to the ground 
and just send raw data for later processing, there are situations where you 
just can't send that much data back, and it has to be on-board processed in 
some way.

Of course, inasmuch as part of Beowulfery is the idea of commodity off the 
shelf computers being used, real Beowulfs in space aren't likely to come 
any time soon, since almost NOTHING in space is a commodity part.  It costs 
so much to get it there, that the additional cost for a "custom" part is a 
small fraction of the launch cost.

If you were to search back proceedings of the IEEE Aerospace Conference 
(Big Sky MT), you'll find some papers on Beowulf type systems proposed for 
space applications, and also some novel ideas for high bandwidth cluster 
interconnects based on optical techniques.

As RGB pointed out, the design environment for space is somewhat 
different.. power consumption and cooling (even if you have a reactor a'la 
Prometheus) are signficant challenges, as is the radiation environment, 
both in an single event and in a total dose.


At 11:59 AM 4/14/2003 -0700, chettri at gst.com wrote:
>Has anybody considered the theoretical aspects of placing beowulfs on a 
>cluster of satellites? I understand that communication will be slower AND 
>unreliable,
>and it would restrict the set of problems that could be solved. I'm 
>looking for papers/tech reps etc on the subject.
>
>Regards,
>
>Samir Chettro
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Mon Apr 14 18:49:24 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 14 Apr 2003 15:49:24 -0700
Subject: beowulf in space
In-Reply-To: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellso
 uth.net>
Message-ID: <5.1.0.14.2.20030414153502.01978198@mailhost4.jpl.nasa.gov>

At 03:40 PM 4/14/2003 -0400, astroguy at bellsouth.net wrote:
>Hi list,
>Ok, you computer genius and rocket scientist all... I tend to agree with 
>Dr.Brown's position but for the sake of argument... Let's think of along 
>the lines of where a computational cluster might find some in space 
>application.  Say for example we were to launch a probe into the 
>Sun's  outer corona... let assume also that we have some shielding device 
>that would sustain the craft in the 10 million degree C or so that such a 
>craft is sure to encounter..

The corona is a fairly non-dense plasma (100 ions/cm^3 viz 2E19 atoms/cm^3 
for STP air), more closely resembling a really good vacuum(1E-15 torr?), 
where the ions are moving moderately fast (1-10kEv), corresponding to a 
temperature of 10 million K, but I don't know that the heat content is all 
that great, and I don't know that it would actually heat a real body placed 
in it all that much, any more than the CRT in your TV or monitor heats up 
from the 100 million K electrons in the internal beam (which has a much, 
much higher number density than the corona)

For some data on a real solar atmosphere probe: 
http://umbra.nascom.nasa.gov/solar_connections/probe.html and
http://umbra.nascom.nasa.gov/spd/solar_probe.html
and a nice technical presentation at
http://solarprobe2.jpl.nasa.gov/SPBR.html

>. Even with our best science fiction such a craft could only endure a few 
>precious moments in such a space environment, so we would have to use the 
>advantage of speed... Ok, so we use an ion engine to get the craft up to 
>speed... since the sun's corona extends apparently 700,000 km or so into 
>space... the craft would have to get up to a speed say 250,000 mph.  Which 
>we have yet to achieve but not impossible... Sling shot around Jupiter and 
>Mars and back to the sun with the ion engine in a bit of celestial magic 
>provided by or on!
>  ground navigational cluster... certainly we can achieve a very high 
> velocity for our death plunge into the Sun's outer atmosphere... 
> Computational real time observations within those few precious moments 
> before the probe vaporised would certainly be enhanced by an on board 
> beowulf cluster... You asked for speculation, as to an application... I 
> think this is perhaps one.

While your nav scenario is a bit unrealistic, the need for on-board 
processing is precisely right..you're limited in your downlink (total bits 
that can be sent before immolation)

>Chip
> >
> > From: Mark Hahn <hahn at physics.mcmaster.ca>
> > Date: 2003/04/14 Mon PM 02:46:05 EDT
> > To: chettri at gst.com
> > CC: beowulf at beowulf.org
> > Subject: Re: beowulf in space
> >
> > > Has anybody considered the theoretical aspects of placing beowulfs on a
> > > cluster of satellites?
> >

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Apr 14 19:53:27 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 14 Apr 2003 19:53:27 -0400 (EDT)
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org>
Message-ID: <Pine.LNX.4.44.0304141944580.10510-100000@coffee.psychology.mcmaster.ca>

> about 1100 to 800 in favor of Pentium 4.  Bandwidth to memory as measured 
> by stream triad is 50% better on the Itanium implying that you will get 
> a larger percentage of peak for out-of-cache workloads.  Then there is 

until the next-gen P4 chipsets arrive (and they have).

> I would be interested in SPECFP and Stream Triad numbers for the 
> Opteron if you have them.

me too </aol>.  but if I understand AMD's marketing "plan",
we won't see the interesting Opteron systems at launch.
that is, since Opteron bandwidth scales with ncpus, it's 
really the 4-8-way systems that will look dramatically 
more attractive than any competitors (cept maybe Marvel).

it is sort of interesting that much of It2's rep rests on 
fairly single-threaded benchmarks (cfp2000, stream).  but I don't
see a lot of people buying uniprocessor It2's, and all It2
systems use a shared 6.4 GB/s FSB.  by comparison, a dual-opt
has 10.8 GB/s aggregate, which starts to be interesting.

I'm hoping AMD will get pumped and support PC3200 on apr 22.
I fear that 4x and 8x systems will be late as usual.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Mon Apr 14 19:57:52 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 14 Apr 2003 16:57:52 -0700
Subject: beowulf in space
In-Reply-To: <20030414203258.GA28080@synapse.neuralscape.com>
References: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net>
 <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net>
Message-ID: <5.1.0.14.2.20030414164248.03049040@mailhost4.jpl.nasa.gov>

At 01:32 PM 4/14/2003 -0700, Karen Shaeffer wrote:
> >  ground navigational cluster... certainly we can achieve a very high 
> velocity for our death plunge into the Sun's outer atmosphere... 
> Computational real time observations within those few precious moments 
> before the probe vaporised would certainly be enhanced by an on board 
> beowulf cluster... You asked for speculation, as to an application... I 
> think this is perhaps one.
>
>
>This is getting silly. You still need to transmit the data back to earth. It
>has already been asserted that it is far more efficient energy wise and cost
>wise to transmit data than to process it.

This is not necessarily true...  While the scientist generally prefers to 
get the raw data (it allows deferring some of the analysis work to a later 
time, and, it reduces the risk of making a bad design decision, because you 
can go reprocess later), in many, many cases it is NOT cheaper to send the 
data back to earth than to process it in situ and send the processed data.

Most spacecraft are severely power constrained, and that sets the basis for 
the tradeoff of joules expended on processing vs joules expended on sending 
data (hence my earlier posts about MIPS/Watt being important).  Inherent in 
the fact that one CAN do processing is that the raw data must contain some 
redundancy, and the processing, in an information theoretic sense, consists 
of removing the redundancy (consider it as "lossless compression" if you 
will).  For an arbitrary communication link, the key thing is the received 
energy at the other end compared with the noise energy (usually talked 
about in terms of Eb/No).  You can divvy up the energy in a lot of ways:
1) You can send each (nonredundant) bit multiple times, increasing the 
energy for each information bit, thereby improving the signal to noise 
ratio for that bit.  There are lots of clever schemes for how you do this 
(generically called "coding"). Essentially, you put some amount of 
redundancy back into the data stream, and then remove it at the receiving end.

2) You can not bother removing the redundancy in the first place, 
transmitting more bits, with less power per bit.

I would contend that in an idealized case, the raw sensor data is unlikely 
to be an efficient coding strategy for the actual information contained in 
the data. Consider a trivial case where the sensor measures the slowly 
varying (timeconstant >1 second) temperature of the spacecraft 100 times a 
second with an accuracy of 8 bits.  Clearly, there is a very high 
correlation between one measurement and the next, so the actual 
"information" in each sample is quite small. A "send the raw data" strategy 
would require 800 bits/sec of bandwidth.

One could trivially encode it by averaging 10 measurements at a time into 
an 8 bit average, probably without losing much data. If one was worried 
about excursions, one could also transmit the min and max values, for a 
total of 240 bits/second.

One could also use any of a number of simple lossless compression schemes 
to greatly reduce the bit rate.

The question to be answered, in a real system, is it better to put your 
precious joules to work sending all those 800 bits, and not spend any on 
the processing, hoping that the greater error rate from the low joules/bit 
can be overcome by ground processing, OR, should one do some onboard 
processing, say lossless compression, putting more joules in each of the 
fewer bits (less some amount of energy used in the compression process), 
then transmit those fewer bits using some form of coding, which increases 
the transmitted bit rate.

It all depends on the link budget, and how close you are to the ragged edge 
of the Shannon limit.


>  So you should invest in a system
>that can transmit all the raw data back to earth. Then you even have the
>benefit of saving the raw data set for future computations as more is
>learned...

There's also the possibility that it is not feasible to send all data back, 
and that it HAS to be processed at the sensor. SRTM is a good example of this.


>(giggles)
>Karen
>--
>  Karen Shaeffer
>  Neuralscape, Palo Alto, Ca. 94306
>  shaeffer at neuralscape.com  http://www.neuralscape.com
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Mon Apr 14 19:33:54 2003
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Mon, 14 Apr 2003 16:33:54 -0700 (PDT)
Subject: Fwd: [PBS-USERS] PBS technical specialists
Message-ID: <20030414233354.95477.qmail@web11402.mail.yahoo.com>

Any PBS hackers??

Rayson

--- Michael Humphrey <Humphrey at altair.com> wrote:
> > Dear PBS users,
> > 
> > 
> > Altair is looking to expand our PBS Pro technical staff in our Troy
> > Michigan office. 
> > Our ideal candidate will have the followings experiences:
> > 
> > 
> > Required experiences
> > 
> > 5 or more years as systems Administrator in UNIX environment 
> > 2 or more years experience with openPBS or  PBS Pro 
> > BS degree in Computer Science or Engineering
> > Experience in writing Unix shell scripts 
> > Experience with PERL  
> > Good communications skills 
> > Willing to do some travel 
> > 
> > 
> > Desirable experiences
> > Experience administering Windows environments (1-2 years) preferred
> 
> > Experience in MCAE applications  environments preferred
> > 
> > If you know someone who might be interested in these positions
> please have
> > them forward their resume to me via email or contact me via
> telephone.
> > Thank you for any referrals which may come forward.
> > 
> > 
> > 
> > Michael Humphrey
> > Altair Engineering
> > 
> > 1820 East Big Beaver Rd.
> > Troy, Mi. 48083
> > 
> > (248) 614-2400 Ext 495
> > 
> > humphrey at altair.com
> > 
> > 
> > 
> > 
> > 
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ctierney at hpti.com  Mon Apr 14 20:27:35 2003
From: ctierney at hpti.com (Craig Tierney)
Date: 14 Apr 2003 18:27:35 -0600
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <Pine.LNX.4.44.0304141944580.10510-100000@coffee.psychology.mcmaster.ca>
References: 	 <Pine.LNX.4.44.0304141944580.10510-100000@coffee.psychology.mcmaster.ca>
Message-ID: <1050366454.6451.400.camel@woody>

On Mon, 2003-04-14 at 17:53, Mark Hahn wrote:
> > about 1100 to 800 in favor of Pentium 4.  Bandwidth to memory as measured 
> > by stream triad is 50% better on the Itanium implying that you will get 
> > a larger percentage of peak for out-of-cache workloads.  Then there is 
> 
> until the next-gen P4 chipsets arrive (and they have).
> 
> > I would be interested in SPECFP and Stream Triad numbers for the 
> > Opteron if you have them.
> 
> me too </aol>.  but if I understand AMD's marketing "plan",
> we won't see the interesting Opteron systems at launch.
> that is, since Opteron bandwidth scales with ncpus, it's 
> really the 4-8-way systems that will look dramatically 
> more attractive than any competitors (cept maybe Marvel).
> 
drooling even more....

> it is sort of interesting that much of It2's rep rests on 
> fairly single-threaded benchmarks (cfp2000, stream).  but I don't
> see a lot of people buying uniprocessor It2's, and all It2
> systems use a shared 6.4 GB/s FSB.  by comparison, a dual-opt
> has 10.8 GB/s aggregate, which starts to be interesting.

Feel free to flame me if I am wrong, but the HP chipset for It2
is 8.5 GB/s.  The Intel chipset is 6.4 GB/s.  Each cpu can
push 6.4 GB/s.


> 
> I'm hoping AMD will get pumped and support PC3200 on apr 22.
> I fear that 4x and 8x systems will be late as usual.
> 

Lots to talk about on the 22nd!  

Did AMD pick Earth Day for any particular reason
to announce the new product?  I do not think this
is going to be a 'green' cpu.

Craig


> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-- 
Craig Tierney <ctierney at hpti.com>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Apr 14 23:55:34 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 14 Apr 2003 23:55:34 -0400 (EDT)
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <1050366454.6451.400.camel@woody>
Message-ID: <Pine.LNX.4.44.0304142306190.16238-100000@coffee.psychology.mcmaster.ca>

> > see a lot of people buying uniprocessor It2's, and all It2
> > systems use a shared 6.4 GB/s FSB.  by comparison, a dual-opt
> > has 10.8 GB/s aggregate, which starts to be interesting.
> 
> Feel free to flame me if I am wrong, but the HP chipset for It2
> is 8.5 GB/s.  The Intel chipset is 6.4 GB/s.  Each cpu can
> push 6.4 GB/s.

heck, the rx5670 claims 12.8 GB/s:
http://www.hp.com/products1/servers/rackoptimized/rx5670/specifications.html

alas, all the CPUs sit on a 6.4 GB/s bus:
http://www.hp.com/products1/itanium/chipset/4_way_block.html

(note that it lists 4 GB/s aggregate IO bandwidth; in short,
the 12.8 is simply false; 10.4 is theoretically possible,
but in reality, the CPUs will sustain 4ish and IO will probably
total less than 1.)

the real flames go out to the marketing pinheads who claim 12.8!

in fairness, I should note that HP's rx2600 stream scores 
(3.5 3.5 4.0) are quite excellent.  not nearly as good as Marvel,
but competitive with a number of traditional vector supers.
quite a bit better than an Altix, too (1.7 1.7 1.9).

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hanzl at noel.feld.cvut.cz  Tue Apr 15 05:49:52 2003
From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz)
Date: Tue, 15 Apr 2003 11:49:52 +0200
Subject: Power supply problems (was: Machine Check Exception)
In-Reply-To: <3E9AF3DD.1070002@pgs.com>
References: <3E9AF3DD.1070002@pgs.com>
Message-ID: <20030415114952S.hanzl@unknown-domain>

> Does anyone know if a power supply can cause a machine check exception

Power supply can probably cause all sorts of weird problems, with
variety similar to RAM problems - problem can surface just anywhere
and resemble other hardware or software problem to such an extent that
usual diagnostic steps like replacing hardware and software components
clearly indicate that something else is faulty, replacing that
'faulty' component seems to fix it but later on problems reoccur.

In another words, if your power supply works near the limits, problems
may be observed just on few nodes in the cluster (or on a single node)
and there may be just certain hardware/software/environmental
circumstances which trigger the problem.

I've seen certain indications that these days power supply can cause
more headaches than before:

- note from Abit tech staff saying that certain power supplies send
POWER_GOOD 'too early', meaning before on-board power conversion
circuits had time to stabilise

- overclockers are starting to take power supply more seriously.
(However stupid overclocking is, their web sites gives good
indication which parts of hardware work near to the limits.)

- note that certain Abit BIOS upgrade fixes PSU problems

- many problems with Abit IT7 MAX2 and 300W PSU I've encountered
myself

- fact that modern PSU is under software control and hardware control
of that part of mainboard which gets standby power, opening new
possibilities for intermixing PSU/hardware/software problems.


Most of my experience is Abit-related but may be general I am
affraid...

Regards

Vaclav

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Andrew.Cannon at nnc.co.uk  Tue Apr 15 06:41:49 2003
From: Andrew.Cannon at nnc.co.uk (Cannon, Andrew)
Date: Tue, 15 Apr 2003 11:41:49 +0100
Subject: UK Cluster hardware suppliers?
Message-ID: <DD1E19A9AFC2D311A32200508B5589EF0498B7C3@nnc.co.uk>

Hi All,

Does anyone know of a supplier (or suppliers) of clustering hardware in the
UK? I need to get some quotes for a 16 node cluster.

Thanks

Andrew

Andrew Cannon, Nuclear Technology (J2), NNC Ltd, Booths Hall, Knutsford,
Cheshire, WA16 8QZ.

Telephone; +44 (0) 1565 843768
email: mailto:andrew.cannon at nnc.co.uk
NNC website: http://www.nnc.co.uk


***********************************************************************************
 NNC Limited
 Booths Hall
 Chelford Road
 Knutsford
 Cheshire
 WA16 8QZ
 
 Country of Registration: United Kingdom
 Registered Number: 1120437
 
 This e-mail and any files transmitted with it are confidential and 
 intended solely for the use of the individual or entity to whom they   
 are addressed. If you have received this e-mail in error please notify 
 the NNC system manager by e-mail at eadm at nnc.co.uk.
***********************************************************************************

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jhearns at freesolutions.net  Tue Apr 15 07:20:03 2003
From: jhearns at freesolutions.net (John Hearns)
Date: 15 Apr 2003 12:20:03 +0100
Subject: UK Cluster hardware suppliers?
In-Reply-To: <DD1E19A9AFC2D311A32200508B5589EF0498B7C3@nnc.co.uk>
References: <DD1E19A9AFC2D311A32200508B5589EF0498B7C3@nnc.co.uk>
Message-ID: <1050405611.10673.16.camel@harwood.home>

On Tue, 2003-04-15 at 11:41, Cannon, Andrew wrote:
> Hi All,
> 
> Does anyone know of a supplier (or suppliers) of clustering hardware in the
> UK? I need to get some quotes for a 16 node cluster.
> 

Off the top of my head, in no particular order, 


Streamline Computing    http://www.streamline-computing.com
Workstations UK         http://www.workstationsuk.co.uk
OCF                     http://www.ocf.co.uk
Clustervision           http://www.clustervision.com
Compusys                http://www.compusys.co.uk
Max Black               http://www.maxblack.co.uk 

Quadrics                http://www.quadrics.com for fast interconnects

SGI
IBM
HP
Dell

Oh, and if anyone from these companies spots their name, I'm looking for
a job.

Apologies to anybody I've missed.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ajt at rri.sari.ac.uk  Tue Apr 15 08:44:59 2003
From: ajt at rri.sari.ac.uk (Tony Travis)
Date: Tue, 15 Apr 2003 13:44:59 +0100
Subject: UK Cluster hardware suppliers?
In-Reply-To: <1050405611.10673.16.camel@harwood.home>
References: <DD1E19A9AFC2D311A32200508B5589EF0498B7C3@nnc.co.uk> <1050405611.10673.16.camel@harwood.home>
Message-ID: <3E9BFECB.3050802@rri.sari.ac.uk>

John Hearns wrote:

 > On Tue, 2003-04-15 at 11:41, Cannon, Andrew wrote:
 >
 >> Hi All,
 >>
 >> Does anyone know of a supplier (or suppliers) of clustering hardware 
in the
 >> UK? I need to get some quotes for a 16 node cluster.


Hello, Andrew.

I've just bought 24 Athlon XP 2400+ nodes for a beowulf cluster from 
Eclipse Computing (mailto:sales at eclipsecomputing.co.uk). They also sell 
complete Beowulf systems.

     Tony.
-- 
Dr. A.J.Travis,                     |  mailto:ajt at rri.sari.ac.uk
Rowett Research Institute,          |    http://www.rri.sari.ac.uk/~ajt
Greenburn Road, Bucksburn,          |   phone:+44 (0)1224 712751
Aberdeen AB2 9SB, Scotland, UK.     |     fax:+44 (0)1224 716687

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joachim at ccrl-nece.de  Tue Apr 15 09:04:42 2003
From: joachim at ccrl-nece.de (Joachim Worringen)
Date: Tue, 15 Apr 2003 15:04:42 +0200
Subject: UK Cluster hardware suppliers?
In-Reply-To: <DD1E19A9AFC2D311A32200508B5589EF0498B7C3@nnc.co.uk>
References: <DD1E19A9AFC2D311A32200508B5589EF0498B7C3@nnc.co.uk>
Message-ID: <200304151504.42313.joachim@ccrl-nece.de>

Cannon, Andrew:
> Does anyone know of a supplier (or suppliers) of clustering hardware in the
> UK? I need to get some quotes for a 16 node cluster.

Take a look at http://www.workstationsuk.co.uk . Have no experience with them, 
though.

 Joachim

-- 
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From shaeffer at neuralscape.com  Mon Apr 14 20:48:40 2003
From: shaeffer at neuralscape.com (Karen Shaeffer)
Date: Mon, 14 Apr 2003 17:48:40 -0700
Subject: beowulf in space
In-Reply-To: <5.1.0.14.2.20030414164248.03049040@mailhost4.jpl.nasa.gov>
References: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> <5.1.0.14.2.20030414164248.03049040@mailhost4.jpl.nasa.gov>
Message-ID: <20030415004840.GA28478@synapse.neuralscape.com>

On Mon, Apr 14, 2003 at 04:57:52PM -0700, Jim Lux wrote:

...snip...

> data (hence my earlier posts about MIPS/Watt being important).  Inherent in 
> the fact that one CAN do processing is that the raw data must contain some 
> redundancy, and the processing, in an information theoretic sense, consists 
> of removing the redundancy (consider it as "lossless compression" if you 
> will).

Hello Jim,

Sure. But you are going to perform lossless compression with a DSP chip
built into the pipeline. It was assumed you would remove redundancy prior to
transmitting from space. Lossless compression with a DSP core is not even
remotely comparable to hoisting a Beowulf cluster into space to
computationally exploit raw data from some exotic remote event. (smiles ;)

cheers,
Karen
-- 
 Karen Shaeffer
 Neuralscape, Palo Alto, Ca. 94306
 shaeffer at neuralscape.com  http://www.neuralscape.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Mon Apr 14 19:23:26 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Tue, 15 Apr 2003 09:23:26 +1000
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <1050358151.6451.226.camel@woody>
References: <200304141425.SAA07170@nocserv.free.net>	 <3E9B17AA.1000806@octopus.com.au> <1050358151.6451.226.camel@woody>
Message-ID: <3E9B42EE.7020509@octopus.com.au>

Craig Tierney wrote:

> On Mon, 2003-04-14 at 14:18, Duraid Madina wrote:
>>Mikhail Kuzminsky wrote:
>>>   Taking into account that Itanium 2 has much more high performance,
>>>the price from HP looks reasonable.
>>
>>On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's 
>>only comparing it against HP's own PA-8700 hardware! Compare it to more 
>>mainstream hardware and you'll see just how laughable Itanium 2 prices 
>>are. The Itanium 2 doesn't have significantly higher performance than 
>>today's Xeons. Opteron, at least for the time being, performs 
>>significantly better again.
> 
> Er, really?  What are your comparisons?

SPECint2000.

> My comparisons show that 
> Itanium 2's operate about 100% faster on my codes than my 2.2 Ghz Xeons
> (400 Mhz FSB).

That's nice.

> This is without going in and trying to tweak the code. 
> I don't know if it is running as fast as it should be.

Why am I not surprised.

> No, this still doesn't justify the price difference, but the performance
> isn't as bad as you are implying.

I was mistaken. Since Itanium 2 is in the habit of routinely performing 
integer codes at double the speed of 2.2GHz Xeons, it's

> The I2 does have integer math performance problems, but is supposed to
> be corrected with the next generation chip (Madison).

Madison will be called Itanium 2, and for good reason. Isn't it an 
Itanium 2, respun on a 130nm process and with (potentially) more cache? 
If there are any other differences, I'd be grateful if you could tell me 
about those. As far as I can see, this year's ISSCC papers still aren't 
online :(

> Unless we want start a flame war on NDA hardware, I think should be
> adding the phrase 'It depends' to any Opteron benchmarks, because it
> does.  Lets get to arguing numbers in about 2 weeks when we can.

One week to go! ;)

> And no, for MY CODES, Opteron does not perform significantly better than
> the Itanium 2 in all cases.

Does your code fit in cache? I can't think of any other reason why you'd 
perform so badly on an Opteron. Which stepping of Opteron have you been 
using?

> However, when we start to talk
> price/performance the Opteron will be the right choice for many
> applications.  However, not necessarily all of them. 

Nope, just most of them.

> I am not trying to be negative about a platform that does not exist
> (yet).

I am trying to be negative about a platform that is overpriced. ;)

> Personally I want 8 to 1000 Opteron nodes to do some real work. 

Personally, I want two Itanium 2s to stick under my desk. But I can't 
afford them!

	Duraid


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hunting at ix.netcom.com  Tue Apr 15 02:01:22 2003
From: hunting at ix.netcom.com (Michael Huntingdon)
Date: Mon, 14 Apr 2003 23:01:22 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <Pine.LNX.4.44.0304141944580.10510-100000@coffee.psychology
 .mcmaster.ca>
References: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org>
Message-ID: <3.0.3.32.20030414230122.01d2cfa0@popd.ix.netcom.com>

I'm absolutely surprised at the notion that we (you/me/and those in this
group) seem to have such advanced knowledge. Forget it. You can, at each
avenue press for what you want in terms of compute performance, and perhaps
compress an expectation into price/performance that meets an immediate
need; however, at some point in time it becomes obvious to each of us
that's it's not the chip set. 

If what we wanted/needed was an advanced CPU, Alpha would have been the
mantra ten years ago, Itanium 2 would be embraced now. Hardware waits for
advancements around software applications.

Let's be clear about what's really expected. While so many members of this
group expect answers around:
	- electrical requirements for super computing
	- cooling requirements for super computing
	- advancements in CPU rates
	- advancements in memory bandwidth
	- advancements along PCI paths
	- mainstream development for PCI-x
	- advancements in storage to system bandwidth
	- advancements along the network path
	- better effiencies within operating systems
	- drivers that allow inter operability with any number of options

Any number of groups can come up with inexpensive solutions to the above
once the industry standards are developed and the engineering is in place.
It's an inexpensive process to piggy-back on original efforts do to
investment in engineer and design. Dell, Gateway and "grey-box"
manufactures provide excellent templates. Yet when research, in the purest
form requires advancements, these are not the groups any of us look to.  

In addition to this, along with the grants that so many receive, along with
the free layered products and support available, there should somehow be
this ongoing discussion about a push for more hardware technology at x86
technology pricing?

If there is truly a dedicated and compelling requirement for advanced
technologies, perhaps some consideration should be given to what's being
asked for within research, what's needed, who is expected to deliver,
what's being delivered, and yes....some acumen specific to research and the
development of technology required to support advanced technology needs.

Let's get over ourselves folks. There are only three companies that are
going to drive technology in the for seeable future and allow the
advancements we all expect. 

Over the past several weeks I've followed speculative threads regarding
"Super Computing Environments". And although RGB has authored a great deal
of solid data, when it comes to creating and maintaining a really large
environments,  you'll want a single "throat to choke" and it won't be
Robert's. Who currently has the technology to construct huge environments
and guarantee results. I can testify that an "authorities" in the field
could not recently during the design of one of the QB3 facilities. 

My advise, look to an engineering organization who not only knows todays
technology, but can also advise you on advanced technologies that take you
3-5-7 years out. In this, you should only have to decide upon one of two
product vendors; however, you can assume either of the two can advise you
around what's possible today and perhaps as many as sever years out. Think
this might save you a (budget) dollar or two. YOU BET! I've seen it to the
tune of $500,000. It's not just the additional cost, but delays, and the
cost associated with the academic professionals who relied on the (so
called) consultants.

With so few in the commercial space investing in the future of advanced
research, I sometimes have to question how the view of those within
academic research could possible be so narrow. In academia of all places,
look around at who is trying to work with you, what each hopes to
accomplish, and how each will reinvest each dollar you spend with them.
Does your investment go to research in new technologies or marketing. This
one is a "no brainer". 

In a separate thread there is a topic of "beowulf in space", the
feasibility, the cost etc. I can't imagine this becoming a reality without
the dedication and investment of a very few select manufactures, but I'm
certain it's something I'll see in my lifetime (and I'm an old guy). It
will be smaller, lighter, cheaper, faster, more reliable.....and it won't
come from a "one off".

In another thread I recall reading about those who might have "more money
than brains". To that I would suggest that their investment in both money
and brains will trickle down, and be a benefit to all. Behind those
(implied) financial investments we will find the driving force for future
technologies that each of us will see in our data centers and research labs.

Let's see just how interesting things become.

cheers
~m  

At 07:53 PM 4/14/2003 -0400, you wrote:
>> about 1100 to 800 in favor of Pentium 4.  Bandwidth to memory as measured 
>> by stream triad is 50% better on the Itanium implying that you will get 
>> a larger percentage of peak for out-of-cache workloads.  Then there is 
>
>until the next-gen P4 chipsets arrive (and they have).
>
>> I would be interested in SPECFP and Stream Triad numbers for the 
>> Opteron if you have them.
>
>me too </aol>.  but if I understand AMD's marketing "plan",
>we won't see the interesting Opteron systems at launch.
>that is, since Opteron bandwidth scales with ncpus, it's 
>really the 4-8-way systems that will look dramatically 
>more attractive than any competitors (cept maybe Marvel).
>
>it is sort of interesting that much of It2's rep rests on 
>fairly single-threaded benchmarks (cfp2000, stream).  but I don't
>see a lot of people buying uniprocessor It2's, and all It2
>systems use a shared 6.4 GB/s FSB.  by comparison, a dual-opt
>has 10.8 GB/s aggregate, which starts to be interesting.
>
>I'm hoping AMD will get pumped and support PC3200 on apr 22.
>I fear that 4x and 8x systems will be late as usual.
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Mon Apr 14 19:28:33 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Tue, 15 Apr 2003 09:28:33 +1000
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <3E9B42EE.7020509@octopus.com.au>
References: <200304141425.SAA07170@nocserv.free.net>	 <3E9B17AA.1000806@octopus.com.au> <1050358151.6451.226.camel@woody> <3E9B42EE.7020509@octopus.com.au>
Message-ID: <3E9B4421.9060307@octopus.com.au>

I wrote:

> I was mistaken. Since Itanium 2 is in the habit of routinely performing 
> integer codes at double the speed of 2.2GHz Xeons, it's

I meant to write:

I was mistaken. Since Itanium 2 is in the habit of routinely performing 
integer codes at double the speed of 2.2GHz Xeons, it's obviously good 
value.

	Damn office distractions! ;)

	Duraid


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rbw at ahpcrc.org  Tue Apr 15 09:57:55 2003
From: rbw at ahpcrc.org (Richard Walsh)
Date: Tue, 15 Apr 2003 08:57:55 -0500
Subject: [Linux-ia64] Itanium gets supercomputing software
Message-ID: <200304151357.h3FDvtC09517@mycroft.ahpcrc.org>


Duraid Madina wrote:

>SPECfp2000 is ~1170 for a 2GHz 1MB L2 Opteron. Not too bad. The SPECint 
>figure is fantastic though (~1200). I hate x86 as much as the next guy, 
>but it looks like this is what I'm going to be working with for some 
>time, _thanks to Intel and their incredibly uninspired pricing strategy_.

Thanks for the numbers :-). Looks like Opteron comes in at slightly better
than the 2.8 GHz Pentium 4 in both cases (1100 FP, 1100 INT). So the marginal
additional price that AMD charges for Opteron will be for 64-bit addresses
and its SMP capability ... whether/when they try to sell it as a one-chip-fits-
all product will depend on how quickly they wish to destroy their x86-only 
markets ... that would seem to be their best strategy though ... otherwise
Intel makes them into a sandwich by lowering the I2's price and raising 
the P4's clock ... a kind of adiabatic squeeze play.

It will be interesting to watch ... from the sidelines.

rbw

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jcownie at etnus.com  Tue Apr 15 05:12:46 2003
From: jcownie at etnus.com (James Cownie)
Date: Tue, 15 Apr 2003 10:12:46 +0100
Subject: beowulf in space 
In-Reply-To: Message from "Robert G. Brown" <rgb@phy.duke.edu> 
   of "Mon, 14 Apr 2003 13:53:07 EDT." <Pine.LNX.4.44.0304141337140.12169-100000@ganesh.phy.duke.edu> 
Message-ID: <195MUg-178-00@etnus.com>


>   b) communications latency (bandwidth actually can be as big as you
>      like or are likely to ever need, since you ARE a satellite, after
>      all...:-)

Well, 18 months ago ESA were getting 50Mb optically between satellites 

  http://www.esa.int/export/esaCP/ESASGBZ84UC_Improving_0.html

since that is 

1) a Moore generation ago :-) (though I think development times are
   longer in space technology)
2) public information

I expect that the people on the "dark side" who do this can indeed get
an awful lot of bandwidth...

(ISTR Chaisson in "Hubble Wars" mentioning that they were borrowing
the 10Mb link to space that the NSA folks had back in 1991).

-- Jim 

James Cownie	<jcownie at etnus.com>
Etnus, LLC.     +44 117 9071438
http://www.etnus.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rtomek at cis.com.pl  Mon Apr 14 19:29:17 2003
From: rtomek at cis.com.pl (Tomasz Rola)
Date: Tue, 15 Apr 2003 01:29:17 +0200 (CEST)
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304140958550.6256-100000@twin.uoregon.edu>
Message-ID: <Pine.LNX.3.96.1030414221905.4762B-100000@pioneer.space.nemesis.pl>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 14 Apr 2003, Joel Jaeggli wrote:

> On Mon, 14 Apr 2003 chettri at gst.com wrote:
> 
> > Has anybody considered the theoretical aspects of placing beowulfs on a 
> > cluster of satellites? I understand that communication will be slower AND 
> > unreliable,
> 
> Communication to satellites needs to be neither slow nor unreliable, it is 
> generally fairly high latency... It can be quite expensive.
> 
> There are clusters of computers in space. they generally aren't what you 
> would consider heavy computation platforms...

Correct. However, I think that since this question really belongs to s-f
(at least today) one can put some s-f behind the answer...

> The biggest issues with with computer resources in space are:
> 
> mass - 	a large sattellite such as the hughes galaxy 4r bird is around 
[...]

I think mass is an issue when you have to export everything up from the
Earth. It won't be if you start to get materials from celestial sources.
The cost of launch from the Moon should be ca. 6 times less than from the
Earth. Even less when you start to explore asteroids. Kuiper belt should
have plenty of materials. Of course, the cost of building facilities there
is so high that it will take a long time to become feasible and next to
pay off.

> power - solar power and long life cadmium batteries mean your whole 

I think there is plenty of solar power in space. At least within some
specified orbit. It's only that you can't get enough of solar grids there
to use it in a practical way. BTW, some people are reconsidering the use
of atomic power up there.

http://www.spacedaily.com/news/oped-03i.html

(There are some links at the bottom of the page too).

> radiation hardening - without 50 miles of atmosphere overhead we're kinda 

Yes it is an issue. Perhaps it could help if you buried a cluster under
the surface of the Moon or put it on the dark side of Mercury (you would
need to move slowly your cluster there to avoid being rotated into the
very hot sunlight - not very practical, I think).

It seems that magnetic field helps but this page:

http://isaac.exploratorium.edu/~pauld/activities/magnetism/magnetismofplanets.html

shows that Earth-like field is scarce in Solar System. Placing such
systems, especially built from off the shelf components, on orbit is
probably not very bright idea unless you can protect them.

> thermal management - air cooling doesn't work given no atomosphere... even 

I'm not a specialist but I think you can force the (air | water) flow in
space. Otherwise, astronauts would have very dangerous time sleeping in
one place for few hours, with no ventilation at all (CO2 bubble growing
around their heads).

> expected service life - if you plan on go to the expense of putting it in 

Today, the longer you can use orbital device the better but nobody applies
this kind of measures to clusters. So you are right it would not be worth
to expedition units from Earth. On the other hand, the use of automated
production facilities, maybe on the Moon, would make the project possible.
When connected with some inexpensive transport system (who knows,
electromagnetic cargo ejectors or orbital lift) it could provide upgrades
and replacements (provided that you solve the radiation problem).

It is also quite possible that some time from now the Moore's law will no
longer hold. If so, the computing unit longevity would be measured in tens
of years. So even without cheap transportation it may be ok to hold it on
orbit for 20 years and still have fun (but not if you use today's cpus).

technology vs automation issues - 

- From what I know the technology for all this is right now very primitive
and/or requires human attention to work properly. Maybe you should ask the
question again about 10-20 years from now. Frankly, I don't see much sense
in putting cluster on orbit and than paying lots of money for sending
human operators there too. So automatic operation is probably a must for
this kind of projects.

mental sanity and business issues - (sorry, I just couldn't stop myself
:-) )

BTW I can't understand WHY anybody would like to place a cluster on orbit?
For the control of some weapon system with sofisticated AI? For autonomic 
management of exploration mission? Do you have any concept of
computational device that would work better on the orbit, by chance?

Environmental issues? Nah. I doubt if we could build such big clusters
anytime soon. Milions of units in one place? What for...

You know, the idea is nice but if what I know is correct, one can do the
same job on the surface, under the surface and even under the sea for a
fraction of cost and without waiting for better tech.

> > and it would restrict the set of problems that could be solved. I'm looking 
> > for papers/tech reps etc on the subject.
> > 
> > Regards,
> > 
> > Samir Chettro

Probably you would suffer from signal propagation times. The longest path
you have to deal on the Earth's surface is some 20 000 km. In case of a
geostationary cluster the diameter is about 70 000 km and even longer if
you want to send via the neighbours. Ok, I expect that you want to have
more than one cluster up there. So I think you are restricted to the tasks
with high processing / communication ratio.

[...]
> -- 
> -------------------------------------------------------------------------- 
> Joel Jaeggli	      Academic User Services   joelja at darkwing.uoregon.edu    
> --    PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E      --

bye
T.

- --
** A C programmer asked whether computer had Buddha's nature.      **
** As the answer, master did "rm -rif" on the programmer's home    **
** directory. And then the C programmer became enlightened...      **
**                                                                 **
** Tomasz Rola          mailto:tomasz_rola at bigfoot.com             **

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 5.0i for non-commercial use
Charset: noconv

iQA/AwUBPptEWBETUsyL9vbiEQIwDQCfUZtUICa+ecU5SsAjOHGLHg8yL9wAnRGT
meGiR1y1vvZYrho51MkLqY+e
=fwa/
-----END PGP SIGNATURE-----


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From duraid at octopus.com.au  Mon Apr 14 19:14:26 2003
From: duraid at octopus.com.au (Duraid Madina)
Date: Tue, 15 Apr 2003 09:14:26 +1000
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org>
References: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org>
Message-ID: <3E9B40D2.9010400@octopus.com.au>

Richard Walsh wrote:
> Duraid Madina wrote:
> 
>>On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's 
>>only comparing it against HP's own PA-8700 hardware! Compare it to more 
>>mainstream hardware and you'll see just how laughable Itanium 2 prices 
>>are. The Itanium 2 doesn't have significantly higher performance than 
>>today's Xeons. Opteron, at least for the time being, performs 
>>significantly better again.
> 
> The SPECFP numbers rate the Itanium 2 at about 1425 (relative to the
> base Sun) and the Pentium 4 at about 1100. That's about a 20% advantage
> on floating point (PA-RISC rates a 600 I think). The integer ratio is
> about 1100 to 800 in favor of Pentium 4.

You think a 20% advantage justifies the price difference, or qualifies 
as significantly higher performance?

 > Bandwidth to memory as measured
> by stream triad is 50% better on the Itanium implying that you will get 
> a larger percentage of peak for out-of-cache workloads.

That's pretty pathetic in the light of Opteron and even today's Intel 
875 desktop chipset. Is Madison going to bring Itanium 2 a new FSB? Nope.

> Then there is the 64-bit address space,

What a great reason to charge through the roof.

> EPIC compiler technology,

Even better!!

> etc.  ... but ... 
> 
> Itanium 2 prices seem high to me. However, the questions is really one for
> Intel and HP ... is the current price generating enough volume to hit the 
> revenue sweet spot. They could care less whether I, you, or any random individual 
> buyer likes the price ;-).  The price is right if they are maximizing the 
> time-integrated return on the product. Initial pricing should err high ...
> you can always lower it, but can never raise it.  Until Opteron is available, 
> the only, long-lived, direct competition is the Power 4 (is it available in 1 
> and 2 processor configurations?).

Well if we believe AMD, Opteron arrives next week. You go buy up your 
Itanium 2s (or POWER 4s).

> Plus, why should Intel compete with their
> own price-performance Pentium 4 systems by lowering Itanium 2 prices?

Because if they lowered Itanium 2 prices, Opteron wouldn't have a 
market. It's too late now, for Itanium 2. Itanium 2.5/3 may be a 
different story (we can only hope).

   They
> are serving two markets segments those with more money than brains and those
> with more brains than money ... ;-). The market is quantized ... each product
> has its own quantum number.

Itanium 2 certainly seems to be in a superposition of "fantastic" and 
"worthless" that the computer market hasn't seen for quite some time.

> I would be interested in SPECFP and Stream Triad numbers for the 
> Opteron if you have them.

SPECfp2000 is ~1170 for a 2GHz 1MB L2 Opteron. Not too bad. The SPECint 
figure is fantastic though (~1200). I hate x86 as much as the next guy, 
but it looks like this is what I'm going to be working with for some 
time, _thanks to Intel and their incredibly uninspired pricing strategy_.

	Duraid


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From anand at novaglobal.com.sg  Mon Apr 14 23:12:32 2003
From: anand at novaglobal.com.sg (Anand Vaidya)
Date: Tue, 15 Apr 2003 11:12:32 +0800
Subject: Question regarding M-VIA & Linux
Message-ID: <200304151112.38782.anand@novaglobal.com.sg>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I am attempting to setup a cluster with Linux (x86) & M-VIA (virtual 
interface). I find that the project has been abandoned last year.

Also, most of the VIA drivers (eepro100, e1000, tulip etc) do not compile or 
if they compile, do not work as expected. They hang (tulip) or don't even 
recognise the ethernet card (e1000), or fail in vnettest (eepro100). 
Unfortunately, I don't have access to hamachi or syskonnect hardware.

I would like to know whether any of you have production clusters running on 
MVIA, especially Intel GB NICs, since that is what I have (easy) access to. I 
would be grateful if you have any patches or related documents.

Regards,
Anand
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+m3ilQR28l/pNhTkRAhc1AKCjIGv8Nf2pexnUt6+X6OiV+Hu2xQCfb4Mg
vqvnADWhSrysUuJqtB8cFIA=
=pzwQ
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rbw at ahpcrc.org  Tue Apr 15 10:40:29 2003
From: rbw at ahpcrc.org (Richard Walsh)
Date: Tue, 15 Apr 2003 09:40:29 -0500
Subject: [Linux-ia64] Itanium gets supercomputing software
Message-ID: <200304151440.h3FEeTY10399@mycroft.ahpcrc.org>


On Mon Apr 14 19:13:40 2003, Mark Hahn wrote:

>> about 1100 to 800 in favor of Pentium 4.  Bandwidth to memory as measured 
>> by stream triad is 50% better on the Itanium implying that you will get 
>> a larger percentage of peak for out-of-cache workloads.  Then there is 
>
>until the next-gen P4 chipsets arrive (and they have).

   Have you see any P4 stream numbers that break the 3 GB/s 
   level yet? What chipsets/boards?

>
>> I would be interested in SPECFP and Stream Triad numbers for the 
>> Opteron if you have them.
>
>me too </aol>.  but if I understand AMD's marketing "plan",
>we won't see the interesting Opteron systems at launch.
>that is, since Opteron bandwidth scales with ncpus, it's 
>really the 4-8-way systems that will look dramatically 
>more attractive than any competitors (cept maybe Marvel).

   I agree on the SMP play. The Opteron's inter-chip 
   interconnect capability resembles the EV7's ... on
   the other hand, HP can use the EV7 (Marvel) do defend 
   I2's flank while Intel does something about its weak 
   shared bus, 4-way SMP design.

>it is sort of interesting that much of It2's rep rests on 
>fairly single-threaded benchmarks (cfp2000, stream).  but I don't
>see a lot of people buying uniprocessor It2's, and all It2
>systems use a shared 6.4 GB/s FSB.  by comparison, a dual-opt
>has 10.8 GB/s aggregate, which starts to be interesting.

   Good points, but if the price on an I2 with 3 MB
   L2 cache comes down and you place it into a larger cluster
   context where people care less about a system's SMP
   content it could be a winner ... that was/is what 
   PNNL is thinking I guess ... but theirs is a traditional
   supercomputer budget really.

   rbw

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rbw at ahpcrc.org  Tue Apr 15 11:08:20 2003
From: rbw at ahpcrc.org (Richard Walsh)
Date: Tue, 15 Apr 2003 10:08:20 -0500
Subject: [Linux-ia64] Itanium gets supercomputing software
Message-ID: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org>

On Tue Apr 15 09:45:03 2003, Joseph Landman wrote:

>I remember the Trace Multiflow in 1991 or so, where compiling my
>molecular dynamics code took the better part of a day.  Made debugging
>interesting, as the "bug" only appeared in the optimized code.

 Don't think EPIC compile times compare to those of the MultiFlow, but
 I have no direct experience.  
 
 I do think EPIC is valuable on several scores.  First, it frees real 
 estate on the chip by reducing/eliminatin out-of-order execution hardware 
 allowing for larger caches (3 MB on chip today) and future additional 
 functional unit parallelism or additional cores on the same chip.  Second,
 it allows generated code to be tuned to the width (number of simultaneous 
 instructions alowed) of the processor. Finally, its predicate/nat analysis 
 can completely remove traditional stall points where the CPU must wait 
 for data from memory or conditions to be computed before proceeding to 
 execution.  The last advantage is a useful way of using increasingly 
 redundant core hardware to speed results through the processor.  "Micro-
 threads/paths" are simultaneously computed using hardware that for the 
 moment would be idle anyway and results that are later proven to 
 to un-needed are be discarded. 

 The benefits are hard to quantify, but I believe significant part
 of the I2's SpecFP score is EPIC derived.

 I am guessing others disagree ... ;-) ... Oui?

 rbw

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Tue Apr 15 10:44:48 2003
From: landman at scalableinformatics.com (Joe Landman)
Date: 15 Apr 2003 10:44:48 -0400
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <3E9B40D2.9010400@octopus.com.au>
References: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org>
	 <3E9B40D2.9010400@octopus.com.au>
Message-ID: <1050417888.3474.6.camel@squash.scalableinformatics.com>

On Mon, 2003-04-14 at 19:14, Duraid Madina wrote:

> What a great reason to charge through the roof.
> 
> > EPIC compiler technology,

(hauling out an old VLIW story)

I remember the Trace Multiflow in 1991 or so, where compiling my
molecular dynamics code took the better part of a day.  Made debugging
interesting, as the "bug" only appeared in the optimized code.

Out of curiosity, is all the good compiler technology for IA64 going to
be retained in the Intel (and other commercial) compilers?  Someone had
posted a link to a machine to play with and I had blown it away (the
link that is)... could we get a quick repost of that?  Thanks.

-- 
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dsarvis at zcorum.com  Tue Apr 15 10:08:40 2003
From: dsarvis at zcorum.com (Dennis Sarvis, II)
Date: 15 Apr 2003 10:08:40 -0400
Subject: task sharing
Message-ID: <1050415720.2122.12.camel@skull.america.net>

I attempted to build a 2 node cluster, simply because my workstations
are slow and irritating whilst developing web applications and graphics,
with a crossover cable I made. They are running Redhat 9 (a P2 and a
Celron550). The machines can see and talk to each other, but do not
share tasks. My question is, what is the "best" software to control the
cluster and share tasks for X-windows on Redhat 9, and is this even a
suitable use?
-- 
Web Applications Designer/Developer, Project Manager, Graphic Designer,
Commercial Web Site Designer, Research & Development, Systems
Administrator, etc...

Dennis Sarvis, II <dsarvis at zcorum.com>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Apr 15 12:36:59 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 15 Apr 2003 12:36:59 -0400 (EDT)
Subject: beowulf in space 
In-Reply-To: <195MUg-178-00@etnus.com>
Message-ID: <Pine.LNX.4.44.0304151215390.2094-100000@lucifer.rgb.private.net>

On Tue, 15 Apr 2003, James Cownie wrote:

> 
> >   b) communications latency (bandwidth actually can be as big as you
> >      like or are likely to ever need, since you ARE a satellite, after
> >      all...:-)
> 
> Well, 18 months ago ESA were getting 50Mb optically between satellites 
> 
>   http://www.esa.int/export/esaCP/ESASGBZ84UC_Improving_0.html
> 
> since that is 
> 
> 1) a Moore generation ago :-) (though I think development times are
>    longer in space technology)
> 2) public information
> 
> I expect that the people on the "dark side" who do this can indeed get
> an awful lot of bandwidth...

I think the relevant numbers that indicate the limits of what technology
CAN do from satellites are more likely to come from looking at the
humble satellite dish attached to many homes.  Order of 100 channels in
the television range, some of them HDTV, at a guess order of 100 MB/sec
per second (assume order of a MB/sec per channel).  Or look at phone
satellites.  And that is using only a small part of the spectrum, and
ignores the possibility of multiple channels reusing the same spectrum
with directional links.

I would guess that one could, with some effort downlink some orders of
magnitude more than gigabytes per second, and I meant bytes.  How many
orders (and I meant plural there, too) probably does depend on a lot of
things including distance from earth, ambient atmospheric conditions in
the intervening space between transmitter and receiver, what frequencies
one is using, the number of directional-parallel channels one can
maintain.  A single visible-light laser link, for example, could likely
carry many gigabits per second even allowing for atmospheric distortion.

However, one of the NASA guys on the list probably knows at least the
comsat or tvsat numbers (Jim?).  And as you say, the military probably
has lots of bandwidth down from theirs although how much they aren't
likely to say.  However, they take HIGH resolution pictures in a pretty
much steady stream...

I think we should just accept Jim's statement that near the earth we can
get a "lot" of bandwidth on demand, but that things get dicier for
obvious reasons when you get far away.  Less power, harder to hold a
tight beam, less signal to noise on both ends' receivers, more
retransmissions.  It's pretty astounding that we were able to get the
incredible flyby pictures of Jupiter and the outer planets at all that
we got, given the minute size of the spacecraft and their power
supplies, their extreme distance from the earth, and the decades they
were in space.  Nasa does literally incredible engineering.  Expensive,
sure, but the REALLY expensive missions are the ones where something
breaks and the whole investment (human and otherwise) is wasted.

Let's also not forget who "invented" the beowulf, as well (tip of the
hat to Nasa Goddard, Don and Tom and all the rest:-).  I'm quite certain
that they use beowulfish concepts all the time in their engineering, and
Jim did an excellent job of indicating some of the reasons why.  This
isn't even inconsistent with the original beowulf idea -- sure, one
would be silly to throw a general purpose cluster up into space to do
e.g. my computations, but real optimizing beowulf engineering matches
the design to the task.  Of course they're going to engineer a "cluster"
that matches their precise needs and specifications.  It just isn't
going to be doing work "for earth" -- it will be doing signal processing
and so forth, and even there only when the economics of the available
data bandwidth and/or robust engineering requirements dictate.

If I were engineering a space vehicle, I'd make even the onboard
navigation computer redundant.  This might be more of a "high
availability" model than high performance, but an ideal design might mix
both.  Lots of processors reduces the time required for a
parallelizeable navigation computation AND can make the computer more
robust against the failure of one or more processors -- as long as you
have at least one left, you can complete key computations, just more
slowly.  Heck, from one point of view every compute node is already a
specialized "parallel cluster" -- the system has a CPU and a variety of
bridges, dedicated special purpose processors, and so forth all on
board, so how could NASA NOT make "parallel" environments for
spacecraft.

The one thing they won't do is use off-the-shelf parts, and I can't
blame them.  The damn things break all the time here on earth.  I'm
typing away on a computer that has a dying hard drive, while waiting for
it to rsync a final time with my fingers crossed.  For me it is no
biggie.  A trip to Intrex, a hundred bucks (or more likely warranty
replacement).  In space that's kinda hard.

SO although I'm certain that they use clusters on spacecraft in at least
one sense of the word, I'm equally certain that they are NOT beowulfs,
according to the standard definition.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mn216 at columbia.edu  Tue Apr 15 11:12:36 2003
From: mn216 at columbia.edu (Murad Nayal)
Date: Tue, 15 Apr 2003 11:12:36 -0400
Subject: beginner help
References: <200304141425.SAA07170@nocserv.free.net>	 <3E9B17AA.1000806@octopus.com.au> <1050358151.6451.226.camel@woody> <3E9B42EE.7020509@octopus.com.au> <3E9B4421.9060307@octopus.com.au>
Message-ID: <3E9C2164.7AB1843D@columbia.edu>


Hello,

I am new to the beowulf system. our cluster has been having problems
mostly with bpsh where for example 'bpsh 0 ls' returns either nothing or
errors:

bpsh 0 ls
ls: error while loading shared libraries: /lib/libtermcap.so.2: cannot
read file data: Error 116

this sounds like /lib is not accessible on node 0. I wrote a small
program to print the /lib directory contents to a file, and another
program that uses bproc_execmove to run the previous program on node 0.
both programs linked static as not to need dynamic linking. and in fact
I do obtain a listing for /lib on node 0. as I said I am a novice and
have no idea where to go from here. any suggestions. the problem seems
to go away after reboot. 

many thanks in advance.
Murad
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kdunder at sandia.gov  Tue Apr 15 11:41:32 2003
From: kdunder at sandia.gov (Keith D. Underwood)
Date: 15 Apr 2003 09:41:32 -0600
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org>
References: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org>
Message-ID: <1050421292.27085.9.camel@sadl16603.sandia.gov>


>  Don't think EPIC compile times compare to those of the MultiFlow, but
>  I have no direct experience.  

>From what I hear, compile times are not particularly good for EPIC...


>  The benefits are hard to quantify, but I believe significant part
>  of the I2's SpecFP score is EPIC derived.
> 
>  I am guessing others disagree ... ;-) ... Oui?

You should actually look at those numbers.  See here:

http://www.spec.org/cpu2000/results/res2002q4/cpu2000-20021119-01859.html

The only way you get graphs like that is when a couple of your
benchmarks actually fit in cache.  Benchmarks running from cache are not
terribly representative of most real applications.  

					Keith

-- 
Keith D. Underwood <kdunder at sandia.gov>


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Tue Apr 15 12:45:22 2003
From: landman at scalableinformatics.com (Joseph Landman)
Date: Tue, 15 Apr 2003 12:45:22 -0400
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <1050421292.27085.9.camel@sadl16603.sandia.gov>
References: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org> <1050421292.27085.9.camel@sadl16603.sandia.gov>
Message-ID: <3E9C3722.4070900@scalableinformatics.com>


Keith D. Underwood wrote:

> The only way you get graphs like that is when a couple of your
> benchmarks actually fit in cache.  Benchmarks running from cache are not
> terribly representative of most real applications.  

I seem to remember that being one of my major complaints about SPEC in 
general.

I would much prefer to see small, medium, large, huge, and 
I-cant-beleive-you-expect-results-from-something-that-size type runs 
than the old "fit-in-the-cache" variety.

My runs, and my customers runs are quite a bit larger than the 3MB 
caches, so we tend to take SPEC with a kg or two of salt.


-- 
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From derek.richardson at pgs.com  Tue Apr 15 15:03:33 2003
From: derek.richardson at pgs.com (Derek Richardson)
Date: Tue, 15 Apr 2003 14:03:33 -0500
Subject: Power supply problems (was: Machine Check Exception)
In-Reply-To: <20030415114952S.hanzl@unknown-domain>
References: <3E9AF3DD.1070002@pgs.com> <20030415114952S.hanzl@unknown-domain>
Message-ID: <3E9C5785.5090006@pgs.com>

Vaclav,
Thanks for the info, I ending up finding the problem : CPU1.  Which of 
course was the last thing I checked...it took a while for it to occur to 
me that a machine check exception might not be particular to the CPU 
that generates it.
Thanks,
Derek R.

hanzl at noel.feld.cvut.cz wrote:

>>Does anyone know if a power supply can cause a machine check exception
>>    
>>
>
>Power supply can probably cause all sorts of weird problems, with
>variety similar to RAM problems - problem can surface just anywhere
>and resemble other hardware or software problem to such an extent that
>usual diagnostic steps like replacing hardware and software components
>clearly indicate that something else is faulty, replacing that
>'faulty' component seems to fix it but later on problems reoccur.
>
>In another words, if your power supply works near the limits, problems
>may be observed just on few nodes in the cluster (or on a single node)
>and there may be just certain hardware/software/environmental
>circumstances which trigger the problem.
>
>I've seen certain indications that these days power supply can cause
>more headaches than before:
>
>- note from Abit tech staff saying that certain power supplies send
>POWER_GOOD 'too early', meaning before on-board power conversion
>circuits had time to stabilise
>
>- overclockers are starting to take power supply more seriously.
>(However stupid overclocking is, their web sites gives good
>indication which parts of hardware work near to the limits.)
>
>- note that certain Abit BIOS upgrade fixes PSU problems
>
>- many problems with Abit IT7 MAX2 and 300W PSU I've encountered
>myself
>
>- fact that modern PSU is under software control and hardware control
>of that part of mainboard which gets standby power, opening new
>possibilities for intermixing PSU/hardware/software problems.
>
>
>Most of my experience is Abit-related but may be general I am
>affraid...
>
>Regards
>
>Vaclav
>  
>

-- 
Linux Administrator
derek.richardson at pgs.com
derek.richardson at ieee.org
Office 713-781-4000
Cell 713-817-1197
bureaucracy, n:
	A method for transforming energy into solid waste.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alangrimes at starpower.net  Tue Apr 15 17:27:49 2003
From: alangrimes at starpower.net (Alan Grimes)
Date: Tue, 15 Apr 2003 14:27:49 -0700
Subject: beowulf in space
References: <Pine.LNX.4.44.0304151215390.2094-100000@lucifer.rgb.private.net>
Message-ID: <3E9C7955.AC911463@starpower.net>

To the best of my limited understanding of space applications the
problem biggest problem of computing in space is not the actual
computing but the communications. 

The biggest problem in sending 20 probes out into the universe is the
issue of _TRACKING_ those 20 probes. NASA has found that its Deep Space
Network is streached thin tracking all the probes it has sent up over
the years (in addition to its other astronomical duities)...

The solution is to design a version of the internet so that the varrious
probes can communicate among themselves reducing the workload of the DSN
to only a few targets (or eliminating the need for elaborate ground
tracking altogeather..) 

Some interesting projects could be: 

--- Establishing relay stations on the moon...
	There are no stable orbits around the moon so any communications relay
station would need to be ground based. Apparently there are certain
points where the varrious gravitational forces balance out called
"lagrange points". One would place relay satelites at these locations
and then build comms towers on the ground to relay local trafic up to
the satelites. This is trickey work because there would need to be a
direct line of sight from your mission to the nearest comm tower and
from there to one of the stationary satelites... The technologies for
this have already been developed... The big challenge is, ofcourse,
establishing this network for operations on the so called dark side of
the moon...

-- Establishing relays to distant planets: 
	We would like to have networks going to Venus, Mars, and Jupiter...
This would require at least one satelite around each of the remote
planets and one in orbit around earth. A satelite in high orbit would
have a clear line of sight to its target planet for 22+ hours a day and
would only require a single ground station each. 

Such a network would drasticly reduce the maintainance costs of all
future missions... =) 


-- 
Having never read a manual, it takes less effort to hack something
togeather with www.squeak.org than it does with C++ and five books.
http://users.rcn.com/alangrimes/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From amitvyas_cse at hotmail.com  Wed Apr 16 00:06:16 2003
From: amitvyas_cse at hotmail.com (amit vyas)
Date: Wed, 16 Apr 2003 09:36:16 +0530
Subject: network(cluster) load balancing
Message-ID: <F18ap2SAWQ8AZndKaQf0006ecb7@hotmail.com>

hi all,
we are working on a experimental 3+1 node beowulf cluster (oscar 2.1 +linux 
7.3), and we were lately  thinking on how to use this cluster to provide 
various services for college campus
like(,mainly for )
1. diskless clients(network booting)
2. X-terminals (XDMCP)
3. parallel programs.

ie we want to know how to modify cluster in a way that handles NETWORK LOAD 
(NLB)
i think i have made myself clear .
can anyone help us on how to deploy this mainframe-terminal model so as to 
demostrate  supercomputing power .
lastly RGB(rgb at phy.duke.edu) provided us valuable information for our 
problem that helped us a ,lot thank you RGB.
thanks in advance.

Amit vyas
RIET, JAIPUR INDIA
CSE deptt.


_________________________________________________________________
Find old batchmates. Renew lost friendship. 
http://www.batchmates.com/msn.asp Right here!

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From eric at fnordsystems.com  Wed Apr 16 01:02:16 2003
From: eric at fnordsystems.com (Eric Kuhnke)
Date: Tue, 15 Apr 2003 22:02:16 -0700
Subject: Electricity Bill
Message-ID: <5.2.0.9.2.20030415215707.02a62c70@66.250.215.18>

I have a quick survey regarding electricity use, kW/H rates, etc:

1) How much did you pay last year for the electricity and HVAC consumption 
of your cluster?

2) How big is it?  What sort of CPUs?  etc

3) What are you paying in kilowatt-hour rates to the power company?

4) Would you have built a much larger cluster if the projected yearly 
electrical bill was significantly lower?  Ex: if you were located in an 
area such as Vancouver or Winnipeg, lowest electricity rates in North 
America.  See http://www.bchydro.com/policies/rates/rates759.html for kwH 
rates (in Canadian currency).


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Wed Apr 16 10:51:11 2003
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Wed, 16 Apr 2003 07:51:11 -0700 (PDT)
Subject: new to linux clustering
In-Reply-To: <bd9cc01c303a0$c56afe40$4701020a@corp.load.com>
Message-ID: <20030416145111.29937.qmail@web11404.mail.yahoo.com>

MPI stuff
=========
MPICH:   http://www-unix.mcs.anl.gov/mpi/mpich/
LAM-MPI: http://www.lam-mpi.org
(LAM has a mailing list)

For batch systems, take a look at GridEngine:
http://gridengine.sunsource.net

Rayson


--- Mohammad Tina <mtina at tahoe.com> wrote:
> Hi,
> what i am trying to do is for my class project , we want to setup
> cluster of 3 machines, i think we will run mpi applications, then if
> it is possible i will try setup grid with other cluster.
> 
> thanks
> 

__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From exa at kablonet.com.tr  Tue Apr 15 19:13:34 2003
From: exa at kablonet.com.tr (Eray Ozkural)
Date: Wed, 16 Apr 2003 02:13:34 +0300
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304151215390.2094-100000@lucifer.rgb.private.net>
References: <Pine.LNX.4.44.0304151215390.2094-100000@lucifer.rgb.private.net>
Message-ID: <200304160213.34165.exa@kablonet.com.tr>

On Tuesday 15 April 2003 19:36, Robert G. Brown wrote:
> SO although I'm certain that they use clusters on spacecraft in at least
> one sense of the word, I'm equally certain that they are NOT beowulfs,
> according to the standard definition.

In my mind, the real difficulty comes from not the costs but the exotic and 
diverse nature of hardware. This might even more conceivably link up with the 
"grid computing" idea. If I had such a heterogeneous network, and I needed to 
make some high-perf. computation, could I simply submit it to a grid system 
that would automatically reconfigure the computation for optimal performance, 
etc.? I'm thinking something like a space vessel being able to take advantage 
of all general-purpose processors on-board and in sufficiently close 
proximity (like say a space station). The problem abstractly not too 
different from a university campus with a lot of motley computer labs and 
unpredictable network setups (ie. dynamic nodes, network topology, so forth).

Thus, I think it's more of a software problem. Can you really build "the" high 
performance platform that successfully and completely abstracts the 
OS/network/CPU? What would be needed for such a thing? (I'm not thinking 
'Java', that's slow, thank you) Such software is surely in line with Beowulf 
thinking.

Thanks,

-- 
Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Apr 16 12:34:10 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 16 Apr 2003 12:34:10 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <200304160213.34165.exa@kablonet.com.tr>
Message-ID: <Pine.LNX.4.44.0304161134280.17657-100000@ganesh.phy.duke.edu>

On Wed, 16 Apr 2003, Eray Ozkural wrote:

> On Tuesday 15 April 2003 19:36, Robert G. Brown wrote:
> > SO although I'm certain that they use clusters on spacecraft in at least
> > one sense of the word, I'm equally certain that they are NOT beowulfs,
> > according to the standard definition.
...
> Thus, I think it's more of a software problem. Can you really build "the" high 
> performance platform that successfully and completely abstracts the 
> OS/network/CPU? What would be needed for such a thing? (I'm not thinking 
> 'Java', that's slow, thank you) Such software is surely in line with Beowulf 
> thinking.

Absolutely, although I'd refer to it more precisely as "cluster
computing" thinking and not beowulfs per se.  To be picky, a beowulf is
"single machine" supercomputer built out of COTS (commodity, off the
shelf) components, running an open source operating system,
traditionally linux, and possibly some software such as Scyld or bproc
that flattens PID space or is otherwise designed to promote that image
of "a beowulf" as being a single machine.

All beowulfs are clusters, not all clusters are beowulfs, see
interminable discussions in years past in the archives and Kragen's FAQ.

A spacecraft cluster will simply never be built with real COTS parts --
they are too unreliable and not nearly expensive enough:-).  They might
conceivably be built with "customized" parts that have a COTS origin --
a system homologous to or derived from a COTS design but subjected to a
far more rigorous manufacture and testing regimen.  It might also be
built with one of the beowulf networks since COTS (in the usual sense of
the term) or not in some cases they are the only game in town.

So I wouldn't be incredibly surprised to see a spacecraft containing a
bunch of "intel" or "amd" nodes, interconnected with e.g. SCI (because
it is switchless and hence arguably more robust).  Those nodes, however,
will be built on motherboards and CPUs custom engineered for low power,
radiation hardness, fault tolerance, redundancy, and tested ad nauseam
before ever leaving the earth.  It is cheaper to spend $100K or even
more on each on those nodes ("identical" in function to a $2000
board+network interface here on earth) and be almost certain that they
won't fail than it is to deal with the roughly 10% failure rate per year
observed for at least one component in a lot of COTS systems.

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at plogic.com  Wed Apr 16 15:25:53 2003
From: deadline at plogic.com (Douglas Eadline)
Date: Wed, 16 Apr 2003 14:25:53 -0500 (CDT)
Subject: SMP and Network Connections
Message-ID: <Pine.LNX.4.44.0304161420370.2687-100000@homer>


Just posted some more SMP tests on www.cluster-rant.com.
This time, I tested the interconnects and asked the
question "What if a dual SMP used two Ethernet connections
instead of one?" Seems to help! Take a look at:

http://www.cluster-rant.com/article.pl?sid=03/04/16/1815257

to get the full report.

Doug
-------------------------------------------------------------------
Paralogic, Inc.           |     PEAK     |      Voice:+610.814.2800
130 Webster Street        |   PARALLEL   |        Fax:+610.814.5844
Bethlehem, PA 18015 USA   |  PERFORMANCE |    http://www.plogic.com
-------------------------------------------------------------------

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kjm31 at cu-genome.org  Wed Apr 16 15:57:50 2003
From: kjm31 at cu-genome.org (Kristen J. McFadden)
Date: Wed, 16 Apr 2003 15:57:50 -0400
Subject: Running perl scripts and non-mpi programs on scyld
Message-ID: <E167FAE539F45940B8BECD4A64535D4E04216D@cgcmail.cgc.cpmc.columbia.edu>

Hi, 

We have a Scyld Beowulf cluster currently running on 28cz-4 (we are
getting -5 soon).  We have been running into a lot of problems with
users that are trying to run scripts on the child nodes.   To start
with, what is the best way to run serial (non-MPI) programs?

Here is the current issue I'm trying to tackle.

Say I have a perl script.  (I  NFS mount /usr /lib etc. on the child
nodes)

I want to run this perl script on N nodes with N DIFFERENT arguments.

Right now, even when I write up a small file with "mpprun my_program
arg1 arg2 | batch now" in 100 lines or something for all different
arguments, bbq does NOT properly distribute these programs.  It
overloads some nodes and behaves essentially unpredictably.  

 
Is there any tools or info anyone has about running Perl scripts and the
like safely on a Scyld implementation?

 
Thanks, Kristen
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Wed Apr 16 17:10:03 2003
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Wed, 16 Apr 2003 14:10:03 -0700 (PDT)
Subject: Running perl scripts and non-mpi programs on scyld
In-Reply-To: <E167FAE539F45940B8BECD4A64535D4E04216D@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <20030416211003.95964.qmail@web11404.mail.yahoo.com>

What is the ratio of parallel programs and serial programs ran on your
cluster??

Rayson

--- "Kristen J. McFadden" <kjm31 at cu-genome.org> wrote:
> Hi, 
> 
> We have a Scyld Beowulf cluster currently running on 28cz-4 (we are
> getting -5 soon).  We have been running into a lot of problems with
> users that are trying to run scripts on the child nodes.   To start
> with, what is the best way to run serial (non-MPI) programs?
> 
> Here is the current issue I'm trying to tackle.
> 
> Say I have a perl script.  (I  NFS mount /usr /lib etc. on the child
> nodes)
> 
> I want to run this perl script on N nodes with N DIFFERENT arguments.
> 
> Right now, even when I write up a small file with "mpprun my_program
> arg1 arg2 | batch now" in 100 lines or something for all different
> arguments, bbq does NOT properly distribute these programs.  It
> overloads some nodes and behaves essentially unpredictably.  
> 
>  
> 
> Is there any tools or info anyone has about running Perl scripts and
> the
> like safely on a Scyld implementation?
> 
>  
> 
> Thanks, Kristen
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Wed Apr 16 16:38:11 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Wed, 16 Apr 2003 13:38:11 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <1050417888.3474.6.camel@squash.scalableinformatics.com>
References: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org> <3E9B40D2.9010400@octopus.com.au> <1050417888.3474.6.camel@squash.scalableinformatics.com>
Message-ID: <20030416203811.GB1149@greglaptop.internal.keyresearch.com>

On Tue, Apr 15, 2003 at 10:44:48AM -0400, Joe Landman wrote:

> Out of curiosity, is all the good compiler technology for IA64 going to
> be retained in the Intel (and other commercial) compilers?

Open64 has a GPLed IA64 backend. While it's unfortunate that SGI has
stopped GPLing new work on it, it's still a pretty good compiler. It
is currently being used by several companies for new cpus, and by some
research groups, both compiling for the IA-64 and using it as a
source-to-source tool.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Wed Apr 16 17:13:17 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Wed, 16 Apr 2003 14:13:17 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org>
References: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org>
Message-ID: <20030416211317.GC1149@greglaptop.internal.keyresearch.com>

On Tue, Apr 15, 2003 at 10:08:20AM -0500, Richard Walsh wrote:

>  I do think EPIC is valuable on several scores.  First, it frees real 
>  estate on the chip by reducing/eliminatin out-of-order execution hardware 
>  allowing for larger caches (3 MB on chip today) and future additional 
>  functional unit parallelism or additional cores on the same chip.

Nope. You can look up the size of the EPIC core; it's not small. It
only can have 3 MB on chip cache today because it's the largest
possible chip you can build. That cache is much larger than the
processor core.

>  Second, it allows generated code to be tuned to the width (number
>  of simultaneous instructions alowed) of the processor.

Good compilers have instruction scheduling which do this on other
chips. While it's easier to understand what's going on when the
parallelism is explicit, you'll find that scientific codes get a
pretty amazing number of instructions per cycle on quite a few cpus
and compilers.

The promise of EPIC was that it would be eaiser to do this. You'll
have to talk to some compiler people to find out if they think it was
easier. The ones I know hate EPIC was a passion.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Wed Apr 16 17:53:50 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Wed, 16 Apr 2003 14:53:50 -0700
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304161134280.17657-100000@ganesh.phy.duke.e
 du>
References: <200304160213.34165.exa@kablonet.com.tr>
Message-ID: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov>

A

>So I wouldn't be incredibly surprised to see a spacecraft containing a
>bunch of "intel" or "amd" nodes, interconnected with e.g. SCI (because
>it is switchless and hence arguably more robust).  Those nodes, however,
>will be built on motherboards and CPUs custom engineered for low power,
>radiation hardness, fault tolerance, redundancy, and tested ad nauseam
>before ever leaving the earth.  It is cheaper to spend $100K or even
>more on each on those nodes ("identical" in function to a $2000
>board+network interface here on earth) and be almost certain that they
>won't fail than it is to deal with the roughly 10% failure rate per year
>observed for at least one component in a lot of COTS systems.

Interestingly, they needn't cost $100K... There are several firms that sell 
(flight qualified) processor cards with interfaces for less. This would 
generally be in a 6U form factor, conduction cooled, with some degree of 
radiation tolerance, and with "flight quality" parts.

You can, for about $30-40K, buy a nifty hybrid package about 2.5x3.5 inches 
with a 21020DSP, a bunch of RAM, various and sundry peripheral glue logic 
(timers, serial ports, etc.) and 3 high speed IEEE-1355 serial ports.

There's also a SPARC version in the same package.

Sandia is developing a rad hard Pentium, for those preferring a x86 
processor. There's also a rad hard/tolerant PowerPC (133 MHz, I think) 
available from BAE.  I'm pretty sure there's a '386 or '486 available as well.

One of the appeals of a Beowulf kind of concept is the idea of using a 
bunch of commodity processors ganged together to get more processing 
resources. For space, the difference is that commodity means something a 
bit different.  However, anytime you can spread the NRE cost across a 
system composed of a bunch of identical parts, it's a good thing. This is 
because you're always buying spares, redundant strings, engineering models, 
etc., and those can help to spread the development cost, so the "flight 
article" cost is less.

There's also a non-negligble cost of having more items on the "bill of 
materials": each different kind of part needs drawings, documentation, test 
procedures, etc., a lot of which is what makes space stuff so expensive 
compared to the commercial parts (for which the primary cost driver is that 
of sand (raw materials) and marketing) so again, systems comprised of many 
identical parts have advantages.


>James Lux, P.E.

Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Wed Apr 16 19:41:36 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Wed, 16 Apr 2003 16:41:36 -0700
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304161905001.1300-100000@lilith.rgb.private
 .net>
References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov>
Message-ID: <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov>

A
> > There's also a non-negligble cost of having more items on the "bill of
> > materials": each different kind of part needs drawings, documentation, 
> test
> > procedures, etc., a lot of which is what makes space stuff so expensive
> > compared to the commercial parts (for which the primary cost driver is 
> that
> > of sand (raw materials) and marketing) so again, systems comprised of many
> > identical parts have advantages.
>
>Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...?
>
>Verrry Eeenteresting...
>
>Now marketing, that I'd believe;-)

Say it costs a billion dollars to set up the fab (which can be spread over 
2-3 years, probably), and maybe another half billion to design the 
processor (I don't know... 2500 work years seems like a lot, but...?)... 
How many Pentiums does Intel make? It's kind of hard to figure out just how 
many chips Intel makes in a given time (such being a critical aspect of 
their profitibility), but...

consider that Intel Revenue for 2002 was about $27B....

As for marketing... in an article about P4s from April of 2001:
Intel has told news sources that it plans to spend roughly $500 million to 
promote the new technology among software makers, and another $300 million 
on general advertising.


Such enormous volumes are why commodity computing even works..The NRE for 
truly high performance computing devices is spread over so many units...

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Apr 16 19:08:21 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 16 Apr 2003 19:08:21 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov>
Message-ID: <Pine.LNX.4.44.0304161905001.1300-100000@lilith.rgb.private.net>

On Wed, 16 Apr 2003, Jim Lux wrote:

> Interestingly, they needn't cost $100K... There are several firms that sell 
> (flight qualified) processor cards with interfaces for less. This would 
> generally be in a 6U form factor, conduction cooled, with some degree of 
> radiation tolerance, and with "flight quality" parts.

I stand corrected.  Perhaps general aviation and military creates a
market large enough to be considered COTS in its own, somewhat elevated
right.

Cool.  Seems useful to know.  Perhaps I'll have to write a chapter on
"Beowulfs in Space", or "Beowulfs in Super Secret Weapons Systems"
(kidding!) in my online book.

> 
> You can, for about $30-40K, buy a nifty hybrid package about 2.5x3.5 inches 
> with a 21020DSP, a bunch of RAM, various and sundry peripheral glue logic 
> (timers, serial ports, etc.) and 3 high speed IEEE-1355 serial ports.
> 
> There's also a SPARC version in the same package.
> 
> Sandia is developing a rad hard Pentium, for those preferring a x86 
> processor. There's also a rad hard/tolerant PowerPC (133 MHz, I think) 
> available from BAE.  I'm pretty sure there's a '386 or '486 available as well.
> 
> One of the appeals of a Beowulf kind of concept is the idea of using a 
> bunch of commodity processors ganged together to get more processing 
> resources. For space, the difference is that commodity means something a 
> bit different.  However, anytime you can spread the NRE cost across a 
> system composed of a bunch of identical parts, it's a good thing. This is 
> because you're always buying spares, redundant strings, engineering models, 
> etc., and those can help to spread the development cost, so the "flight 
> article" cost is less.
> 
> There's also a non-negligble cost of having more items on the "bill of 
> materials": each different kind of part needs drawings, documentation, test 
> procedures, etc., a lot of which is what makes space stuff so expensive 
> compared to the commercial parts (for which the primary cost driver is that 
> of sand (raw materials) and marketing) so again, systems comprised of many 
> identical parts have advantages.

Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...?

Verrry Eeenteresting...

Now marketing, that I'd believe;-)

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Wed Apr 16 20:30:37 2003
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Wed, 16 Apr 2003 17:30:37 -0700 (PDT)
Subject: Electricity Bill
In-Reply-To: <5.2.0.9.2.20030415215707.02a62c70@66.250.215.18>
Message-ID: <Pine.LNX.4.44.0304161729501.27648-100000@twin.uoregon.edu>

If we were located in Winnipeg, our cooling costs would be way lower in 
the winter...

joelja

On Tue, 15 Apr 2003, Eric Kuhnke wrote:

> I have a quick survey regarding electricity use, kW/H rates, etc:
> 
> 1) How much did you pay last year for the electricity and HVAC consumption 
> of your cluster?
> 
> 2) How big is it?  What sort of CPUs?  etc
> 
> 3) What are you paying in kilowatt-hour rates to the power company?
> 
> 4) Would you have built a much larger cluster if the projected yearly 
> electrical bill was significantly lower?  Ex: if you were located in an 
> area such as Vancouver or Winnipeg, lowest electricity rates in North 
> America.  See http://www.bchydro.com/policies/rates/rates759.html for kwH 
> rates (in Canadian currency).
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli	      Academic User Services   joelja at darkwing.uoregon.edu    
--    PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E      --
  In Dr. Johnson's famous dictionary patriotism is defined as the last
  resort of the scoundrel.  With all due respect to an enlightened but
  inferior lexicographer I beg to submit that it is the first.
	   	            -- Ambrose Bierce, "The Devil's Dictionary"


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From young_yuen at yahoo.com  Wed Apr 16 23:22:59 2003
From: young_yuen at yahoo.com (Young Yuen)
Date: Wed, 16 Apr 2003 20:22:59 -0700 (PDT)
Subject: problem with ANA-6911A/TX under kernel 2.4.18
In-Reply-To: <20030413163641.31713.qmail@web41303.mail.yahoo.com>
Message-ID: <20030417032259.18109.qmail@web41303.mail.yahoo.com>

Sorry but is this the right place for questions for
problems with the tulip (DEC chip based NIC) driver?


--- Young Yuen <young_yuen at yahoo.com> wrote:
> Hi,
> 
> The Tulip driver doesn't seem to detect the RJ45
> port.
> My kernel ver is 2.4.18 and Tulip driver ver is
> 0.9.15.
> 
> Linux Tulip driver version 0.9.15-pre11 (May 11,
> 2002)
> tulip0:  EEPROM default media type Autosense.
> tulip0:  Index #0 - Media MII (#11) described by a
> 21142 MII PHY (3) block.
> tulip0:  Index #1 - Media 10base2 (#1) described by
> a
> 21142 Serial PHY (2) block.
> tulip0: ***WARNING***: No MII transceiver found!
> divert: allocating divert_blk for eth0
> eth0: Digital DS21143 Tulip rev 33 at 0xc6855000,
> 00:00:D1:00:0B:4B, IRQ 11.
> 
> Somtimes after a reboot the warning message is gone.
> 
> tulip0:  MII transceiver #1 config 3100 status 7809
> advertising 0101.
> divert: allocating divert_blk for eth0
> eth0: Digital DS21143 Tulip rev 33 at 0xc6855000,
> 00:00:D1:00:0B:4B, IRQ 11.
> 
> But in either cases, it fails to ping any nodes on
> the
> network besides its own. ANA-6911A/TX is a
> 100BaseT/10Base2 combo card, RJ45 port is
> connected to LAN. Windows dual boot from the same
> machine works fine shows no problem with the network
> configuration or hardware.
> 
> Can you please kindly advise.
> 
> Thx & Rgds,
> Young
> 
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Tax Center - File online, calculators, forms,
> and more
> http://tax.yahoo.com
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From edwardsa at plk.af.mil  Wed Apr 16 23:21:51 2003
From: edwardsa at plk.af.mil (Art Edwards)
Date: Wed, 16 Apr 2003 21:21:51 -0600
Subject: beowulf in space
In-Reply-To: <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov>
References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov>
Message-ID: <20030417032151.GA13826@plk.af.mil>

I think I'm jumping into the middle of a conversation here, but our
branch is the shop through which most of the DoD processor programs are
managed. For real space applications there are radiation issues like
total dose hardness and single even upset that require special design
and, still, special processing. That is, you can't make these parts at
any foundry (yet). There are currently two hardened foundries through
which the most tolerant parts  are fabricated. Where the commercial
market is ~100's of Billions/year, the space electronics industry is
~200million/year. So parts are expensive, as Jim Lux says. But more
importantly, the current state-of-the-art for space processors is
several generations back. Now, with a 200 million market/year, who is
going to spend the money to build a new foundry? (anyone?) It's a huge
problem, and beowulfs in space will not give the economies of scale
necessary to move us forward. 

I don't know if this has been discussed here, but have you thought about
launch costs? They're huge. Weight, power, and mission lifetime are the 
crucial factors for space. These are the reasons that so much R&D goes
into space electronics. I apologize if I have gone over old ground.

Art Edwards

On Wed, Apr 16, 2003 at 04:41:36PM -0700, Jim Lux wrote:
> A
> >> There's also a non-negligble cost of having more items on the "bill of
> >> materials": each different kind of part needs drawings, documentation, 
> >test
> >> procedures, etc., a lot of which is what makes space stuff so expensive
> >> compared to the commercial parts (for which the primary cost driver is 
> >that
> >> of sand (raw materials) and marketing) so again, systems comprised of 
> >many
> >> identical parts have advantages.
> >
> >Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...?
> >
> >Verrry Eeenteresting...
> >
> >Now marketing, that I'd believe;-)
> 
> Say it costs a billion dollars to set up the fab (which can be spread over 
> 2-3 years, probably), and maybe another half billion to design the 
> processor (I don't know... 2500 work years seems like a lot, but...?)... 
> How many Pentiums does Intel make? It's kind of hard to figure out just how 
> many chips Intel makes in a given time (such being a critical aspect of 
> their profitibility), but...
> 
> consider that Intel Revenue for 2002 was about $27B....
> 
> As for marketing... in an article about P4s from April of 2001:
> Intel has told news sources that it plans to spend roughly $500 million to 
> promote the new technology among software makers, and another $300 million 
> on general advertising.
> 
> 
> Such enormous volumes are why commodity computing even works..The NRE for 
> truly high performance computing devices is spread over so many units...
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Art Edwards
Senior Research Physicist
Air Force Research Laboratory
Electronics Foundations Branch
KAFB, New Mexico

(505) 853-6042 (v)
(505) 846-2290 (f)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hanzl at noel.feld.cvut.cz  Thu Apr 17 03:24:17 2003
From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz)
Date: Thu, 17 Apr 2003 09:24:17 +0200
Subject: Running perl scripts and non-mpi programs on scyld
In-Reply-To: <E167FAE539F45940B8BECD4A64535D4E04216D@cgcmail.cgc.cpmc.columbia.edu>
References: <E167FAE539F45940B8BECD4A64535D4E04216D@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <20030417092417S.hanzl@unknown-domain>

> We have a Scyld Beowulf cluster currently running on 28cz-4 (we are
> getting -5 soon).  We have been running into a lot of problems with
> users that are trying to run scripts on the child nodes.   To start
> with, what is the best way to run serial (non-MPI) programs?
> ...
> Say I have a perl script.  (I  NFS mount /usr /lib etc. on the child
> nodes)
> 
> I want to run this perl script on N nodes with N DIFFERENT arguments.

For these types of jobs, we are using SGE on scyld-like cluster (we
are using HDDCS which is a variant of Clustermatic which is similar to
Scyld but this should not matter here).

SGE is quite nice and opensource batch spooling. Using it with
scyld-like cluster for this type of jobs is a bit tricky but quite
easy. We just create one 'queue' for every slave node and use node
number as a queue name. Then we use 'starter method' script like this:

file /usr/local/bin/sge-bproc-starter-method:

  #/bin/sh
  bpsh $QUEUE $*

All these queues are defined as running on master node but starter
method in fact moves perl scripts on individual slave nodes.

To run scripts on N nodes with N DIFFERENT arguments, you may use
'array jobs' or submit many individual jobs.

(And there is much more you can do with SGE, I highly recommend it.)

Regards

Vaclav Hanzl
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From scheinin at crs4.it  Thu Apr 17 03:56:48 2003
From: scheinin at crs4.it (Alan Scheinine)
Date: Thu, 17 Apr 2003 09:56:48 +0200
Subject: [Linux-ia64] Itanium gets supercomputing software
Message-ID: <200304170756.h3H7umB02357@dali.crs4.it>

Greg Lindahl wrote:


Good compilers have instruction scheduling which do this on other
chips. While it's easier to understand what's going on when the
parallelism is explicit, you'll find that scientific codes get a
pretty amazing number of instructions per cycle on quite a few cpus
and compilers.

The promise of EPIC was that it would be eaiser to do this. You'll
have to talk to some compiler people to find out if they think it was
easier. The ones I know hate EPIC was a passion.

 =================================================

   I do not think there was a promise that getting efficiency would
be easier with EPIC.  My understanding of the situation is that
the logic of dynamic allocation of resources, that is, the various
tricks done in silicon, could not scale to a large number of
processing units on a chip.  That is, the complexity grows faster
than linear, much faster.  If you take that as a postulate, then
it is logical to conclude that optimization must move to the
compiler.  The problem is that writing a compiler to maximize
efficiency is difficult.  Fifteen years ago I heard a talk in which
it was claimed that compiler advances developed at universities
arrive in commercial compilers after a delay of ten years.  More
recently people tell me that the development cycle is shorter, but
nonetheless, writing optimizing compilers is a very difficult task.

Greg Lindahl wrote that "The ones [compiler people] I know hate EPIC
with a passion".  Why?  Do they say that the concept is wrong or
is that problem that they cannot meet their deadlines because of
the quantity of analysis that has been moved from the silicon to
the compiler writer?  This is not a rhetorical question, it would
be interesting to learn more details from the "compiler people".

By the way, it may be a good idea to develop more packages like
Atlas and FFTW which optimize themselves based on the actual computer,
since memory latency and other factors are variable.  But then,
optimizing through experimentation takes a long time.

 -- Alan Scheinine
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Apr 17 08:14:38 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 17 Apr 2003 08:14:38 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <20030417032151.GA13826@plk.af.mil>
Message-ID: <Pine.LNX.4.44.0304170754440.2223-100000@lilith.rgb.private.net>

On Wed, 16 Apr 2003, Art Edwards wrote:

> I think I'm jumping into the middle of a conversation here, but our
> branch is the shop through which most of the DoD processor programs are
> managed. For real space applications there are radiation issues like
> total dose hardness and single even upset that require special design
> and, still, special processing. That is, you can't make these parts at
> any foundry (yet). There are currently two hardened foundries through
> which the most tolerant parts  are fabricated. Where the commercial
> market is ~100's of Billions/year, the space electronics industry is
> ~200million/year. So parts are expensive, as Jim Lux says. But more
> importantly, the current state-of-the-art for space processors is
> several generations back. Now, with a 200 million market/year, who is
> going to spend the money to build a new foundry? (anyone?) It's a huge
> problem, and beowulfs in space will not give the economies of scale
> necessary to move us forward. 
> 
> I don't know if this has been discussed here, but have you thought about
> launch costs? They're huge. Weight, power, and mission lifetime are the 
> crucial factors for space. These are the reasons that so much R&D goes
> into space electronics. I apologize if I have gone over old ground.

Actually, this is the sort of thing that makes (as Eray pointed out) the
idea of a cluster (leaving aside the COTS issue, the single-headed
issue, and whether or not it could be a true "beowulf" cluster)
attractive in space applications.  What you (and Gerry) are saying is
that the space and DoD market is stuck using specially engineered,
radiation hard, not-so-bleeding-VLSI processors from what amounts to
several VLSI generations ago.  The parts are expensive, but the cost of
building a newer better foundry for such a small and inelastic market
are prohibitive, so they are the only game in town.

If you have an orbital project or application that needs considerably
more speed than the undoubtedly pedestrian clock of these devices can
provide, you have a HUGE cost barrier to developing a faster processor,
and that barrier is largely out of your (DoD) or Nasa's control -- you
can only ask/hope for an industrial partner to make the investment
required to up the chip generation in hardened technology with the
promise of at least some guaranteed sales.  You also have a known per
kilogram per liter cost for lifting stuff into space, and this is at
least modestly under your own control.  So (presuming an efficiently
parallelizable task) instead of effectively financing a couple of
billion dollars in developing the nextgen hard chips to get a speedup of
ten or so, you can engineer twelve systems based on the current,
relatively cheap chips into a robust and fault tolerant cluster and pay
the known immediate costs of lifting those twelve systems into orbit.

Again presuming that it is for some reason not feasible to simply
establish a link to earth and do the processing here -- an application
for which the latency would be bad, an application that requires
immediate response in a changing environment when downlink
communications may not be robust.

A question that you or Gerry or Jim may or may not be able to answer
(with which Chip started this discussion):  Are there any specific
non-classified instances that you know of where an actual "cluster"
(defined loosely as multiple identical CPUs interconnected with some
sort of communications bus or network and running a specific parallel
numerical task, not e.g.  task-specific processors in several parts of a
military jet) has been engineered, built, and shot into space?

This has been interesting enough that if there are any, I may indeed add
a chapter to the book, if/when I next actually work on it.  I got dem
end of semester blues, at the moment...:-)

  rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From astroguy at bellsouth.net  Thu Apr 17 01:29:58 2003
From: astroguy at bellsouth.net (c.clary)
Date: Thu, 17 Apr 2003 01:29:58 -0400
Subject: beowulf in space
References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> <20030417032151.GA13826@plk.af.mil>
Message-ID: <3E9E3BD6.8030706@bellsouth.net>

Art Edwards wrote:

>I think I'm jumping into the middle of a conversation here, but our
>branch is the shop through which most of the DoD processor programs are
>managed. For real space applications there are radiation issues like
>total dose hardness and single even upset that require special design
>and, still, special processing. That is, you can't make these parts at
>any foundry (yet). There are currently two hardened foundries through
>which the most tolerant parts  are fabricated. Where the commercial
>market is ~100's of Billions/year, the space electronics industry is
>~200million/year. So parts are expensive, as Jim Lux says. But more
>importantly, the current state-of-the-art for space processors is
>several generations back. Now, with a 200 million market/year, who is
>going to spend the money to build a new foundry? (anyone?) It's a huge
>problem, and beowulfs in space will not give the economies of scale
>necessary to move us forward. 
>
>I don't know if this has been discussed here, but have you thought about
>launch costs? They're huge. Weight, power, and mission lifetime are the 
>crucial factors for space. These are the reasons that so much R&D goes
>into space electronics. I apologize if I have gone over old ground.
>
>Art Edwards
>
>On Wed, Apr 16, 2003 at 04:41:36PM -0700, Jim Lux wrote:
>  
>
>>A
>>    
>>
>>>>There's also a non-negligble cost of having more items on the "bill of
>>>>materials": each different kind of part needs drawings, documentation, 
>>>>        
>>>>
>>>test
>>>      
>>>
>>>>procedures, etc., a lot of which is what makes space stuff so expensive
>>>>compared to the commercial parts (for which the primary cost driver is 
>>>>        
>>>>
>>>that
>>>      
>>>
>>>>of sand (raw materials) and marketing) so again, systems comprised of 
>>>>        
>>>>
>>>many
>>>      
>>>
>>>>identical parts have advantages.
>>>>        
>>>>
>>>Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...?
>>>
>>>Verrry Eeenteresting...
>>>
>>>Now marketing, that I'd believe;-)
>>>      
>>>
>>Say it costs a billion dollars to set up the fab (which can be spread over 
>>2-3 years, probably), and maybe another half billion to design the 
>>processor (I don't know... 2500 work years seems like a lot, but...?)... 
>>How many Pentiums does Intel make? It's kind of hard to figure out just how 
>>many chips Intel makes in a given time (such being a critical aspect of 
>>their profitibility), but...
>>
>>consider that Intel Revenue for 2002 was about $27B....
>>
>>As for marketing... in an article about P4s from April of 2001:
>>Intel has told news sources that it plans to spend roughly $500 million to 
>>promote the new technology among software makers, and another $300 million 
>>on general advertising.
>>
>>
>>Such enormous volumes are why commodity computing even works..The NRE for 
>>truly high performance computing devices is spread over so many units...
>>
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit 
>>http://www.beowulf.org/mailman/listinfo/beowulf
>>    
>>
>
>  
>
Dear sir,
Plz feel free to jump right in, nice to have you posting on this most 
exceptional list,( for the most part one of the best on the web, 
IMHO)... But you do bring to mind an excellent point.. One of endless 
debate since I can recall in my early days of high school science club 
and launching rockets and modeling ballistic scenario's at the local 
Wofford College computer lab time that Dr. Olds was so generous and kind 
to provide... What we concluded then  and applies equally as to the 
current discussion is that cost of access to space could be greatly 
reduce if we changed the launch platform to that of the earliest days of 
high speed space research... such as the X-15 project... Some of us went 
on to working world married a gypsy princes and so locked into a certain 
destiny...  Others in our class went on to places like M.I.T where they 
continued to pursue their space dreams... Like David Thompson founder of 
Orbital Research and the launch of the first commercial space rocket 
called Project Pegasus ... Which was, in fact, first carried into space 
by the same B-52 used to launch the X-15... I think recent events 
clearly demonstrate that there is certainly a need to re visit this 
equation.... Everything old is new again... "Generations come and 
generations go... and they have no memory."
Thanks again Art, nice to have your post

C.Clary
Spartan sys. analyst
PO 1515
Spartanburg, SC 29304-0243

Fax# (801) 858-2722

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20030417/730b8f76/attachment.html>

From mack.joseph at epa.gov  Thu Apr 17 10:45:51 2003
From: mack.joseph at epa.gov (Joseph Mack)
Date: Thu, 17 Apr 2003 10:45:51 -0400
Subject: SMP and Network Connections
References: <Pine.LNX.4.44.0304161420370.2687-100000@homer>
Message-ID: <3E9EBE1F.3E769C13@epa.gov>

Douglas Eadline wrote:
> 
> Just posted some more SMP tests on www.cluster-rant.com.
> This time, I tested the interconnects and asked the
> question "What if a dual SMP used two Ethernet connections
> instead of one?" Seems to help! Take a look at:

Thanks for your work and write up. 

I did some performance tests a few years ago on a router,
using multiple copies of netpipe through a single interface 
to a set of nodes, to determine the effect of multiple streams
on throughput on the router (this was 100Mbps ethernet).

I found that as I increased the number of nodes connecting
to the router, that the throughput increased, rising above
100Mbps (when I totalled the throughput from each netpipe
job).

Looking at the netpipe code I saw that netpipe waits for a
quiet time on the network before entering the next round of
the test. Thus for 4 connections, if each instance of
netpipe waited for a quiet time to run the test on the next
packet size, I could (in principle) get the result of 4 connections
of 100Mpbs for a total of 400Mpbs. 

I contacted the netpipe author, who sent me a preliminary version
of a multi-netpipe, where multiple connections are synchronised
and stepped through the range of packet sizes together. He
said that it wasn't ready to use and I didn't have time to 
work on it myself. I never solved the multiple connection 
problem and wound up doing tests with a single connection.

Do you know if this problem is affecting your measurements?

(The report is at
http://www.linuxvirtualserver.org/Joseph.Mack/performance/single_realserver_performance.html)

Joe

-- 
Joseph Mack PhD, Senior Systems Engineer, SAIC contractor 
to the National Environmental Supercomputer Center, 
ph# 919-541-0007, RTP, NC, USA. mailto:mack.joseph at epa.gov
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mof at labf.org  Thu Apr 17 11:47:34 2003
From: mof at labf.org (Mof)
Date: Fri, 18 Apr 2003 01:17:34 +0930
Subject: beowulf in space
In-Reply-To: <3E9DFC75.50504@tamu.edu>
References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> <3E9DFC75.50504@tamu.edu>
Message-ID: <200304180117.35195.mof@labf.org>

Ok excuse my ignorance, but what is involved in rad harding hardware ?
Is the cost really necessary, in that couldn't you put the unprotected 
hardware into some sort of shielded container ?
Or am I just being silly ? :-)

Mof.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Thu Apr 17 12:33:12 2003
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Thu, 17 Apr 2003 09:33:12 -0700 (PDT)
Subject: beowulf in space
In-Reply-To: <200304180117.35195.mof@labf.org>
Message-ID: <Pine.LNX.4.44.0304170915000.30979-100000@twin.uoregon.edu>

high levels of ionizing radiation can induce shorts in semi-conductor 
junctions... if you short a junction that has a voltage source behind it 
you can do serious damage to whatevers on the other side. then you have 
issues like higher instanaces of single bit errors, need for all ceramic 
chip packages, and probably a couple other things I've already 
forgotten. Taken as a whole they make substantial redesign necessary for 
components that were not desgined to work in this environment from the 
outset. 

The other thing to keep in mind is that hardening systems against nuclear 
attacks is a substantially different exercise given the short term nature 
of that particular radiation exposure...

joelja


On Fri, 18 Apr 2003, Mof wrote:


> Ok excuse my ignorance, but what is involved in rad harding hardware ?
> Is the cost really necessary, in that couldn't you put the unprotected 
> hardware into some sort of shielded container ?
> Or am I just being silly ? :-)
> 
> Mof.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli	      Academic User Services   joelja at darkwing.uoregon.edu    
--    PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E      --
  In Dr. Johnson's famous dictionary patriotism is defined as the last
  resort of the scoundrel.  With all due respect to an enlightened but
  inferior lexicographer I beg to submit that it is the first.
	   	            -- Ambrose Bierce, "The Devil's Dictionary"


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Wed Apr 16 20:59:33 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Wed, 16 Apr 2003 19:59:33 -0500
Subject: beowulf in space
References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov>
Message-ID: <3E9DFC75.50504@tamu.edu>

We can consider a 10e2 (or more) cost multiplier for space-qualified 
hardware, excluding the design work someone like Harris does to 
radiation harden the processors... and memory... and glue-logic.  Intel 
doesn't tend to make space-qualified hardware, or rad-hard hardware. 
They license that out to Harris and some of the research labs.

Now: Using industrial-grade devices is more cost effective, and loses 
some of the paperwork burden (the 2 are tied intimately).  But nothing's 
  been done about radiation hardening.  Which is an issue.

Let's talk about radiation hardening and single-event upsets.

Radiation hardening refers, generally to resistance to the effects of 
transient bit resets due to hits by heavy particles.  (Is that the sound 
of RGB winding up?)  Transient bit flips are one thing: You have to do 
error detection (and correction?) but the device recovers.  In 
spacecraft memory, one runs almost continuous housecleaning code to 
detect permanent holes and remap the memory around them.  This is a very 
important aspect of planning.  If we're talking about losing enough 
cycles to housecleaning to drag our processing power down, are we really 
gaining much in "flying" a cluster?

Ah, yes... speed.  It's generally accepted in building flight 
processors, that the faster they go, the easier they are to upset.  Thus 
, that 3GHz Pentium.... Oh.  Sorry.  The 2.4GHz (non-vaporware) device 
is significantly more prone to SEU than the Pentium I/166.

Trace/mask sizing makes a difference.  The finer the lines, the more 
prone to failure.  So, once again, the old stuff (especially CMOS) 
outlasts the new x-ray lithography chips.

OKAY.  Pretty pessimistic.  The real world of space-qualified processors 
_IS_ conservative, as changing a CPU requires a service call of a couple 
of hundred miles (vertical) plus the delta-v and guidance to manage to 
match orbits... So you get your industrial-grade devices, burn 'em in on 
the ground, in higher-than-expected temperatures ("accelerated life 
testing") and qualify your systems that way.  You review the literature 
(Sandia National Labs has some great stuff) and decide the break-points 
for memory, processor and bus speeds.  Overclocking is _right_out_.  You 
design your spaceframe to accommodate adequate cooling (remember those 
heat-pipes for the new processors?  Ever wonder where the technology 
came from?  Thank the USAF.)  You add some layered polyethylene and gold 
layers to improve hardening, and you rewrite your code to accomplish 
memory and (processor) register housecleaning.

It's not impossible but it's not quite the same as building a 256 node 
COTS cluster, either.

gerry

Jim Lux wrote:
> A
> 
>> > There's also a non-negligble cost of having more items on the "bill of
>> > materials": each different kind of part needs drawings, 
>> documentation, test
>> > procedures, etc., a lot of which is what makes space stuff so expensive
>> > compared to the commercial parts (for which the primary cost driver 
>> is that
>> > of sand (raw materials) and marketing) so again, systems comprised 
>> of many
>> > identical parts have advantages.
>>
>> Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...?
>>
>> Verrry Eeenteresting...
>>
>> Now marketing, that I'd believe;-)
> 
> 
> Say it costs a billion dollars to set up the fab (which can be spread 
> over 2-3 years, probably), and maybe another half billion to design the 
> processor (I don't know... 2500 work years seems like a lot, but...?)... 
> How many Pentiums does Intel make? It's kind of hard to figure out just 
> how many chips Intel makes in a given time (such being a critical aspect 
> of their profitibility), but...
> 
> consider that Intel Revenue for 2002 was about $27B....
> 
> As for marketing... in an article about P4s from April of 2001:
> Intel has told news sources that it plans to spend roughly $500 million 
> to promote the new technology among software makers, and another $300 
> million on general advertising.
> 
> 
> Such enormous volumes are why commodity computing even works..The NRE 
> for truly high performance computing devices is spread over so many 
> units...
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf


-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Thu Apr 17 12:22:47 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 17 Apr 2003 12:22:47 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <200304180117.35195.mof@labf.org>
Message-ID: <Pine.LNX.4.44.0304171219430.4649-100000@coffee.psychology.mcmaster.ca>

> Is the cost really necessary, in that couldn't you put the unprotected 
> hardware into some sort of shielded container ?

that would clearly work, but would be very heavy.  I'm definitely
not in the field, though.  and to me, a 100x multiplier makes the 
whole idea very dubious - why not just use fast hardware and run 
every task 3 times and vote for the results?  obviously, there are 
some places where software/temporal redundancy can't be used.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Thu Apr 17 12:41:39 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Thu, 17 Apr 2003 09:41:39 -0700
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304170754440.2223-100000@lilith.rgb.private
 .net>
References: <20030417032151.GA13826@plk.af.mil>
Message-ID: <5.1.0.14.2.20030417092926.030d5c80@mailhost4.jpl.nasa.gov>


>If you have an orbital project or application that needs considerably
>more speed than the undoubtedly pedestrian clock of these devices can
>provide, you have a HUGE cost barrier to developing a faster processor,
>and that barrier is largely out of your (DoD) or Nasa's control -- you
>can only ask/hope for an industrial partner to make the investment
>required to up the chip generation in hardened technology with the
>promise of at least some guaranteed sales.  You also have a known per
>kilogram per liter cost for lifting stuff into space, and this is at
>least modestly under your own control.  So (presuming an efficiently
>parallelizable task) instead of effectively financing a couple of
>billion dollars in developing the nextgen hard chips to get a speedup of
>ten or so, you can engineer twelve systems based on the current,
>relatively cheap chips into a robust and fault tolerant cluster and pay
>the known immediate costs of lifting those twelve systems into orbit.
>


>A question that you or Gerry or Jim may or may not be able to answer
>(with which Chip started this discussion):  Are there any specific
>non-classified instances that you know of where an actual "cluster"
>(defined loosely as multiple identical CPUs interconnected with some
>sort of communications bus or network and running a specific parallel
>numerical task, not e.g.  task-specific processors in several parts of a
>military jet) has been engineered, built, and shot into space?

I was involved with development of a breadboard scatterometer ( a 
specialized type of radar that measures the radar reflectivity of the 
target (the ocean surface, in this case)) using multiple off the shelf 
space qualified DSP processors to get the numerical processing crunch 
needed. It was more a proof of concept or feasibility demonstration than a 
flight instrument, and designed to provide a reasonable basis for cost 
estimates for an eventual flight instrument.

It was specifically the concept you address above: You're not going to get 
one special processor built custom for you at a reasonable price, but you 
can get a bunch of generic ones and gang em together.  The "going in 
constraint" was that the approach had to use existing off the shelf flight 
qualified technology, which in this case is the rad tolerant ADSP21020 
clone funded by ESA, made by Atmel/Temic.

We used SpaceWire as the interconnect (it's a routable high speed serial 
link, based on IEEE 1355), wrote drivers that implement a subset of MPI, 
and did all the fancy stuff in fairly vanilla C doing the interprocessor 
comms with calls to the MPI-like API.  The breadboard illustrated 
scalability (i.e. you could add and drop identical processors to achieve 
any desired performance; manifested as either "amount of signal processing 
required" or "max pulse repetition frequency handled")

Interestingly, mass wasn't a big design driver (adding a processor to the 
cluster only adds <1kg to an instrument that already is on the order of 
100kg). Power was a bit of a concern (mostly because it hadn't ever been 
built), but the real hurdle for the review boards was just the 
unfamiliarity with the concept of accepting inefficiency in exchange for 
use of generic parts.  Most spacecraft systems are very purpose designed 
and highly customized.

>This has been interesting enough that if there are any, I may indeed add
>a chapter to the book, if/when I next actually work on it.  I got dem
>end of semester blues, at the moment...:-)
>
>   rgb
>
>--
>Robert G. Brown                        http://www.phy.duke.edu/~rgb/
>Duke University Dept. of Physics, Box 90305
>Durham, N.C. 27708-0305
>Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at plogic.com  Thu Apr 17 14:04:03 2003
From: deadline at plogic.com (Douglas Eadline)
Date: Thu, 17 Apr 2003 13:04:03 -0500 (CDT)
Subject: SMP and Network Connections
In-Reply-To: <3E9EBE1F.3E769C13@epa.gov>
Message-ID: <Pine.LNX.4.44.0304171254520.4280-100000@homer>

On Thu, 17 Apr 2003, Joseph Mack wrote:

> Douglas Eadline wrote:
> > 
> > Just posted some more SMP tests on www.cluster-rant.com.
> > This time, I tested the interconnects and asked the
> > question "What if a dual SMP used two Ethernet connections
> > instead of one?" Seems to help! Take a look at:
> 
> Thanks for your work and write up. 
> 
> I did some performance tests a few years ago on a router,
> using multiple copies of netpipe through a single interface 
> to a set of nodes, to determine the effect of multiple streams
> on throughput on the router (this was 100Mbps ethernet).
> 
> I found that as I increased the number of nodes connecting
> to the router, that the throughput increased, rising above
> 100Mbps (when I totalled the throughput from each netpipe
> job).
> 
> Looking at the netpipe code I saw that netpipe waits for a
> quiet time on the network before entering the next round of
> the test. Thus for 4 connections, if each instance of
> netpipe waited for a quiet time to run the test on the next
> packet size, I could (in principle) get the result of 4 connections
> of 100Mpbs for a total of 400Mpbs. 
> 
> I contacted the netpipe author, who sent me a preliminary version
> of a multi-netpipe, where multiple connections are synchronised
> and stepped through the range of packet sizes together. He
> said that it wasn't ready to use and I didn't have time to 
> work on it myself. I never solved the multiple connection 
> problem and wound up doing tests with a single connection.
> 
> Do you know if this problem is affecting your measurements?

I was not aware of this, however, the netpipe data seems to 
indicate that when two netpipes are using the same interface,
there is some degradation when compared to a single run. I'll have a look 
at the code as well. Thanks for the information. 

Doug


> 
> (The report is at
> http://www.linuxvirtualserver.org/Joseph.Mack/performance/single_realserver_performance.html)
> 
> Joe
> 
> 

-- 
-------------------------------------------------------------------
Paralogic, Inc.           |     PEAK     |      Voice:+610.814.2800
130 Webster Street        |   PARALLEL   |        Fax:+610.814.5844
Bethlehem, PA 18015 USA   |  PERFORMANCE |    http://www.plogic.com
-------------------------------------------------------------------

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From shewa at inel.gov  Thu Apr 17 13:20:03 2003
From: shewa at inel.gov (Andrew Shewmaker)
Date: Thu, 17 Apr 2003 11:20:03 -0600
Subject: Running perl scripts and non-mpi programs on scyld
In-Reply-To: <20030417092417S.hanzl@unknown-domain>
References: <E167FAE539F45940B8BECD4A64535D4E04216D@cgcmail.cgc.cpmc.columbia.edu> <20030417092417S.hanzl@unknown-domain>
Message-ID: <200304171120.03430.shewa@inel.gov>

On Thursday 17 April 2003 01:24 am, hanzl at noel.feld.cvut.cz wrote:

> For these types of jobs, we are using SGE on scyld-like cluster (we
> are using HDDCS which is a variant of Clustermatic which is similar to
> Scyld but this should not matter here).
>
> SGE is quite nice and opensource batch spooling. Using it with
> scyld-like cluster for this type of jobs is a bit tricky but quite
> easy. We just create one 'queue' for every slave node and use node
> number as a queue name. Then we use 'starter method' script like this:
>
> file /usr/local/bin/sge-bproc-starter-method:
>
>   #/bin/sh
>   bpsh $QUEUE $*
>
> All these queues are defined as running on master node but starter
> method in fact moves perl scripts on individual slave nodes.
>
> To run scripts on N nodes with N DIFFERENT arguments, you may use
> 'array jobs' or submit many individual jobs.
>
> (And there is much more you can do with SGE, I highly recommend it.)

So does SGE see the load on the slave node queues?  Does it show the number of
processors and total memory in qhost?  I didn't realize it was so easy to use
SGE on top of a bproc based system.  I suppose it would take quite a bit more
work to get it integrated to the point where you only needed one queue for
the entire cluster.

Andrew

--
Andrew Shewmaker
Associate Engineer, INEEL
Phone: 1-208-526-1415

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Apr 17 16:54:17 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 17 Apr 2003 16:54:17 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <200304180117.35195.mof@labf.org>
Message-ID: <Pine.LNX.4.44.0304171626190.2392-100000@lilith.rgb.private.net>

On Fri, 18 Apr 2003, Mof wrote:

> Ok excuse my ignorance, but what is involved in rad harding hardware ?
> Is the cost really necessary, in that couldn't you put the unprotected 
> hardware into some sort of shielded container ?
>
> Or am I just being silly ? :-)

Not really silly, but IIRC shielding is both difficult and expensive and
sometimes actively counterproductive in space.  I'm sure the NASA guys
will have even more detail, but:

  a) Difficult, because there is a very wide range of KINDS and ENERGIES
of radiation out there.  Some are easy to stop, but some (like massive,
very high energy nucleii or very high energy gamma rays) are not.

  b) Expensive, because to stop radiation you basically have to
interpolate matter in sufficient density to absorb and disperse the
energy via single and multiple scattering events.  Some radiation has a
relatively high cross section with matter and low energy and is easily
stopped, but the most destructive sort requires quite a lot of
shielding, which is dense and thick.  This means heavy and occupying
lots of volume, which means expensive in terms of lifting it out of the
gravity well.  I don't know what it costs to lift a kilogram of mass to
geosynchronous orbit, but I'll bet it is a LOT.

  c) Counterproductive, because SOME of the kinds of radiation present
are by themselves not horribly dangerous -- they have a lot of energy
but are relatively unlikely to hit anything.  So when they hit they kill
a cell or a chromosome or a bit or something, but in a fairly localized
way.  However, when they hit the right densities of matter in shielding
they can produce a literal shower or shotgun blast of secondary
particles that ARE the right particles at the right energies to do a lot
of damage (to humans or hardware).  So either you need enough shielding
to stop these particles and all their secondary byproducts, or you can
be better off just letting those particles (probably) pass right on
through, hopefully without hitting anything.

Basically, we are pretty fortunate to live way down here at the bottom
of several miles of atmosphere, where most of the dangerous crap hits
and showers its secondary stuff miles overhead and is absorbed before it
becomes a hazard.  Our computer hardware is similarly fortunate.  Even a
mile up the radiation levels are significantly higher -- even growing up
in subtropical India I was NEVER sunburned as badly as I was in a mere
two hours of late afternoon exposure in Taxco, Mexico, just one mile up.
A single six hour cross-country plane ride exposes you to 1/8 of the
rems you'd receive, on average, in an entire year spent at ground
level.  God only knows what astronauts get.  Maybe they bank gametes
before leaving, dunno...

So definitely not silly, but things are more complex than they might
seem.  I'm sure that if a cost-effective solution were as easy as just
"more shielding" the rocket scientists (literally:-) at NASA would have
already thunk it.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From edwardsa at plk.af.mil  Thu Apr 17 16:03:21 2003
From: edwardsa at plk.af.mil (Art Edwards)
Date: Thu, 17 Apr 2003 14:03:21 -0600
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304170754440.2223-100000@lilith.rgb.private.net>
References: <20030417032151.GA13826@plk.af.mil> <Pine.LNX.4.44.0304170754440.2223-100000@lilith.rgb.private.net>
Message-ID: <20030417200321.GB15077@plk.af.mil>

Just so you don't think  that the space program is run by a bunch of
out-of-the-loop dopes, we have been doing clustering, althought these
are by no means beowulfs. I sent a message to one of the brightest
architectural designers, who is in our branch, and I paste his reply.
Please copy to him any posts/responses to this.

>From Jim Lyke

Pretty cool.  Sure, there has been publications on SAFE, and I have
submitted a longer paper for publication.

Sensor and Fusion Engine (SAFE) in its best case is 96 processors,
broken
into 12 bussed groups (the bus a customized Futurebus+) with a Myrinet
bridge.  The system is small enough in scale to be serviced by a single,
16-port duplex non-blocking Myrinet crossbar.  So, 12 of the hubs are
occupied with the 96 processors, which are of a special design
(microprogrammable with IEEE 754(?) double-precision floating point
support).  Two of the remaining four hubs are equipped with FPGA-based
front-end processors, to massage real-time sensor data into the packeted
formats amenable to the 96-nodes.  One of the remaining two hubs is
occupied
by a boot processor, which distributes program loads over the network
and
kicks off processor groups.  The final port is a user/telemetry port,
which
could be a simple Linux box equipped with a Myrinet card.

Everything above (except Linux box) is designed to be crammed into a
conduction-cooled 5x5x8 inch parallelopiped container.  The Myrinet
protocol
was gutted and replaced with a lower latency protocol with a one sigma
latency of about 2uS on messages based on the statistics of our problem.
The max sustainable peak is about 12 GFLOPS, which is because the chips
were
built on 0.5um.  The theoretic density of the system (even so) is
slightly
over one TFLOP/cu.ft.  We are moving forward to modernize the system,
but
are funding limited.  The ultimate barrier will be thermal.  Even though
we
use carbon-matrix composite materials that have 5X better heat
conduction
than aluminum, the ultimate power densities as we encroach on
>10TFLOPS/cu.ft. will overtake the ability of that material to draw heat
away.  There is discussion of trying to create a new type of thermal
management material based on either carbon or boron nanotubes, which are
claimed to beat natural diamond by about 2X.

I wouldn't mind being copied on the posts/replies either.

END OF LYKE

Art Edwards

On Thu, Apr 17, 2003 at 08:14:38AM -0400, Robert G. Brown wrote:
> On Wed, 16 Apr 2003, Art Edwards wrote:
> 
> > I think I'm jumping into the middle of a conversation here, but our
> > branch is the shop through which most of the DoD processor programs are
> > managed. For real space applications there are radiation issues like
> > total dose hardness and single even upset that require special design
> > and, still, special processing. That is, you can't make these parts at
> > any foundry (yet). There are currently two hardened foundries through
> > which the most tolerant parts  are fabricated. Where the commercial
> > market is ~100's of Billions/year, the space electronics industry is
> > ~200million/year. So parts are expensive, as Jim Lux says. But more
> > importantly, the current state-of-the-art for space processors is
> > several generations back. Now, with a 200 million market/year, who is
> > going to spend the money to build a new foundry? (anyone?) It's a huge
> > problem, and beowulfs in space will not give the economies of scale
> > necessary to move us forward. 
> > 
> > I don't know if this has been discussed here, but have you thought about
> > launch costs? They're huge. Weight, power, and mission lifetime are the 
> > crucial factors for space. These are the reasons that so much R&D goes
> > into space electronics. I apologize if I have gone over old ground.
> 
> Actually, this is the sort of thing that makes (as Eray pointed out) the
> idea of a cluster (leaving aside the COTS issue, the single-headed
> issue, and whether or not it could be a true "beowulf" cluster)
> attractive in space applications.  What you (and Gerry) are saying is
> that the space and DoD market is stuck using specially engineered,
> radiation hard, not-so-bleeding-VLSI processors from what amounts to
> several VLSI generations ago.  The parts are expensive, but the cost of
> building a newer better foundry for such a small and inelastic market
> are prohibitive, so they are the only game in town.
> 
> If you have an orbital project or application that needs considerably
> more speed than the undoubtedly pedestrian clock of these devices can
> provide, you have a HUGE cost barrier to developing a faster processor,
> and that barrier is largely out of your (DoD) or Nasa's control -- you
> can only ask/hope for an industrial partner to make the investment
> required to up the chip generation in hardened technology with the
> promise of at least some guaranteed sales.  You also have a known per
> kilogram per liter cost for lifting stuff into space, and this is at
> least modestly under your own control.  So (presuming an efficiently
> parallelizable task) instead of effectively financing a couple of
> billion dollars in developing the nextgen hard chips to get a speedup of
> ten or so, you can engineer twelve systems based on the current,
> relatively cheap chips into a robust and fault tolerant cluster and pay
> the known immediate costs of lifting those twelve systems into orbit.
> 
> Again presuming that it is for some reason not feasible to simply
> establish a link to earth and do the processing here -- an application
> for which the latency would be bad, an application that requires
> immediate response in a changing environment when downlink
> communications may not be robust.
> 
> A question that you or Gerry or Jim may or may not be able to answer
> (with which Chip started this discussion):  Are there any specific
> non-classified instances that you know of where an actual "cluster"
> (defined loosely as multiple identical CPUs interconnected with some
> sort of communications bus or network and running a specific parallel
> numerical task, not e.g.  task-specific processors in several parts of a
> military jet) has been engineered, built, and shot into space?
> 
> This has been interesting enough that if there are any, I may indeed add
> a chapter to the book, if/when I next actually work on it.  I got dem
> end of semester blues, at the moment...:-)
> 
>   rgb
> 
> -- 
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> 
> 
> 
> 

-- 
Art Edwards
Senior Research Physicist
Air Force Research Laboratory
Electronics Foundations Branch
KAFB, New Mexico

(505) 853-6042 (v)
(505) 846-2290 (f)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Thu Apr 17 16:33:00 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Thu, 17 Apr 2003 13:33:00 -0700
Subject: [Linux-ia64] Itanium gets supercomputing software
In-Reply-To: <200304170756.h3H7umB02357@dali.crs4.it>
References: <200304170756.h3H7umB02357@dali.crs4.it>
Message-ID: <20030417203300.GG1345@greglaptop.internal.keyresearch.com>

On Thu, Apr 17, 2003 at 09:56:48AM +0200, Alan Scheinine wrote:

>    I do not think there was a promise that getting efficiency would
> be easier with EPIC.  My understanding of the situation is that
> the logic of dynamic allocation of resources, that is, the various
> tricks done in silicon, could not scale to a large number of
> processing units on a chip.

That's not what I said. I said that getting more instructions per
cycle was what was supposed to be easier, and indeed, that means more
compiler complexity.

> Fifteen years ago I heard a talk in which
> it was claimed that compiler advances developed at universities
> arrive in commercial compilers after a delay of ten years.

That's an over-generalization. For example, a lot of compiler research
is done on the framework provided by Open64, which is SGI's compiler.
You can get research frameworks in which you can play with a
particular optimization idea, but if you want a research framework
which is already a really great compiler, Open64 is the only choice.

> Greg Lindahl wrote that "The ones [compiler people] I know hate EPIC
> with a passion".  Why?

They think it's a pig with lipstick.

> Do they say that the concept is wrong or
> is that problem that they cannot meet their deadlines because of
> the quantity of analysis that has been moved from the silicon to
> the compiler writer?

The lack of uptake in the marketplace meant that they had a couple of
extra years to do the compiler work, so deadlines weren't a problem.

Now in comparison, x86 chips are not very good compilation targets
either: trying to figure out how x86 instructions actually work after
they are translated into some unknown micro-ops isn't exactly easy.
But I suspect that a poll of compiler people would vote for x86 over
ia-64.

> By the way, it may be a good idea to develop more packages like
> Atlas and FFTW which optimize themselves based on the actual computer,
> since memory latency and other factors are variable.  But then,
> optimizing through experimentation takes a long time.

It is a good idea, but it's worth pointing out that Atlas and FFTW
work best on machines which are either out of order, where a bad
compiler isn't a problem, or on in-order machines with great
compilers. I'd love to hear about how well gcc does on ia-64 with
Atlas or FFTW.

greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Apr 17 16:56:34 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 17 Apr 2003 16:56:34 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <5.1.0.14.2.20030417092926.030d5c80@mailhost4.jpl.nasa.gov>
Message-ID: <Pine.LNX.4.44.0304171656060.2392-100000@lilith.rgb.private.net>

On Thu, 17 Apr 2003, Jim Lux wrote:

<very cool stuff, deleted>

  Thanks!

    rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Apr 17 17:14:15 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 17 Apr 2003 17:14:15 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <20030417200321.GB15077@plk.af.mil>
Message-ID: <Pine.LNX.4.44.0304171707540.2392-100000@lilith.rgb.private.net>

On Thu, 17 Apr 2003, Art Edwards wrote:

> Just so you don't think  that the space program is run by a bunch of
> out-of-the-loop dopes, we have been doing clustering, althought these
> are by no means beowulfs. I sent a message to one of the brightest
> architectural designers, who is in our branch, and I paste his reply.
> Please copy to him any posts/responses to this.

Who could possibly think that, given that the first beowulf was built
and named by a NASA program and that CESDIS for years housed both the
list and beowulf.org (with only one short hiatus when high level program
overseers became inexplicably stricken with some sort of mental
disease:-)?

However, this TOO looks very cool, sort of at the edge of the possible
with clustering technology altogether.

It is obvious that clusters are indeed making their way into space, or
will be soon.

I won't even ask what a "Sensor and Fusion Engine" might be -- it would
be too much to hope that it would be a thermonuclear fusion engine that
cannot AFAIK exist with current technology, existing anyway and
preparing to really change the way we do space..:-)

    rgb

> 
> >From Jim Lyke
> 
> Pretty cool.  Sure, there has been publications on SAFE, and I have
> submitted a longer paper for publication.
> 
> Sensor and Fusion Engine (SAFE) in its best case is 96 processors,
> broken
> into 12 bussed groups (the bus a customized Futurebus+) with a Myrinet
> bridge.  The system is small enough in scale to be serviced by a single,
> 16-port duplex non-blocking Myrinet crossbar.  So, 12 of the hubs are
> occupied with the 96 processors, which are of a special design
> (microprogrammable with IEEE 754(?) double-precision floating point
> support).  Two of the remaining four hubs are equipped with FPGA-based
> front-end processors, to massage real-time sensor data into the packeted
> formats amenable to the 96-nodes.  One of the remaining two hubs is
> occupied
> by a boot processor, which distributes program loads over the network
> and
> kicks off processor groups.  The final port is a user/telemetry port,
> which
> could be a simple Linux box equipped with a Myrinet card.
> 
> Everything above (except Linux box) is designed to be crammed into a
> conduction-cooled 5x5x8 inch parallelopiped container.  The Myrinet
> protocol
> was gutted and replaced with a lower latency protocol with a one sigma
> latency of about 2uS on messages based on the statistics of our problem.
> The max sustainable peak is about 12 GFLOPS, which is because the chips
> were
> built on 0.5um.  The theoretic density of the system (even so) is
> slightly
> over one TFLOP/cu.ft.  We are moving forward to modernize the system,
> but
> are funding limited.  The ultimate barrier will be thermal.  Even though
> we
> use carbon-matrix composite materials that have 5X better heat
> conduction
> than aluminum, the ultimate power densities as we encroach on
> >10TFLOPS/cu.ft. will overtake the ability of that material to draw heat
> away.  There is discussion of trying to create a new type of thermal
> management material based on either carbon or boron nanotubes, which are
> claimed to beat natural diamond by about 2X.
> 
> I wouldn't mind being copied on the posts/replies either.
> 
> END OF LYKE
> 
> Art Edwards
> 
> On Thu, Apr 17, 2003 at 08:14:38AM -0400, Robert G. Brown wrote:
> > On Wed, 16 Apr 2003, Art Edwards wrote:
> > 
> > > I think I'm jumping into the middle of a conversation here, but our
> > > branch is the shop through which most of the DoD processor programs are
> > > managed. For real space applications there are radiation issues like
> > > total dose hardness and single even upset that require special design
> > > and, still, special processing. That is, you can't make these parts at
> > > any foundry (yet). There are currently two hardened foundries through
> > > which the most tolerant parts  are fabricated. Where the commercial
> > > market is ~100's of Billions/year, the space electronics industry is
> > > ~200million/year. So parts are expensive, as Jim Lux says. But more
> > > importantly, the current state-of-the-art for space processors is
> > > several generations back. Now, with a 200 million market/year, who is
> > > going to spend the money to build a new foundry? (anyone?) It's a huge
> > > problem, and beowulfs in space will not give the economies of scale
> > > necessary to move us forward. 
> > > 
> > > I don't know if this has been discussed here, but have you thought about
> > > launch costs? They're huge. Weight, power, and mission lifetime are the 
> > > crucial factors for space. These are the reasons that so much R&D goes
> > > into space electronics. I apologize if I have gone over old ground.
> > 
> > Actually, this is the sort of thing that makes (as Eray pointed out) the
> > idea of a cluster (leaving aside the COTS issue, the single-headed
> > issue, and whether or not it could be a true "beowulf" cluster)
> > attractive in space applications.  What you (and Gerry) are saying is
> > that the space and DoD market is stuck using specially engineered,
> > radiation hard, not-so-bleeding-VLSI processors from what amounts to
> > several VLSI generations ago.  The parts are expensive, but the cost of
> > building a newer better foundry for such a small and inelastic market
> > are prohibitive, so they are the only game in town.
> > 
> > If you have an orbital project or application that needs considerably
> > more speed than the undoubtedly pedestrian clock of these devices can
> > provide, you have a HUGE cost barrier to developing a faster processor,
> > and that barrier is largely out of your (DoD) or Nasa's control -- you
> > can only ask/hope for an industrial partner to make the investment
> > required to up the chip generation in hardened technology with the
> > promise of at least some guaranteed sales.  You also have a known per
> > kilogram per liter cost for lifting stuff into space, and this is at
> > least modestly under your own control.  So (presuming an efficiently
> > parallelizable task) instead of effectively financing a couple of
> > billion dollars in developing the nextgen hard chips to get a speedup of
> > ten or so, you can engineer twelve systems based on the current,
> > relatively cheap chips into a robust and fault tolerant cluster and pay
> > the known immediate costs of lifting those twelve systems into orbit.
> > 
> > Again presuming that it is for some reason not feasible to simply
> > establish a link to earth and do the processing here -- an application
> > for which the latency would be bad, an application that requires
> > immediate response in a changing environment when downlink
> > communications may not be robust.
> > 
> > A question that you or Gerry or Jim may or may not be able to answer
> > (with which Chip started this discussion):  Are there any specific
> > non-classified instances that you know of where an actual "cluster"
> > (defined loosely as multiple identical CPUs interconnected with some
> > sort of communications bus or network and running a specific parallel
> > numerical task, not e.g.  task-specific processors in several parts of a
> > military jet) has been engineered, built, and shot into space?
> > 
> > This has been interesting enough that if there are any, I may indeed add
> > a chapter to the book, if/when I next actually work on it.  I got dem
> > end of semester blues, at the moment...:-)
> > 
> >   rgb
> > 
> > -- 
> > Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> > Duke University Dept. of Physics, Box 90305
> > Durham, N.C. 27708-0305
> > Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> > 
> > 
> > 
> > 
> 
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Thu Apr 17 17:35:13 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 17 Apr 2003 17:35:13 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <20030417205423.GA15534@plk.af.mil>
Message-ID: <Pine.LNX.4.44.0304171733340.2392-100000@lilith.rgb.private.net>

On Thu, 17 Apr 2003, Art Edwards wrote:

> In this case fusion just refers to data-fusion from sensors. Data
> integration and processing might capture what is meant by fusion. Signal
> processing is a biggee for the Air Force.

Awww, rats.  Sexy name though -- "Fusion Engine"...

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From wade.hampton at nsc1.net  Thu Apr 17 12:57:34 2003
From: wade.hampton at nsc1.net (Wade Hampton)
Date: Thu, 17 Apr 2003 12:57:34 -0400
Subject: Running perl scripts and non-mpi programs on scyld
In-Reply-To: <E167FAE539F45940B8BECD4A64535D4E04216D@cgcmail.cgc.cpmc.columbia.edu>
References: <E167FAE539F45940B8BECD4A64535D4E04216D@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <3E9EDCFE.5060100@nsc1.net>

Kristen J. McFadden wrote:

>Hi, 
>
>We have a Scyld Beowulf cluster currently running on 28cz-4 (we are
>getting -5 soon).  We have been running into a lot of problems with
>users that are trying to run scripts on the child nodes.   To start
>with, what is the best way to run serial (non-MPI) programs?
>
>Here is the current issue I'm trying to tackle.
>
>Say I have a perl script.  (I  NFS mount /usr /lib etc. on the child
>nodes)
>
>I want to run this perl script on N nodes with N DIFFERENT arguments.
>
This is sort of what we are doing. 

Our solution currently is:

1.  custom scheduler using bproc_rfork to fork
     our processing jobs (up to 2 per SMP node).

2.  local disk on each node:
        hda1     beoboot
        hda2     swap
        hda3     /tmp
        hda4     /usr/local

3.  cache of common tools to /usr/local including:
         some tools from /bin, /usr/bin, /usr/local/bin
         /usr/lib/perl
     - we are not currently rsync'ing this, but we
       could in the future....

4.  special scripts to run our collection of software
     as an independent run on each node

Hope this helps,
--
Wade Hampton

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From edwardsa at plk.af.mil  Thu Apr 17 16:54:23 2003
From: edwardsa at plk.af.mil (Art Edwards)
Date: Thu, 17 Apr 2003 14:54:23 -0600
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304171707540.2392-100000@lilith.rgb.private.net>
References: <20030417200321.GB15077@plk.af.mil> <Pine.LNX.4.44.0304171707540.2392-100000@lilith.rgb.private.net>
Message-ID: <20030417205423.GA15534@plk.af.mil>

On Thu, Apr 17, 2003 at 05:14:15PM -0400, Robert G. Brown wrote:
> On Thu, 17 Apr 2003, Art Edwards wrote:
> 
> > Just so you don't think  that the space program is run by a bunch of
> > out-of-the-loop dopes, we have been doing clustering, althought these
> > are by no means beowulfs. I sent a message to one of the brightest
> > architectural designers, who is in our branch, and I paste his reply.
> > Please copy to him any posts/responses to this.
> 
> Who could possibly think that, given that the first beowulf was built
> and named by a NASA program and that CESDIS for years housed both the
> list and beowulf.org (with only one short hiatus when high level program
> overseers became inexplicably stricken with some sort of mental
> disease:-)?
> 
> However, this TOO looks very cool, sort of at the edge of the possible
> with clustering technology altogether.
> 
> It is obvious that clusters are indeed making their way into space, or
> will be soon.
> 
> I won't even ask what a "Sensor and Fusion Engine" might be -- it would
> be too much to hope that it would be a thermonuclear fusion engine that
> cannot AFAIK exist with current technology, existing anyway and
> preparing to really change the way we do space..:-)
In this case fusion just refers to data-fusion from sensors. Data
integration and processing might capture what is meant by fusion. Signal
processing is a biggee for the Air Force.


Art Edwards
> 
>     rgb
> 
> > 
> > >From Jim Lyke
> > 
> > Pretty cool.  Sure, there has been publications on SAFE, and I have
> > submitted a longer paper for publication.
> > 
> > Sensor and Fusion Engine (SAFE) in its best case is 96 processors,
> > broken
> > into 12 bussed groups (the bus a customized Futurebus+) with a Myrinet
> > bridge.  The system is small enough in scale to be serviced by a single,
> > 16-port duplex non-blocking Myrinet crossbar.  So, 12 of the hubs are
> > occupied with the 96 processors, which are of a special design
> > (microprogrammable with IEEE 754(?) double-precision floating point
> > support).  Two of the remaining four hubs are equipped with FPGA-based
> > front-end processors, to massage real-time sensor data into the packeted
> > formats amenable to the 96-nodes.  One of the remaining two hubs is
> > occupied
> > by a boot processor, which distributes program loads over the network
> > and
> > kicks off processor groups.  The final port is a user/telemetry port,
> > which
> > could be a simple Linux box equipped with a Myrinet card.
> > 
> > Everything above (except Linux box) is designed to be crammed into a
> > conduction-cooled 5x5x8 inch parallelopiped container.  The Myrinet
> > protocol
> > was gutted and replaced with a lower latency protocol with a one sigma
> > latency of about 2uS on messages based on the statistics of our problem.
> > The max sustainable peak is about 12 GFLOPS, which is because the chips
> > were
> > built on 0.5um.  The theoretic density of the system (even so) is
> > slightly
> > over one TFLOP/cu.ft.  We are moving forward to modernize the system,
> > but
> > are funding limited.  The ultimate barrier will be thermal.  Even though
> > we
> > use carbon-matrix composite materials that have 5X better heat
> > conduction
> > than aluminum, the ultimate power densities as we encroach on
> > >10TFLOPS/cu.ft. will overtake the ability of that material to draw heat
> > away.  There is discussion of trying to create a new type of thermal
> > management material based on either carbon or boron nanotubes, which are
> > claimed to beat natural diamond by about 2X.
> > 
> > I wouldn't mind being copied on the posts/replies either.
> > 
> > END OF LYKE
> > 
> > Art Edwards
> > 
> > On Thu, Apr 17, 2003 at 08:14:38AM -0400, Robert G. Brown wrote:
> > > On Wed, 16 Apr 2003, Art Edwards wrote:
> > > 
> > > > I think I'm jumping into the middle of a conversation here, but our
> > > > branch is the shop through which most of the DoD processor programs are
> > > > managed. For real space applications there are radiation issues like
> > > > total dose hardness and single even upset that require special design
> > > > and, still, special processing. That is, you can't make these parts at
> > > > any foundry (yet). There are currently two hardened foundries through
> > > > which the most tolerant parts  are fabricated. Where the commercial
> > > > market is ~100's of Billions/year, the space electronics industry is
> > > > ~200million/year. So parts are expensive, as Jim Lux says. But more
> > > > importantly, the current state-of-the-art for space processors is
> > > > several generations back. Now, with a 200 million market/year, who is
> > > > going to spend the money to build a new foundry? (anyone?) It's a huge
> > > > problem, and beowulfs in space will not give the economies of scale
> > > > necessary to move us forward. 
> > > > 
> > > > I don't know if this has been discussed here, but have you thought about
> > > > launch costs? They're huge. Weight, power, and mission lifetime are the 
> > > > crucial factors for space. These are the reasons that so much R&D goes
> > > > into space electronics. I apologize if I have gone over old ground.
> > > 
> > > Actually, this is the sort of thing that makes (as Eray pointed out) the
> > > idea of a cluster (leaving aside the COTS issue, the single-headed
> > > issue, and whether or not it could be a true "beowulf" cluster)
> > > attractive in space applications.  What you (and Gerry) are saying is
> > > that the space and DoD market is stuck using specially engineered,
> > > radiation hard, not-so-bleeding-VLSI processors from what amounts to
> > > several VLSI generations ago.  The parts are expensive, but the cost of
> > > building a newer better foundry for such a small and inelastic market
> > > are prohibitive, so they are the only game in town.
> > > 
> > > If you have an orbital project or application that needs considerably
> > > more speed than the undoubtedly pedestrian clock of these devices can
> > > provide, you have a HUGE cost barrier to developing a faster processor,
> > > and that barrier is largely out of your (DoD) or Nasa's control -- you
> > > can only ask/hope for an industrial partner to make the investment
> > > required to up the chip generation in hardened technology with the
> > > promise of at least some guaranteed sales.  You also have a known per
> > > kilogram per liter cost for lifting stuff into space, and this is at
> > > least modestly under your own control.  So (presuming an efficiently
> > > parallelizable task) instead of effectively financing a couple of
> > > billion dollars in developing the nextgen hard chips to get a speedup of
> > > ten or so, you can engineer twelve systems based on the current,
> > > relatively cheap chips into a robust and fault tolerant cluster and pay
> > > the known immediate costs of lifting those twelve systems into orbit.
> > > 
> > > Again presuming that it is for some reason not feasible to simply
> > > establish a link to earth and do the processing here -- an application
> > > for which the latency would be bad, an application that requires
> > > immediate response in a changing environment when downlink
> > > communications may not be robust.
> > > 
> > > A question that you or Gerry or Jim may or may not be able to answer
> > > (with which Chip started this discussion):  Are there any specific
> > > non-classified instances that you know of where an actual "cluster"
> > > (defined loosely as multiple identical CPUs interconnected with some
> > > sort of communications bus or network and running a specific parallel
> > > numerical task, not e.g.  task-specific processors in several parts of a
> > > military jet) has been engineered, built, and shot into space?
> > > 
> > > This has been interesting enough that if there are any, I may indeed add
> > > a chapter to the book, if/when I next actually work on it.  I got dem
> > > end of semester blues, at the moment...:-)
> > > 
> > >   rgb
> > > 
> > > -- 
> > > Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> > > Duke University Dept. of Physics, Box 90305
> > > Durham, N.C. 27708-0305
> > > Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> > > 
> > > 
> > > 
> > > 
> > 
> > 
> 
> -- 
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> 
> 
> 

-- 
Art Edwards
Senior Research Physicist
Air Force Research Laboratory
Electronics Foundations Branch
KAFB, New Mexico

(505) 853-6042 (v)
(505) 846-2290 (f)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From edwardsa at plk.af.mil  Thu Apr 17 21:33:19 2003
From: edwardsa at plk.af.mil (Art Edwards)
Date: Thu, 17 Apr 2003 19:33:19 -0600
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304171626190.2392-100000@lilith.rgb.private.net>
References: <200304180117.35195.mof@labf.org> <Pine.LNX.4.44.0304171626190.2392-100000@lilith.rgb.private.net>
Message-ID: <20030418013319.GA15875@plk.af.mil>

There are two basic strategies for hardening: Design and process.
Processing involves special anneals, implants and oxide recipes that
are outside standard processing and so cannot be fabbed in standard
foundaries.  Designs are rather old and infolve specially shaped
transistors. This is an active and promising area of pursuit. If you
are really interested, you can look at past December issues of IEEE
Transactions on Nuclear Science. You can also attend the Nuclear and
Space Radiation Effects Conference this July in Monterrey Ca.

You can, in fact, shield against alot of threats. The question is
whether you want to launch shielding or active electronics. 

Art Edwards

On Thu, Apr 17, 2003 at 04:54:17PM -0400, Robert G. Brown wrote:
> On Fri, 18 Apr 2003, Mof wrote:
> 
> > Ok excuse my ignorance, but what is involved in rad harding hardware ?
> > Is the cost really necessary, in that couldn't you put the unprotected 
> > hardware into some sort of shielded container ?
> >
> > Or am I just being silly ? :-)
> 
> Not really silly, but IIRC shielding is both difficult and expensive and
> sometimes actively counterproductive in space.  I'm sure the NASA guys
> will have even more detail, but:
> 
>   a) Difficult, because there is a very wide range of KINDS and ENERGIES
> of radiation out there.  Some are easy to stop, but some (like massive,
> very high energy nucleii or very high energy gamma rays) are not.
> 
>   b) Expensive, because to stop radiation you basically have to
> interpolate matter in sufficient density to absorb and disperse the
> energy via single and multiple scattering events.  Some radiation has a
> relatively high cross section with matter and low energy and is easily
> stopped, but the most destructive sort requires quite a lot of
> shielding, which is dense and thick.  This means heavy and occupying
> lots of volume, which means expensive in terms of lifting it out of the
> gravity well.  I don't know what it costs to lift a kilogram of mass to
> geosynchronous orbit, but I'll bet it is a LOT.
> 
>   c) Counterproductive, because SOME of the kinds of radiation present
> are by themselves not horribly dangerous -- they have a lot of energy
> but are relatively unlikely to hit anything.  So when they hit they kill
> a cell or a chromosome or a bit or something, but in a fairly localized
> way.  However, when they hit the right densities of matter in shielding
> they can produce a literal shower or shotgun blast of secondary
> particles that ARE the right particles at the right energies to do a lot
> of damage (to humans or hardware).  So either you need enough shielding
> to stop these particles and all their secondary byproducts, or you can
> be better off just letting those particles (probably) pass right on
> through, hopefully without hitting anything.
> 
> Basically, we are pretty fortunate to live way down here at the bottom
> of several miles of atmosphere, where most of the dangerous crap hits
> and showers its secondary stuff miles overhead and is absorbed before it
> becomes a hazard.  Our computer hardware is similarly fortunate.  Even a
> mile up the radiation levels are significantly higher -- even growing up
> in subtropical India I was NEVER sunburned as badly as I was in a mere
> two hours of late afternoon exposure in Taxco, Mexico, just one mile up.
> A single six hour cross-country plane ride exposes you to 1/8 of the
> rems you'd receive, on average, in an entire year spent at ground
> level.  God only knows what astronauts get.  Maybe they bank gametes
> before leaving, dunno...
> 
> So definitely not silly, but things are more complex than they might
> seem.  I'm sure that if a cost-effective solution were as easy as just
> "more shielding" the rocket scientists (literally:-) at NASA would have
> already thunk it.
> 
>    rgb
> 
> -- 
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Art Edwards
Senior Research Physicist
Air Force Research Laboratory
Electronics Foundations Branch
KAFB, New Mexico

(505) 853-6042 (v)
(505) 846-2290 (f)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jdc at uwo.ca  Thu Apr 17 22:57:40 2003
From: jdc at uwo.ca (Dan Christensen)
Date: Thu, 17 Apr 2003 22:57:40 -0400
Subject: Can't run NAS Benchmark
In-Reply-To: <Pine.LNX.4.44.0304141209170.826-100000@homer> (Douglas
 Eadline's message of "Mon, 14 Apr 2003 12:13:41 -0500 (CDT)")
References: <Pine.LNX.4.44.0304141209170.826-100000@homer>
Message-ID: <87el40cy1n.fsf@uwo.ca>

Douglas Eadline <deadline at plogic.com> writes:

> You may wish to look at the BPS (Beowulf Performance Suite):
>
> http://www.cluster-rant.com/article.pl?sid=03/03/17/1838236
>
> and:
>
> http://www.hpc-design.com/reports/bps1/index.html

I've been trying the download links on the page

  http://www.plogic.com/bps

for a couple of days, without success.  Any idea what's up?
E.g. the link to

  ftp://ftp.plogic.com/pub/software/bps/RPMS/bps-1.2-11.i386.rpm

doesn't work.  Anonymous login and "cd" work, but "dir" and "get" just
freeze up.  I tried it with several ftp clients and browsers, and from
two hosts.

Dan
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hanzl at noel.feld.cvut.cz  Fri Apr 18 06:17:41 2003
From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz)
Date: Fri, 18 Apr 2003 12:17:41 +0200
Subject: Running perl scripts and non-mpi programs on scyld
In-Reply-To: <3E9EDCFE.5060100@nsc1.net>
References: <E167FAE539F45940B8BECD4A64535D4E04216D@cgcmail.cgc.cpmc.columbia.edu>
	<3E9EDCFE.5060100@nsc1.net>
Message-ID: <20030418121741S.hanzl@unknown-domain>

> 3.  cache of common tools to /usr/local including:
> 	   some tools from /bin, /usr/bin, /usr/local/bin
> 	   /usr/lib/perl
>      - we are not currently rsync'ing this, but we
> 	 could in the future....

This is where cachefs could do really great job. Unfortunately there
is no cachefs for linux as far as I know.

Regards

Vaclav Hanzl
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hanzl at noel.feld.cvut.cz  Fri Apr 18 07:35:04 2003
From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz)
Date: Fri, 18 Apr 2003 13:35:04 +0200
Subject: Running perl scripts and non-mpi programs on scyld
In-Reply-To: <200304171120.03430.shewa@inel.gov>
References: <E167FAE539F45940B8BECD4A64535D4E04216D@cgcmail.cgc.cpmc.columbia.edu>
	<20030417092417S.hanzl@unknown-domain>
	<200304171120.03430.shewa@inel.gov>
Message-ID: <20030418133504V.hanzl@unknown-domain>

> > SGE is quite nice and opensource batch spooling. Using it with
> > scyld-like cluster for this type of jobs is a bit tricky but quite
> > easy. We just create one 'queue' for every slave node and use node
> > number as a queue name. Then we use 'starter method' script like this:
> >
> >   #/bin/sh
> >   bpsh $QUEUE $*
>
> I didn't realize it was so easy to use SGE on top of a bproc based system.  

It was super-easy to meet our particular requirements, there might be
little more work if somebody needs more.

> I suppose it would take quite a bit more work to get it integrated to
> the point where you only needed one queue for the entire cluster.

Term "queue" in SGE is rather misleading. In fact there is no queue -
there is just a global set of jobs to be done (this set behaves as
queue if there is nothing else to order jobs) and set of places where
to execute jobs (these places are called "queues" in SGE). In our
solution, we have "one queue for the entire cluster" in the sense of
set of jobs to be done. We have however N queues in the latter sense
but all these "queues" have identical definition and one can just copy
them using script or qmon GUI.

There is however just one sge_execd daemon running (and one 'execution
host' to install - the head host). 

ps -auxf on head node gives something like this:

 sge_execd
  \_ sge_shepherd-6889 -bg
  |   \_ /bin/sh sge-bproc-starter-method job_scripts/6889
  |       \_ bpsh 3 job_scripts/6889
  |           \_ [6889]
  |               \_ [te]
  |                   \_ [sh]
  |                       \_ [HERest]
  \_ sge_shepherd-6888 -bg
  |   \_ /bin/sh sge-bproc-starter-method job_scripts/6888
  |       \_ bpsh 8 job_scripts/6888
  |           \_ [6888]
  |               \_ [Ser]
  |                   \_ [sh]
  |                       \_ [HVite]
  ...


> So does SGE see the load on the slave node queues?

We blindly schedule fixed number of jobs (1 or 2) per node. It is
however possible to write so called "load sensor script" and see and
use real loads. Creating load sensors is well documented and these
script can use any commands like "bpsh -ap w" or "supermon".

> Does it show the number of processors and total memory in qhost?

No. Maybe one could trick it somehow, we did not care. We use qstat to
get rough per-box information what is going on. Load sensors could
make this more exact and could probably also provide memory
information. But I am happy with just "bpsh -ap free".


In general, SGE is a bit confused because we used just one sge_execd
instead of N. Information collected by sge_execd is incorrect as it
does not run on real execution box. Rest of SGE setup could compensate
for this if somebody cares.

(Or better yet, SGE can be changed to work even better with bproc. I think
bproc systems are important enough and SGE team is nice and responsive
enough for this to happen if we can demonstrate interest and propose sensible
ways of improvements.)

Regards

Vaclav
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Fri Apr 18 09:04:21 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Fri, 18 Apr 2003 08:04:21 -0500
Subject: beowulf in space
In-Reply-To: <200304180117.35195.mof@labf.org>
References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> <3E9DFC75.50504@tamu.edu> <200304180117.35195.mof@labf.org>
Message-ID: <3E9FF7D5.8030809@tamu.edu>

Generally, hardening takes a lot more than a sealed box, although 
packaging of the processor is part of it.  It involves considerations of 
trace sizing on the die, deposition thickness, and a number of other 
things.  Generally, it takes a lot of work to test and certify a lot of 
processors as to their rad-hardness.

There's really a lot of effort that goes into it.

gerry

Mof wrote:
> Ok excuse my ignorance, but what is involved in rad harding hardware ?
> Is the cost really necessary, in that couldn't you put the unprotected 
> hardware into some sort of shielded container ?
> Or am I just being silly ? :-)
> 
> Mof.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From exa at kablonet.com.tr  Thu Apr 17 19:53:41 2003
From: exa at kablonet.com.tr (Eray Ozkural)
Date: Fri, 18 Apr 2003 02:53:41 +0300
Subject: beowulf in space
In-Reply-To: <20030417200321.GB15077@plk.af.mil>
References: <20030417032151.GA13826@plk.af.mil> <Pine.LNX.4.44.0304170754440.2223-100000@lilith.rgb.private.net> <20030417200321.GB15077@plk.af.mil>
Message-ID: <200304180253.41114.exa@kablonet.com.tr>

On Thursday 17 April 2003 23:03, Art Edwards wrote:
> Sensor and Fusion Engine (SAFE) in its best case is 96 processors,
> broken
> into 12 bussed groups (the bus a customized Futurebus+) with a Myrinet
> bridge.  The system is small enough in scale to be serviced by a single,
> 16-port duplex non-blocking Myrinet crossbar.  So, 12 of the hubs are
> occupied with the 96 processors, which are of a special design
> (microprogrammable with IEEE 754(?) double-precision floating point
> support). 

!!! My! Maybe I can go to space as a parallel programming researcher after 
all! (^_^)

Seriously, this is pretty cool stuff :) We know what to say when they ask "are 
there supercomputers in space?"

Cheers,

-- 
Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ron_chen_123 at yahoo.com  Thu Apr 17 23:27:13 2003
From: ron_chen_123 at yahoo.com (Ron Chen)
Date: Thu, 17 Apr 2003 20:27:13 -0700 (PDT)
Subject: rocks cluster -- SGE preconfigured
Message-ID: <20030418032713.69875.qmail@web41311.mail.yahoo.com>

SGE support for IA64 and IA32 on rocks cluster.

 -Ron

--- "Matthew C.H. Lee" wrote:
> Are you trying to build a beowulf type of cluster or
> just want to have a
> load managing software for your site to manage the
> collection of
> heterogeneous workstations?  If you are building a
> beowulf cluster, you
> might also want to check out Rocks
> 
> rocks.npaci.edu
> 
> The latest version already has SGE preconfigured and
> ready to run out of the
> box.  People in my lab without prior cluster or sys
> admin experience were
> able to build a function cluster using Rocks within
> ~ 1 hr.  Very cool
> stuff.
> 
> -- Matt
> 
> ----- Original Message -----
> From: "Ron Chen" <ron_chen_123 at yahoo.com>
> To: "Benjamin Goldsteen"
> <Benjamin.Goldsteen at physbio.mssm.edu>
> Cc: "Matthew C. H. Lee" <mattlee at UDel.Edu>;
> <pbs-users at PBSpro.com>;
> <pbs-users at openpbs.org>
> Sent: Thursday, April 17, 2003 9:26 PM
> Subject: Re: [PBS-USERS] cost for educational sites
> 
> 
> > Some users told me that PBSPro is still "free",
> but
> > now they charge for support.
> >
> > However, if you want to get PBSPro, you *must* pay
> for
> > support. PBS developers, please correct me if that
> is
> > wrong.
> >
> > GridEngine is much, much better than OpenPBS:
> >
> > 1) it has job arrays
> > 2) Better fault tolerance features such as shadow
> > master and automatically job rerun.
> > 3) better scheduler performance and scheduling
> > policies.
> > 4) Better platform support, including AIX,
> FreeBSD,
> > HP-UX, MacOSX, Tru64, Solaris, Linux, Cray, NEC
> > SX-5/6, IRIX, and initial support for Win2K.
> >
> > Even with 70,000 submitted jobs, SGE is able to
> handle
> > that easily, and some sites even tried with as
> many as
> > 600,000-task job array, and further, it can handle
> > 1,300 hosts.
> >
> > Notes that those are in production environments,
> and
> > the numbers user reported are not actual limits.
> >
> > And you can get commercial support:
> >
>
http://wwws.sun.com/software/gridware/partners/index.html
> >
> > And you can see the list is very popular:
> >
>
http://gridengine.sunsource.net/servlets/SummarizeList?listName=users
> >
> > And people are switching from L$F to GridEngine:
> > http://www.veus.hr/linux/gemonitor.html
> >
> >  -Ron
> >
> >
> > --- Benjamin Goldsteen wrote:
> > > Hi Ron and Matthew,
> > > Any update on the new policy?  If they now plan
> to
> > > charge .edu for what should
> > > be free and open-source under the original PBS
> terms
> > > then I plan to support
> > > another company or product.  If I am going to
> pay
> > > money, I will pay that money
> > > to a company like LSF rather than this company
> which
> > > takes a government
> > > supported project and violates its original
> terms.
> > >
> > > Otherwise, HPC and .edu should put its efforts
> into
> > > enhancing OpenPBS, SGE,
> > > etc.  I don't know why a company would drive the
> > > .edu/HPC market into supporting
> > > competing free products, but I think that is
> what's
> > > going to happen here.
> > > --
> > > Benjamin Z. Goldsteen
> > > Physiology & Biophysics
> > > Mount Sinai School of Medicine
> > > 212-241-1614 / 212-860-3369 (FAX)
> > >
> >
> >
> > __________________________________________________
> > Do you Yahoo!?
> > The New Yahoo! Search - Faster. Easier. Bingo
> > http://search.yahoo.com
> >
>
__________________________________________________________________________
> > To unsubscribe: email majordomo at OpenPBS.org with
> body "unsubscribe
> pbs-users"
> > For message archives:
> http://www.OpenPBS.org/UserArea/pbs-users.html
> >     -    -    -    -    -    -    -    -    -    -
>    -    -    -    -
> > OpenPBS and the pbs-users mailing list are
> sponsored by Altair.
> >
>
__________________________________________________________________________
> 


__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dtj at uberh4x0r.org  Fri Apr 18 11:42:39 2003
From: dtj at uberh4x0r.org (Dean Johnson)
Date: 18 Apr 2003 10:42:39 -0500
Subject: beowulf in space
In-Reply-To: <3E9FF7D5.8030809@tamu.edu>
References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov>
	 <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov>
	 <3E9DFC75.50504@tamu.edu> <200304180117.35195.mof@labf.org>
	 <3E9FF7D5.8030809@tamu.edu>
Message-ID: <1050680559.25053.15.camel@terra>

Howdy all,
Is it just me, or does anybody else have the problem when reading the "beowulf in space"
subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, maybe I need some 
sleep or something. ;-)

	-Dean

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From chip at chip.bellsouth.net  Fri Apr 18 13:17:47 2003
From: chip at chip.bellsouth.net (chip)
Date: Fri, 18 Apr 2003 13:17:47 -0400 (EDT)
Subject: beowulf in space
Message-ID: <Pine.LNX.4.44.0304181315270.8168-100000@chip>

Dean Johnson wrote:

Howdy all,
Is it just me, or does anybody else have the problem when reading the 
"beowulf in space"
subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, 
maybe I need some 
sleep or something. ;-)

	-Dean

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Hi Dean,
No, I don't think its you... I've got that same sound coming thru my box 
and I've had plenty of time to sleep on it.  But you would have actually 
have had to live in the 1960's to understand what the sound was like.  A 
flight to the moon was considered reckless not to mention impossible and 
at the time considering the state of technology, it was certainly risky 
business.  Flying a mission thru the Sun's outer atmosphere certainly 
presses our technology to the extreme of theoretical limits... Not just 
the processing power of the on board beowulfy type cluster processors 
necessary for such a mission to succeed but the velocity that would 
required is on the order of 10 times of what we have yet to achieve, but 
theoretically possible based on a space tested Ion design.  The shielding 
would have incorporate a combination of our most advanced composites and  
not to mention the electrostatic field that would have to be generated to 
protect the sensitive sensor array would perhaps come close to what Dr. 
Brown half jokingly describes as a fusion engine... since it would 
required an inverted plasm bubble... Yep I would say it is a purdy near 
impossible mission... but not outside our theoretical limits... It's 
enticing in its appeal... And in the same manner as the missions to the 
moon in the 60's... It's got that mythic quality about it that tends to 
capture the imagination... sends a chill of excitement in the 
confrontation of overwhelming and impossible odds... From our not so 
distant past it has a sound that is hauntingly  familiar... something the 
thunder of the sound barrier perhaps booming from the past... It rings 
history heroic... Its a great sound, I still love that sixties music :-)
   
C.Clary
Spartan sys. analyst
PO 1515
Spartanburg, SC 29304-0243

Fax# (801) 858-2722


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jbbernard at eng.uab.edu  Fri Apr 18 13:57:44 2003
From: jbbernard at eng.uab.edu (jbbernard at eng.uab.edu)
Date: Fri, 18 Apr 2003 12:57:44 -0500
Subject: Google's cluster
Message-ID: <836A226C5200104C8A4AFDB31F8529BD2316A7@engem0.eng.uab.edu>

In the past there's been some discussion on the list about the hardware
required for Google to work its magic. I recently came across this talk by
Urs Hoelzle of Google, given last year at the Univ of Washington.

http://www.cs.washington.edu/info/videos/asx/colloq/UHoelzle_2002_11_05.asx

Jon

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Apr 18 13:56:31 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 18 Apr 2003 10:56:31 -0700
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304171626190.2392-100000@lilith.rgb.private
 .net>
References: <200304180117.35195.mof@labf.org>
Message-ID: <5.1.0.14.2.20030418104052.0303c7d8@mailhost4.jpl.nasa.gov>

I have to say that while this discussion is straying a bit from the usual 
enjoyable stuff on the list about switches, interconnects, and how fast one 
processor or another is, I find it gratifying that folks are coming up with 
creative ideas, and thinking about other applications for Beowulves than 
just computer rooms full of rackmounted computers.

Look back on the growth of cluster computing in the overall supercomputing 
business. Did anyone think, back in 1995, that there would be the 
penetration there is today?  My hope is that the same can occur for space 
applications, which, while different in details, have   many of the same 
drivers.

And on to my comments interspersed within...


At 04:54 PM 4/17/2003 -0400, Robert G. Brown wrote:
>On Fri, 18 Apr 2003, Mof wrote:
>
> > Ok excuse my ignorance, but what is involved in rad harding hardware ?
> > Is the cost really necessary, in that couldn't you put the unprotected
> > hardware into some sort of shielded container ?
> >
> > Or am I just being silly ? :-)
>
>Not really silly, but IIRC shielding is both difficult and expensive and
>sometimes actively counterproductive in space.  I'm sure the NASA guys
>will have even more detail, but:
>
>   a) Difficult, because there is a very wide range of KINDS and ENERGIES
>of radiation out there.  Some are easy to stop, but some (like massive,
>very high energy nucleii or very high energy gamma rays) are not.

Precisely the case. There's two aspects: total dose, which gradually 
degrades the components, and single event effects (SEE), which come from 
the "one big fast (high energy) particle" kind of things. SEEs can be 
either transient (upsets) or permanent (Gate rupture).

Total dose is talked about in terms of kiloRads or MegaRads (Yeah, I know 
the real units are Grays and Sieverts, but we work in rads for historical 
reasons). And, of course, taking dose and trying to collapse it into a 
single number ignores important things like the energy spectrum and dose 
rate effects (some degradation processes are enhanced and others reduced at 
low dose rates)

Single events are usually talked about in terms of Linear Energy Transfer 
(LET).. typically some number of MeV per cm, etc.  A neutrino may have high 
energy, but because it won't hit anything, it doesn't transfer any energy 
to the victim circuit, hence have low LET. On the other hand, a big old 
heavy ion, moving slow, has a very high collision cross section, so the LET 
might be quite high. LET is sort of a way to represent a combination of 
particle energy and reaction cross-section.


>   b) Expensive, because to stop radiation you basically have to
>interpolate matter in sufficient density to absorb and disperse the
>energy via single and multiple scattering events.  Some radiation has a
>relatively high cross section with matter and low energy and is easily
>stopped, but the most destructive sort requires quite a lot of
>shielding, which is dense and thick.  This means heavy and occupying
>lots of volume, which means expensive in terms of lifting it out of the
>gravity well.  I don't know what it costs to lift a kilogram of mass to
>geosynchronous orbit, but I'll bet it is a LOT.

$100K/kg is a nice round number...
Of more significance is that launch capability comes in chunks. You might 
have 300kg of lift, and if your box winds up being 320kg, you have to buy 
the next bigger rocket, at a substantial cost increase. At an early stage, 
your mass budget gets set according to your dollar budget. The mission 
designer divvies up the mass budget among all the folks clamoring for it 
(so many kg for attitude control, so many kg for thermal management, so 
many kg for instruments, etc.) holding a bit back in reserve (because 
systems ALWAYS get heavier), so that when the inevitable happens, they can 
still buy the cheaper rocket.


>   c) Counterproductive, because SOME of the kinds of radiation present
>are by themselves not horribly dangerous -- they have a lot of energy
>but are relatively unlikely to hit anything.  So when they hit they kill
>a cell or a chromosome or a bit or something, but in a fairly localized
>way.  However, when they hit the right densities of matter in shielding
>they can produce a literal shower or shotgun blast of secondary
>particles that ARE the right particles at the right energies to do a lot
>of damage (to humans or hardware).  So either you need enough shielding
>to stop these particles and all their secondary byproducts, or you can
>be better off just letting those particles (probably) pass right on
>through, hopefully without hitting anything.

Scattering is one of those horrible things.. adding shielding might make 
things worse. And the real problem is that it is very, very difficult to 
model accurately. So we make approximations (spherical shells, etc.), add a 
bit of margin, and go from there.

Think of this.. high energy neutrons are actually safer than thermalized 
neutrons, for human exposure (looking at RBE numbers), because the cross 
section is much higher for thermalized neutrons... they're slower. Same 
kinds of things apply to electronics.


>Basically, we are pretty fortunate to live way down here at the bottom
>of several miles of atmosphere, where most of the dangerous crap hits
>and showers its secondary stuff miles overhead and is absorbed before it
>becomes a hazard.  Our computer hardware is similarly fortunate.  Even a
>mile up the radiation levels are significantly higher -- even growing up
>in subtropical India I was NEVER sunburned as badly as I was in a mere
>two hours of late afternoon exposure in Taxco, Mexico, just one mile up.
>A single six hour cross-country plane ride exposes you to 1/8 of the
>rems you'd receive, on average, in an entire year spent at ground
>level.  God only knows what astronauts get.  Maybe they bank gametes
>before leaving, dunno...

IBM did a bunch of studies a while back comparing DRAM error rates at 
computers installed at sea level and in Colorado and in a mine in Colorado, 
and found significant (in a statistical sense) differences.


>So definitely not silly, but things are more complex than they might
>seem.  I'm sure that if a cost-effective solution were as easy as just
>"more shielding" the rocket scientists (literally:-) at NASA would have
>already thunk it.

All it takes is money...


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Fri Apr 18 19:41:59 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Fri, 18 Apr 2003 18:41:59 -0500
Subject: beowulf in space Warning:  'WAY OT
In-Reply-To: <1050680559.25053.15.camel@terra>
References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov>	 <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov>	 <3E9DFC75.50504@tamu.edu> <200304180117.35195.mof@labf.org>	 <3E9FF7D5.8030809@tamu.edu> <1050680559.25053.15.camel@terra>
Message-ID: <3EA08D47.2040304@tamu.edu>

You asked for it.

about 8 years ago, in response to a need to better and more accurately 
track cattle for a USDA-sponsored entomology research project, Cows In 
Space was born (or borne, as the case may be).  There were lots of 
68030's moooving around the pasture, all reporting back to a head node 
made of a Pentium-I.  All of the primary data was provided by direct 
sequence spread-spectrum signalling at L-Band.  Biasing and additional 
input data was provided by a low-speed VHF datalink.  ALthough the 
general application was NUMA, in fact, all of the processors had a 
uniform architecture and memory distribution.

These were GPS receivers.  On cows.  With differential corrections data 
provided.  For the record, we were able to determine the animals' 
location on a 30-sec interval with sub-bovine accuracy.

Sorry.  It just HAD to be told.

gerry

Dean Johnson wrote:
> Howdy all,
> Is it just me, or does anybody else have the problem when reading the "beowulf in space"
> subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, maybe I need some 
> sleep or something. ;-)
> 
> 	-Dean
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Sat Apr 19 14:54:59 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Sat, 19 Apr 2003 14:54:59 -0400 (EDT)
Subject: beowulf in space Warning:  'WAY OT
In-Reply-To: <3EA08D47.2040304@tamu.edu>
Message-ID: <Pine.LNX.4.44.0304191448190.1590-100000@lilith.rgb.private.net>

On Fri, 18 Apr 2003, Gerry Creager N5JXS wrote:

> You asked for it.
> 
> about 8 years ago, in response to a need to better and more accurately 
> track cattle for a USDA-sponsored entomology research project, Cows In 
> Space was born (or borne, as the case may be).  There were lots of 
> 68030's moooving around the pasture, all reporting back to a head node 
> made of a Pentium-I.  All of the primary data was provided by direct 
> sequence spread-spectrum signalling at L-Band.  Biasing and additional 
> input data was provided by a low-speed VHF datalink.  ALthough the 
> general application was NUMA, in fact, all of the processors had a 
> uniform architecture and memory distribution.
> 
> These were GPS receivers.  On cows.  With differential corrections data 
> provided.  For the record, we were able to determine the animals' 
> location on a 30-sec interval with sub-bovine accuracy.
> 
> Sorry.  It just HAD to be told.

You'll be sorry.  I'm recording all of this in my
Src/beowulf_book/List_Ideas/Space file.  One day Cows In Space (as told
by a certain GC:-) will be on the tongues of all humans on the
planet...or at least all of those interested in building clusters.

It's pretty clear that there is a chapter in all this, but I've spent
some 24 out of the last 36 hours rewriting the brahma website in php.
That (by the way) is almost done -- people who have used

 http://www.phy.duke.edu/brahma

in the past might check it out again at 

 http://www.phy.duke.edu/brahma/index.php

and comment if they so desire.  When my energy banks recharge, perhaps
I'll tackle the Cows. So to speak...

   rgb

> 
> gerry
> 
> Dean Johnson wrote:
> > Howdy all,
> > Is it just me, or does anybody else have the problem when reading the "beowulf in space"
> > subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, maybe I need some 
> > sleep or something. ;-)
> > 
> > 	-Dean
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Sat Apr 19 19:49:39 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Sat, 19 Apr 2003 19:49:39 -0400 (EDT)
Subject: Brahma Site Officially Rebuilt
Message-ID: <Pine.LNX.4.44.0304191928390.1590-100000@lilith.rgb.private.net>

Dear All,

The <a href="http://www.phy.duke.edu/brahma/index.php">brahma</a>
website has just been completely redone in php so that it looks much
prettier and is much easier to navigate.  I have worked very hard to
validate all the links, although I'm sure that some links are still
broken or missing from the previous site.  I have also written a
moderately detailed description of the various clusters that are part of
the "brahma" project.

The beowulf engineering book is still there (in three forms), and any
old bookmarks to it should be forwarded.  The links and vendors lists
are updated and augmented with new entries.  Software, talks and papers,
and other resources are much better organized.  Nearly everything has
meta tags that should make the associated resource more visible to
search engines on campus or off.  The old site is even there, preserved
in its entirety, in case there are things you rely on or find in a
search engine that failed to get moved (yet).

Beowulf list persons who have used the site in the past are invited to
revisit it, browse it, and update their bookmarks or URL's.
Beowulf-associated managers or vendors with sites of their own are
invited to check it out and send me links or update requests if your
site is missing or broken.

DBUG persons on campus are invited to visit it and the DBUG site and to
REGISTER THEIR CLUSTERS on the DBUG site if they have not already done
so.

DULUG persons on campus who are interested in cluster computing or
interested in learning about a bit of of what is being done with linux
at the high performance computing edge are invited to visit it just to
check it out for fun.

Finally, physics department persons who use the cluster or operate one
of the subclusters in brahma are invited to send me comments, entries
for the various (woefully incomplete) tables crossreferencing personnel
that use the cluster or research group pages that detail some of what is
being done with the cluster.  Look especially under the Users, Clusters
and Research pages and let me know if you find egregious errors or want
to send me updated information for the tables.

Thank you,

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From astroguy at bellsouth.net  Sun Apr 20 13:05:31 2003
From: astroguy at bellsouth.net (astroguy at bellsouth.net)
Date: Sun, 20 Apr 2003 13:05:31 -0400
Subject: beowulf in space
Message-ID: <20030420170531.BUSW1506.imf45bis.bellsouth.net@mail.bellsouth.net>

PS. I'm not sure if this got out as I been getting out cause I've been fiddling with my pop and IMAP system settings...  forever tinkering.
But I have to agree this topic has spilled over into areas that this beowulfy seldom visits but if we are  going to fly the cluster into the hostile environment of space all these areas must be considered... the thermal as well as a combinations of electro static and magnetic shielding .... and sure, as Jim points out it is going to take money... but perhaps even more than money... Is the passion (in the 60's it was electric and contagious)...  the commitment and enthusiastic support of another generation will have to have their imaginations stirred as to the complexity and daunting challenge of the goals space exploration, research not just in terms of gold and treasure but in the blood, sweat, and as recent events again revisit the tears of it all.. As space exploration is forever to remain the most risky business... But it is also the most exciting and noble of all human adventures.

> 
> From: chip <chip at chip.bellsouth.net>
> Date: 2003/04/18 Fri PM 01:17:47 EDT
> To: Dean Johnson <dtj at uberh4x0r.org>
> CC: Beowulf at beowulf.org
> Subject: Re: beowulf in space
> 
> Dean Johnson wrote:
> 
> Howdy all,
> Is it just me, or does anybody else have the problem when reading the 
> "beowulf in space"
> subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, 
> maybe I need some 
> sleep or something. ;-)
> 
> 	-Dean
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
> 
> Hi Dean,
> No, I don't think its you... I've got that same sound coming thru my box 
> and I've had plenty of time to sleep on it.  But you would have actually 
> have had to live in the 1960's to understand what the sound was like.  A 
> flight to the moon was considered reckless not to mention impossible and 
> at the time considering the state of technology, it was certainly risky 
> business.  Flying a mission thru the Sun's outer atmosphere certainly 
> presses our technology to the extreme of theoretical limits... Not just 
> the processing power of the on board beowulfy type cluster processors 
> necessary for such a mission to succeed but the velocity that would 
> required is on the order of 10 times of what we have yet to achieve, but 
> theoretically possible based on a space tested Ion design.  The shielding 
> would have incorporate a combination of our most advanced composites and  
> not to mention the electrostatic field that would have to be generated to 
> protect the sensitive sensor array would perhaps come close to what Dr. 
> Brown half jokingly describes as a fusion engine... since it would 
> required an inverted plasma bubble... Yep I would say it is a purdy near 
> impossible mission... but not outside our theoretical limits... It's 
> enticing in its appeal... And in the same manner as the missions to the 
> moon in the 60's... It's got that mythic quality about it that tends to 
> capture the imagination... sends a chill of excitement in the 
> confrontation of overwhelming and impossible odds... From our not so 
> distant past it has a sound that is hauntingly  familiar... something the 
> thunder of the sound barrier perhaps booming from the past... It rings 
> history heroic... Its a great sound, I still love that sixties music :-)
>    
> C.Clary
> Spartan sys. analyst
> PO 1515
> Spartanburg, SC 29304-0243
> 
> Fax# (801) 858-2722
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jbassett at blue.weeg.uiowa.edu  Mon Apr 21 16:46:34 2003
From: jbassett at blue.weeg.uiowa.edu (jbassett)
Date: Mon, 21 Apr 2003 15:46:34 -0500
Subject: back to the issue of cooling
Message-ID: <3EA82302@itsnt5.its.uiowa.edu>

Sorry to keep kicking a dead horse guys, but the issue of increasing 
thermal efficiency in large clusters and data centers has been keeping 
me awake at nights. Has anyone tried to use a stirling engine or other 
system 
for instance:

http://www.stmpower.com/Technology/Technology.asp

that can take as input pure heat, not just potential in the form of btus, in 
order to recover some of the heat energy that would simply be wasted at a 
large facility. Could this be economically viable?  
Joseph Bassett


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Apr 21 18:23:10 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 21 Apr 2003 18:23:10 -0400 (EDT)
Subject: back to the issue of cooling
In-Reply-To: <3EA82302@itsnt5.its.uiowa.edu>
Message-ID: <Pine.LNX.4.44.0304211815320.30443-100000@coffee.psychology.mcmaster.ca>

> Sorry to keep kicking a dead horse guys, but the issue of increasing 
> thermal efficiency in large clusters and data centers has been keeping 
> me awake at nights. 

wow.  you need to get a cluster of your own to worry about ;)

> Has anyone tried to use a stirling engine or other 

afaikt, this sort of thing depends on the presence of a substantial
temperature differential, not just a lot of energy.  my machineroom 
dissipates around 35 KW, but the return air isn't supposed to get 
above about 30C (unlike today, when we hit 35.8 :( )

I have a vague recollection that the efficiency of heat engines is 
strongly dependent on the temp differential, which would be only about 20C
assuming our chilled water stayed chilled...

> order to recover some of the heat energy that would simply be wasted at a 
> large facility. Could this be economically viable?  

I think it would be more effective to reduce at the source.  for instance,
my ES40's have 2-of-3 redundant power supplies, which seem to dissipate 
a lot more heat than another cluster which has 1-of-2.  of course, the mere
fact that we're using Alphas is a declaration of war on coolness ;)

I hear those Opterons are pretty cool...

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at math.ucdavis.edu  Tue Apr 22 01:35:51 2003
From: bill at math.ucdavis.edu (Bill Broadley)
Date: Mon, 21 Apr 2003 22:35:51 -0700
Subject: Opteron announcement
Message-ID: <20030422053550.GA6923@sphere.math.ucdavis.edu>


Apparently the link to http://www.amd.com/opteronservers just went
live.  Tons of cool docs/benchmarks.  

SPECfp rate 2000 (dual cpu)
================
it2-1.0  30.7
amd-244  26.7
amd-242  25.1
amd-240  22.7
Xeon-2.8 14.7

SPECfp_peak 2000 (single cpu)
================
it2-1.0   1431
amd-144   1219
Xeon-3.06 1103

SPECint_peak 2000 (single cpu)
=================
Amd-144   1170
Xeon-3.06 1130
IT2-1.0    719

SPECint_rate 2000 (dual cpu)
=================
amd-244  26.8
amd-242  24.0
amd-240  21.2
Xeon-2.8 19.6
it2-900  15.5

SPECint_rate2000 (windows) 4p
=============================
amd-844     48.5
amd-852     45.1
amd-840     40
Xeon MP 2.0 34.7
it2-1.0     32.9

SPECfp_rate20000 4P
===================
it2-1.0  49.3
amd-844  49.2
amd-842  45.0
amd-840  40.7
Xeon-2.0 20.2

Oh and one more interesting link:
Software Optimization Guide for AMD athlon 64 and AMD Opteron Processors
http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_739_7203,00.html

Amusingly all the submissions that I looked at the full reports for
use the Intel compiler.  So the Opterons extra registers are ignored.

Time will tell if 3rd party compilers that fully utilize the additional
registers can win benchmarks against Intel's compiler.

Based on the preliminary pricing I have the Opterons look to make for
very nice beowulf nodes.


-- 
Bill Broadley
Mathematics
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Apr 22 09:36:52 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 22 Apr 2003 09:36:52 -0400 (EDT)
Subject: back to the issue of cooling
In-Reply-To: <3EA82302@itsnt5.its.uiowa.edu>
Message-ID: <Pine.LNX.4.44.0304220916570.2656-100000@lilith.rgb.private.net>

On Mon, 21 Apr 2003, jbassett wrote:

> Sorry to keep kicking a dead horse guys, but the issue of increasing 
> thermal efficiency in large clusters and data centers has been keeping 
> me awake at nights. Has anyone tried to use a stirling engine or other 
> system 
> for instance:
> 
> http://www.stmpower.com/Technology/Technology.asp
> 
> that can take as input pure heat, not just potential in the form of btus, in 
> order to recover some of the heat energy that would simply be wasted at a 
> large facility. Could this be economically viable?  

<sigh> In almost all cases, no.  It's the problem with heat -- you have
to pay for it to get it where you want it, then you have to pay for it
again to get rid of it when it is where you DON'T want it.

I won't inflict a full review of the laws of thermodynamics on the list,
but the relevant one (second) here says that you can only extract work
when running e.g. a heat engine between two reservoirs, one "hot", one
"cold(er)".  Even then, one can only extract strictly less than

 \eta = \frac{T_h - T_c}{T_h}

(degrees kelvin only) of the energy in the heat that flows from hot to
cold through your engine.  Even to start with, this makes it hardly
worth it.  You're trying NOT to allow T_h to exceed 340K at the CPU (the
room itself would need to be far colder or you'd be in deep trouble out
of the gate); you'd have to work very hard (and spend a lot of energy
and money!) to come up with a "free" cold reservoir at T_c = 290K (got a
glacier or springfed lake handy?).  So you could recover at most 12% of
the heat energy from the CPUs themselves, probably not enough to run the
pumps from your "free" cold reservoir.

The only way to recover any fraction at all, is to use your A/C as a
"heat pump" in the wintertime and pump the heat elsewhere in your
building where it could be of use.  A modern new building facility might
well do that, if the architect designed things appropriately for that
purpose from the beginning.  It would be quite difficult to
cost-effectively retrofit in most other environments.  This would still
cost money to operate, but you'd get a gain on the investment as the
coefficient of performance of your heat pump/AC could be 3-5 (giving you
a solid gain on the energy used).

The BEST way to remember the second law is that it says that "There
ain't no such thing as a free lunch" (tanstaafl).  So any clever scheme
to get something for nothing will almost certainly fail.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jcownie at etnus.com  Tue Apr 22 07:12:28 2003
From: jcownie at etnus.com (James Cownie)
Date: Tue, 22 Apr 2003 12:12:28 +0100
Subject: beowulf in space 
In-Reply-To: Message from Jim Lux <James.P.Lux@jpl.nasa.gov> 
   of "Fri, 18 Apr 2003 10:56:31 PDT." <5.1.0.14.2.20030418104052.0303c7d8@mailhost4.jpl.nasa.gov> 
Message-ID: <197vhM-3Yi-00@etnus.com>


As you no doubt already know, some people have used standard
(non-rad-hardened) CPUs in satellite applications.

For instance Clementine used a MIPS R3081 for its sensor interface
processor. (See table 9 in 

http://www.google.com/search?q=cache:UWVj-wMWHLUC:www.pxi.com/praxis_publicpages/pdfs/Lun_Orb_Alabama.pdf+clementine+MIPS+computer+&hl=en&ie=UTF-8

)

Of course lunar orbit is likely a lower radiation environment than
low-earth orbit, and there were two lower level processors for basic
control which _were_ rad-hardened.

-- Jim 

James Cownie	<jcownie at etnus.com>
Etnus, LLC.     +44 117 9071438
http://www.etnus.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Tue Apr 22 00:45:23 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 21 Apr 2003 21:45:23 -0700
Subject: back to the issue of cooling
References: <Pine.LNX.4.44.0304211815320.30443-100000@coffee.psychology.mcmaster.ca>
Message-ID: <000c01c30889$fce650c0$02a8a8c0@office1>

It always takes energy to move the heat against temperature differential.
(one of those pesky laws of thermodynamics)

So, to use the waste heat from your cluster to move that heat outside would
require the addition of extra energy.

----- Original Message -----
From: "Mark Hahn" <hahn at physics.mcmaster.ca>
To: "jbassett" <jbassett at blue.weeg.uiowa.edu>
Cc: <beowulf at beowulf.org>
Sent: Monday, April 21, 2003 3:23 PM
Subject: Re: back to the issue of cooling


> > Sorry to keep kicking a dead horse guys, but the issue of increasing
> > thermal efficiency in large clusters and data centers has been keeping
> > me awake at nights.
>
> wow.  you need to get a cluster of your own to worry about ;)
>
> > Has anyone tried to use a stirling engine or other
>
> afaikt, this sort of thing depends on the presence of a substantial
> temperature differential, not just a lot of energy.  my machineroom
> dissipates around 35 KW, but the return air isn't supposed to get
> above about 30C (unlike today, when we hit 35.8 :( )
>
> I have a vague recollection that the efficiency of heat engines is
> strongly dependent on the temp differential, which would be only about 20C
> assuming our chilled water stayed chilled...
>
> > order to recover some of the heat energy that would simply be wasted at
a
> > large facility. Could this be economically viable?
>
> I think it would be more effective to reduce at the source.  for instance,
> my ES40's have 2-of-3 redundant power supplies, which seem to dissipate
> a lot more heat than another cluster which has 1-of-2.  of course, the
mere
> fact that we're using Alphas is a declaration of war on coolness ;)
>
> I hear those Opterons are pretty cool...
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From wharman at prism.net  Tue Apr 22 10:57:36 2003
From: wharman at prism.net (William Harman)
Date: Tue, 22 Apr 2003 08:57:36 -0600
Subject: Opteron announcement
In-Reply-To: <20030422053550.GA6923@sphere.math.ucdavis.edu>
Message-ID: <002701c308df$98af7270$318a010a@WHARMAN>

Bill;

If you need a demo unit, let me know, I can supply.

Bill Harman,
High Performance Cluster Systems
Toll Free  866-883-4689  Ext 203
Salt Lake City Office (801) 572-9252
wharman at prism.net
wharman at einux.com
www.einux.com
 
 
-----Original Message-----
From: beowulf-admin at beowulf.org [mailto:beowulf-admin at beowulf.org] On
Behalf Of Bill Broadley
Sent: Monday, April 21, 2003 11:36 PM
To: beowulf at beowulf.org
Subject: Opteron announcement


Apparently the link to http://www.amd.com/opteronservers just went live.
Tons of cool docs/benchmarks.  

SPECfp rate 2000 (dual cpu)
================
it2-1.0  30.7
amd-244  26.7
amd-242  25.1
amd-240  22.7
Xeon-2.8 14.7

SPECfp_peak 2000 (single cpu)
================
it2-1.0   1431
amd-144   1219
Xeon-3.06 1103

SPECint_peak 2000 (single cpu)
=================
Amd-144   1170
Xeon-3.06 1130
IT2-1.0    719

SPECint_rate 2000 (dual cpu)
=================
amd-244  26.8
amd-242  24.0
amd-240  21.2
Xeon-2.8 19.6
it2-900  15.5

SPECint_rate2000 (windows) 4p
=============================
amd-844     48.5
amd-852     45.1
amd-840     40
Xeon MP 2.0 34.7
it2-1.0     32.9

SPECfp_rate20000 4P
===================
it2-1.0  49.3
amd-844  49.2
amd-842  45.0
amd-840  40.7
Xeon-2.0 20.2

Oh and one more interesting link:
Software Optimization Guide for AMD athlon 64 and AMD Opteron Processors
http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_739_720
3,00.html

Amusingly all the submissions that I looked at the full reports for use
the Intel compiler.  So the Opterons extra registers are ignored.

Time will tell if 3rd party compilers that fully utilize the additional
registers can win benchmarks against Intel's compiler.

Based on the preliminary pricing I have the Opterons look to make for
very nice beowulf nodes.


-- 
Bill Broadley
Mathematics
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From brian.dobbins at yale.edu  Tue Apr 22 11:54:09 2003
From: brian.dobbins at yale.edu (Brian Dobbins)
Date: Tue, 22 Apr 2003 11:54:09 -0400 (EDT)
Subject: PGI v5.0 [Beta] avail.. (was: Opteron announcement)
Message-ID: <Pine.LNX.4.44.0304221150260.28835-100000@email.combustion.eng.yale.edu>

[Mikhail Kuzminksy said:]
>   PGI (Portland Group) 5.0 will have Opteron support. The product
>will be available at summer (June, if I remember correctly).
>It'll be very interesting to compare !

  You can download a beta now, though it doesn't support cross-compiling, 
so you need an Opteron system.  I recently used it to benchmark a code and 
was thoroughly impressed.  I haven't yet run that same code on an Opteron 
via the Intel compilers, but I should have a system arriving soon and will 
certainly try that out to see the difference.

  PGI v5.0 Beta:
  http://www.pgroup.com/AMD64.htm

  Cheers,
  - Brian

Brian Dobbins
Yale Mechanical Engineering

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Tue Apr 22 12:05:31 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Tue, 22 Apr 2003 09:05:31 -0700
Subject: Opteron announcement
In-Reply-To: <20030422053550.GA6923@sphere.math.ucdavis.edu>
References: <20030422053550.GA6923@sphere.math.ucdavis.edu>
Message-ID: <20030422160531.GA1299@greglaptop.attbi.com>

> SPECfp_peak 2000 (single cpu)
> ================
> it2-1.0   1431
> amd-144   1219
> Xeon-3.06 1103

By the way, if you get rid of art (which gets a major cache benefit at
3 MB) and swim (main memory bound), the Opteron is faster than Itanium
on the rest. Pretty amazing.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jbassett at blue.weeg.uiowa.edu  Tue Apr 22 13:51:54 2003
From: jbassett at blue.weeg.uiowa.edu (jbassett)
Date: Tue, 22 Apr 2003 12:51:54 -0500
Subject: back to the issue of cooling
Message-ID: <3EAA34FF@itsnt5.its.uiowa.edu>

Yes , yes I haven't forgotten that yellow stat phys book that I enjoyed a 
couple of semesters ago. It depresses me that there is not an easy solution. I 
suppose it is better to make the engine more efficient than to try to trap 
unburned fuel. Cluster o' Transmeta.

On another note, I am now the proud owner of a Sun Ultra 10 workstation. I 
have a very heterogeneous cluster in my apartment, including x86 and DEC 
alpha. I cannot seem to get the Sun (running FreeBSD 5.0, fortran compiler 
broken) to shake hands correctly with the rest of the team. But if I run a PVM 
code, he plays ball. Is there something peculiar to Sparc that would keep it 
from integrating well into a hetero mpich cluster. Is this a 512/1024 problem?

Joseph Bassett


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From timm at fnal.gov  Tue Apr 22 14:14:18 2003
From: timm at fnal.gov (Steven Timm)
Date: Tue, 22 Apr 2003 13:14:18 -0500 (CDT)
Subject: Opteron announcement
In-Reply-To: <20030422160531.GA1299@greglaptop.attbi.com>
Message-ID: <Pine.LNX.4.31.0304221313460.17879-100000@boxer.fnal.gov>

How do the specint_peak 2000 numbers compare?

Steve Timm

------------------------------------------------------------------
Steven C. Timm (630) 840-8525  timm at fnal.gov  http://home.fnal.gov/~timm/
Fermilab Computing Division/Core Support Services Dept.
Assistant Group Leader, Scientific Computing Support Group
Lead of Computing Farms Team

On Tue, 22 Apr 2003, Greg Lindahl wrote:

> > SPECfp_peak 2000 (single cpu)
> > ================
> > it2-1.0   1431
> > amd-144   1219
> > Xeon-3.06 1103
>
> By the way, if you get rid of art (which gets a major cache benefit at
> 3 MB) and swim (main memory bound), the Opteron is faster than Itanium
> on the rest. Pretty amazing.
>
> -- greg
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at keyresearch.com  Tue Apr 22 14:33:17 2003
From: lindahl at keyresearch.com (Greg Lindahl)
Date: Tue, 22 Apr 2003 11:33:17 -0700
Subject: PGI v5.0 [Beta] avail.. (was: Opteron announcement)
In-Reply-To: <Pine.LNX.4.44.0304221150260.28835-100000@email.combustion.eng.yale.edu>
References: <Pine.LNX.4.44.0304221150260.28835-100000@email.combustion.eng.yale.edu>
Message-ID: <20030422183316.GA1631@greglaptop.internal.keyresearch.com>

On Tue, Apr 22, 2003 at 11:54:09AM -0400, Brian Dobbins wrote:

> I recently used it to benchmark a code and 
> was thoroughly impressed.

I'd give my opinions, but:

BETA LICENSE CONDITIONS

  iv. In the absense of explicit permission from STMicroelectronics,
      The Portland Group Compiler Technology, performance results
      obtained using this Software will not be published or presented
      in a public forum

This is fairly typical for beta releases...

-- greg
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jeff at aslab.com  Tue Apr 22 14:27:00 2003
From: jeff at aslab.com (Jeff Nguyen)
Date: Tue, 22 Apr 2003 11:27:00 -0700
Subject: Opteron announcement
References: <20030422053550.GA6923@sphere.math.ucdavis.edu>
Message-ID: <056a01c308fc$c390e6a0$6502a8c0@jeff>

Hi Bill,

Here is an interesting benchmark result of the Opteron platforms
running on the combination of 32-bit/64-bit operating system and
applications. For this benchmark, Povray 3D ray tracer is used.

Platform                                 Render Time (smaller is faster)
-----------------------------------------------------------------
Opteron Model 242                41m 44s
   1.6ghz, 1MB L2, 32-bit OS (RH 8.0), 32-bit Povray binary

Opteron Model 242                41m 44s
   1.6ghz, 1MB L2, 32-bit OS (RH 8.0), 32-bit Povray binary

Opteron Model 242                41m 44s
   1.6ghz, 1MB L2, 64-bit OS (UnitedLinux x86-64 v1.0), 32-bit Povray binary

Opteron Model 242                30m 12s
   1.6ghz, 1MB L2, 64-bit OS (UnitedLinux x86-64 v1.0), 64-bit Povray binary

Intel Xeon 3.06ghz                    31m 11s
   32-bit OS (RH 8.0), 32-bit Povray binary

Jeff

ASL Inc.

----- Original Message -----
From: "Bill Broadley" <bill at math.ucdavis.edu>
To: <beowulf at beowulf.org>
Sent: Monday, April 21, 2003 10:35 PM
Subject: Opteron announcement


>
> Apparently the link to http://www.amd.com/opteronservers just went
> live.  Tons of cool docs/benchmarks.
>
> SPECfp rate 2000 (dual cpu)
> ================
> it2-1.0  30.7
> amd-244  26.7
> amd-242  25.1
> amd-240  22.7
> Xeon-2.8 14.7
>
> SPECfp_peak 2000 (single cpu)
> ================
> it2-1.0   1431
> amd-144   1219
> Xeon-3.06 1103
>
> SPECint_peak 2000 (single cpu)
> =================
> Amd-144   1170
> Xeon-3.06 1130
> IT2-1.0    719
>
> SPECint_rate 2000 (dual cpu)
> =================
> amd-244  26.8
> amd-242  24.0
> amd-240  21.2
> Xeon-2.8 19.6
> it2-900  15.5
>
> SPECint_rate2000 (windows) 4p
> =============================
> amd-844     48.5
> amd-852     45.1
> amd-840     40
> Xeon MP 2.0 34.7
> it2-1.0     32.9
>
> SPECfp_rate20000 4P
> ===================
> it2-1.0  49.3
> amd-844  49.2
> amd-842  45.0
> amd-840  40.7
> Xeon-2.0 20.2
>
> Oh and one more interesting link:
> Software Optimization Guide for AMD athlon 64 and AMD Opteron Processors
>
http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_739_7203,00
.html
>
> Amusingly all the submissions that I looked at the full reports for
> use the Intel compiler.  So the Opterons extra registers are ignored.
>
> Time will tell if 3rd party compilers that fully utilize the additional
> registers can win benchmarks against Intel's compiler.
>
> Based on the preliminary pricing I have the Opterons look to make for
> very nice beowulf nodes.
>
>
> --
> Bill Broadley
> Mathematics
> UC Davis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From erwan at mandrakesoft.com  Tue Apr 22 18:12:58 2003
From: erwan at mandrakesoft.com (Erwan Velu)
Date: Wed, 23 Apr 2003 00:12:58 +0200 (CEST)
Subject: Opteron announcement
In-Reply-To: <20030422053550.GA6923@sphere.math.ucdavis.edu>
References: <20030422053550.GA6923@sphere.math.ucdavis.edu>
Message-ID: <32813.81.56.219.165.1051049578.squirrel@webmail.mandrakesoft.com>

As some has may noticed, Mandrakesoft is one of the AMD launch partners.
The Mandrake Linux products are ready-to-run under this platform except
the clustering side (June).
-
http://www.mandrakesoft.com/company/press/briefs?n=/mandrakesoft/news/2414
[...]
Later in June 2003, MandrakeSoft will release 'MandrakeClustering' for
Opteron?, an easy-to-use clustering solution designed to answer needs in
the intensive calculation area that will greatly benefit from the power of
AMD 64-bit technology.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Apr 22 17:08:49 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 22 Apr 2003 17:08:49 -0400 (EDT)
Subject: back to the issue of cooling
In-Reply-To: <3EAA34FF@itsnt5.its.uiowa.edu>
Message-ID: <Pine.LNX.4.44.0304221649400.14381-100000@ganesh.phy.duke.edu>

On Tue, 22 Apr 2003, jbassett wrote:

> Yes , yes I haven't forgotten that yellow stat phys book that I enjoyed a 
> couple of semesters ago. It depresses me that there is not an easy solution. I 
> suppose it is better to make the engine more efficient than to try to trap 
> unburned fuel. Cluster o' Transmeta.

No, not even this does it in the HPC market where there are few idle
cycles, unless my back-of-the-envelope computations are wrong (entirely
possible as I suck at arithmetic:-).  IIRC there is an energy cost per
switching operation in VLSI that provides a basic, physical limitation
on the energy efficiency per "flop".  Beyond that, it is the battle of
the chip maskers.  How to lay out the chip at a given fabrication scale
so that the switching operations are reliable, so that pathways are
minimized, so that energy isn't radiated away.  If you work out the
actual energy cost per average "instruction" for the different silicon
foundries, you don't get all that profound a difference between them --
well within a factor of two in most cases.

So you can get more slower, cooler chips, or fewer faster, hotter chips,
but the net amount of energy you consume doing a GFLOPS-year of mixed
computation isn't likely to vary tremendously, from at least the seat of
the pants computations I've done.  Don't forget the auxiliary costs, as
well -- one case, motherboard, memory, disk for a dual 2.5 GHz CPU (5
aggregate GHz of instructions) vs five cases for single 1 GHz CPUs means
that even if the 1 GHz CPU runs more than 2.5x cooler (often it won't)
you may be spending an extra 200 Watts running the extra cases and
peripherals.  You might save 20% or 30% of your energy costs PER UNIT OF
WORK ACCOMPLISHED shopping for energy-efficient processors, but I would
be surprised if you did much better than that.

So -- tanstaafl.  Barring real technical breakthroughs at the silicon
level -- teensy switches that switch, reliably, just as fast, at lower
voltage with lower energy, the best you are dealing with is
rearrangements of the same scaling laws at each level of VLSI masking.
Not that there aren't real differences that appear over time -- my palm
pilot is about as fast as my original IBM PC was, but runs MUCH
cooler:-) but there is a pretty significant lag in performance to where
cpu masking rearrangements and implementation in different technologies
starts making them happen.  Believe me, if it were possible to run
silicon cooler (and over time it is), chip designers would "immediately"
implement the cooler technology to increase switch densities and make
more powerful chips, since heat dissipation is a major limitation on
chip design as it is.

  rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From exa at kablonet.com.tr  Tue Apr 22 16:26:27 2003
From: exa at kablonet.com.tr (Eray Ozkural)
Date: Tue, 22 Apr 2003 23:26:27 +0300
Subject: back to the issue of cooling
In-Reply-To: <3EAA34FF@itsnt5.its.uiowa.edu>
References: <3EAA34FF@itsnt5.its.uiowa.edu>
Message-ID: <200304222326.27878.exa@kablonet.com.tr>

On Tuesday 22 April 2003 20:51, jbassett wrote:
> Yes , yes I haven't forgotten that yellow stat phys book that I enjoyed a
> couple of semesters ago. It depresses me that there is not an easy
> solution. I suppose it is better to make the engine more efficient than to
> try to trap unburned fuel. Cluster o' Transmeta.

Has anybody calculated if the operation of a low-power cluster can amortize 
the actual price of the system in a couple of years? I'm thinking something 
like an apple cluster or something, might actually be viable! What about 
total cost/processing power of the cluster ?

Cheers,

-- 
Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Apr 22 17:47:36 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 22 Apr 2003 17:47:36 -0400 (EDT)
Subject: back to the issue of cooling
In-Reply-To: <200304222326.27878.exa@kablonet.com.tr>
Message-ID: <Pine.LNX.4.44.0304221741560.14964-100000@ganesh.phy.duke.edu>

On Tue, 22 Apr 2003, Eray Ozkural wrote:

> Has anybody calculated if the operation of a low-power cluster can amortize 
> the actual price of the system in a couple of years? I'm thinking something 
> like an apple cluster or something, might actually be viable! What about 
> total cost/processing power of the cluster ?

This is the relevant measure, to be sure.  Total cost of ownership with
a vengeance, per unit of work done, amortized over the life of a
cluster.  $1 per watt per year for heating and cooling, add cost of
systems themselves, correct for SPEED of systems (ideally including your
Amdahl's law scaling hit for using more slower processors!).  I predict
that the sweet spot is probably Athlon 2400's or possibly 2.4 GHz P4's
(depending on your code), with fine grained people shifted toward the
even higher end processors and with some room for Celerons or Durons on
the low end.  I think the main reason to get transmetas is likely to be
to get the processing densities, not to save power or money.

For things like apples, you may have to factor in increased sysadmin
costs, if you're not careful.  Intel or Athlon clusters are pretty much
plug'n'play with multiple distributions and techniques, if you shop your
hardware at all carefully.

  rgb

> 
> Cheers,
> 
> -- 
> Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
> Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
> www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
> GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From toon at moene.indiv.nluug.nl  Tue Apr 22 18:15:29 2003
From: toon at moene.indiv.nluug.nl (Toon Moene)
Date: Wed, 23 Apr 2003 00:15:29 +0200
Subject: Opteron announcement
References: <20030422053550.GA6923@sphere.math.ucdavis.edu>
Message-ID: <3EA5BF01.7090308@moene.indiv.nluug.nl>

Bill Broadley wrote:

> Amusingly all the submissions that I looked at the full reports for
> use the Intel compiler.  So the Opterons extra registers are ignored.

Yes, I'd hoped AMD would use g77 for the Fortran 77 parts of SPECfp2000 :-)

> Time will tell if 3rd party compilers that fully utilize the additional
> registers can win benchmarks against Intel's compiler.

You want to look at this page of the (15 !) page report by Aces Hardware 
(the results are mixed):

http://www.aceshardware.com/read.jsp?id=55000265

[ the very last table on that page ]

-- 
Toon Moene - mailto:toon at moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
GNU Fortran 95: http://gcc-g95.sourceforge.net/ (under construction)

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Tue Apr 22 20:09:20 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Tue, 22 Apr 2003 20:09:20 -0400 (EDT)
Subject: Opteron announcement
In-Reply-To: <3EA5BF01.7090308@moene.indiv.nluug.nl>
Message-ID: <Pine.LNX.4.44.0304221919190.6090-100000@coffee.psychology.mcmaster.ca>

> You want to look at this page of the (15 !) page report by Aces Hardware 
> (the results are mixed):

I think the results are pretty clear: AMD has successfully transformed
the well-regarded Athlon core into a serious competitor to anything Intel
has to offer.  

the core isn't really changed that much: it's got SSE2 now (which helps blas
a lot), seems to gain somewhat from extra regs in 64b mode (but it's not
clear how you go about using them, since until now, Intel's compiler has been
the far-best).  I'm still a little puzzled about the 1MB onchip L2, since 
low-latency ram at least partially obviates the need for it.

at the system level, AMD can now compete against Intel's agressive bandwidth
scaling (where the Athlon recently lagged) and has a clearly superior SMP
architecture (especially for >2-way). 

in my opinion, AMD will quickly realize that their CPU price ($794 for
opt/244(1800)) has to roughly match that of the Xeon/2.8 (pricewatch: $425),
since they're comparable in performance.  there's no reason AMD can't 
play $794 as an opening bid, to capitalize on the buzz and make a point
about being serious players.

I don't see any reason for a serious difference in prices of motherboards:
Intel's i7xxx Xeon chipsets are well-understood and perform well, but if
anything, Opteron boards should be cheaper, since the chipset has fewer 
responsibilities.

so for those of us looking at dual-CPU cluster bricks, $740 difference due 
to CPU price is a serious issue.  AMD can either let the prices slide or 
bump up the clocks (for which AMD has already paid the SOI price, as well
as the cost of a couple extra pipestages.)

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From math at velocet.ca  Tue Apr 22 23:46:23 2003
From: math at velocet.ca (Ken Chase)
Date: Tue, 22 Apr 2003 23:46:23 -0400
Subject: back to the issue of cooling
In-Reply-To: <Pine.LNX.4.44.0304221741560.14964-100000@ganesh.phy.duke.edu>; from rgb@phy.duke.edu on Tue, Apr 22, 2003 at 05:47:36PM -0400
References: <200304222326.27878.exa@kablonet.com.tr> <Pine.LNX.4.44.0304221741560.14964-100000@ganesh.phy.duke.edu>
Message-ID: <20030422234623.Z25523@velocet.ca>

On Tue, Apr 22, 2003 at 05:47:36PM -0400, Robert G. Brown's all...
  >On Tue, 22 Apr 2003, Eray Ozkural wrote:
  >
  >> Has anybody calculated if the operation of a low-power cluster can amortize 
  >> the actual price of the system in a couple of years? I'm thinking something 
  >> like an apple cluster or something, might actually be viable! What about 
  >> total cost/processing power of the cluster ?

Depends on where you live. Canada is a pretty cheap place to waste power,
tho Seattle is better than Toronto:

http://www.bchydro.com/policies/rates/rates759.html

(But then you might factor in real estate pricing 

  >For things like apples, you may have to factor in increased sysadmin
  >costs, if you're not careful.  Intel or Athlon clusters are pretty much
  >plug'n'play with multiple distributions and techniques, if you shop your
  >hardware at all carefully.

Suddenly diskless vs non diskless isnt just a management issue too -
15krpm drives can eat a fair bit of power. (We have power supplies in
smaller servers that cant handle 2x 15Krpm + Dual p3). Start adding up
to 30-40W+ per drive across 1000 nodes and you have a fair chunk of power.

/kc

  >
  >  rgb
  >
  >> 
  >> Cheers,
  >> 
  >> -- 
  >> Eray Ozkural (exa) <erayo at cs.bilkent.edu.tr>
  >> Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
  >> www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
  >> GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C
  >> 
  >> _______________________________________________
  >> Beowulf mailing list, Beowulf at beowulf.org
  >> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
  >> 
  >
  >Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
  >Duke University Dept. of Physics, Box 90305
  >Durham, N.C. 27708-0305
  >Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
  >
  >
  >
  >_______________________________________________
  >Beowulf mailing list, Beowulf at beowulf.org
  >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Ken Chase, math at velocet.ca  *  Velocet Communications Inc.  *  Toronto, Canada
Wiznet Velocet DSL.ca Datavaults  24/7: 416-967-4414  tollfree: 1-866-353-0363

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From timm at fnal.gov  Wed Apr 23 09:26:38 2003
From: timm at fnal.gov (Steven Timm)
Date: Wed, 23 Apr 2003 08:26:38 -0500 (CDT)
Subject: back to the issue of cooling
In-Reply-To: <Pine.LNX.4.44.0304221741560.14964-100000@ganesh.phy.duke.edu>
Message-ID: <Pine.LNX.4.31.0304230822540.19611-100000@boxer.fnal.gov>


> On Tue, 22 Apr 2003, Eray Ozkural wrote:
>
> > Has anybody calculated if the operation of a low-power cluster can amortize
> > the actual price of the system in a couple of years? I'm thinking something
> > like an apple cluster or something, might actually be viable! What about
> > total cost/processing power of the cluster ?
>
> This is the relevant measure, to be sure.  Total cost of ownership with
> a vengeance, per unit of work done, amortized over the life of a
> cluster.  $1 per watt per year for heating and cooling, add cost of
> systems themselves, correct for SPEED of systems (ideally including your
> Amdahl's law scaling hit for using more slower processors!).  I predict
> that the sweet spot is probably Athlon 2400's or possibly 2.4 GHz P4's
> (depending on your code), with fine grained people shifted toward the
> even higher end processors and with some room for Celerons or Durons on
> the low end.  I think the main reason to get transmetas is likely to be
> to get the processing densities, not to save power or money.
>
The problem with the above calculation is that oftentimes the cost
to get the electrical infrastructure into your facility in the
first place is much, much greater than the cost of the electricity
it delivers.  We are spending $560K here at Fermilab to add 250 kVA
of electrical capacity to our floor.  We calculate that the
cost of the electricity to run the machines over 3 years will only
be $50K.

We considered whether to actually put a weighting factor into our
bids so that more electrically-efficient machines would be preferred,
but then when we thought about it, we figured that (1) these
machines are usually slower so you need more of them (2) they
also use up more floor space which isn't free, and (3) within
the same CPU speed class, those machines which use up the most
electricity are also likely to be the ones with the biggest fans
which are the best cooled internally.

Steve Timm


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Apr 23 09:51:41 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 23 Apr 2003 09:51:41 -0400 (EDT)
Subject: back to the issue of cooling
In-Reply-To: <Pine.LNX.4.31.0304230822540.19611-100000@boxer.fnal.gov>
Message-ID: <Pine.LNX.4.44.0304230944420.1333-100000@lilith.rgb.private.net>

On Wed, 23 Apr 2003, Steven Timm wrote:

> The problem with the above calculation is that oftentimes the cost
> to get the electrical infrastructure into your facility in the
> first place is much, much greater than the cost of the electricity
> it delivers.  We are spending $560K here at Fermilab to add 250 kVA
> of electrical capacity to our floor.  We calculate that the
> cost of the electricity to run the machines over 3 years will only
> be $50K.

Indeed true.  However, the infrastructure cost is amortized over a
longer time, as well, and it varies strongly and nonlinearly from site
to site, depending on how much capacity you already have "handy" (or how
far they have to go to find a transformer with the capacity to deliver
it).  If you have 250 kVA worth of machines running in your machine
room, and spend 8 cents or so a kVA-hour, then power and cooling
combined for your room would cost about $250K a year (so that's a fairer
measure of the running capacity you are purchasing, even if you don't
use it right away).  Amortized over ten years the cost of the renovation
would be roughly $60-65K per year, including interest.  That's still
high -- you obviously had to install some BIG transformers to get the
capacity you need or something -- but not insanely high.

> We considered whether to actually put a weighting factor into our
> bids so that more electrically-efficient machines would be preferred,
> but then when we thought about it, we figured that (1) these
> machines are usually slower so you need more of them (2) they
> also use up more floor space which isn't free, and (3) within
> the same CPU speed class, those machines which use up the most
> electricity are also likely to be the ones with the biggest fans
> which are the best cooled internally.

Indeed and agreed.  If you like, what matters is getting the most work
done per dollar spent, regardless of how you get the work done.

   rgb

> 
> Steve Timm
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From adamgood at linux-mag.com  Wed Apr 23 12:19:32 2003
From: adamgood at linux-mag.com (Adam Goodman)
Date: Wed, 23 Apr 2003 09:19:32 -0700 (PDT)
Subject: ClusterWorld Conference & Expo Announcement
Message-ID: <Pine.BSF.4.44.0304230911360.9499-100000@mail.via.net>

Hello Everyone,

We would be really be thrilled to have any and all of you involved in
our conference!

We're excited to inform you that ClusterWorld Conference & Expo San Jose
2003 Registration is now open!  You can register today for your FREE
Exhibits Pass, or for one of our in-depth conference passes!

Please use your SPECIAL PRIORITY CODE -  BELOW when registering.

Just go to http://www.clusterworldexpo.com and click on "REGISTER NOW!" to
sign up today!

ClusterWorld Conference and Expo
June 23 - 26, 2003
San Jose Convention Center
San Jose, CA
http://www.clusterworldexpo.com

Cluster technology is changing everything - from supercomputing to
high-availability - and it's growing faster than any other segment of the
market. ClusterWorld Conference & Expo stands at the very center of this
amazing movement. At ClusterWorld Conference & Expo, you can:

* LEARN from top clustering experts from all industries in our extensive
conference programs.
* EXPERIENCE the latest cluster technology from all the top vendors on our
awesome expo floor.
* MEET AND NETWORK with colleagues from across the world of clustering at
our numerous sponsored social events and parties.

*** Keynote Speakers *** (Keynotes are open to all attendees)

John Picklo
Manager, High Performance Computing
DaimlerChrysler

John Reynders
Vice-President, Informatics
Celera Therapeutics

Jacobus N. Buur
Principal Research Physicist
Shell International Exploration and Production B.V.

Dr. Tilak Agerwala
Vice President, Systems
IBM Research

The ClusterWorld conference program was created in conjunction with the
Linux Clusters Institute (LCI) and offers something for everyone
interested in cluster technology.  If you work with clusters in any
capacity, ClusterWorld Conference & Expo is the one event you cannot
afford to miss this year.

Learn more at http://www.clusterworldexpo.com.

*** ClusterWorld Conference & Expo Sponsors ***

Platinum: HP and Intel
Gold: AMD, Dell, MSC Software, Myricom, RackSaver, RLX Technologies
Silver: APC - American Power Conversion, Appro International, Linux
Networx, Microway, Penguin Computing, Promicro Systems, and Western
Scientific
Media Sponsors: Dr. Dobbs Journal, GridToday, HPCwire, Linux Magazine, and
Sys Admin


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jbassett at blue.weeg.uiowa.edu  Wed Apr 23 12:31:59 2003
From: jbassett at blue.weeg.uiowa.edu (jbassett)
Date: Wed, 23 Apr 2003 11:31:59 -0500
Subject: back to the issue of cooling
Message-ID: <3EA854CE@itsnt5.its.uiowa.edu>

Transmeta quotes a TDP for their 1-Ghz Crusoe as 7.5 watts

An Athlon XP at around twice the clock-speed is around 10* that at 75 watts

but at .05$/kw*h I agree that it is unlikely that you could ever find an 
operating cost that would be able to offset the greater cost and slower 
performance of the Crusoe. But the density that you could pack them would be 
incredible. If you were running so much cooler that less of a cooling system 
investment were required that might change the equation.

Or if there was ever a need for a highly mobile cluster system. You could pack 
a great number into a single box and carry it about and perhaps because in 
theory 10 Crusoes would dissipate the heat of a single Athlon you could easily 
cool many of them. Joseph Bassett


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gary at umsl.edu  Wed Apr 23 14:23:19 2003
From: gary at umsl.edu (Gary Stiehr)
Date: Wed, 23 Apr 2003 13:23:19 -0500
Subject: Serial Port Concentrators vs. KVMs
Message-ID: <3EA6DA17.6040100@umsl.edu>

Hi,
	Having used both KVMs and serial port concentrators, I have my own 
opinions about the advantages and disadvantes of each.  I was hoping 
that list members might share their opinions as well.  My experience is 
with Belkin 8-port KVMs and with a Computone RAS2000 serial port 
concentrator.  Here are some of my opinions, please feel free to add to 
the list or correct me if I'm wrong.  In particular, any comments on 
scalability or some price comparisons would be interesting.

KVM Advantages
--------------
* Ease of Setup: usually you just run the keyboard/video/mouse cables to 
the KVM and then a set of keyboard/video/mouse cables from the KVM to 
some other node from which you can access the console for all of the 
nodes attached to the KVM.  There usually is nothing that needs to be 
done with the OS (although I've heard of some BIOSes having problems but 
I've never experienced this).  There is also usually nothing to set up 
with the KVM itself--just hook up the cables.

KVM Disadvantages
-----------------
* Lots of cables: Even if you do not use a mouse cable, you still have 
two cables running from the back of each node.  I have heard of some 
KVMs lately that use an adapter to combine all three kvm cables into 
one.  I have not actually seen or used one but that would certainly help.

* No remote access: The only KVM switches that I have seen with remote 
access are "enterprise" KVM switches that have a high price tag.  I have 
no experience with this type of KVM switch but I would imagine it would 
be like a hybrid KVM/serial port concentrator.

Serial Port Concentrator (SPC) Advantages
-----------------------------------------
* Remote access: Most SPCs that I looked at listed remote access as a 
feature.  And some, including the Computone RAS2000 that I use, allow 
you to access the them via ssh.

* Less cables: You only need to run one cable from the back of each node 
(from the serial port) to the SPC.

* Multiple access methods: As noted above, you can access a lot of SPCs 
via the network.  But if that is down, you can also access the SPC via a 
node that is attached via serial port to a special port on the SPC.

Serial Port Concentrator (SPC) Disadvantages
--------------------------------------------
* Need to set up the SPC itself: In my case, this wasn't too bad. 
Unfortunately, I would think that each vendor would have its own set of 
procedures to follow for the setup of its own SPC.

* Somewhat of a learning curve:  If you have not had experience with 
serial ports (i.e., you know what they are but you've never done 
anything with them), there will be a lot of terms that are unfamiliar. 
You will also need to find out a lot of information about your hardware, 
OS and BIOS.  For instance, what speed do they support (9600 baud, 
115200 baud, etc.)? What terminal emulation do they support (vt100, 
vt102, ansi)? Is my serial port enabled in the BIOS?  Which serial port 
is which (For Linux: /dev/ttyS0, /dev/ttyS1, etc.)?  And so on.

* A significant number of small changes to OS: There are a number of 
changes that you need to make to the OS (in my case Linux) in order for 
the console messages to be sent to the serial port.  Thanks to various 
how-tos and other docs, I was able to make all of the appropriate 
changes but a lot of them were not very obvious (although once you read 
about them you can see why it would be necessary).

* Must access the BIOS on each system:  Unless your BIOS has serial port 
redirection enabled by default (if it has this feature at all), you will 
need to access each BIOS as you set the systems up (if you want to see 
console messages generated by the BIOS).


Thanks for reading this somewhat lengthy e-mail.  I would appreciate 
your comments.

Thank you,
Gary Stiehr
gary at umsl.edu

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From henken at seas.upenn.edu  Wed Apr 23 15:04:15 2003
From: henken at seas.upenn.edu (Nicholas Henke)
Date: 23 Apr 2003 15:04:15 -0400
Subject: Serial Port Concentrators vs. KVMs
In-Reply-To: <3EA6DA17.6040100@umsl.edu>
References: <3EA6DA17.6040100@umsl.edu>
Message-ID: <1051124655.7370.45.camel@roughneck.liniac.upenn.edu>

On Wed, 2003-04-23 at 14:23, Gary Stiehr wrote:

> 
> KVM Disadvantages
> -----------------
> * Lots of cables: Even if you do not use a mouse cable, you still have 
> two cables running from the back of each node.  I have heard of some 
> KVMs lately that use an adapter to combine all three kvm cables into 
> one.  I have not actually seen or used one but that would certainly help.

This get's to be a price issue too -- good KVM cables are darned
expensive.

> 
> * No remote access: The only KVM switches that I have seen with remote 
> access are "enterprise" KVM switches that have a high price tag.  I have 
> no experience with this type of KVM switch but I would imagine it would 
> be like a hybrid KVM/serial port concentrator.

> 
> Serial Port Concentrator (SPC) Advantages
> -----------------------------------------
> * Remote access: Most SPCs that I looked at listed remote access as a 
> feature.  And some, including the Computone RAS2000 that I use, allow 
> you to access the them via ssh.

Best part -- never leave the comfort of your own desk :)

Generally Cheaper than KVM -- I am sure there is some knee point, but
for clusters in the >24 nodes size, remote serial is the way to go. You
will need some sort of KVM to access the nodes in the machine room, for
those problems where hardware is !#@$-ed, or if BIOS redirection is not
an option, as we have seen on some of our machines.

Also, when using a nice package like conserver (conserver.com (free :)),
you get logs of the console output -- absolutely critical for debugging
oops output -- unless you like transcribing oops to notepad, and then
back to a text file for ksymoops.

Nic
-- 
Nicholas Henke
Penguin Herder & Linux Cluster System Programmer
Liniac Project - Univ. of Pennsylvania

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Apr 23 17:15:18 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 23 Apr 2003 17:15:18 -0400 (EDT)
Subject: back to the issue of cooling
In-Reply-To: <3EA854CE@itsnt5.its.uiowa.edu>
Message-ID: <Pine.LNX.4.44.0304231520040.17571-100000@ganesh.phy.duke.edu>

On Wed, 23 Apr 2003, jbassett wrote:

> Transmeta quotes a TDP for their 1-Ghz Crusoe as 7.5 watts
> 
> An Athlon XP at around twice the clock-speed is around 10* that at 75 watts
> 
> but at .05$/kw*h I agree that it is unlikely that you could ever find an 
> operating cost that would be able to offset the greater cost and slower 
> performance of the Crusoe. But the density that you could pack them would be 
> incredible. If you were running so much cooler that less of a cooling system 
> investment were required that might change the equation.

(People bored with blades can skip the following:-)

Right, this is really their niche at the moment, especially in
environments where installing them in high densities saves you from
REALLY costly infrastructure investments or where space itself is just
plain tight.

Be careful about comparing raw clocks, though -- they are different
architectures, with the transmeta according to Feng's own paper (Feng is
the Green Destiny cluster guy) only delivering 1/3 to 1/2 the
performance at equivalent clock to Athlons.  I don't think he came CLOSE
to systematically exploring system performance to get those numbers, but
that's just me -- maybe I'm misreading.  I'd like to see systematic e.g.
specmarks, systematic lmbench's, stream, netpipes, and more, not just
sqrt's; less emphasis on "fraction of peak" and more on wall-clock
completion times.  

To put it another way, I don't think Feng's paper is a sound basis for
would be cluster engineers trying to guestimate the performance of a
bladed system on a given problem.  This makes it very difficult to
compare "theoretically" with competing designs (but of course that won't
stop me below -- just take it with a grain of salt:-).

Also be careful about comparing CPU-only numbers for e.g. power.  The
CPU is mounted on a card with memory, a hard disk (or two), a network
interface, and a bus/backplane interface.  All of these draw power.  The
power itself comes from a chassis with a power supply that gets hot
while operating. What I looked for, but failed to find in Feng's paper,
is the actual wall-plug power load of a 24 blade chassis running code
flat out.  If the chassis power supply capacity is any indication, it is
probably more like 20 watts per blade (and maybe more, as some fraction
of the "blade load" goes to running chassis electronics and heat
dissipated by the chassis power supply).  The only good way to find out
is to stick a kill-a-watt between a blade chassis and the wall and read
out its draw under a mix of loads.  I don't think Feng did that (hard
to tell from the paper, at any rate).  I suspect he used published
numbers for the CPU draw or the blade draw instead of measuring it
himself but if he told WHAT he did I missed it.

If we assume 20 W and compare it to the 85 full chassis load watts (or
so) burned (per CPU) in a loaded dual Athlon at 1.6 GHz, then the
transmeta gets 0.3 to 0.5 "Athlon GHz" (AGHz) (or 1/5 to 1/3 the
performance) for 1/4 the power draw.  Hmmm.  Where is the big win here?
Even if my power numbers are off by a power of two and a fully loaded
blade burns only 10 W -- a number I'd doubt since NICs alone tend to
burn 5 W and a blade has a NIC -- I'm not impressed, given the cost
differential, because we have NOT YET CONSIDERED the scaling laws
associated with parallelizing tasks themselves, which often strongly
favor faster processors (i.e. faster processors on systems with faster
busses can often be used to make clusters that scale near-linearly to
far higher total performance and to far more processors).  

Nor have we considered micro-determinants of performance.  How expensive
is a context switch?  How well does it manage cache and dataflow?  How
smoothly does it process interrupts so it can USE the NIC or disk?  Is
there an all-things-equal network latency hit of 3x or more relative to
an Athlon?  There might be (or not)...but barring a published
measurement we won't know.

I think a far more sophisticated analysis is called for to determine
what the real performance/power scaling is PER NODE since the crux of
the argument is whether more slower cooler processors are going to
perform as well as fewer faster hotter processors.  This is a problem
dependent question, as has been a focus of the list forever, and might
well be TRUE for one problem and FALSE for another.

I was REALLY unimpressed by Feng's TCO argument, and especially by his
analysis of the processor scaling laws that are limiting processor
speedup and leading to an increase in power draw as Moore's law cranks
along.  First of all, those things are well known -- on chip or off
chip, parallelism is a way to get better usage of chip real estate, as
Ian Foster points out in a lot more convincing detail in his book on
parallel program design.  Second of all, Feng's proposed "solution":
"quit using the `increasing frequency = increasing performance'
marketing ploy" -- isn't a solution at all, it isn't even an argument --
it is raw polemic.

What marketing ploy, and what does marketing ploy have to do with chip
scaling laws?  Increasing frequency DOES, visibly and obviously,
increase performance on CPU bound problems, including mine, in a
marvelously linear fashion.  On the transmeta too, at a guess, just as
it has for generations of in-family CPUs.  Quantum jumps (relative to
clock) occur when the chip is rearchitectured with more parallelism and
finer scale, e.g. changing from 8 to 16 to 32 bit architectures, from no
pipelines to several pipelines.  These are the realities of CPU design,
not marketing ploys.

Second of all, he offers no argument at all, convincing or otherwise,
for how using lots of cooler slower chips is going to actually beat the
scaling laws he himself introduces (and the ones he omits).  Foster
does, in the explicit context of parallel task execution (so I'm
familiar with them in a fair bit of detail) but Feng doesn't.  A good
argument would require him to account for various kinds of overhead,
account for parallel scaling on tasks (where his argument OBVIOUSLY
fails for a task that will run, fast, from memory on a single CPU with
no IPC's but require lots of slow IPC's to run in parallel on two or
more transmetas) and would inevitably restrict the classes of task that
can be distributed cost-efficiently on the bladed architecture.  It is
NOT a "substitute" for the increased clock single CPU cycle, it is
something different for solving different problems.

And finally, there is the good old tanstaafl, which makes me suspicious
of the whole line of argument from the beginning. Chip designers at
Intel and AMD (and at Transmeta, for that matter) are not idiots.  They
are REALLY familiar with the chip real estate, clockspeed, parallelism
scaling laws and have introduced a LOT of on-chip parallelism in part
because of them.  They are real experts on this and don't do stupid
things and are all working with the same microscopic "components" on
their VLSI layouts, trying to optimize a highly nonlinear cost-benefit
function in truly creative ways.  Their chip designs are genius, not
dumb, expensive genius at that (up to order $billion per CPU generation
foundry at this point?).

RISC itself is something of a response to these laws, and Transmeta's
architecture seems almost like "super-RISC" with a lot of code
translation and pre-processing to conserve chip real estate.
Ultimately, winning the performance war requires either finding a really
fundamentally different design that has different scaling laws or
finding a niche market where the design you have (which may be a
different emphasis or design focus of existing designs) can be
successful.

So far I don't see it, although I've seen some intriguing ideas kicked
around.  I'd be at least intrigued, for example, by an "8 processor
motherboard" where 8 transmeta's were slotted right up on a very fast
memory bus with a standard peripheral (PCI-x) interconnect.  That would
give you e.g. "8 transmeta GHz" on a system that drew roughly the same
power as a P4 or Athlon in the 2 GHz range.  Multiply by 0.4 (say) and
perhaps it is competitive, and gets you out to decent performance
without an ethernet interconnect, giving you better parallelism for
certain classes of task.  Transmeta's in PDA's are also very intriguing
-- building a handheld device that can run for hours at high speed on a
small battery is very cool indeed.

THAT kind of (SMP) design in a mainstream mass market delivery would
require a new kernel and a fundamentally parallel approach to
programming, to make "happen".  It might not make it -- lots of stuff on
PC's is single threaded and CPU or memory I/O bound, and lots of CPUs
competing for memory or trying to deliver a threaded task are a known
headache.  It would be interesting, though, especially if the design was
modular and could be scaled up to 24, 48, 1024 processors.

> Or if there was ever a need for a highly mobile cluster system. You could pack 
> a great number into a single box and carry it about and perhaps because in 
> theory 10 Crusoes would dissipate the heat of a single Athlon you could easily 
> cool many of them. Joseph Bassett

Well, yes, unless you needed the single-threaded PERFORMANCE of a single
Athlon.  And remember, until that 10 way SMP motherboard for the Crusoe
comes along, you're feeding the CPU, its own memory, its own disk, and a
network (ten of each), and suddenly it isn't anything like 10 for one to
the Athlon, more like four for one or even five for one, and when you
multiply by the speed differential per clock, suddenly you're back
dangerously close to where you started in BOTH FLOPS/Watt AND in
absolute FLOPS, with now 10 CPUs to care for, feed, network, and
program.  The single AMD will run ANY application over the counter, no
parallel programming required.  Lower TCO?  I think that's obvious.

I'm not down on blades -- I think they have their niche and
power/cooling/space starved environments are it.  I don't think that
they are even close to a cost/benefit win in most other environments,
and not because of "marketing hype".  I'm not selling anything; if
anything I'm buying.  Should I spend my (say) $15K on Crusoes in a blade
configuration or on dual Athlons?  Hmmm, I can afford just about 8 dual
Athlon 2400+'s or just about 12 Crusoes (presuming $1K each by the time
a chassis and so forth is thrown in).  16 Athlon CPUs buys me some 32
"Athlon GHz" (and costs me about $1300 a year in utility bills).  The
alternative gives me 12 "Transmeta GHz", where a TGHz is "worth" perhaps
0.5 AGHz in FLOPS, according to Feng's incomplete measurements.

So it buys me roughly 6 AGHz, five times fewer, and costs me (heck, I'll
GIVE you 10x less power) $150 year to run.  I'd still need to spend
$75,000 on transmetas, assuming my application scaled linearly to 60
transmetas at all, to equal the power of my (8 dual FF) 16 Athlons CPUs
(assuming I'm still scaling linearly there, as well).  My three-year
power bill would be maybe $2000 less, but my overall bill would be
$53,000 more for the Crusoes.  In a lot of environments, I could buy
brand new wiring, a dedicated air conditioner, and get STILL get back
enough change to travel business class to Australia going with the
Athlons, especially if my goal is to feed 8 whole dual Athlons (ballpark
of 170W each, 1400 watts to perhaps 2000 watts total consummption under
load, one to two 20 Amps circuits, installable in most locations that
have a bit of surplus capacity at the box for maybe $1000 bucks tops
even if they have to pull wire).

If there is something wrong with this analysis, I'd be interested in
hearing it.  At $200/blade, blades would be a great deal from a TCO or
cost/benefit perspective.  At $400/blade they would be "interesting" and
often competitive.  At $1000/blade, they are a niche market only item,
as I see it -- people who have $100K in renovation required otherwise to
build a cluster, people who have an uncooled broom closet available as a
"cluster room" and who inexplicably can STILL afford a Transmeta
cluster in the first place.

  rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Wed Apr 23 19:00:39 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Wed, 23 Apr 2003 18:00:39 -0500
Subject: Serial Port Concentrators vs. KVMs
In-Reply-To: <3EA6DA17.6040100@umsl.edu>
References: <3EA6DA17.6040100@umsl.edu>
Message-ID: <3EA71B17.2010206@tamu.edu>

ALthough in a different context, we use both.  And an additional approach.

We use KVM switches in my virtual network engineering lab environment 
when we're working in the lab and doing "console" access locally: 
Resetting/rebuilding systems, adding hardware and fine-tuning, etc.

We use Concentrators for our remote access, although our application is 
a little different from most: Our Cyclades systems are isolated in a 
private network, and we have front-end systems to access them.  This 
system has worked very well for our lab exercises.  We can keep the 
number of connections and logins relatively low, which is one place the 
terminal concentrators fall down on: too long to log in, IMHO.

We also use the generic equivalent of a Head node, which we refer to as 
a 'Bastion" box.  This allows ssh access to the sandbox area for 
terminal-type connections. The Bastion Host is accessible on ssh through 
our campus firewall.  We use a series of restrictive rules, especially 
during our security classes (which are real, hacking, attack-defend 
classes) to prevent students from accessing a system not their own from 
the Bastion Host, and the Rules of the Game preclude use of the Bastion 
Host from attacking.

In summary, we like all three approaches.  There are times where the 
Bastion approach is best and we try to utilize it there.

We like the Belkin 16 port KVMs when we've got to be in the room.  We 
don't like the 8- or 4-ports as they are not cost effective.

We like the serial concentrators when doing "serial console" tasks.  If 
you reboot a system, the Bastion Host won't maintain the connection 
during the process, while the serial system will.

Series of tradeoffs and we've tried to ascertain what works best for us...

Gerry

Gary Stiehr wrote:
> Hi,
>     Having used both KVMs and serial port concentrators, I have my own 
> opinions about the advantages and disadvantes of each.  I was hoping 
> that list members might share their opinions as well.  My experience is 
> with Belkin 8-port KVMs and with a Computone RAS2000 serial port 
> concentrator.  Here are some of my opinions, please feel free to add to 
> the list or correct me if I'm wrong.  In particular, any comments on 
> scalability or some price comparisons would be interesting.
> 
> KVM Advantages
> --------------
> * Ease of Setup: usually you just run the keyboard/video/mouse cables to 
> the KVM and then a set of keyboard/video/mouse cables from the KVM to 
> some other node from which you can access the console for all of the 
> nodes attached to the KVM.  There usually is nothing that needs to be 
> done with the OS (although I've heard of some BIOSes having problems but 
> I've never experienced this).  There is also usually nothing to set up 
> with the KVM itself--just hook up the cables.
> 
> KVM Disadvantages
> -----------------
> * Lots of cables: Even if you do not use a mouse cable, you still have 
> two cables running from the back of each node.  I have heard of some 
> KVMs lately that use an adapter to combine all three kvm cables into 
> one.  I have not actually seen or used one but that would certainly help.
> 
> * No remote access: The only KVM switches that I have seen with remote 
> access are "enterprise" KVM switches that have a high price tag.  I have 
> no experience with this type of KVM switch but I would imagine it would 
> be like a hybrid KVM/serial port concentrator.
> 
> Serial Port Concentrator (SPC) Advantages
> -----------------------------------------
> * Remote access: Most SPCs that I looked at listed remote access as a 
> feature.  And some, including the Computone RAS2000 that I use, allow 
> you to access the them via ssh.
> 
> * Less cables: You only need to run one cable from the back of each node 
> (from the serial port) to the SPC.
> 
> * Multiple access methods: As noted above, you can access a lot of SPCs 
> via the network.  But if that is down, you can also access the SPC via a 
> node that is attached via serial port to a special port on the SPC.
> 
> Serial Port Concentrator (SPC) Disadvantages
> --------------------------------------------
> * Need to set up the SPC itself: In my case, this wasn't too bad. 
> Unfortunately, I would think that each vendor would have its own set of 
> procedures to follow for the setup of its own SPC.
> 
> * Somewhat of a learning curve:  If you have not had experience with 
> serial ports (i.e., you know what they are but you've never done 
> anything with them), there will be a lot of terms that are unfamiliar. 
> You will also need to find out a lot of information about your hardware, 
> OS and BIOS.  For instance, what speed do they support (9600 baud, 
> 115200 baud, etc.)? What terminal emulation do they support (vt100, 
> vt102, ansi)? Is my serial port enabled in the BIOS?  Which serial port 
> is which (For Linux: /dev/ttyS0, /dev/ttyS1, etc.)?  And so on.
> 
> * A significant number of small changes to OS: There are a number of 
> changes that you need to make to the OS (in my case Linux) in order for 
> the console messages to be sent to the serial port.  Thanks to various 
> how-tos and other docs, I was able to make all of the appropriate 
> changes but a lot of them were not very obvious (although once you read 
> about them you can see why it would be necessary).
> 
> * Must access the BIOS on each system:  Unless your BIOS has serial port 
> redirection enabled by default (if it has this feature at all), you will 
> need to access each BIOS as you set the systems up (if you want to see 
> console messages generated by the BIOS).
> 
> 
> Thanks for reading this somewhat lengthy e-mail.  I would appreciate 
> your comments.
> 
> Thank you,
> Gary Stiehr
> gary at umsl.edu
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From award at andorra.ad  Thu Apr 24 01:08:42 2003
From: award at andorra.ad (Alan Ward)
Date: Thu, 24 Apr 2003 07:08:42 +0200
Subject: back to the issue of cooling
References: <Pine.LNX.4.44.0304231520040.17571-100000@ganesh.phy.duke.edu>
Message-ID: <3EA7715A.44EFAA24@andorra.ad>

I tend to think Transmeta and other low-power CPUs belong on the 
desktop, so you can run them without the noisy fans (and they don't 
heat up the air). 

Alan Ward


Robert G. Brown ha escrit:

(big snip)

> I'm not down on blades -- I think they have their niche and
> power/cooling/space starved environments are it.  I don't think that

(other big snip)

>   rgb
> 
> Robert G. Brown                        http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gmpc at sanger.ac.uk  Thu Apr 24 05:03:41 2003
From: gmpc at sanger.ac.uk (Guy Coates)
Date: Thu, 24 Apr 2003 10:03:41 +0100 (BST)
Subject: Serial Port Concentrators vs. KVMs
In-Reply-To: <200304231901.h3NJ1Vs17842@NewBlue.Scyld.com>
References: <200304231901.h3NJ1Vs17842@NewBlue.Scyld.com>
Message-ID: <Pine.OSF.4.44.0304240929220.3433953-100000@ecs2f.internal.sanger.ac.uk>

The other important aspects are logging and automation.

If you use serial port concentrators you can then you can use the wonders
of Conserver (http://www.conserver.com/) to manage and log the console
output, allowing you to capture your kernel panics in their full glory.

You can script against serial consoles, something that you can't do with
KVM. The case we like to taunt our prospective hardware vendors with is:

"How do I change the bios settings of all of the machines in a 200 node
cluster?"

Assuming serial access to the bios, you can automate that process with
expect.  With KVM (or the horribly broken VNC "remote management"
solutions some manufacturers seem so keen on) you have to do it by hand.

Cheers,

Guy Coates

-- 
Guy Coates,  Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
Tel: +44 (0)1223 834244 ex 7199


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jahia at mail.umesd.k12.or.us  Thu Apr 24 05:34:41 2003
From: jahia at mail.umesd.k12.or.us (Jim Ahia)
Date: Thu, 24 Apr 2003 02:34:41 -0700
Subject: beowulf in space
Message-ID: <sea74d57.047@mail.UMESD.K12.OR.US>

As I was reading this thread, some things came to mind that might add to
the discussion:

1 ) although Dells and Gateways are too heavy to lift into orbit,
pc-104 systems might be the solution.  3.6 x 3.8 inch pentium-class
motherboards with a single 5v power requirement make things much
smaller.  It is completely possible to have each node fit into the space
of a half-height CD-ROM drive.  Can anyone say "cluster in one box"?

2 ) Has anyone yet mentioned the possibility of mesh networks using
802.11 for robotics clustering?  Such networks of robots might make site
construction, ship construction, and mining feasible.  

Mining the surface of the moon is well documented to provide hydrogen,
oxygen, aluminum, silica, and titanium.  Launching fuel & materials for
spacecraft to an orbital construction facility might make more sense
than the billions we are spending now, if the mine, transport, and
construction are largely carried out by robotics under the oversight of
a resident cluster with ground-based monitoring.

Using a similar swarm of robots for site construction on mars prior to
human arrival can have a major impact on mission success.

If all robots use identical motion base and cpu, then 2 broken bots can
be cannibalized to return one working bot to service.  

If all of the robots that are currently recharging batteries are added
to the cluster as mains-connected nodes, then a cluster of sorts is in
effect to speed control processing of the 'hive'.  This is assuming that
the central site has the main power supply system online, be it solar,
nuc, whatever.

-Jim Ahia
-makenamicro at charter.net
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jhearns at freesolutions.net  Thu Apr 24 10:44:42 2003
From: jhearns at freesolutions.net (John Hearns)
Date: 24 Apr 2003 15:44:42 +0100
Subject: beowulf in space
In-Reply-To: <sea74d57.047@mail.UMESD.K12.OR.US>
References: <sea74d57.047@mail.UMESD.K12.OR.US>
Message-ID: <1051195484.16295.22.camel@harwood.home>

On Thu, 2003-04-24 at 10:34, Jim Ahia wrote:
> As I was reading this thread, some things came to mind that might add
to
> the discussion:
> 
> 1 ) although Dells and Gateways are too heavy to lift into orbit,
> pc-104 systems might be the solution.  3.6 x 3.8 inch pentium-class
> motherboards with a single 5v power requirement make things much
> smaller.  It is completely possible to have each node fit into the
space
> of a half-height CD-ROM drive.  Can anyone say "cluster in one box"?

A PC-104 cluster has been constructed at Sandia:
http://eri.ca.sandia.gov/eri/howto.html


BTW, one of the UK Sunday newspapers recently carried a magazine article
on Sandia, and the projects going on there. What an interesting place.
I think we've all heard about the fun computing things at Sandia, but
this article talked about other things like the very high powered laser
they have.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From daniel.kidger at quadrics.com  Thu Apr 24 11:28:12 2003
From: daniel.kidger at quadrics.com (Dan Kidger)
Date: Thu, 24 Apr 2003 16:28:12 +0100
Subject: Serial Port Concentrators vs. KVMs
References: <3EA6DA17.6040100@umsl.edu> <3EA71B17.2010206@tamu.edu>
Message-ID: <012d01c30a76$1ebc6e30$04a8a8c0@spot>

> >     Having used both KVMs and serial port concentrators, I have my own
> > opinions about the advantages and disadvantes of each.  I was hoping
> > that list members might share their opinions as well.

Traditionaly we have used serial lines for all large clusters, often with a
small say 4 or 8-way KVM.
The KVM gives us access to the management node(s) (and various hence X11
based tools), other servers (e.g raid controllers, QsNet switches), together
with a spare connection or two. The spare connections are used only when we
have odd 'problem nodes' or when setting the BIOS to serial for the first
time.

IMHO Linux based serial line concentrators like Cyclades et al. are
particularly easy to setup and maintain

However the current trend is for all new rack-mount nodes to offer some sort
of BMC (baseboard management controller) with an ethernet connection. As
well as giving remote power cycling, this should allow 'Serial_over_Lan" -
the headnode in the cluster acts as a server and you can telnet/ssh to it on
a certain port and hence reach the console on any compute node. As a result,
our latest cluster we have got has neither a KVM nor a serial port
concentrator.


Yours,
Daniel.

--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd.      daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK         0117 915 5505
----------------------- www.quadrics.com --------------------


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From cozzi at nd.edu  Thu Apr 24 12:41:15 2003
From: cozzi at nd.edu (Marc Cozzi)
Date: Thu, 24 Apr 2003 11:41:15 -0500
Subject: Cluster World Expo
Message-ID: <F163413C9250D211A55C0060979D52803DFB6D@hertz.rad.nd.edu>


Someone posted http://www.clusterworldexpo.com here a day
or so ago.
Have (m)any of you been to the Cluster World Expo before?
Any comments on the value of this meeting versus other similar
conferences would be appreciated. I've built a few Intel
clusters and maintain a few. I would be primarily interested
in support tools and cluster software installation tools.
It looks like they have BOF sessions that I would most
likely benefit from.

thanks


  --marc
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bari at onelabs.com  Thu Apr 24 12:57:20 2003
From: bari at onelabs.com (Bari Ari)
Date: Thu, 24 Apr 2003 11:57:20 -0500
Subject: back to the issue of cooling
References: <3EA854CE@itsnt5.its.uiowa.edu>
Message-ID: <3EA81770.9000802@onelabs.com>

jbassett wrote:

>Transmeta quotes a TDP for their 1-Ghz Crusoe as 7.5 watts
>
>An Athlon XP at around twice the clock-speed is around 10* that at 75 watts
>
>but at .05$/kw*h I agree that it is unlikely that you could ever find an 
>operating cost that would be able to offset the greater cost and slower 
>performance of the Crusoe. But the density that you could pack them would be 
>incredible. If you were running so much cooler that less of a cooling system 
>investment were required that might change the equation.
>
>Or if there was ever a need for a highly mobile cluster system. You could pack 
>a great number into a single box and carry it about and perhaps because in 
>theory 10 Crusoes would dissipate the heat of a single Athlon you could easily 
>cool many of them. Joseph Bassett
>
>  
>
Density is not a problem using 75W Athlon's or Xeon's. You can stuff 
8-16 Athlon's or Xeon's into a 16.5" x 25" x 1.7" 1U box. Heat is 
transferred away from them (cpu, memory, chipset) using a combination of 
conduction cooling techniques and heat pipes in the case tied to a 
"heatbus" outside the case. The heatbus can be (depending on the heat 
generated) a large highly profiled heatsink requiring forced air 
convection, heat exchange coil (evaporator) or combination of the two.

What's more of a limiting factor in tightly packing cpu's is the 
distance required between cpu and chipset and also chipset to memory 
that eats up board space. There are very tight PCB routing rules that 
limit how closely devices can be spaced. 1" - 1.5" min. is common.

There's lots of talk about very dense systems but nobody ever really 
wants them. Everyone wants clusters with COTS motherboards and 
enclosures. They then rely on forced air cooling through slots in the 
enclosures and then cool the room air down with A/C.

--Bari Ari


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Thu Apr 24 16:43:49 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Thu, 24 Apr 2003 13:43:49 -0700
Subject: beowulf in space
In-Reply-To: <sea74d57.047@mail.UMESD.K12.OR.US>
Message-ID: <5.1.0.14.2.20030424133401.030443d8@mailhost4.jpl.nasa.gov>

At 02:34 AM 4/24/2003 -0700, Jim Ahia wrote:
>As I was reading this thread, some things came to mind that might add to
>the discussion:
>
>1 ) although Dells and Gateways are too heavy to lift into orbit,
>pc-104 systems might be the solution.  3.6 x 3.8 inch pentium-class
>motherboards with a single 5v power requirement make things much
>smaller.  It is completely possible to have each node fit into the space
>of a half-height CD-ROM drive.  Can anyone say "cluster in one box"?

I have seen PC104 stuff being used in prototypes, but for space 
applications, they prefer a more robust packaging. cPCI is showing some 
signs of popularity, as is the venerable VME.  ESA has funded and is flying 
quite a lot of stuff that is essentially single board computers 
interconnected with high speed serial links.


>2 ) Has anyone yet mentioned the possibility of mesh networks using
>802.11 for robotics clustering?  Such networks of robots might make site
>construction, ship construction, and mining feasible.

There is a huge amount of this kind of work going on at JPL: cooperative 
robotics.  Take a look at the JPL planetary robotics web site 
http://prl.jpl.nasa.gov/ However, to my knowledge, they're not doing much 
cluster computing.


>Mining the surface of the moon is well documented to provide hydrogen,
>oxygen, aluminum, silica, and titanium.

Uhhhhh... yes, in the sense that the moon is made of rock, which is made of 
hydrogen, oxygen, aluminum, etc.  Turning rock into metal is a non-trivial 
process, even on Earth where there are literally millenia of history for 
the process.

>Using a similar swarm of robots for site construction on mars prior to
>human arrival can have a major impact on mission success.
>
>If all robots use identical motion base and cpu, then 2 broken bots can
>be cannibalized to return one working bot to service.

Of course, this means that the robots have to be a lot smarter and more 
capable because not only do they have to do their primary job, they also 
have to be "dismantleable", and have the ability to dismantle 
things.  While this is feasible, in an abstract sense, it might not be 
worth it; you might be able to spend the resources you'd spend on providing 
that additional capability on just buying more simpler robots in the first 
place.. Hmmm.. kind of like buying a bunch of generic commodity computers 
instead of one big specialized computer to do a particular job...

I'll also point out that it's a pretty big job just to design and build 
rovers that drive and make a few measurements, much less do construction 
work, smelt metal, do scrap recovery, etc.  The Mars Exploration Rovers are 
quite capable as far as spacecraft go, but weren't particularly easy or 
cheap to develop, and are hardly a production line item, nor are they 
likely to be anytime in the next couple decades.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Thu Apr 24 17:07:52 2003
From: becker at scyld.com (Donald Becker)
Date: Thu, 24 Apr 2003 17:07:52 -0400 (EDT)
Subject: Cluster World Expo
In-Reply-To: <F163413C9250D211A55C0060979D52803DFB6D@hertz.rad.nd.edu>
Message-ID: <Pine.LNX.4.44.0304241701540.4882-100000@beohost.scyld.com>

On Thu, 24 Apr 2003, Marc Cozzi wrote:

> Someone posted http://www.clusterworldexpo.com here a day
> or so ago.
> Have (m)any of you been to the Cluster World Expo before?

It's a new show combined with an older conference.
The focus is more on deployed and end-to-end use of clusters and cluster
applications, rather than algorithm or theory oriented conferences.

Thanks to the hard work of Adam Goodman, it's shaping up to be a really
good event.  (Adam is the well-known editor of Linux Magazine -- if you
have been to a Linux conference in the U.S., you have seen Adam.)

> Any comments on the value of this meeting versus other similar
> conferences would be appreciated. I've built a few Intel
> clusters and maintain a few. I would be primarily interested
> in support tools and cluster software installation tools.
> It looks like they have BOF sessions that I would most
> likely benefit from.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rodmur at maybe.org  Thu Apr 24 18:24:44 2003
From: rodmur at maybe.org (Dale Harris)
Date: Thu, 24 Apr 2003 15:24:44 -0700
Subject: Serial Port Concentrators vs. KVMs
In-Reply-To: <012d01c30a76$1ebc6e30$04a8a8c0@spot>
References: <3EA6DA17.6040100@umsl.edu> <3EA71B17.2010206@tamu.edu> <012d01c30a76$1ebc6e30$04a8a8c0@spot>
Message-ID: <20030424222444.GY9122@maybe.org>

On Thu, Apr 24, 2003 at 04:28:12PM +0100, Dan Kidger elucidated:
> However the current trend is for all new rack-mount nodes to offer some sort
> of BMC (baseboard management controller) with an ethernet connection. As
> well as giving remote power cycling, this should allow 'Serial_over_Lan" -

Course the thing I wonder about that is then would seem to loose some
redundancy,  unless you have a separate network setup to run the serial
over LAN.  Basically stuff like Cyclades is out of band from your
administrative network.  Course I guess if a switch blows up, there may
not be any particular need to access a node.


-- 
Dale Harris   
rodmur at maybe.org
/.-)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Duclam80 at gmx.net  Thu Apr 24 19:38:04 2003
From: Duclam80 at gmx.net (Vu Duc Lam)
Date: Fri, 25 Apr 2003 06:38:04 +0700
Subject: Mom Config 
Message-ID: <004f01c30abb$46e300f0$1a3afea9@conan>

Hi,

I want to install OpenPBS in a cluster with 16 nodes IBM PC and 1 front-end
machine HP Server. I want to use FIFO scheduler. So could any one can give
me the detail of mom config file, scheduler config file which I can use to
configure the cluster. I try serveral times to config these file according
to PBS admin administration document but when I submit a Job, I receive a
message error:"Job execeeds queue resources limit"  although i set a
resources resquest for a job as small as posible. Thanks for yours help.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Thu Apr 24 20:35:32 2003
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Thu, 24 Apr 2003 17:35:32 -0700 (PDT)
Subject: Mom Config 
In-Reply-To: <004f01c30abb$46e300f0$1a3afea9@conan>
Message-ID: <20030425003532.34945.qmail@web11407.mail.yahoo.com>

I would strongly suggest any new batch system installations to start
with GridEngine.

It is "more opensource" than OpenPBS, easier to install, easier to use,
more features, and more friendly developers.

http://gridengine.sunsource.net/

If you are in doubt, read the thread "sun grid engine?":

http://www.beowulf.org/pipermail/beowulf/2003-March/date.html

Rayson

--- Vu Duc Lam <Duclam80 at gmx.net> wrote:
> Hi,
> 
> I want to install OpenPBS in a cluster with 16 nodes IBM PC and 1
> front-end
> machine HP Server. I want to use FIFO scheduler. So could any one can
> give
> me the detail of mom config file, scheduler config file which I can
> use to
> configure the cluster. I try serveral times to config these file
> according
> to PBS admin administration document but when I submit a Job, I
> receive a
> message error:"Job execeeds queue resources limit"  although i set a
> resources resquest for a job as small as posible. Thanks for yours
> help.
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From m053546 at usna.edu  Thu Apr 24 20:48:42 2003
From: m053546 at usna.edu (MIDN Sean Jones)
Date: 24 Apr 2003 20:48:42 -0400
Subject: beowulf in space
In-Reply-To: <sea74d57.047@mail.UMESD.K12.OR.US>
References: <sea74d57.047@mail.UMESD.K12.OR.US>
Message-ID: <1051231726.1859.9.camel@Eagle.mid4.usna.edu>

For reference the United States Naval Academy is putting up a PowerPC
405 SoC in PC/104 form factor up as the Command and Data Handling System
of the MidSTAR I satellite slated for launch in March 2006.

Sean Jones
MIDN   USN

MidSTAR C&DH Lead
Armada Cluster Asst. Admin

On Thu, 2003-04-24 at 05:34, Jim Ahia wrote:
> As I was reading this thread, some things came to mind that might add to
> the discussion:
> 
> 1 ) although Dells and Gateways are too heavy to lift into orbit,
> pc-104 systems might be the solution.  3.6 x 3.8 inch pentium-class
> motherboards with a single 5v power requirement make things much
> smaller.  It is completely possible to have each node fit into the space
> of a half-height CD-ROM drive.  Can anyone say "cluster in one box"?
> 
> 2 ) Has anyone yet mentioned the possibility of mesh networks using
> 802.11 for robotics clustering?  Such networks of robots might make site
> construction, ship construction, and mining feasible.  
> 
> Mining the surface of the moon is well documented to provide hydrogen,
> oxygen, aluminum, silica, and titanium.  Launching fuel & materials for
> spacecraft to an orbital construction facility might make more sense
> than the billions we are spending now, if the mine, transport, and
> construction are largely carried out by robotics under the oversight of
> a resident cluster with ground-based monitoring.
> 
> Using a similar swarm of robots for site construction on mars prior to
> human arrival can have a major impact on mission success.
> 
> If all robots use identical motion base and cpu, then 2 broken bots can
> be cannibalized to return one working bot to service.  
> 
> If all of the robots that are currently recharging batteries are added
> to the cluster as mains-connected nodes, then a cluster of sorts is in
> effect to speed control processing of the 'hive'.  This is assuming that
> the central site has the main power supply system online, be it solar,
> nuc, whatever.
> 
> -Jim Ahia
> -makenamicro at charter.net
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
-- 
==============================================================================
   /\                        |                       Sean Jones
  /  \     _          __   __|  __                   MIDN   USN
 /====\  |/ \  /\/\  /  | /  | /  |               m053546 at usna.edu
/      \ |    |    | \_/| \_/| \_/|         United States Naval Academy
                                                Annapolis, MD 21412
==============================================================================

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dvancon at alineos.com  Fri Apr 25 03:28:01 2003
From: dvancon at alineos.com (=?ISO-8859-1?Q?Dominique_Van=E7on?=)
Date: Fri, 25 Apr 2003 09:28:01 +0200
Subject: AMD Opteron benchmarks
Message-ID: <3EA8E381.6020202@alineos.com>

Hi All,
we performed some benchmarks on AMD Opteron 1400 (also Intel XEON, 
Itanium2 900, AMD MP, Apple and Alpha processors) :
http://www.alineos.com/benchs_eng.html
We also could make some comments about these tests, so feel free to 
contact.
-- 
Dominique Van?on                         | http://www.alineos.com
mailto:dvancon at alineos.com               | tel/fax +33 1 64 78 57 65/66
ALINEOS SA, 14 bis rue du Mar?chal Foch, F-77780 Bourron Marlotte France


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From daniel.kidger at quadrics.com  Fri Apr 25 03:59:59 2003
From: daniel.kidger at quadrics.com (Dan Kidger)
Date: Fri, 25 Apr 2003 08:59:59 +0100
Subject: Serial Port Concentrators vs. KVMs
References: <3EA6DA17.6040100@umsl.edu> <3EA71B17.2010206@tamu.edu> <012d01c30a76$1ebc6e30$04a8a8c0@spot> <20030424222444.GY9122@maybe.org>
Message-ID: <018201c30b01$65998aa0$04a8a8c0@spot>

> On Thu, Apr 24, 2003 at 04:28:12PM +0100, Dan Kidger elucidated:
> > However the current trend is for all new rack-mount nodes to offer some
sort
> > of BMC (baseboard management controller) with an ethernet connection. As
> > well as giving remote power cycling, this should allow
'Serial_over_Lan" -
>
> Course the thing I wonder about that is then would seem to loose some
> redundancy,  unless you have a separate network setup to run the serial
> over LAN.  Basically stuff like Cyclades is out of band from your
> administrative network.  Course I guess if a switch blows up, there may
> not be any particular need to access a node.


You do not necessirly lose any reduncancy..
Compaq nodes have a extra ethernet socket for the BMC (Hence
serial_over_lan).
You then can use a seperate ethernet hub in place of  the Cyclades, which is
of course much cheaper (you could even recycle an old 10Mbit hub from the
cupboard).

Alternatively the Intel MoBo's like E7501 hijack the same physical ethernet
socket for the BMC. This may lose a little redundancy (if for example a
cable falls out), but halves the amount of cat5 cabling needed

Yours,
Daniel.

--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd.      daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK         0117 915 5505
----------------------- www.quadrics.com --------------------


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jahia at mail.umesd.k12.or.us  Fri Apr 25 01:31:21 2003
From: jahia at mail.umesd.k12.or.us (Jim Ahia)
Date: Thu, 24 Apr 2003 22:31:21 -0700
Subject: beowulf in space
Message-ID: <sea865de.005@mail.UMESD.K12.OR.US>

So the radiation concerns with rad-hardened computer equipment are not
as much of a problem once clear of the Van Allen Radiation Belt?  How
does this affect the space station and the planned missions to mars? 
What about the lunar environment?  I admit to having a lot of ignorance
on this subject, but I am concerned because part of my college project
is for robotic teams to do excavation / mining using a "hive" concept. 
The end result is to get more information on the challenges that will be
faced by the robotic workers that are eventually sent to the moon first,
and to mars second.

I am not speaking about the exploration missions by NASA, but rather
the much-farther-down-the-road commercial mining interests that will
want to build a foundry on the moon and a spacedock in earth orbit prior
to the big colonization push into our solar system.  I believe it is
going to happen someday, because we already know that eventually our sun
will go nova and earth will be no longer habitable.  Sooner or later
mankind, if it is to survive, will need to undergo some kind of diaspora
and migrate out into space.  
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Fri Apr 25 00:32:13 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Thu, 24 Apr 2003 23:32:13 -0500
Subject: beowulf in space
In-Reply-To: <1051231726.1859.9.camel@Eagle.mid4.usna.edu>
References: <sea74d57.047@mail.UMESD.K12.OR.US> <1051231726.1859.9.camel@Eagle.mid4.usna.edu>
Message-ID: <3EA8BA4D.9070104@tamu.edu>

One of the things I established when I was working on the old Space 
Station Freedon, in the early '90s, is that the space-rated CPUs, less 
the issues with radiation hardening and single-event upset recovery, 
were hardly different from good CPUs.  What we discovered was that the 
MIL-SPEC components differed little from the "industrial-grade" 
components, save in the degree of paperwork delivered with the device. 
And the costs.  Thus, we drove toward the use of the lower-cost, similar 
quality Industrial-Grade devices.

Now, for low- and mid-earth-orbit altitudes, the radiation environment 
is pretty harsh.  One should be cognizant of that environment, and model 
the potential for radiation induced transient problems.  If you're not 
ready for transient failures, and at that, failures that may or may not 
heal (aneal), you shouldn't use non-radiation hardened, commercial, 
processors.

I've not looked at the specs for rad-hardening and SEU performance.  If 
it's a commercial- as opposed to an industrial-grade processor, I'd not 
be too sure of reliability, either, although those specs have come up 
markedly over the last 10 years.

gerry

MIDN Sean Jones wrote:
> For reference the United States Naval Academy is putting up a PowerPC
> 405 SoC in PC/104 form factor up as the Command and Data Handling System
> of the MidSTAR I satellite slated for launch in March 2006.
> 
> Sean Jones
> MIDN   USN
> 
> MidSTAR C&DH Lead
> Armada Cluster Asst. Admin
> 
> On Thu, 2003-04-24 at 05:34, Jim Ahia wrote:
> 
>>As I was reading this thread, some things came to mind that might add to
>>the discussion:
>>
>>1 ) although Dells and Gateways are too heavy to lift into orbit,
>>pc-104 systems might be the solution.  3.6 x 3.8 inch pentium-class
>>motherboards with a single 5v power requirement make things much
>>smaller.  It is completely possible to have each node fit into the space
>>of a half-height CD-ROM drive.  Can anyone say "cluster in one box"?
>>
>>2 ) Has anyone yet mentioned the possibility of mesh networks using
>>802.11 for robotics clustering?  Such networks of robots might make site
>>construction, ship construction, and mining feasible.  
>>
>>Mining the surface of the moon is well documented to provide hydrogen,
>>oxygen, aluminum, silica, and titanium.  Launching fuel & materials for
>>spacecraft to an orbital construction facility might make more sense
>>than the billions we are spending now, if the mine, transport, and
>>construction are largely carried out by robotics under the oversight of
>>a resident cluster with ground-based monitoring.
>>
>>Using a similar swarm of robots for site construction on mars prior to
>>human arrival can have a major impact on mission success.
>>
>>If all robots use identical motion base and cpu, then 2 broken bots can
>>be cannibalized to return one working bot to service.  
>>
>>If all of the robots that are currently recharging batteries are added
>>to the cluster as mains-connected nodes, then a cluster of sorts is in
>>effect to speed control processing of the 'hive'.  This is assuming that
>>the central site has the main power supply system online, be it solar,
>>nuc, whatever.
>>
>>-Jim Ahia
>>-makenamicro at charter.net
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>>

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Apr 25 10:14:30 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 25 Apr 2003 10:14:30 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <sea865de.005@mail.UMESD.K12.OR.US>
Message-ID: <Pine.LNX.4.44.0304251004160.1973-100000@lilith.rgb.private.net>

On Thu, 24 Apr 2003, Jim Ahia wrote:

> So the radiation concerns with rad-hardened computer equipment are not
> as much of a problem once clear of the Van Allen Radiation Belt?  How

>From what I've read, the main concern is solar activity.  The sun can
relatively suddenly decide to spew significantly higher levels of
radiation our way.  When this happens it least appears that space can be
quite dangerous, and it can actually mess up the ionosphere and
radiotransmission all the way down here.  We lack an adequate baseline
for proper measurement and comparison or prediction, but it wouldn't
horribly surprise me if at least some events get to the point where
radiation levels on the surface reach mutogenetic levels.  An
"interesting" possibility that might explain the relatively sudden
emergence of new species, for example.

However, the ones we've seen are enough to confirm the potential risk.

> I am not speaking about the exploration missions by NASA, but rather
> the much-farther-down-the-road commercial mining interests that will
> want to build a foundry on the moon and a spacedock in earth orbit prior
> to the big colonization push into our solar system.  I believe it is
> going to happen someday, because we already know that eventually our sun
> will go nova and earth will be no longer habitable.  Sooner or later
> mankind, if it is to survive, will need to undergo some kind of diaspora
> and migrate out into space.  

Ah, a far thinking person, I see.  Time to start planning for the big
implosion already?

Mankind will not survive.  Whatever it is that is around when the sun
goes nova (if anything) will resemble man as man resembles small furry
rodents, at least if they are our descendants.  The event isn't due for
a rather long time;-)

   rgb


-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dtj at uberh4x0r.org  Fri Apr 25 11:05:07 2003
From: dtj at uberh4x0r.org (Dean Johnson)
Date: 25 Apr 2003 10:05:07 -0500
Subject: beowulf in space
In-Reply-To: <sea865de.005@mail.UMESD.K12.OR.US>
References: <sea865de.005@mail.UMESD.K12.OR.US>
Message-ID: <1051283107.27185.12.camel@terra>

On Fri, 2003-04-25 at 00:31, Jim Ahia wrote:
> So the radiation concerns with rad-hardened computer equipment are not
> as much of a problem once clear of the Van Allen Radiation Belt?  How
> does this affect the space station and the planned missions to mars? 
> What about the lunar environment?  I admit to having a lot of ignorance
> on this subject, but I am concerned because part of my college project
> is for robotic teams to do excavation / mining using a "hive" concept. 
> The end result is to get more information on the challenges that will be
> faced by the robotic workers that are eventually sent to the moon first,
> and to mars second.
> 
> I am not speaking about the exploration missions by NASA, but rather
> the much-farther-down-the-road commercial mining interests that will
> want to build a foundry on the moon and a spacedock in earth orbit prior
> to the big colonization push into our solar system.  I believe it is
> going to happen someday, because we already know that eventually our sun
> will go nova and earth will be no longer habitable.  Sooner or later
> mankind, if it is to survive, will need to undergo some kind of diaspora
> and migrate out into space.  
> 

In terms of setting up mining operations on other celestial bodies, cpu stability
and radiation protection are amongst the least of your worries. What will be
needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical
issues are largely tractable, one way or another, but the institutional and 
international issues will be like herding cats on crack.

	-Dean

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dtj at uberh4x0r.org  Fri Apr 25 11:07:30 2003
From: dtj at uberh4x0r.org (Dean Johnson)
Date: 25 Apr 2003 10:07:30 -0500
Subject: beowulf in space
In-Reply-To: <Pine.LNX.4.44.0304251004160.1973-100000@lilith.rgb.private.net>
References: <Pine.LNX.4.44.0304251004160.1973-100000@lilith.rgb.private.net>
Message-ID: <1051283250.27174.14.camel@terra>

On Fri, 2003-04-25 at 09:14, Robert G. Brown wrote:

> Mankind will not survive.  Whatever it is that is around when the sun
> goes nova (if anything) will resemble man as man resembles small furry
> rodents, at least if they are our descendants.

I resemble that remark! Okay, maybe a big furry rodent.

-- 

	-Dean

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From edwardsa at plk.af.mil  Fri Apr 25 11:03:05 2003
From: edwardsa at plk.af.mil (Art Edwards)
Date: Fri, 25 Apr 2003 09:03:05 -0600
Subject: beowulf in space
In-Reply-To: <3EA8BA4D.9070104@tamu.edu>
References: <sea74d57.047@mail.UMESD.K12.OR.US> <1051231726.1859.9.camel@Eagle.mid4.usna.edu> <3EA8BA4D.9070104@tamu.edu>
Message-ID: <20030425150305.GB21431@plk.af.mil>

I should also mention that there is a very large drive for radiation
hardening by design. There is work in the literature indicating that
device redesign (annular transistors, for example, to handle total dose
effects, and circuit redesign for SEU, SET) can lead to 
strategic hardening from commercial foundries. This is for digital logic
circuits. The disadvantage is that these design techniques always lead
to degradation of circuit density and of performance. However, cost
should be dramatically improved.

An issue brought up by Jim Lyke but not addressed elsewhere is cooling.
Recall that clusters generate lots of heat and that we use convection to
transfer it. In space there is either radiation or conduction. This has
to be a major focus for compact clusters that will go in space.

Art Edwards

On Thu, Apr 24, 2003 at 11:32:13PM -0500, Gerry Creager N5JXS wrote:
> One of the things I established when I was working on the old Space 
> Station Freedon, in the early '90s, is that the space-rated CPUs, less 
> the issues with radiation hardening and single-event upset recovery, 
> were hardly different from good CPUs.  What we discovered was that the 
> MIL-SPEC components differed little from the "industrial-grade" 
> components, save in the degree of paperwork delivered with the device. 
> And the costs.  Thus, we drove toward the use of the lower-cost, similar 
> quality Industrial-Grade devices.
> 
> Now, for low- and mid-earth-orbit altitudes, the radiation environment 
> is pretty harsh.  One should be cognizant of that environment, and model 
> the potential for radiation induced transient problems.  If you're not 
> ready for transient failures, and at that, failures that may or may not 
> heal (aneal), you shouldn't use non-radiation hardened, commercial, 
> processors.
> 
> I've not looked at the specs for rad-hardening and SEU performance.  If 
> it's a commercial- as opposed to an industrial-grade processor, I'd not 
> be too sure of reliability, either, although those specs have come up 
> markedly over the last 10 years.
> 
> gerry
> 
> MIDN Sean Jones wrote:
> >For reference the United States Naval Academy is putting up a PowerPC
> >405 SoC in PC/104 form factor up as the Command and Data Handling System
> >of the MidSTAR I satellite slated for launch in March 2006.
> >
> >Sean Jones
> >MIDN   USN
> >
> >MidSTAR C&DH Lead
> >Armada Cluster Asst. Admin
> >
> >On Thu, 2003-04-24 at 05:34, Jim Ahia wrote:
> >
> >>As I was reading this thread, some things came to mind that might add to
> >>the discussion:
> >>
> >>1 ) although Dells and Gateways are too heavy to lift into orbit,
> >>pc-104 systems might be the solution.  3.6 x 3.8 inch pentium-class
> >>motherboards with a single 5v power requirement make things much
> >>smaller.  It is completely possible to have each node fit into the space
> >>of a half-height CD-ROM drive.  Can anyone say "cluster in one box"?
> >>
> >>2 ) Has anyone yet mentioned the possibility of mesh networks using
> >>802.11 for robotics clustering?  Such networks of robots might make site
> >>construction, ship construction, and mining feasible.  
> >>
> >>Mining the surface of the moon is well documented to provide hydrogen,
> >>oxygen, aluminum, silica, and titanium.  Launching fuel & materials for
> >>spacecraft to an orbital construction facility might make more sense
> >>than the billions we are spending now, if the mine, transport, and
> >>construction are largely carried out by robotics under the oversight of
> >>a resident cluster with ground-based monitoring.
> >>
> >>Using a similar swarm of robots for site construction on mars prior to
> >>human arrival can have a major impact on mission success.
> >>
> >>If all robots use identical motion base and cpu, then 2 broken bots can
> >>be cannibalized to return one working bot to service.  
> >>
> >>If all of the robots that are currently recharging batteries are added
> >>to the cluster as mains-connected nodes, then a cluster of sorts is in
> >>effect to speed control processing of the 'hive'.  This is assuming that
> >>the central site has the main power supply system online, be it solar,
> >>nuc, whatever.
> >>
> >>-Jim Ahia
> >>-makenamicro at charter.net
> >>_______________________________________________
> >>Beowulf mailing list, Beowulf at beowulf.org
> >>To change your subscription (digest mode or unsubscribe) visit 
> >>http://www.beowulf.org/mailman/listinfo/beowulf
> >>
> 
> -- 
> Gerry Creager -- gerry.creager at tamu.edu
> Network Engineering -- AATLT, Texas A&M University	
> Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
> Page: 979.228.0173
> Office: 903A Eller Bldg, TAMU, College Station, TX 77843
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Art Edwards
Senior Research Physicist
Air Force Research Laboratory
Electronics Foundations Branch
KAFB, New Mexico

(505) 853-6042 (v)
(505) 846-2290 (f)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Apr 25 12:21:27 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 25 Apr 2003 12:21:27 -0400 (EDT)
Subject: beowulf in space
In-Reply-To: <1051283107.27185.12.camel@terra>
Message-ID: <Pine.LNX.4.44.0304251147350.1973-100000@lilith.rgb.private.net>

On 25 Apr 2003, Dean Johnson wrote:

> In terms of setting up mining operations on other celestial bodies, cpu stability
> and radiation protection are amongst the least of your worries. What will be
> needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical
> issues are largely tractable, one way or another, but the institutional and 
> international issues will be like herding cats on crack.

<ignore topic="science fiction" comment="off topic as hell"> 

Well, there are also the umm, "economic" issues as well.  As in no
matter what you do, no matter what you say, the theoretical minimum cost
of lifting something out of the earth's gravity well is on the order of
a 100 megajoules per kilogram (mgR_earth is "escape energy").  Ignoring
things like the 2nd law, call it a first-law cost of a buck purchased as
raw electricity at commercial rates.  

However, using rockets to provide lift, the net 2nd law efficiency is
some appallingly low number, as one has to lift the fuel to lift the
fuel to lift the fuel to lift the payload, and then there are things
like drag forces and the fact that failure has a very high cost so
everything is overengineered, and the fact that you have to build a
REALLY BIG vehicle to deliver a REALLY SMALL payload, which kicks in
several orders of magnitude in cost (like 5?).  As in, we are never
going to "explore space" on chemical rockets more than slowly and
infrequently, period, at $100K/kg.  We can't afford it.  Sorry, that's
just a fact and I don't see it changing, not even if we figure out how
to make electricity from fusion reactors and drop the fuel costs.

Electromagnet mass drivers (a la Heinlein) would get rid of the lifting
of reaction mass/fuel problem (which would make a BIG difference) but
leaves you with lots of OTHER problems (like accelerating something to
order of 10 km/sec against drag forces and without exceeding (say) 3-4
g's or cooking the contents with eddy currents, punching it through the
thicker lower atmosphere against nonlinear turbulence that makes the
stuff that ripped up the space shuttle seem like kid's stuff, and more).

This approach would require a huge capital investment, new technologies
galore (and maybe a bit of new physics), and might not ever work.  We
could spend a significant fraction of a terabuck just finding out.  IF
it worked, though, it could reduce the cost (ignoring the amortization
of the initial investment, which was a mostly-ignored hundreds of
gigabucks for chemical rockets and NASA as well, truth be told) to
perhaps $100/kg, which is at least in the not-completely-insane range
(assuming that one could achieve 1% efficiency, which is open to doubt).

I see nobody designing earth-orbit mass drivers.  I see little serious
investment in the entire concept (although my physics students love the
idea and regularly do exam questions on it:-).  Until somebody does, the
space program will be restricted to rare manned big ticket "exploration
of space" trips and lots of unmanned earth orbit flights with a
predictable economic payoff (weather and comm and military satellites).
This goes for the indefinite future.  Just buying the fuel to fill a
shuttle mission has to cost a literally insane sum compared to the
weight of the orbital payload, and the cost of that energy (viewed as
energy) literally defines the value of money and cannot ever become
"cheap".

So sorry, although I >>love<< space exploration and have read a
signficant fraction of all science fiction, the tragic thing about being
a physicist is one has to really work to suspend that disbelief thing
when one can do the math.

Look on the bright side.  With mass drivers it at least >>is<< feasible
to contemplate exploiting here to the moon, maybe even near solar system
(although the cost issue starts creeping in again when you get outside
the moon, as do lots of other things like time of travel).  Until you
hear of work being done on them (with some degree of success) then
forget mining the moon.  They may well have to wait on other
breakthroughs, as well (like viable hi-T superconductors) -- in addition
to atmospheric drag forces and friction, there are eddy currents to
consider, where resistance in the lifted shell is a "bad thing".

</ignore>

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Fri Apr 25 08:16:38 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Fri, 25 Apr 2003 07:16:38 -0500
Subject: beowulf in space
In-Reply-To: <sea865de.006@mail.UMESD.K12.OR.US>
References: <sea865de.006@mail.UMESD.K12.OR.US>
Message-ID: <3EA92726.801@tamu.edu>

The Van Allen belts are not a continuum of radiation that you see as a 
"shell", but rather, are modulated by the solar environment and the 
earth's own geomagnetic environment.  One might look at the variations 
in geomagnetic potential and gravetic potential for the earth at its 
surface, and some projections (and measurements) in low earth orbit.

The possibility of an increased ion/particle based event is higher when 
passing through the "belts" than when outside of them.  Thus, there is 
concern.  There tends to be a higher concentration of radiation 
environments ("belts") in the mid-earth-orbit range (1500-5000km) than 
at lower altitudes, but the concentration of particles is higher in the 
lower orbits within the belts.  So, it's a catch-22.

I might add, it's hard to explain this, as there's not a board handy I 
can drawon, and I can't be seen waving my hands.  Then there's the issue 
of adequate coffee levels...

In the ISS, there are "safe haven" areas with more shielding than other 
parts of the station.  In the event of a strong solar event, the crew 
could be ordered into the safe haven area for a period of time.  Or, 
they could be ordered to prepare for evacuation via Soyuz.

Medical planning for a Mars mission was problemmatic when I was at NASA. 
  The concept of a safe haven has to be considered as the potential for 
a solar storm is non-trivial during a mission transit of the necessary 
duration, and theweight penalty for such a safe haven area is very 
great.  Addition of multiple layers of heavy metal, which might mitigate 
  some of the ion transitions has its own drawbacks, as mentioned 
earlier.  Layering of heavy metals and differing dense materials is one 
path that's been evaluated.

Let's add to the discussion: There are conditions where the human 
machine will self-repair better than silicon or germanium... or 
silicon-on-saphire, or, pick your substrate.  In these cases, we have to 
protect the computers more.  However, the reverse can also be true, if 
the hit is a soft, but fairly frequent set of radiation hits: The 
machine sees these as single event upsets, while the human could well 
see them as enough ionizing radiation to modify the immune system, the 
neurological system or synapses.  So protecting hardware _and_ liveware 
becomes an important, and difficult task.

And there should be no separation of NASA and follow-on commercial 
concerns: These concerns should be echoed all down the chain, because 
the concept of protecting the hardware and the liveware isn't something 
NASA does to ramp up the costs.

I feel obligated to note that there was a Shuttle mission several years 
ago, where the crew were exposed to a sudden and unanticipated solar 
event.  They were placed in safe-haven in the airlock for a period of a 
couple of hours until the majority of the event had passed.  THere was 
inadequate time to de-orbet and protect them via the atmosphere.  All 
dosimeters were over normal exposure limits.  Certain post-flight 
medical recommendations were made with regard to the potential for 
reproductive health, and they were subjected to more, and longer 
follow-up medically than other crews.  I'm not aware of any lasting 
consequences... but then, if I were, I'd probably have violated some of 
the confidentiality tenets.  The story here is simple, though:  Solar 
events happen, sometimes unexpectedly.  When they do, some contingency 
planning must be on hand to rapidly (or by design) protect the liveware 
and hardware, or something may get damaged.  It's really hard to make a 
service call to a broken device when it's vertical offset is 250 or so 
km, and its delta-V is over 10m/s...

gerry
Jim Ahia wrote:
> So the radiation concerns with rad-hardened computer equipment are not
> as much of a problem once clear of the Van Allen Radiation Belt?  How
> does this affect the space station and the planned missions to mars? 
> What about the lunar environment?  I admit to having a lot of ignorance
> on this subject, but I am concerned because part of my college project
> is for robotic teams to do excavation / mining using a "hive" concept. 
> The end result is to get more information on the challenges that will be
> faced by the robotic workers that are eventually sent to the moon first,
> and to mars second.
> 
> I am not speaking about the exploration missions by NASA, but rather
> the much-farther-down-the-road commercial mining interests that will
> want to build a foundry on the moon and a spacedock in earth orbit prior
> to the big colonization push into our solar system.  I believe it is
> going to happen someday, because we already know that eventually our sun
> will go nova and earth will be no longer habitable.  Sooner or later
> mankind, if it is to survive, will need to undergo some kind of diaspora
> and migrate out into space.  

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Fri Apr 25 12:43:35 2003
From: becker at scyld.com (Donald Becker)
Date: Fri, 25 Apr 2003 12:43:35 -0400 (EDT)
Subject: Serial Port Concentrators vs. KVMs
In-Reply-To: <018201c30b01$65998aa0$04a8a8c0@spot>
Message-ID: <Pine.LNX.4.44.0304251222150.4882-100000@beohost.scyld.com>

On Fri, 25 Apr 2003, Dan Kidger wrote:

> > On Thu, Apr 24, 2003 at 04:28:12PM +0100, Dan Kidger elucidated:
> > > However the current trend is for all new rack-mount nodes to offer some
> sort
> > > of BMC (baseboard management controller) with an ethernet connection. As
> > > well as giving remote power cycling, this should allow
> 'Serial_over_Lan" -
> >
> > Course the thing I wonder about that is then would seem to loose some
> > redundancy
...
> >  Course I guess if a switch blows up, there may
> > not be any particular need to access a node.

That's the key idea: by reducing the cable count, you reduce the number
of things that can wrong.  And if the communications to the node is
down, there is little point in having anything else working.

> You do not necessirly lose any reduncancy..
> Compaq nodes have a extra ethernet socket for the BMC (Hence
> serial_over_lan).
...
> Alternatively the Intel MoBo's like E7501 hijack the same physical ethernet
> socket for the BMC.

The different approach is determined by the Ethernet chip is in use.
A special NIC design is required to transparently piggyback management
traffic on the main network channel, and to continue to do so when the
main system is powered off.  The selection is pretty much limited to a
few 10/100 chips from 3Com and Intel.  If the system board uses a
gigabit NIC, the BMC has to have its own network connection.

An second Ethernet network is far less expensive, more reliable and
easier to diagnose than a KVM or serial setup.  While I prefer having
only two cables, power and one Cat5, rather than three, even three is no
comparison to the complexity of other solutions.

> This may lose a little redundancy (if for example a
> cable falls out), but halves the amount of cat5 cabling needed

This is a good example of how fewer cables doesn't lose redundancy, it
decreases the points of failure and increases the reliability.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From math at velocet.ca  Fri Apr 25 15:18:00 2003
From: math at velocet.ca (Ken Chase)
Date: Fri, 25 Apr 2003 15:18:00 -0400
Subject: back to the issue of cooling
In-Reply-To: <3EA7715A.44EFAA24@andorra.ad>; from award@andorra.ad on Thu, Apr 24, 2003 at 07:08:42AM +0200
References: <Pine.LNX.4.44.0304231520040.17571-100000@ganesh.phy.duke.edu> <3EA7715A.44EFAA24@andorra.ad>
Message-ID: <20030425151800.A69860@velocet.ca>

On Thu, Apr 24, 2003 at 07:08:42AM +0200, Alan Ward's all...
  >I tend to think Transmeta and other low-power CPUs belong on the 
  >desktop, so you can run them without the noisy fans (and they don't 
  >heat up the air). 

make them diskless and you have no moving parts - great for remote
machines that are mission critical but you cant get to them quick for
repair.

Im gonna stick my firewall on one of these, hide it in the closet, and boot it
off my desktop cuz the $WIFE "doesnt want to see anymore bloody computers in
the house!"

/kc
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at math.ucdavis.edu  Fri Apr 25 17:07:29 2003
From: bill at math.ucdavis.edu (Bill Broadley)
Date: Fri, 25 Apr 2003 14:07:29 -0700
Subject: AMD Opteron benchmarks
In-Reply-To: <3EA8E381.6020202@alineos.com>
References: <3EA8E381.6020202@alineos.com>
Message-ID: <20030425210729.GA12550@sphere.math.ucdavis.edu>

On Fri, Apr 25, 2003 at 09:28:01AM +0200, Dominique Van?on wrote:
> Hi All,
> we performed some benchmarks on AMD Opteron 1400 (also Intel XEON, 
> Itanium2 900, AMD MP, Apple and Alpha processors) :
> http://www.alineos.com/benchs_eng.html
> We also could make some comments about these tests, so feel free to 
> contact.

Interesting.  What were the exact configurations of the hardware?
Unlike most other hardware the opterons can be significantly slower
based on the configuration.

Apparently there is a shortage of PC2700 ECC Registered memory.  Did your
test opterons have PC2100 or PC2700?

Did each opteron have 2 matched dimms (so 4 for a dual cpu)?  I've seen
duals with only 1 of the 2 memory banks populated.

Why did you use Intel's compiler for the Xeons, but gcc for the Opterons?

-- 
Bill Broadley
Mathematics
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Fri Apr 25 17:27:29 2003
From: gerry.creager at tamu.edu (Gerry Creager)
Date: Fri, 25 Apr 2003 16:27:29 -0500
Subject: beowulf in space
References: <sea865de.005@mail.UMESD.K12.OR.US> <1051283107.27185.12.camel@terra>
Message-ID: <3EA9A841.9030302@tamu.edu>

Layers 8 & 9 (fiscal/political) of the ISO model?

gerry

Dean Johnson wrote:
> On Fri, 2003-04-25 at 00:31, Jim Ahia wrote:
> 
> In terms of setting up mining operations on other celestial bodies, cpu stability
> and radiation protection are amongst the least of your worries. What will be
> needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical
> issues are largely tractable, one way or another, but the institutional and 
> international issues will be like herding cats on crack.

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University
Office: 979.458.4020  FAX: 979.847.8578
Cell: 979.229.5301    Pager: 979.228.0173


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Fri Apr 25 18:15:44 2003
From: landman at scalableinformatics.com (Joseph Landman)
Date: 25 Apr 2003 18:15:44 -0400
Subject: AMD Opteron benchmarks
In-Reply-To: <20030425210729.GA12550@sphere.math.ucdavis.edu>
References: <3EA8E381.6020202@alineos.com>
	 <20030425210729.GA12550@sphere.math.ucdavis.edu>
Message-ID: <1051308944.2898.15.camel@protein.scalableinformatics.com>

I'd be curious to see the Intel compiled code run on the Opteron, and
the gcc compiled code run on the Xeon.  The BLAST results were quite
suprising, so I would like to see if that is the identical binary on
both systems, and if so, what is the config of each.


On Fri, 2003-04-25 at 17:07, Bill Broadley wrote:
> On Fri, Apr 25, 2003 at 09:28:01AM +0200, Dominique Van?on wrote:
> > Hi All,
> > we performed some benchmarks on AMD Opteron 1400 (also Intel XEON, 
> > Itanium2 900, AMD MP, Apple and Alpha processors) :
> > http://www.alineos.com/benchs_eng.html
> > We also could make some comments about these tests, so feel free to 
> > contact.
> 
> Interesting.  What were the exact configurations of the hardware?
> Unlike most other hardware the opterons can be significantly slower
> based on the configuration.
> 
> Apparently there is a shortage of PC2700 ECC Registered memory.  Did your
> test opterons have PC2100 or PC2700?
> 
> Did each opteron have 2 matched dimms (so 4 for a dual cpu)?  I've seen
> duals with only 1 of the 2 memory banks populated.
> 
> Why did you use Intel's compiler for the Xeons, but gcc for the Opterons?
-- 
Joseph Landman <landman at scalableinformatics.com>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From astroguy at bellsouth.net  Sat Apr 26 07:40:19 2003
From: astroguy at bellsouth.net (astroguy at bellsouth.net)
Date: Sat, 26 Apr 2003 7:40:19 -0400
Subject: OT warning...Re: beowulf in space, now wandering far afield...
Message-ID: <20030426114019.YYXZ1247.imf54bis.bellsouth.net@mail.bellsouth.net>


> 
> From: Gerry Creager N5JXS <gerry.creager at tamu.edu>
> Date: 2003/04/26 Sat AM 01:55:02 EDT
> To: astroguy <astroguy at bellsouth.net>
> CC: beowulf at beowulf.org
> Subject: OT warning...Re: beowulf in space, now wandering far afield...
> 
> I can't conceive of a reason to locate a cluster in the vicinity of 
> CHernobyl.  As you note, the death toll continues to climb, and the 
> mutation rate is non-trivial among the retilian population.  The cancer 
> rate is considerably higher than background, as well.
> 
> However, parking a cluster under the sarcophogus doesn't strike me as 
> adding anything to the mass of knowledge, nor is there too much I can 
> think of, from a research perspective, that'd require or benefit from 
> on-site cluster computations.
> 
> I suspect one reason no one is wanting to talk about Chernobyl is that 
> it occurred so long ago, at least in American terms, that it's ancient 
> history.  We know what caused it (carelessness) and we know a lot of 
> damage was wrought.  Right now, aside from documenting mutation and 
> cancer rates, neither of which requires massively parallel applications, 
> there's some interesting structural engineering data to be gleaned from 
> the concrete sarcophygus  (overshroud of reinforced concrete that's 
> decaying at a pretty amazing rate.  But, I think I'd to my assessments 
> on-site, then go home to decontaminate and run my datasets.
> 
> gerry
> 
> astroguy wrote:
> > Gerry Creager wrote:
> > 
> > 
> >>Layers 8 & 9 (fiscal/political) of the ISO model?
> >>
> >>gerry
> >>
> >>Dean Johnson wrote:
> >>
> >>>On Fri, 2003-04-25 at 00:31, Jim Ahia wrote:
> >>>
> >>>In terms of setting up mining operations on other celestial bodies, cpu stability
> >>>and radiation protection are amongst the least of your worries. What will be
> >>>needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical
> >>>issues are largely tractable, one way or another, but the institutional and
> >>>international issues will be like herding cats on crack.
> >>
> >>--
> >>Gerry Creager -- gerry.creager at tamu.edu
> >>Network Engineering -- AATLT, Texas A&M University
> >>Office: 979.458.4020  FAX: 979.847.8578
> >>Cell: 979.229.5301    Pager: 979.228.0173
> >>
> >>_______________________________________________
> >>Beowulf mailing list, Beowulf at beowulf.org
> >>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> > 
> > 
> > Hi Gerry,
> > Since you asked... not sure what distro of ISO we may need but there is a lot
> > of work yet here on Terra fir ma... one area of convergence that the Beowolf
> > in space might hold some measure of promise in synergistic high radiation
> > environment would be, at least in my mind, the site of Chernobyl that no one
> > is very keen to talk about but all are certanly aware it is a site that must be revisited... in the death toll
> > even greater that our 9/11... and their Russian firefighters are still adding their heroic numbers to the list...
> > Who are we to ask them to sacrifice more than they have already... But I think all agree we have a daunting and
> > serious job to do in a very difficult almost space like environment.
> > . Crazy Russians a little nuts but ya just got to love'em
> > Just posting, thanks for list indulgence
> > c.clary
> > spartan sys.
> > po box 1515
> > spartanburg, sc 29304-0243
> > 
> > fax# (801) 858-2722
> 
> -- 
> Gerry Creager -- gerry.creager at tamu.edu
> Network Engineering -- AATLT, Texas A&M University	
> Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
As I recall we were speaking of diplomacy and high radiation... it just seems that a bot that could function would because of mission demand to function in an actual worker labor intensive multi functional capacity... visual demands alone of simply walking or rolling are considerable in any independent fashion... basic demands beyond the simple robotic independent of a tether leash and logic demands of simply picking up a piece of pipe... even to see the pipe and understand and distinguish a difference from a wooden broom on the floor are considerable and daunting task that have eluded our top engineers to date... as from the inception from this debate I hold to Dr.Browns position.... Lots of work to be done yet on earth before we might place pie in the sky theoretical magnetic space drives and magic space monkey's into space... some call  proof of concept or failure analysis... It to me is basic foundation groundwork that we build the steps before we just rocket into heaven!
 or hell.
c.clary
spartan sys
po box 1515
spartanburg, sc 29304-0243

PS sorry to poop on the party... but beyond this whist of some H.G.Wells there is a ton of real work attached to not only theory but test... test and more test.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Sat Apr 26 01:55:02 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Sat, 26 Apr 2003 00:55:02 -0500
Subject: OT warning...Re: beowulf in space, now wandering far afield...
In-Reply-To: <3EAA0A33.D7D7BD0A@bellsouth.net>
References: <sea865de.005@mail.UMESD.K12.OR.US> <1051283107.27185.12.camel@terra> <3EA9A841.9030302@tamu.edu> <3EAA0A33.D7D7BD0A@bellsouth.net>
Message-ID: <3EAA1F36.5010100@tamu.edu>

I can't conceive of a reason to locate a cluster in the vicinity of 
CHernobyl.  As you note, the death toll continues to climb, and the 
mutation rate is non-trivial among the retilian population.  The cancer 
rate is considerably higher than background, as well.

However, parking a cluster under the sarcophogus doesn't strike me as 
adding anything to the mass of knowledge, nor is there too much I can 
think of, from a research perspective, that'd require or benefit from 
on-site cluster computations.

I suspect one reason no one is wanting to talk about Chernobyl is that 
it occurred so long ago, at least in American terms, that it's ancient 
history.  We know what caused it (carelessness) and we know a lot of 
damage was wrought.  Right now, aside from documenting mutation and 
cancer rates, neither of which requires massively parallel applications, 
there's some interesting structural engineering data to be gleaned from 
the concrete sarcophygus  (overshroud of reinforced concrete that's 
decaying at a pretty amazing rate.  But, I think I'd to my assessments 
on-site, then go home to decontaminate and run my datasets.

gerry

astroguy wrote:
> Gerry Creager wrote:
> 
> 
>>Layers 8 & 9 (fiscal/political) of the ISO model?
>>
>>gerry
>>
>>Dean Johnson wrote:
>>
>>>On Fri, 2003-04-25 at 00:31, Jim Ahia wrote:
>>>
>>>In terms of setting up mining operations on other celestial bodies, cpu stability
>>>and radiation protection are amongst the least of your worries. What will be
>>>needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical
>>>issues are largely tractable, one way or another, but the institutional and
>>>international issues will be like herding cats on crack.
>>
>>--
>>Gerry Creager -- gerry.creager at tamu.edu
>>Network Engineering -- AATLT, Texas A&M University
>>Office: 979.458.4020  FAX: 979.847.8578
>>Cell: 979.229.5301    Pager: 979.228.0173
>>
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> 
> Hi Gerry,
> Since you asked... not sure what distro of ISO we may need but there is a lot
> of work yet here on Terra fir ma... one area of convergence that the Beowolf
> in space might hold some measure of promise in synergistic high radiation
> environment would be, at least in my mind, the site of Chernobyl that no one
> is very keen to talk about but all are certanly aware it is a site that must be revisited... in the death toll
> even greater that our 9/11... and their Russian firefighters are still adding their heroic numbers to the list...
> Who are we to ask them to sacrifice more than they have already... But I think all agree we have a daunting and
> serious job to do in a very difficult almost space like environment.
> . Crazy Russians a little nuts but ya just got to love'em
> Just posting, thanks for list indulgence
> c.clary
> spartan sys.
> po box 1515
> spartanburg, sc 29304-0243
> 
> fax# (801) 858-2722

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From astroguy at bellsouth.net  Sat Apr 26 00:25:23 2003
From: astroguy at bellsouth.net (astroguy)
Date: Sat, 26 Apr 2003 00:25:23 -0400
Subject: beowulf in space
References: <sea865de.005@mail.UMESD.K12.OR.US> <1051283107.27185.12.camel@terra> <3EA9A841.9030302@tamu.edu>
Message-ID: <3EAA0A33.D7D7BD0A@bellsouth.net>

Gerry Creager wrote:

> Layers 8 & 9 (fiscal/political) of the ISO model?
>
> gerry
>
> Dean Johnson wrote:
> > On Fri, 2003-04-25 at 00:31, Jim Ahia wrote:
> >
> > In terms of setting up mining operations on other celestial bodies, cpu stability
> > and radiation protection are amongst the least of your worries. What will be
> > needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical
> > issues are largely tractable, one way or another, but the institutional and
> > international issues will be like herding cats on crack.
>
> --
> Gerry Creager -- gerry.creager at tamu.edu
> Network Engineering -- AATLT, Texas A&M University
> Office: 979.458.4020  FAX: 979.847.8578
> Cell: 979.229.5301    Pager: 979.228.0173
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Hi Gerry,
Since you asked... not sure what distro of ISO we may need but there is a lot
of work yet here on Terra fir ma... one area of convergence that the Beowolf
in space might hold some measure of promise in synergistic high radiation
environment would be, at least in my mind, the site of Chernobyl that no one
is very keen to talk about but all are certanly aware it is a site that must be revisited... in the death toll
even greater that our 9/11... and their Russian firefighters are still adding their heroic numbers to the list...
Who are we to ask them to sacrifice more than they have already... But I think all agree we have a daunting and
serious job to do in a very difficult almost space like environment.
. Crazy Russians a little nuts but ya just got to love'em
Just posting, thanks for list indulgence
c.clary
spartan sys.
po box 1515
spartanburg, sc 29304-0243

fax# (801) 858-2722

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf