From adam4098 at uidaho.edu Tue Apr 1 00:47:58 2003 From: adam4098 at uidaho.edu (Adam Phillabaum) Date: Mon, 31 Mar 2003 21:47:58 -0800 Subject: MSI motherboards Message-ID: <003f01c2f812$40d1db70$ee506581@lookout> Hello, I'm looking for some information about A MSI motherboard, the MS-9138 http://www.msicomputer.com/product/detail_spec/product_detail.asp?model=E750 1_Master-LS2 Its a dual Xeon motherboard. Just checking if anyone has anything positive or negative to say about it. -- Adam _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From dlane at ap.stmarys.ca Tue Apr 1 13:21:40 2003 From: dlane at ap.stmarys.ca (Dave Lane) Date: Tue, 01 Apr 2003 14:21:40 -0400 Subject: SMC8624T vs DLINK DGC-1024T Message-ID: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca> Can anyone comment on the strengths/weaknesses of these two 24-port gigabit switches. We're going to be building a 16 node dual-Xeon cluster this spring and were planning on the SMC switch (which has received good review here before), but a vendor pointed out the DLINK switch as a less expensive alternative. ... Dave _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mprinkey at aeolusresearch.com Tue Apr 1 16:17:26 2003 From: mprinkey at aeolusresearch.com (Michael T. Prinkey) Date: Tue, 1 Apr 2003 16:17:26 -0500 (EST) Subject: DDR Xeon Chipsets Message-ID: <36051.66.118.77.29.1049231846.squirrel@ra.aeolustec.com> This question was asked a few months ago but not answered. Are stream or other memory benchmark data available for the new DDR Xeon chipsets, specifically the Intel 7501 and 7505 and the Serverworks GC-SL and GC-LE? In particular, I was looking at the Supermicro x5DEi which uses the GC-SL chipset. It seems that this north bridge uses only a single DDR channel. I would think that this would seriously impact performance. The GC-LE and the 7501/5 chipsets use dual DDR channels which seems more appropriate. Any insights here? Any numbers? Thanks, Mike Prinkey _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joachim at ccrl-nece.de Wed Apr 2 02:53:14 2003 From: joachim at ccrl-nece.de (Joachim Worringen) Date: Wed, 2 Apr 2003 09:53:14 +0200 Subject: Which MPI implementation for MPI-2? In-Reply-To: <200303301650.34552.exa@kablonet.com.tr> References: <200303301650.34552.exa@kablonet.com.tr> Message-ID: <200304020953.14361.joachim@ccrl-nece.de> Eray Ozkural: > In order to make use of MPI-2 features such as one-sided communications, > new collective operations and I/O which implementation do you think is > preferable? Which platform? Do you want it for free? When do you need the MPI-2 features? Joachim -- Joachim Worringen - NEC C&C research lab St.Augustin fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From exa at kablonet.com.tr Wed Apr 2 13:33:49 2003 From: exa at kablonet.com.tr (Eray Ozkural) Date: Wed, 2 Apr 2003 21:33:49 +0300 Subject: Which MPI implementation for MPI-2? In-Reply-To: <200304020953.14361.joachim@ccrl-nece.de> References: <200303301650.34552.exa@kablonet.com.tr> <200304020953.14361.joachim@ccrl-nece.de> Message-ID: <200304022133.49554.exa@kablonet.com.tr> On Wednesday 02 April 2003 10:53, Joachim Worringen wrote: > > Which platform? Do you want it for free? When do you need the MPI-2 > features? Beowulf class, linux :) We currently have a switched fast ethernet network. NICs eepro100, switch 3com superstackii I wasn't able to get LAM one sided ops to run at all last year when I gave it a shot. (I have a feeling their architecture is a little buggy and inefficient). Maybe mpich can cope better with MPI-2, well aren't they using libraries like global arrays at Sandia which do one sided comms? Thanks, -- Eray Ozkural (exa) Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rbw at ahpcrc.org Wed Apr 2 15:11:20 2003 From: rbw at ahpcrc.org (Richard Walsh) Date: Wed, 2 Apr 2003 14:11:20 -0600 Subject: Which MPI implementation for MPI-2? ... Message-ID: <200304022011.h32KBKv10390@mycroft.ahpcrc.org> Eray Ozkural wrote: >I wasn't able to get LAM one sided ops to run at all last year when I gave it >a shot. (I have a feeling their architecture is a little buggy and >inefficient). Maybe mpich can cope better with MPI-2, well aren't they using >libraries like global arrays at Sandia which do one sided comms? To anyone ... Somewhat tangentially, but while we are on the subject of one-sided communications in MPI-2, am I correct in assuming that this capability is implemented as it is in SHMEM ... via communication to/from symmetric (or known asymmetric) memory locations inside the companion processes memory space. It would seem to be a requirement for speed and would also seem to require the use of identical binaries on each processor (and COMMON or static to place data in a symmetric location). Thanks for your guidance ... rbw #--------------------------------------------------- # Richard Walsh # Project Manager, Cluster Computing, Computational # Chemistry and Finance # netASPx, Inc. # 1200 Washington Ave. So. # Minneapolis, MN 55415 # VOX: 612-337-3467 # FAX: 612-337-3400 # EMAIL: rbw at networkcs.com, richard.walsh at netaspx.com # rbw at ahpcrc.org # #--------------------------------------------------- # "When Noah built the arc, it was not raining." # -Anonymous #--------------------------------------------------- # _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From exa at kablonet.com.tr Wed Apr 2 16:33:16 2003 From: exa at kablonet.com.tr (Eray Ozkural) Date: Thu, 3 Apr 2003 00:33:16 +0300 Subject: Which MPI implementation for MPI-2? In-Reply-To: <200304020953.14361.joachim@ccrl-nece.de> References: <200303301650.34552.exa@kablonet.com.tr> <200304020953.14361.joachim@ccrl-nece.de> Message-ID: <200304030033.16374.exa@kablonet.com.tr> On Wednesday 02 April 2003 10:53, Joachim Worringen wrote: > > Which platform? Do you want it for free? When do you need the MPI-2 > features? About free-ness, yes probably :) We don't have any important MPI-2 code right now, I think I only used IO features till now. Maybe in one of our future projects but I've no idea when. Thanks, -- Eray Ozkural (exa) Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From dsarvis at zcorum.com Wed Apr 2 15:41:03 2003 From: dsarvis at zcorum.com (Dennis Sarvis, II) Date: 02 Apr 2003 15:41:03 -0500 Subject: small cluster Message-ID: <1049316063.1932.4.camel@skull.america.net> How does one go about creating a 2 PC cluster? I have a redhat 400Mhz PII and a Debian Celeron 550Mhz. Can I do something like use 2 NICs in the controller and one in the slave (1 NIC for the office network/internet and the other connecting via crossover 10baseT to the NIC on node1 slave)? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joachim at ccrl-nece.de Thu Apr 3 02:07:41 2003 From: joachim at ccrl-nece.de (Joachim Worringen) Date: Thu, 3 Apr 2003 09:07:41 +0200 Subject: Which MPI implementation for MPI-2? In-Reply-To: <200304030033.16374.exa@kablonet.com.tr> References: <200303301650.34552.exa@kablonet.com.tr> <200304020953.14361.joachim@ccrl-nece.de> <200304030033.16374.exa@kablonet.com.tr> Message-ID: <200304030907.41246.joachim@ccrl-nece.de> Eray Ozkural: > About free-ness, yes probably :) > > We don't have any important MPI-2 code right now, I think I only used IO > features till now. Maybe in one of our future projects but I've no idea > when. So you could easily use MPICH (if LAM doesn't work for you) and wait until MPICH-2 is available (or test the beta-release...). Or you could buy a full MPI-2 implementaion from NEC which runs on SCore-clusters... oops, wrong mailinglist. ;-) Joachim -- Joachim Worringen - NEC C&C research lab St.Augustin fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu Apr 3 04:04:01 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 3 Apr 2003 01:04:01 -0800 Subject: Which MPI implementation for MPI-2? ... In-Reply-To: <200304022011.h32KBKv10390@mycroft.ahpcrc.org> References: <200304022011.h32KBKv10390@mycroft.ahpcrc.org> Message-ID: <20030403090401.GB1447@greglaptop.attbi.com> On Wed, Apr 02, 2003 at 02:11:20PM -0600, Richard Walsh wrote: > Somewhat tangentially, but while we are on the subject of one-sided > communications in MPI-2, am I correct in assuming that this capability > is implemented as it is in SHMEM ... No. It's much more complicated and general. You have to register windows within which one-sided ops can be used, and there are some extra calls that you make to make sure operations have completed. UPC is a much more compact method of expressing one-sided calls, and unlike shmem, it can benefit from pipelined transfers. > It would seem to be a requirement for speed and would > also seem to require the use of identical binaries on each processor > (and COMMON or static to place data in a symmetric location). shmem doesn't require that; you can use a common address (I'm very punny at 1am) to exchange addresses of malloc-ed data. But with shmem, you get a free registration of all static & common variables, and the stack too, as long as you use it in a consistant fashion. greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rbw at ahpcrc.org Thu Apr 3 09:51:50 2003 From: rbw at ahpcrc.org (Richard Walsh) Date: Thu, 3 Apr 2003 08:51:50 -0600 Subject: Which MPI implementation for MPI-2? ... Message-ID: <200304031451.h33Epo500310@mycroft.ahpcrc.org> On Thu Apr 3 03:43:24 2003 Greg Lindahl wrote: >On Wed, Apr 02, 2003 at 02:11:20PM -0600, Richard Walsh wrote: > >> Somewhat tangentially, but while we are on the subject of one-sided >> communications in MPI-2, am I correct in assuming that this capability >> is implemented as it is in SHMEM ... > >No. It's much more complicated and general. You have to register >windows within which one-sided ops can be used, and there are some >extra calls that you make to make sure operations have completed. I see ... then I should also anticipate some loss of performance (higher latency) when using one-sided MPI communications compared to SHMEM. Or perhaps this is one-time overhead paid at registration only? >UPC is a much more compact method of expressing one-sided calls, and >unlike shmem, it can benefit from pipelined transfers. Right (so also with CAF) for messages, but you still have to explicitly sychronize/lock, etc. >> It would seem to be a requirement for speed and would >> also seem to require the use of identical binaries on each processor >> (and COMMON or static to place data in a symmetric location). > >shmem doesn't require that; you can use a common address (I'm very >punny at 1am) to exchange addresses of malloc-ed data. But with shmem, >you get a free registration of all static & common variables, and the >stack too, as long as you use it in a consistant fashion. As far as I know, SHMEM requires a known address either explicitly passed (asymmetric location) between partners or a implicitly determined from the symmetry relationships of the images communicating (static or common). As you say, this is "free" for COMMON/STATIC data. Perhaps we are actually agreeing ... explicitly exchange addresses of malloc-ed locations in different binaries would be fine. Thanks, rbw _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From edwardsa at plk.af.mil Thu Apr 3 10:35:41 2003 From: edwardsa at plk.af.mil (Art Edwards) Date: Thu, 3 Apr 2003 08:35:41 -0700 Subject: small cluster In-Reply-To: <1049316063.1932.4.camel@skull.america.net> References: <1049316063.1932.4.camel@skull.america.net> Message-ID: <20030403153541.GB30047@plk.af.mil> I have a two node cluster running just Debian. Indeed you need two NICS on the head node (one for the outside and one for the local network). I'm still running NFS to mount /home and /usr/local on the internal node. Also, I'm running MPICH. It was an exercise to learn rudimentary skills in cluster building, but it still runs. Art Edwards On Wed, Apr 02, 2003 at 03:41:03PM -0500, Dennis Sarvis, II wrote: > How does one go about creating a 2 PC cluster? I have a redhat 400Mhz > PII and a Debian Celeron 550Mhz. Can I do something like use 2 NICs in > the controller and one in the slave (1 NIC for the office > network/internet and the other connecting via crossover 10baseT to the > NIC on node1 slave)? > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Art Edwards Senior Research Physicist Air Force Research Laboratory Electronics Foundations Branch KAFB, New Mexico (505) 853-6042 (v) (505) 846-2290 (f) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tony at mpi-softtech.com Thu Apr 3 10:45:58 2003 From: tony at mpi-softtech.com (Anthony Skjellum) Date: Thu, 3 Apr 2003 09:45:58 -0600 (CST) Subject: Which MPI implementation for MPI-2? ... In-Reply-To: <200304031451.h33Epo500310@mycroft.ahpcrc.org> Message-ID: Our experience in ChaMPIon/Pro is that we get higher latency and higher bandwidth than 2-sided, vs. the design target of lower latency and lower bandwidth; the standard missed the mark, but it is still useful. On Thu, 3 Apr 2003, Richard Walsh wrote: > On Thu Apr 3 03:43:24 2003 Greg Lindahl wrote: > > >On Wed, Apr 02, 2003 at 02:11:20PM -0600, Richard Walsh wrote: > > > >> Somewhat tangentially, but while we are on the subject of one-sided > >> communications in MPI-2, am I correct in assuming that this capability > >> is implemented as it is in SHMEM ... > > > >No. It's much more complicated and general. You have to register > >windows within which one-sided ops can be used, and there are some > >extra calls that you make to make sure operations have completed. > > I see ... then I should also anticipate some loss of performance > (higher latency) when using one-sided MPI communications compared > to SHMEM. Or perhaps this is one-time overhead paid at registration > only? > > >UPC is a much more compact method of expressing one-sided calls, and > >unlike shmem, it can benefit from pipelined transfers. > > Right (so also with CAF) for messages, but you still have to explicitly > sychronize/lock, etc. > > >> It would seem to be a requirement for speed and would > >> also seem to require the use of identical binaries on each processor > >> (and COMMON or static to place data in a symmetric location). > > > >shmem doesn't require that; you can use a common address (I'm very > >punny at 1am) to exchange addresses of malloc-ed data. But with shmem, > >you get a free registration of all static & common variables, and the > >stack too, as long as you use it in a consistant fashion. > > As far as I know, SHMEM requires a known address either explicitly > passed (asymmetric location) between partners or a implicitly determined > from the symmetry relationships of the images communicating (static > or common). As you say, this is "free" for COMMON/STATIC data. > Perhaps we are actually agreeing ... explicitly exchange addresses > of malloc-ed locations in different binaries would be fine. > > > Thanks, > > rbw > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Anthony Skjellum PhD, CTO | MPI Software Technology, Inc. 101 South Lafayette St, Ste. 33 | Starkville, MS 39759, USA Ph: +1-(662)320-4300 x15 | FAX: +1-(662)320-4301 http://www.mpi-softtech.com | tony at mpi-softtech.com Middleware that's hard at work for you and your enterprise.(SM) The information contained in this communication may be confidential and is intended only for the use of the recipient(s) named above. If the reader of this communication is not the intended recipient(s), you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you are not a named recipient or received this communication by mistake, please notify the sender and delete the communication and all copies of it. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From exa at kablonet.com.tr Wed Apr 2 21:20:04 2003 From: exa at kablonet.com.tr (Eray Ozkural) Date: Thu, 3 Apr 2003 05:20:04 +0300 Subject: Which MPI implementation for MPI-2? ... In-Reply-To: <200304022011.h32KBKv10390@mycroft.ahpcrc.org> References: <200304022011.h32KBKv10390@mycroft.ahpcrc.org> Message-ID: <200304030520.05254.exa@kablonet.com.tr> On Wednesday 02 April 2003 23:11, Richard Walsh wrote: > > To anyone ... > > Somewhat tangentially, but while we are on the subject of one-sided > communications in MPI-2, am I correct in assuming that this capability > is implemented as it is in SHMEM ... via communication to/from symmetric > (or known asymmetric) memory locations inside the companion processes > memory space. It would seem to be a requirement for speed and would > also seem to require the use of identical binaries on each processor > (and COMMON or static to place data in a symmetric location). > > Thanks for your guidance ... I think that's the idea but it isn't shared memory! http://www.mpi-forum.org/docs/mpi-20-html/node117.htm#Node117 Remote Memory Access ( RMA) extends the communication mechanisms of MPI by allowing one process to specify all communication parameters, both for the sending side and for the receiving side. .... Evidently, you cannot treat RMA like shared memory. What RMA really is what shared memory should have been (in a sense): The design is similar to that of weakly coherent memory systems: correct ordering of memory accesses has to be imposed by the user, using synchronization calls; the implementation can delay communication operations until the synchronization calls occur, for efficiency. Using RMA you have to design your algorithm like before however it is much easier to cope with dynamic communication. In usual MPI-1 code you would have to specify tons of custom async. send/recv. routines, sync. code etc. to accomplish the same thing. So the answer is, yes it's like shared memory but you know that each call (put, get, accumulate) will incur a message passing eventually. IMO the greatest advantage comes from the (possibly) higher level of abstraction attained this way. Of course a nicer thing is the ease of writing in-place routines, that could potentially make a difference in a lot of places, for example sparse codes like graph partitioning. What we had thought was perhaps implementing complex parallel algorithms (like fold/expand) could be easier with one sided comms. Thanks, -- Eray Ozkural (exa) Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Apr 3 12:35:00 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 3 Apr 2003 12:35:00 -0500 (EST) Subject: small cluster In-Reply-To: <200304031559.TAA01399@nocserv.free.net> Message-ID: On Thu, 3 Apr 103, Mikhail Kuzminsky wrote: > According to Dennis Sarvis, II > > How does one go about creating a 2 PC cluster? I have a redhat 400Mhz > > PII and a Debian Celeron 550Mhz. Can I do something like use 2 NICs in > > the controller and one in the slave (1 NIC for the office > > network/internet and the other connecting via crossover 10baseT to the > > NIC on node1 slave)? > Yes, I use like configuration in my home (but w/o permanent > external link to Internet). However, small switches are SO cheap ($50?) that it is hard not to justify buying a switch unless you are stone cold broke. Even a small switch also makes it much easier to add more nodes as you find them. They are almost certainly going to be 100BT as well, so that you can use faster NICs on future systems without having to deal with 10BT to 100BT crossover connections, multiple NICs in the head node, and so forth. I mean, the cost of a switch (per port) can actually be less than the cost of the NIC ports that connect to it these days. Go for it. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu Apr 3 14:29:44 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 3 Apr 2003 11:29:44 -0800 Subject: Which MPI implementation for MPI-2? ... In-Reply-To: <200304030520.05254.exa@kablonet.com.tr> References: <200304022011.h32KBKv10390@mycroft.ahpcrc.org> <200304030520.05254.exa@kablonet.com.tr> Message-ID: <20030403192944.GC1201@greglaptop.internal.keyresearch.com> On Thu, Apr 03, 2003 at 05:20:04AM +0300, Eray Ozkural wrote: > I think that's the idea but it isn't shared memory! He is talking about the Cray T3E SHMEM library, not shared memory. SHMEM is a SALC (Shared address, local consistancy) model, and is very similar to the MPI-2 one-sided stuff, but with much simpler syntax. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu Apr 3 14:56:08 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 3 Apr 2003 11:56:08 -0800 Subject: Which MPI implementation for MPI-2? ... In-Reply-To: <200304031451.h33Epo500310@mycroft.ahpcrc.org> References: <200304031451.h33Epo500310@mycroft.ahpcrc.org> Message-ID: <20030403195608.GD1201@greglaptop.internal.keyresearch.com> On Thu, Apr 03, 2003 at 08:51:50AM -0600, Richard Walsh wrote: > I see ... then I should also anticipate some loss of performance > (higher latency) when using one-sided MPI communications compared > to SHMEM. Or perhaps this is one-time overhead paid at registration > only? Registration is a one-time overhead. However, the exact semantics of MPI-2 are annoying and may end up introducing some significant overhead for tiny messages. A machine which only allows a limited amout of memory to be registered might have significant overhead all the time. The T3E didn't have that problem because it had a direct mapping from virtual to physical addresses, so the communications system didn't need to know what the TLB mappings looked like. For modern interconnects like Myrinet, there's enough SRAM on the card to map the entire process: 3 bytes per page times (4 GB/4k per page) = 3 megabytes, so the larger memory version of the card suffices for a 32-bit system. The current GM only supports put, not get. I have no idea how much memory SCI or Quadrics could map. You may be able to hack Linux such that it always handed out groups of pages; this would waste some memory, but could reduce the memory needed to hold the full set of mappings by a factor of 4 for 16k groups, 16 for 64k groups, etc. It's a shame that x86 doesn't have support for slightly larger pages; the Opteron has the same problem. > As far as I know, SHMEM requires a known address either explicitly > passed (asymmetric location) between partners or a implicitly determined > from the symmetry relationships of the images communicating (static > or common). As you say, this is "free" for COMMON/STATIC data. > Perhaps we are actually agreeing ... explicitly exchange addresses > of malloc-ed locations in different binaries would be fine. We are agreeing. It comes down to "does this object have the same address in both processes?" greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From purp at acm.org Thu Apr 3 17:22:07 2003 From: purp at acm.org (Jim Meyer) Date: 03 Apr 2003 14:22:07 -0800 Subject: FOLLOWUP: Re: Platform adopts DRMAA? In-Reply-To: <1049138047.18608.48.camel@utonium.pdi.com> References: <20030329072430.66227.qmail@web41315.mail.yahoo.com> <1049138047.18608.48.camel@utonium.pdi.com> Message-ID: <1049408526.13758.98.camel@utonium.pdi.com> On Mon, 2003-03-31 at 11:14, Jim Meyer wrote: > On Fri, 2003-03-28 at 23:24, Ron Chen wrote: > > Something more interesting! Platform throws away its > > own API and joined the DRMAA camp. > > You've mentioned this twice but I've not seen any mention of Platform > [...] > Can you (or anyone) point to something more specific indicating that > Platform is adopting DRMAA (and hopefully providing an estimated > timeline =)? Failing any response from the original poster, I made a few phone calls and determined that while Platform did indeed discard NPi, they have not announced any effort to support DRMAA (nor did the couple of folks I spoke with know of any such effort under consideration). Perhaps that's a shame as it'd be nice to be able to integrate with one interface and have a selection of DRM packages to choose from ... but then, I suspect that an easy bidirectional migration path doesn't top the list of any software vendor, proprietary or otherwise. A pity, that. Cheers! --j, with salt shaker in hand. -- Jim Meyer, Geek at Large purp at acm.org _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ron_chen_123 at yahoo.com Fri Apr 4 02:36:40 2003 From: ron_chen_123 at yahoo.com (Ron Chen) Date: Thu, 3 Apr 2003 23:36:40 -0800 (PST) Subject: FOLLOWUP: Re: Platform adopts DRMAA? In-Reply-To: <1049408526.13758.98.camel@utonium.pdi.com> Message-ID: <20030404073640.8869.qmail@web41313.mail.yahoo.com> I have to confess that I did not get the information directly form Platform. I have been looking for the forum post which mentioned that Platform's NPi is dead, and Platform is joining DRMAA. I still couldn't find it. Nevertheless, I just found that Platform is a member of GGF: http://www.gridforum.org/L_About/who.htm which defines DRMAA: http://www.gridforum.org/3_SRM/drmaa.htm So that made the original poster on the forum believed that Platform is joining DRMAA. Do you know why Platform discarded NPi? (Looks like another M$ API standard!) -Ron --- Jim Meyer wrote: > Failing any response from the original poster, I > made a few phone calls > and determined that while Platform did indeed > discard NPi, they have not > announced any effort to support DRMAA (nor did the > couple of folks I > spoke with know of any such effort under > consideration). > > Perhaps that's a shame as it'd be nice to be able to > integrate with one > interface and have a selection of DRM packages to > choose from ... but > then, I suspect that an easy bidirectional migration > path doesn't top > the list of any software vendor, proprietary or > otherwise. > > A pity, that. > > Cheers! > > --j, with salt shaker in hand. > -- > Jim Meyer, Geek at Large > purp at acm.org > __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joachim at ccrl-nece.de Fri Apr 4 03:01:55 2003 From: joachim at ccrl-nece.de (Joachim Worringen) Date: Fri, 4 Apr 2003 10:01:55 +0200 Subject: Which MPI implementation for MPI-2? ... In-Reply-To: References: Message-ID: <200304041001.55470.joachim@ccrl-nece.de> Anthony Skjellum: > Our experience in ChaMPIon/Pro is that we get higher latency and higher > bandwidth than 2-sided, vs. the design target of lower latency and lower > bandwidth; the standard missed the mark, but it is still useful. I would be surprised if the (primary) design target for one-sided communication on non-shared-memory architectures was lower latency and higher bandwidth - this can obviously not be achieved if you need to use messages. I'd say it's the different communication paradigm ("origin process chooses which data to read or write, independant from target process") which helps to adopt certain communication patterns more easily/naturally, and *maybe* avoid some synchronization delays. But then again, MPI-2 one sided with it's higly relaxed consistency model does not come really naturally for most users... Joachim -- Joachim Worringen - NEC C&C research lab St.Augustin fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From deadline at plogic.com Fri Apr 4 09:29:02 2003 From: deadline at plogic.com (Douglas Eadline) Date: Fri, 4 Apr 2003 09:29:02 -0500 (EST) Subject: More SMP Memory Data Message-ID: I have posted more memory contention data (including wall clock times) for PIII and Athlon SMP systems at: http://www.cluster-rant.com/article.pl?sid=03/04/03/1429239 Doug -- ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.814.2800 130 Webster Street | PARALLEL | Fax:+610.814.5844 Bethlehem, PA 18015 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ilumb at platform.com Fri Apr 4 06:47:33 2003 From: ilumb at platform.com (Ian Lumb) Date: Fri, 4 Apr 2003 06:47:33 -0500 Subject: FOLLOWUP: Re: Platform adopts DRMAA? Message-ID: <4AB0624F069DAD4E90F18B13A818EEFE287D7D@catoexm04.noam.corp.platform.com> NPi merged with the GGF in April 2002 - see http://www.ggf.org/5_ARCH/npi.htm for more. -Ian -----Original Message----- From: Ron Chen [mailto:ron_chen_123 at yahoo.com] Sent: Friday, April 04, 2003 2:37 AM To: Jim Meyer Cc: Beowulf Mailing List Subject: Re: FOLLOWUP: Re: Platform adopts DRMAA? I have to confess that I did not get the information directly form Platform. I have been looking for the forum post which mentioned that Platform's NPi is dead, and Platform is joining DRMAA. I still couldn't find it. Nevertheless, I just found that Platform is a member of GGF: http://www.gridforum.org/L_About/who.htm which defines DRMAA: http://www.gridforum.org/3_SRM/drmaa.htm So that made the original poster on the forum believed that Platform is joining DRMAA. Do you know why Platform discarded NPi? (Looks like another M$ API standard!) -Ron --- Jim Meyer wrote: > Failing any response from the original poster, I > made a few phone calls > and determined that while Platform did indeed > discard NPi, they have not > announced any effort to support DRMAA (nor did the > couple of folks I > spoke with know of any such effort under > consideration). > > Perhaps that's a shame as it'd be nice to be able to > integrate with one > interface and have a selection of DRM packages to > choose from ... but > then, I suspect that an easy bidirectional migration > path doesn't top > the list of any software vendor, proprietary or > otherwise. > > A pity, that. > > Cheers! > > --j, with salt shaker in hand. > -- > Jim Meyer, Geek at Large > purp at acm.org > __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From hahn at physics.mcmaster.ca Fri Apr 4 10:23:03 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Fri, 4 Apr 2003 10:23:03 -0500 (EST) Subject: FOLLOWUP: Re: Platform adopts DRMAA? In-Reply-To: <20030404073640.8869.qmail@web41313.mail.yahoo.com> Message-ID: I'm rather puzzled at this exchange: why would anyone care? from the quick look I had at the drmaa stuff, it seemed quite trivial. if I was writing a tool that needed to interact with a queueing system, I'd just have an encapsulation layer anyway, so wouldn't care about exact interfaces. this is one of those nasty (and ultimately useless) lowest-common-denominator types of interfaces, and seems to be driven by marketing weasels. > I have to confess that I did not get the information > directly form Platform. > > I have been looking for the forum post which mentioned > that Platform's NPi is dead, and Platform is joining > DRMAA. I still couldn't find it. > > Nevertheless, I just found that Platform is a member > of GGF: > http://www.gridforum.org/L_About/who.htm > > which defines DRMAA: > http://www.gridforum.org/3_SRM/drmaa.htm > > So that made the original poster on the forum believed > that Platform is joining DRMAA. > > Do you know why Platform discarded NPi? > > (Looks like another M$ API standard!) > > -Ron > > --- Jim Meyer wrote: > > Failing any response from the original poster, I > > made a few phone calls > > and determined that while Platform did indeed > > discard NPi, they have not > > announced any effort to support DRMAA (nor did the > > couple of folks I > > spoke with know of any such effort under > > consideration). > > > > Perhaps that's a shame as it'd be nice to be able to > > integrate with one > > interface and have a selection of DRM packages to > > choose from ... but > > then, I suspect that an easy bidirectional migration > > path doesn't top > > the list of any software vendor, proprietary or > > otherwise. > > > > A pity, that. > > > > Cheers! > > > > --j, with salt shaker in hand. > > -- > > Jim Meyer, Geek at Large > > purp at acm.org > > > > > __________________________________________________ > Do you Yahoo!? > Yahoo! Tax Center - File online, calculators, forms, and more > http://tax.yahoo.com > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- operator may differ from spokesperson. hahn at mcmaster.ca http://hahn.mcmaster.ca/~hahn _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From purp at acm.org Fri Apr 4 11:45:50 2003 From: purp at acm.org (Jim Meyer) Date: 04 Apr 2003 08:45:50 -0800 Subject: FOLLOWUP: Re: Platform adopts DRMAA? In-Reply-To: <4AB0624F069DAD4E90F18B13A818EEFE287D7D@catoexm04.noam.corp.platform.com> References: <4AB0624F069DAD4E90F18B13A818EEFE287D7D@catoexm04.noam.corp.platform.com> Message-ID: <1049474750.19892.3.camel@utonium.pdi.com> On Fri, 2003-04-04 at 03:47, Ian Lumb wrote: > NPi merged with the GGF in April 2002 - see > http://www.ggf.org/5_ARCH/npi.htm for more. -Ian It seems my reports of NPi's demise are much exaggerated. My apologies. On a brighter note, the OGSA/OGSI bits look promising. Thanks for the link! --j -- Jim Meyer, Geek at Large purp at acm.org _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ron_chen_123 at yahoo.com Sun Apr 6 00:20:00 2003 From: ron_chen_123 at yahoo.com (Ron Chen) Date: Sat, 5 Apr 2003 21:20:00 -0800 (PST) Subject: Fwd: Re: [GE users] SGEEE is opensource, opensource, please repeat with me... Message-ID: <20030406052000.35183.qmail@web41303.mail.yahoo.com> --- Fritz Ferstl wrote: > Date: Fri, 28 Mar 2003 16:53:33 +0100 (MET) > From: Fritz Ferstl > To: users at gridengine.sunsource.net > Subject: Re: [GE users] SGEEE is opensource, > opensource, please repeat with > me... > > Hi Vaclav, > > thanks for pointing this out again. > > The reply you wrote to the beowulf list is > completely correct and not > speculative at all. In reality, it goes even > farther. If Sun or anyone > else who want's to create a commercial Grid Engine > version intends to > change the sources such that project defined > compatibility requirements > are violated then those changes would need to be > documented and the > interfaces which were introduced would need to be > published. This is the > spirit of the SISSL open source licensing model > under which Grid Engine > has been released and it tries to ensure > interoperability between open > source and commercialized versions. This is 100% the > case today. You can > use Sun's commercial version completely > interchangeably with the > same-release-level versions from the project. > > Note that everyone has the same rights in the > project and can create a > commercial version of Grid Engine. The thing that > needs to be honored is > the SISSL. See > > http://gridengine.sunsource.net/project/gridengine/Gridengine_SISSL_license.html > > for details. > > We'll upgrade the site to a new version of > SourceCast within the next > few weeks and have planned to refurbish the content > a bit in this > process. We'll keep an eye on your recommendation to > make more clear > that the full Enterprise Edition functionality is > indeed contained in > the published project source code and in the project > builds. > > Cheers, > > Fritz > > > > On Fri, 28 Mar 2003 hanzl at noel.feld.cvut.cz wrote: > > > Sorry to repeat this old topic, but I see this > happen again and again: > > > > PEOPLE THINK THAT SGEEE IS NOT OPENSOURCE ! > > > > And some of them get upset and it is hard to > explain them that they > > are mistaken. I spent quite big effort on > beowulf at beowulf.org to make > > this clear but confused people arise again and > again. > > > > Please: > > > > - If you can find more suitable places where to > put bold label > > "SGEEE is OPENSOURCE", please do it > > > > - Kindly verify my explanation below - I did not > intend to send a copy > > here but later I realized that maybe I was too > speculative, so please > > check that my claims are true. > > > > Thanks a lot > > > > Vaclav Hanzl > > > > ------- one of my beowulf posts - please verify my > claims: ----------- > > > > Subject: Re: sun grid engine? > > From: hanzl > > To: kus at free.net > > Cc: beowulf at beowulf.org > > Date: Thu, 27 Mar 2003 20:12:03 +0100 > > > > > May be it's integrated into SGE 5.3 Enterprise > Edition ? I said about > > > *free* SGE 5.3. Both "Sun ONE Grid Engine > Administartor and User's Guide" > > > and "Sun ONE Grid Engine Release Notes" don't > have just the word "MAUI". > > > Moreover, the only sheduler algorithm allowed in > usual > > > (free) SGE 5.3 is "standard" (see SGE > Administrator & User's guide, p.225). > > > > It is easy to get confused by SGE versions. > > > > Enterprise Edition is also free. MAUI was > integrated with it - most of > > this work was done by MAUI team with help from SGE > team. > > > > Regarding SGE versions, I think it works as > follows: > > > > 1) Developers create opensource SGE version. They > work using publicly > > available CVS software repository. All new > features come to this > > version. > > > > This opensource version is both "SGE" ans "SGE > Enterprise Edition" - > > the difference is just an instalation option. You > install both using > > the same files, you may compile both using the > same sources from the > > CVS archive. > > > > 2) 'Commercial' part of SUN takes these sources > (probably without any > > important changes) and compiles 'commercial' SGE > and SGEEE. They add > > word 'ONE' to the name. They create nice manuals. > You can buy this > > software and get usual support you expect for > commercial software. > > You can still download the manuals for free. Just > skip word 'ONE' > > while reading them - they are perfectly usable for > free SGE as well. > > They just may be out of date because the free > version already has new > > features (like MAUI integration). They may also > never mention MAUI > > integration because the 'commercial' part of SUN > has no support for > > it. > > > > > > All this is just too nice to believe it so people > often get confused. > > > > Note that it probably quite differs from > PBS/OpenPBS development model > > - I am no expert on PBS (experienced experts, > please correct me if I > > am wrong!) but I think that commercial PBS and > OpenPBS are split and > > the development team has quite hard times deciding > what to do - they > > introduce new features to commercial branch to > make it more attractive > > (to make any money on it) but in the same time > similar features are > > wanted in the OpenPBS version. They themselves > created their own enemy > > on the market (OpenPBS) and now they are not sure > how to behave to it > > - support it as their child? Kill it as their > enemy? > > > > Even if I am wrong in my thoughts on PBS (and I > may easily be wrong as > > it is a long time I left PBS maillists) I am > pretty sure many PBS > > users percieve it like this (as I got few quite > few emails from them > > indicating this). > > > > PBS is older than SGE (and yes, PBS did many good > things, no doubt) > > and everybody knew PBS when opensource SGE was > born. And many people > > could easily expect that SGE used the same model > as PBS did. (It was > > easy to think that SGE EE is the commercial > version - no, it is not.) > > > > SGE did not use the same model as PBS. It used > more open one. And this > > choice was huge success I think. > > > > ... > > > > Regards > > > > Vaclav > > > > > > ---- one more example of confusion ------- > > > > Subject: Re: sun grid engine? > > From: Alan Scheinine > > To: Beowulf at beowulf.org > > Date: Fri, 28 Mar 2003 10:33:09 +0100 > > > > I see "Vaclav" posted a message. Last week we > began the installation > > of SGE and someone involved with the installation > said that in order > > to have the options of sgeee it is necessary to > buy that version. > > Using grep on the messages I had saved, I found > the message from > > Vaclav from the year 2001 showing how to convert > sge to sgeee. > > In 2001 Vaclav said that the information was at > the end of the > > download page, now it is in a readme file in the > distribution. > > In any case, the note from Vaclav in 2001 proved > to be useful also > > in 2003, the file is easily overlooked if the > system administrator > > does not know it can be done. By the way, the > file is > > root>/README.inst_sgeee > > Alan > > > > ---- I think Alan means this my old note: ----- > > > > Subject: SGEEE easily mistaken as commercial > version > > From: hanzl at noel.feld.cvut.cz > > To: dev at gridengine.sunsource.net, > beowulf at beowulf.org > > Date: Thu, 25 Oct 2001 11:22:40 +0200 > > > > Prospective SGE users could very easily be > mistaken and suppose that > > Enterprise Edition is a commercial close source > version. If you > > download the three tarfiles from "Binary > Downloads" page, unpack them > > and look around, you will install non-EE version. > There is no way to > > find easily (from unpacked files) that you could > install EE. The pdf > > manual will tell you about EE features but your > instalation is missing > > them. No hint at all that EE is also opensource. > > > > This IMHO seriously harms the SGE project and > should be corrected as > > soon as possible by including inst_sgeee script in > tar files. > > > > Potential SGE users and opensource co-developers > are likely to know > > PBS, which exists in both opensource and > commercial version. During SGE > > test-install many of them will be systematically > driven into false > > assumption that SGE project is organised the same. > > > > I wish all the best to Veridian and PBS and > everybody making free > > versions of commercial software. This setup of > things however > > inevitably makes opensource users to assess danger > that core > > developement team will be torn between opensource > and commercial > > version support, will be reluctant to port > commercial version fixes to > > opensource version (cause it takes time) and will > be unable to > > integrate opensource-community created patches > cause without knowledge > > of the commercial version source these patches > will diverge. > > > > It is very sad to have these worries about SGE by > mistake. > > > > > > Only after lot of hacking around I found that all > you have to do to > > install EE is to rename inst_sge to inst_sgeee > (and it behaves > > accordingly). Only after this I looked around once > more and found this > > at the bottom of binary download page: > > > > Only for Grid Engine Enterprise Edition you have > to make slight modifications: > > % cd $SGE_ROOT > > % ln -s inst_sge inst_sgeee > > % replace inst_sge with inst_sgeee in the last > line of the files install_qmaster and install_execd > > Then you can proceed as with the standard Grid > Engine installation. > > > > Well, you may say it is my fault not to notice > this before. Sure it is > > but I think this fault is quite common and harms > SGE a lot. It is > > worth it to include inst_sgeee in tar files now as > many Beowulf > > maillist readers might be prompted by recent SGE > discussions to go and > > try SGE - and maybe forget about it if they make > the same mistake as I > > did. > > > > > > With all the best wishes to SGE team (and thanks > for all the work done > > so far) > > > > Vaclav > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > users-unsubscribe at gridengine.sunsource.net > > For additional commands, e-mail: > users-help at gridengine.sunsource.net > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > users-unsubscribe at gridengine.sunsource.net > For additional commands, e-mail: > users-help at gridengine.sunsource.net > __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From virtualsuresh at yahoo.co.in Sun Apr 6 03:34:41 2003 From: virtualsuresh at yahoo.co.in (=?iso-8859-1?q?suresh=20chandra?=) Date: Sun, 6 Apr 2003 08:34:41 +0100 (BST) Subject: small cluster Message-ID: <20030406073441.50944.qmail@web8102.in.yahoo.com> Hi, I am also Interested in Building a two node cluster, I had Athlon 850Mhz and old Pentium 133Mhz(Hard Disk Less). We are going to build a 16 node cluster for our university. So as a practice, I want to build 2 node cluster in Home. I am planning to use OpenMosix2(SSI). I want to share Ideas with all of you in implementing. Thanks & Regards, Suresh Chandra, India ===== ________________________________________________________________________ Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jbassett at blue.weeg.uiowa.edu Sun Apr 6 00:35:31 2003 From: jbassett at blue.weeg.uiowa.edu (jbassett) Date: Sat, 5 Apr 2003 23:35:31 -0600 Subject: advice on cluster purchase Message-ID: <3E9602A5@itsnt5.its.uiowa.edu> Hi, I am an undergraduate involved with a totally student run parallel computing experience. We have approximately 10,000 of university money with which to produce the best possible machine. I would be interested to hear from you all what configuration you would choose if someone just said "here's the money, build the best system you can." The system will do both cpu dominated and network intensive activities, so it would be tailored for neither. Do SMP nodes tend to be superior in a cost/performance framework? I have worked with other peoples systems and they are always dual cpu nodes, my impression being that it is for the purpose of minimizing overall size- as I tend to start a process on each cpu. Any advice would be appreciated. Joseph Bassett _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From chih-houng-king at uiowa.edu Sun Apr 6 23:33:44 2003 From: chih-houng-king at uiowa.edu (Chih King) Date: Sun, 6 Apr 2003 22:33:44 -0500 Subject: Specific Question about Single vs. Dual Processor System Message-ID: <002501c2fcb6$7e369930$6401a8c0@chihking> Hello. I am a member of the University of Iowa Student Supercomputing Project (UISSP), and we are planning for the purchase of our first cluster. Currently we are divided between a sixteen node single-processor Pentium 4 system and a seven node dual-processor Xeon system. Here are the brief specification of both machines: 16 Pentium 4 single-processor system (total cost $7,407): Intel Pentium 4 2.4GHz 533FSB 512KB ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN 512MB PC2700 DDR333 Maxtor 20GB Ultra100 Hard Drive ATI Rage Mobility VGA Card 8MB AGP CG 6039L 350W USB Midtower Case Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel) 7 Xeon dual-processor system (total cost $8,400): INTEL XEON 2.4GHZ 533FSB PROCESSOR x2 TYAN S2723GNN E7501 GLAN MOTHERBOARD PC2100 256MB ECC/REG DDR x2 Maxtor 20GB Ultra100 Hard Drive Chenbro Beige Server Case NMB 460W Xeon Power Supply MITSUMI 54X CD-ROM Drive As you can see, the single-processor system is about $1,000 cheaper than the dual-processor system. We have a total of $9,500 in our budget (to pay for the system, the switch, and everything else). Taking into consideration both performance and economical issues which system would you choose and why? Some more details: since Gigabit LAN is built in both motherboards we will probably establish one Gigabit channel, and if necessary have a second 100Mbps LAN channel as well. Therefore we will probably have to spend an additional $500-600 on switches. Currently we are not sure about specific application that we will be running on the cluster, but we would like to run a broad range of calculations/simulations (ie. biological, economical, mathematical, etc.) We would really appreciate any response in this matter. Thank you very much! Sincerely, Chih King UISSP _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Sun Apr 6 19:57:24 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Sun, 6 Apr 2003 19:57:24 -0400 (EDT) Subject: Specific Question about Single vs. Dual Processor System In-Reply-To: <002501c2fcb6$7e369930$6401a8c0@chihking> Message-ID: On Sun, 6 Apr 2003, Chih King wrote: > Hello. I am a member of the University of Iowa Student Supercomputing > Project (UISSP), and we are planning for the purchase of our first cluster. > Currently we are divided between a sixteen node single-processor Pentium 4 > system and a seven node dual-processor Xeon system. Here are the brief > specification of both machines: > > 16 Pentium 4 single-processor system (total cost $7,407): > > Intel Pentium 4 2.4GHz 533FSB 512KB > ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN > 512MB PC2700 DDR333 > Maxtor 20GB Ultra100 Hard Drive > ATI Rage Mobility VGA Card 8MB AGP > CG 6039L 350W USB Midtower Case > Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel) > > 7 Xeon dual-processor system (total cost $8,400): > > INTEL XEON 2.4GHZ 533FSB PROCESSOR x2 > TYAN S2723GNN E7501 GLAN MOTHERBOARD > PC2100 256MB ECC/REG DDR x2 > Maxtor 20GB Ultra100 Hard Drive > Chenbro Beige Server Case > NMB 460W Xeon Power Supply > MITSUMI 54X CD-ROM Drive > > As you can see, the single-processor system is about $1,000 cheaper than the > dual-processor system. We have a total of $9,500 in our budget (to pay for > the system, the switch, and everything else). Taking into consideration > both performance and economical issues which system would you choose and > why? Some more details: since Gigabit LAN is built in both motherboards we > will probably establish one Gigabit channel, and if necessary have a second > 100Mbps LAN channel as well. Therefore we will probably have to spend an > additional $500-600 on switches. Currently we are not sure about specific > application that we will be running on the cluster, but we would like to run > a broad range of calculations/simulations (ie. biological, economical, > mathematical, etc.) We would really appreciate any response in this matter. > Thank you very much! Hmmm, I think I just responded with ONE plan -- looks like you already have better quotes than I expected EXCEPT that you look like you're getting less memory than I think you should get on the duals. I'd recommend at least 512 MB per processor, maybe 1 GB per processor if you can afford it. You also haven't said anything about a server -- if the cluster is going to do any serious work, you'll likely want a "server node" in either configuration with a lot more than 20GB in relatively unreliable IDE drives. You are also getting more nodes (either way) than will comfortably fit on a cheap KVM, which is ok but not as convenient on a starter/demo cluster when you'll have relatively many occasions to connect directly to nodes to mess with them. So replace the KVM with just the monitor, keyboard, mouse themselves and a cart to put them on. Now, about your question. The UP systems have faster memory and more memory and more processors total. If you REALLY have no more money, having 16 systems is better than having 7 if you have to deal with node failures. You do have to get a bigger switch, you do give up a bit of speed when processors have to talk at least some of the time. I think I'd get the UP configuration, with one node pulled and beefed up in a bigger case into a server node. The other 15 and the switch will fit very neatly onto a single heavy duty steel shelf unit, and can be cabled up to look lovely. This should be very serviceable. For coarse grained or embarrassingly parallel code you've optimized CPU and have more memory for applications; for parallel code with a fair bit of IPC's you no longer can talk to at least ONE processor locally, but neither do you have to share a single gigE connection among two processors. It will look more impressive. It will run slightly hotter and cost slightly more to operate, if you are paying the power bill (about $2K/year, at a guess, so I hope you are NOT paying the power bill:-). rgb > > Sincerely, > > Chih King > UISSP > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From deadline at plogic.com Mon Apr 7 10:09:19 2003 From: deadline at plogic.com (Douglas Eadline) Date: Mon, 7 Apr 2003 10:09:19 -0400 (EDT) Subject: Specific Question about Single vs. Dual Processor System In-Reply-To: <002501c2fcb6$7e369930$6401a8c0@chihking> Message-ID: On Sun, 6 Apr 2003, Chih King wrote: > Hello. I am a member of the University of Iowa Student Supercomputing > Project (UISSP), and we are planning for the purchase of our first cluster. > Currently we are divided between a sixteen node single-processor Pentium 4 > system and a seven node dual-processor Xeon system. Here are the brief > specification of both machines: > > 16 Pentium 4 single-processor system (total cost $7,407): > > Intel Pentium 4 2.4GHz 533FSB 512KB > ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN According to Asus this MB has an on-board 10/100 (realtek) interface is there a 10/100/1000 option for this board. Or are you adding a GigE NIC? Doug > 512MB PC2700 DDR333 > Maxtor 20GB Ultra100 Hard Drive > ATI Rage Mobility VGA Card 8MB AGP > CG 6039L 350W USB Midtower Case > Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel) > > 7 Xeon dual-processor system (total cost $8,400): > > INTEL XEON 2.4GHZ 533FSB PROCESSOR x2 > TYAN S2723GNN E7501 GLAN MOTHERBOARD > PC2100 256MB ECC/REG DDR x2 > Maxtor 20GB Ultra100 Hard Drive > Chenbro Beige Server Case > NMB 460W Xeon Power Supply > MITSUMI 54X CD-ROM Drive > > As you can see, the single-processor system is about $1,000 cheaper than the > dual-processor system. We have a total of $9,500 in our budget (to pay for > the system, the switch, and everything else). Taking into consideration > both performance and economical issues which system would you choose and > why? Some more details: since Gigabit LAN is built in both motherboards we > will probably establish one Gigabit channel, and if necessary have a second > 100Mbps LAN channel as well. Therefore we will probably have to spend an > additional $500-600 on switches. Currently we are not sure about specific > application that we will be running on the cluster, but we would like to run > a broad range of calculations/simulations (ie. biological, economical, > mathematical, etc.) We would really appreciate any response in this matter. > Thank you very much! > > Sincerely, > > Chih King > UISSP > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.814.2800 130 Webster Street | PARALLEL | Fax:+610.814.5844 Bethlehem, PA 18015 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Sun Apr 6 19:40:42 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Sun, 6 Apr 2003 19:40:42 -0400 (EDT) Subject: advice on cluster purchase In-Reply-To: <3E9602A5@itsnt5.its.uiowa.edu> Message-ID: On Sat, 5 Apr 2003, jbassett wrote: > Hi, I am an undergraduate involved with a totally student run parallel > computing experience. We have approximately 10,000 of university money with > which to produce the best possible machine. I would be interested to hear > from you all what configuration you would choose if someone just said "here's > the money, build the best system you can." The system will do both cpu > dominated and network intensive activities, so it would be tailored for > neither. Do SMP nodes tend to be superior in a cost/performance framework? I > have worked with other peoples systems and they are always dual cpu nodes, my > impression being that it is for the purpose of minimizing overall size- as I > tend to start a process on each cpu. Any advice would be appreciated. I think you can barely afford the following: 3 dual Xeon or dual Athlon systems. Budget them for $1800-2000, get at least 512 MB of memory per, small/wimpy IDE hard disk, gigabit ethernet card. Tower cases are cheaper and 4 nodes don't need a rackmount. No CD drives. A floppy is ok, a cheap video card is ok although likely to be onboard on the motherboard along with a possibly useful 100BT interface. 1 dual processor P4 or Athlon with a gig card, a SCSI interface, and 3-4 SCSI disks set up in a RAID, in a server (supertower) case. If data preservation is very importanty to you and you can afford it, add a tape or CD-RW to back it up. If this "head node" is to connect to an external network, buy it an extra 100BT interface. Get it some bric-a-brac, as well -- a CD RW, a nice sound card (if one isn't onboard), some decent speakers -- this is where one will "work". A bit of extra memory (relative to the nodes) wouldn't hurt as well. 1 small gigabit ethernet switch. Netgear has a cheap one. So do other vendors. I leave it to your shopping process to determine the number of ports -- at least 4, of course, but you might want to try for 8, or 16, if you think your cluster might grow later. You may want a cheap 100BT switch as well (or extra ports for the 100BT interfaces) if you'd like to preserve the gig network for IPC computations only. 1 four port KVM switch. Don't go cheap -- good cables, maybe a Belkin switch. This should cost you $200+ (including cables) not $100-. The cheap serial/switch ones suck, and cheap cables will distort video. 1 monitor as large and nice as you wish. If you can afford it, I'd go for e.g a NEC 17" flatpanel that does 1280x1024. Oh, and a nice mouse and keyboard too. 1 heavy duty shelf unit. See pictures on http://www.phy.duke.edu/brahma for a nice one I got at Home Depot for $60 or so -- you only need a half of one for four nodes, but your cluster might grow... Miscellaneous cables, UPS/surge protectors, some nifty LEDs and glowing lights to make people think it is a really powerful computer;-) A name, and a nice logo. Never underestimate the importance of marketing...:-) I make it (3x$1800=$5400) + (1x$3000) + $600 + $400 = $9400, plus several hundred for the miscellaneous -- cables, shelf, KVM, UPS, and anything I might have forgotten. At least you have something to structure a price search around while shopping. Note that you won't get bleeding edge systems at these prices. I'd guess 2.4 GHz P4 Xeons or 2000+ Athlons with 1 GB of DDR, maybe a bit better. rgb > Joseph Bassett > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eugen at leitl.org Mon Apr 7 08:56:05 2003 From: eugen at leitl.org (Eugen Leitl) Date: Mon, 7 Apr 2003 14:56:05 +0200 Subject: renting time on a cluster Message-ID: <20030407125604.GR2067@leitl.org> A friend of mine has a project requiring a lot of crunch (no idea yet which bandwidth/latency requirements). Can you think of places where one can rent nontrivial amount of crunch for money? TIA, Eugene -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available URL: From stan at temple.edu Mon Apr 7 11:27:55 2003 From: stan at temple.edu (Stan Horwitz) Date: Mon, 7 Apr 2003 11:27:55 -0400 (EDT) Subject: Question about linking Beowulf nodes Message-ID: Hello all; Sorry if this is a FAQ. I have been assigned the job of budgeting for a six-node Beowulf cluster. I have no experience in this area, yet. We would like to use PCs with AMD processors in them and have disk storage reside on a Compaq Storageworks SAN. What I am not clear on is the best hardware solution to link up the six nodes and the appropriate type of network cards for the individual PCs that will form the cluster. The purpose of this cluster will be to run computational jobs such as SAS, Gausian, SPSS, IMSL, and various and sundry FORTRAN and C programs that our faculty and graduate require for their research projects. Not surprisingly, we also want to keep implementation and maintenance costs as low as possible. I have looked through the faq on the beowfulf.org web site, but I did not come across any specific hardware recommendations. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From natorro at fisica.unam.mx Mon Apr 7 12:33:30 2003 From: natorro at fisica.unam.mx (Carlos Ernesto Lopez Nataren) Date: 07 Apr 2003 11:33:30 -0500 Subject: Mac OS X or Linux? Message-ID: <1049733210.7632.3.camel@linux> Hi!, we recently acquired 6 Xserve nodes at my institute, and we are planning to setup a beowulf cluster, can anyone tell if it is worthy to set it up with the OS it brings??? (Mac OS X server jaguar) or if it is better to try to use linux on these beauties??? Thanks a lot in advance for any help -- Carlos Ernesto Lopez Nataren IFISICA _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rocky at atipa.com Mon Apr 7 10:54:24 2003 From: rocky at atipa.com (Rocky McGaugh) Date: Mon, 7 Apr 2003 09:54:24 -0500 (CDT) Subject: Specific Question about Single vs. Dual Processor System In-Reply-To: <002501c2fcb6$7e369930$6401a8c0@chihking> Message-ID: On Sun, 6 Apr 2003, Chih King wrote: > Hello. I am a member of the University of Iowa Student Supercomputing > Project (UISSP), and we are planning for the purchase of our first cluster. > Currently we are divided between a sixteen node single-processor Pentium 4 > system and a seven node dual-processor Xeon system. Here are the brief > specification of both machines: > > 16 Pentium 4 single-processor system (total cost $7,407): > > Intel Pentium 4 2.4GHz 533FSB 512KB > ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN > 512MB PC2700 DDR333 > Maxtor 20GB Ultra100 Hard Drive > ATI Rage Mobility VGA Card 8MB AGP > CG 6039L 350W USB Midtower Case > Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel) > > 7 Xeon dual-processor system (total cost $8,400): > > INTEL XEON 2.4GHZ 533FSB PROCESSOR x2 > TYAN S2723GNN E7501 GLAN MOTHERBOARD > PC2100 256MB ECC/REG DDR x2 > Maxtor 20GB Ultra100 Hard Drive > Chenbro Beige Server Case > NMB 460W Xeon Power Supply > MITSUMI 54X CD-ROM Drive > Given the above options, i'd go with the dual Xeons. Memory bandwidth is greater on the 2723 due to its dual-DDR setup. It is also a server-class motherboard. The i7501 chipset was designed for server use and works well for clusters. The Asus board has a SiS chipset. It is fairly safe to assume that this board was optimized for AGP speed which wont matter much to you. I think you'll find much better reliability with the servers. -- Rocky McGaugh Atipa Technologies rocky at atipatechnologies.com rmcgaugh at atipa.com 1-785-841-9513 x3110 http://1087800222/ perl -e 'print unpack(u, ".=W=W+F%T:7\!A+F-O;0H`");' _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From robl at mcs.anl.gov Mon Apr 7 13:57:48 2003 From: robl at mcs.anl.gov (Robert Latham) Date: Mon, 7 Apr 2003 12:57:48 -0500 Subject: Mac OS X or Linux? In-Reply-To: <1049733210.7632.3.camel@linux> References: <1049733210.7632.3.camel@linux> Message-ID: <20030407175748.GB20765@mcs.anl.gov> On Mon, Apr 07, 2003 at 11:33:30AM -0500, Carlos Ernesto Lopez Nataren wrote: > Hi!, we recently acquired 6 Xserve nodes at my institute, and we are > planning to setup a beowulf cluster, can anyone tell if it is worthy to > set it up with the OS it brings??? (Mac OS X server jaguar) or if it is > better to try to use linux on these beauties??? please, if you have the time and resources, make them dual-boot (granted, with six, it will mildy annoying to switch operating systems) and tell us how well your applications run under an os x cluster versus under a powerpc linux cluster. You'll spark a massive flame war, but there is a dearth of real data showing how good or bad mac os X is in a cluster environment (where the benefits of "good user interface" and "i can watch quicktime trailers" aren't important) compared to linux on the same hardware. please note that you'll need a quite recent linux kernel to support the xserve hardware I can show you lmbench numbers that show linux outperforming os x in *operating system specific* tasks, but real applications carry more weight than microbenchmarks. ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Labs, IL USA B29D F333 664A 4280 315B _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Apr 7 12:53:37 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 7 Apr 2003 12:53:37 -0400 (EDT) Subject: Specific Question about Single vs. Dual Processor System In-Reply-To: Message-ID: > > 16 Pentium 4 single-processor system (total cost $7,407): pretty cheap! > > Intel Pentium 4 2.4GHz 533FSB 512KB > > ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN > > According to Asus this MB has an on-board 10/100 (realtek) interface > is there a 10/100/1000 option for this board. > Or are you adding a GigE NIC? recent SiS chipsets also have a bit of a smell about them, Linux-wise. Alan Cox says that SiS hasn't been cooperative in producing docs that permit good linux support. I don't see anything really attractive about this board - most of the features are useless (agp, for instance, S/PDIF, 1394). it doesn't seem like SATA is happening fast enough to be a good motive, either. if you insist on P4's, I'd probably go with a 845PE or maybe GE. several vendors bundle such boards with gigabit. I have mixed info on whether the integrated video on the GE causes problems - I expect that if you're in text mode, it wouldn't steal enough dram bandwidth to notice. Intel chipsets are a bit of a conservative choice, but sometimes that's the right move (heck, Asus is fairly conservative). it's worth at least considering e7205 boards, since doubling the bandwidth does definitely help many compute codes (unlike most desktop apps.) finally, AMD remains a viable option, though mainly as a low-end approach. for instance, an ECS K7S5a-pro is incredibly cheap, has builtin 100bT, and is pretty snappy. for $500 computers, saving a hundred dollars on the motherboard, along with a hundred on the CPU can add up quickly. > > Maxtor 20GB Ultra100 Hard Drive are you sure you really want that? the U100 part is not important (since current disks peak at around 50 MB/s), but the size implies density, and means that the disk is ~2 generations old. consider getting a 30-40G disk just so you get current (60-80G/platter) mechanisms. > > ATI Rage Mobility VGA Card 8MB AGP or integrated video. > > Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel) a good choice 5 years ago, but I'd probaby consider something more interesting. if you're planning to just treat it as a management net, (which I don't understand the appeal of), then just go with realtek nics. if you're going to use it for anything interesting, try to get gigabit. > > 7 Xeon dual-processor system (total cost $8,400): > > > > INTEL XEON 2.4GHZ 533FSB PROCESSOR x2 > > TYAN S2723GNN E7501 GLAN MOTHERBOARD > > PC2100 256MB ECC/REG DDR x2 that's not much ram (since ram is cheap). if you're interested in exploring the benefits of duals and/or double-wide DDR, perhaps you should wait for the next generation chipsets (springdale/etc). > > Maxtor 20GB Ultra100 Hard Drive > > Chenbro Beige Server Case > > NMB 460W Xeon Power Supply 350 is actually plenty for a dual 2.4. naturally, people will be more impressed with a cluster of non-beige cases ;) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ganeshnamboothiri at yahoo.com Mon Apr 7 13:28:18 2003 From: ganeshnamboothiri at yahoo.com (Ganesh Namboothiri) Date: Mon, 7 Apr 2003 10:28:18 -0700 (PDT) Subject: Matrix Multiplication Message-ID: <20030407172818.97982.qmail@web21504.mail.yahoo.com> Hello I want to implement a parallel matrix multiplication algm and i dont know how to split the array and send it. Plz help me to split the nxn matrix into pices so that ican do parallel matrix multiplication. ganeshnamboothiri at yahoo.com --------------------------------- Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more -------------- next part -------------- An HTML attachment was scrubbed... URL: From deadline at plogic.com Mon Apr 7 14:31:34 2003 From: deadline at plogic.com (Douglas Eadline) Date: Mon, 7 Apr 2003 14:31:34 -0400 (EDT) Subject: Specific Question about Single vs. Dual Processor System In-Reply-To: Message-ID: On Mon, 7 Apr 2003, Rocky McGaugh wrote: > On Sun, 6 Apr 2003, Chih King wrote: > > > Hello. I am a member of the University of Iowa Student Supercomputing > > Project (UISSP), and we are planning for the purchase of our first cluster. > > Currently we are divided between a sixteen node single-processor Pentium 4 > > system and a seven node dual-processor Xeon system. Here are the brief > > specification of both machines: > > > > 16 Pentium 4 single-processor system (total cost $7,407): > > > > Intel Pentium 4 2.4GHz 533FSB 512KB > > ASUS P4S8X 533FSB SATA GB 8X MB w/ 10/100/1000 LAN > > 512MB PC2700 DDR333 > > Maxtor 20GB Ultra100 Hard Drive > > ATI Rage Mobility VGA Card 8MB AGP > > CG 6039L 350W USB Midtower Case > > Linksys LNE100TX 10/100 Network Adaptor (2nd LAN channel) > > > > 7 Xeon dual-processor system (total cost $8,400): > > > > INTEL XEON 2.4GHZ 533FSB PROCESSOR x2 > > TYAN S2723GNN E7501 GLAN MOTHERBOARD > > PC2100 256MB ECC/REG DDR x2 > > Maxtor 20GB Ultra100 Hard Drive > > Chenbro Beige Server Case > > NMB 460W Xeon Power Supply > > MITSUMI 54X CD-ROM Drive > > > > Given the above options, i'd go with the dual Xeons. Memory bandwidth is > greater on the 2723 due to its dual-DDR setup. It is also a server-class > motherboard. The i7501 chipset was designed for server use and works well > for clusters. Of course it all depends on the application(s). Depending on the application mix, you may not realize the full potential DDR offers. (look at some of the numbers for SMP motherboards on cluster-rant.com) This is a very important question - single vs. dual. In addition to sharing the memory, they will also share the interconnect. I wonder if systems built from boards like the Tyan 2707 or the SM X5SSE would provide better price to performance for some applications than using dual MB's. I have some testing planned. Nice thing about the Tyan board is it has GigE on a PCI-X bus and a PCI-X slot if you need one. Doug > > The Asus board has a SiS chipset. It is fairly safe to assume that this > board was optimized for AGP speed which wont matter much to you. > > I think you'll find much better reliability with the servers. > > -- ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.814.2800 130 Webster Street | PARALLEL | Fax:+610.814.5844 Bethlehem, PA 18015 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From landman at scalableinformatics.com Mon Apr 7 14:33:39 2003 From: landman at scalableinformatics.com (Joseph Landman) Date: 07 Apr 2003 14:33:39 -0400 Subject: Mac OS X or Linux? In-Reply-To: <20030407175748.GB20765@mcs.anl.gov> References: <1049733210.7632.3.camel@linux> <20030407175748.GB20765@mcs.anl.gov> Message-ID: <1049740419.6940.43.camel@protein.scalableinformatics.com> On Mon, 2003-04-07 at 13:57, Robert Latham wrote: > You'll spark a massive flame war, but there is a dearth of real data > showing how good or bad mac os X is in a cluster environment (where > the benefits of "good user interface" and "i can watch quicktime > trailers" aren't important) compared to linux on the same hardware. Happens with everything though... > please note that you'll need a quite recent linux kernel to support > the xserve hardware > > I can show you lmbench numbers that show linux outperforming os x in > *operating system specific* tasks, but real applications carry more > weight than microbenchmarks. Numbers I have seen for bioinfo apps seem to indicate that the hardware is faster when code is redone for the built in vector registers (gcc compiler doesn't automatically do this). Then again, this is comparing non-SIMD to SIMD, and I would expect that the SIMD could be faster at specific code patterns/fragments. Single CPU (non-SIMD) to single CPU (non-SIMD) the performance comparison seems not to favor the current Apple PPC hardware against current IA32 machines. I am curious about other apps as well. Please summarize the lmbench microbenchmarks. I would be curious about heavy FP/memory codes. I would think that the PPC would have some interesting performance there. > > ==rob -- Joseph Landman, Ph.D Scalable Informatics LLC email: landman at scalableinformatics.com web: http://scalableinformatics.com phone: +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eugen at leitl.org Mon Apr 7 15:46:11 2003 From: eugen at leitl.org (Eugen Leitl) Date: Mon, 7 Apr 2003 21:46:11 +0200 Subject: thanks [was renting time on a clustger] Message-ID: <20030407194611.GC3245@leitl.org> Thanks for all the helpful responses, both on-list and off-list. I've passed on the information to the party in question. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available URL: From rgb at phy.duke.edu Mon Apr 7 17:21:07 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Mon, 7 Apr 2003 17:21:07 -0400 (EDT) Subject: Mac OS X or Linux? In-Reply-To: <20030407175748.GB20765@mcs.anl.gov> Message-ID: On Mon, 7 Apr 2003, Robert Latham wrote: > On Mon, Apr 07, 2003 at 11:33:30AM -0500, Carlos Ernesto Lopez Nataren wrote: > > Hi!, we recently acquired 6 Xserve nodes at my institute, and we are > > planning to setup a beowulf cluster, can anyone tell if it is worthy to > > set it up with the OS it brings??? (Mac OS X server jaguar) or if it is > > better to try to use linux on these beauties??? > > I can show you lmbench numbers that show linux outperforming os x in > *operating system specific* tasks, but real applications carry more > weight than microbenchmarks. Hola, Carlos! Como Estas? Say hello to Carmela and Jaime (if they are still there). My only modification to this is that I'd recommend looking at the non-hardware costs a bit to determine if messing with either solution is worth it. Remember, it costs time and money to set things up and run them, and this differential cost is very sensitive to things like the scalability of what you build. For example, with linux on intel or amd, you can fully automate installation and upgrade for an entire cluster so that it takes only a tiny bit of time per node per year to run the thing. All software is prebuilt ready to run, basically for free. With linux on mac, or mac os on macs, are you going to have anything like this level of scaling? No, because there will be lots of things you have to build (and possibly port) for linux on mac, and because the mac os on mac was built for a PC environment and I doubt that it scales like rpm distros or debian. As in, hardware can be "free" and not be worth it, if it costs you a huge amount of human time to make everything work. The further you get from any of the "standard beowulf models" (whatever they might be at any point in time) the more of YOUR time you're going to put in screwing around getting things to work. If your time is cheap and the benefit of eventual success is great, this is no problem. If your time is costly and the benefit is at best "ok" if everything works when your done, you might better consider ways of setting up 'wulfs closer to the standard approaches. Of course, you may have cheap labor in the form of graduate students...but it is still something to think about.;-) rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ron_chen_123 at yahoo.com Mon Apr 7 22:53:28 2003 From: ron_chen_123 at yahoo.com (Ron Chen) Date: Mon, 7 Apr 2003 19:53:28 -0700 (PDT) Subject: Fwd: [PBS-USERS] cost for educational sites Message-ID: <20030408025328.7453.qmail@web41310.mail.yahoo.com> Looks like PBSPro is *not* free for educational sites anymore. When PBS was owned by Veridian, OpenPBS was quite broken, as not all PBSPro fixes went into OpenPBS. I guess more and more sites will switch to GridEngine. -Ron --- Jenn Sturm wrote: > I see from PBSPro's new website that educational > sites need to submit a > grant application in order to purchase PBSPro at > reduced costs, where > previously it was free to educational sites. Has > anyone submitted this > yet and found out what the price actually is? I'm > building a new > machine right now and am surprised to find this out > (didn't exactly > expect to have to complete a grant application in > order to build this > new machine...) and now have to move back to > OpenPBS, but I'm curious, > still... > > Thanks, > > Jenn Sturm > > > +-------------------------------------------------------------------+ > Jennifer Sturm > System Administrator and Research Support Specialist > Chemistry Department > Hamilton College > > jsturm at hamilton.edu > help at mercury.chem.hamilton.edu > 315-859-4745 > > http://www.chem.hamilton.edu/ > http://mars.chem.hamilton.edu/ > +-------------------------------------------------------------------+ > > __________________________________________________________________________ > To unsubscribe: email majordomo at OpenPBS.org with > body "unsubscribe pbs-users" > For message archives: > http://www.OpenPBS.org/UserArea/pbs-users.html > - - - - - - - - - - > - - - - > OpenPBS and the pbs-users mailing list are sponsored > by Altair. > __________________________________________________________________________ __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jeff at aslab.com Mon Apr 7 20:22:04 2003 From: jeff at aslab.com (Jeff Nguyen) Date: Mon, 7 Apr 2003 17:22:04 -0700 Subject: Specific Question about Single vs. Dual Processor System References: Message-ID: <0b6a01c2fd64$e2155140$6502a8c0@jeff> > I don't see anything really attractive about this board - > most of the features are useless (agp, for instance, S/PDIF, 1394). > it doesn't seem like SATA is happening fast enough to be a good > motive, either. > > if you insist on P4's, I'd probably go with a 845PE or maybe GE. > several vendors bundle such boards with gigabit. I have mixed info > on whether the integrated video on the GE causes problems - I expect > that if you're in text mode, it wouldn't steal enough dram bandwidth > to notice. Intel chipsets are a bit of a conservative choice, > but sometimes that's the right move (heck, Asus is fairly conservative). > > it's worth at least considering e7205 boards, since doubling the bandwidth > does definitely help many compute codes (unlike most desktop apps.) > I would rather wait for 865P (Springsdale) or 875P (Canterwood) instead of going for E7205. These new platforms will out really soon. :) They will offer higher front side bus (800mhz) and faster memory bus (400mhz) at the same cost as the existing E7205 machines. Jeff ASL Inc. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ron_chen_123 at yahoo.com Tue Apr 8 02:02:36 2003 From: ron_chen_123 at yahoo.com (Ron Chen) Date: Mon, 7 Apr 2003 23:02:36 -0700 (PDT) Subject: Fwd: [PBS-USERS] cost for educational sites In-Reply-To: Message-ID: <20030408060236.98444.qmail@web41303.mail.yahoo.com> First, both SGE and SGEEE are free and opensource. They have better features, and far better fault tolerance. If PBSPro is starting to cost $$ even for educational sites, why not do the switch now? Second, by "I guess", I am talking about the trend, if you follow the discussions on beowulf lately, you will find that there really are a lot of people switching from PBS to SGE. Lastly, I don't work for Sun, and besides, Sun is not making a dollar even if the whole world is using SGE. (also note that Sun is making SGE free not only for Solaris, but for other platforms -- AIX, HP, Alpha, Mac...) Next time if I suggest people to use Linux instead of Windows, I hope people don't ask me whether I work for Linus or Redhat :-) -Ron --- Mark Hahn wrote: > > I guess more and more sites will switch to > GridEngine. > > do you work for Sun? > __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From robl at mcs.anl.gov Wed Apr 9 02:19:21 2003 From: robl at mcs.anl.gov (Robert Latham) Date: Wed, 9 Apr 2003 01:19:21 -0500 Subject: Mac OS X or Linux? In-Reply-To: <1049740419.6940.43.camel@protein.scalableinformatics.com> References: <1049733210.7632.3.camel@linux> <20030407175748.GB20765@mcs.anl.gov> <1049740419.6940.43.camel@protein.scalableinformatics.com> Message-ID: <20030409061920.GA32255@mcs.anl.gov> On Mon, Apr 07, 2003 at 02:33:39PM -0400, Joseph Landman wrote: > I am curious about other apps as well. Please summarize the lmbench > microbenchmarks. I would be curious about heavy FP/memory codes. I > would think that the PPC would have some interesting performance there. Usual caveats about benchmarks and misleading numbers apply, but this is as fair as i can make it: same hardware, same benchmark, different operating systems: new: http://terizla.org/~robl/pbook/benchmarks/lmbench-linux_vs_osx.1 old (but same results): http://drmirage.clustermonkey.org/~laz/pbook/lmbench.powerbook.txt ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Labs, IL USA B29D F333 664A 4280 315B _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eugen at leitl.org Tue Apr 8 13:37:07 2003 From: eugen at leitl.org (Eugen Leitl) Date: Tue, 8 Apr 2003 19:37:07 +0200 Subject: [baillot@ait.nrl.navy.mil: [ARFORUM] Fwd: Cluster job opportunity] Message-ID: <20030408173707.GK13969@leitl.org> ----- Forwarded message from yohan baillot ----- From: yohan baillot Date: Tue, 08 Apr 2003 13:00:01 -0400 To: ARforum Subject: [ARFORUM] Fwd: Cluster job opportunity Reply-To: arforum at topica.com X-Mailer: QUALCOMM Windows Eudora Version 5.1 FYI Yohan >From: Jonathan Gratch >To: >Subject: Cluster job opportunity >Date: Tue, 8 Apr 2003 07:43:36 -0700 >X-Mailer: Internet Mail Service (5.5.2653.19) > >Hi, > >I saw your recent paper at the VR2003 confrence. >Our research institute is hoping to hire someone to lead >a R&D effort to develop a cluster system for VR applications. I don't >know if this community has a mailing list where it might be more >appropriate to post job openings so I've contacted you directly in the >hope that you might let me know if there is a mailing list or if there >might be a place at your institute that you could post the following >job opening. > >Thanks in advance, > >jon gratch >______________________________________________ >Jonathan Gratch | www.ict.usc.edu/~gratch >Project Leader, Research Assistant Professor | Phone: (310) 448-0306 >USC Institute for Creative Technologies | Fax: (310) 574-5725 >13274 Fiji Way, Suite 600 | E-mail: gratch at ict.usc.edu >Marina del Rey, CA 90292 | > > > > >Job Posting for Cluster Project Leader (Req# 14490) > >The University of Southern California's Institute for Creative >Technologies is involved in fundamental research on advancing the >state of virtual reality training systems through a combination of >advanced graphics, audio, artificial intelligence and Hollywood >production techniques. We are currently seeking a senior programmer >with project management experience to coordinate the research and >development of a distributed rendering and animation engine that will >serve as the backbone of the next generation of ICT training >simulators. The goal of this multi-year project is to create a >flexible architecture, using commercial off-the-shelf software where >possible, that will support the real-time graphics, audio, >animation and simulation requirements of multiple ICT research >efforts. A key aspect of the project is to support distributed >rendering of real-time graphics on a cluster of PC computers. > >The applicant can expect to spend half of their time performing >management duties and half programming. Management duties include >working with research project leaders to refine system requirements, >defining tasks and priorities, creating milestones and managing a >small team of developers, contacting vendors and attending conferences >to stay informed of developments in the area. Programming duties >include developing new software and evaluating and integrating >commercial solutions. > >The ideal applicant will have: >* project management experience >* familiarity with virtual reality systems (military simulations, > computer games) >* expertise in computer graphics (specifically, Performer and OpenGL), >* familiarity with graphics clusters and supporting software > (Renderizer, ClusterJuggler, Chromium) >* expertise with C++, UNIX/LINUX/IRIX and Windows operating systems >* familiarity with high-speed network solutions (Myrinet, Gigabit > ethernet) >* familiarity with commercial content production tools (Maya, 3D > Sudio, Diva) > >Interested applicants should apply to job Requisition number 14490 >at http://www.usc.edu/bus-affairs/ers/search.html. Yohan BAILLOT Virtual Reality Laboratory, Advanced Information Technology (Code 5580), Naval Research Laboratory, 4555 Overlook Avenue SW, Washington, DC 20375-5337 Email : baillot at ait.nrl.navy.mil Work : (202) 404 7801 Home : (202) 518 3960 Cell : (703) 732 5679 Fax : (202) 767 1122 Web : http://ait.nrl.navy.mil/vrlab/projects/BARS/BARS.html ==^================================================================ This email was sent to: eugen at leitl.org EASY UNSUBSCRIBE click here: http://topica.com/u/?a84Ao5.bb5321.ZXVnZW5A Or send an email to: arforum-unsubscribe at topica.com TOPICA - Start your own email discussion group. FREE! http://www.topica.com/partner/tag02/create/index2.html ==^================================================================ ----- End forwarded message ----- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available URL: From landman at scalableinformatics.com Tue Apr 8 17:19:59 2003 From: landman at scalableinformatics.com (Joseph Landman) Date: 08 Apr 2003 17:19:59 -0400 Subject: Mac OS X or Linux? In-Reply-To: <200304081541.36834.exa@kablonet.com.tr> References: <1049733210.7632.3.camel@linux> <20030407175748.GB20765@mcs.anl.gov> <1049740419.6940.43.camel@protein.scalableinformatics.com> <200304081541.36834.exa@kablonet.com.tr> Message-ID: <1049836799.16158.4.camel@protein.scalableinformatics.com> On Tue, 2003-04-08 at 08:41, Eray Ozkural wrote: > I wonder if we can really classify those vector operations as SIMD which means > Single Instruction Multiple Data architecture. (no this isn't a troll!) I would word that the other way. They look like SIMD to me, and not "vector" in the Cray-ish model. I could be wrong on this, but I didn't see long "vectors", rather large (128 bit) data types upon which you can apply simultaneous operations. > > Thanks, -- Joseph Landman, Ph.D Scalable Informatics LLC email: landman at scalableinformatics.com web: http://scalableinformatics.com phone: +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From exa at kablonet.com.tr Tue Apr 8 08:41:36 2003 From: exa at kablonet.com.tr (Eray Ozkural) Date: Tue, 8 Apr 2003 15:41:36 +0300 Subject: Mac OS X or Linux? In-Reply-To: <1049740419.6940.43.camel@protein.scalableinformatics.com> References: <1049733210.7632.3.camel@linux> <20030407175748.GB20765@mcs.anl.gov> <1049740419.6940.43.camel@protein.scalableinformatics.com> Message-ID: <200304081541.36834.exa@kablonet.com.tr> On Monday 07 April 2003 21:33, Joseph Landman wrote: > Numbers I have seen for bioinfo apps seem to indicate that the hardware > is faster when code is redone for the built in vector registers (gcc > compiler doesn't automatically do this). Then again, this is comparing > non-SIMD to SIMD, and I would expect that the SIMD could be faster at > specific code patterns/fragments. Single CPU (non-SIMD) to single CPU > (non-SIMD) the performance comparison seems not to favor the current > Apple PPC hardware against current IA32 machines. > I wonder if we can really classify those vector operations as SIMD which means Single Instruction Multiple Data architecture. (no this isn't a troll!) Thanks, -- Eray Ozkural (exa) Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From whately at lcp.coppe.ufrj.br Wed Apr 9 16:11:10 2003 From: whately at lcp.coppe.ufrj.br (Lauro L. A. Whately) Date: Wed, 09 Apr 2003 17:11:10 -0300 Subject: setting PXE on a different net interface Message-ID: <3E947E5E.1020601@lcp.coppe.ufrj.br> Hi, I would like to make the machines in the cluster boot from a remote server. The mainboard of the nodes has a giga-ethernet interface on-board. Also, each node has a pci fast-ethernet interface that I want to use for administration and services (nfs, nis, monitoring, ...). The only configuration I find in the bios for the PXE is booting from the on-board interface. Does anyone know I way to reconfigure the (Intel) boot agent bypass the onboard interface and boot from the pci interface ? TIA, Lauro Whately. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Wed Apr 9 17:04:52 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed, 9 Apr 2003 17:04:52 -0400 (EDT) Subject: Mac OS X or Linux? In-Reply-To: <20030409061920.GA32255@mcs.anl.gov> Message-ID: > http://terizla.org/~robl/pbook/benchmarks/lmbench-linux_vs_osx.1 yow! I think it's fair to say that Apple has some work to do. I suppose it's also possible that the OS is tuned for models (such as desktop ones, perhaps with different cpu/cache/dram configs.) does OS X have page coloring inherited from *BSD? perhaps that explains the only place it comes out ahead (memory bandwidth/latency). _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From becker at scyld.com Wed Apr 9 17:45:52 2003 From: becker at scyld.com (Donald Becker) Date: Wed, 9 Apr 2003 17:45:52 -0400 (EDT) Subject: setting PXE on a different net interface In-Reply-To: <3E947E5E.1020601@lcp.coppe.ufrj.br> Message-ID: On Wed, 9 Apr 2003, Lauro L. A. Whately wrote: > I would like to make the machines in the cluster boot from a remote > server. The mainboard of the nodes has a giga-ethernet interface > on-board. Also, each node has a pci fast-ethernet interface that I want > to use for administration and services (nfs, nis, monitoring, ...). > The only configuration I find in the bios for the PXE is booting from > the on-board interface. This one is pretty easy to answer: the on-board interface is the only interface that the BIOS knows how to use. > Does anyone know I way to reconfigure the (Intel) boot agent bypass the > onboard interface and boot from the pci interface ? The only way your add-on PCI NIC can support PXE boot is if it has its own boot agent code. Note that most PXE clients out there use the Intel framework as the basis of their PXE boot. If the add-on card can do PXE boot, it will have a duplicate copy of the boot agent code. It might have the same message as the on-board NIC, but it's different boot step. If you have multiple on-board NICs, the Intel boot agent will attempt to PXE boot sequentially, rather than send requests out on all interfaces simultaneously and then using the best response. (If you read the PXE specs you would expect the latter behavior.) -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Scyld Beowulf cluster system Annapolis MD 21403 410-990-9993 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Wolfgang.Dobler at kis.uni-freiburg.de Thu Apr 10 04:47:04 2003 From: Wolfgang.Dobler at kis.uni-freiburg.de (Wolfgang Dobler) Date: Thu, 10 Apr 2003 10:47:04 +0200 Subject: Scaling of hydro codes Message-ID: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de> We have a 3-d finite-difference hydro code and find that the time per time step and grid point scales almost linearly, t_step ~ Ncpu^(-1) , on an Origin3000 from 1 up to 64 CPUs. On our Linux cluster (Gbit ethernet, 8x2 CPUs) however, we get a scaling that is well represented by t_step ~ Ncpu^(-0.75) . More or less the same scaling is obtained on another machine (100Mbit, 128 nodes), and also for another hydro code (parallelized using Cactus). Note that the number of grid points was adapted for these timings, so that the problem size per CPU is roughly constant. My question is: do others find the same type of scaling for hydro codes? If so, how can this be understood? I don't expect latency to play a role for these timings, as we are only communicating a reasonably low number of large arrays in every time step; I suppose, Cactus does the same. And if saturation of the switch played a role, I would expect a well-defined drop at some critical value of Ncpu, not a power law. W o l f g a n g -- ------------------------------------------------------------------------- | Wolfgang Dobler Phone: ++49/(0)761/3198-224 | | Kiepenheuer Institute for Solar Physics Fax: ++49/(0)761/3198-111 | | Sch?neckstra?e 6 | | D-79104 Freiburg E-Mail: Dobler at kis.uni-freiburg.de | | Germany http://www.kis.uni-freiburg.de/~dobler/ | ------------------------------------------------------------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mail_for_anand at yahoo.com Thu Apr 10 01:12:48 2003 From: mail_for_anand at yahoo.com (anand bagchi) Date: Wed, 9 Apr 2003 22:12:48 -0700 (PDT) Subject: help!!!!!!!!!-suggest a parallel program to be run on a beowulf cluster Message-ID: <20030410051248.81786.qmail@web21508.mail.yahoo.com> hi all , i am working on a beowulf cluster as a part of my undergraduate training and need to run a program on it . Could anybody suggest a parallel program that can be implemented or suggest a book or a website from where i can get some help . anand(India) __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From exa at kablonet.com.tr Thu Apr 10 08:54:43 2003 From: exa at kablonet.com.tr (Eray Ozkural) Date: Thu, 10 Apr 2003 15:54:43 +0300 Subject: Scaling of hydro codes In-Reply-To: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de> References: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de> Message-ID: <200304101554.43012.exa@kablonet.com.tr> On Thursday 10 April 2003 11:47, Wolfgang Dobler wrote: > My question is: do others find the same type of scaling for hydro codes? > If so, how can this be understood? Those are quite different architectures, that's why. Same parallel algorithm will show different performance on such different architectures. Your beowulf is a cluster of SMP nodes, does your algorithm take that into account? I think it probably doesn't. What exactly is the topology and architecture of the network on Origin3000? How fast are the nodes (cpu/mem bandwidth), and how much memory does it have? Same goes for the beowulf cluster. By scaling I take it that you increase problem size as well as number of processors. If you don't increase problem size it's called speedup. A scalability plot together with a speedup plot can say more about your problem. Thanks, -- Eray Ozkural (exa) Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Apr 10 11:54:57 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 10 Apr 2003 11:54:57 -0400 (EDT) Subject: help!!!!!!!!!-suggest a parallel program to be run on a beowulf cluster In-Reply-To: <20030410051248.81786.qmail@web21508.mail.yahoo.com> Message-ID: On Wed, 9 Apr 2003, anand bagchi wrote: > hi all , > i am working on a beowulf cluster as a > part of my undergraduate training and need to run a > program on it . Could anybody suggest a parallel > program that can be implemented or suggest a book or a > website from where i can get some help . The two most common demo programs to my experience are pvmpov (povray parallelized on PVM) and one of several parallelized Mandelbrot set generators, under either PVM or MPI. There are also mini-demo's and example programs in the distributions themselves or on their primary web homes, although they tend to be less graphical. The nice thing about either of these is that parallel speedup is clearly evident on almost any network, and that the speedup is beautifully (literally) rendered on the screen. One can rubberband one's way down into the visually stunning mandelbrot set and "see" individual patches of the new image being returned by the nodes, ditto for the rendering of pvmpov's standard pitcher/picture. If you use e.g. xpvm to add or remove nodes to your cluster, you can watch the computation speed up or slow down. Beyond these, there are of course many other resources you can use to write demos of your own or adopt demos from code in books. There are nice books on both PVM and MPI from e.g. MIT press that you can probably order via Amazon from anywhere in the world. A perusal of the mpich and pvm websites will turn up lots of useful things. I'm sure others on the list will return other specific reference programs and resources, for example parallelized linpack computations and the like, that are sometimes used to "benchmark" a cluster. rgb > > anand(India) > > __________________________________________________ > Do you Yahoo!? > Yahoo! Tax Center - File online, calculators, forms, and more > http://tax.yahoo.com > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joachim at ccrl-nece.de Thu Apr 10 12:41:47 2003 From: joachim at ccrl-nece.de (Joachim Worringen) Date: Thu, 10 Apr 2003 18:41:47 +0200 Subject: Scaling of hydro codes In-Reply-To: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de> References: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de> Message-ID: <200304101841.47669.joachim@ccrl-nece.de> Wolfgang Dobler: > I don't expect latency to play a role for these timings, as we are only > communicating a reasonably low number of large arrays in every time step; > I suppose, Cactus does the same. > And if saturation of the switch played a role, I would expect a > well-defined drop at some critical value of Ncpu, not a power law. I'd say that it's just your network which is to slow (Gbit ethernet is not necessarily fast!) in relation to the speed of the CPUs. Without knowing your code, I guess that with increasing Ncpu, the number of communication operations and the transported volume of data increases, too. This leads to increased communication time, while the time that each CPU needs to run through its timestep remains constant (as you adapted the problem size ~ Ncpu). But wait, if you keep the workload per CPU constant with increasing Ncpu, how comes that t_step scales with 1/Ncpu at all? Am I missing something here? Anyway, you should check if a faster network could help you (by verifying if the reason I suspected is valid). You might do this with MPE or Vampir (commercial tool from Pallas, demo licenses available), or some other way of profiling. Joachim -- Joachim Worringen - NEC C&C research lab St.Augustin fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sgaudet at wildopensource.com Thu Apr 10 13:04:04 2003 From: sgaudet at wildopensource.com (Stephen Gaudet) Date: Thu, 10 Apr 2003 13:04:04 -0400 Subject: Itanium gets supercomputing software Message-ID: http://msnbc-cnet.com.com/2100-1012-996357.html?type=pt&part=msnbc&tag=alert &form=feed&subj=cnetnews Stephen Gaudet ..... <(???)> ---------------------- Wild Open Source Bedford, NH 03110 pH: 603-488-1599 cell: 603-498-1600 Home: 603-472-8040 http://www.wildopensource.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From dsarvis at zcorum.com Thu Apr 10 12:32:49 2003 From: dsarvis at zcorum.com (Dennis Sarvis, II) Date: 10 Apr 2003 12:32:49 -0400 Subject: problem with load balancing Message-ID: <1049992368.16688.4.camel@skull.america.net> I tried implementing a 2 node cluster (both redhat, 1 a PII400 and 1 a Celeron550) with a cross-over cable I built. I tried implementing an open-mosix kernel and they talk to each other, I can manually migrate processes in x-windows, but they will not auto share. I also tried PVM but it just freezes up. I wanted to try mpi-ch but I need some guidance. I did 'try' to turn on RSH, but I may not have done it correctly. -- Alpharetta, GA Dennis Sarvis, II _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jakob at unthought.net Thu Apr 10 13:52:32 2003 From: jakob at unthought.net (Jakob Oestergaard) Date: Thu, 10 Apr 2003 19:52:32 +0200 Subject: renting time on a cluster In-Reply-To: <20030407125604.GR2067@leitl.org> References: <20030407125604.GR2067@leitl.org> Message-ID: <20030410175232.GB16320@unthought.net> On Mon, Apr 07, 2003 at 02:56:05PM +0200, Eugen Leitl wrote: > A friend of mine has a project requiring a lot > of crunch (no idea yet which bandwidth/latency > requirements). > > Can you think of places where one can rent nontrivial > amount of crunch for money? I know people who would be interested in providing such a service. As in, getting a cluster and start renting out time on it. (and no, I'm not affiliated with them, I just know them well :) So, while I don't have a real answer to your question, allow me to add yet another question: Is there interest in such a service? How many here would, or know people who might, rent time on a remote cluster ? I personally think that security concerns is the main showstopper here - you often cannot really do paid research on such a system, if the results are supposed to help getting patents etc. Larger organizations would rather buy their own cluster, than risk losing a patent to the competition. And for hobbyists? I guess most hobbyists can sneak in low-priority jobs at work :) So, is the original question a once-in-a-decade thing, or do people generally feel that there is interest in such a service? (haven't seen many of those requests on this list, AFAIR) Cheers, -- ................................................................ : jakob at unthought.net : And I see the elder races, : :.........................: putrid forms of man : : Jakob ?stergaard : See him rise and claim the earth, : : OZ9ABN : his downfall is at hand. : :.........................:............{Konkhra}...............: _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From craig.tierney at noaa.gov Thu Apr 10 11:42:10 2003 From: craig.tierney at noaa.gov (Craig Tierney) Date: Thu, 10 Apr 2003 09:42:10 -0600 Subject: Scaling of hydro codes In-Reply-To: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de> References: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de> Message-ID: <20030410154210.GA24214@hpti.com> On Thu, Apr 10, 2003 at 10:47:04AM +0200, Wolfgang Dobler wrote: > We have a 3-d finite-difference hydro code and find that the time per time > step and grid point scales almost linearly, > t_step ~ Ncpu^(-1) , > on an Origin3000 from 1 up to 64 CPUs. > > On our Linux cluster (Gbit ethernet, 8x2 CPUs) however, we get a scaling > that is well represented by > t_step ~ Ncpu^(-0.75) . > More or less the same scaling is obtained on another machine (100Mbit, 128 > nodes), and also for another hydro code (parallelized using Cactus). > Note that the number of grid points was adapted for these timings, so that > the problem size per CPU is roughly constant. Did determine this number scaling from 1 to 16 cpus, or from 2 to 16 cpus? You aren't going to get good scaling from 1 to 2 because lack of memory bandwidth (this is usually the case). Scale from 1 to 8 nodes (2 to 16 processors) to see how the code scales due to the interconnect. Craig > > My question is: do others find the same type of scaling for hydro codes? > If so, how can this be understood? > > I don't expect latency to play a role for these timings, as we are only > communicating a reasonably low number of large arrays in every time step; > I suppose, Cactus does the same. > And if saturation of the switch played a role, I would expect a > well-defined drop at some critical value of Ncpu, not a power law. > > > W o l f g a n g > > -- > > ------------------------------------------------------------------------- > | Wolfgang Dobler Phone: ++49/(0)761/3198-224 | > | Kiepenheuer Institute for Solar Physics Fax: ++49/(0)761/3198-111 | > | Sch?neckstra?e 6 | > | D-79104 Freiburg E-Mail: Dobler at kis.uni-freiburg.de | > | Germany http://www.kis.uni-freiburg.de/~dobler/ | > ------------------------------------------------------------------------- > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Craig Tierney (ctierney at hpti.com) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From keith.murphy at attglobal.net Thu Apr 10 14:23:23 2003 From: keith.murphy at attglobal.net (Keith Murphy) Date: Thu, 10 Apr 2003 11:23:23 -0700 Subject: renting time on a cluster References: <20030407125604.GR2067@leitl.org> <20030410175232.GB16320@unthought.net> Message-ID: <060401c2ff8e$45f4df70$02fea8c0@oemcomputer> There is a company in Florida Tsunamic Technologies, who already offers such a service. No, they are not a customer or even a friend http://www.tsunamictechnologies.com/ Regards Keith Murphy Dolphin Interconnect T: 818-597-2114 F: 818-597-2119 C: 818-292-5100 www.dolphinics.com www.scali.com ----- Original Message ----- From: "Jakob Oestergaard" To: "Eugen Leitl" Cc: Sent: Thursday, April 10, 2003 10:52 AM Subject: Re: renting time on a cluster > On Mon, Apr 07, 2003 at 02:56:05PM +0200, Eugen Leitl wrote: > > A friend of mine has a project requiring a lot > > of crunch (no idea yet which bandwidth/latency > > requirements). > > > > Can you think of places where one can rent nontrivial > > amount of crunch for money? > > I know people who would be interested in providing such a service. As > in, getting a cluster and start renting out time on it. > > (and no, I'm not affiliated with them, I just know them well :) > > So, while I don't have a real answer to your question, allow me to add > yet another question: Is there interest in such a service? > > How many here would, or know people who might, rent time on a remote > cluster ? > > I personally think that security concerns is the main showstopper here - > you often cannot really do paid research on such a system, if the > results are supposed to help getting patents etc. Larger organizations > would rather buy their own cluster, than risk losing a patent to the > competition. And for hobbyists? I guess most hobbyists can sneak in > low-priority jobs at work :) > > So, is the original question a once-in-a-decade thing, or do people > generally feel that there is interest in such a service? > > (haven't seen many of those requests on this list, AFAIR) > > Cheers, > > -- > ................................................................ > : jakob at unthought.net : And I see the elder races, : > :.........................: putrid forms of man : > : Jakob ?stergaard : See him rise and claim the earth, : > : OZ9ABN : his downfall is at hand. : > :.........................:............{Konkhra}...............: > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From iod00d at hp.com Thu Apr 10 14:26:09 2003 From: iod00d at hp.com (Grant Grundler) Date: Thu, 10 Apr 2003 11:26:09 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: References: Message-ID: <20030410182609.GF29125@cup.hp.com> On Thu, Apr 10, 2003 at 01:04:04PM -0400, Stephen Gaudet wrote: > > http://msnbc-cnet.com.com/2100-1012-996357.html?type=pt&part=msnbc&tag=alert > &form=feed&subj=cnetnews ... | That barrier has hindered adoption of Itanium in broad business markets, | but it's been less of a problem in the supercomputing niche, where | customers often control their own software instead of relying on products | such as Oracle's database or Computer Associates' management software. Gah! Both Oracle and Computer Associates have ia64-linux product available. grant _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Apr 10 15:57:46 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 10 Apr 2003 15:57:46 -0400 (EDT) Subject: renting time on a cluster In-Reply-To: <20030410175232.GB16320@unthought.net> Message-ID: On Thu, 10 Apr 2003, Jakob Oestergaard wrote: > > So, is the original question a once-in-a-decade thing, or do people > generally feel that there is interest in such a service? > > (haven't seen many of those requests on this list, AFAIR) The problem is that there is a fairly narrow profile of problems for which such a service is optimal; in "most cases" the cost benefit of doing it yourself or in your existing IT organization are superior, as you have to pay any such service provider the real costs plus depreciation plus a profit; in your own organization some parts of these costs are low marginal cost rescalings of existing infrastructure or opportunity cost time paid out of a pool of low priority competing tasks or FTE surplus hours (i.e. free). The same problem exists, actually, for "centralized" shared compute resources at universities or supercomputer centers -- for these to be a cost win they generally need a pool of clients that is: a) Big enough to keep their cluster operating close to capacity all the time, since the only way to be the fixed costs of dead time is to amortize it over active time, raising rates and starting a deadly spiral of still fewer clients. b) With demand that can be spread out to keep the duty cycle high a la a). It does no good to have one cluster-year's worth of tasks for your cluster if all your clients insist on having their work done in the same three month time of the year -- you'll have to have a cluster 3x bigger (and idle 3/4 of the year) or lose 2/3 of your clients and STILL be idle 3/4 of the year. Oooo, hate to even do the math on that one. c) Poor enough in local computing resources that a locally purchased and administered cluster doesn't make more sense. d) Almost by definition, with a problem that needs only a short, intense burst of computation. People with longrunning problems tend NOT to use this sort of resource because they almost always are better off with their own cluster. It's people who need a 128 node cluster for a month who can't make do with a 16 node cluster for a year that will be your primary clients (along with a FEW of those local-resource poor groups -- this is an important client base of a shared resource in a University, for example). A good sized campus is likely to have enough of a mix where a centralized cluster can make sense, especially one that is "owned" by the primary groups that operate it who effectively subscribe most of its time with a clear understanding of how it is to be split up among long runners and on demanders. A commercial cluster is pretty tough. I think you'd need a bunch of long term "subscribers" there as well, contractually bound for periods on the order of a year, to keep risks sane and costs reasonably competitive with DIY. If you had some sort of auction/market model whereby you could resell idle time at or even below cost to keep from losing money actively while charging a lot more than cost to the on demand short term users (who would pay it as it is still cheaper than building their own) you might work out a stable and profitable business. You'd also do better reselling to businesses than to university or government researchers. We're notoriously cheap and like to DIY anyway. Small businesses especially often have significant infrastructure barriers that would make purchasing rented time desirable in at least the short run, IF you could identify the small businesses that need it... rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Thu Apr 10 19:58:08 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Fri, 11 Apr 2003 09:58:08 +1000 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <16022.194.793900.97453@napali.hpl.hp.com> References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> Message-ID: <3E960510.6070503@octopus.com.au> David Mosberger wrote: > Remember that Intel is targeting Itanium 2 against Power4 and SPARC. > In that space, the price of Itanium 2 is very competitive. OK, I want to be clear on this. I asked why Itanium hardware is still so expensive. Your answer seems to be marketing speak for "The prices are still high because we are _happy_ selling small quantities of this equipment to people used to paying through the nose for good quality hardware." Is this correct? Can I then conclude that Intel has not yet had any interest whatsoever in driving IA64 into the realm of reasonble prices? It's sad to see so much work being put into this Linux port when, if things remain as they are, it will hardly be used. > Duraid> Seriously, IA64 must be the first architecture in history > Duraid> where a software simulator is still being developed 4 years > Duraid> after commercial availability of silicon (indeed, entire > Duraid> systems). > > What's a software simulator got to do with anything? Certain things > are easier to develop on a simulator, others are easier to develop on > hardware. Nothing unique to IA64. I put it to you that software is easier to develop on hardware. Nothing unique to IA64, indeed. Duraid _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Thu Apr 10 16:55:30 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Fri, 11 Apr 2003 06:55:30 +1000 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <20030410182609.GF29125@cup.hp.com> References: <20030410182609.GF29125@cup.hp.com> Message-ID: <3E95DA42.7000607@octopus.com.au> You and I both know the only real barrier to Itanium adoption is the price. Can anyone here shed some light on this? Why is Itanium hardware still so expensive? Seriously, IA64 must be the first architecture in history where a software simulator is still being developed 4 years after commercial availability of silicon (indeed, entire systems). Hello? Is anyone home? If Intel thinks an 0.13u respin of Itanium 2 going for $1000 a pop is going to save them from the horrible onslaught of horrible hardware (x86-64 ;) it'd seem they have another thing coming! We live in Carly times. :\ Duraid Grant Grundler wrote: > Gah! > Both Oracle and Computer Associates have ia64-linux product available. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From randolph at tausq.org Thu Apr 10 19:56:37 2003 From: randolph at tausq.org (Randolph Chung) Date: Thu, 10 Apr 2003 16:56:37 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <16022.194.793900.97453@napali.hpl.hp.com> References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> Message-ID: <20030410235636.GO12993@tausq.org> > What's a software simulator got to do with anything? Certain things > are easier to develop on a simulator, others are easier to develop on > hardware. Nothing unique to IA64. hear hear... i might have access to a bunch of parisc hardware, but i would love to get my hands on a good parisc simulator. i setup the ia64 simulator to play with kernel modules support.. but now that david got it working... :-) randolph -- Randolph Chung Debian GNU/Linux Developer, hppa/ia64 ports http://www.tausq.org/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From davidm at napali.hpl.hp.com Thu Apr 10 19:39:46 2003 From: davidm at napali.hpl.hp.com (David Mosberger) Date: Thu, 10 Apr 2003 16:39:46 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <3E95DA42.7000607@octopus.com.au> References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> Message-ID: <16022.194.793900.97453@napali.hpl.hp.com> >>>>> On Fri, 11 Apr 2003 06:55:30 +1000, Duraid Madina said: Duraid> You and I both know the only real barrier to Itanium Duraid> adoption is the price. Can anyone here shed some light on Duraid> this? Why is Itanium hardware still so expensive? Remember that Intel is targeting Itanium 2 against Power4 and SPARC. In that space, the price of Itanium 2 is very competitive. Duraid> Seriously, IA64 must be the first architecture in history Duraid> where a software simulator is still being developed 4 years Duraid> after commercial availability of silicon (indeed, entire Duraid> systems). What's a software simulator got to do with anything? Certain things are easier to develop on a simulator, others are easier to develop on hardware. Nothing unique to IA64. --david _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bob at drzyzgula.org Thu Apr 10 21:51:39 2003 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Thu, 10 Apr 2003 21:51:39 -0400 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <3E960510.6070503@octopus.com.au> References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> Message-ID: <20030410215139.O3614@www2> On Fri, Apr 11, 2003 at 09:58:08AM +1000, Duraid Madina wrote: > > David Mosberger wrote: > >Remember that Intel is targeting Itanium 2 against Power4 and SPARC. > >In that space, the price of Itanium 2 is very competitive. > > OK, I want to be clear on this. I asked why Itanium hardware is still so > expensive. Your answer seems to be marketing speak for "The prices are > still high because we are _happy_ selling small quantities of this > equipment to people used to paying through the nose for good quality > hardware." Is this correct? I'm not sure that it works this way. I think it's more like "We are making the best processor we know (or, perhaps, "knew", or "thought we knew", or even "allowed ourselves to know") how to make that will/would/might in our dreams be profitable to sell at this high price in moderate quantities." I expect that if they could sell one hundred times as many Itaniums at a tenth the price, they would ramp up the fabs and do it. But then you get into the chicken-or-egg problem: There's no software, and hence no demand, and hence no software, and hence no demand, that would justify the production of a hundred times as many Itaniums. > Can I then conclude that Intel has not yet had any interest whatsoever > in driving IA64 into the realm of reasonble prices? It's sad to see so > much work being put into this Linux port when, if things remain as they > are, it will hardly be used. Be careful that you put the horse before the cart. Might it not be that the people doing this work are wagering that it will ultimately cause demand for the Itanium to increase? Could it really be expected that demand for Itanium *would* materialize without such investment in software happening first? In any event, virtually nothing remains as it is. --Bob Drzyzgula _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From matthewc at cse.unsw.edu.au Thu Apr 10 22:20:46 2003 From: matthewc at cse.unsw.edu.au (Matt Chapman) Date: Fri, 11 Apr 2003 12:20:46 +1000 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <3E960510.6070503@octopus.com.au> References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> Message-ID: <20030411022046.GA22381@cse.unsw.edu.au> On Fri, Apr 11, 2003 at 09:58:08AM +1000, Duraid Madina wrote: > > Can I then conclude that Intel has not yet had any interest whatsoever > in driving IA64 into the realm of reasonble prices? My understanding is that Deerfield will be targeted at the lower cost market, though I haven't seen much info about it recently. > I put it to you that software is easier to develop on hardware. Nothing > unique to IA64, indeed. We still use simulators despite the availability of hardware. Operating system software is often easier to debug on a simulator. Matt _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rochus.schmid at ch.tum.de Fri Apr 11 06:34:08 2003 From: rochus.schmid at ch.tum.de (rochus.schmid at ch.tum.de) Date: Fri, 11 Apr 2003 12:34:08 +0200 (CEST) Subject: SMC8624T vs DLINK DGC-1024T / Jumbo Frames ? In-Reply-To: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca> References: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca> Message-ID: <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de> dear beowulfers, we are in a similar situation as dave: we get an 8nodes dual-xeon cluster (with tyan e7501 mobo) with intel gige on board. and now the "switch issue" comes up. my vendor also suggested the DLINK, whereas i found the discussion on the (more expensive managed) SMC on this list supporting jumbo frames. the issue was whether or not any of the cheaper (unmanaged) switches support jumbo frames. i couldnt figure out if this is resolved, yet. it sounded like they might, but since they are unmanaged the problem is to switch it on or off. is that right? i also found this document: http://www.scl.ameslab.gov/Publications/ HalsteadPubs/usenix_halstead.pdf it says that the effect on bandwith with jumbo frames is only seen for tcp/ip commun (netpipe) but is completely lost using MPI. since my code is MPI based it wouldn't matter to have jumbo frames and i could go with the cheaper DLINK. is this info right? or outdated? missunderstood? any hints highly appreciated. greetings rochus Quoting Dave Lane : > Can anyone comment on the strengths/weaknesses of these two 24-port > gigabit > switches. We're going to be building a 16 node dual-Xeon cluster this > spring and were planning on the SMC switch (which has received good > review > here before), but a vendor pointed out the DLINK switch as a less > expensive > alternative. > > ... Dave > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From virtualsuresh at yahoo.co.in Fri Apr 11 06:49:48 2003 From: virtualsuresh at yahoo.co.in (=?iso-8859-1?q?suresh=20chandra?=) Date: Fri, 11 Apr 2003 11:49:48 +0100 (BST) Subject: remote booting Message-ID: <20030411104948.94018.qmail@web8107.mail.in.yahoo.com> Hi, I am building a 2-node cluster as a practice for building a 16-node cluster in University. I want to remote boot for client (Diskless), I found PXELINUX should be flashed or burned into a PROM on the network card. Is there any other way for remote booting by using a Floppy disk (which in turn invoke my NIC for remote booting), I have less time to get a PROM for my network card. I am going to use OpenMosix. Thanks in Advance. Regards, Suresh Chandra Mannava, India. ===== ________________________________________________________________________ Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Andrew.Cannon at nnc.co.uk Fri Apr 11 08:20:33 2003 From: Andrew.Cannon at nnc.co.uk (Cannon, Andrew) Date: Fri, 11 Apr 2003 13:20:33 +0100 Subject: PVM and MPI differences? Message-ID: Hi All, I've recently set up a Monte Carlo compute cluster of 4 computers (RH8) running pvm. I have heard about MPI and I was wondering what the differences between mpi and pvm are? Regards Andrew Andrew Cannon, Nuclear Technology (J2), NNC Ltd, Booths Hall, Knutsford, Cheshire, WA16 8QZ. Telephone; +44 (0) 1565 843768 email: mailto:andrew.cannon at nnc.co.uk NNC website: http://www.nnc.co.uk *********************************************************************************** NNC Limited Booths Hall Chelford Road Knutsford Cheshire WA16 8QZ Country of Registration: United Kingdom Registered Number: 1120437 This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the NNC system manager by e-mail at eadm at nnc.co.uk. *********************************************************************************** _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From seth at hogg.org Fri Apr 11 04:39:12 2003 From: seth at hogg.org (Simon Hogg) Date: Fri, 11 Apr 2003 09:39:12 +0100 Subject: IA-64 related question (tangentially) Message-ID: <4.3.2.7.2.20030411093017.00c3cc70@pop.freeuk.net> Don't everybody get excited straight away - this needs to be approved by a few people first :-) Suppose 'a friend' had some Itanium hardware to be donated to a 'good cause' - what's the best way of going about it? Should it go to FSF / Gnu / Debian projects to further their development (in general software terms)? Or, should it go to a local university in support of a specific 'end-user' / project (maybe beowulf related) role. What are the pros and cons of each route (for the receiver)? I am more tempted by the FSF / Debian donation, but maybe there are other benefits. Simon p.s. The 'gift' won't be in the order of the 300+ nodes at OSC! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From adriano at satec.es Fri Apr 11 04:37:39 2003 From: adriano at satec.es (Adriano Galano) Date: Fri, 11 Apr 2003 10:37:39 +0200 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <16022.194.793900.97453@napali.hpl.hp.com> Message-ID: <003601c30005$9d740a10$a620a4d5@tsatec.int> > >>>>> On Fri, 11 Apr 2003 06:55:30 +1000, Duraid Madina > said: > > Duraid> You and I both know the only real barrier to Itanium > Duraid> adoption is the price. Can anyone here shed some light on > Duraid> this? Why is Itanium hardware still so expensive? > > Remember that Intel is targeting Itanium 2 against Power4 and SPARC. > In that space, the price of Itanium 2 is very competitive. > What's mean very competitive? How it compare with Power* for example? > Duraid> Seriously, IA64 must be the first architecture in history > Duraid> where a software simulator is still being developed 4 years > Duraid> after commercial availability of silicon (indeed, entire > Duraid> systems). > > What's a software simulator got to do with anything? Certain things > are easier to develop on a simulator, others are easier to develop on > hardware. Nothing unique to IA64. > AMD's Opteron is in a simulator yet... Regards, --Adriano _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mtina at tahoe.com Fri Apr 11 10:18:25 2003 From: mtina at tahoe.com (Mohammad Tina) Date: 11 Apr 2003 15:18:25 +0100 Subject: new to linux clustering Message-ID: <57d7501c30035$3721bb10$4701020a@corp.load.com> Hi, i am new to linux clustering, i am planning to install cluster on 3 machines (redhat 7). i was reading about clustering and i found many packages. can anyone recommend a package for me?? Thanks ================================================================== Get Your Free Web-Based Email at http://www.tahoe.com! ================================================================== _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From yudong at lshp.gsfc.nasa.gov Fri Apr 11 11:00:12 2003 From: yudong at lshp.gsfc.nasa.gov (Yudong Tian) Date: Fri, 11 Apr 2003 11:00:12 -0400 Subject: remote booting In-Reply-To: <20030411104948.94018.qmail@web8107.mail.in.yahoo.com> Message-ID: Please make sure your NIC supports PXE boot or not. If it does, then you can boot over the network without using a floppy. If it does not, you might need to use syslinux on a floppy. I did network boot and installation before, and here you can find the steps I took: http://lis.gsfc.nasa.gov/yudong/notes/net-install.txt ------------------------------------------------------------ Falun Dafa: The Tao of Meditation (http://www.falundafa.org) ------------------------------------------------------------ Yudong Tian, Ph.D. NASA/GSFC (301) 286-2275 > -----Original Message----- > From: beowulf-admin at beowulf.org [mailto:beowulf-admin at beowulf.org]On > Behalf Of suresh chandra > Sent: Friday, April 11, 2003 6:50 AM > To: Beowulf at beowulf.org > Subject: remote booting > > > Hi, > I am building a 2-node cluster as a practice for > building a 16-node cluster in University. > I want to remote boot for client (Diskless), I found > PXELINUX should be flashed or burned into a PROM on > the network card. > Is there any other way for remote booting by using a > Floppy disk (which in turn invoke my NIC for remote > booting), I have less time to get a PROM for my > network card. > > I am going to use OpenMosix. > Thanks in Advance. > > Regards, > Suresh Chandra Mannava, India. > > > ===== > > > ________________________________________________________________________ > Missed your favourite TV serial last night? Try the new, Yahoo! TV. > visit http://in.tv.yahoo.com > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Fri Apr 11 10:27:31 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 11 Apr 2003 10:27:31 -0400 (EDT) Subject: IA-64 related question (tangentially) In-Reply-To: <4.3.2.7.2.20030411093017.00c3cc70@pop.freeuk.net> Message-ID: On Fri, 11 Apr 2003, Simon Hogg wrote: > Don't everybody get excited straight away - this needs to be approved by a > few people first :-) > > Suppose 'a friend' had some Itanium hardware to be donated to a 'good > cause' - what's the best way of going about it? Should it go to FSF / Gnu > / Debian projects to further their development (in general software terms)? > > Or, should it go to a local university in support of a specific 'end-user' > / project (maybe beowulf related) role. > > What are the pros and cons of each route (for the receiver)? I am more > tempted by the FSF / Debian donation, but maybe there are other benefits. > > Simon > p.s. The 'gift' won't be in the order of the 300+ nodes at OSC! Goodness! I just HAVE to take a stab at answering this one (I answer everything else, after all...:-) It's perfectly clear that the best way is to donate it to a University, in fact, more specifically, to the Duke University Physics Department. Indeed, most specifically of all, to the group of Brown and Ciftan in the Duke University Physics Department, to be used in Monte Carlo computations in O(3) Symmetric critical systems and a new Multiple Scattering Band Theory project just getting underway. FSF or Debian don't need Itaniums, really, except for maybe one or two to ensure that builds work on the architecture. They don't "compute". I do. To me a cycle is a precious thing as I use so MANY of them over the years. Selflessly yours (just trying to make sure you Do The Right Thing...:-) rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gmpc at sanger.ac.uk Fri Apr 11 10:53:09 2003 From: gmpc at sanger.ac.uk (Guy Coates) Date: Fri, 11 Apr 2003 15:53:09 +0100 (BST) Subject: remote booting In-Reply-To: <200304111422.h3BEMUs01923@NewBlue.Scyld.com> References: <200304111422.h3BEMUs01923@NewBlue.Scyld.com> Message-ID: >Is there any other way for remote booting by using a >Floppy disk (which in turn invoke my NIC for remote >booting) Yup, take a look at the etherboot project http://etherboot.sourceforge.net/ which does exactly this for a wide range of ethernet hardware. There is even a nice webpage which will build your boot image for you: http://www.rom-o-matic.net/ Cheers, Guy Coates -- Guy Coates, Informatics System Group The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK Tel: +44 (0)1223 834244 ex 7199 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From david at virtutech.se Fri Apr 11 03:51:36 2003 From: david at virtutech.se (David =?iso-8859-1?q?K=E5gedal?=) Date: Fri, 11 Apr 2003 09:51:36 +0200 Subject: Itanium gets supercomputing software In-Reply-To: <20030411022046.GA22381@cse.unsw.edu.au> (Matt Chapman's message of "Fri, 11 Apr 2003 12:20:46 +1000") References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> <20030411022046.GA22381@cse.unsw.edu.au> Message-ID: Matt Chapman writes: > On Fri, Apr 11, 2003 at 09:58:08AM +1000, Duraid Madina wrote: >> >> I put it to you that software is easier to develop on hardware. Nothing >> unique to IA64, indeed. > > We still use simulators despite the availability of hardware. Operating > system software is often easier to debug on a simulator. Exactly. There are a lot of things that you can do with a simulator that you can't do with hardware. Developing software before hardware is available is just one of them. (plug mode on) That's why we sell simulators for most major current CPU architectures. Including IA64. -- David K?gedal, Virtutech http://www.simics.com/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From seth at hogg.org Fri Apr 11 11:07:33 2003 From: seth at hogg.org (Simon Hogg) Date: Fri, 11 Apr 2003 16:07:33 +0100 Subject: IA-64 related question (tangentially) Message-ID: <4.3.2.7.2.20030411160726.00c3d700@pop.freeuk.net> At 10:27 11/04/03 -0400, you wrote: >On Fri, 11 Apr 2003, Simon Hogg wrote: > > Suppose 'a friend' had some Itanium hardware to be donated to a 'good > > cause' - what's the best way of going about it? Should it go to FSF / Gnu > > / Debian projects to further their development (in general software terms)? > > > > Or, should it go to a local university in support of a specific 'end-user' > > / project (maybe beowulf related) role. > >Goodness! I just HAVE to take a stab at answering this one (I answer >everything else, after all...:-) > >It's perfectly clear that the best way is to donate it to a >University, in fact, more specifically, to the Duke University Physics >Department. Indeed, most specifically of all, to the group of Brown and >Ciftan in the Duke University Physics Department, to be used in Monte >Carlo computations in O(3) Symmetric critical systems and a new Multiple >Scattering Band Theory project just getting underway. > >FSF or Debian don't need Itaniums, really, except for maybe one or two >to ensure that builds work on the architecture. They don't "compute". >I do. To me a cycle is a precious thing as I use so MANY of them over >the years. > >Selflessly yours (just trying to make sure you Do The Right Thing...:-) Well, how surprised was I by this answer? :-) Thinking about this a little bit more, would 'you' be happier getting the cash equivalent (with strings attached as to what you could buy). Is this more tax-efficient? Or is it better to 'loan' you the equipment which then depreciates over two or three years, then I donate it to you for zero cost? Simon _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Fri Apr 11 11:16:43 2003 From: raysonlogin at yahoo.com (Rayson Ho) Date: Fri, 11 Apr 2003 08:16:43 -0700 (PDT) Subject: new to linux clustering In-Reply-To: <57d7501c30035$3721bb10$4701020a@corp.load.com> Message-ID: <20030411151643.92394.qmail@web11405.mail.yahoo.com> Tell us what you want to run first... Rayson --- Mohammad Tina wrote: > Hi, > i am new to linux clustering, i am planning to install cluster on 3 > machines (redhat 7). > i was reading about clustering and i found many packages. > can anyone recommend a package for me?? > > Thanks > > > > > ================================================================== > Get Your Free Web-Based Email at http://www.tahoe.com! > ================================================================== > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From chris_oubre at hotmail.com Fri Apr 11 12:29:49 2003 From: chris_oubre at hotmail.com (Chris Oubre) Date: Fri, 11 Apr 2003 11:29:49 -0500 Subject: new to linux clustering In-Reply-To: <200304111420.h3BEKos01501@NewBlue.Scyld.com> Message-ID: <000e01c30047$92978f30$25462a80@rice.edu> I am using OSCAR 2.1 to run my cluster of 15 dual Xeons. I quite like the package. It lays on top of Red Hat 7.2 7.3 or Mandrake 8.2. OSCAR is basically a suite of packages (PBS, MPI,LAM, PVM, C3, HDF5, ...) which make "culsterize" and make administration easier. Check them out at http://oscar.sourceforge.net/ -----Original Message----- From: beowulf-admin at beowulf.org [mailto:beowulf-admin at beowulf.org] On Behalf Of beowulf-request at beowulf.org Sent: Friday, April 11, 2003 9:21 AM To: beowulf at beowulf.org Subject: Beowulf digest, Vol 1 #1243 - 14 msgs --__--__-- Message: 14 Date: 11 Apr 2003 15:18:25 +0100 From: "Mohammad Tina" To: "beowulf" Subject: new to linux clustering Hi, i am new to linux clustering, i am planning to install cluster on 3 machines (redhat 7). i was reading about clustering and i found many packages. can anyone recommend a package for me?? Thanks ================================================================== Get Your Free Web-Based Email at http://www.tahoe.com! ================================================================== --__--__-- _______________________________________________ Beowulf mailing list Beowulf at beowulf.org http://www.beowulf.org/mailman/listinfo/beowulf End of Beowulf Digest _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jbassett at blue.weeg.uiowa.edu Fri Apr 11 12:41:07 2003 From: jbassett at blue.weeg.uiowa.edu (jbassett) Date: Fri, 11 Apr 2003 11:41:07 -0500 Subject: cooling systems Message-ID: <3EA0F5C5@itsnt5.its.uiowa.edu> Does anyone know of if it is possible to buy a rackmount cluster with an integrated cooling system? It seems against the philosophy of Beowulf to look for low cost computing solutions, and then find that you need to make a substantial investment just to cool the room. I had an Athlon system shut down on me due to overheat, so I look at the cases and I think- why aren't people looking to use airflow in a more efficient manner. I know the ambient air temp isn't this high. I may be in left field, but it seems like the flow inside a case is so turbulent that the mean air velocity is not carrying the warm air away from the cpu as quickly as it could. Joseph Bassett _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Fri Apr 11 13:29:01 2003 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Fri, 11 Apr 2003 10:29:01 -0700 (PDT) Subject: cooling systems In-Reply-To: <3EA0F5C5@itsnt5.its.uiowa.edu> Message-ID: rack systems with integrated hvac do exist. as a general rule it cheaper to buy one or two large ac units and move air around as needed than it is to buy lots of small ac units have to deal seperately with the exhaust from all their little heat-exchangers... On Fri, 11 Apr 2003, jbassett wrote: > Does anyone know of if it is possible to buy a rackmount cluster with an > integrated cooling system? It seems against the philosophy of Beowulf to look > for low cost computing solutions, and then find that you need to make a > substantial investment just to cool the room. I had an Athlon system shut down > on me due to overheat, so I look at the cases and I think- why aren't people > looking to use airflow in a more efficient manner. I know the ambient air temp > isn't this high. I may be in left field, but it seems like the flow inside a > case is so turbulent that the mean air velocity is not carrying the warm air > away from the cpu as quickly as it could. there's a substantial amount of engineering that has to go into the thermal management in a 1u or 2u case that simply doen't have to happen in the desktop pc industry. even if you have sufficient cooling for the room you may still need dedicated airhandlers to move it to the right location for the rack with the nodes in them... > Joseph Bassett > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- In Dr. Johnson's famous dictionary patriotism is defined as the last resort of the scoundrel. With all due respect to an enlightened but inferior lexicographer I beg to submit that it is the first. -- Ambrose Bierce, "The Devil's Dictionary" _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Fri Apr 11 14:13:35 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Fri, 11 Apr 2003 11:13:35 -0700 Subject: [Linux-IA64] Itanium gets supercomputer software In-Reply-To: <200304111505.TAA23036@nocserv.free.net> References: <200304111505.TAA23036@nocserv.free.net> Message-ID: <20030411181335.GB1321@greglaptop.internal.keyresearch.com> On Fri, Apr 11, 2003 at 07:05:13PM +0400, Mikhail Kuzminsky wrote: > It looks that there is some "gentleman's" agreement between Intel > and companies, manufacturing IA64-based systems, about "price increase". With the first Itanium generation, only Intel built boxes, and everyone else OEMed and sold the same 2 boxes (one dual, one quad). There shouldn't be any surprise that they were roughly the same price. Things are a bit more diverse with the Itanium2, but it's still a low volume item. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Fri Apr 11 14:00:34 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Fri, 11 Apr 2003 11:00:34 -0700 Subject: IA-64 related question (tangentially) In-Reply-To: References: <4.3.2.7.2.20030411093017.00c3cc70@pop.freeuk.net> Message-ID: <20030411180034.GA1321@greglaptop.internal.keyresearch.com> On Fri, Apr 11, 2003 at 10:27:31AM -0400, Robert G. Brown wrote: > FSF or Debian don't need Itaniums, really, except for maybe one or two > to ensure that builds work on the architecture. They don't "compute". Joking aside, every compiler group needs a cluster, because their nightly testing is: build kernel, test kernel build compiler, test compiler build all rpms in your distro build and run SPECcpu build and run misc tests run a search over combinations of optimization flags to see if any are broken or have performance regressions And that's not even considering parallelizing builds. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Fri Apr 11 14:11:01 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 11 Apr 2003 14:11:01 -0400 (EDT) Subject: cooling systems In-Reply-To: <3EA0F5C5@itsnt5.its.uiowa.edu> Message-ID: On Fri, 11 Apr 2003, jbassett wrote: > Does anyone know of if it is possible to buy a rackmount cluster with an > integrated cooling system? It seems against the philosophy of Beowulf to look > for low cost computing solutions, and then find that you need to make a > substantial investment just to cool the room. I had an Athlon system shut down This is in some ways impossible, if I understand what you mean. Or from another point of view, it is already standard. Let's understand refrigeration and thermodynamics a bit: All the energy used to run your systems and do computations turns into heat (1st law). One cannot make heat "go away"; it either naturally flows from hot places to cooler places, or one can move it forcibly from a hot place to a hotter place. It costs energy which makes still MORE heat to move it forcibly around (2nd law). Now view the CPUs as little heaters -- 50W to 100W apiece (as hot as most incandescent light bulbs) and confined inside a 1U or 2U case. Add on another 50W plus for the motherboard, memory, disk, network, and the switching power supply itself inside the case. Even the "refrigeration" devices already standard in the case (case fans intended to speed the heat on its way) add heat to the case exhaust in the process. Cases are already designed to move heat from the hot spots inside out into the ambient air as efficiently as possible (within the quality of engineering and layout of any particular case with any particular motherboard). There are even cooling devices designed for e.g. CPU cooling that are active electronic refrigerators (peltier coolers) and not just fan+heat sink conduction+convection coolers. The problem is out in the room. Once you remove the heat from the cases, with or without an actual case refrigerator at work (in general one will exhaust MORE heat into the room than a case cooled with fan alone) the heat still HAS to get out of the room. If the room has lots of nodes making heat, nice thick walls, ceilings, floors, and lots of dead air (as do most uncooled cluster rooms, it seems), it won't get out quickly enough on its own, so it will start to build up. This makes the room get hot -- temperature being a measure of the "heat" (random kinetic energy) in the room's air. Now, a passive cooler fan can only cool the CPU if the ambient air is cooler than the CPU. It can move air through more quickly, but basically heat is flowing from hot to cold. As the room air temperature goes up, so does the CPU temperature as the fan is less successful in helping to remove its power-generated heat. An active cooler is in no fundamentally better shape. Yest, it will maintain a temperature gradient, and keep the CPU actually cooler than ambient air, but as ambient air goes up in temperature so will the CPU temperature AND the ambient air will get still hotter as a result of the extra energy the cooler itself consumes (which in turn goes up as the ambient air temperature increases in a vicious cycle). It also heats the other components in the case more while keeping the CPU a bit cooler, so other things may fail at a higher rate unless you remove all that heat. ONE WAY OR ANOTHER you will HAVE to remove the heat from the room JUST AS FAST as all the systems and other heat sources (including electric lights and human bodies) produce it to maintain the room's temperature as constant. If you live in a cold climate or have some handy "cold reservoir" that can absorb the heat from your cluster indefinitely without getting warmer itself, maybe you can metaphorically open a window and stick in a fan and blow the hot air out into the snow, replacing it with nice cool air from outside. If you live in Durham NC in the summer, the air outside the building is a lot HOTTER than you'd like the cluster room to be, so you have to do work to actively move the heat from your nice cool cluster room to the much hotter out of doors, moving it "uphill" so to speak. This work WILL be done by a refrigeration unit -- an air conditioner -- as that's what they are and what they do. You can even estimate fairly accurately how much air conditioning you'll require to keep up with the rate at which the cluster produces heat, using 3500 Watts per "ton" of A/C (and remembering to provide a lot more capacity than you think you'll need, maybe twice as much). You can install an "off-the-shelf" air conditioning solution if one is possible and makes sense for your cluster room, or you can (likely better) have a pro come and install a proper climate control system. You'd have to do this for EITHER a "big iron" supercomputer OR a beowulf -- in both cases they make lots of heat, in both cases you MUST remove that heat as fast as it is made and dump it outside to maintain ambient air temperatures in the 60's (ideally). Beowulfish clusters are cheap to build, they are relatively cheap and scalable to operate in most environments, but there are most definitely infrastructure costs and requirements -- adequate power and ac and networking in the physical space, and the actual cost of power to run and cool the nodes. The former can usually be "amortized" over many years so that it adds a few tens of dollars per year to the cost of operating the nodes themselves. The latter is unavoidable -- roughly $1/watt/year for heating and cooling. This is another "killer" surprise for cluster builders -- a 100 node cluster of 100 Watt nodes might cost $75,000 in direct hardware costs, AND $25,000 in renovation costs for new power and AC (amortized over ten years and 100 nodes -- maybe $30 per node per year "payback", including the cost of the money), AND $10,000 a year for power and A/C. It's still cheap, really, compared to big iron -- just not as cheap as you might have thought looking at hardware costs alone. This serious, thoughtful approach to infrastructure, is the best way to keep from having problems with overheating. The best fans or Peltier coolers in the world aren't going to do much if ambient air in the cluster room is in the 80's or 90's, and without AC a cluster room can get well into the 100's and beyond in a remarkably short period of time. If you have 50 KW or so being given off in an office-sized space with insulating walls and no AC, you'll be able to bake brownies by leaving cups of batter out on top of your racks, at least until something melts, shorts, starts a fire, and burns down the whole thing. As far as the rest of your remarks on case design are concerned, they may be well-justified but there are a lot of cases out there and you should look at more than one. It isn't horribly easy to design airflow inside a 1U space filled with big block-like components, and some do a better job of it than others. Even with a good case design, something like using a flat ribbon cable ide/floppy connector instead of a round cable can defeat your purpose, in SOME units, by virtue of the ribbon accidentally blocking part of the airflow! I "like" 2U cases a bit better than 1U's for that reason, but there are some people that make very lovely 1U cases that seem to be quite robust and reliable -- as long as you keep ambient air in the 60's or at worst low 70's at the fan intake. rgb > on me due to overheat, so I look at the cases and I think- why aren't people > looking to use airflow in a more efficient manner. I know the ambient air temp > isn't this high. I may be in left field, but it seems like the flow inside a > case is so turbulent that the mean air velocity is not carrying the warm air > away from the cpu as quickly as it could. > Joseph Bassett > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jbassett at blue.weeg.uiowa.edu Fri Apr 11 14:26:30 2003 From: jbassett at blue.weeg.uiowa.edu (jbassett) Date: Fri, 11 Apr 2003 13:26:30 -0500 Subject: cooling systems Message-ID: <3EA1CB6B@itsnt5.its.uiowa.edu> This is precisely the point that I am getting at. It seems indirect to me to cool the ambient atmosphere in a room using air conditioners, then expect the heat to distribute itself so that the temperature is at equilibrium throughout the system. It seems entirely more sensible to have a system such that cold air would be directed more precisely at the cpus, then have the exiting flow directed through some sort of exhaust system which would take it to some place that would act as a heat resovoir. In that way you could use the existing airconditioning infrastructure at a facility by distributing the hot exhaust from a cluster into the building. You could even use a venturi on an air duct pipe to keep a vacuum going and redistribute the hot air. Joseph Bassett _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From davidm at napali.hpl.hp.com Fri Apr 11 13:59:12 2003 From: davidm at napali.hpl.hp.com (David Mosberger) Date: Fri, 11 Apr 2003 10:59:12 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <003601c30005$9d740a10$a620a4d5@tsatec.int> References: <16022.194.793900.97453@napali.hpl.hp.com> <003601c30005$9d740a10$a620a4d5@tsatec.int> Message-ID: <16023.624.775864.544742@napali.hpl.hp.com> >>>>> On Fri, 11 Apr 2003 10:37:39 +0200, "Adriano Galano" said: >> >>>>> On Fri, 11 Apr 2003 06:55:30 +1000, Duraid Madina >> said: Duraid> You and I both know the only real barrier to Itanium Duraid> adoption is the price. Can anyone here shed some light on Duraid> this? Why is Itanium hardware still so expensive? >> Remember that Intel is targeting Itanium 2 against Power4 and SPARC. >> In that space, the price of Itanium 2 is very competitive. Adriano> What's mean very competitive? How it compare with Power* for example? AFAIK, Power4 CPUs are not sold on the open market, so it's difficult to compare the price of the CPU alone (surely IBM has a list price, but with different discount schedules, that price may or may not be meaningful in practice). Here is one real price point for an Itanium 2 workstation: - hp workstation zx2000 (Linux software enablement kit) - Intel? Itanium 2 900MHz Processor with 1.5MB on-chip L3 cache - 512MB Total PC2100 Registered ECC DDR 266 SDRAM Memory (2x256MB) - 40GB EIDE Hard Drive - NVIDIA Quadro2 EX - 10/100/1000BT LAN integrated - 16X Max DVD-ROM - Linux software enablement kit (not an operating system) - 3-year warranty, next-day, onsite hardware response, Mon - Fri, 8am - 5pm - $3,298 (To see this config, go to www.hp.com, then click on "online shopping" -> "small and medium business store" -> "workstations" -> "hp Itanium 2-based workstations" -> "zx2000"). I don't know exactly what price/configuration Power4 machines start. Perhaps one of the IBMers on this list could chime in? --david _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From becker at scyld.com Fri Apr 11 14:50:24 2003 From: becker at scyld.com (Donald Becker) Date: Fri, 11 Apr 2003 14:50:24 -0400 (EDT) Subject: remote booting In-Reply-To: Message-ID: On Fri, 11 Apr 2003, Yudong Tian wrote: > Please make sure your NIC supports PXE boot or not. If it does, > then you can boot over the network without using a floppy. > If it does not, you might need to use syslinux on a floppy. The Scyld system boots using almost any boot media, including floppy. And since it uses a Linux kernel as part of the boot system, it supports almost any network devices that the cluster might end up using. But we still strongly recommend using PXE boot instead of the "stage 1" system we developed. The PXE boot protocol isn't as technically strong, but that is out-weighted by - the tens of millions of machines that already have PXE support - its near ubiquity in current production machines, and - its very low cost to retrofit by installing a NIC with a PXE ROM. -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Scyld Beowulf cluster system Annapolis MD 21403 410-990-9993 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Fri Apr 11 14:18:42 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Fri, 11 Apr 2003 11:18:42 -0700 Subject: SMC8624T vs DLINK DGC-1024T / Jumbo Frames ? In-Reply-To: <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de> References: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca> <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de> Message-ID: <20030411181841.GC1321@greglaptop.internal.keyresearch.com> On Fri, Apr 11, 2003 at 12:34:08PM +0200, rochus.schmid at ch.tum.de wrote: > http://www.scl.ameslab.gov/Publications/ HalsteadPubs/usenix_halstead.pdf > it says that the effect on bandwith with jumbo frames is only seen for tcp/ip > commun (netpipe) but is completely lost using MPI. since my code is MPI based > it wouldn't matter to have jumbo frames and i could go with the cheaper DLINK. > is this info right? or outdated? missunderstood? This is a function of how big your communications are -- if you always send multi-megabyte messages, you probably will get a bit better bandwidth with jumbo frames. By the way, interrupt coalescence can get most of the improvement of jumbo frames without the pain that jumbo frames can cause. But interrupt coalescence might make short messages take a bit longer to arrive. greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Fri Apr 11 15:29:11 2003 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Fri, 11 Apr 2003 12:29:11 -0700 (PDT) Subject: cooling systems In-Reply-To: <3EA1CB6B@itsnt5.its.uiowa.edu> Message-ID: On Fri, 11 Apr 2003, jbassett wrote: > This is precisely the point that I am getting at. It seems indirect to me to > cool the ambient atmosphere in a room using air conditioners, then expect the > heat to distribute itself so that the temperature is at equilibrium throughout > the system. It seems entirely more sensible to have a system such that cold > air would be directed more precisely at the cpus, then have the exiting flow > directed through some sort of exhaust system which would take it to some place > that would act as a heat resovoir. In that way you could use the existing > airconditioning infrastructure at a facility by distributing the hot exhaust > from a cluster into the building. You're better off just taking the heat out of the air and exhausting it out of the building, which you do with either vapor-phase refridgeration or chilled water in general... when you have large datacenters, you don't use the rest of the building as a heat resovoir, it's not nearly big enough a space unless you happen to work in one of the moffet field blimp hangars or something. if you had enough space or a convenient lake you could also use a heat-pump. > You could even use a venturi on an air duct > pipe to keep a vacuum going and redistribute the hot air. > > Joseph Bassett > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- In Dr. Johnson's famous dictionary patriotism is defined as the last resort of the scoundrel. With all due respect to an enlightened but inferior lexicographer I beg to submit that it is the first. -- Ambrose Bierce, "The Devil's Dictionary" _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Fri Apr 11 14:26:16 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Fri, 11 Apr 2003 11:26:16 -0700 Subject: Scaling of hydro codes In-Reply-To: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de> References: <16021.12168.589577.793395@cincinnatus.kis.uni-freiburg.de> Message-ID: <20030411182616.GD1321@greglaptop.internal.keyresearch.com> On Thu, Apr 10, 2003 at 10:47:04AM +0200, Wolfgang Dobler wrote: > My question is: do others find the same type of scaling for hydro codes? > If so, how can this be understood? CFD can vary widely. Some algorithms are cache friendly (operator splitting, the compute part of spectral codes), some are not (3D operators). Sometimes the data size is huge (1+ gbytes/cpu) and sometimes it's small enough to fit in the combined L2 caches of your cluster. A non-cache-friendly code won't get a great speedup when you use the 2nd cpu. This is what Craig Tierney mentioned, and you can test for this effect using a 1-cpu and 2-cpu run. Large data sizes mean easier network scaling. You can look at that separately by running the code at several sizes using 1 cpu per machine. If you increase the data size as you use more cpus, this scaling should be nearly linear. greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mike.sullivan at alltec.com Fri Apr 11 14:47:26 2003 From: mike.sullivan at alltec.com (Mike Sullivan) Date: Fri, 11 Apr 2003 14:47:26 -0400 Subject: cooling systems (jbassett) Message-ID: <3E970DBE.8020301@alltec.com> We have designed a custom cabinet for a client that must run in a room without AC. The system has an integral centrifugal blower that can have the air ducted to an outside sink. ( or to double as a furnace in the winter). The Motherboards mount to trays inside the cabinet so we do not use standard 1U cases. -- Mike Sullivan Director Performance Computing @lliance Technologies, Voice: (416) 385-3255 x 228, 18 Wynford Dr, Suite 407 Fax: (416) 385-1774 Toronto, ON, Canada, M3C-3S2 Toll Free:1-877-216-3199 http://www.alltec.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Fri Apr 11 15:38:38 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Fri, 11 Apr 2003 12:38:38 -0700 Subject: cooling systems In-Reply-To: <3EA0F5C5@itsnt5.its.uiowa.edu> References: <3EA0F5C5@itsnt5.its.uiowa.edu> Message-ID: <20030411193838.GB1690@greglaptop.internal.keyresearch.com> On Fri, Apr 11, 2003 at 11:41:07AM -0500, jbassett wrote: > Does anyone know of if it is possible to buy a rackmount cluster with an > integrated cooling system? ASCI Red has a little air conditioner on the top of every rack, but that's undoubtedly more expensive than using standard commercial units. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rmyers1400 at attbi.com Fri Apr 11 15:16:00 2003 From: rmyers1400 at attbi.com (Robert Myers) Date: Fri, 11 Apr 2003 15:16:00 -0400 Subject: cooling systems In-Reply-To: <3EA0F5C5@itsnt5.its.uiowa.edu> References: <3EA0F5C5@itsnt5.its.uiowa.edu> Message-ID: <3E971470.7020004@attbi.com> jbassett wrote: >Does anyone know of if it is possible to buy a rackmount cluster with an >integrated cooling system? It seems against the philosophy of Beowulf to look >for low cost computing solutions, and then find that you need to make a >substantial investment just to cool the room. I had an Athlon system shut down >on me due to overheat, so I look at the cases and I think- why aren't people >looking to use airflow in a more efficient manner. I know the ambient air temp >isn't this high. I may be in left field, but it seems like the flow inside a >case is so turbulent that the mean air velocity is not carrying the warm air >away from the cpu as quickly as it could. > > A rackmount cluster with an integrated cooling system sounds like big money. If you're looking at your case and worrying about overheat and especially if you think airflow is the problem, you might want to look at lower hanging fruit, like getting cables out of the way of the airflow and/or going to round cables instead of ribbon cables. A homebuilt or commercial ducted fan solution that brings air directly to the CPU from the outside is a big, relatively low-cost win more in line with the typical economics of a beowulf cluster. The typical homebuilt installation involves a case fan, some off the shelf flexible ducting, and a home built shroud over the CPU heatsink. Google groups.google.com and www.google.com on CPU "ducted fan" to get an idea of the range of possibilities and results. RM _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Fri Apr 11 15:52:03 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 11 Apr 2003 15:52:03 -0400 (EDT) Subject: cooling systems In-Reply-To: <3EA1CB6B@itsnt5.its.uiowa.edu> Message-ID: On Fri, 11 Apr 2003, jbassett wrote: > This is precisely the point that I am getting at. It seems indirect to me to > cool the ambient atmosphere in a room using air conditioners, then expect the > heat to distribute itself so that the temperature is at equilibrium throughout > the system. It seems entirely more sensible to have a system such that cold > air would be directed more precisely at the cpus, then have the exiting flow > directed through some sort of exhaust system which would take it to some place > that would act as a heat resovoir. In that way you could use the existing > airconditioning infrastructure at a facility by distributing the hot exhaust > from a cluster into the building. You could even use a venturi on an air duct > pipe to keep a vacuum going and redistribute the hot air. I think that if you worked it all out you'd find that you CAN do this, AND that it would provide you more precise control of temperatures, AND that it would cost you a lot MORE than a standard AC/chiller/heat exchanger with relatively simple but suitable ductwork and fans with the ability to balance and redirect airflow. Although I could easily be wrong, since I don't really know what kind of cluster we're talking about, or how big, or what kind of space you're trying to put it in. First of all, I think you're almost certainly mistaken when you say that you can use existing A/C infrastructure to cool a cluster, at least if that cluster has more than 16 nodes or so (small clusters you often can, and many do). Orindary building AC that is servicing "offices" isn't really generally engineered to remove more than a couple or three of KW in a single room-sized space and deliver it back to its heat exchanger -- the ductwork and delivery/return systems simply aren't adequate to do more. Sometimes cool air is delivered in one office and only exhausted/returned several offices away, passing through several interoffice vents in between! My office gets air through a square foot or two of ductwork. That air would have to howl in at ten below zero to cool a big cluster, and the hot air would have to howl out just as fast. Worse still, as we found out the hard way, most physical plants will shut A/C chillers down altogether in the winter time, or run them on a standby/intermittant basis. After all, who needs AC in the winter? It's COLD outside, isn't it? So don't count on building air to be adequate OR reliable, unless reengineered to be both for your particular needs. Those needs can be quite variable. A good sized cluster can consume as much power as all the offices in a good sized building put together -- tens of KW -- and needs A/C just as much in the winter as in the summer, unless you figure out a clever way of using the cluster room as a "furnace" to warm all the offices while cooling the cluster. To put it in perspective, the A/C heat exchanger/blower alone in our server room sits in a unit about two meters cubed in size and sounds like a 747 at cruising altitude in operation, which is all the time. Then there is the actual chiller, which is far away and (fractionally speaking, as it is shared) just as big. Its air delivery and return are about a square meter or more each in cross section before it starts splitting down. At the moment it is removing a few tens of KW continuously, day and night, and the ambient room air (delivered in a balanced way from overhead down to the general fronts of the racks but not ducted right down into them) hovers between 60 and 70, except in the air columns right BEHIND the node racks where it is more like 70-80 (about a 15 degree difference between incoming and exiting air). To work in the room you need a jacket, unless you're standing behind the racks where you could work comfortably in shorts and a tee shirt. This is still a pretty "small" cluster, too -- around 150 dual CPU nodes, plus sundry single processor nodes and some servers -- with infrastructure capacity that might support about twice as many nodes eventually as the room fills. How could this ever be managed by an ordinary office AC duct? Second, look at the costs. Putting active, directed coolers in each chassis costs more than just fans (and generates more total heat). Also, there will ALREADY be a warm air return in the room if it is A/C'd at all -- your "some sort of exhaust system". Air, like energy, is conserved and whatever comes out of the blowers into the room must go out of the room into the blowers. In many cases simply directing cold incoming air down the case fronts (intakes) and permitting the rising warm air off the case backs (outflows) to get to the ceiling and into the return is sufficient -- a reasonably stable airflow will set up to balance cool air out against warm air in. Note that this is NOT equilibrium -- the air going in at the front is cool, out at the back warm, and they do NOT mix before the warm air is exhausted -- it is just a stable pattern of circulation. This may or may not be good enough -- for us it seems to be working, but for you it might not. If you want to do better, all it costs you is more ductwork, more fans and control systems to ensure that the ducting does its job of dumping cool air in a balanced way (so it all doesn't come out of the ducts closest to the blower, leaving none for the back part of the room) and picking up the outflowing warm air ditto, maybe a raised floor (since without a raised floor the ductwork will interfere with access to the nodes, front and back). The capture of the exhaust will be particularly tricky as you will no longer be exhausting the hot air trapped on the ceiling as actively and will need to make sure that spillover doesn't build up there and anomalously heat the upper part of the room so that it DOES eventually mix with the ambient air. You can also build closed or open racks on raised floors and deliver cool air directly in at the bottom and remove it at the top in what amounts to a dedicated cooling chamber or airflow pattern, per rack. However, as Joel said, folks do all of this -- lots of really big clusters or server farms have very carefully engineered cooling systems, and you can find websites where they discuss and illustrate particular patterns of cool/warm airflow for various designs. They're just expensive. All of this just costs more money, and you seemed to be complaining about the (lesser) cost of ordinary A/C or using A/C at all, not looking for ways of increasing costs still further with complicated ductwork on top of ordinary A/C. So the only point I was (and am) making is that NOTHING you do at the node level gets you around the fundamental problem that leads you to A/C in the first place -- ensuring that power in = heat out such that input cooling air (or ambient air at the fan intakes) stays at or below may 75F, ideally much cooler. There are lots of ways to make this happen with or without fancy ductwork depending on cluster size and design, but you MUST make it happen. If your space DOES have adequate capacity in building AC and your cluster isn't too huge, then you are lucky -- some fans and maybe some ductwork and you'll likely be able to fly. If you're building a cluster that will draw, say, 4 KW and up, and don't "happen" to have a space with lots of power and surplus AC capacity and ductwork that can deliver and return air in a balanced way, (like a former server room) you're almost certainly going to be looking at some cluster-specific renovation to provide the required power and cooling. rgb > > Joseph Bassett > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Fri Apr 11 16:02:33 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 11 Apr 2003 16:02:33 -0400 (EDT) Subject: IA-64 related question (tangentially) In-Reply-To: <3E97144D.4050201@moene.indiv.nluug.nl> Message-ID: On Fri, 11 Apr 2003, Toon Moene wrote: > Robert G. Brown wrote: > > > FSF or Debian don't need Itaniums, really, except for maybe one or two > > to ensure that builds work on the architecture. They don't "compute". > > I do. To me a cycle is a precious thing as I use so MANY of them over > > the years. > > Given what you told us about your code (how you didn't need Fortran and > all those nice multi-rank array loops), I gather that you could get lots > of cycles with an Itanium, but no speed. > > Perhaps David M-T can explain this better than me ... I know, I know. It would be a horrible waste of money to give them to me in terms of raw cost benefit, EXCEPT of course that to me they'd be FREE, and it is hard to beat that for cost benefit;-) So I guess you can ask for them instead. At least the Netherlands is closer to the UK...:-) rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu In a little known book of ancient wisdom appears the following Koan: The Devil finds work For idle systems Nature abhors a NoOp The sages have argued about the meaning of this for megacycles, some contending that idle systems are easily turned to evil tasks, others arguing that whoever uses an idle system must be possessed of the Devil and should be smote with a sucker rod until purified. I myself interpret "Devil" to be an obvious mistranslation of the word "Daemon". It is for this reason, my son, that I wish to place a simple daemon on your system so that Nature is satisfied, for it is clear that a NoOp is merely a Void waiting to be filled... This is the true Tao. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Fri Apr 11 14:56:18 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 11 Apr 2003 14:56:18 -0400 (EDT) Subject: IA-64 related question (tangentially) In-Reply-To: <20030411180034.GA1321@greglaptop.internal.keyresearch.com> Message-ID: On Fri, 11 Apr 2003, Greg Lindahl wrote: > On Fri, Apr 11, 2003 at 10:27:31AM -0400, Robert G. Brown wrote: > > > FSF or Debian don't need Itaniums, really, except for maybe one or two > > to ensure that builds work on the architecture. They don't "compute". > > Joking aside, every compiler group needs a cluster, because their > nightly testing is: > > build kernel, test kernel > build compiler, test compiler > build all rpms in your distro > build and run SPECcpu > build and run misc tests > run a search over combinations of optimization flags to see if > any are broken or have performance regressions > > And that's not even considering parallelizing builds. Good point; I had forgotten the compiler people. Kernel people in general as well. I was thinking more in terms of application level people. Not to mention trying to weasel some free high end systems by talking down the competition...;-) Alas, alas, It is not meant to be. They in UK, Me in NC. :-) rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sgaudet at wildopensource.com Fri Apr 11 16:46:25 2003 From: sgaudet at wildopensource.com (Stephen Gaudet) Date: Fri, 11 Apr 2003 16:46:25 -0400 Subject: [Linux-ia64] Itanium gets supercomputing software References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> <20030410215139.O3614@www2> Message-ID: <3E9729A1.1040100@wildopensource.com> Bob Drzyzgula wrote: > On Fri, Apr 11, 2003 at 09:58:08AM +1000, Duraid Madina wrote: > >>David Mosberger wrote: >> >>>Remember that Intel is targeting Itanium 2 against Power4 and SPARC. >>>In that space, the price of Itanium 2 is very competitive. >> >>OK, I want to be clear on this. I asked why Itanium hardware is still so >>expensive. Your answer seems to be marketing speak for "The prices are >>still high because we are _happy_ selling small quantities of this >>equipment to people used to paying through the nose for good quality >>hardware." Is this correct? > > I'm not sure that it works this way. I think it's more like > "We are making the best processor we know (or, perhaps, > "knew", or "thought we knew", or even "allowed ourselves > to know") how to make that will/would/might in our dreams > be profitable to sell at this high price in moderate > quantities." I expect that if they could sell one hundred > times as many Itaniums at a tenth the price, they would > ramp up the fabs and do it. But then you get into the > chicken-or-egg problem: There's no software, and hence > no demand, and hence no software, and hence no demand, > that would justify the production of a hundred times as > many Itaniums. Based on over 25 years in computers owning a company for 5 years you see this change in the computing market over time. When I sold Alpha based systems there was always a bitch about cost. However, people that needed the compute cycles were more than willing to purchase Alpha over Intel because of what it brought them in total TCO. More compute cycles, memory and bandwith. Main problem was Digital at the time, was they never knew how to sell Alpha other than with UNIX. They tried selling it with MS Windows and never made a dent in the market until OEM's starting selling it in the 3D space with a little package called Renderman. This was big hit with film studios. Remember the movie Titanic? Rendered on Alpha to give you a time line. The fastest cpu was a 21064, 275MHz and a system cost about $12,000.00. The Alpha market started to take off when Digital screwed up with a product call the Multia. This was a 21066 processor, 166Mhz or 233Mhz. The Multia was Digital's attempt to build a X terminal for Windows NT. It failed and left DEC with 15,000 of these pigs sitting around. Now Digital needed to get rid of them quick. The plan was to sell them with Linux and hopefully develop the Linux space. These Multia/UDB sold for less than $2000.00. That's when Alpha started to take off. I personally sold tons of them. In fact, in a former life I even sold a system or two to David Mosberger. So I'll agree that when the cost comes down more people will get involved with the ia64. BTW: Intel is looking to release a single cpu version of the ia64 sometime this year. When this happens I believe you'll see the market open up. >>Can I then conclude that Intel has not yet had any interest whatsoever >>in driving IA64 into the realm of reasonble prices? It's sad to see so >>much work being put into this Linux port when, if things remain as they >>are, it will hardly be used. Main reason as David alluded to these systems are meant to compete with high end Sun, HP and IBM servers. Not in the commodity market. Remember, the cost in R&D on ia64 development. > Be careful that you put the horse before the cart. > Might it not be that the people doing this work are > wagering that it will ultimately cause demand for > the Itanium to increase? Could it really be expected > that demand for Itanium *would* materialize without > such investment in software happening first? > > In any event, virtually nothing remains as it is. Myself I wouldn't worry, over time Intel has a way of getting the price down. Heck, Dell has P4 desktops selling for $449.00 and notebooks for $799.00. Wow. Cheers, Steve Gaudet ..... <(???)> ---------------------- Wild Open Source Bedford, NH 03110 pH: 603-488-1599 cell: 603-498-1600 Home: 603-472-8040 http://www.wildopensource.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Fri Apr 11 17:20:05 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Fri, 11 Apr 2003 17:20:05 -0400 (EDT) Subject: cooling systems In-Reply-To: Message-ID: as usual, Robert (et al) gave a comprehensive answer. I'd like to just emphasize one thing: expect trouble if you use chilled-water cooling that's designed/managed for offices. ours works fairly well during the summer (16 tons of chillers and 35 KW dissipated, which should work out to 10/16 utilization.) but facilities people tend not to think of chilled water as a critical resource, and construction people certainly do not. you *will* have problems with the temperature of your chilled water (not to mention whether it's even flowing). I was surprised how little thermal capacity the chillers/pipes have - our room heats up in seconds if there's any disruption. and don't forget to run your chiller blowers on your UPS :( I expect we'll be adding supplemental electical cooling soon ;( consider wiring up a few ibutton thermo sensors - I have 5 now (incoming chilled water pipe, chilled air duct, dead/ambient and hot/return air), and log them every 30 seconds or so. yes, I have a little script that monitors the temps, logs them in mysql, pages me, powers off. clusters are getting bigger, and these problems aren't going away. yes, one solution is to use laptop processors. that works, but is simply inapropriate for some applications. another is to try water-cooling, which I've heard some cluster vendors are working on. the main appeal there is to avoid flakey CPU fans, and potentially to exchange and transport heat more effectively. but you're still probably dependent on a chiller somewhere. regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Duclam80 at gmx.net Fri Apr 11 17:57:56 2003 From: Duclam80 at gmx.net (Vu Duc Lam) Date: Sat, 12 Apr 2003 04:57:56 +0700 Subject: Can't run NAS Benchmark Message-ID: <002f01c30075$71b38570$1a3afea9@conan> Hi, To run NAS Benchmark correctly, may be or not to install Scalapack library. I have some problem when trying to run 5 Kernel Benchmarks with class B and C. I have installed NAS Benchmark in a cluster System. The system is collection of Intel processor-based workstations and server interconnected by TCP/IP network. Each node is Intel with Pentium 800 MHz processor and 256 megabytes of memory, 2GB of Hark Disk. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Fri Apr 11 18:19:15 2003 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Fri, 11 Apr 2003 15:19:15 -0700 (PDT) Subject: cooling systems In-Reply-To: Message-ID: In our system a 24 ton chiller has a backup water source in the form of a city water supply... There's no return on that one, it gets dumped into the waste water treatment stream so your water bill goes up in a big hurry, and it's way less effective because the water is in the upper 50s instead of the mid 30's. It's controlled by a vacuum break valve so switchover is automatic if one supply goes away... Backup fans, extensive temperature monitoring, and thermal kill switches are still necessary, esp as there have been about 6 chiller failures since I arrived in 93 (the thing is 22 years old at this point). water under the floor is always one of those exciting alarms... joelja On Fri, 11 Apr 2003, Mark Hahn wrote: > as usual, Robert (et al) gave a comprehensive answer. > I'd like to just emphasize one thing: expect trouble if you > use chilled-water cooling that's designed/managed for offices. > ours works fairly well during the summer (16 tons of chillers and > 35 KW dissipated, which should work out to 10/16 utilization.) > > but facilities people tend not to think of chilled water as a > critical resource, and construction people certainly do not. > you *will* have problems with the temperature of your chilled > water (not to mention whether it's even flowing). I was surprised > how little thermal capacity the chillers/pipes have - our room > heats up in seconds if there's any disruption. > > and don't forget to run your chiller blowers on your UPS :( > > I expect we'll be adding supplemental electical cooling soon ;( > > consider wiring up a few ibutton thermo sensors - I have 5 now > (incoming chilled water pipe, chilled air duct, dead/ambient > and hot/return air), and log them every 30 seconds or so. > yes, I have a little script that monitors the temps, logs them > in mysql, pages me, powers off. > > clusters are getting bigger, and these problems aren't going away. > yes, one solution is to use laptop processors. that works, but is > simply inapropriate for some applications. another is to try > water-cooling, which I've heard some cluster vendors are working on. > the main appeal there is to avoid flakey CPU fans, and potentially > to exchange and transport heat more effectively. but you're still > probably dependent on a chiller somewhere. > > regards, mark hahn. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- In Dr. Johnson's famous dictionary patriotism is defined as the last resort of the scoundrel. With all due respect to an enlightened but inferior lexicographer I beg to submit that it is the first. -- Ambrose Bierce, "The Devil's Dictionary" _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Fri Apr 11 16:31:58 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Sat, 12 Apr 2003 06:31:58 +1000 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <16023.624.775864.544742@napali.hpl.hp.com> References: <16022.194.793900.97453@napali.hpl.hp.com> <003601c30005$9d740a10$a620a4d5@tsatec.int> <16023.624.775864.544742@napali.hpl.hp.com> Message-ID: <3E97263E.5010605@octopus.com.au> David, Itanium 2 isn't even competitive with other offerings from your own company. Compare: David Mosberger wrote: > Here is one real price point for an Itanium 2 workstation: > > - hp workstation zx2000 (Linux software enablement kit) > - Intel? Itanium 2 900MHz Processor with 1.5MB on-chip L3 cache > - 512MB Total PC2100 Registered ECC DDR 266 SDRAM Memory (2x256MB) > - 40GB EIDE Hard Drive > - NVIDIA Quadro2 EX > - 10/100/1000BT LAN integrated > - 16X Max DVD-ROM > - Linux software enablement kit (not an operating system) > - 3-year warranty, next-day, onsite hardware response, Mon - Fri, 8am - 5pm > - $3,298 with: - HP server rp2430 - 1xHP PA-8700 650MHz CPU with 2.25MB on-chip L1 cache - 128MB Roughly-2GB/sec-God-Knows-What ECC Memory - HP-UX 11i - 1-year warranty, next-day onsite hardware response - $1,095 (missing things like disk, a reasonable amount of RAM, etc can be brought to the level of the Itanium system you quote for another $700 or so - to see this config, go to www.e-solutions.hp.com, and try to buy an rp2430 (HP part #A6889A)) I bought one of these, and it is excellent (if a little loud. ;) I would happily buy a bare-bones Itanium 2 system at the same price. This doesn't seem to like it's going to be possible any time soon. In less than two weeks, I will be able to buy an Opteron system that runs a great deal faster at the same price. Good luck. Duraid _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From iod00d at hp.com Fri Apr 11 16:42:17 2003 From: iod00d at hp.com (Grant Grundler) Date: Fri, 11 Apr 2003 13:42:17 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <3E97263E.5010605@octopus.com.au> References: <16022.194.793900.97453@napali.hpl.hp.com> <003601c30005$9d740a10$a620a4d5@tsatec.int> <16023.624.775864.544742@napali.hpl.hp.com> <3E97263E.5010605@octopus.com.au> Message-ID: <20030411204217.GA4306@cup.hp.com> On Sat, Apr 12, 2003 at 06:31:58AM +1000, Duraid Madina wrote: > Itanium 2 isn't even competitive with other offerings from your own company. Try comparing like products. Single CPU rp2430 is about 1/4 to 1/2 the perf of dual zx6000 depending on what one measures. grant _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From alvin at Maggie.Linux-Consulting.com Fri Apr 11 16:52:47 2003 From: alvin at Maggie.Linux-Consulting.com (Alvin Oga) Date: Fri, 11 Apr 2003 13:52:47 -0700 (PDT) Subject: cooling systems In-Reply-To: <20030411193838.GB1690@greglaptop.internal.keyresearch.com> Message-ID: hi ya On Fri, 11 Apr 2003, Greg Lindahl wrote: > On Fri, Apr 11, 2003 at 11:41:07AM -0500, jbassett wrote: > > > Does anyone know of if it is possible to buy a rackmount cluster with an > > integrated cooling system? > > ASCI Red has a little air conditioner on the top of every rack, but > that's undoubtedly more expensive than using standard commercial > units. putting a real AC is nice but... also using standard household fans to blow air into the racks is goood too ( as long as air can get in and get out of the chassis and up and out the ( other side of the cabinet c ya alvin _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From toon at moene.indiv.nluug.nl Fri Apr 11 15:15:25 2003 From: toon at moene.indiv.nluug.nl (Toon Moene) Date: Fri, 11 Apr 2003 21:15:25 +0200 Subject: IA-64 related question (tangentially) References: Message-ID: <3E97144D.4050201@moene.indiv.nluug.nl> Robert G. Brown wrote: > FSF or Debian don't need Itaniums, really, except for maybe one or two > to ensure that builds work on the architecture. They don't "compute". > I do. To me a cycle is a precious thing as I use so MANY of them over > the years. Given what you told us about your code (how you didn't need Fortran and all those nice multi-rank array loops), I gather that you could get lots of cycles with an Itanium, but no speed. Perhaps David M-T can explain this better than me ... -- Toon Moene - mailto:toon at moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html GNU Fortran 95: http://gcc-g95.sourceforge.net/ (under construction) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From davidm at napali.hpl.hp.com Fri Apr 11 17:35:07 2003 From: davidm at napali.hpl.hp.com (David Mosberger) Date: Fri, 11 Apr 2003 14:35:07 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <20030411212516.GD4306@cup.hp.com> References: <16022.194.793900.97453@napali.hpl.hp.com> <003601c30005$9d740a10$a620a4d5@tsatec.int> <16023.624.775864.544742@napali.hpl.hp.com> <3E97263E.5010605@octopus.com.au> <20030411204217.GA4306@cup.hp.com> <3E972BA5.4010603@octopus.com.au> <20030411212516.GD4306@cup.hp.com> Message-ID: <16023.13579.676695.490297@napali.hpl.hp.com> >>>>> On Fri, 11 Apr 2003 14:25:16 -0700, Grant Grundler said: Grant> Anyway, my point still stands, comparing a "server" (rackable, remote console) Grant> with a "workstation" (3D gfx, sound, DVD-ROM) has alot of variables and Grant> different folks value these things differently. But even here, I would Grant> guess the difference in CPU perf alone is 2x or more by most measures. Grant> Yes, I know the price is 3x but other things make zx2000 attract to Grant> a different set of customers (like linux support). Also, don't forget memory bandwidth. The zx2000 is a very nice workstation. Granted, it's not $1k, but it doesn't perform like a Multia either! And yes, it's quiet, too (I use one as my main workstation these days... ;-) As for what the future holds, I guess we'll just have to wait and see. Remember though: just a year ago, the cheapest ia64 workstation you could get was priced at $7k+. This year, you can get a zx2000 for $3k+, so, judging from where I sit, prices certainly are coming down. --david _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From davidm at napali.hpl.hp.com Fri Apr 11 19:05:46 2003 From: davidm at napali.hpl.hp.com (David Mosberger) Date: Fri, 11 Apr 2003 16:05:46 -0700 Subject: [Linux-ia64] Re: Itanium gets supercomputing software In-Reply-To: <3E9747C0.5080603@octopus.com.au> References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> <20030411022046.GA22381@cse.unsw.edu.au> <3E972A5F.6000807@octopus.com.au> <16023.15589.825794.344192@napali.hpl.hp.com> <3E9747C0.5080603@octopus.com.au> Message-ID: <16023.19018.83511.297444@napali.hpl.hp.com> >>>>> On Sat, 12 Apr 2003 08:54:56 +1000, Duraid Madina said: Duraid> That's right - _you_ use real hardware because you actually have it. Duraid> Everyone else (figuratively speaking, though it's not far off the mark) Duraid> has no choice _but_ to use Ski. That sucks, regardless of whether or not Duraid> Ski is accurate, fast, or easy to use. Duraid> Anyway, I don't think there's much more that can be said. As Matt Duraid> indicated, we must pray for Deerfield, so I will continue to align my Duraid> holy carpet of hope to Fort Collins/Portland/Carly's hotel bedroom and Duraid> pray for reasonably priced IA64 hardware. Anyone can get access to ia64 hardware at: http://testdrive.hp.com/ They're shared machines, so kernel development is out, but for user-level development, they are very handy. --david _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Fri Apr 11 18:54:56 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Sat, 12 Apr 2003 08:54:56 +1000 Subject: [Linux-ia64] Re: Itanium gets supercomputing software In-Reply-To: <16023.15589.825794.344192@napali.hpl.hp.com> References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> <20030411022046.GA22381@cse.unsw.edu.au> <3E972A5F.6000807@octopus.com.au> <16023.15589.825794.344192@napali.hpl.hp.com> Message-ID: <3E9747C0.5080603@octopus.com.au> David Mosberger wrote: >>>>>>On Sat, 12 Apr 2003 06:49:35 +1000, Duraid Madina said: > > Duraid> My point wasn't that software simulators are useless, but > Duraid> that software simulators _should_ be useless **4 years** > Duraid> (!!) after the public availability of hardware. > > Then how do you explain the popularity of user-mode linux on x86? If user-mode linux is "popular", then linux is "f@#%g buggy s#!t". I mean really, when your attitude to software development is: hey somethings wrong my swap is full what gives???? stop running 91589 copies of XMMS 8) shut up riel god i hate you ** Riel is banned from linux-kernel YO CHECK OUT THE NEW VM SYSTEM I WROTE THIS MORNING^H^H^H^H^H^H^HWEEK!! ITLL FIX YOUR PROBLEMS!!!! k i know 2.4 is supposed to be a "stable" kernel but god i hate that riel dude!! :| welp.. out with the old, in with the new!!!!! ** Linus integrates new VM THANKS D00D no probs m8 ..then yes, having UML as a sandbox can help. The UML guys see it differently though. According to them: "It doesn't need to be good for anything. It's fun!" Maybe Ski can embrace this spirit also. ;) > The reason I continue to use Ski is because it's one of the very few > simulators out there that are (a) architecturally extremely accurate, > (b) fast, and (c) very easy to setup & use. Ski is an asset for ia64 > linux, not a weakness. > > (And no, just because we have Ski doesn't mean we don't use real > hardware. Nothing could be further from the truth.) That's right - _you_ use real hardware because you actually have it. Everyone else (figuratively speaking, though it's not far off the mark) has no choice _but_ to use Ski. That sucks, regardless of whether or not Ski is accurate, fast, or easy to use. Anyway, I don't think there's much more that can be said. As Matt indicated, we must pray for Deerfield, so I will continue to align my holy carpet of hope to Fort Collins/Portland/Carly's hotel bedroom and pray for reasonably priced IA64 hardware. Duraid _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Fri Apr 11 16:49:35 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Sat, 12 Apr 2003 06:49:35 +1000 Subject: [Linux-ia64] Re: Itanium gets supercomputing software In-Reply-To: References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> <20030411022046.GA22381@cse.unsw.edu.au> Message-ID: <3E972A5F.6000807@octopus.com.au> I guess I was being a bit subtle. I'm well aware there are things you can do with a simulator that you can't do with hardware. Like test your code against what's supposed to happen, not what actually happens. ;) My point wasn't that software simulators are useless, but that software simulators _should_ be useless **4 years** (!!) after the public availability of hardware. When I said: > I put it to you that software is easier to develop on hardware. I meant that at this late stage, one would expect that people would be writing software, on their hardware. And not a whole lot else, all things considered. Do you see x86 linux people using simulators? Once in a blue moon, perhaps. Does anyone doubt that the x86-64 port will mature a heck of a lot faster than linux-ia64 has? One doesn't need to think for very long to realise why this might be. Don't get me wrong, I think Linus was being a complete idiot for his comments against IA64 and for x86-64, but insofar as keeping hardware pricing so high that Joe K. Hacker can't even dream of affording it is "good business" on Intel/HP's part, it's an even better way of keeping your kernel untested. Duraid David K?gedal wrote: > Exactly. There are a lot of things that you can do with a simulator > that you can't do with hardware. Developing software before hardware > is available is just one of them. (plug mode on) That's why we sell > simulators for most major current CPU architectures. Including IA64. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bropers at lsu.edu Fri Apr 11 17:52:57 2003 From: bropers at lsu.edu (Brian D. Ropers-Huilman) Date: Fri, 11 Apr 2003 16:52:57 -0500 (CDT) Subject: cooling systems In-Reply-To: References: Message-ID: On Fri, 11 Apr 2003, Joel Jaeggli wrote: > if you had enough space or a convenient lake you could also use a heat-pump. One of the "old-timers" here tells a story of some old system, possibly a VAXen, that was water cooled. The water came from the deep nearby lake. There was apparantly an intricate screening system to prevent muck, plants, and the like from getting into the system. Unfortunately, ... one day the temperature suddenly shot through the roof and the system crashed (there may have also been some form of physical damage). The cause, as you may already have guessed: a fish somehow made it into the system and was literally gumming up the pumps. :( Possibly geek-urban legend, but the source is reliable. -- Brian D. Ropers-Huilman (225) 578-0461 (V) Systems Administrator AIX (225) 578-6400 (F) Office of Computing Services GNU Linux brian at ropers-huilman.net High Performance Computing .^. http://www.ropers-huilman.net/ Fred Frey Building, Rm. 201, E-1Q /V\ \o/ Louisiana State University (/ \) -- __o / | Baton Rouge, LA 70803-1900 ( ) --- `\<, / `\\, ^^-^^ O/ O / O/ O _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From davidm at napali.hpl.hp.com Fri Apr 11 18:08:37 2003 From: davidm at napali.hpl.hp.com (David Mosberger) Date: Fri, 11 Apr 2003 15:08:37 -0700 Subject: [Linux-ia64] Re: Itanium gets supercomputing software In-Reply-To: <3E972A5F.6000807@octopus.com.au> References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> <20030411022046.GA22381@cse.unsw.edu.au> <3E972A5F.6000807@octopus.com.au> Message-ID: <16023.15589.825794.344192@napali.hpl.hp.com> >>>>> On Sat, 12 Apr 2003 06:49:35 +1000, Duraid Madina said: Duraid> My point wasn't that software simulators are useless, but Duraid> that software simulators _should_ be useless **4 years** Duraid> (!!) after the public availability of hardware. Then how do you explain the popularity of user-mode linux on x86? The reason I continue to use Ski is because it's one of the very few simulators out there that are (a) architecturally extremely accurate, (b) fast, and (c) very easy to setup & use. Ski is an asset for ia64 linux, not a weakness. (And no, just because we have Ski doesn't mean we don't use real hardware. Nothing could be further from the truth.) --david _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kevin.vanmaren at unisys.com Fri Apr 11 18:36:13 2003 From: kevin.vanmaren at unisys.com (Van Maren, Kevin) Date: Fri, 11 Apr 2003 17:36:13 -0500 Subject: [Linux-ia64] Itanium gets supercomputing software Message-ID: <3FAD1088D4556046AEC48D80B47B478C0101F734@usslc-exch-4.slc.unisys.com> > I don't know exactly what price/configuration Power4 machines start. > Perhaps one of the IBMers on this list could chime in? > > --david Okay, my turn to plug: Itanium 2-based systems _are_ very competitive with RISC machines, even at the mid-range and high end. Unisys is currently selling 4 to 16-processor Itanium 2 machines. They are very competitively priced against mid-sized RISC machines, although pricing information is not available on the web. SCO's UnitedLinux will be available as soon as SCO ships. http://www.unisys.com/products/es7000__servers/hardware/aries__130.htm For a quote in North America, you can contact Rob Luke, rob.luke at unisys.com (801) 594-5088. I can get you contact info for other parts of the world. Kevin Van Maren Unisys _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Fri Apr 11 16:55:01 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Sat, 12 Apr 2003 06:55:01 +1000 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <20030411204217.GA4306@cup.hp.com> References: <16022.194.793900.97453@napali.hpl.hp.com> <003601c30005$9d740a10$a620a4d5@tsatec.int> <16023.624.775864.544742@napali.hpl.hp.com> <3E97263E.5010605@octopus.com.au> <20030411204217.GA4306@cup.hp.com> Message-ID: <3E972BA5.4010603@octopus.com.au> Grant Grundler wrote: > Try comparing like products. > > Single CPU rp2430 is about 1/4 to 1/2 the perf of dual zx6000 depending > on what one measures. Isn't a single CPU rp2430 somewhere between 1/8 and 1/4 the price of a dual zx6000? Duraid _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From iod00d at hp.com Fri Apr 11 17:25:16 2003 From: iod00d at hp.com (Grant Grundler) Date: Fri, 11 Apr 2003 14:25:16 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <3E972BA5.4010603@octopus.com.au> References: <16022.194.793900.97453@napali.hpl.hp.com> <003601c30005$9d740a10$a620a4d5@tsatec.int> <16023.624.775864.544742@napali.hpl.hp.com> <3E97263E.5010605@octopus.com.au> <20030411204217.GA4306@cup.hp.com> <3E972BA5.4010603@octopus.com.au> Message-ID: <20030411212516.GD4306@cup.hp.com> On Sat, Apr 12, 2003 at 06:55:01AM +1000, Duraid Madina wrote: > Isn't a single CPU rp2430 somewhere between 1/8 and 1/4 the price of a > dual zx6000? Dunno. is it? Try comparing dual rp2470 and dual rx2600 with similar products from other vendors. David wrote: | - hp workstation zx2000 (Linux software enablement kit) | - Intel? Itanium 2 900MHz Processor with 1.5MB on-chip L3 cache Sorry - my bad. I misread that as "2x 900 Mhz". Anyway, my point still stands, comparing a "server" (rackable, remote console) with a "workstation" (3D gfx, sound, DVD-ROM) has alot of variables and different folks value these things differently. But even here, I would guess the difference in CPU perf alone is 2x or more by most measures. Yes, I know the price is 3x but other things make zx2000 attract to a different set of customers (like linux support). grant _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From astroguy at bellsouth.net Sat Apr 12 01:19:28 2003 From: astroguy at bellsouth.net (astroguy at bellsouth.net) Date: Sat, 12 Apr 2003 1:19:28 -0400 Subject: cooling systems (jbassett) Message-ID: <20030412051928.DXW28599.imf60bis.bellsouth.net@mail.bellsouth.net> Hi List, (I posted this to the group a bit earlier and it did not get posted but I see from Mike's post that this is not as strange as it first may seem;) I know this might sound a little strange... but strange is my stock and trade I've been working on cooling solutions ever since I had my first 64/128... I have submerged, entire boards into a liquid silicon solution... playing around with liquid coolant is sometimes messy but to my surprise it worked! Perhaps not very practical for rack nodes and I'm not really sure of how this might work over the long haul but for a day it worked find and did demonstrate at least in principle an approach... Now, something off the shelf and not requiring large vats of slick and messy goo, maybe is to build a rack of nodes built inside a self contained refrigeration units like the ones we all have seen at some of the mom and pop convenient stores... the boxed ones with the glass doors... make them air tight and then you are only cooling your rack. Like I said, a little strange twist on an"inside" the box approach;P C.Clary Spartan Sys.analyst PO 1515 Spartanburg, SC 29304-0243 Fax# 801-858-2722 > > From: Mike Sullivan > Date: 2003/04/11 Fri PM 02:47:26 EDT > To: beowulf at beowulf.org > Subject: Re:cooling systems (jbassett) > > We have designed a custom cabinet for a client that must run in a room > without AC. The system has > an integral centrifugal blower that can have the air ducted to an > outside sink. ( or to double as a furnace > in the winter). The Motherboards mount to trays inside the cabinet so we > do not use standard 1U cases. > > -- > Mike Sullivan Director Performance Computing > @lliance Technologies, Voice: (416) 385-3255 x 228, > 18 Wynford Dr, Suite 407 Fax: (416) 385-1774 > Toronto, ON, Canada, M3C-3S2 Toll Free:1-877-216-3199 > http://www.alltec.com > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bclem at rice.edu Sat Apr 12 12:06:47 2003 From: bclem at rice.edu (Brent M. Clements) Date: Sat, 12 Apr 2003 11:06:47 -0500 (CDT) Subject: HPL Benchmark on Itanium 2 box In-Reply-To: <87adevu343.fsf@bix.grotte> Message-ID: Hi Guys, I'm trying to compile the hpl benchmark on a HP zx6000 box. I have the hp math libraries and the intel 7.0 compilers. Has anyone ever tried compiling the hpl benchmark using this compile configuration? If so could they send me their Makefile The reason I'm asking is because I keep on getting the following error HPL_pdtest.o: In function `HPL_pdtest': HPL_pdtest.o(.text+0x1a82): undefined reference to `cblas_dgemv' HPL_pdtest.o(.text+0x1ad2): undefined reference to `cblas_dgemv' Anyone have a clue? Thanks, Brentr Clements _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From david at virtutech.se Sat Apr 12 11:50:20 2003 From: david at virtutech.se (David =?iso-8859-1?q?K=E5gedal?=) Date: Sat, 12 Apr 2003 17:50:20 +0200 Subject: Itanium gets supercomputing software In-Reply-To: <3E972A5F.6000807@octopus.com.au> (Duraid Madina's message of "Sat, 12 Apr 2003 06:49:35 +1000") References: <20030410182609.GF29125@cup.hp.com> <3E95DA42.7000607@octopus.com.au> <16022.194.793900.97453@napali.hpl.hp.com> <3E960510.6070503@octopus.com.au> <20030411022046.GA22381@cse.unsw.edu.au> <3E972A5F.6000807@octopus.com.au> Message-ID: <87adevu343.fsf@bix.grotte> Duraid Madina writes: > I guess I was being a bit subtle. > > I'm well aware there are things you can do with a simulator that you > can't do with hardware. Like test your code against what's supposed to > happen, not what actually happens. ;) > > My point wasn't that software simulators are useless, but that > software simulators _should_ be useless **4 years** (!!) after the > public availability of hardware. Why is that? There are numerous reasons for using simulators to develop software, especially low-level software (OS, drivers, firmware etc.) You get things like full system visibility, non-intrusive debugging, deterministic repeatability, fault injection, and more. -- David K?gedal, Virtutech _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Sat Apr 12 11:54:23 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Sat, 12 Apr 2003 08:54:23 -0700 Subject: HPL Benchmark on Itanium 2 box In-Reply-To: References: <87adevu343.fsf@bix.grotte> Message-ID: <20030412155423.GA2884@greglaptop.attbi.com> On Sat, Apr 12, 2003 at 11:06:47AM -0500, Brent M. Clements wrote: > HPL_pdtest.o: In function `HPL_pdtest': > HPL_pdtest.o(.text+0x1a82): undefined reference to `cblas_dgemv' > HPL_pdtest.o(.text+0x1ad2): undefined reference to `cblas_dgemv' You're missing some BLAS (linear algebra) subroutines. I'm pretty sure that the HPL documentation explains this. For most machines, ATLAS provides a pretty good BLAS library. greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Sat Apr 12 12:22:34 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Sat, 12 Apr 2003 09:22:34 -0700 Subject: Can't run NAS Benchmark In-Reply-To: <002f01c30075$71b38570$1a3afea9@conan> References: <002f01c30075$71b38570$1a3afea9@conan> Message-ID: <20030412162234.GB2884@greglaptop.attbi.com> On Sat, Apr 12, 2003 at 04:57:56AM +0700, Vu Duc Lam wrote: > I have some problem when trying to run 5 Kernel Benchmarks with class B and > C. These benchmarks take a large amount of memory -- do you have enough? greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mizhou at coe.neu.edu Sat Apr 12 15:34:38 2003 From: mizhou at coe.neu.edu (Mi Zhou) Date: Sat, 12 Apr 2003 14:34:38 -0500 Subject: CPU time accounting Message-ID: <008b01c3012a$8edc6ad0$0402a8c0@HuaMao> I am new to cluster management. I wan to get some statistics on the usage of the cluster. Is there some utility that can summarize CPU usage of each user/group? Thanks, Mi _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From beowulf at paralline.com Mon Apr 14 06:57:46 2003 From: beowulf at paralline.com (Pierre BRUA) Date: Mon, 14 Apr 2003 12:57:46 +0200 Subject: renting time on a cluster In-Reply-To: <20030407125604.GR2067@leitl.org> References: <20030407125604.GR2067@leitl.org> Message-ID: <3E9A942A.1030005@paralline.com> Eugen Leitl a ?crit: > Can you think of places where one can rent nontrivial > amount of crunch for money? Paralline can rent nontrivial amount of crunch for money. The question is : how much money are your friend ready to spend and how much crunch he is looking for (requested node config would be nice) for how much time. Pierre -- PARALLINE /// Clusters, Linux, Java /// 71,av des Vosges Phone:+33 388 141 740 F-67000 STRASBOURG Fax:+33 388 141 741 http://www.paralline.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From robl at mcs.anl.gov Sun Apr 13 13:57:59 2003 From: robl at mcs.anl.gov (Robert Latham) Date: Sun, 13 Apr 2003 12:57:59 -0500 Subject: Mac OS X or Linux? In-Reply-To: References: <20030409061920.GA32255@mcs.anl.gov> Message-ID: <20030413175756.GA19214@mcs.anl.gov> On Wed, Apr 09, 2003 at 05:04:52PM -0400, Mark Hahn wrote: > > http://terizla.org/~robl/pbook/benchmarks/lmbench-linux_vs_osx.1 > does OS X have page coloring inherited from *BSD? perhaps that > explains the only place it comes out ahead (memory bandwidth/latency). linux loses out on the Libc(bcopy) score because gnu libc doesn't have ppc-optimised string and memory operations. This really surprised me, but hopefully someone (maybe me, if i can learn ppc assembly fast enough :> ) will implement them. The other memory bandwidth and latency numbers are too close to call, unless i'm missing something. ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Labs, IL USA B29D F333 664A 4280 315B _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From young_yuen at yahoo.com Sun Apr 13 12:36:41 2003 From: young_yuen at yahoo.com (Young Yuen) Date: Sun, 13 Apr 2003 09:36:41 -0700 (PDT) Subject: problem with ANA-6911A/TX under kernel 2.4.18 Message-ID: <20030413163641.31713.qmail@web41303.mail.yahoo.com> Hi, The Tulip driver doesn't seem to detect the RJ45 port. My kernel ver is 2.4.18 and Tulip driver ver is 0.9.15. Linux Tulip driver version 0.9.15-pre11 (May 11, 2002) tulip0: EEPROM default media type Autosense. tulip0: Index #0 - Media MII (#11) described by a 21142 MII PHY (3) block. tulip0: Index #1 - Media 10base2 (#1) described by a 21142 Serial PHY (2) block. tulip0: ***WARNING***: No MII transceiver found! divert: allocating divert_blk for eth0 eth0: Digital DS21143 Tulip rev 33 at 0xc6855000, 00:00:D1:00:0B:4B, IRQ 11. Somtimes after a reboot the warning message is gone. tulip0: MII transceiver #1 config 3100 status 7809 advertising 0101. divert: allocating divert_blk for eth0 eth0: Digital DS21143 Tulip rev 33 at 0xc6855000, 00:00:D1:00:0B:4B, IRQ 11. But in either cases, it fails to ping any nodes on the network besides its own. ANA-6911A/TX is a 100BaseT/10Base2 combo card, RJ45 port is connected to LAN. Windows dual boot from the same machine works fine shows no problem with the network configuration or hardware. Can you please kindly advise. Thx & Rgds, Young __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From landman at scalableinformatics.com Sat Apr 12 19:16:03 2003 From: landman at scalableinformatics.com (Joseph Landman) Date: 12 Apr 2003 19:16:03 -0400 Subject: HPL Benchmark on Itanium 2 box In-Reply-To: References: Message-ID: <1050189362.20946.89.camel@protein.scalableinformatics.com> Hi Brent: Looks like either a missing library, or a library order issue. The HPL_pdtest.o is trying to find the cblas_dgemv function. This function is likely supplied in the optimized Intel libs (though I don't know which library, but it would be one supplying BLAS and LAPACK routines optimized for the platform). You may have a -L/path in front of the correct -lcblas (or similar library name). If you can find out which library is supposed to provide that function, try moving it to a different position in the link line. Joe On Sat, 2003-04-12 at 12:06, Brent M. Clements wrote: > Hi Guys, > I'm trying to compile the hpl benchmark on a HP zx6000 box. > > I have the hp math libraries and the intel 7.0 compilers. > > Has anyone ever tried compiling the hpl benchmark using this compile > configuration? If so could they send me their Makefile > > The reason I'm asking is because I keep on getting the following error > > HPL_pdtest.o: In function `HPL_pdtest': > HPL_pdtest.o(.text+0x1a82): undefined reference to `cblas_dgemv' > HPL_pdtest.o(.text+0x1ad2): undefined reference to `cblas_dgemv' > > Anyone have a clue? > > Thanks, > > > Brentr Clements > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joseph Landman, Ph.D Scalable Informatics LLC email: landman at scalableinformatics.com web: http://scalableinformatics.com phone: +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Mon Apr 14 09:29:55 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Mon, 14 Apr 2003 09:29:55 -0400 (EDT) Subject: CPU time accounting In-Reply-To: <008b01c3012a$8edc6ad0$0402a8c0@HuaMao> Message-ID: On Sat, 12 Apr 2003, Mi Zhou wrote: > I am new to cluster management. I wan to get some statistics on the usage of > the cluster. Is there some utility that can summarize CPU usage of each > user/group? What an interesting question! The "psacct" package in Red Hat et. al. linuces contains the BSD system accounting package (accton, sa, ac, etc). Install and read the man pages to see what you get, on a node by node basis. I have no idea if any of the other cluster monitor tools for general workstation clusters interfaces with psacct -- xmlsysd (my own) does not, although it wouldn't be terribly difficult to hack it so that it did. Alternatively, and perhaps more intelligently (since this isn't the kind of question one generally cares to have answered in a 5-10 second polling loop as the changes are usually fairly predictable given knowledge of who's on a cluster at any given time) it would be fairly straightforward to write a collection script in e.g. perl that polled each node on demand and cumulated results across a cluster. That would be relatively resource expensive -- order of a second per remote ssh call to get the cumulated results -- but presumably one would only run it once a day or thereabouts to cumulate the usage du jour. Note that accounting isn't usually turned on by default because it is "expensive" in its own right -- the system creates an accounting file that gets a record for each process run, and adds a write to this file to the termination sequence for each job as it finishes to preserve its cumulated stat data. On a typical normal node this won't be a horrible problem, but on a system running lots of little commands or with a broken looping command stream it can be. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From anand at novaglobal.com.sg Sun Apr 13 22:44:57 2003 From: anand at novaglobal.com.sg (Anand Vaidya) Date: Mon, 14 Apr 2003 10:44:57 +0800 Subject: SMC8624T vs DLINK DGC-1024T / Jumbo Frames ? In-Reply-To: <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de> References: <5.1.1.6.0.20030401141622.033c1e20@crux.stmarys.ca> <1050057248.3e969a20125dd@intranet.anorg.chemie.tu-muenchen.de> Message-ID: <200304141044.59457.anand@novaglobal.com.sg> On Friday 11 April 2003 06:34 pm, rochus.schmid at ch.tum.de wrote: > dear beowulfers, > > we are in a similar situation as dave: we get an 8nodes dual-xeon cluster > (with tyan e7501 mobo) with intel gige on board. and now the "switch issue" > comes up. my vendor also suggested the DLINK, whereas i found the > discussion on the (more expensive managed) SMC on this list supporting > jumbo frames. the issue was whether or not any of the cheaper (unmanaged) > switches support jumbo frames. i couldnt figure out if this is resolved, > yet. it sounded like they might, but since they are unmanaged the problem > is to switch it on or off. is that right? > > i also found this document: > http://www.scl.ameslab.gov/Publications/ HalsteadPubs/usenix_halstead.pdf ----------------------------------------- The above URL seems to be outdated. Please use http://www.scl.ameslab.gov/Publications/Halstead/usenix_halstead.pdf ----------------------------------------- > it says that the effect on bandwith with jumbo frames is only seen for > tcp/ip commun (netpipe) but is completely lost using MPI. since my code is > MPI based it wouldn't matter to have jumbo frames and i could go with the > cheaper DLINK. is this info right? or outdated? missunderstood? > > any hints highly appreciated. > > greetings > > rochus > > Quoting Dave Lane : > > Can anyone comment on the strengths/weaknesses of these two 24-port > > gigabit > > switches. We're going to be building a 16 node dual-Xeon cluster this > > spring and were planning on the SMC switch (which has received good > > review > > here before), but a vendor pointed out the DLINK switch as a less > > expensive > > alternative. > > > > ... Dave > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From chris_oubre at hotmail.com Mon Apr 14 10:46:58 2003 From: chris_oubre at hotmail.com (Chris Oubre) Date: Mon, 14 Apr 2003 09:46:58 -0500 Subject: HPL Benchmark on Itanium 2 box In-Reply-To: <200304121901.h3CJ19s24252@NewBlue.Scyld.com> Message-ID: <001501c30294$b3608b50$25462a80@rice.edu> Have you tried using the Intel MKL (Math Kernel Library)? http://www.intel.com/software/products/mkl/mkl52/ This is what we use. We have found this library very fast! **************************************************** Christopher D. Oubre * email: chris_oubre at hotmail.com * research: http://cmt.rice.edu/~coubre * Web: http://www.angelfire.com/la2/oubre * Hangout: http://pub44.ezboard.com/bsouthterrebonne * Phone:(713)348-3541 Fax: (713)348-4150 * Rice University * Department of Physics, M.S. 61 * 6100 Main St. ^-^ * Houston, Tx 77251-1892, USA (O O) * -= Phlax=- ( v ) * ************************************m*m************* Message: 2 Date: Sat, 12 Apr 2003 11:06:47 -0500 (CDT) From: "Brent M. Clements" To: , Subject: HPL Benchmark on Itanium 2 box Hi Guys, I'm trying to compile the hpl benchmark on a HP zx6000 box. I have the hp math libraries and the intel 7.0 compilers. Has anyone ever tried compiling the hpl benchmark using this compile configuration? If so could they send me their Makefile The reason I'm asking is because I keep on getting the following error HPL_pdtest.o: In function `HPL_pdtest': HPL_pdtest.o(.text+0x1a82): undefined reference to `cblas_dgemv' HPL_pdtest.o(.text+0x1ad2): undefined reference to `cblas_dgemv' Anyone have a clue? Thanks, Brentr Clements --__--__-- _______________________________________________ Beowulf mailing list Beowulf at beowulf.org http://www.beowulf.org/mailman/listinfo/beowulf End of Beowulf Digest _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From deadline at plogic.com Mon Apr 14 13:13:41 2003 From: deadline at plogic.com (Douglas Eadline) Date: Mon, 14 Apr 2003 12:13:41 -0500 (CDT) Subject: Can't run NAS Benchmark In-Reply-To: <002f01c30075$71b38570$1a3afea9@conan> Message-ID: On Sat, 12 Apr 2003, Vu Duc Lam wrote: > Hi, > > To run NAS Benchmark correctly, may be or not to install Scalapack library. > I have some problem when trying to run 5 Kernel Benchmarks with class B and > C. I have installed NAS Benchmark in a cluster System. The system is > collection of Intel processor-based workstations and server interconnected > by TCP/IP network. Each node is Intel with Pentium 800 MHz processor and > 256 megabytes of memory, 2GB of Hark Disk. You may wish to look at the BPS (Beowulf Performance Suite): http://www.cluster-rant.com/article.pl?sid=03/03/17/1838236 and: http://www.hpc-design.com/reports/bps1/index.html BPS has the NAS suite included. It also has a script that allows different compilers (gnu,pgi,intel), numbers of CPUs, test size, and MPI's (mpich,lam,mpipro) It has everything you need to run the tests (and more). Doug > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.814.2800 130 Webster Street | PARALLEL | Fax:+610.814.5844 Bethlehem, PA 18015 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From JairoArbey at gmx.net Mon Apr 14 11:20:01 2003 From: JairoArbey at gmx.net (Jairo Arbey Rodriguez) Date: Mon, 14 Apr 2003 10:20:01 -0500 Subject: Help f90 - intel Message-ID: <000601c30299$52de7b70$72b0fea9@Q13197.tjdo.com> Hi Friends: I have two PC. One with intel processor and Athlon processor the other one. I got the fortran intel compiler (ifc) and it was installed successful on the first pc. Now I want to install on the second pc (with Athlon processor), but when I issued the command ?./install?, the installer stopped with a message saying: install can't identify your machine type, glibc or kernel. This product is supported for use with the following combinations. Machine Type Kernel glibc 1. IA-32 2.4.7 2.2.4, or IA-32 2.4.18 2.2.5, or 2. Itanium(R)-based system 2.4.3 2.2.3, or Itanium(R)-based system 2.4.9 2.2.4, or Itanium(R)-based system 2.4.18 2.2.4 x. Exit For an unsupported install, select the platform most similar to yours. [H[2JRPM shows no Intel packages as installed. Which of the following would you like to install? 1. Intel(R) Fortran Compiler for 32-bit applications, Version 7.0 Build 20030212Z 2. Linux Application Debugger for 32-bit applications, Version 7.0, Build 20021218 x. Exit [H[2JIntel(R) Fortran Compiler for 32-bit applications, Version 7.0 Build 20030212Z ------------------------------------------------------------------------ -------- Please carefully read the following license agreement. Prior to installing the software you will be asked to agree to the terms and conditions of the following license agreement. ------------------------------------------------------------------------ -------- Press Enter to continue. 'accept' to continue, 'reject' to return to the main menu. Where do you want to install to? Specify directory starting with '/'. [/opt/intel] What rpm install options would you like? [-U --replacefiles] ------------------------------------------------------------------------ -------- Intel(R) Fortran Compiler for 32-bit applications, Version 7.0 Build 20030212Z Installing... error: failed dependencies: ld-linux.so.2 is needed by intel-ifc7-7.0-87 libc.so.6 is needed by intel-ifc7-7.0-87 libm.so.6 is needed by intel-ifc7-7.0-87 libpthread.so.0 is needed by intel-ifc7-7.0-87 libc.so.6(GLIBC_2.0) is needed by intel-ifc7-7.0-87 libc.so.6(GLIBC_2.1) is needed by intel-ifc7-7.0-87 libc.so.6(GLIBC_2.1.3) is needed by intel-ifc7-7.0-87 libc.so.6(GLIBC_2.2) is needed by intel-ifc7-7.0-87 libm.so.6(GLIBC_2.0) is needed by intel-ifc7-7.0-87 libpthread.so.0(GLIBC_2.0) is needed by intel-ifc7-7.0-87 libpthread.so.0(GLIBC_2.1) is needed by intel-ifc7-7.0-87 Installation failed. ------------------------------------------------------------------------ -------- Press Enter to continue. RPM shows no Intel packages as installed. Which of the following would you like to install? 1. Intel(R) Fortran Compiler for 32-bit applications, Version 7.0 Build 20030212Z 2. Linux Application Debugger for 32-bit applications, Version 7.0, Build 20021218 x. Exit Exiting... I am sure that the libraries mentioned above are in /lib/ and /usr/lib. I want to question you: What do I do? Thanks in advance. Jairo Arbey Rodriguez M. Grupo de Fisica de la Materia Condensada Dept. de F?sica, Universidad Nacional de Colombia, Colombia FAX: (571) 244 9122 & (571) 316 5135 TEL: (571) 316 5000 Ext. 13047 / 13081 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 1163 bytes Desc: not available URL: From chettri at gst.com Mon Apr 14 14:59:42 2003 From: chettri at gst.com (chettri at gst.com) Date: Mon, 14 Apr 2003 11:59:42 -0700 Subject: beowulf in space Message-ID: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com> Has anybody considered the theoretical aspects of placing beowulfs on a cluster of satellites? I understand that communication will be slower AND unreliable, and it would restrict the set of problems that could be solved. I'm looking for papers/tech reps etc on the subject. Regards, Samir Chettro _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Mon Apr 14 13:53:07 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Mon, 14 Apr 2003 13:53:07 -0400 (EDT) Subject: beowulf in space In-Reply-To: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com> Message-ID: On Mon, 14 Apr 2003 chettri at gst.com wrote: > Has anybody considered the theoretical aspects of placing beowulfs on a > cluster of satellites? I understand that communication will be slower AND > unreliable, > and it would restrict the set of problems that could be solved. I'm looking > for papers/tech reps etc on the subject. Well, let's see. Beowulfs for what purpose? As far as building general purpose computational supercomputing centers in space, that is such a phenomenally silly idea that anyone that DID have it would probably shake their head after a minute or two of reflection and resolve never to use those particular drugs again. As you say, problems include: a) expense b) communications latency (bandwidth actually can be as big as you like or are likely to ever need, since you ARE a satellite, after all...:-) c) access/maintenance difficulties d) expense e) cooling (think of the cluster as being located a really big vacuum flask) f) onsite staff (astrobots? astroadministrators?) g) radiation and shielding h) energy supply i) hard to get 24 hour turnaround on spare parts j) did I mention expense? Even if you think about some sort of space station as being just another cluster room and the cluster nodes being just off-the-shelf units from Dell, you're looking at one hell of a delivery charge... Now, with all of that said, it may be perfectly reasonable and sane to send small clusters aloft -- I suspect that we already do, every time we launch a shuttle or send experiments up. Many modern jets are architected like a "cluster" in many ways, with sensors and processing units all over the place, interconnected by a network of sorts. A compute cluster has a lot of desirable features -- an extension of the available total computational power that can be brought to bear on certain problems, for example, in addition to some highly desirable redundancy (if a node dies out of five or six you've got, you can proceed to function a bit slower -- if a system dies and is all you've got, you're in a lot more trouble). The "problems" that would be solved are thus restricted by common sense -- dedicated tasks in many cases to accomplish some specific purpose, or MAYBE a very small general purpose cluster on something like a space station doing science that happened to need some local processing power. In most cases, though, it would still make more sense to locate the processing power on the ground and use a dedicated comm channel to the ground to access it. Something out in space has an excellent vantage point to establish high bandwidth (high latency) with any number of ground stations. rgb > > Regards, > > Samir Chettro > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Mon Apr 14 13:47:55 2003 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Mon, 14 Apr 2003 10:47:55 -0700 (PDT) Subject: beowulf in space In-Reply-To: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com> Message-ID: On Mon, 14 Apr 2003 chettri at gst.com wrote: > Has anybody considered the theoretical aspects of placing beowulfs on a > cluster of satellites? I understand that communication will be slower AND > unreliable, Communication to satellites needs to be neither slow nor unreliable, it is generally fairly high latency... It can be quite expensive. There are clusters of computers in space. they generally aren't what you would consider heavy computation platforms... The biggest issues with with computer resources in space are: mass - a large sattellite such as the hughes galaxy 4r bird is around 2500kg for everything that's half the mass of the ups that backs up our racks. every gram you send up costs you. power - solar power and long life cadmium batteries mean your whole platform has to run on pretty thin resources. again using something like galaxy 4r which is a very powerful satellite 8800 watts is what you get max to power everything... Thats with a 26 meter span of galium arsenide solar cells. most of the power is going to communications equipement in the case of galaxy 4r r that would be 24 c band at 40w each and 24 ku at 108w each radiation hardening - without 50 miles of atmosphere overhead we're kinda close to the sun and gamma ray bursts from other parts of the galaxy are kinda hard on the equipment. thermal management - air cooling doesn't work given no atomosphere... even on something like the iss hot air doesn't rise in microgravity, you have resort to fairly extreme measures to deal with the thermal management issues. if you see the laptops on the shuttle they're mostly pentium class thinkpads with some fairly serious mods. There's mission specific equipment as well, but you won't find a rack of dual xeons floating around due to thermal issues alone (disregarding mass or power requirements). expected service life - if you plan on go to the expense of putting it in geostationary orbit you're probably planning on keeping it up there for a minimum of 10-15 years, so it has to still work after a decde in a hostile environment, and upgrades and service calls aren't in the plan. It's pretty easy to spend a billion dollars by the time everything is said and done putting up a large satellite. you generally try to loft only what's critical to the mission of the satellite. > and it would restrict the set of problems that could be solved. I'm looking > for papers/tech reps etc on the subject. > > Regards, > > Samir Chettro > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- In Dr. Johnson's famous dictionary patriotism is defined as the last resort of the scoundrel. With all due respect to an enlightened but inferior lexicographer I beg to submit that it is the first. -- Ambrose Bierce, "The Devil's Dictionary" _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Apr 14 14:46:05 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 14 Apr 2003 14:46:05 -0400 (EDT) Subject: beowulf in space In-Reply-To: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com> Message-ID: > Has anybody considered the theoretical aspects of placing beowulfs on a > cluster of satellites? if this interests you, I highly recommend reading Vernor Vinge's recent books (A Deepness in the Sky, for instance). Robert Forward has some topical ones, too. they are science fiction, though... > I understand that communication will be slower AND > unreliable, well, to the extent that such a cluster would be spread out, I can understand the "slower" part. though c in vacuum is higher than c in fiber or TP. I don't see the "unreliable" part. are you presuming some kind of traditional RF modulation? using free-space optics seems like the more obvious way to network satellites, and I don't see why that would be flakey. > and it would restrict the set of problems that could be solved. I'm looking > for papers/tech reps etc on the subject. I doubt they exist, simply because there's no practical reason, given the huge cost and unclear advantage. I can imagine some really great advertisements for colo though ;) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From derek.richardson at pgs.com Mon Apr 14 13:46:05 2003 From: derek.richardson at pgs.com (Derek Richardson) Date: Mon, 14 Apr 2003 12:46:05 -0500 Subject: Machine Check Exception Message-ID: <3E9AF3DD.1070002@pgs.com> All, Does anyone know if a power supply can cause a machine check exception ( I would think that the VRM would stop it from effecting the processor, but what about the rest of the system - seems odd that the machine wouldn't fail in other ways...)? I have a cluster node that keeps crashing w/ one, and I've looked it up in the Intel ia32 manual, and it's a not specific to processor and RAM ( which I have already changed out ), so I've just been swapping parts out ( so far I've swapped CPU0, where the Exception took place, all the RAM, all the fibre, network, and RSA cards, the motherboard, etc. - basically the only things that are the same as the original node are the chass, power supply, scsi disk ( but not controller ), CPU1, and CPU1's VRM - I just changed out the VRM for CPU0 and am putting the node back into use once it's fibre disk fscks : this might fix the problem. Does anyone have any thoughts on this? I'd hate to throw the entire scenario out and just replace the entire node ( Since I'll eventually have to find and replace the faulty hardware and I've already done so much, I'd like to finish it ). Thanks, Derek R. -- Linux Administrator derek.richardson at pgs.com derek.richardson at ieee.org Office 713-781-4000 Cell 713-817-1197 bureaucracy, n: A method for transforming energy into solid waste. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From astroguy at bellsouth.net Mon Apr 14 15:40:57 2003 From: astroguy at bellsouth.net (astroguy at bellsouth.net) Date: Mon, 14 Apr 2003 15:40:57 -0400 Subject: beowulf in space Message-ID: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> Hi list, Ok, you computer genius and rocket scientist all... I tend to agree with Dr.Brown's position but for the sake of argument... Let's think of along the lines of where a computational cluster might find some in space application. Say for example we were to launch a probe into the Sun's outer corona... let assume also that we have some shielding device that would sustain the craft in the 10 million degree C or so that such a craft is sure to encounter... Even with our best science fiction such a craft could only endure a few precious moments in such a space environment, so we would have to use the advantage of speed... Ok, so we use an ion engine to get the craft up to speed... since the sun's corona extends apparently 700,000 km or so into space... the craft would have to get up to a speed say 250,000 mph. Which we have yet to achieve but not impossible... Sling shot around Jupiter and Mars and back to the sun with the ion engine in a bit of celestial magic provided by or on! ground navigational cluster... certainly we can achieve a very high velocity for our death plunge into the Sun's outer atmosphere... Computational real time observations within those few precious moments before the probe vaporised would certainly be enhanced by an on board beowulf cluster... You asked for speculation, as to an application... I think this is perhaps one. Chip > > From: Mark Hahn > Date: 2003/04/14 Mon PM 02:46:05 EDT > To: chettri at gst.com > CC: beowulf at beowulf.org > Subject: Re: beowulf in space > > > Has anybody considered the theoretical aspects of placing beowulfs on a > > cluster of satellites? > > if this interests you, I highly recommend reading Vernor Vinge's > recent books (A Deepness in the Sky, for instance). Robert Forward > has some topical ones, too. they are science fiction, though... > > > I understand that communication will be slower AND > > unreliable, > > well, to the extent that such a cluster would be spread out, > I can understand the "slower" part. though c in vacuum is higher > than c in fiber or TP. > > I don't see the "unreliable" part. are you presuming some kind of > traditional RF modulation? using free-space optics seems like the > more obvious way to network satellites, and I don't see why that would > be flakey. > > > and it would restrict the set of problems that could be solved. I'm looking > > for papers/tech reps etc on the subject. > > I doubt they exist, simply because there's no practical reason, > given the huge cost and unclear advantage. > > I can imagine some really great advertisements for colo though ;) > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gmpc at sanger.ac.uk Mon Apr 14 16:08:32 2003 From: gmpc at sanger.ac.uk (Guy Coates) Date: Mon, 14 Apr 2003 21:08:32 +0100 (BST) Subject: Help f90 - intel (Jairo Arbey Rodriguez) In-Reply-To: <200304141636.h3EGa1s16542@NewBlue.Scyld.com> References: <200304141636.h3EGa1s16542@NewBlue.Scyld.com> Message-ID: Hi, It looks like you may be installing on a system which does not support RPM as its native packaging format. The other gotcha is that v7.0 of the C/Fortran compilers needs glibc <= 2.2.4 and some newer distros ship with later versions. There are workarounds for both problems. You can force an install of the RPMs by specifying the --nodeps option when the install script asks: >What rpm install options would you like? If your distribution ships with a version of glibc > 2.2.4 then you may need to install the glibc-2.2.4 include files; the compiler cannot parse the include files in newer versions. The easiest way to do this is to grab glibc-2.2.4 from the GNU website, compile (using gcc) and install it in /usr/local/glibc-2.2.4 or /opt/glibc-2.2.4. Just make sure you don't install it on top of your existing glibc in /usr/lib, or anywhere where LD_LIBRARY_PATH or ld.so.conf is going to pick it up, otherwise you will break your system horribly. Once you've installed the glibc headers you need to tell the compiler where to find them. Add -I/usr/local/glibc-2.2.4/include and maybe -restrict to your compiler flags and you should be set. I've used this trick to compile stuff on x86 and ia64 glibc-2.3.x systems without a problem. Cheers, Guy Coates -- Guy Coates, Informatics System Group The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK Tel: +44 (0)1223 834244 ex 7199 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jim_windle at eudoramail.com Mon Apr 14 15:56:48 2003 From: jim_windle at eudoramail.com (Jim Windle) Date: Mon, 14 Apr 2003 15:56:48 -0400 Subject: beowulf in space Message-ID: -- On Mon, 14 Apr 2003 11:59:42 chettri wrote: >Has anybody considered the theoretical aspects of placing beowulfs on a >cluster of satellites? I understand that communication will be slower AND >unreliable, >and it would restrict the set of problems that could be solved. I'm looking >for papers/tech reps etc on the subject. > I am not aware of any published work on placing beowulfs in orbit and as Bob Brown points out the expense and practical difficulties would be immense. The only place I can think of where related issues would be discussed would be in technical papers related to the Iridium satellite network. It has been a few years since I looked at it but if I recall correctly the architecture of their systems was different from all others. In systems like Globalstar the satellites are controlled from the ground and satellites relay to ground stations which in turn relay to other satellites with all routing decisions for network traffic being made on Earth. In Iridium, if I recall correctly, the satellites communicated directly with each other and all routing decisions were made in the satellites themselve. There was no satellite designated as a control node but each satellite would have some processing power for making routing decisions and each satellite would in communication with ot! her satellites directly so some sense it would be beowulf like. Whatever technical papers they published when they were looking for funding for the network might address some of the issues you are interested in. Jim > >Samir Chettro > > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Need a new email address that people can remember Check out the new EudoraMail at http://www.eudoramail.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Apr 14 16:11:52 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 14 Apr 2003 16:11:52 -0400 (EDT) Subject: beowulf in space In-Reply-To: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> Message-ID: > ground navigational cluster... certainly we can achieve a very high > velocity for our death plunge into the Sun's outer > atmosphere... Computational real time observations within those few > precious moments before the probe vaporised would certainly be enhanced by > an on board beowulf cluster... You asked for speculation, as to an > application... I think this is perhaps one. nice plan ;) you have to remember that beowulfery is basically for cheapskates and penny-pinchers. the whole idea is to use hardware that's been made cheap by the commodity PC market, and build something powerful out of it. there's really no special sauce (ie, "grid"), just a bunch of cost-effective hardware. the point is that for space applications, costs are already sky high (heh), so saving a few bucks by running a cluster doesn't make that much sense. the real cost-savings is in reducing mass... I'd also guess that space apps don't need that much compute power. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From shaeffer at neuralscape.com Mon Apr 14 16:32:58 2003 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Mon, 14 Apr 2003 13:32:58 -0700 Subject: beowulf in space In-Reply-To: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> References: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> Message-ID: <20030414203258.GA28080@synapse.neuralscape.com> > ground navigational cluster... certainly we can achieve a very high velocity for our death plunge into the Sun's outer atmosphere... Computational real time observations within those few precious moments before the probe vaporised would certainly be enhanced by an on board beowulf cluster... You asked for speculation, as to an application... I think this is perhaps one. This is getting silly. You still need to transmit the data back to earth. It has already been asserted that it is far more efficient energy wise and cost wise to transmit data than to process it. So you should invest in a system that can transmit all the raw data back to earth. Then you even have the benefit of saving the raw data set for future computations as more is learned... (giggles) Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer at neuralscape.com http://www.neuralscape.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Mon Apr 14 16:18:50 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Tue, 15 Apr 2003 06:18:50 +1000 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <200304141425.SAA07170@nocserv.free.net> References: <200304141425.SAA07170@nocserv.free.net> Message-ID: <3E9B17AA.1000806@octopus.com.au> Mikhail Kuzminsky wrote: > Taking into account that Itanium 2 has much more high performance, > the price from HP looks reasonable. On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's only comparing it against HP's own PA-8700 hardware! Compare it to more mainstream hardware and you'll see just how laughable Itanium 2 prices are. The Itanium 2 doesn't have significantly higher performance than today's Xeons. Opteron, at least for the time being, performs significantly better again. > Yes, Opteron may give good alternative, but I'm not sure > that price/performance ratio for Opteron servers will be better > than for P4 Xeon dual servers. Let me tell you now: the price/performance ratio of Opteron systems _is_ better than that of their dual Xeon counterparts. How long this will remain the case is yet to be seen. Thank your lucky stars for AMD though, as they're the only people who have a chance at making Intel cut their prices. ;) > Only if you need badly 64-bit processor ... It's thanks to Intel that people even think like this. It's now 2003: you shouldn't have to sell your children to get fast 64-bit systems. Duraid _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ctierney at hpti.com Mon Apr 14 18:09:12 2003 From: ctierney at hpti.com (Craig Tierney) Date: 14 Apr 2003 16:09:12 -0600 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <3E9B17AA.1000806@octopus.com.au> References: <200304141425.SAA07170@nocserv.free.net> <3E9B17AA.1000806@octopus.com.au> Message-ID: <1050358151.6451.226.camel@woody> On Mon, 2003-04-14 at 14:18, Duraid Madina wrote: > Mikhail Kuzminsky wrote: > > Taking into account that Itanium 2 has much more high performance, > > the price from HP looks reasonable. > > On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's > only comparing it against HP's own PA-8700 hardware! Compare it to more > mainstream hardware and you'll see just how laughable Itanium 2 prices > are. The Itanium 2 doesn't have significantly higher performance than > today's Xeons. Opteron, at least for the time being, performs > significantly better again. Er, really? What are your comparisons? My comparisons show that Itanium 2's operate about 100% faster on my codes than my 2.2 Ghz Xeons (400 Mhz FSB). This is without going in and trying to tweak the code. I don't know if it is running as fast as it should be. No, this still doesn't justify the price difference, but the performance isn't as bad as you are implying. The I2 does have integer math performance problems, but is supposed to be corrected with the next generation chip (Madison). > > > Yes, Opteron may give good alternative, but I'm not sure > > that price/performance ratio for Opteron servers will be better > > than for P4 Xeon dual servers. > > Let me tell you now: the price/performance ratio of Opteron systems _is_ > better than that of their dual Xeon counterparts. How long this will > remain the case is yet to be seen. Thank your lucky stars for AMD > though, as they're the only people who have a chance at making Intel cut > their prices. ;) > Unless we want start a flame war on NDA hardware, I think should be adding the phrase 'It depends' to any Opteron benchmarks, because it does. Lets get to arguing numbers in about 2 weeks when we can. And no, for MY CODES, Opteron does not perform significantly better than the Itanium 2 in all cases. However, when we start to talk price/performance the Opteron will be the right choice for many applications. However, not necessarily all of them. I am not trying to be negative about a platform that does not exist (yet). Personally I want 8 to 1000 Opteron nodes to do some real work. However blanket statements about any hardware platform don't do any good. But I agree with you, all competition is good. It makes our toys cheaper. Craig > > Only if you need badly 64-bit processor ... > > It's thanks to Intel that people even think like this. It's now 2003: > you shouldn't have to sell your children to get fast 64-bit systems. > > Duraid > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Craig Tierney _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rbw at ahpcrc.org Mon Apr 14 18:15:34 2003 From: rbw at ahpcrc.org (Richard Walsh) Date: Mon, 14 Apr 2003 17:15:34 -0500 Subject: [Linux-ia64] Itanium gets supercomputing software Message-ID: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org> Duraid Madina wrote: >On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's >only comparing it against HP's own PA-8700 hardware! Compare it to more >mainstream hardware and you'll see just how laughable Itanium 2 prices >are. The Itanium 2 doesn't have significantly higher performance than >today's Xeons. Opteron, at least for the time being, performs >significantly better again. The SPECFP numbers rate the Itanium 2 at about 1425 (relative to the base Sun) and the Pentium 4 at about 1100. That's about a 20% advantage on floating point (PA-RISC rates a 600 I think). The integer ratio is about 1100 to 800 in favor of Pentium 4. Bandwidth to memory as measured by stream triad is 50% better on the Itanium implying that you will get a larger percentage of peak for out-of-cache workloads. Then there is the 64-bit address space, EPIC compiler technology, etc. ... but ... Itanium 2 prices seem high to me. However, the questions is really one for Intel and HP ... is the current price generating enough volume to hit the revenue sweet spot. They could care less whether I, you, or any random individual buyer likes the price ;-). The price is right if they are maximizing the time-integrated return on the product. Initial pricing should err high ... you can always lower it, but can never raise it. Until Opteron is available, the only, long-lived, direct competition is the Power 4 (is it available in 1 and 2 processor configurations?). Plus, why should Intel compete with their own price-performance Pentium 4 systems by lowering Itanium 2 prices? They are serving two markets segments those with more money than brains and those with more brains than money ... ;-). The market is quantized ... each product has its own quantum number. I would be interested in SPECFP and Stream Triad numbers for the Opteron if you have them. Regards, rbw #--------------------------------------------------- # Richard Walsh # Project Manager, Cluster Computing, Computational # Chemistry and Finance # netASPx, Inc. # 1200 Washington Ave. So. # Minneapolis, MN 55415 # VOX: 612-337-3467 # FAX: 612-337-3400 # EMAIL: rbw at networkcs.com, richard.walsh at netaspx.com # rbw at ahpcrc.org # #--------------------------------------------------- # "I'm quite contented to take my chances with # the Gildensterns and Rosenkrantzes. # -SpinDoctors #--------------------------------------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Mon Apr 14 18:34:33 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Mon, 14 Apr 2003 15:34:33 -0700 Subject: beowulf in space References: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com> Message-ID: <5.1.0.14.2.20030414150845.02f024f0@mailhost4.jpl.nasa.gov> At 02:46 PM 4/14/2003 -0400, Mark Hahn wrote: > > Has anybody considered the theoretical aspects of placing beowulfs on a > > cluster of satellites? > >if this interests you, I highly recommend reading Vernor Vinge's >recent books (A Deepness in the Sky, for instance). Robert Forward >has some topical ones, too. they are science fiction, though... > > > I understand that communication will be slower AND > > unreliable, > >well, to the extent that such a cluster would be spread out, >I can understand the "slower" part. though c in vacuum is higher >than c in fiber or TP. > >I don't see the "unreliable" part. are you presuming some kind of >traditional RF modulation? using free-space optics seems like the >more obvious way to network satellites, and I don't see why that would >be flakey. RF for short (<1000km) links can be very reliable (certainly better than Ethernet, once you've factored in collisions, etc.). Don't take 802.11 kinds of links as an example. > > and it would restrict the set of problems that could be solved. I'm > looking > > for papers/tech reps etc on the subject. > >I doubt they exist, simply because there's no practical reason, >given the huge cost and unclear advantage. Certainly, flying a Beowulf to provide computing services to a terrestrial user makes no sense, but flying a Beowulf to provide insitu computing crunch for, e.g., data reduction on a deep space mission, makes a lot of sense. While a broadband high rate "pipe" from GEO orbit isn't too tough (all it takes is money to buy or rent a transponder and suitable ground station equipment), the same from a LEO orbit is much more of a challenge. Take something like the Shuttle Radar Topographic Mission (SRTM) as an example. The 2 radars produce 180 and 90 Mbit/sec raw data rate for C and X band, respectively. There isn't any convenient way to get that kind of data pipe for something orbiting the earth every 90 minutes or so. So, they recorded the data on a whole pile of tapes, which they brought back, and which will take some years to ground process the 10 Tbyte of data. And that's for a mere 10-11 days of data. Clearly, some sort of onboard processing would be useful. SRTM was designed to measure the topography of all land surfaces on a 10 meter (or so) grid. Figuring the Earth's surface at 564E6 square kilometers, figuring 40% land area, and 1E4 measurements/square km, you're looking at about 2E12 measurements reduced from around 1E13 bytes of data. Topography is actually one of the easier measurements.. the ground elevation doesn't change much on a day to day basis (usually). Now consider a couple much more difficult problems: 1) quasi real time imaging of some parameter that varies quickly (wind, rainfall, vegetation) 2) moving target detection.. Say you wanted to track all airplanes in flight with an accuracy of, say, 100 meters. For a constellation of spacecraft, one could do things like atmospheric sounding or tomography, the latter of which requires some serious processing crunch to reduce the raw data to usable output. Imagine that you want to tomographically process atmospheric sounding through the atmosphere of Jupiter, but you need to send the data back through a datalink with a bandwidth of, maybe, 1 Mbit/second, 8 hours a day. It's also got to tolerate the somewhat(!) harsh radiation environment of Jupiter. One can argue that for deep space missions, costing hundreds of millions of dollars, that you're not going to be using commodity PCs mail-ordered from WalMart stacked up on baker's carts. However, one might very well use the Beowulf concept of lots of fairly simple, fairly slow processors, interconnected by a high latency, moderate bandwidth fabric of sorts. >I can imagine some really great advertisements for colo though ;) > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Mon Apr 14 16:43:39 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Mon, 14 Apr 2003 13:43:39 -0700 Subject: beowulf in space In-Reply-To: <5.1.0.14.0.20030414115725.01c5cb00@gst.gst.com> Message-ID: <5.1.0.14.2.20030414133602.030c6cd0@mailhost4.jpl.nasa.gov> The short answer is yes, it has been and is being considered, in several forms. The interprocessor comm is not necessarily slower (very wideband optical links are practical), but latency is an issue. However, there are many space applications that can benefit from this sort of thing that aren't particularly bandwidth or latency constrained. While the scientists would generally like to have a big pipe to the ground and just send raw data for later processing, there are situations where you just can't send that much data back, and it has to be on-board processed in some way. Of course, inasmuch as part of Beowulfery is the idea of commodity off the shelf computers being used, real Beowulfs in space aren't likely to come any time soon, since almost NOTHING in space is a commodity part. It costs so much to get it there, that the additional cost for a "custom" part is a small fraction of the launch cost. If you were to search back proceedings of the IEEE Aerospace Conference (Big Sky MT), you'll find some papers on Beowulf type systems proposed for space applications, and also some novel ideas for high bandwidth cluster interconnects based on optical techniques. As RGB pointed out, the design environment for space is somewhat different.. power consumption and cooling (even if you have a reactor a'la Prometheus) are signficant challenges, as is the radiation environment, both in an single event and in a total dose. At 11:59 AM 4/14/2003 -0700, chettri at gst.com wrote: >Has anybody considered the theoretical aspects of placing beowulfs on a >cluster of satellites? I understand that communication will be slower AND >unreliable, >and it would restrict the set of problems that could be solved. I'm >looking for papers/tech reps etc on the subject. > >Regards, > >Samir Chettro > > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Mon Apr 14 18:49:24 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Mon, 14 Apr 2003 15:49:24 -0700 Subject: beowulf in space In-Reply-To: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellso uth.net> Message-ID: <5.1.0.14.2.20030414153502.01978198@mailhost4.jpl.nasa.gov> At 03:40 PM 4/14/2003 -0400, astroguy at bellsouth.net wrote: >Hi list, >Ok, you computer genius and rocket scientist all... I tend to agree with >Dr.Brown's position but for the sake of argument... Let's think of along >the lines of where a computational cluster might find some in space >application. Say for example we were to launch a probe into the >Sun's outer corona... let assume also that we have some shielding device >that would sustain the craft in the 10 million degree C or so that such a >craft is sure to encounter.. The corona is a fairly non-dense plasma (100 ions/cm^3 viz 2E19 atoms/cm^3 for STP air), more closely resembling a really good vacuum(1E-15 torr?), where the ions are moving moderately fast (1-10kEv), corresponding to a temperature of 10 million K, but I don't know that the heat content is all that great, and I don't know that it would actually heat a real body placed in it all that much, any more than the CRT in your TV or monitor heats up from the 100 million K electrons in the internal beam (which has a much, much higher number density than the corona) For some data on a real solar atmosphere probe: http://umbra.nascom.nasa.gov/solar_connections/probe.html and http://umbra.nascom.nasa.gov/spd/solar_probe.html and a nice technical presentation at http://solarprobe2.jpl.nasa.gov/SPBR.html >. Even with our best science fiction such a craft could only endure a few >precious moments in such a space environment, so we would have to use the >advantage of speed... Ok, so we use an ion engine to get the craft up to >speed... since the sun's corona extends apparently 700,000 km or so into >space... the craft would have to get up to a speed say 250,000 mph. Which >we have yet to achieve but not impossible... Sling shot around Jupiter and >Mars and back to the sun with the ion engine in a bit of celestial magic >provided by or on! > ground navigational cluster... certainly we can achieve a very high > velocity for our death plunge into the Sun's outer atmosphere... > Computational real time observations within those few precious moments > before the probe vaporised would certainly be enhanced by an on board > beowulf cluster... You asked for speculation, as to an application... I > think this is perhaps one. While your nav scenario is a bit unrealistic, the need for on-board processing is precisely right..you're limited in your downlink (total bits that can be sent before immolation) >Chip > > > > From: Mark Hahn > > Date: 2003/04/14 Mon PM 02:46:05 EDT > > To: chettri at gst.com > > CC: beowulf at beowulf.org > > Subject: Re: beowulf in space > > > > > Has anybody considered the theoretical aspects of placing beowulfs on a > > > cluster of satellites? > > James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Apr 14 19:53:27 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 14 Apr 2003 19:53:27 -0400 (EDT) Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org> Message-ID: > about 1100 to 800 in favor of Pentium 4. Bandwidth to memory as measured > by stream triad is 50% better on the Itanium implying that you will get > a larger percentage of peak for out-of-cache workloads. Then there is until the next-gen P4 chipsets arrive (and they have). > I would be interested in SPECFP and Stream Triad numbers for the > Opteron if you have them. me too . but if I understand AMD's marketing "plan", we won't see the interesting Opteron systems at launch. that is, since Opteron bandwidth scales with ncpus, it's really the 4-8-way systems that will look dramatically more attractive than any competitors (cept maybe Marvel). it is sort of interesting that much of It2's rep rests on fairly single-threaded benchmarks (cfp2000, stream). but I don't see a lot of people buying uniprocessor It2's, and all It2 systems use a shared 6.4 GB/s FSB. by comparison, a dual-opt has 10.8 GB/s aggregate, which starts to be interesting. I'm hoping AMD will get pumped and support PC3200 on apr 22. I fear that 4x and 8x systems will be late as usual. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Mon Apr 14 19:57:52 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Mon, 14 Apr 2003 16:57:52 -0700 Subject: beowulf in space In-Reply-To: <20030414203258.GA28080@synapse.neuralscape.com> References: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> Message-ID: <5.1.0.14.2.20030414164248.03049040@mailhost4.jpl.nasa.gov> At 01:32 PM 4/14/2003 -0700, Karen Shaeffer wrote: > > ground navigational cluster... certainly we can achieve a very high > velocity for our death plunge into the Sun's outer atmosphere... > Computational real time observations within those few precious moments > before the probe vaporised would certainly be enhanced by an on board > beowulf cluster... You asked for speculation, as to an application... I > think this is perhaps one. > > >This is getting silly. You still need to transmit the data back to earth. It >has already been asserted that it is far more efficient energy wise and cost >wise to transmit data than to process it. This is not necessarily true... While the scientist generally prefers to get the raw data (it allows deferring some of the analysis work to a later time, and, it reduces the risk of making a bad design decision, because you can go reprocess later), in many, many cases it is NOT cheaper to send the data back to earth than to process it in situ and send the processed data. Most spacecraft are severely power constrained, and that sets the basis for the tradeoff of joules expended on processing vs joules expended on sending data (hence my earlier posts about MIPS/Watt being important). Inherent in the fact that one CAN do processing is that the raw data must contain some redundancy, and the processing, in an information theoretic sense, consists of removing the redundancy (consider it as "lossless compression" if you will). For an arbitrary communication link, the key thing is the received energy at the other end compared with the noise energy (usually talked about in terms of Eb/No). You can divvy up the energy in a lot of ways: 1) You can send each (nonredundant) bit multiple times, increasing the energy for each information bit, thereby improving the signal to noise ratio for that bit. There are lots of clever schemes for how you do this (generically called "coding"). Essentially, you put some amount of redundancy back into the data stream, and then remove it at the receiving end. 2) You can not bother removing the redundancy in the first place, transmitting more bits, with less power per bit. I would contend that in an idealized case, the raw sensor data is unlikely to be an efficient coding strategy for the actual information contained in the data. Consider a trivial case where the sensor measures the slowly varying (timeconstant >1 second) temperature of the spacecraft 100 times a second with an accuracy of 8 bits. Clearly, there is a very high correlation between one measurement and the next, so the actual "information" in each sample is quite small. A "send the raw data" strategy would require 800 bits/sec of bandwidth. One could trivially encode it by averaging 10 measurements at a time into an 8 bit average, probably without losing much data. If one was worried about excursions, one could also transmit the min and max values, for a total of 240 bits/second. One could also use any of a number of simple lossless compression schemes to greatly reduce the bit rate. The question to be answered, in a real system, is it better to put your precious joules to work sending all those 800 bits, and not spend any on the processing, hoping that the greater error rate from the low joules/bit can be overcome by ground processing, OR, should one do some onboard processing, say lossless compression, putting more joules in each of the fewer bits (less some amount of energy used in the compression process), then transmit those fewer bits using some form of coding, which increases the transmitted bit rate. It all depends on the link budget, and how close you are to the ragged edge of the Shannon limit. > So you should invest in a system >that can transmit all the raw data back to earth. Then you even have the >benefit of saving the raw data set for future computations as more is >learned... There's also the possibility that it is not feasible to send all data back, and that it HAS to be processed at the sensor. SRTM is a good example of this. >(giggles) >Karen >-- > Karen Shaeffer > Neuralscape, Palo Alto, Ca. 94306 > shaeffer at neuralscape.com http://www.neuralscape.com >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Mon Apr 14 19:33:54 2003 From: raysonlogin at yahoo.com (Rayson Ho) Date: Mon, 14 Apr 2003 16:33:54 -0700 (PDT) Subject: Fwd: [PBS-USERS] PBS technical specialists Message-ID: <20030414233354.95477.qmail@web11402.mail.yahoo.com> Any PBS hackers?? Rayson --- Michael Humphrey wrote: > > Dear PBS users, > > > > > > Altair is looking to expand our PBS Pro technical staff in our Troy > > Michigan office. > > Our ideal candidate will have the followings experiences: > > > > > > Required experiences > > > > 5 or more years as systems Administrator in UNIX environment > > 2 or more years experience with openPBS or PBS Pro > > BS degree in Computer Science or Engineering > > Experience in writing Unix shell scripts > > Experience with PERL > > Good communications skills > > Willing to do some travel > > > > > > Desirable experiences > > Experience administering Windows environments (1-2 years) preferred > > > Experience in MCAE applications environments preferred > > > > If you know someone who might be interested in these positions > please have > > them forward their resume to me via email or contact me via > telephone. > > Thank you for any referrals which may come forward. > > > > > > > > Michael Humphrey > > Altair Engineering > > > > 1820 East Big Beaver Rd. > > Troy, Mi. 48083 > > > > (248) 614-2400 Ext 495 > > > > humphrey at altair.com > > > > > > > > > > > __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ctierney at hpti.com Mon Apr 14 20:27:35 2003 From: ctierney at hpti.com (Craig Tierney) Date: 14 Apr 2003 18:27:35 -0600 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: References: Message-ID: <1050366454.6451.400.camel@woody> On Mon, 2003-04-14 at 17:53, Mark Hahn wrote: > > about 1100 to 800 in favor of Pentium 4. Bandwidth to memory as measured > > by stream triad is 50% better on the Itanium implying that you will get > > a larger percentage of peak for out-of-cache workloads. Then there is > > until the next-gen P4 chipsets arrive (and they have). > > > I would be interested in SPECFP and Stream Triad numbers for the > > Opteron if you have them. > > me too . but if I understand AMD's marketing "plan", > we won't see the interesting Opteron systems at launch. > that is, since Opteron bandwidth scales with ncpus, it's > really the 4-8-way systems that will look dramatically > more attractive than any competitors (cept maybe Marvel). > drooling even more.... > it is sort of interesting that much of It2's rep rests on > fairly single-threaded benchmarks (cfp2000, stream). but I don't > see a lot of people buying uniprocessor It2's, and all It2 > systems use a shared 6.4 GB/s FSB. by comparison, a dual-opt > has 10.8 GB/s aggregate, which starts to be interesting. Feel free to flame me if I am wrong, but the HP chipset for It2 is 8.5 GB/s. The Intel chipset is 6.4 GB/s. Each cpu can push 6.4 GB/s. > > I'm hoping AMD will get pumped and support PC3200 on apr 22. > I fear that 4x and 8x systems will be late as usual. > Lots to talk about on the 22nd! Did AMD pick Earth Day for any particular reason to announce the new product? I do not think this is going to be a 'green' cpu. Craig > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Craig Tierney _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Apr 14 23:55:34 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 14 Apr 2003 23:55:34 -0400 (EDT) Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <1050366454.6451.400.camel@woody> Message-ID: > > see a lot of people buying uniprocessor It2's, and all It2 > > systems use a shared 6.4 GB/s FSB. by comparison, a dual-opt > > has 10.8 GB/s aggregate, which starts to be interesting. > > Feel free to flame me if I am wrong, but the HP chipset for It2 > is 8.5 GB/s. The Intel chipset is 6.4 GB/s. Each cpu can > push 6.4 GB/s. heck, the rx5670 claims 12.8 GB/s: http://www.hp.com/products1/servers/rackoptimized/rx5670/specifications.html alas, all the CPUs sit on a 6.4 GB/s bus: http://www.hp.com/products1/itanium/chipset/4_way_block.html (note that it lists 4 GB/s aggregate IO bandwidth; in short, the 12.8 is simply false; 10.4 is theoretically possible, but in reality, the CPUs will sustain 4ish and IO will probably total less than 1.) the real flames go out to the marketing pinheads who claim 12.8! in fairness, I should note that HP's rx2600 stream scores (3.5 3.5 4.0) are quite excellent. not nearly as good as Marvel, but competitive with a number of traditional vector supers. quite a bit better than an Altix, too (1.7 1.7 1.9). _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hanzl at noel.feld.cvut.cz Tue Apr 15 05:49:52 2003 From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz) Date: Tue, 15 Apr 2003 11:49:52 +0200 Subject: Power supply problems (was: Machine Check Exception) In-Reply-To: <3E9AF3DD.1070002@pgs.com> References: <3E9AF3DD.1070002@pgs.com> Message-ID: <20030415114952S.hanzl@unknown-domain> > Does anyone know if a power supply can cause a machine check exception Power supply can probably cause all sorts of weird problems, with variety similar to RAM problems - problem can surface just anywhere and resemble other hardware or software problem to such an extent that usual diagnostic steps like replacing hardware and software components clearly indicate that something else is faulty, replacing that 'faulty' component seems to fix it but later on problems reoccur. In another words, if your power supply works near the limits, problems may be observed just on few nodes in the cluster (or on a single node) and there may be just certain hardware/software/environmental circumstances which trigger the problem. I've seen certain indications that these days power supply can cause more headaches than before: - note from Abit tech staff saying that certain power supplies send POWER_GOOD 'too early', meaning before on-board power conversion circuits had time to stabilise - overclockers are starting to take power supply more seriously. (However stupid overclocking is, their web sites gives good indication which parts of hardware work near to the limits.) - note that certain Abit BIOS upgrade fixes PSU problems - many problems with Abit IT7 MAX2 and 300W PSU I've encountered myself - fact that modern PSU is under software control and hardware control of that part of mainboard which gets standby power, opening new possibilities for intermixing PSU/hardware/software problems. Most of my experience is Abit-related but may be general I am affraid... Regards Vaclav _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Andrew.Cannon at nnc.co.uk Tue Apr 15 06:41:49 2003 From: Andrew.Cannon at nnc.co.uk (Cannon, Andrew) Date: Tue, 15 Apr 2003 11:41:49 +0100 Subject: UK Cluster hardware suppliers? Message-ID: Hi All, Does anyone know of a supplier (or suppliers) of clustering hardware in the UK? I need to get some quotes for a 16 node cluster. Thanks Andrew Andrew Cannon, Nuclear Technology (J2), NNC Ltd, Booths Hall, Knutsford, Cheshire, WA16 8QZ. Telephone; +44 (0) 1565 843768 email: mailto:andrew.cannon at nnc.co.uk NNC website: http://www.nnc.co.uk *********************************************************************************** NNC Limited Booths Hall Chelford Road Knutsford Cheshire WA16 8QZ Country of Registration: United Kingdom Registered Number: 1120437 This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the NNC system manager by e-mail at eadm at nnc.co.uk. *********************************************************************************** _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jhearns at freesolutions.net Tue Apr 15 07:20:03 2003 From: jhearns at freesolutions.net (John Hearns) Date: 15 Apr 2003 12:20:03 +0100 Subject: UK Cluster hardware suppliers? In-Reply-To: References: Message-ID: <1050405611.10673.16.camel@harwood.home> On Tue, 2003-04-15 at 11:41, Cannon, Andrew wrote: > Hi All, > > Does anyone know of a supplier (or suppliers) of clustering hardware in the > UK? I need to get some quotes for a 16 node cluster. > Off the top of my head, in no particular order, Streamline Computing http://www.streamline-computing.com Workstations UK http://www.workstationsuk.co.uk OCF http://www.ocf.co.uk Clustervision http://www.clustervision.com Compusys http://www.compusys.co.uk Max Black http://www.maxblack.co.uk Quadrics http://www.quadrics.com for fast interconnects SGI IBM HP Dell Oh, and if anyone from these companies spots their name, I'm looking for a job. Apologies to anybody I've missed. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ajt at rri.sari.ac.uk Tue Apr 15 08:44:59 2003 From: ajt at rri.sari.ac.uk (Tony Travis) Date: Tue, 15 Apr 2003 13:44:59 +0100 Subject: UK Cluster hardware suppliers? In-Reply-To: <1050405611.10673.16.camel@harwood.home> References: <1050405611.10673.16.camel@harwood.home> Message-ID: <3E9BFECB.3050802@rri.sari.ac.uk> John Hearns wrote: > On Tue, 2003-04-15 at 11:41, Cannon, Andrew wrote: > >> Hi All, >> >> Does anyone know of a supplier (or suppliers) of clustering hardware in the >> UK? I need to get some quotes for a 16 node cluster. Hello, Andrew. I've just bought 24 Athlon XP 2400+ nodes for a beowulf cluster from Eclipse Computing (mailto:sales at eclipsecomputing.co.uk). They also sell complete Beowulf systems. Tony. -- Dr. A.J.Travis, | mailto:ajt at rri.sari.ac.uk Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751 Aberdeen AB2 9SB, Scotland, UK. | fax:+44 (0)1224 716687 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joachim at ccrl-nece.de Tue Apr 15 09:04:42 2003 From: joachim at ccrl-nece.de (Joachim Worringen) Date: Tue, 15 Apr 2003 15:04:42 +0200 Subject: UK Cluster hardware suppliers? In-Reply-To: References: Message-ID: <200304151504.42313.joachim@ccrl-nece.de> Cannon, Andrew: > Does anyone know of a supplier (or suppliers) of clustering hardware in the > UK? I need to get some quotes for a 16 node cluster. Take a look at http://www.workstationsuk.co.uk . Have no experience with them, though. Joachim -- Joachim Worringen - NEC C&C research lab St.Augustin fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From shaeffer at neuralscape.com Mon Apr 14 20:48:40 2003 From: shaeffer at neuralscape.com (Karen Shaeffer) Date: Mon, 14 Apr 2003 17:48:40 -0700 Subject: beowulf in space In-Reply-To: <5.1.0.14.2.20030414164248.03049040@mailhost4.jpl.nasa.gov> References: <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> <20030414194058.MOSJ7247.imf56bis.bellsouth.net@mail.bellsouth.net> <5.1.0.14.2.20030414164248.03049040@mailhost4.jpl.nasa.gov> Message-ID: <20030415004840.GA28478@synapse.neuralscape.com> On Mon, Apr 14, 2003 at 04:57:52PM -0700, Jim Lux wrote: ...snip... > data (hence my earlier posts about MIPS/Watt being important). Inherent in > the fact that one CAN do processing is that the raw data must contain some > redundancy, and the processing, in an information theoretic sense, consists > of removing the redundancy (consider it as "lossless compression" if you > will). Hello Jim, Sure. But you are going to perform lossless compression with a DSP chip built into the pipeline. It was assumed you would remove redundancy prior to transmitting from space. Lossless compression with a DSP core is not even remotely comparable to hoisting a Beowulf cluster into space to computationally exploit raw data from some exotic remote event. (smiles ;) cheers, Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer at neuralscape.com http://www.neuralscape.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Mon Apr 14 19:23:26 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Tue, 15 Apr 2003 09:23:26 +1000 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <1050358151.6451.226.camel@woody> References: <200304141425.SAA07170@nocserv.free.net> <3E9B17AA.1000806@octopus.com.au> <1050358151.6451.226.camel@woody> Message-ID: <3E9B42EE.7020509@octopus.com.au> Craig Tierney wrote: > On Mon, 2003-04-14 at 14:18, Duraid Madina wrote: >>Mikhail Kuzminsky wrote: >>> Taking into account that Itanium 2 has much more high performance, >>>the price from HP looks reasonable. >> >>On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's >>only comparing it against HP's own PA-8700 hardware! Compare it to more >>mainstream hardware and you'll see just how laughable Itanium 2 prices >>are. The Itanium 2 doesn't have significantly higher performance than >>today's Xeons. Opteron, at least for the time being, performs >>significantly better again. > > Er, really? What are your comparisons? SPECint2000. > My comparisons show that > Itanium 2's operate about 100% faster on my codes than my 2.2 Ghz Xeons > (400 Mhz FSB). That's nice. > This is without going in and trying to tweak the code. > I don't know if it is running as fast as it should be. Why am I not surprised. > No, this still doesn't justify the price difference, but the performance > isn't as bad as you are implying. I was mistaken. Since Itanium 2 is in the habit of routinely performing integer codes at double the speed of 2.2GHz Xeons, it's > The I2 does have integer math performance problems, but is supposed to > be corrected with the next generation chip (Madison). Madison will be called Itanium 2, and for good reason. Isn't it an Itanium 2, respun on a 130nm process and with (potentially) more cache? If there are any other differences, I'd be grateful if you could tell me about those. As far as I can see, this year's ISSCC papers still aren't online :( > Unless we want start a flame war on NDA hardware, I think should be > adding the phrase 'It depends' to any Opteron benchmarks, because it > does. Lets get to arguing numbers in about 2 weeks when we can. One week to go! ;) > And no, for MY CODES, Opteron does not perform significantly better than > the Itanium 2 in all cases. Does your code fit in cache? I can't think of any other reason why you'd perform so badly on an Opteron. Which stepping of Opteron have you been using? > However, when we start to talk > price/performance the Opteron will be the right choice for many > applications. However, not necessarily all of them. Nope, just most of them. > I am not trying to be negative about a platform that does not exist > (yet). I am trying to be negative about a platform that is overpriced. ;) > Personally I want 8 to 1000 Opteron nodes to do some real work. Personally, I want two Itanium 2s to stick under my desk. But I can't afford them! Duraid _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hunting at ix.netcom.com Tue Apr 15 02:01:22 2003 From: hunting at ix.netcom.com (Michael Huntingdon) Date: Mon, 14 Apr 2003 23:01:22 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: References: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org> Message-ID: <3.0.3.32.20030414230122.01d2cfa0@popd.ix.netcom.com> I'm absolutely surprised at the notion that we (you/me/and those in this group) seem to have such advanced knowledge. Forget it. You can, at each avenue press for what you want in terms of compute performance, and perhaps compress an expectation into price/performance that meets an immediate need; however, at some point in time it becomes obvious to each of us that's it's not the chip set. If what we wanted/needed was an advanced CPU, Alpha would have been the mantra ten years ago, Itanium 2 would be embraced now. Hardware waits for advancements around software applications. Let's be clear about what's really expected. While so many members of this group expect answers around: - electrical requirements for super computing - cooling requirements for super computing - advancements in CPU rates - advancements in memory bandwidth - advancements along PCI paths - mainstream development for PCI-x - advancements in storage to system bandwidth - advancements along the network path - better effiencies within operating systems - drivers that allow inter operability with any number of options Any number of groups can come up with inexpensive solutions to the above once the industry standards are developed and the engineering is in place. It's an inexpensive process to piggy-back on original efforts do to investment in engineer and design. Dell, Gateway and "grey-box" manufactures provide excellent templates. Yet when research, in the purest form requires advancements, these are not the groups any of us look to. In addition to this, along with the grants that so many receive, along with the free layered products and support available, there should somehow be this ongoing discussion about a push for more hardware technology at x86 technology pricing? If there is truly a dedicated and compelling requirement for advanced technologies, perhaps some consideration should be given to what's being asked for within research, what's needed, who is expected to deliver, what's being delivered, and yes....some acumen specific to research and the development of technology required to support advanced technology needs. Let's get over ourselves folks. There are only three companies that are going to drive technology in the for seeable future and allow the advancements we all expect. Over the past several weeks I've followed speculative threads regarding "Super Computing Environments". And although RGB has authored a great deal of solid data, when it comes to creating and maintaining a really large environments, you'll want a single "throat to choke" and it won't be Robert's. Who currently has the technology to construct huge environments and guarantee results. I can testify that an "authorities" in the field could not recently during the design of one of the QB3 facilities. My advise, look to an engineering organization who not only knows todays technology, but can also advise you on advanced technologies that take you 3-5-7 years out. In this, you should only have to decide upon one of two product vendors; however, you can assume either of the two can advise you around what's possible today and perhaps as many as sever years out. Think this might save you a (budget) dollar or two. YOU BET! I've seen it to the tune of $500,000. It's not just the additional cost, but delays, and the cost associated with the academic professionals who relied on the (so called) consultants. With so few in the commercial space investing in the future of advanced research, I sometimes have to question how the view of those within academic research could possible be so narrow. In academia of all places, look around at who is trying to work with you, what each hopes to accomplish, and how each will reinvest each dollar you spend with them. Does your investment go to research in new technologies or marketing. This one is a "no brainer". In a separate thread there is a topic of "beowulf in space", the feasibility, the cost etc. I can't imagine this becoming a reality without the dedication and investment of a very few select manufactures, but I'm certain it's something I'll see in my lifetime (and I'm an old guy). It will be smaller, lighter, cheaper, faster, more reliable.....and it won't come from a "one off". In another thread I recall reading about those who might have "more money than brains". To that I would suggest that their investment in both money and brains will trickle down, and be a benefit to all. Behind those (implied) financial investments we will find the driving force for future technologies that each of us will see in our data centers and research labs. Let's see just how interesting things become. cheers ~m At 07:53 PM 4/14/2003 -0400, you wrote: >> about 1100 to 800 in favor of Pentium 4. Bandwidth to memory as measured >> by stream triad is 50% better on the Itanium implying that you will get >> a larger percentage of peak for out-of-cache workloads. Then there is > >until the next-gen P4 chipsets arrive (and they have). > >> I would be interested in SPECFP and Stream Triad numbers for the >> Opteron if you have them. > >me too . but if I understand AMD's marketing "plan", >we won't see the interesting Opteron systems at launch. >that is, since Opteron bandwidth scales with ncpus, it's >really the 4-8-way systems that will look dramatically >more attractive than any competitors (cept maybe Marvel). > >it is sort of interesting that much of It2's rep rests on >fairly single-threaded benchmarks (cfp2000, stream). but I don't >see a lot of people buying uniprocessor It2's, and all It2 >systems use a shared 6.4 GB/s FSB. by comparison, a dual-opt >has 10.8 GB/s aggregate, which starts to be interesting. > >I'm hoping AMD will get pumped and support PC3200 on apr 22. >I fear that 4x and 8x systems will be late as usual. > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Mon Apr 14 19:28:33 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Tue, 15 Apr 2003 09:28:33 +1000 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <3E9B42EE.7020509@octopus.com.au> References: <200304141425.SAA07170@nocserv.free.net> <3E9B17AA.1000806@octopus.com.au> <1050358151.6451.226.camel@woody> <3E9B42EE.7020509@octopus.com.au> Message-ID: <3E9B4421.9060307@octopus.com.au> I wrote: > I was mistaken. Since Itanium 2 is in the habit of routinely performing > integer codes at double the speed of 2.2GHz Xeons, it's I meant to write: I was mistaken. Since Itanium 2 is in the habit of routinely performing integer codes at double the speed of 2.2GHz Xeons, it's obviously good value. Damn office distractions! ;) Duraid _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rbw at ahpcrc.org Tue Apr 15 09:57:55 2003 From: rbw at ahpcrc.org (Richard Walsh) Date: Tue, 15 Apr 2003 08:57:55 -0500 Subject: [Linux-ia64] Itanium gets supercomputing software Message-ID: <200304151357.h3FDvtC09517@mycroft.ahpcrc.org> Duraid Madina wrote: >SPECfp2000 is ~1170 for a 2GHz 1MB L2 Opteron. Not too bad. The SPECint >figure is fantastic though (~1200). I hate x86 as much as the next guy, >but it looks like this is what I'm going to be working with for some >time, _thanks to Intel and their incredibly uninspired pricing strategy_. Thanks for the numbers :-). Looks like Opteron comes in at slightly better than the 2.8 GHz Pentium 4 in both cases (1100 FP, 1100 INT). So the marginal additional price that AMD charges for Opteron will be for 64-bit addresses and its SMP capability ... whether/when they try to sell it as a one-chip-fits- all product will depend on how quickly they wish to destroy their x86-only markets ... that would seem to be their best strategy though ... otherwise Intel makes them into a sandwich by lowering the I2's price and raising the P4's clock ... a kind of adiabatic squeeze play. It will be interesting to watch ... from the sidelines. rbw _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jcownie at etnus.com Tue Apr 15 05:12:46 2003 From: jcownie at etnus.com (James Cownie) Date: Tue, 15 Apr 2003 10:12:46 +0100 Subject: beowulf in space In-Reply-To: Message from "Robert G. Brown" of "Mon, 14 Apr 2003 13:53:07 EDT." Message-ID: <195MUg-178-00@etnus.com> > b) communications latency (bandwidth actually can be as big as you > like or are likely to ever need, since you ARE a satellite, after > all...:-) Well, 18 months ago ESA were getting 50Mb optically between satellites http://www.esa.int/export/esaCP/ESASGBZ84UC_Improving_0.html since that is 1) a Moore generation ago :-) (though I think development times are longer in space technology) 2) public information I expect that the people on the "dark side" who do this can indeed get an awful lot of bandwidth... (ISTR Chaisson in "Hubble Wars" mentioning that they were borrowing the 10Mb link to space that the NSA folks had back in 1991). -- Jim James Cownie Etnus, LLC. +44 117 9071438 http://www.etnus.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rtomek at cis.com.pl Mon Apr 14 19:29:17 2003 From: rtomek at cis.com.pl (Tomasz Rola) Date: Tue, 15 Apr 2003 01:29:17 +0200 (CEST) Subject: beowulf in space In-Reply-To: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Mon, 14 Apr 2003, Joel Jaeggli wrote: > On Mon, 14 Apr 2003 chettri at gst.com wrote: > > > Has anybody considered the theoretical aspects of placing beowulfs on a > > cluster of satellites? I understand that communication will be slower AND > > unreliable, > > Communication to satellites needs to be neither slow nor unreliable, it is > generally fairly high latency... It can be quite expensive. > > There are clusters of computers in space. they generally aren't what you > would consider heavy computation platforms... Correct. However, I think that since this question really belongs to s-f (at least today) one can put some s-f behind the answer... > The biggest issues with with computer resources in space are: > > mass - a large sattellite such as the hughes galaxy 4r bird is around [...] I think mass is an issue when you have to export everything up from the Earth. It won't be if you start to get materials from celestial sources. The cost of launch from the Moon should be ca. 6 times less than from the Earth. Even less when you start to explore asteroids. Kuiper belt should have plenty of materials. Of course, the cost of building facilities there is so high that it will take a long time to become feasible and next to pay off. > power - solar power and long life cadmium batteries mean your whole I think there is plenty of solar power in space. At least within some specified orbit. It's only that you can't get enough of solar grids there to use it in a practical way. BTW, some people are reconsidering the use of atomic power up there. http://www.spacedaily.com/news/oped-03i.html (There are some links at the bottom of the page too). > radiation hardening - without 50 miles of atmosphere overhead we're kinda Yes it is an issue. Perhaps it could help if you buried a cluster under the surface of the Moon or put it on the dark side of Mercury (you would need to move slowly your cluster there to avoid being rotated into the very hot sunlight - not very practical, I think). It seems that magnetic field helps but this page: http://isaac.exploratorium.edu/~pauld/activities/magnetism/magnetismofplanets.html shows that Earth-like field is scarce in Solar System. Placing such systems, especially built from off the shelf components, on orbit is probably not very bright idea unless you can protect them. > thermal management - air cooling doesn't work given no atomosphere... even I'm not a specialist but I think you can force the (air | water) flow in space. Otherwise, astronauts would have very dangerous time sleeping in one place for few hours, with no ventilation at all (CO2 bubble growing around their heads). > expected service life - if you plan on go to the expense of putting it in Today, the longer you can use orbital device the better but nobody applies this kind of measures to clusters. So you are right it would not be worth to expedition units from Earth. On the other hand, the use of automated production facilities, maybe on the Moon, would make the project possible. When connected with some inexpensive transport system (who knows, electromagnetic cargo ejectors or orbital lift) it could provide upgrades and replacements (provided that you solve the radiation problem). It is also quite possible that some time from now the Moore's law will no longer hold. If so, the computing unit longevity would be measured in tens of years. So even without cheap transportation it may be ok to hold it on orbit for 20 years and still have fun (but not if you use today's cpus). technology vs automation issues - - From what I know the technology for all this is right now very primitive and/or requires human attention to work properly. Maybe you should ask the question again about 10-20 years from now. Frankly, I don't see much sense in putting cluster on orbit and than paying lots of money for sending human operators there too. So automatic operation is probably a must for this kind of projects. mental sanity and business issues - (sorry, I just couldn't stop myself :-) ) BTW I can't understand WHY anybody would like to place a cluster on orbit? For the control of some weapon system with sofisticated AI? For autonomic management of exploration mission? Do you have any concept of computational device that would work better on the orbit, by chance? Environmental issues? Nah. I doubt if we could build such big clusters anytime soon. Milions of units in one place? What for... You know, the idea is nice but if what I know is correct, one can do the same job on the surface, under the surface and even under the sea for a fraction of cost and without waiting for better tech. > > and it would restrict the set of problems that could be solved. I'm looking > > for papers/tech reps etc on the subject. > > > > Regards, > > > > Samir Chettro Probably you would suffer from signal propagation times. The longest path you have to deal on the Earth's surface is some 20 000 km. In case of a geostationary cluster the diameter is about 70 000 km and even longer if you want to send via the neighbours. Ok, I expect that you want to have more than one cluster up there. So I think you are restricted to the tasks with high processing / communication ratio. [...] > -- > -------------------------------------------------------------------------- > Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu > -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- bye T. - -- ** A C programmer asked whether computer had Buddha's nature. ** ** As the answer, master did "rm -rif" on the programmer's home ** ** directory. And then the C programmer became enlightened... ** ** ** ** Tomasz Rola mailto:tomasz_rola at bigfoot.com ** -----BEGIN PGP SIGNATURE----- Version: PGPfreeware 5.0i for non-commercial use Charset: noconv iQA/AwUBPptEWBETUsyL9vbiEQIwDQCfUZtUICa+ecU5SsAjOHGLHg8yL9wAnRGT meGiR1y1vvZYrho51MkLqY+e =fwa/ -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From duraid at octopus.com.au Mon Apr 14 19:14:26 2003 From: duraid at octopus.com.au (Duraid Madina) Date: Tue, 15 Apr 2003 09:14:26 +1000 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org> References: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org> Message-ID: <3E9B40D2.9010400@octopus.com.au> Richard Walsh wrote: > Duraid Madina wrote: > >>On FP-heavy workloads, perhaps. On integer workloads, hardly. And that's >>only comparing it against HP's own PA-8700 hardware! Compare it to more >>mainstream hardware and you'll see just how laughable Itanium 2 prices >>are. The Itanium 2 doesn't have significantly higher performance than >>today's Xeons. Opteron, at least for the time being, performs >>significantly better again. > > The SPECFP numbers rate the Itanium 2 at about 1425 (relative to the > base Sun) and the Pentium 4 at about 1100. That's about a 20% advantage > on floating point (PA-RISC rates a 600 I think). The integer ratio is > about 1100 to 800 in favor of Pentium 4. You think a 20% advantage justifies the price difference, or qualifies as significantly higher performance? > Bandwidth to memory as measured > by stream triad is 50% better on the Itanium implying that you will get > a larger percentage of peak for out-of-cache workloads. That's pretty pathetic in the light of Opteron and even today's Intel 875 desktop chipset. Is Madison going to bring Itanium 2 a new FSB? Nope. > Then there is the 64-bit address space, What a great reason to charge through the roof. > EPIC compiler technology, Even better!! > etc. ... but ... > > Itanium 2 prices seem high to me. However, the questions is really one for > Intel and HP ... is the current price generating enough volume to hit the > revenue sweet spot. They could care less whether I, you, or any random individual > buyer likes the price ;-). The price is right if they are maximizing the > time-integrated return on the product. Initial pricing should err high ... > you can always lower it, but can never raise it. Until Opteron is available, > the only, long-lived, direct competition is the Power 4 (is it available in 1 > and 2 processor configurations?). Well if we believe AMD, Opteron arrives next week. You go buy up your Itanium 2s (or POWER 4s). > Plus, why should Intel compete with their > own price-performance Pentium 4 systems by lowering Itanium 2 prices? Because if they lowered Itanium 2 prices, Opteron wouldn't have a market. It's too late now, for Itanium 2. Itanium 2.5/3 may be a different story (we can only hope). They > are serving two markets segments those with more money than brains and those > with more brains than money ... ;-). The market is quantized ... each product > has its own quantum number. Itanium 2 certainly seems to be in a superposition of "fantastic" and "worthless" that the computer market hasn't seen for quite some time. > I would be interested in SPECFP and Stream Triad numbers for the > Opteron if you have them. SPECfp2000 is ~1170 for a 2GHz 1MB L2 Opteron. Not too bad. The SPECint figure is fantastic though (~1200). I hate x86 as much as the next guy, but it looks like this is what I'm going to be working with for some time, _thanks to Intel and their incredibly uninspired pricing strategy_. Duraid _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From anand at novaglobal.com.sg Mon Apr 14 23:12:32 2003 From: anand at novaglobal.com.sg (Anand Vaidya) Date: Tue, 15 Apr 2003 11:12:32 +0800 Subject: Question regarding M-VIA & Linux Message-ID: <200304151112.38782.anand@novaglobal.com.sg> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I am attempting to setup a cluster with Linux (x86) & M-VIA (virtual interface). I find that the project has been abandoned last year. Also, most of the VIA drivers (eepro100, e1000, tulip etc) do not compile or if they compile, do not work as expected. They hang (tulip) or don't even recognise the ethernet card (e1000), or fail in vnettest (eepro100). Unfortunately, I don't have access to hamachi or syskonnect hardware. I would like to know whether any of you have production clusters running on MVIA, especially Intel GB NICs, since that is what I have (easy) access to. I would be grateful if you have any patches or related documents. Regards, Anand -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE+m3ilQR28l/pNhTkRAhc1AKCjIGv8Nf2pexnUt6+X6OiV+Hu2xQCfb4Mg vqvnADWhSrysUuJqtB8cFIA= =pzwQ -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rbw at ahpcrc.org Tue Apr 15 10:40:29 2003 From: rbw at ahpcrc.org (Richard Walsh) Date: Tue, 15 Apr 2003 09:40:29 -0500 Subject: [Linux-ia64] Itanium gets supercomputing software Message-ID: <200304151440.h3FEeTY10399@mycroft.ahpcrc.org> On Mon Apr 14 19:13:40 2003, Mark Hahn wrote: >> about 1100 to 800 in favor of Pentium 4. Bandwidth to memory as measured >> by stream triad is 50% better on the Itanium implying that you will get >> a larger percentage of peak for out-of-cache workloads. Then there is > >until the next-gen P4 chipsets arrive (and they have). Have you see any P4 stream numbers that break the 3 GB/s level yet? What chipsets/boards? > >> I would be interested in SPECFP and Stream Triad numbers for the >> Opteron if you have them. > >me too . but if I understand AMD's marketing "plan", >we won't see the interesting Opteron systems at launch. >that is, since Opteron bandwidth scales with ncpus, it's >really the 4-8-way systems that will look dramatically >more attractive than any competitors (cept maybe Marvel). I agree on the SMP play. The Opteron's inter-chip interconnect capability resembles the EV7's ... on the other hand, HP can use the EV7 (Marvel) do defend I2's flank while Intel does something about its weak shared bus, 4-way SMP design. >it is sort of interesting that much of It2's rep rests on >fairly single-threaded benchmarks (cfp2000, stream). but I don't >see a lot of people buying uniprocessor It2's, and all It2 >systems use a shared 6.4 GB/s FSB. by comparison, a dual-opt >has 10.8 GB/s aggregate, which starts to be interesting. Good points, but if the price on an I2 with 3 MB L2 cache comes down and you place it into a larger cluster context where people care less about a system's SMP content it could be a winner ... that was/is what PNNL is thinking I guess ... but theirs is a traditional supercomputer budget really. rbw _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rbw at ahpcrc.org Tue Apr 15 11:08:20 2003 From: rbw at ahpcrc.org (Richard Walsh) Date: Tue, 15 Apr 2003 10:08:20 -0500 Subject: [Linux-ia64] Itanium gets supercomputing software Message-ID: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org> On Tue Apr 15 09:45:03 2003, Joseph Landman wrote: >I remember the Trace Multiflow in 1991 or so, where compiling my >molecular dynamics code took the better part of a day. Made debugging >interesting, as the "bug" only appeared in the optimized code. Don't think EPIC compile times compare to those of the MultiFlow, but I have no direct experience. I do think EPIC is valuable on several scores. First, it frees real estate on the chip by reducing/eliminatin out-of-order execution hardware allowing for larger caches (3 MB on chip today) and future additional functional unit parallelism or additional cores on the same chip. Second, it allows generated code to be tuned to the width (number of simultaneous instructions alowed) of the processor. Finally, its predicate/nat analysis can completely remove traditional stall points where the CPU must wait for data from memory or conditions to be computed before proceeding to execution. The last advantage is a useful way of using increasingly redundant core hardware to speed results through the processor. "Micro- threads/paths" are simultaneously computed using hardware that for the moment would be idle anyway and results that are later proven to to un-needed are be discarded. The benefits are hard to quantify, but I believe significant part of the I2's SpecFP score is EPIC derived. I am guessing others disagree ... ;-) ... Oui? rbw _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From landman at scalableinformatics.com Tue Apr 15 10:44:48 2003 From: landman at scalableinformatics.com (Joe Landman) Date: 15 Apr 2003 10:44:48 -0400 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <3E9B40D2.9010400@octopus.com.au> References: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org> <3E9B40D2.9010400@octopus.com.au> Message-ID: <1050417888.3474.6.camel@squash.scalableinformatics.com> On Mon, 2003-04-14 at 19:14, Duraid Madina wrote: > What a great reason to charge through the roof. > > > EPIC compiler technology, (hauling out an old VLIW story) I remember the Trace Multiflow in 1991 or so, where compiling my molecular dynamics code took the better part of a day. Made debugging interesting, as the "bug" only appeared in the optimized code. Out of curiosity, is all the good compiler technology for IA64 going to be retained in the Intel (and other commercial) compilers? Someone had posted a link to a machine to play with and I had blown it away (the link that is)... could we get a quick repost of that? Thanks. -- Joseph Landman, Ph.D Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://scalableinformatics.com phone: +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From dsarvis at zcorum.com Tue Apr 15 10:08:40 2003 From: dsarvis at zcorum.com (Dennis Sarvis, II) Date: 15 Apr 2003 10:08:40 -0400 Subject: task sharing Message-ID: <1050415720.2122.12.camel@skull.america.net> I attempted to build a 2 node cluster, simply because my workstations are slow and irritating whilst developing web applications and graphics, with a crossover cable I made. They are running Redhat 9 (a P2 and a Celron550). The machines can see and talk to each other, but do not share tasks. My question is, what is the "best" software to control the cluster and share tasks for X-windows on Redhat 9, and is this even a suitable use? -- Web Applications Designer/Developer, Project Manager, Graphic Designer, Commercial Web Site Designer, Research & Development, Systems Administrator, etc... Dennis Sarvis, II _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Tue Apr 15 12:36:59 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue, 15 Apr 2003 12:36:59 -0400 (EDT) Subject: beowulf in space In-Reply-To: <195MUg-178-00@etnus.com> Message-ID: On Tue, 15 Apr 2003, James Cownie wrote: > > > b) communications latency (bandwidth actually can be as big as you > > like or are likely to ever need, since you ARE a satellite, after > > all...:-) > > Well, 18 months ago ESA were getting 50Mb optically between satellites > > http://www.esa.int/export/esaCP/ESASGBZ84UC_Improving_0.html > > since that is > > 1) a Moore generation ago :-) (though I think development times are > longer in space technology) > 2) public information > > I expect that the people on the "dark side" who do this can indeed get > an awful lot of bandwidth... I think the relevant numbers that indicate the limits of what technology CAN do from satellites are more likely to come from looking at the humble satellite dish attached to many homes. Order of 100 channels in the television range, some of them HDTV, at a guess order of 100 MB/sec per second (assume order of a MB/sec per channel). Or look at phone satellites. And that is using only a small part of the spectrum, and ignores the possibility of multiple channels reusing the same spectrum with directional links. I would guess that one could, with some effort downlink some orders of magnitude more than gigabytes per second, and I meant bytes. How many orders (and I meant plural there, too) probably does depend on a lot of things including distance from earth, ambient atmospheric conditions in the intervening space between transmitter and receiver, what frequencies one is using, the number of directional-parallel channels one can maintain. A single visible-light laser link, for example, could likely carry many gigabits per second even allowing for atmospheric distortion. However, one of the NASA guys on the list probably knows at least the comsat or tvsat numbers (Jim?). And as you say, the military probably has lots of bandwidth down from theirs although how much they aren't likely to say. However, they take HIGH resolution pictures in a pretty much steady stream... I think we should just accept Jim's statement that near the earth we can get a "lot" of bandwidth on demand, but that things get dicier for obvious reasons when you get far away. Less power, harder to hold a tight beam, less signal to noise on both ends' receivers, more retransmissions. It's pretty astounding that we were able to get the incredible flyby pictures of Jupiter and the outer planets at all that we got, given the minute size of the spacecraft and their power supplies, their extreme distance from the earth, and the decades they were in space. Nasa does literally incredible engineering. Expensive, sure, but the REALLY expensive missions are the ones where something breaks and the whole investment (human and otherwise) is wasted. Let's also not forget who "invented" the beowulf, as well (tip of the hat to Nasa Goddard, Don and Tom and all the rest:-). I'm quite certain that they use beowulfish concepts all the time in their engineering, and Jim did an excellent job of indicating some of the reasons why. This isn't even inconsistent with the original beowulf idea -- sure, one would be silly to throw a general purpose cluster up into space to do e.g. my computations, but real optimizing beowulf engineering matches the design to the task. Of course they're going to engineer a "cluster" that matches their precise needs and specifications. It just isn't going to be doing work "for earth" -- it will be doing signal processing and so forth, and even there only when the economics of the available data bandwidth and/or robust engineering requirements dictate. If I were engineering a space vehicle, I'd make even the onboard navigation computer redundant. This might be more of a "high availability" model than high performance, but an ideal design might mix both. Lots of processors reduces the time required for a parallelizeable navigation computation AND can make the computer more robust against the failure of one or more processors -- as long as you have at least one left, you can complete key computations, just more slowly. Heck, from one point of view every compute node is already a specialized "parallel cluster" -- the system has a CPU and a variety of bridges, dedicated special purpose processors, and so forth all on board, so how could NASA NOT make "parallel" environments for spacecraft. The one thing they won't do is use off-the-shelf parts, and I can't blame them. The damn things break all the time here on earth. I'm typing away on a computer that has a dying hard drive, while waiting for it to rsync a final time with my fingers crossed. For me it is no biggie. A trip to Intrex, a hundred bucks (or more likely warranty replacement). In space that's kinda hard. SO although I'm certain that they use clusters on spacecraft in at least one sense of the word, I'm equally certain that they are NOT beowulfs, according to the standard definition. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mn216 at columbia.edu Tue Apr 15 11:12:36 2003 From: mn216 at columbia.edu (Murad Nayal) Date: Tue, 15 Apr 2003 11:12:36 -0400 Subject: beginner help References: <200304141425.SAA07170@nocserv.free.net> <3E9B17AA.1000806@octopus.com.au> <1050358151.6451.226.camel@woody> <3E9B42EE.7020509@octopus.com.au> <3E9B4421.9060307@octopus.com.au> Message-ID: <3E9C2164.7AB1843D@columbia.edu> Hello, I am new to the beowulf system. our cluster has been having problems mostly with bpsh where for example 'bpsh 0 ls' returns either nothing or errors: bpsh 0 ls ls: error while loading shared libraries: /lib/libtermcap.so.2: cannot read file data: Error 116 this sounds like /lib is not accessible on node 0. I wrote a small program to print the /lib directory contents to a file, and another program that uses bproc_execmove to run the previous program on node 0. both programs linked static as not to need dynamic linking. and in fact I do obtain a listing for /lib on node 0. as I said I am a novice and have no idea where to go from here. any suggestions. the problem seems to go away after reboot. many thanks in advance. Murad _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kdunder at sandia.gov Tue Apr 15 11:41:32 2003 From: kdunder at sandia.gov (Keith D. Underwood) Date: 15 Apr 2003 09:41:32 -0600 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org> References: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org> Message-ID: <1050421292.27085.9.camel@sadl16603.sandia.gov> > Don't think EPIC compile times compare to those of the MultiFlow, but > I have no direct experience. >From what I hear, compile times are not particularly good for EPIC... > The benefits are hard to quantify, but I believe significant part > of the I2's SpecFP score is EPIC derived. > > I am guessing others disagree ... ;-) ... Oui? You should actually look at those numbers. See here: http://www.spec.org/cpu2000/results/res2002q4/cpu2000-20021119-01859.html The only way you get graphs like that is when a couple of your benchmarks actually fit in cache. Benchmarks running from cache are not terribly representative of most real applications. Keith -- Keith D. Underwood _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From landman at scalableinformatics.com Tue Apr 15 12:45:22 2003 From: landman at scalableinformatics.com (Joseph Landman) Date: Tue, 15 Apr 2003 12:45:22 -0400 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <1050421292.27085.9.camel@sadl16603.sandia.gov> References: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org> <1050421292.27085.9.camel@sadl16603.sandia.gov> Message-ID: <3E9C3722.4070900@scalableinformatics.com> Keith D. Underwood wrote: > The only way you get graphs like that is when a couple of your > benchmarks actually fit in cache. Benchmarks running from cache are not > terribly representative of most real applications. I seem to remember that being one of my major complaints about SPEC in general. I would much prefer to see small, medium, large, huge, and I-cant-beleive-you-expect-results-from-something-that-size type runs than the old "fit-in-the-cache" variety. My runs, and my customers runs are quite a bit larger than the 3MB caches, so we tend to take SPEC with a kg or two of salt. -- Joseph Landman, Ph.D Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://scalableinformatics.com phone: +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From derek.richardson at pgs.com Tue Apr 15 15:03:33 2003 From: derek.richardson at pgs.com (Derek Richardson) Date: Tue, 15 Apr 2003 14:03:33 -0500 Subject: Power supply problems (was: Machine Check Exception) In-Reply-To: <20030415114952S.hanzl@unknown-domain> References: <3E9AF3DD.1070002@pgs.com> <20030415114952S.hanzl@unknown-domain> Message-ID: <3E9C5785.5090006@pgs.com> Vaclav, Thanks for the info, I ending up finding the problem : CPU1. Which of course was the last thing I checked...it took a while for it to occur to me that a machine check exception might not be particular to the CPU that generates it. Thanks, Derek R. hanzl at noel.feld.cvut.cz wrote: >>Does anyone know if a power supply can cause a machine check exception >> >> > >Power supply can probably cause all sorts of weird problems, with >variety similar to RAM problems - problem can surface just anywhere >and resemble other hardware or software problem to such an extent that >usual diagnostic steps like replacing hardware and software components >clearly indicate that something else is faulty, replacing that >'faulty' component seems to fix it but later on problems reoccur. > >In another words, if your power supply works near the limits, problems >may be observed just on few nodes in the cluster (or on a single node) >and there may be just certain hardware/software/environmental >circumstances which trigger the problem. > >I've seen certain indications that these days power supply can cause >more headaches than before: > >- note from Abit tech staff saying that certain power supplies send >POWER_GOOD 'too early', meaning before on-board power conversion >circuits had time to stabilise > >- overclockers are starting to take power supply more seriously. >(However stupid overclocking is, their web sites gives good >indication which parts of hardware work near to the limits.) > >- note that certain Abit BIOS upgrade fixes PSU problems > >- many problems with Abit IT7 MAX2 and 300W PSU I've encountered >myself > >- fact that modern PSU is under software control and hardware control >of that part of mainboard which gets standby power, opening new >possibilities for intermixing PSU/hardware/software problems. > > >Most of my experience is Abit-related but may be general I am >affraid... > >Regards > >Vaclav > > -- Linux Administrator derek.richardson at pgs.com derek.richardson at ieee.org Office 713-781-4000 Cell 713-817-1197 bureaucracy, n: A method for transforming energy into solid waste. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From alangrimes at starpower.net Tue Apr 15 17:27:49 2003 From: alangrimes at starpower.net (Alan Grimes) Date: Tue, 15 Apr 2003 14:27:49 -0700 Subject: beowulf in space References: Message-ID: <3E9C7955.AC911463@starpower.net> To the best of my limited understanding of space applications the problem biggest problem of computing in space is not the actual computing but the communications. The biggest problem in sending 20 probes out into the universe is the issue of _TRACKING_ those 20 probes. NASA has found that its Deep Space Network is streached thin tracking all the probes it has sent up over the years (in addition to its other astronomical duities)... The solution is to design a version of the internet so that the varrious probes can communicate among themselves reducing the workload of the DSN to only a few targets (or eliminating the need for elaborate ground tracking altogeather..) Some interesting projects could be: --- Establishing relay stations on the moon... There are no stable orbits around the moon so any communications relay station would need to be ground based. Apparently there are certain points where the varrious gravitational forces balance out called "lagrange points". One would place relay satelites at these locations and then build comms towers on the ground to relay local trafic up to the satelites. This is trickey work because there would need to be a direct line of sight from your mission to the nearest comm tower and from there to one of the stationary satelites... The technologies for this have already been developed... The big challenge is, ofcourse, establishing this network for operations on the so called dark side of the moon... -- Establishing relays to distant planets: We would like to have networks going to Venus, Mars, and Jupiter... This would require at least one satelite around each of the remote planets and one in orbit around earth. A satelite in high orbit would have a clear line of sight to its target planet for 22+ hours a day and would only require a single ground station each. Such a network would drasticly reduce the maintainance costs of all future missions... =) -- Having never read a manual, it takes less effort to hack something togeather with www.squeak.org than it does with C++ and five books. http://users.rcn.com/alangrimes/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From amitvyas_cse at hotmail.com Wed Apr 16 00:06:16 2003 From: amitvyas_cse at hotmail.com (amit vyas) Date: Wed, 16 Apr 2003 09:36:16 +0530 Subject: network(cluster) load balancing Message-ID: hi all, we are working on a experimental 3+1 node beowulf cluster (oscar 2.1 +linux 7.3), and we were lately thinking on how to use this cluster to provide various services for college campus like(,mainly for ) 1. diskless clients(network booting) 2. X-terminals (XDMCP) 3. parallel programs. ie we want to know how to modify cluster in a way that handles NETWORK LOAD (NLB) i think i have made myself clear . can anyone help us on how to deploy this mainframe-terminal model so as to demostrate supercomputing power . lastly RGB(rgb at phy.duke.edu) provided us valuable information for our problem that helped us a ,lot thank you RGB. thanks in advance. Amit vyas RIET, JAIPUR INDIA CSE deptt. _________________________________________________________________ Find old batchmates. Renew lost friendship. http://www.batchmates.com/msn.asp Right here! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eric at fnordsystems.com Wed Apr 16 01:02:16 2003 From: eric at fnordsystems.com (Eric Kuhnke) Date: Tue, 15 Apr 2003 22:02:16 -0700 Subject: Electricity Bill Message-ID: <5.2.0.9.2.20030415215707.02a62c70@66.250.215.18> I have a quick survey regarding electricity use, kW/H rates, etc: 1) How much did you pay last year for the electricity and HVAC consumption of your cluster? 2) How big is it? What sort of CPUs? etc 3) What are you paying in kilowatt-hour rates to the power company? 4) Would you have built a much larger cluster if the projected yearly electrical bill was significantly lower? Ex: if you were located in an area such as Vancouver or Winnipeg, lowest electricity rates in North America. See http://www.bchydro.com/policies/rates/rates759.html for kwH rates (in Canadian currency). _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Wed Apr 16 10:51:11 2003 From: raysonlogin at yahoo.com (Rayson Ho) Date: Wed, 16 Apr 2003 07:51:11 -0700 (PDT) Subject: new to linux clustering In-Reply-To: Message-ID: <20030416145111.29937.qmail@web11404.mail.yahoo.com> MPI stuff ========= MPICH: http://www-unix.mcs.anl.gov/mpi/mpich/ LAM-MPI: http://www.lam-mpi.org (LAM has a mailing list) For batch systems, take a look at GridEngine: http://gridengine.sunsource.net Rayson --- Mohammad Tina wrote: > Hi, > what i am trying to do is for my class project , we want to setup > cluster of 3 machines, i think we will run mpi applications, then if > it is possible i will try setup grid with other cluster. > > thanks > __________________________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From exa at kablonet.com.tr Tue Apr 15 19:13:34 2003 From: exa at kablonet.com.tr (Eray Ozkural) Date: Wed, 16 Apr 2003 02:13:34 +0300 Subject: beowulf in space In-Reply-To: References: Message-ID: <200304160213.34165.exa@kablonet.com.tr> On Tuesday 15 April 2003 19:36, Robert G. Brown wrote: > SO although I'm certain that they use clusters on spacecraft in at least > one sense of the word, I'm equally certain that they are NOT beowulfs, > according to the standard definition. In my mind, the real difficulty comes from not the costs but the exotic and diverse nature of hardware. This might even more conceivably link up with the "grid computing" idea. If I had such a heterogeneous network, and I needed to make some high-perf. computation, could I simply submit it to a grid system that would automatically reconfigure the computation for optimal performance, etc.? I'm thinking something like a space vessel being able to take advantage of all general-purpose processors on-board and in sufficiently close proximity (like say a space station). The problem abstractly not too different from a university campus with a lot of motley computer labs and unpredictable network setups (ie. dynamic nodes, network topology, so forth). Thus, I think it's more of a software problem. Can you really build "the" high performance platform that successfully and completely abstracts the OS/network/CPU? What would be needed for such a thing? (I'm not thinking 'Java', that's slow, thank you) Such software is surely in line with Beowulf thinking. Thanks, -- Eray Ozkural (exa) Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Apr 16 12:34:10 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 16 Apr 2003 12:34:10 -0400 (EDT) Subject: beowulf in space In-Reply-To: <200304160213.34165.exa@kablonet.com.tr> Message-ID: On Wed, 16 Apr 2003, Eray Ozkural wrote: > On Tuesday 15 April 2003 19:36, Robert G. Brown wrote: > > SO although I'm certain that they use clusters on spacecraft in at least > > one sense of the word, I'm equally certain that they are NOT beowulfs, > > according to the standard definition. ... > Thus, I think it's more of a software problem. Can you really build "the" high > performance platform that successfully and completely abstracts the > OS/network/CPU? What would be needed for such a thing? (I'm not thinking > 'Java', that's slow, thank you) Such software is surely in line with Beowulf > thinking. Absolutely, although I'd refer to it more precisely as "cluster computing" thinking and not beowulfs per se. To be picky, a beowulf is "single machine" supercomputer built out of COTS (commodity, off the shelf) components, running an open source operating system, traditionally linux, and possibly some software such as Scyld or bproc that flattens PID space or is otherwise designed to promote that image of "a beowulf" as being a single machine. All beowulfs are clusters, not all clusters are beowulfs, see interminable discussions in years past in the archives and Kragen's FAQ. A spacecraft cluster will simply never be built with real COTS parts -- they are too unreliable and not nearly expensive enough:-). They might conceivably be built with "customized" parts that have a COTS origin -- a system homologous to or derived from a COTS design but subjected to a far more rigorous manufacture and testing regimen. It might also be built with one of the beowulf networks since COTS (in the usual sense of the term) or not in some cases they are the only game in town. So I wouldn't be incredibly surprised to see a spacecraft containing a bunch of "intel" or "amd" nodes, interconnected with e.g. SCI (because it is switchless and hence arguably more robust). Those nodes, however, will be built on motherboards and CPUs custom engineered for low power, radiation hardness, fault tolerance, redundancy, and tested ad nauseam before ever leaving the earth. It is cheaper to spend $100K or even more on each on those nodes ("identical" in function to a $2000 board+network interface here on earth) and be almost certain that they won't fail than it is to deal with the roughly 10% failure rate per year observed for at least one component in a lot of COTS systems. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From deadline at plogic.com Wed Apr 16 15:25:53 2003 From: deadline at plogic.com (Douglas Eadline) Date: Wed, 16 Apr 2003 14:25:53 -0500 (CDT) Subject: SMP and Network Connections Message-ID: Just posted some more SMP tests on www.cluster-rant.com. This time, I tested the interconnects and asked the question "What if a dual SMP used two Ethernet connections instead of one?" Seems to help! Take a look at: http://www.cluster-rant.com/article.pl?sid=03/04/16/1815257 to get the full report. Doug ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.814.2800 130 Webster Street | PARALLEL | Fax:+610.814.5844 Bethlehem, PA 18015 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kjm31 at cu-genome.org Wed Apr 16 15:57:50 2003 From: kjm31 at cu-genome.org (Kristen J. McFadden) Date: Wed, 16 Apr 2003 15:57:50 -0400 Subject: Running perl scripts and non-mpi programs on scyld Message-ID: Hi, We have a Scyld Beowulf cluster currently running on 28cz-4 (we are getting -5 soon). We have been running into a lot of problems with users that are trying to run scripts on the child nodes. To start with, what is the best way to run serial (non-MPI) programs? Here is the current issue I'm trying to tackle. Say I have a perl script. (I NFS mount /usr /lib etc. on the child nodes) I want to run this perl script on N nodes with N DIFFERENT arguments. Right now, even when I write up a small file with "mpprun my_program arg1 arg2 | batch now" in 100 lines or something for all different arguments, bbq does NOT properly distribute these programs. It overloads some nodes and behaves essentially unpredictably. Is there any tools or info anyone has about running Perl scripts and the like safely on a Scyld implementation? Thanks, Kristen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Wed Apr 16 17:10:03 2003 From: raysonlogin at yahoo.com (Rayson Ho) Date: Wed, 16 Apr 2003 14:10:03 -0700 (PDT) Subject: Running perl scripts and non-mpi programs on scyld In-Reply-To: Message-ID: <20030416211003.95964.qmail@web11404.mail.yahoo.com> What is the ratio of parallel programs and serial programs ran on your cluster?? Rayson --- "Kristen J. McFadden" wrote: > Hi, > > We have a Scyld Beowulf cluster currently running on 28cz-4 (we are > getting -5 soon). We have been running into a lot of problems with > users that are trying to run scripts on the child nodes. To start > with, what is the best way to run serial (non-MPI) programs? > > Here is the current issue I'm trying to tackle. > > Say I have a perl script. (I NFS mount /usr /lib etc. on the child > nodes) > > I want to run this perl script on N nodes with N DIFFERENT arguments. > > Right now, even when I write up a small file with "mpprun my_program > arg1 arg2 | batch now" in 100 lines or something for all different > arguments, bbq does NOT properly distribute these programs. It > overloads some nodes and behaves essentially unpredictably. > > > > Is there any tools or info anyone has about running Perl scripts and > the > like safely on a Scyld implementation? > > > > Thanks, Kristen > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Wed Apr 16 16:38:11 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Wed, 16 Apr 2003 13:38:11 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <1050417888.3474.6.camel@squash.scalableinformatics.com> References: <200304142215.h3EMFYI32263@mycroft.ahpcrc.org> <3E9B40D2.9010400@octopus.com.au> <1050417888.3474.6.camel@squash.scalableinformatics.com> Message-ID: <20030416203811.GB1149@greglaptop.internal.keyresearch.com> On Tue, Apr 15, 2003 at 10:44:48AM -0400, Joe Landman wrote: > Out of curiosity, is all the good compiler technology for IA64 going to > be retained in the Intel (and other commercial) compilers? Open64 has a GPLed IA64 backend. While it's unfortunate that SGI has stopped GPLing new work on it, it's still a pretty good compiler. It is currently being used by several companies for new cpus, and by some research groups, both compiling for the IA-64 and using it as a source-to-source tool. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Wed Apr 16 17:13:17 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Wed, 16 Apr 2003 14:13:17 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org> References: <200304151508.h3FF8KR10851@mycroft.ahpcrc.org> Message-ID: <20030416211317.GC1149@greglaptop.internal.keyresearch.com> On Tue, Apr 15, 2003 at 10:08:20AM -0500, Richard Walsh wrote: > I do think EPIC is valuable on several scores. First, it frees real > estate on the chip by reducing/eliminatin out-of-order execution hardware > allowing for larger caches (3 MB on chip today) and future additional > functional unit parallelism or additional cores on the same chip. Nope. You can look up the size of the EPIC core; it's not small. It only can have 3 MB on chip cache today because it's the largest possible chip you can build. That cache is much larger than the processor core. > Second, it allows generated code to be tuned to the width (number > of simultaneous instructions alowed) of the processor. Good compilers have instruction scheduling which do this on other chips. While it's easier to understand what's going on when the parallelism is explicit, you'll find that scientific codes get a pretty amazing number of instructions per cycle on quite a few cpus and compilers. The promise of EPIC was that it would be eaiser to do this. You'll have to talk to some compiler people to find out if they think it was easier. The ones I know hate EPIC was a passion. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Wed Apr 16 17:53:50 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed, 16 Apr 2003 14:53:50 -0700 Subject: beowulf in space In-Reply-To: References: <200304160213.34165.exa@kablonet.com.tr> Message-ID: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> A >So I wouldn't be incredibly surprised to see a spacecraft containing a >bunch of "intel" or "amd" nodes, interconnected with e.g. SCI (because >it is switchless and hence arguably more robust). Those nodes, however, >will be built on motherboards and CPUs custom engineered for low power, >radiation hardness, fault tolerance, redundancy, and tested ad nauseam >before ever leaving the earth. It is cheaper to spend $100K or even >more on each on those nodes ("identical" in function to a $2000 >board+network interface here on earth) and be almost certain that they >won't fail than it is to deal with the roughly 10% failure rate per year >observed for at least one component in a lot of COTS systems. Interestingly, they needn't cost $100K... There are several firms that sell (flight qualified) processor cards with interfaces for less. This would generally be in a 6U form factor, conduction cooled, with some degree of radiation tolerance, and with "flight quality" parts. You can, for about $30-40K, buy a nifty hybrid package about 2.5x3.5 inches with a 21020DSP, a bunch of RAM, various and sundry peripheral glue logic (timers, serial ports, etc.) and 3 high speed IEEE-1355 serial ports. There's also a SPARC version in the same package. Sandia is developing a rad hard Pentium, for those preferring a x86 processor. There's also a rad hard/tolerant PowerPC (133 MHz, I think) available from BAE. I'm pretty sure there's a '386 or '486 available as well. One of the appeals of a Beowulf kind of concept is the idea of using a bunch of commodity processors ganged together to get more processing resources. For space, the difference is that commodity means something a bit different. However, anytime you can spread the NRE cost across a system composed of a bunch of identical parts, it's a good thing. This is because you're always buying spares, redundant strings, engineering models, etc., and those can help to spread the development cost, so the "flight article" cost is less. There's also a non-negligble cost of having more items on the "bill of materials": each different kind of part needs drawings, documentation, test procedures, etc., a lot of which is what makes space stuff so expensive compared to the commercial parts (for which the primary cost driver is that of sand (raw materials) and marketing) so again, systems comprised of many identical parts have advantages. >James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Wed Apr 16 19:41:36 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Wed, 16 Apr 2003 16:41:36 -0700 Subject: beowulf in space In-Reply-To: References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> Message-ID: <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> A > > There's also a non-negligble cost of having more items on the "bill of > > materials": each different kind of part needs drawings, documentation, > test > > procedures, etc., a lot of which is what makes space stuff so expensive > > compared to the commercial parts (for which the primary cost driver is > that > > of sand (raw materials) and marketing) so again, systems comprised of many > > identical parts have advantages. > >Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...? > >Verrry Eeenteresting... > >Now marketing, that I'd believe;-) Say it costs a billion dollars to set up the fab (which can be spread over 2-3 years, probably), and maybe another half billion to design the processor (I don't know... 2500 work years seems like a lot, but...?)... How many Pentiums does Intel make? It's kind of hard to figure out just how many chips Intel makes in a given time (such being a critical aspect of their profitibility), but... consider that Intel Revenue for 2002 was about $27B.... As for marketing... in an article about P4s from April of 2001: Intel has told news sources that it plans to spend roughly $500 million to promote the new technology among software makers, and another $300 million on general advertising. Such enormous volumes are why commodity computing even works..The NRE for truly high performance computing devices is spread over so many units... _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Apr 16 19:08:21 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 16 Apr 2003 19:08:21 -0400 (EDT) Subject: beowulf in space In-Reply-To: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> Message-ID: On Wed, 16 Apr 2003, Jim Lux wrote: > Interestingly, they needn't cost $100K... There are several firms that sell > (flight qualified) processor cards with interfaces for less. This would > generally be in a 6U form factor, conduction cooled, with some degree of > radiation tolerance, and with "flight quality" parts. I stand corrected. Perhaps general aviation and military creates a market large enough to be considered COTS in its own, somewhat elevated right. Cool. Seems useful to know. Perhaps I'll have to write a chapter on "Beowulfs in Space", or "Beowulfs in Super Secret Weapons Systems" (kidding!) in my online book. > > You can, for about $30-40K, buy a nifty hybrid package about 2.5x3.5 inches > with a 21020DSP, a bunch of RAM, various and sundry peripheral glue logic > (timers, serial ports, etc.) and 3 high speed IEEE-1355 serial ports. > > There's also a SPARC version in the same package. > > Sandia is developing a rad hard Pentium, for those preferring a x86 > processor. There's also a rad hard/tolerant PowerPC (133 MHz, I think) > available from BAE. I'm pretty sure there's a '386 or '486 available as well. > > One of the appeals of a Beowulf kind of concept is the idea of using a > bunch of commodity processors ganged together to get more processing > resources. For space, the difference is that commodity means something a > bit different. However, anytime you can spread the NRE cost across a > system composed of a bunch of identical parts, it's a good thing. This is > because you're always buying spares, redundant strings, engineering models, > etc., and those can help to spread the development cost, so the "flight > article" cost is less. > > There's also a non-negligble cost of having more items on the "bill of > materials": each different kind of part needs drawings, documentation, test > procedures, etc., a lot of which is what makes space stuff so expensive > compared to the commercial parts (for which the primary cost driver is that > of sand (raw materials) and marketing) so again, systems comprised of many > identical parts have advantages. Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...? Verrry Eeenteresting... Now marketing, that I'd believe;-) rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Wed Apr 16 20:30:37 2003 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Wed, 16 Apr 2003 17:30:37 -0700 (PDT) Subject: Electricity Bill In-Reply-To: <5.2.0.9.2.20030415215707.02a62c70@66.250.215.18> Message-ID: If we were located in Winnipeg, our cooling costs would be way lower in the winter... joelja On Tue, 15 Apr 2003, Eric Kuhnke wrote: > I have a quick survey regarding electricity use, kW/H rates, etc: > > 1) How much did you pay last year for the electricity and HVAC consumption > of your cluster? > > 2) How big is it? What sort of CPUs? etc > > 3) What are you paying in kilowatt-hour rates to the power company? > > 4) Would you have built a much larger cluster if the projected yearly > electrical bill was significantly lower? Ex: if you were located in an > area such as Vancouver or Winnipeg, lowest electricity rates in North > America. See http://www.bchydro.com/policies/rates/rates759.html for kwH > rates (in Canadian currency). > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- In Dr. Johnson's famous dictionary patriotism is defined as the last resort of the scoundrel. With all due respect to an enlightened but inferior lexicographer I beg to submit that it is the first. -- Ambrose Bierce, "The Devil's Dictionary" _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From young_yuen at yahoo.com Wed Apr 16 23:22:59 2003 From: young_yuen at yahoo.com (Young Yuen) Date: Wed, 16 Apr 2003 20:22:59 -0700 (PDT) Subject: problem with ANA-6911A/TX under kernel 2.4.18 In-Reply-To: <20030413163641.31713.qmail@web41303.mail.yahoo.com> Message-ID: <20030417032259.18109.qmail@web41303.mail.yahoo.com> Sorry but is this the right place for questions for problems with the tulip (DEC chip based NIC) driver? --- Young Yuen wrote: > Hi, > > The Tulip driver doesn't seem to detect the RJ45 > port. > My kernel ver is 2.4.18 and Tulip driver ver is > 0.9.15. > > Linux Tulip driver version 0.9.15-pre11 (May 11, > 2002) > tulip0: EEPROM default media type Autosense. > tulip0: Index #0 - Media MII (#11) described by a > 21142 MII PHY (3) block. > tulip0: Index #1 - Media 10base2 (#1) described by > a > 21142 Serial PHY (2) block. > tulip0: ***WARNING***: No MII transceiver found! > divert: allocating divert_blk for eth0 > eth0: Digital DS21143 Tulip rev 33 at 0xc6855000, > 00:00:D1:00:0B:4B, IRQ 11. > > Somtimes after a reboot the warning message is gone. > > tulip0: MII transceiver #1 config 3100 status 7809 > advertising 0101. > divert: allocating divert_blk for eth0 > eth0: Digital DS21143 Tulip rev 33 at 0xc6855000, > 00:00:D1:00:0B:4B, IRQ 11. > > But in either cases, it fails to ping any nodes on > the > network besides its own. ANA-6911A/TX is a > 100BaseT/10Base2 combo card, RJ45 port is > connected to LAN. Windows dual boot from the same > machine works fine shows no problem with the network > configuration or hardware. > > Can you please kindly advise. > > Thx & Rgds, > Young > > __________________________________________________ > Do you Yahoo!? > Yahoo! Tax Center - File online, calculators, forms, > and more > http://tax.yahoo.com > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or > unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From edwardsa at plk.af.mil Wed Apr 16 23:21:51 2003 From: edwardsa at plk.af.mil (Art Edwards) Date: Wed, 16 Apr 2003 21:21:51 -0600 Subject: beowulf in space In-Reply-To: <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> Message-ID: <20030417032151.GA13826@plk.af.mil> I think I'm jumping into the middle of a conversation here, but our branch is the shop through which most of the DoD processor programs are managed. For real space applications there are radiation issues like total dose hardness and single even upset that require special design and, still, special processing. That is, you can't make these parts at any foundry (yet). There are currently two hardened foundries through which the most tolerant parts are fabricated. Where the commercial market is ~100's of Billions/year, the space electronics industry is ~200million/year. So parts are expensive, as Jim Lux says. But more importantly, the current state-of-the-art for space processors is several generations back. Now, with a 200 million market/year, who is going to spend the money to build a new foundry? (anyone?) It's a huge problem, and beowulfs in space will not give the economies of scale necessary to move us forward. I don't know if this has been discussed here, but have you thought about launch costs? They're huge. Weight, power, and mission lifetime are the crucial factors for space. These are the reasons that so much R&D goes into space electronics. I apologize if I have gone over old ground. Art Edwards On Wed, Apr 16, 2003 at 04:41:36PM -0700, Jim Lux wrote: > A > >> There's also a non-negligble cost of having more items on the "bill of > >> materials": each different kind of part needs drawings, documentation, > >test > >> procedures, etc., a lot of which is what makes space stuff so expensive > >> compared to the commercial parts (for which the primary cost driver is > >that > >> of sand (raw materials) and marketing) so again, systems comprised of > >many > >> identical parts have advantages. > > > >Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...? > > > >Verrry Eeenteresting... > > > >Now marketing, that I'd believe;-) > > Say it costs a billion dollars to set up the fab (which can be spread over > 2-3 years, probably), and maybe another half billion to design the > processor (I don't know... 2500 work years seems like a lot, but...?)... > How many Pentiums does Intel make? It's kind of hard to figure out just how > many chips Intel makes in a given time (such being a critical aspect of > their profitibility), but... > > consider that Intel Revenue for 2002 was about $27B.... > > As for marketing... in an article about P4s from April of 2001: > Intel has told news sources that it plans to spend roughly $500 million to > promote the new technology among software makers, and another $300 million > on general advertising. > > > Such enormous volumes are why commodity computing even works..The NRE for > truly high performance computing devices is spread over so many units... > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Art Edwards Senior Research Physicist Air Force Research Laboratory Electronics Foundations Branch KAFB, New Mexico (505) 853-6042 (v) (505) 846-2290 (f) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hanzl at noel.feld.cvut.cz Thu Apr 17 03:24:17 2003 From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz) Date: Thu, 17 Apr 2003 09:24:17 +0200 Subject: Running perl scripts and non-mpi programs on scyld In-Reply-To: References: Message-ID: <20030417092417S.hanzl@unknown-domain> > We have a Scyld Beowulf cluster currently running on 28cz-4 (we are > getting -5 soon). We have been running into a lot of problems with > users that are trying to run scripts on the child nodes. To start > with, what is the best way to run serial (non-MPI) programs? > ... > Say I have a perl script. (I NFS mount /usr /lib etc. on the child > nodes) > > I want to run this perl script on N nodes with N DIFFERENT arguments. For these types of jobs, we are using SGE on scyld-like cluster (we are using HDDCS which is a variant of Clustermatic which is similar to Scyld but this should not matter here). SGE is quite nice and opensource batch spooling. Using it with scyld-like cluster for this type of jobs is a bit tricky but quite easy. We just create one 'queue' for every slave node and use node number as a queue name. Then we use 'starter method' script like this: file /usr/local/bin/sge-bproc-starter-method: #/bin/sh bpsh $QUEUE $* All these queues are defined as running on master node but starter method in fact moves perl scripts on individual slave nodes. To run scripts on N nodes with N DIFFERENT arguments, you may use 'array jobs' or submit many individual jobs. (And there is much more you can do with SGE, I highly recommend it.) Regards Vaclav Hanzl _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From scheinin at crs4.it Thu Apr 17 03:56:48 2003 From: scheinin at crs4.it (Alan Scheinine) Date: Thu, 17 Apr 2003 09:56:48 +0200 Subject: [Linux-ia64] Itanium gets supercomputing software Message-ID: <200304170756.h3H7umB02357@dali.crs4.it> Greg Lindahl wrote: Good compilers have instruction scheduling which do this on other chips. While it's easier to understand what's going on when the parallelism is explicit, you'll find that scientific codes get a pretty amazing number of instructions per cycle on quite a few cpus and compilers. The promise of EPIC was that it would be eaiser to do this. You'll have to talk to some compiler people to find out if they think it was easier. The ones I know hate EPIC was a passion. ================================================= I do not think there was a promise that getting efficiency would be easier with EPIC. My understanding of the situation is that the logic of dynamic allocation of resources, that is, the various tricks done in silicon, could not scale to a large number of processing units on a chip. That is, the complexity grows faster than linear, much faster. If you take that as a postulate, then it is logical to conclude that optimization must move to the compiler. The problem is that writing a compiler to maximize efficiency is difficult. Fifteen years ago I heard a talk in which it was claimed that compiler advances developed at universities arrive in commercial compilers after a delay of ten years. More recently people tell me that the development cycle is shorter, but nonetheless, writing optimizing compilers is a very difficult task. Greg Lindahl wrote that "The ones [compiler people] I know hate EPIC with a passion". Why? Do they say that the concept is wrong or is that problem that they cannot meet their deadlines because of the quantity of analysis that has been moved from the silicon to the compiler writer? This is not a rhetorical question, it would be interesting to learn more details from the "compiler people". By the way, it may be a good idea to develop more packages like Atlas and FFTW which optimize themselves based on the actual computer, since memory latency and other factors are variable. But then, optimizing through experimentation takes a long time. -- Alan Scheinine _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Apr 17 08:14:38 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 17 Apr 2003 08:14:38 -0400 (EDT) Subject: beowulf in space In-Reply-To: <20030417032151.GA13826@plk.af.mil> Message-ID: On Wed, 16 Apr 2003, Art Edwards wrote: > I think I'm jumping into the middle of a conversation here, but our > branch is the shop through which most of the DoD processor programs are > managed. For real space applications there are radiation issues like > total dose hardness and single even upset that require special design > and, still, special processing. That is, you can't make these parts at > any foundry (yet). There are currently two hardened foundries through > which the most tolerant parts are fabricated. Where the commercial > market is ~100's of Billions/year, the space electronics industry is > ~200million/year. So parts are expensive, as Jim Lux says. But more > importantly, the current state-of-the-art for space processors is > several generations back. Now, with a 200 million market/year, who is > going to spend the money to build a new foundry? (anyone?) It's a huge > problem, and beowulfs in space will not give the economies of scale > necessary to move us forward. > > I don't know if this has been discussed here, but have you thought about > launch costs? They're huge. Weight, power, and mission lifetime are the > crucial factors for space. These are the reasons that so much R&D goes > into space electronics. I apologize if I have gone over old ground. Actually, this is the sort of thing that makes (as Eray pointed out) the idea of a cluster (leaving aside the COTS issue, the single-headed issue, and whether or not it could be a true "beowulf" cluster) attractive in space applications. What you (and Gerry) are saying is that the space and DoD market is stuck using specially engineered, radiation hard, not-so-bleeding-VLSI processors from what amounts to several VLSI generations ago. The parts are expensive, but the cost of building a newer better foundry for such a small and inelastic market are prohibitive, so they are the only game in town. If you have an orbital project or application that needs considerably more speed than the undoubtedly pedestrian clock of these devices can provide, you have a HUGE cost barrier to developing a faster processor, and that barrier is largely out of your (DoD) or Nasa's control -- you can only ask/hope for an industrial partner to make the investment required to up the chip generation in hardened technology with the promise of at least some guaranteed sales. You also have a known per kilogram per liter cost for lifting stuff into space, and this is at least modestly under your own control. So (presuming an efficiently parallelizable task) instead of effectively financing a couple of billion dollars in developing the nextgen hard chips to get a speedup of ten or so, you can engineer twelve systems based on the current, relatively cheap chips into a robust and fault tolerant cluster and pay the known immediate costs of lifting those twelve systems into orbit. Again presuming that it is for some reason not feasible to simply establish a link to earth and do the processing here -- an application for which the latency would be bad, an application that requires immediate response in a changing environment when downlink communications may not be robust. A question that you or Gerry or Jim may or may not be able to answer (with which Chip started this discussion): Are there any specific non-classified instances that you know of where an actual "cluster" (defined loosely as multiple identical CPUs interconnected with some sort of communications bus or network and running a specific parallel numerical task, not e.g. task-specific processors in several parts of a military jet) has been engineered, built, and shot into space? This has been interesting enough that if there are any, I may indeed add a chapter to the book, if/when I next actually work on it. I got dem end of semester blues, at the moment...:-) rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From astroguy at bellsouth.net Thu Apr 17 01:29:58 2003 From: astroguy at bellsouth.net (c.clary) Date: Thu, 17 Apr 2003 01:29:58 -0400 Subject: beowulf in space References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> <20030417032151.GA13826@plk.af.mil> Message-ID: <3E9E3BD6.8030706@bellsouth.net> Art Edwards wrote: >I think I'm jumping into the middle of a conversation here, but our >branch is the shop through which most of the DoD processor programs are >managed. For real space applications there are radiation issues like >total dose hardness and single even upset that require special design >and, still, special processing. That is, you can't make these parts at >any foundry (yet). There are currently two hardened foundries through >which the most tolerant parts are fabricated. Where the commercial >market is ~100's of Billions/year, the space electronics industry is >~200million/year. So parts are expensive, as Jim Lux says. But more >importantly, the current state-of-the-art for space processors is >several generations back. Now, with a 200 million market/year, who is >going to spend the money to build a new foundry? (anyone?) It's a huge >problem, and beowulfs in space will not give the economies of scale >necessary to move us forward. > >I don't know if this has been discussed here, but have you thought about >launch costs? They're huge. Weight, power, and mission lifetime are the >crucial factors for space. These are the reasons that so much R&D goes >into space electronics. I apologize if I have gone over old ground. > >Art Edwards > >On Wed, Apr 16, 2003 at 04:41:36PM -0700, Jim Lux wrote: > > >>A >> >> >>>>There's also a non-negligble cost of having more items on the "bill of >>>>materials": each different kind of part needs drawings, documentation, >>>> >>>> >>>test >>> >>> >>>>procedures, etc., a lot of which is what makes space stuff so expensive >>>>compared to the commercial parts (for which the primary cost driver is >>>> >>>> >>>that >>> >>> >>>>of sand (raw materials) and marketing) so again, systems comprised of >>>> >>>> >>>many >>> >>> >>>>identical parts have advantages. >>>> >>>> >>>Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...? >>> >>>Verrry Eeenteresting... >>> >>>Now marketing, that I'd believe;-) >>> >>> >>Say it costs a billion dollars to set up the fab (which can be spread over >>2-3 years, probably), and maybe another half billion to design the >>processor (I don't know... 2500 work years seems like a lot, but...?)... >>How many Pentiums does Intel make? It's kind of hard to figure out just how >>many chips Intel makes in a given time (such being a critical aspect of >>their profitibility), but... >> >>consider that Intel Revenue for 2002 was about $27B.... >> >>As for marketing... in an article about P4s from April of 2001: >>Intel has told news sources that it plans to spend roughly $500 million to >>promote the new technology among software makers, and another $300 million >>on general advertising. >> >> >>Such enormous volumes are why commodity computing even works..The NRE for >>truly high performance computing devices is spread over so many units... >> >>_______________________________________________ >>Beowulf mailing list, Beowulf at beowulf.org >>To change your subscription (digest mode or unsubscribe) visit >>http://www.beowulf.org/mailman/listinfo/beowulf >> >> > > > Dear sir, Plz feel free to jump right in, nice to have you posting on this most exceptional list,( for the most part one of the best on the web, IMHO)... But you do bring to mind an excellent point.. One of endless debate since I can recall in my early days of high school science club and launching rockets and modeling ballistic scenario's at the local Wofford College computer lab time that Dr. Olds was so generous and kind to provide... What we concluded then and applies equally as to the current discussion is that cost of access to space could be greatly reduce if we changed the launch platform to that of the earliest days of high speed space research... such as the X-15 project... Some of us went on to working world married a gypsy princes and so locked into a certain destiny... Others in our class went on to places like M.I.T where they continued to pursue their space dreams... Like David Thompson founder of Orbital Research and the launch of the first commercial space rocket called Project Pegasus ... Which was, in fact, first carried into space by the same B-52 used to launch the X-15... I think recent events clearly demonstrate that there is certainly a need to re visit this equation.... Everything old is new again... "Generations come and generations go... and they have no memory." Thanks again Art, nice to have your post C.Clary Spartan sys. analyst PO 1515 Spartanburg, SC 29304-0243 Fax# (801) 858-2722 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mack.joseph at epa.gov Thu Apr 17 10:45:51 2003 From: mack.joseph at epa.gov (Joseph Mack) Date: Thu, 17 Apr 2003 10:45:51 -0400 Subject: SMP and Network Connections References: Message-ID: <3E9EBE1F.3E769C13@epa.gov> Douglas Eadline wrote: > > Just posted some more SMP tests on www.cluster-rant.com. > This time, I tested the interconnects and asked the > question "What if a dual SMP used two Ethernet connections > instead of one?" Seems to help! Take a look at: Thanks for your work and write up. I did some performance tests a few years ago on a router, using multiple copies of netpipe through a single interface to a set of nodes, to determine the effect of multiple streams on throughput on the router (this was 100Mbps ethernet). I found that as I increased the number of nodes connecting to the router, that the throughput increased, rising above 100Mbps (when I totalled the throughput from each netpipe job). Looking at the netpipe code I saw that netpipe waits for a quiet time on the network before entering the next round of the test. Thus for 4 connections, if each instance of netpipe waited for a quiet time to run the test on the next packet size, I could (in principle) get the result of 4 connections of 100Mpbs for a total of 400Mpbs. I contacted the netpipe author, who sent me a preliminary version of a multi-netpipe, where multiple connections are synchronised and stepped through the range of packet sizes together. He said that it wasn't ready to use and I didn't have time to work on it myself. I never solved the multiple connection problem and wound up doing tests with a single connection. Do you know if this problem is affecting your measurements? (The report is at http://www.linuxvirtualserver.org/Joseph.Mack/performance/single_realserver_performance.html) Joe -- Joseph Mack PhD, Senior Systems Engineer, SAIC contractor to the National Environmental Supercomputer Center, ph# 919-541-0007, RTP, NC, USA. mailto:mack.joseph at epa.gov _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mof at labf.org Thu Apr 17 11:47:34 2003 From: mof at labf.org (Mof) Date: Fri, 18 Apr 2003 01:17:34 +0930 Subject: beowulf in space In-Reply-To: <3E9DFC75.50504@tamu.edu> References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> <3E9DFC75.50504@tamu.edu> Message-ID: <200304180117.35195.mof@labf.org> Ok excuse my ignorance, but what is involved in rad harding hardware ? Is the cost really necessary, in that couldn't you put the unprotected hardware into some sort of shielded container ? Or am I just being silly ? :-) Mof. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Thu Apr 17 12:33:12 2003 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Thu, 17 Apr 2003 09:33:12 -0700 (PDT) Subject: beowulf in space In-Reply-To: <200304180117.35195.mof@labf.org> Message-ID: high levels of ionizing radiation can induce shorts in semi-conductor junctions... if you short a junction that has a voltage source behind it you can do serious damage to whatevers on the other side. then you have issues like higher instanaces of single bit errors, need for all ceramic chip packages, and probably a couple other things I've already forgotten. Taken as a whole they make substantial redesign necessary for components that were not desgined to work in this environment from the outset. The other thing to keep in mind is that hardening systems against nuclear attacks is a substantially different exercise given the short term nature of that particular radiation exposure... joelja On Fri, 18 Apr 2003, Mof wrote: > Ok excuse my ignorance, but what is involved in rad harding hardware ? > Is the cost really necessary, in that couldn't you put the unprotected > hardware into some sort of shielded container ? > Or am I just being silly ? :-) > > Mof. > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- In Dr. Johnson's famous dictionary patriotism is defined as the last resort of the scoundrel. With all due respect to an enlightened but inferior lexicographer I beg to submit that it is the first. -- Ambrose Bierce, "The Devil's Dictionary" _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Wed Apr 16 20:59:33 2003 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Wed, 16 Apr 2003 19:59:33 -0500 Subject: beowulf in space References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> Message-ID: <3E9DFC75.50504@tamu.edu> We can consider a 10e2 (or more) cost multiplier for space-qualified hardware, excluding the design work someone like Harris does to radiation harden the processors... and memory... and glue-logic. Intel doesn't tend to make space-qualified hardware, or rad-hard hardware. They license that out to Harris and some of the research labs. Now: Using industrial-grade devices is more cost effective, and loses some of the paperwork burden (the 2 are tied intimately). But nothing's been done about radiation hardening. Which is an issue. Let's talk about radiation hardening and single-event upsets. Radiation hardening refers, generally to resistance to the effects of transient bit resets due to hits by heavy particles. (Is that the sound of RGB winding up?) Transient bit flips are one thing: You have to do error detection (and correction?) but the device recovers. In spacecraft memory, one runs almost continuous housecleaning code to detect permanent holes and remap the memory around them. This is a very important aspect of planning. If we're talking about losing enough cycles to housecleaning to drag our processing power down, are we really gaining much in "flying" a cluster? Ah, yes... speed. It's generally accepted in building flight processors, that the faster they go, the easier they are to upset. Thus , that 3GHz Pentium.... Oh. Sorry. The 2.4GHz (non-vaporware) device is significantly more prone to SEU than the Pentium I/166. Trace/mask sizing makes a difference. The finer the lines, the more prone to failure. So, once again, the old stuff (especially CMOS) outlasts the new x-ray lithography chips. OKAY. Pretty pessimistic. The real world of space-qualified processors _IS_ conservative, as changing a CPU requires a service call of a couple of hundred miles (vertical) plus the delta-v and guidance to manage to match orbits... So you get your industrial-grade devices, burn 'em in on the ground, in higher-than-expected temperatures ("accelerated life testing") and qualify your systems that way. You review the literature (Sandia National Labs has some great stuff) and decide the break-points for memory, processor and bus speeds. Overclocking is _right_out_. You design your spaceframe to accommodate adequate cooling (remember those heat-pipes for the new processors? Ever wonder where the technology came from? Thank the USAF.) You add some layered polyethylene and gold layers to improve hardening, and you rewrite your code to accomplish memory and (processor) register housecleaning. It's not impossible but it's not quite the same as building a 256 node COTS cluster, either. gerry Jim Lux wrote: > A > >> > There's also a non-negligble cost of having more items on the "bill of >> > materials": each different kind of part needs drawings, >> documentation, test >> > procedures, etc., a lot of which is what makes space stuff so expensive >> > compared to the commercial parts (for which the primary cost driver >> is that >> > of sand (raw materials) and marketing) so again, systems comprised >> of many >> > identical parts have advantages. >> >> Hmmm, so the primary cost determinant of VLSIC's is the cost of sand...? >> >> Verrry Eeenteresting... >> >> Now marketing, that I'd believe;-) > > > Say it costs a billion dollars to set up the fab (which can be spread > over 2-3 years, probably), and maybe another half billion to design the > processor (I don't know... 2500 work years seems like a lot, but...?)... > How many Pentiums does Intel make? It's kind of hard to figure out just > how many chips Intel makes in a given time (such being a critical aspect > of their profitibility), but... > > consider that Intel Revenue for 2002 was about $27B.... > > As for marketing... in an article about P4s from April of 2001: > Intel has told news sources that it plans to spend roughly $500 million > to promote the new technology among software makers, and another $300 > million on general advertising. > > > Such enormous volumes are why commodity computing even works..The NRE > for truly high performance computing devices is spread over so many > units... > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Thu Apr 17 12:22:47 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Thu, 17 Apr 2003 12:22:47 -0400 (EDT) Subject: beowulf in space In-Reply-To: <200304180117.35195.mof@labf.org> Message-ID: > Is the cost really necessary, in that couldn't you put the unprotected > hardware into some sort of shielded container ? that would clearly work, but would be very heavy. I'm definitely not in the field, though. and to me, a 100x multiplier makes the whole idea very dubious - why not just use fast hardware and run every task 3 times and vote for the results? obviously, there are some places where software/temporal redundancy can't be used. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Thu Apr 17 12:41:39 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Thu, 17 Apr 2003 09:41:39 -0700 Subject: beowulf in space In-Reply-To: References: <20030417032151.GA13826@plk.af.mil> Message-ID: <5.1.0.14.2.20030417092926.030d5c80@mailhost4.jpl.nasa.gov> >If you have an orbital project or application that needs considerably >more speed than the undoubtedly pedestrian clock of these devices can >provide, you have a HUGE cost barrier to developing a faster processor, >and that barrier is largely out of your (DoD) or Nasa's control -- you >can only ask/hope for an industrial partner to make the investment >required to up the chip generation in hardened technology with the >promise of at least some guaranteed sales. You also have a known per >kilogram per liter cost for lifting stuff into space, and this is at >least modestly under your own control. So (presuming an efficiently >parallelizable task) instead of effectively financing a couple of >billion dollars in developing the nextgen hard chips to get a speedup of >ten or so, you can engineer twelve systems based on the current, >relatively cheap chips into a robust and fault tolerant cluster and pay >the known immediate costs of lifting those twelve systems into orbit. > >A question that you or Gerry or Jim may or may not be able to answer >(with which Chip started this discussion): Are there any specific >non-classified instances that you know of where an actual "cluster" >(defined loosely as multiple identical CPUs interconnected with some >sort of communications bus or network and running a specific parallel >numerical task, not e.g. task-specific processors in several parts of a >military jet) has been engineered, built, and shot into space? I was involved with development of a breadboard scatterometer ( a specialized type of radar that measures the radar reflectivity of the target (the ocean surface, in this case)) using multiple off the shelf space qualified DSP processors to get the numerical processing crunch needed. It was more a proof of concept or feasibility demonstration than a flight instrument, and designed to provide a reasonable basis for cost estimates for an eventual flight instrument. It was specifically the concept you address above: You're not going to get one special processor built custom for you at a reasonable price, but you can get a bunch of generic ones and gang em together. The "going in constraint" was that the approach had to use existing off the shelf flight qualified technology, which in this case is the rad tolerant ADSP21020 clone funded by ESA, made by Atmel/Temic. We used SpaceWire as the interconnect (it's a routable high speed serial link, based on IEEE 1355), wrote drivers that implement a subset of MPI, and did all the fancy stuff in fairly vanilla C doing the interprocessor comms with calls to the MPI-like API. The breadboard illustrated scalability (i.e. you could add and drop identical processors to achieve any desired performance; manifested as either "amount of signal processing required" or "max pulse repetition frequency handled") Interestingly, mass wasn't a big design driver (adding a processor to the cluster only adds <1kg to an instrument that already is on the order of 100kg). Power was a bit of a concern (mostly because it hadn't ever been built), but the real hurdle for the review boards was just the unfamiliarity with the concept of accepting inefficiency in exchange for use of generic parts. Most spacecraft systems are very purpose designed and highly customized. >This has been interesting enough that if there are any, I may indeed add >a chapter to the book, if/when I next actually work on it. I got dem >end of semester blues, at the moment...:-) > > rgb > >-- >Robert G. Brown http://www.phy.duke.edu/~rgb/ >Duke University Dept. of Physics, Box 90305 >Durham, N.C. 27708-0305 >Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From deadline at plogic.com Thu Apr 17 14:04:03 2003 From: deadline at plogic.com (Douglas Eadline) Date: Thu, 17 Apr 2003 13:04:03 -0500 (CDT) Subject: SMP and Network Connections In-Reply-To: <3E9EBE1F.3E769C13@epa.gov> Message-ID: On Thu, 17 Apr 2003, Joseph Mack wrote: > Douglas Eadline wrote: > > > > Just posted some more SMP tests on www.cluster-rant.com. > > This time, I tested the interconnects and asked the > > question "What if a dual SMP used two Ethernet connections > > instead of one?" Seems to help! Take a look at: > > Thanks for your work and write up. > > I did some performance tests a few years ago on a router, > using multiple copies of netpipe through a single interface > to a set of nodes, to determine the effect of multiple streams > on throughput on the router (this was 100Mbps ethernet). > > I found that as I increased the number of nodes connecting > to the router, that the throughput increased, rising above > 100Mbps (when I totalled the throughput from each netpipe > job). > > Looking at the netpipe code I saw that netpipe waits for a > quiet time on the network before entering the next round of > the test. Thus for 4 connections, if each instance of > netpipe waited for a quiet time to run the test on the next > packet size, I could (in principle) get the result of 4 connections > of 100Mpbs for a total of 400Mpbs. > > I contacted the netpipe author, who sent me a preliminary version > of a multi-netpipe, where multiple connections are synchronised > and stepped through the range of packet sizes together. He > said that it wasn't ready to use and I didn't have time to > work on it myself. I never solved the multiple connection > problem and wound up doing tests with a single connection. > > Do you know if this problem is affecting your measurements? I was not aware of this, however, the netpipe data seems to indicate that when two netpipes are using the same interface, there is some degradation when compared to a single run. I'll have a look at the code as well. Thanks for the information. Doug > > (The report is at > http://www.linuxvirtualserver.org/Joseph.Mack/performance/single_realserver_performance.html) > > Joe > > -- ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.814.2800 130 Webster Street | PARALLEL | Fax:+610.814.5844 Bethlehem, PA 18015 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From shewa at inel.gov Thu Apr 17 13:20:03 2003 From: shewa at inel.gov (Andrew Shewmaker) Date: Thu, 17 Apr 2003 11:20:03 -0600 Subject: Running perl scripts and non-mpi programs on scyld In-Reply-To: <20030417092417S.hanzl@unknown-domain> References: <20030417092417S.hanzl@unknown-domain> Message-ID: <200304171120.03430.shewa@inel.gov> On Thursday 17 April 2003 01:24 am, hanzl at noel.feld.cvut.cz wrote: > For these types of jobs, we are using SGE on scyld-like cluster (we > are using HDDCS which is a variant of Clustermatic which is similar to > Scyld but this should not matter here). > > SGE is quite nice and opensource batch spooling. Using it with > scyld-like cluster for this type of jobs is a bit tricky but quite > easy. We just create one 'queue' for every slave node and use node > number as a queue name. Then we use 'starter method' script like this: > > file /usr/local/bin/sge-bproc-starter-method: > > #/bin/sh > bpsh $QUEUE $* > > All these queues are defined as running on master node but starter > method in fact moves perl scripts on individual slave nodes. > > To run scripts on N nodes with N DIFFERENT arguments, you may use > 'array jobs' or submit many individual jobs. > > (And there is much more you can do with SGE, I highly recommend it.) So does SGE see the load on the slave node queues? Does it show the number of processors and total memory in qhost? I didn't realize it was so easy to use SGE on top of a bproc based system. I suppose it would take quite a bit more work to get it integrated to the point where you only needed one queue for the entire cluster. Andrew -- Andrew Shewmaker Associate Engineer, INEEL Phone: 1-208-526-1415 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Apr 17 16:54:17 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 17 Apr 2003 16:54:17 -0400 (EDT) Subject: beowulf in space In-Reply-To: <200304180117.35195.mof@labf.org> Message-ID: On Fri, 18 Apr 2003, Mof wrote: > Ok excuse my ignorance, but what is involved in rad harding hardware ? > Is the cost really necessary, in that couldn't you put the unprotected > hardware into some sort of shielded container ? > > Or am I just being silly ? :-) Not really silly, but IIRC shielding is both difficult and expensive and sometimes actively counterproductive in space. I'm sure the NASA guys will have even more detail, but: a) Difficult, because there is a very wide range of KINDS and ENERGIES of radiation out there. Some are easy to stop, but some (like massive, very high energy nucleii or very high energy gamma rays) are not. b) Expensive, because to stop radiation you basically have to interpolate matter in sufficient density to absorb and disperse the energy via single and multiple scattering events. Some radiation has a relatively high cross section with matter and low energy and is easily stopped, but the most destructive sort requires quite a lot of shielding, which is dense and thick. This means heavy and occupying lots of volume, which means expensive in terms of lifting it out of the gravity well. I don't know what it costs to lift a kilogram of mass to geosynchronous orbit, but I'll bet it is a LOT. c) Counterproductive, because SOME of the kinds of radiation present are by themselves not horribly dangerous -- they have a lot of energy but are relatively unlikely to hit anything. So when they hit they kill a cell or a chromosome or a bit or something, but in a fairly localized way. However, when they hit the right densities of matter in shielding they can produce a literal shower or shotgun blast of secondary particles that ARE the right particles at the right energies to do a lot of damage (to humans or hardware). So either you need enough shielding to stop these particles and all their secondary byproducts, or you can be better off just letting those particles (probably) pass right on through, hopefully without hitting anything. Basically, we are pretty fortunate to live way down here at the bottom of several miles of atmosphere, where most of the dangerous crap hits and showers its secondary stuff miles overhead and is absorbed before it becomes a hazard. Our computer hardware is similarly fortunate. Even a mile up the radiation levels are significantly higher -- even growing up in subtropical India I was NEVER sunburned as badly as I was in a mere two hours of late afternoon exposure in Taxco, Mexico, just one mile up. A single six hour cross-country plane ride exposes you to 1/8 of the rems you'd receive, on average, in an entire year spent at ground level. God only knows what astronauts get. Maybe they bank gametes before leaving, dunno... So definitely not silly, but things are more complex than they might seem. I'm sure that if a cost-effective solution were as easy as just "more shielding" the rocket scientists (literally:-) at NASA would have already thunk it. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From edwardsa at plk.af.mil Thu Apr 17 16:03:21 2003 From: edwardsa at plk.af.mil (Art Edwards) Date: Thu, 17 Apr 2003 14:03:21 -0600 Subject: beowulf in space In-Reply-To: References: <20030417032151.GA13826@plk.af.mil> Message-ID: <20030417200321.GB15077@plk.af.mil> Just so you don't think that the space program is run by a bunch of out-of-the-loop dopes, we have been doing clustering, althought these are by no means beowulfs. I sent a message to one of the brightest architectural designers, who is in our branch, and I paste his reply. Please copy to him any posts/responses to this. >From Jim Lyke Pretty cool. Sure, there has been publications on SAFE, and I have submitted a longer paper for publication. Sensor and Fusion Engine (SAFE) in its best case is 96 processors, broken into 12 bussed groups (the bus a customized Futurebus+) with a Myrinet bridge. The system is small enough in scale to be serviced by a single, 16-port duplex non-blocking Myrinet crossbar. So, 12 of the hubs are occupied with the 96 processors, which are of a special design (microprogrammable with IEEE 754(?) double-precision floating point support). Two of the remaining four hubs are equipped with FPGA-based front-end processors, to massage real-time sensor data into the packeted formats amenable to the 96-nodes. One of the remaining two hubs is occupied by a boot processor, which distributes program loads over the network and kicks off processor groups. The final port is a user/telemetry port, which could be a simple Linux box equipped with a Myrinet card. Everything above (except Linux box) is designed to be crammed into a conduction-cooled 5x5x8 inch parallelopiped container. The Myrinet protocol was gutted and replaced with a lower latency protocol with a one sigma latency of about 2uS on messages based on the statistics of our problem. The max sustainable peak is about 12 GFLOPS, which is because the chips were built on 0.5um. The theoretic density of the system (even so) is slightly over one TFLOP/cu.ft. We are moving forward to modernize the system, but are funding limited. The ultimate barrier will be thermal. Even though we use carbon-matrix composite materials that have 5X better heat conduction than aluminum, the ultimate power densities as we encroach on >10TFLOPS/cu.ft. will overtake the ability of that material to draw heat away. There is discussion of trying to create a new type of thermal management material based on either carbon or boron nanotubes, which are claimed to beat natural diamond by about 2X. I wouldn't mind being copied on the posts/replies either. END OF LYKE Art Edwards On Thu, Apr 17, 2003 at 08:14:38AM -0400, Robert G. Brown wrote: > On Wed, 16 Apr 2003, Art Edwards wrote: > > > I think I'm jumping into the middle of a conversation here, but our > > branch is the shop through which most of the DoD processor programs are > > managed. For real space applications there are radiation issues like > > total dose hardness and single even upset that require special design > > and, still, special processing. That is, you can't make these parts at > > any foundry (yet). There are currently two hardened foundries through > > which the most tolerant parts are fabricated. Where the commercial > > market is ~100's of Billions/year, the space electronics industry is > > ~200million/year. So parts are expensive, as Jim Lux says. But more > > importantly, the current state-of-the-art for space processors is > > several generations back. Now, with a 200 million market/year, who is > > going to spend the money to build a new foundry? (anyone?) It's a huge > > problem, and beowulfs in space will not give the economies of scale > > necessary to move us forward. > > > > I don't know if this has been discussed here, but have you thought about > > launch costs? They're huge. Weight, power, and mission lifetime are the > > crucial factors for space. These are the reasons that so much R&D goes > > into space electronics. I apologize if I have gone over old ground. > > Actually, this is the sort of thing that makes (as Eray pointed out) the > idea of a cluster (leaving aside the COTS issue, the single-headed > issue, and whether or not it could be a true "beowulf" cluster) > attractive in space applications. What you (and Gerry) are saying is > that the space and DoD market is stuck using specially engineered, > radiation hard, not-so-bleeding-VLSI processors from what amounts to > several VLSI generations ago. The parts are expensive, but the cost of > building a newer better foundry for such a small and inelastic market > are prohibitive, so they are the only game in town. > > If you have an orbital project or application that needs considerably > more speed than the undoubtedly pedestrian clock of these devices can > provide, you have a HUGE cost barrier to developing a faster processor, > and that barrier is largely out of your (DoD) or Nasa's control -- you > can only ask/hope for an industrial partner to make the investment > required to up the chip generation in hardened technology with the > promise of at least some guaranteed sales. You also have a known per > kilogram per liter cost for lifting stuff into space, and this is at > least modestly under your own control. So (presuming an efficiently > parallelizable task) instead of effectively financing a couple of > billion dollars in developing the nextgen hard chips to get a speedup of > ten or so, you can engineer twelve systems based on the current, > relatively cheap chips into a robust and fault tolerant cluster and pay > the known immediate costs of lifting those twelve systems into orbit. > > Again presuming that it is for some reason not feasible to simply > establish a link to earth and do the processing here -- an application > for which the latency would be bad, an application that requires > immediate response in a changing environment when downlink > communications may not be robust. > > A question that you or Gerry or Jim may or may not be able to answer > (with which Chip started this discussion): Are there any specific > non-classified instances that you know of where an actual "cluster" > (defined loosely as multiple identical CPUs interconnected with some > sort of communications bus or network and running a specific parallel > numerical task, not e.g. task-specific processors in several parts of a > military jet) has been engineered, built, and shot into space? > > This has been interesting enough that if there are any, I may indeed add > a chapter to the book, if/when I next actually work on it. I got dem > end of semester blues, at the moment...:-) > > rgb > > -- > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu > > > > -- Art Edwards Senior Research Physicist Air Force Research Laboratory Electronics Foundations Branch KAFB, New Mexico (505) 853-6042 (v) (505) 846-2290 (f) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu Apr 17 16:33:00 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 17 Apr 2003 13:33:00 -0700 Subject: [Linux-ia64] Itanium gets supercomputing software In-Reply-To: <200304170756.h3H7umB02357@dali.crs4.it> References: <200304170756.h3H7umB02357@dali.crs4.it> Message-ID: <20030417203300.GG1345@greglaptop.internal.keyresearch.com> On Thu, Apr 17, 2003 at 09:56:48AM +0200, Alan Scheinine wrote: > I do not think there was a promise that getting efficiency would > be easier with EPIC. My understanding of the situation is that > the logic of dynamic allocation of resources, that is, the various > tricks done in silicon, could not scale to a large number of > processing units on a chip. That's not what I said. I said that getting more instructions per cycle was what was supposed to be easier, and indeed, that means more compiler complexity. > Fifteen years ago I heard a talk in which > it was claimed that compiler advances developed at universities > arrive in commercial compilers after a delay of ten years. That's an over-generalization. For example, a lot of compiler research is done on the framework provided by Open64, which is SGI's compiler. You can get research frameworks in which you can play with a particular optimization idea, but if you want a research framework which is already a really great compiler, Open64 is the only choice. > Greg Lindahl wrote that "The ones [compiler people] I know hate EPIC > with a passion". Why? They think it's a pig with lipstick. > Do they say that the concept is wrong or > is that problem that they cannot meet their deadlines because of > the quantity of analysis that has been moved from the silicon to > the compiler writer? The lack of uptake in the marketplace meant that they had a couple of extra years to do the compiler work, so deadlines weren't a problem. Now in comparison, x86 chips are not very good compilation targets either: trying to figure out how x86 instructions actually work after they are translated into some unknown micro-ops isn't exactly easy. But I suspect that a poll of compiler people would vote for x86 over ia-64. > By the way, it may be a good idea to develop more packages like > Atlas and FFTW which optimize themselves based on the actual computer, > since memory latency and other factors are variable. But then, > optimizing through experimentation takes a long time. It is a good idea, but it's worth pointing out that Atlas and FFTW work best on machines which are either out of order, where a bad compiler isn't a problem, or on in-order machines with great compilers. I'd love to hear about how well gcc does on ia-64 with Atlas or FFTW. greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Apr 17 16:56:34 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 17 Apr 2003 16:56:34 -0400 (EDT) Subject: beowulf in space In-Reply-To: <5.1.0.14.2.20030417092926.030d5c80@mailhost4.jpl.nasa.gov> Message-ID: On Thu, 17 Apr 2003, Jim Lux wrote: Thanks! rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Apr 17 17:14:15 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 17 Apr 2003 17:14:15 -0400 (EDT) Subject: beowulf in space In-Reply-To: <20030417200321.GB15077@plk.af.mil> Message-ID: On Thu, 17 Apr 2003, Art Edwards wrote: > Just so you don't think that the space program is run by a bunch of > out-of-the-loop dopes, we have been doing clustering, althought these > are by no means beowulfs. I sent a message to one of the brightest > architectural designers, who is in our branch, and I paste his reply. > Please copy to him any posts/responses to this. Who could possibly think that, given that the first beowulf was built and named by a NASA program and that CESDIS for years housed both the list and beowulf.org (with only one short hiatus when high level program overseers became inexplicably stricken with some sort of mental disease:-)? However, this TOO looks very cool, sort of at the edge of the possible with clustering technology altogether. It is obvious that clusters are indeed making their way into space, or will be soon. I won't even ask what a "Sensor and Fusion Engine" might be -- it would be too much to hope that it would be a thermonuclear fusion engine that cannot AFAIK exist with current technology, existing anyway and preparing to really change the way we do space..:-) rgb > > >From Jim Lyke > > Pretty cool. Sure, there has been publications on SAFE, and I have > submitted a longer paper for publication. > > Sensor and Fusion Engine (SAFE) in its best case is 96 processors, > broken > into 12 bussed groups (the bus a customized Futurebus+) with a Myrinet > bridge. The system is small enough in scale to be serviced by a single, > 16-port duplex non-blocking Myrinet crossbar. So, 12 of the hubs are > occupied with the 96 processors, which are of a special design > (microprogrammable with IEEE 754(?) double-precision floating point > support). Two of the remaining four hubs are equipped with FPGA-based > front-end processors, to massage real-time sensor data into the packeted > formats amenable to the 96-nodes. One of the remaining two hubs is > occupied > by a boot processor, which distributes program loads over the network > and > kicks off processor groups. The final port is a user/telemetry port, > which > could be a simple Linux box equipped with a Myrinet card. > > Everything above (except Linux box) is designed to be crammed into a > conduction-cooled 5x5x8 inch parallelopiped container. The Myrinet > protocol > was gutted and replaced with a lower latency protocol with a one sigma > latency of about 2uS on messages based on the statistics of our problem. > The max sustainable peak is about 12 GFLOPS, which is because the chips > were > built on 0.5um. The theoretic density of the system (even so) is > slightly > over one TFLOP/cu.ft. We are moving forward to modernize the system, > but > are funding limited. The ultimate barrier will be thermal. Even though > we > use carbon-matrix composite materials that have 5X better heat > conduction > than aluminum, the ultimate power densities as we encroach on > >10TFLOPS/cu.ft. will overtake the ability of that material to draw heat > away. There is discussion of trying to create a new type of thermal > management material based on either carbon or boron nanotubes, which are > claimed to beat natural diamond by about 2X. > > I wouldn't mind being copied on the posts/replies either. > > END OF LYKE > > Art Edwards > > On Thu, Apr 17, 2003 at 08:14:38AM -0400, Robert G. Brown wrote: > > On Wed, 16 Apr 2003, Art Edwards wrote: > > > > > I think I'm jumping into the middle of a conversation here, but our > > > branch is the shop through which most of the DoD processor programs are > > > managed. For real space applications there are radiation issues like > > > total dose hardness and single even upset that require special design > > > and, still, special processing. That is, you can't make these parts at > > > any foundry (yet). There are currently two hardened foundries through > > > which the most tolerant parts are fabricated. Where the commercial > > > market is ~100's of Billions/year, the space electronics industry is > > > ~200million/year. So parts are expensive, as Jim Lux says. But more > > > importantly, the current state-of-the-art for space processors is > > > several generations back. Now, with a 200 million market/year, who is > > > going to spend the money to build a new foundry? (anyone?) It's a huge > > > problem, and beowulfs in space will not give the economies of scale > > > necessary to move us forward. > > > > > > I don't know if this has been discussed here, but have you thought about > > > launch costs? They're huge. Weight, power, and mission lifetime are the > > > crucial factors for space. These are the reasons that so much R&D goes > > > into space electronics. I apologize if I have gone over old ground. > > > > Actually, this is the sort of thing that makes (as Eray pointed out) the > > idea of a cluster (leaving aside the COTS issue, the single-headed > > issue, and whether or not it could be a true "beowulf" cluster) > > attractive in space applications. What you (and Gerry) are saying is > > that the space and DoD market is stuck using specially engineered, > > radiation hard, not-so-bleeding-VLSI processors from what amounts to > > several VLSI generations ago. The parts are expensive, but the cost of > > building a newer better foundry for such a small and inelastic market > > are prohibitive, so they are the only game in town. > > > > If you have an orbital project or application that needs considerably > > more speed than the undoubtedly pedestrian clock of these devices can > > provide, you have a HUGE cost barrier to developing a faster processor, > > and that barrier is largely out of your (DoD) or Nasa's control -- you > > can only ask/hope for an industrial partner to make the investment > > required to up the chip generation in hardened technology with the > > promise of at least some guaranteed sales. You also have a known per > > kilogram per liter cost for lifting stuff into space, and this is at > > least modestly under your own control. So (presuming an efficiently > > parallelizable task) instead of effectively financing a couple of > > billion dollars in developing the nextgen hard chips to get a speedup of > > ten or so, you can engineer twelve systems based on the current, > > relatively cheap chips into a robust and fault tolerant cluster and pay > > the known immediate costs of lifting those twelve systems into orbit. > > > > Again presuming that it is for some reason not feasible to simply > > establish a link to earth and do the processing here -- an application > > for which the latency would be bad, an application that requires > > immediate response in a changing environment when downlink > > communications may not be robust. > > > > A question that you or Gerry or Jim may or may not be able to answer > > (with which Chip started this discussion): Are there any specific > > non-classified instances that you know of where an actual "cluster" > > (defined loosely as multiple identical CPUs interconnected with some > > sort of communications bus or network and running a specific parallel > > numerical task, not e.g. task-specific processors in several parts of a > > military jet) has been engineered, built, and shot into space? > > > > This has been interesting enough that if there are any, I may indeed add > > a chapter to the book, if/when I next actually work on it. I got dem > > end of semester blues, at the moment...:-) > > > > rgb > > > > -- > > Robert G. Brown http://www.phy.duke.edu/~rgb/ > > Duke University Dept. of Physics, Box 90305 > > Durham, N.C. 27708-0305 > > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu > > > > > > > > > > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Apr 17 17:35:13 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 17 Apr 2003 17:35:13 -0400 (EDT) Subject: beowulf in space In-Reply-To: <20030417205423.GA15534@plk.af.mil> Message-ID: On Thu, 17 Apr 2003, Art Edwards wrote: > In this case fusion just refers to data-fusion from sensors. Data > integration and processing might capture what is meant by fusion. Signal > processing is a biggee for the Air Force. Awww, rats. Sexy name though -- "Fusion Engine"... rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From wade.hampton at nsc1.net Thu Apr 17 12:57:34 2003 From: wade.hampton at nsc1.net (Wade Hampton) Date: Thu, 17 Apr 2003 12:57:34 -0400 Subject: Running perl scripts and non-mpi programs on scyld In-Reply-To: References: Message-ID: <3E9EDCFE.5060100@nsc1.net> Kristen J. McFadden wrote: >Hi, > >We have a Scyld Beowulf cluster currently running on 28cz-4 (we are >getting -5 soon). We have been running into a lot of problems with >users that are trying to run scripts on the child nodes. To start >with, what is the best way to run serial (non-MPI) programs? > >Here is the current issue I'm trying to tackle. > >Say I have a perl script. (I NFS mount /usr /lib etc. on the child >nodes) > >I want to run this perl script on N nodes with N DIFFERENT arguments. > This is sort of what we are doing. Our solution currently is: 1. custom scheduler using bproc_rfork to fork our processing jobs (up to 2 per SMP node). 2. local disk on each node: hda1 beoboot hda2 swap hda3 /tmp hda4 /usr/local 3. cache of common tools to /usr/local including: some tools from /bin, /usr/bin, /usr/local/bin /usr/lib/perl - we are not currently rsync'ing this, but we could in the future.... 4. special scripts to run our collection of software as an independent run on each node Hope this helps, -- Wade Hampton _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From edwardsa at plk.af.mil Thu Apr 17 16:54:23 2003 From: edwardsa at plk.af.mil (Art Edwards) Date: Thu, 17 Apr 2003 14:54:23 -0600 Subject: beowulf in space In-Reply-To: References: <20030417200321.GB15077@plk.af.mil> Message-ID: <20030417205423.GA15534@plk.af.mil> On Thu, Apr 17, 2003 at 05:14:15PM -0400, Robert G. Brown wrote: > On Thu, 17 Apr 2003, Art Edwards wrote: > > > Just so you don't think that the space program is run by a bunch of > > out-of-the-loop dopes, we have been doing clustering, althought these > > are by no means beowulfs. I sent a message to one of the brightest > > architectural designers, who is in our branch, and I paste his reply. > > Please copy to him any posts/responses to this. > > Who could possibly think that, given that the first beowulf was built > and named by a NASA program and that CESDIS for years housed both the > list and beowulf.org (with only one short hiatus when high level program > overseers became inexplicably stricken with some sort of mental > disease:-)? > > However, this TOO looks very cool, sort of at the edge of the possible > with clustering technology altogether. > > It is obvious that clusters are indeed making their way into space, or > will be soon. > > I won't even ask what a "Sensor and Fusion Engine" might be -- it would > be too much to hope that it would be a thermonuclear fusion engine that > cannot AFAIK exist with current technology, existing anyway and > preparing to really change the way we do space..:-) In this case fusion just refers to data-fusion from sensors. Data integration and processing might capture what is meant by fusion. Signal processing is a biggee for the Air Force. Art Edwards > > rgb > > > > > >From Jim Lyke > > > > Pretty cool. Sure, there has been publications on SAFE, and I have > > submitted a longer paper for publication. > > > > Sensor and Fusion Engine (SAFE) in its best case is 96 processors, > > broken > > into 12 bussed groups (the bus a customized Futurebus+) with a Myrinet > > bridge. The system is small enough in scale to be serviced by a single, > > 16-port duplex non-blocking Myrinet crossbar. So, 12 of the hubs are > > occupied with the 96 processors, which are of a special design > > (microprogrammable with IEEE 754(?) double-precision floating point > > support). Two of the remaining four hubs are equipped with FPGA-based > > front-end processors, to massage real-time sensor data into the packeted > > formats amenable to the 96-nodes. One of the remaining two hubs is > > occupied > > by a boot processor, which distributes program loads over the network > > and > > kicks off processor groups. The final port is a user/telemetry port, > > which > > could be a simple Linux box equipped with a Myrinet card. > > > > Everything above (except Linux box) is designed to be crammed into a > > conduction-cooled 5x5x8 inch parallelopiped container. The Myrinet > > protocol > > was gutted and replaced with a lower latency protocol with a one sigma > > latency of about 2uS on messages based on the statistics of our problem. > > The max sustainable peak is about 12 GFLOPS, which is because the chips > > were > > built on 0.5um. The theoretic density of the system (even so) is > > slightly > > over one TFLOP/cu.ft. We are moving forward to modernize the system, > > but > > are funding limited. The ultimate barrier will be thermal. Even though > > we > > use carbon-matrix composite materials that have 5X better heat > > conduction > > than aluminum, the ultimate power densities as we encroach on > > >10TFLOPS/cu.ft. will overtake the ability of that material to draw heat > > away. There is discussion of trying to create a new type of thermal > > management material based on either carbon or boron nanotubes, which are > > claimed to beat natural diamond by about 2X. > > > > I wouldn't mind being copied on the posts/replies either. > > > > END OF LYKE > > > > Art Edwards > > > > On Thu, Apr 17, 2003 at 08:14:38AM -0400, Robert G. Brown wrote: > > > On Wed, 16 Apr 2003, Art Edwards wrote: > > > > > > > I think I'm jumping into the middle of a conversation here, but our > > > > branch is the shop through which most of the DoD processor programs are > > > > managed. For real space applications there are radiation issues like > > > > total dose hardness and single even upset that require special design > > > > and, still, special processing. That is, you can't make these parts at > > > > any foundry (yet). There are currently two hardened foundries through > > > > which the most tolerant parts are fabricated. Where the commercial > > > > market is ~100's of Billions/year, the space electronics industry is > > > > ~200million/year. So parts are expensive, as Jim Lux says. But more > > > > importantly, the current state-of-the-art for space processors is > > > > several generations back. Now, with a 200 million market/year, who is > > > > going to spend the money to build a new foundry? (anyone?) It's a huge > > > > problem, and beowulfs in space will not give the economies of scale > > > > necessary to move us forward. > > > > > > > > I don't know if this has been discussed here, but have you thought about > > > > launch costs? They're huge. Weight, power, and mission lifetime are the > > > > crucial factors for space. These are the reasons that so much R&D goes > > > > into space electronics. I apologize if I have gone over old ground. > > > > > > Actually, this is the sort of thing that makes (as Eray pointed out) the > > > idea of a cluster (leaving aside the COTS issue, the single-headed > > > issue, and whether or not it could be a true "beowulf" cluster) > > > attractive in space applications. What you (and Gerry) are saying is > > > that the space and DoD market is stuck using specially engineered, > > > radiation hard, not-so-bleeding-VLSI processors from what amounts to > > > several VLSI generations ago. The parts are expensive, but the cost of > > > building a newer better foundry for such a small and inelastic market > > > are prohibitive, so they are the only game in town. > > > > > > If you have an orbital project or application that needs considerably > > > more speed than the undoubtedly pedestrian clock of these devices can > > > provide, you have a HUGE cost barrier to developing a faster processor, > > > and that barrier is largely out of your (DoD) or Nasa's control -- you > > > can only ask/hope for an industrial partner to make the investment > > > required to up the chip generation in hardened technology with the > > > promise of at least some guaranteed sales. You also have a known per > > > kilogram per liter cost for lifting stuff into space, and this is at > > > least modestly under your own control. So (presuming an efficiently > > > parallelizable task) instead of effectively financing a couple of > > > billion dollars in developing the nextgen hard chips to get a speedup of > > > ten or so, you can engineer twelve systems based on the current, > > > relatively cheap chips into a robust and fault tolerant cluster and pay > > > the known immediate costs of lifting those twelve systems into orbit. > > > > > > Again presuming that it is for some reason not feasible to simply > > > establish a link to earth and do the processing here -- an application > > > for which the latency would be bad, an application that requires > > > immediate response in a changing environment when downlink > > > communications may not be robust. > > > > > > A question that you or Gerry or Jim may or may not be able to answer > > > (with which Chip started this discussion): Are there any specific > > > non-classified instances that you know of where an actual "cluster" > > > (defined loosely as multiple identical CPUs interconnected with some > > > sort of communications bus or network and running a specific parallel > > > numerical task, not e.g. task-specific processors in several parts of a > > > military jet) has been engineered, built, and shot into space? > > > > > > This has been interesting enough that if there are any, I may indeed add > > > a chapter to the book, if/when I next actually work on it. I got dem > > > end of semester blues, at the moment...:-) > > > > > > rgb > > > > > > -- > > > Robert G. Brown http://www.phy.duke.edu/~rgb/ > > > Duke University Dept. of Physics, Box 90305 > > > Durham, N.C. 27708-0305 > > > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu > > > > > > > > > > > > > > > > > > -- > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu > > > -- Art Edwards Senior Research Physicist Air Force Research Laboratory Electronics Foundations Branch KAFB, New Mexico (505) 853-6042 (v) (505) 846-2290 (f) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From edwardsa at plk.af.mil Thu Apr 17 21:33:19 2003 From: edwardsa at plk.af.mil (Art Edwards) Date: Thu, 17 Apr 2003 19:33:19 -0600 Subject: beowulf in space In-Reply-To: References: <200304180117.35195.mof@labf.org> Message-ID: <20030418013319.GA15875@plk.af.mil> There are two basic strategies for hardening: Design and process. Processing involves special anneals, implants and oxide recipes that are outside standard processing and so cannot be fabbed in standard foundaries. Designs are rather old and infolve specially shaped transistors. This is an active and promising area of pursuit. If you are really interested, you can look at past December issues of IEEE Transactions on Nuclear Science. You can also attend the Nuclear and Space Radiation Effects Conference this July in Monterrey Ca. You can, in fact, shield against alot of threats. The question is whether you want to launch shielding or active electronics. Art Edwards On Thu, Apr 17, 2003 at 04:54:17PM -0400, Robert G. Brown wrote: > On Fri, 18 Apr 2003, Mof wrote: > > > Ok excuse my ignorance, but what is involved in rad harding hardware ? > > Is the cost really necessary, in that couldn't you put the unprotected > > hardware into some sort of shielded container ? > > > > Or am I just being silly ? :-) > > Not really silly, but IIRC shielding is both difficult and expensive and > sometimes actively counterproductive in space. I'm sure the NASA guys > will have even more detail, but: > > a) Difficult, because there is a very wide range of KINDS and ENERGIES > of radiation out there. Some are easy to stop, but some (like massive, > very high energy nucleii or very high energy gamma rays) are not. > > b) Expensive, because to stop radiation you basically have to > interpolate matter in sufficient density to absorb and disperse the > energy via single and multiple scattering events. Some radiation has a > relatively high cross section with matter and low energy and is easily > stopped, but the most destructive sort requires quite a lot of > shielding, which is dense and thick. This means heavy and occupying > lots of volume, which means expensive in terms of lifting it out of the > gravity well. I don't know what it costs to lift a kilogram of mass to > geosynchronous orbit, but I'll bet it is a LOT. > > c) Counterproductive, because SOME of the kinds of radiation present > are by themselves not horribly dangerous -- they have a lot of energy > but are relatively unlikely to hit anything. So when they hit they kill > a cell or a chromosome or a bit or something, but in a fairly localized > way. However, when they hit the right densities of matter in shielding > they can produce a literal shower or shotgun blast of secondary > particles that ARE the right particles at the right energies to do a lot > of damage (to humans or hardware). So either you need enough shielding > to stop these particles and all their secondary byproducts, or you can > be better off just letting those particles (probably) pass right on > through, hopefully without hitting anything. > > Basically, we are pretty fortunate to live way down here at the bottom > of several miles of atmosphere, where most of the dangerous crap hits > and showers its secondary stuff miles overhead and is absorbed before it > becomes a hazard. Our computer hardware is similarly fortunate. Even a > mile up the radiation levels are significantly higher -- even growing up > in subtropical India I was NEVER sunburned as badly as I was in a mere > two hours of late afternoon exposure in Taxco, Mexico, just one mile up. > A single six hour cross-country plane ride exposes you to 1/8 of the > rems you'd receive, on average, in an entire year spent at ground > level. God only knows what astronauts get. Maybe they bank gametes > before leaving, dunno... > > So definitely not silly, but things are more complex than they might > seem. I'm sure that if a cost-effective solution were as easy as just > "more shielding" the rocket scientists (literally:-) at NASA would have > already thunk it. > > rgb > > -- > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Art Edwards Senior Research Physicist Air Force Research Laboratory Electronics Foundations Branch KAFB, New Mexico (505) 853-6042 (v) (505) 846-2290 (f) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jdc at uwo.ca Thu Apr 17 22:57:40 2003 From: jdc at uwo.ca (Dan Christensen) Date: Thu, 17 Apr 2003 22:57:40 -0400 Subject: Can't run NAS Benchmark In-Reply-To: (Douglas Eadline's message of "Mon, 14 Apr 2003 12:13:41 -0500 (CDT)") References: Message-ID: <87el40cy1n.fsf@uwo.ca> Douglas Eadline writes: > You may wish to look at the BPS (Beowulf Performance Suite): > > http://www.cluster-rant.com/article.pl?sid=03/03/17/1838236 > > and: > > http://www.hpc-design.com/reports/bps1/index.html I've been trying the download links on the page http://www.plogic.com/bps for a couple of days, without success. Any idea what's up? E.g. the link to ftp://ftp.plogic.com/pub/software/bps/RPMS/bps-1.2-11.i386.rpm doesn't work. Anonymous login and "cd" work, but "dir" and "get" just freeze up. I tried it with several ftp clients and browsers, and from two hosts. Dan _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hanzl at noel.feld.cvut.cz Fri Apr 18 06:17:41 2003 From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz) Date: Fri, 18 Apr 2003 12:17:41 +0200 Subject: Running perl scripts and non-mpi programs on scyld In-Reply-To: <3E9EDCFE.5060100@nsc1.net> References: <3E9EDCFE.5060100@nsc1.net> Message-ID: <20030418121741S.hanzl@unknown-domain> > 3. cache of common tools to /usr/local including: > some tools from /bin, /usr/bin, /usr/local/bin > /usr/lib/perl > - we are not currently rsync'ing this, but we > could in the future.... This is where cachefs could do really great job. Unfortunately there is no cachefs for linux as far as I know. Regards Vaclav Hanzl _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hanzl at noel.feld.cvut.cz Fri Apr 18 07:35:04 2003 From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz) Date: Fri, 18 Apr 2003 13:35:04 +0200 Subject: Running perl scripts and non-mpi programs on scyld In-Reply-To: <200304171120.03430.shewa@inel.gov> References: <20030417092417S.hanzl@unknown-domain> <200304171120.03430.shewa@inel.gov> Message-ID: <20030418133504V.hanzl@unknown-domain> > > SGE is quite nice and opensource batch spooling. Using it with > > scyld-like cluster for this type of jobs is a bit tricky but quite > > easy. We just create one 'queue' for every slave node and use node > > number as a queue name. Then we use 'starter method' script like this: > > > > #/bin/sh > > bpsh $QUEUE $* > > I didn't realize it was so easy to use SGE on top of a bproc based system. It was super-easy to meet our particular requirements, there might be little more work if somebody needs more. > I suppose it would take quite a bit more work to get it integrated to > the point where you only needed one queue for the entire cluster. Term "queue" in SGE is rather misleading. In fact there is no queue - there is just a global set of jobs to be done (this set behaves as queue if there is nothing else to order jobs) and set of places where to execute jobs (these places are called "queues" in SGE). In our solution, we have "one queue for the entire cluster" in the sense of set of jobs to be done. We have however N queues in the latter sense but all these "queues" have identical definition and one can just copy them using script or qmon GUI. There is however just one sge_execd daemon running (and one 'execution host' to install - the head host). ps -auxf on head node gives something like this: sge_execd \_ sge_shepherd-6889 -bg | \_ /bin/sh sge-bproc-starter-method job_scripts/6889 | \_ bpsh 3 job_scripts/6889 | \_ [6889] | \_ [te] | \_ [sh] | \_ [HERest] \_ sge_shepherd-6888 -bg | \_ /bin/sh sge-bproc-starter-method job_scripts/6888 | \_ bpsh 8 job_scripts/6888 | \_ [6888] | \_ [Ser] | \_ [sh] | \_ [HVite] ... > So does SGE see the load on the slave node queues? We blindly schedule fixed number of jobs (1 or 2) per node. It is however possible to write so called "load sensor script" and see and use real loads. Creating load sensors is well documented and these script can use any commands like "bpsh -ap w" or "supermon". > Does it show the number of processors and total memory in qhost? No. Maybe one could trick it somehow, we did not care. We use qstat to get rough per-box information what is going on. Load sensors could make this more exact and could probably also provide memory information. But I am happy with just "bpsh -ap free". In general, SGE is a bit confused because we used just one sge_execd instead of N. Information collected by sge_execd is incorrect as it does not run on real execution box. Rest of SGE setup could compensate for this if somebody cares. (Or better yet, SGE can be changed to work even better with bproc. I think bproc systems are important enough and SGE team is nice and responsive enough for this to happen if we can demonstrate interest and propose sensible ways of improvements.) Regards Vaclav _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Fri Apr 18 09:04:21 2003 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Fri, 18 Apr 2003 08:04:21 -0500 Subject: beowulf in space In-Reply-To: <200304180117.35195.mof@labf.org> References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> <3E9DFC75.50504@tamu.edu> <200304180117.35195.mof@labf.org> Message-ID: <3E9FF7D5.8030809@tamu.edu> Generally, hardening takes a lot more than a sealed box, although packaging of the processor is part of it. It involves considerations of trace sizing on the die, deposition thickness, and a number of other things. Generally, it takes a lot of work to test and certify a lot of processors as to their rad-hardness. There's really a lot of effort that goes into it. gerry Mof wrote: > Ok excuse my ignorance, but what is involved in rad harding hardware ? > Is the cost really necessary, in that couldn't you put the unprotected > hardware into some sort of shielded container ? > Or am I just being silly ? :-) > > Mof. > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From exa at kablonet.com.tr Thu Apr 17 19:53:41 2003 From: exa at kablonet.com.tr (Eray Ozkural) Date: Fri, 18 Apr 2003 02:53:41 +0300 Subject: beowulf in space In-Reply-To: <20030417200321.GB15077@plk.af.mil> References: <20030417032151.GA13826@plk.af.mil> <20030417200321.GB15077@plk.af.mil> Message-ID: <200304180253.41114.exa@kablonet.com.tr> On Thursday 17 April 2003 23:03, Art Edwards wrote: > Sensor and Fusion Engine (SAFE) in its best case is 96 processors, > broken > into 12 bussed groups (the bus a customized Futurebus+) with a Myrinet > bridge. The system is small enough in scale to be serviced by a single, > 16-port duplex non-blocking Myrinet crossbar. So, 12 of the hubs are > occupied with the 96 processors, which are of a special design > (microprogrammable with IEEE 754(?) double-precision floating point > support). !!! My! Maybe I can go to space as a parallel programming researcher after all! (^_^) Seriously, this is pretty cool stuff :) We know what to say when they ask "are there supercomputers in space?" Cheers, -- Eray Ozkural (exa) Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ron_chen_123 at yahoo.com Thu Apr 17 23:27:13 2003 From: ron_chen_123 at yahoo.com (Ron Chen) Date: Thu, 17 Apr 2003 20:27:13 -0700 (PDT) Subject: rocks cluster -- SGE preconfigured Message-ID: <20030418032713.69875.qmail@web41311.mail.yahoo.com> SGE support for IA64 and IA32 on rocks cluster. -Ron --- "Matthew C.H. Lee" wrote: > Are you trying to build a beowulf type of cluster or > just want to have a > load managing software for your site to manage the > collection of > heterogeneous workstations? If you are building a > beowulf cluster, you > might also want to check out Rocks > > rocks.npaci.edu > > The latest version already has SGE preconfigured and > ready to run out of the > box. People in my lab without prior cluster or sys > admin experience were > able to build a function cluster using Rocks within > ~ 1 hr. Very cool > stuff. > > -- Matt > > ----- Original Message ----- > From: "Ron Chen" > To: "Benjamin Goldsteen" > > Cc: "Matthew C. H. Lee" ; > ; > > Sent: Thursday, April 17, 2003 9:26 PM > Subject: Re: [PBS-USERS] cost for educational sites > > > > Some users told me that PBSPro is still "free", > but > > now they charge for support. > > > > However, if you want to get PBSPro, you *must* pay > for > > support. PBS developers, please correct me if that > is > > wrong. > > > > GridEngine is much, much better than OpenPBS: > > > > 1) it has job arrays > > 2) Better fault tolerance features such as shadow > > master and automatically job rerun. > > 3) better scheduler performance and scheduling > > policies. > > 4) Better platform support, including AIX, > FreeBSD, > > HP-UX, MacOSX, Tru64, Solaris, Linux, Cray, NEC > > SX-5/6, IRIX, and initial support for Win2K. > > > > Even with 70,000 submitted jobs, SGE is able to > handle > > that easily, and some sites even tried with as > many as > > 600,000-task job array, and further, it can handle > > 1,300 hosts. > > > > Notes that those are in production environments, > and > > the numbers user reported are not actual limits. > > > > And you can get commercial support: > > > http://wwws.sun.com/software/gridware/partners/index.html > > > > And you can see the list is very popular: > > > http://gridengine.sunsource.net/servlets/SummarizeList?listName=users > > > > And people are switching from L$F to GridEngine: > > http://www.veus.hr/linux/gemonitor.html > > > > -Ron > > > > > > --- Benjamin Goldsteen wrote: > > > Hi Ron and Matthew, > > > Any update on the new policy? If they now plan > to > > > charge .edu for what should > > > be free and open-source under the original PBS > terms > > > then I plan to support > > > another company or product. If I am going to > pay > > > money, I will pay that money > > > to a company like LSF rather than this company > which > > > takes a government > > > supported project and violates its original > terms. > > > > > > Otherwise, HPC and .edu should put its efforts > into > > > enhancing OpenPBS, SGE, > > > etc. I don't know why a company would drive the > > > .edu/HPC market into supporting > > > competing free products, but I think that is > what's > > > going to happen here. > > > -- > > > Benjamin Z. Goldsteen > > > Physiology & Biophysics > > > Mount Sinai School of Medicine > > > 212-241-1614 / 212-860-3369 (FAX) > > > > > > > > > __________________________________________________ > > Do you Yahoo!? > > The New Yahoo! Search - Faster. Easier. Bingo > > http://search.yahoo.com > > > __________________________________________________________________________ > > To unsubscribe: email majordomo at OpenPBS.org with > body "unsubscribe > pbs-users" > > For message archives: > http://www.OpenPBS.org/UserArea/pbs-users.html > > - - - - - - - - - - > - - - - > > OpenPBS and the pbs-users mailing list are > sponsored by Altair. > > > __________________________________________________________________________ > __________________________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From dtj at uberh4x0r.org Fri Apr 18 11:42:39 2003 From: dtj at uberh4x0r.org (Dean Johnson) Date: 18 Apr 2003 10:42:39 -0500 Subject: beowulf in space In-Reply-To: <3E9FF7D5.8030809@tamu.edu> References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> <3E9DFC75.50504@tamu.edu> <200304180117.35195.mof@labf.org> <3E9FF7D5.8030809@tamu.edu> Message-ID: <1050680559.25053.15.camel@terra> Howdy all, Is it just me, or does anybody else have the problem when reading the "beowulf in space" subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, maybe I need some sleep or something. ;-) -Dean _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From chip at chip.bellsouth.net Fri Apr 18 13:17:47 2003 From: chip at chip.bellsouth.net (chip) Date: Fri, 18 Apr 2003 13:17:47 -0400 (EDT) Subject: beowulf in space Message-ID: Dean Johnson wrote: Howdy all, Is it just me, or does anybody else have the problem when reading the "beowulf in space" subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, maybe I need some sleep or something. ;-) -Dean _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf Hi Dean, No, I don't think its you... I've got that same sound coming thru my box and I've had plenty of time to sleep on it. But you would have actually have had to live in the 1960's to understand what the sound was like. A flight to the moon was considered reckless not to mention impossible and at the time considering the state of technology, it was certainly risky business. Flying a mission thru the Sun's outer atmosphere certainly presses our technology to the extreme of theoretical limits... Not just the processing power of the on board beowulfy type cluster processors necessary for such a mission to succeed but the velocity that would required is on the order of 10 times of what we have yet to achieve, but theoretically possible based on a space tested Ion design. The shielding would have incorporate a combination of our most advanced composites and not to mention the electrostatic field that would have to be generated to protect the sensitive sensor array would perhaps come close to what Dr. Brown half jokingly describes as a fusion engine... since it would required an inverted plasm bubble... Yep I would say it is a purdy near impossible mission... but not outside our theoretical limits... It's enticing in its appeal... And in the same manner as the missions to the moon in the 60's... It's got that mythic quality about it that tends to capture the imagination... sends a chill of excitement in the confrontation of overwhelming and impossible odds... From our not so distant past it has a sound that is hauntingly familiar... something the thunder of the sound barrier perhaps booming from the past... It rings history heroic... Its a great sound, I still love that sixties music :-) C.Clary Spartan sys. analyst PO 1515 Spartanburg, SC 29304-0243 Fax# (801) 858-2722 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jbbernard at eng.uab.edu Fri Apr 18 13:57:44 2003 From: jbbernard at eng.uab.edu (jbbernard at eng.uab.edu) Date: Fri, 18 Apr 2003 12:57:44 -0500 Subject: Google's cluster Message-ID: <836A226C5200104C8A4AFDB31F8529BD2316A7@engem0.eng.uab.edu> In the past there's been some discussion on the list about the hardware required for Google to work its magic. I recently came across this talk by Urs Hoelzle of Google, given last year at the Univ of Washington. http://www.cs.washington.edu/info/videos/asx/colloq/UHoelzle_2002_11_05.asx Jon _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Fri Apr 18 13:56:31 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Fri, 18 Apr 2003 10:56:31 -0700 Subject: beowulf in space In-Reply-To: References: <200304180117.35195.mof@labf.org> Message-ID: <5.1.0.14.2.20030418104052.0303c7d8@mailhost4.jpl.nasa.gov> I have to say that while this discussion is straying a bit from the usual enjoyable stuff on the list about switches, interconnects, and how fast one processor or another is, I find it gratifying that folks are coming up with creative ideas, and thinking about other applications for Beowulves than just computer rooms full of rackmounted computers. Look back on the growth of cluster computing in the overall supercomputing business. Did anyone think, back in 1995, that there would be the penetration there is today? My hope is that the same can occur for space applications, which, while different in details, have many of the same drivers. And on to my comments interspersed within... At 04:54 PM 4/17/2003 -0400, Robert G. Brown wrote: >On Fri, 18 Apr 2003, Mof wrote: > > > Ok excuse my ignorance, but what is involved in rad harding hardware ? > > Is the cost really necessary, in that couldn't you put the unprotected > > hardware into some sort of shielded container ? > > > > Or am I just being silly ? :-) > >Not really silly, but IIRC shielding is both difficult and expensive and >sometimes actively counterproductive in space. I'm sure the NASA guys >will have even more detail, but: > > a) Difficult, because there is a very wide range of KINDS and ENERGIES >of radiation out there. Some are easy to stop, but some (like massive, >very high energy nucleii or very high energy gamma rays) are not. Precisely the case. There's two aspects: total dose, which gradually degrades the components, and single event effects (SEE), which come from the "one big fast (high energy) particle" kind of things. SEEs can be either transient (upsets) or permanent (Gate rupture). Total dose is talked about in terms of kiloRads or MegaRads (Yeah, I know the real units are Grays and Sieverts, but we work in rads for historical reasons). And, of course, taking dose and trying to collapse it into a single number ignores important things like the energy spectrum and dose rate effects (some degradation processes are enhanced and others reduced at low dose rates) Single events are usually talked about in terms of Linear Energy Transfer (LET).. typically some number of MeV per cm, etc. A neutrino may have high energy, but because it won't hit anything, it doesn't transfer any energy to the victim circuit, hence have low LET. On the other hand, a big old heavy ion, moving slow, has a very high collision cross section, so the LET might be quite high. LET is sort of a way to represent a combination of particle energy and reaction cross-section. > b) Expensive, because to stop radiation you basically have to >interpolate matter in sufficient density to absorb and disperse the >energy via single and multiple scattering events. Some radiation has a >relatively high cross section with matter and low energy and is easily >stopped, but the most destructive sort requires quite a lot of >shielding, which is dense and thick. This means heavy and occupying >lots of volume, which means expensive in terms of lifting it out of the >gravity well. I don't know what it costs to lift a kilogram of mass to >geosynchronous orbit, but I'll bet it is a LOT. $100K/kg is a nice round number... Of more significance is that launch capability comes in chunks. You might have 300kg of lift, and if your box winds up being 320kg, you have to buy the next bigger rocket, at a substantial cost increase. At an early stage, your mass budget gets set according to your dollar budget. The mission designer divvies up the mass budget among all the folks clamoring for it (so many kg for attitude control, so many kg for thermal management, so many kg for instruments, etc.) holding a bit back in reserve (because systems ALWAYS get heavier), so that when the inevitable happens, they can still buy the cheaper rocket. > c) Counterproductive, because SOME of the kinds of radiation present >are by themselves not horribly dangerous -- they have a lot of energy >but are relatively unlikely to hit anything. So when they hit they kill >a cell or a chromosome or a bit or something, but in a fairly localized >way. However, when they hit the right densities of matter in shielding >they can produce a literal shower or shotgun blast of secondary >particles that ARE the right particles at the right energies to do a lot >of damage (to humans or hardware). So either you need enough shielding >to stop these particles and all their secondary byproducts, or you can >be better off just letting those particles (probably) pass right on >through, hopefully without hitting anything. Scattering is one of those horrible things.. adding shielding might make things worse. And the real problem is that it is very, very difficult to model accurately. So we make approximations (spherical shells, etc.), add a bit of margin, and go from there. Think of this.. high energy neutrons are actually safer than thermalized neutrons, for human exposure (looking at RBE numbers), because the cross section is much higher for thermalized neutrons... they're slower. Same kinds of things apply to electronics. >Basically, we are pretty fortunate to live way down here at the bottom >of several miles of atmosphere, where most of the dangerous crap hits >and showers its secondary stuff miles overhead and is absorbed before it >becomes a hazard. Our computer hardware is similarly fortunate. Even a >mile up the radiation levels are significantly higher -- even growing up >in subtropical India I was NEVER sunburned as badly as I was in a mere >two hours of late afternoon exposure in Taxco, Mexico, just one mile up. >A single six hour cross-country plane ride exposes you to 1/8 of the >rems you'd receive, on average, in an entire year spent at ground >level. God only knows what astronauts get. Maybe they bank gametes >before leaving, dunno... IBM did a bunch of studies a while back comparing DRAM error rates at computers installed at sea level and in Colorado and in a mine in Colorado, and found significant (in a statistical sense) differences. >So definitely not silly, but things are more complex than they might >seem. I'm sure that if a cost-effective solution were as easy as just >"more shielding" the rocket scientists (literally:-) at NASA would have >already thunk it. All it takes is money... James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Fri Apr 18 19:41:59 2003 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Fri, 18 Apr 2003 18:41:59 -0500 Subject: beowulf in space Warning: 'WAY OT In-Reply-To: <1050680559.25053.15.camel@terra> References: <5.1.0.14.2.20030416144219.03045948@mailhost4.jpl.nasa.gov> <5.1.0.14.2.20030416163146.01979a28@mailhost4.jpl.nasa.gov> <3E9DFC75.50504@tamu.edu> <200304180117.35195.mof@labf.org> <3E9FF7D5.8030809@tamu.edu> <1050680559.25053.15.camel@terra> Message-ID: <3EA08D47.2040304@tamu.edu> You asked for it. about 8 years ago, in response to a need to better and more accurately track cattle for a USDA-sponsored entomology research project, Cows In Space was born (or borne, as the case may be). There were lots of 68030's moooving around the pasture, all reporting back to a head node made of a Pentium-I. All of the primary data was provided by direct sequence spread-spectrum signalling at L-Band. Biasing and additional input data was provided by a low-speed VHF datalink. ALthough the general application was NUMA, in fact, all of the processors had a uniform architecture and memory distribution. These were GPS receivers. On cows. With differential corrections data provided. For the record, we were able to determine the animals' location on a 30-sec interval with sub-bovine accuracy. Sorry. It just HAD to be told. gerry Dean Johnson wrote: > Howdy all, > Is it just me, or does anybody else have the problem when reading the "beowulf in space" > subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, maybe I need some > sleep or something. ;-) > > -Dean > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Sat Apr 19 14:54:59 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Sat, 19 Apr 2003 14:54:59 -0400 (EDT) Subject: beowulf in space Warning: 'WAY OT In-Reply-To: <3EA08D47.2040304@tamu.edu> Message-ID: On Fri, 18 Apr 2003, Gerry Creager N5JXS wrote: > You asked for it. > > about 8 years ago, in response to a need to better and more accurately > track cattle for a USDA-sponsored entomology research project, Cows In > Space was born (or borne, as the case may be). There were lots of > 68030's moooving around the pasture, all reporting back to a head node > made of a Pentium-I. All of the primary data was provided by direct > sequence spread-spectrum signalling at L-Band. Biasing and additional > input data was provided by a low-speed VHF datalink. ALthough the > general application was NUMA, in fact, all of the processors had a > uniform architecture and memory distribution. > > These were GPS receivers. On cows. With differential corrections data > provided. For the record, we were able to determine the animals' > location on a 30-sec interval with sub-bovine accuracy. > > Sorry. It just HAD to be told. You'll be sorry. I'm recording all of this in my Src/beowulf_book/List_Ideas/Space file. One day Cows In Space (as told by a certain GC:-) will be on the tongues of all humans on the planet...or at least all of those interested in building clusters. It's pretty clear that there is a chapter in all this, but I've spent some 24 out of the last 36 hours rewriting the brahma website in php. That (by the way) is almost done -- people who have used http://www.phy.duke.edu/brahma in the past might check it out again at http://www.phy.duke.edu/brahma/index.php and comment if they so desire. When my energy banks recharge, perhaps I'll tackle the Cows. So to speak... rgb > > gerry > > Dean Johnson wrote: > > Howdy all, > > Is it just me, or does anybody else have the problem when reading the "beowulf in space" > > subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, maybe I need some > > sleep or something. ;-) > > > > -Dean > > > > _______________________________________________ > > Beowulf mailing list, Beowulf at beowulf.org > > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Sat Apr 19 19:49:39 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Sat, 19 Apr 2003 19:49:39 -0400 (EDT) Subject: Brahma Site Officially Rebuilt Message-ID: Dear All, The brahma website has just been completely redone in php so that it looks much prettier and is much easier to navigate. I have worked very hard to validate all the links, although I'm sure that some links are still broken or missing from the previous site. I have also written a moderately detailed description of the various clusters that are part of the "brahma" project. The beowulf engineering book is still there (in three forms), and any old bookmarks to it should be forwarded. The links and vendors lists are updated and augmented with new entries. Software, talks and papers, and other resources are much better organized. Nearly everything has meta tags that should make the associated resource more visible to search engines on campus or off. The old site is even there, preserved in its entirety, in case there are things you rely on or find in a search engine that failed to get moved (yet). Beowulf list persons who have used the site in the past are invited to revisit it, browse it, and update their bookmarks or URL's. Beowulf-associated managers or vendors with sites of their own are invited to check it out and send me links or update requests if your site is missing or broken. DBUG persons on campus are invited to visit it and the DBUG site and to REGISTER THEIR CLUSTERS on the DBUG site if they have not already done so. DULUG persons on campus who are interested in cluster computing or interested in learning about a bit of of what is being done with linux at the high performance computing edge are invited to visit it just to check it out for fun. Finally, physics department persons who use the cluster or operate one of the subclusters in brahma are invited to send me comments, entries for the various (woefully incomplete) tables crossreferencing personnel that use the cluster or research group pages that detail some of what is being done with the cluster. Look especially under the Users, Clusters and Research pages and let me know if you find egregious errors or want to send me updated information for the tables. Thank you, rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From astroguy at bellsouth.net Sun Apr 20 13:05:31 2003 From: astroguy at bellsouth.net (astroguy at bellsouth.net) Date: Sun, 20 Apr 2003 13:05:31 -0400 Subject: beowulf in space Message-ID: <20030420170531.BUSW1506.imf45bis.bellsouth.net@mail.bellsouth.net> PS. I'm not sure if this got out as I been getting out cause I've been fiddling with my pop and IMAP system settings... forever tinkering. But I have to agree this topic has spilled over into areas that this beowulfy seldom visits but if we are going to fly the cluster into the hostile environment of space all these areas must be considered... the thermal as well as a combinations of electro static and magnetic shielding .... and sure, as Jim points out it is going to take money... but perhaps even more than money... Is the passion (in the 60's it was electric and contagious)... the commitment and enthusiastic support of another generation will have to have their imaginations stirred as to the complexity and daunting challenge of the goals space exploration, research not just in terms of gold and treasure but in the blood, sweat, and as recent events again revisit the tears of it all.. As space exploration is forever to remain the most risky business... But it is also the most exciting and noble of all human adventures. > > From: chip > Date: 2003/04/18 Fri PM 01:17:47 EDT > To: Dean Johnson > CC: Beowulf at beowulf.org > Subject: Re: beowulf in space > > Dean Johnson wrote: > > Howdy all, > Is it just me, or does anybody else have the problem when reading the > "beowulf in space" > subject line, you hear that echoey 60's sci-fi movie sound effect? Okay, > maybe I need some > sleep or something. ;-) > > -Dean > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > Hi Dean, > No, I don't think its you... I've got that same sound coming thru my box > and I've had plenty of time to sleep on it. But you would have actually > have had to live in the 1960's to understand what the sound was like. A > flight to the moon was considered reckless not to mention impossible and > at the time considering the state of technology, it was certainly risky > business. Flying a mission thru the Sun's outer atmosphere certainly > presses our technology to the extreme of theoretical limits... Not just > the processing power of the on board beowulfy type cluster processors > necessary for such a mission to succeed but the velocity that would > required is on the order of 10 times of what we have yet to achieve, but > theoretically possible based on a space tested Ion design. The shielding > would have incorporate a combination of our most advanced composites and > not to mention the electrostatic field that would have to be generated to > protect the sensitive sensor array would perhaps come close to what Dr. > Brown half jokingly describes as a fusion engine... since it would > required an inverted plasma bubble... Yep I would say it is a purdy near > impossible mission... but not outside our theoretical limits... It's > enticing in its appeal... And in the same manner as the missions to the > moon in the 60's... It's got that mythic quality about it that tends to > capture the imagination... sends a chill of excitement in the > confrontation of overwhelming and impossible odds... From our not so > distant past it has a sound that is hauntingly familiar... something the > thunder of the sound barrier perhaps booming from the past... It rings > history heroic... Its a great sound, I still love that sixties music :-) > > C.Clary > Spartan sys. analyst > PO 1515 > Spartanburg, SC 29304-0243 > > Fax# (801) 858-2722 > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jbassett at blue.weeg.uiowa.edu Mon Apr 21 16:46:34 2003 From: jbassett at blue.weeg.uiowa.edu (jbassett) Date: Mon, 21 Apr 2003 15:46:34 -0500 Subject: back to the issue of cooling Message-ID: <3EA82302@itsnt5.its.uiowa.edu> Sorry to keep kicking a dead horse guys, but the issue of increasing thermal efficiency in large clusters and data centers has been keeping me awake at nights. Has anyone tried to use a stirling engine or other system for instance: http://www.stmpower.com/Technology/Technology.asp that can take as input pure heat, not just potential in the form of btus, in order to recover some of the heat energy that would simply be wasted at a large facility. Could this be economically viable? Joseph Bassett _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Apr 21 18:23:10 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 21 Apr 2003 18:23:10 -0400 (EDT) Subject: back to the issue of cooling In-Reply-To: <3EA82302@itsnt5.its.uiowa.edu> Message-ID: > Sorry to keep kicking a dead horse guys, but the issue of increasing > thermal efficiency in large clusters and data centers has been keeping > me awake at nights. wow. you need to get a cluster of your own to worry about ;) > Has anyone tried to use a stirling engine or other afaikt, this sort of thing depends on the presence of a substantial temperature differential, not just a lot of energy. my machineroom dissipates around 35 KW, but the return air isn't supposed to get above about 30C (unlike today, when we hit 35.8 :( ) I have a vague recollection that the efficiency of heat engines is strongly dependent on the temp differential, which would be only about 20C assuming our chilled water stayed chilled... > order to recover some of the heat energy that would simply be wasted at a > large facility. Could this be economically viable? I think it would be more effective to reduce at the source. for instance, my ES40's have 2-of-3 redundant power supplies, which seem to dissipate a lot more heat than another cluster which has 1-of-2. of course, the mere fact that we're using Alphas is a declaration of war on coolness ;) I hear those Opterons are pretty cool... _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at math.ucdavis.edu Tue Apr 22 01:35:51 2003 From: bill at math.ucdavis.edu (Bill Broadley) Date: Mon, 21 Apr 2003 22:35:51 -0700 Subject: Opteron announcement Message-ID: <20030422053550.GA6923@sphere.math.ucdavis.edu> Apparently the link to http://www.amd.com/opteronservers just went live. Tons of cool docs/benchmarks. SPECfp rate 2000 (dual cpu) ================ it2-1.0 30.7 amd-244 26.7 amd-242 25.1 amd-240 22.7 Xeon-2.8 14.7 SPECfp_peak 2000 (single cpu) ================ it2-1.0 1431 amd-144 1219 Xeon-3.06 1103 SPECint_peak 2000 (single cpu) ================= Amd-144 1170 Xeon-3.06 1130 IT2-1.0 719 SPECint_rate 2000 (dual cpu) ================= amd-244 26.8 amd-242 24.0 amd-240 21.2 Xeon-2.8 19.6 it2-900 15.5 SPECint_rate2000 (windows) 4p ============================= amd-844 48.5 amd-852 45.1 amd-840 40 Xeon MP 2.0 34.7 it2-1.0 32.9 SPECfp_rate20000 4P =================== it2-1.0 49.3 amd-844 49.2 amd-842 45.0 amd-840 40.7 Xeon-2.0 20.2 Oh and one more interesting link: Software Optimization Guide for AMD athlon 64 and AMD Opteron Processors http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_739_7203,00.html Amusingly all the submissions that I looked at the full reports for use the Intel compiler. So the Opterons extra registers are ignored. Time will tell if 3rd party compilers that fully utilize the additional registers can win benchmarks against Intel's compiler. Based on the preliminary pricing I have the Opterons look to make for very nice beowulf nodes. -- Bill Broadley Mathematics UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Tue Apr 22 09:36:52 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue, 22 Apr 2003 09:36:52 -0400 (EDT) Subject: back to the issue of cooling In-Reply-To: <3EA82302@itsnt5.its.uiowa.edu> Message-ID: On Mon, 21 Apr 2003, jbassett wrote: > Sorry to keep kicking a dead horse guys, but the issue of increasing > thermal efficiency in large clusters and data centers has been keeping > me awake at nights. Has anyone tried to use a stirling engine or other > system > for instance: > > http://www.stmpower.com/Technology/Technology.asp > > that can take as input pure heat, not just potential in the form of btus, in > order to recover some of the heat energy that would simply be wasted at a > large facility. Could this be economically viable? In almost all cases, no. It's the problem with heat -- you have to pay for it to get it where you want it, then you have to pay for it again to get rid of it when it is where you DON'T want it. I won't inflict a full review of the laws of thermodynamics on the list, but the relevant one (second) here says that you can only extract work when running e.g. a heat engine between two reservoirs, one "hot", one "cold(er)". Even then, one can only extract strictly less than \eta = \frac{T_h - T_c}{T_h} (degrees kelvin only) of the energy in the heat that flows from hot to cold through your engine. Even to start with, this makes it hardly worth it. You're trying NOT to allow T_h to exceed 340K at the CPU (the room itself would need to be far colder or you'd be in deep trouble out of the gate); you'd have to work very hard (and spend a lot of energy and money!) to come up with a "free" cold reservoir at T_c = 290K (got a glacier or springfed lake handy?). So you could recover at most 12% of the heat energy from the CPUs themselves, probably not enough to run the pumps from your "free" cold reservoir. The only way to recover any fraction at all, is to use your A/C as a "heat pump" in the wintertime and pump the heat elsewhere in your building where it could be of use. A modern new building facility might well do that, if the architect designed things appropriately for that purpose from the beginning. It would be quite difficult to cost-effectively retrofit in most other environments. This would still cost money to operate, but you'd get a gain on the investment as the coefficient of performance of your heat pump/AC could be 3-5 (giving you a solid gain on the energy used). The BEST way to remember the second law is that it says that "There ain't no such thing as a free lunch" (tanstaafl). So any clever scheme to get something for nothing will almost certainly fail. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jcownie at etnus.com Tue Apr 22 07:12:28 2003 From: jcownie at etnus.com (James Cownie) Date: Tue, 22 Apr 2003 12:12:28 +0100 Subject: beowulf in space In-Reply-To: Message from Jim Lux of "Fri, 18 Apr 2003 10:56:31 PDT." <5.1.0.14.2.20030418104052.0303c7d8@mailhost4.jpl.nasa.gov> Message-ID: <197vhM-3Yi-00@etnus.com> As you no doubt already know, some people have used standard (non-rad-hardened) CPUs in satellite applications. For instance Clementine used a MIPS R3081 for its sensor interface processor. (See table 9 in http://www.google.com/search?q=cache:UWVj-wMWHLUC:www.pxi.com/praxis_publicpages/pdfs/Lun_Orb_Alabama.pdf+clementine+MIPS+computer+&hl=en&ie=UTF-8 ) Of course lunar orbit is likely a lower radiation environment than low-earth orbit, and there were two lower level processors for basic control which _were_ rad-hardened. -- Jim James Cownie Etnus, LLC. +44 117 9071438 http://www.etnus.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Tue Apr 22 00:45:23 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Mon, 21 Apr 2003 21:45:23 -0700 Subject: back to the issue of cooling References: Message-ID: <000c01c30889$fce650c0$02a8a8c0@office1> It always takes energy to move the heat against temperature differential. (one of those pesky laws of thermodynamics) So, to use the waste heat from your cluster to move that heat outside would require the addition of extra energy. ----- Original Message ----- From: "Mark Hahn" To: "jbassett" Cc: Sent: Monday, April 21, 2003 3:23 PM Subject: Re: back to the issue of cooling > > Sorry to keep kicking a dead horse guys, but the issue of increasing > > thermal efficiency in large clusters and data centers has been keeping > > me awake at nights. > > wow. you need to get a cluster of your own to worry about ;) > > > Has anyone tried to use a stirling engine or other > > afaikt, this sort of thing depends on the presence of a substantial > temperature differential, not just a lot of energy. my machineroom > dissipates around 35 KW, but the return air isn't supposed to get > above about 30C (unlike today, when we hit 35.8 :( ) > > I have a vague recollection that the efficiency of heat engines is > strongly dependent on the temp differential, which would be only about 20C > assuming our chilled water stayed chilled... > > > order to recover some of the heat energy that would simply be wasted at a > > large facility. Could this be economically viable? > > I think it would be more effective to reduce at the source. for instance, > my ES40's have 2-of-3 redundant power supplies, which seem to dissipate > a lot more heat than another cluster which has 1-of-2. of course, the mere > fact that we're using Alphas is a declaration of war on coolness ;) > > I hear those Opterons are pretty cool... > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From wharman at prism.net Tue Apr 22 10:57:36 2003 From: wharman at prism.net (William Harman) Date: Tue, 22 Apr 2003 08:57:36 -0600 Subject: Opteron announcement In-Reply-To: <20030422053550.GA6923@sphere.math.ucdavis.edu> Message-ID: <002701c308df$98af7270$318a010a@WHARMAN> Bill; If you need a demo unit, let me know, I can supply. Bill Harman, High Performance Cluster Systems Toll Free 866-883-4689 Ext 203 Salt Lake City Office (801) 572-9252 wharman at prism.net wharman at einux.com www.einux.com -----Original Message----- From: beowulf-admin at beowulf.org [mailto:beowulf-admin at beowulf.org] On Behalf Of Bill Broadley Sent: Monday, April 21, 2003 11:36 PM To: beowulf at beowulf.org Subject: Opteron announcement Apparently the link to http://www.amd.com/opteronservers just went live. Tons of cool docs/benchmarks. SPECfp rate 2000 (dual cpu) ================ it2-1.0 30.7 amd-244 26.7 amd-242 25.1 amd-240 22.7 Xeon-2.8 14.7 SPECfp_peak 2000 (single cpu) ================ it2-1.0 1431 amd-144 1219 Xeon-3.06 1103 SPECint_peak 2000 (single cpu) ================= Amd-144 1170 Xeon-3.06 1130 IT2-1.0 719 SPECint_rate 2000 (dual cpu) ================= amd-244 26.8 amd-242 24.0 amd-240 21.2 Xeon-2.8 19.6 it2-900 15.5 SPECint_rate2000 (windows) 4p ============================= amd-844 48.5 amd-852 45.1 amd-840 40 Xeon MP 2.0 34.7 it2-1.0 32.9 SPECfp_rate20000 4P =================== it2-1.0 49.3 amd-844 49.2 amd-842 45.0 amd-840 40.7 Xeon-2.0 20.2 Oh and one more interesting link: Software Optimization Guide for AMD athlon 64 and AMD Opteron Processors http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_739_720 3,00.html Amusingly all the submissions that I looked at the full reports for use the Intel compiler. So the Opterons extra registers are ignored. Time will tell if 3rd party compilers that fully utilize the additional registers can win benchmarks against Intel's compiler. Based on the preliminary pricing I have the Opterons look to make for very nice beowulf nodes. -- Bill Broadley Mathematics UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From brian.dobbins at yale.edu Tue Apr 22 11:54:09 2003 From: brian.dobbins at yale.edu (Brian Dobbins) Date: Tue, 22 Apr 2003 11:54:09 -0400 (EDT) Subject: PGI v5.0 [Beta] avail.. (was: Opteron announcement) Message-ID: [Mikhail Kuzminksy said:] > PGI (Portland Group) 5.0 will have Opteron support. The product >will be available at summer (June, if I remember correctly). >It'll be very interesting to compare ! You can download a beta now, though it doesn't support cross-compiling, so you need an Opteron system. I recently used it to benchmark a code and was thoroughly impressed. I haven't yet run that same code on an Opteron via the Intel compilers, but I should have a system arriving soon and will certainly try that out to see the difference. PGI v5.0 Beta: http://www.pgroup.com/AMD64.htm Cheers, - Brian Brian Dobbins Yale Mechanical Engineering _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Tue Apr 22 12:05:31 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Tue, 22 Apr 2003 09:05:31 -0700 Subject: Opteron announcement In-Reply-To: <20030422053550.GA6923@sphere.math.ucdavis.edu> References: <20030422053550.GA6923@sphere.math.ucdavis.edu> Message-ID: <20030422160531.GA1299@greglaptop.attbi.com> > SPECfp_peak 2000 (single cpu) > ================ > it2-1.0 1431 > amd-144 1219 > Xeon-3.06 1103 By the way, if you get rid of art (which gets a major cache benefit at 3 MB) and swim (main memory bound), the Opteron is faster than Itanium on the rest. Pretty amazing. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jbassett at blue.weeg.uiowa.edu Tue Apr 22 13:51:54 2003 From: jbassett at blue.weeg.uiowa.edu (jbassett) Date: Tue, 22 Apr 2003 12:51:54 -0500 Subject: back to the issue of cooling Message-ID: <3EAA34FF@itsnt5.its.uiowa.edu> Yes , yes I haven't forgotten that yellow stat phys book that I enjoyed a couple of semesters ago. It depresses me that there is not an easy solution. I suppose it is better to make the engine more efficient than to try to trap unburned fuel. Cluster o' Transmeta. On another note, I am now the proud owner of a Sun Ultra 10 workstation. I have a very heterogeneous cluster in my apartment, including x86 and DEC alpha. I cannot seem to get the Sun (running FreeBSD 5.0, fortran compiler broken) to shake hands correctly with the rest of the team. But if I run a PVM code, he plays ball. Is there something peculiar to Sparc that would keep it from integrating well into a hetero mpich cluster. Is this a 512/1024 problem? Joseph Bassett _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From timm at fnal.gov Tue Apr 22 14:14:18 2003 From: timm at fnal.gov (Steven Timm) Date: Tue, 22 Apr 2003 13:14:18 -0500 (CDT) Subject: Opteron announcement In-Reply-To: <20030422160531.GA1299@greglaptop.attbi.com> Message-ID: How do the specint_peak 2000 numbers compare? Steve Timm ------------------------------------------------------------------ Steven C. Timm (630) 840-8525 timm at fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division/Core Support Services Dept. Assistant Group Leader, Scientific Computing Support Group Lead of Computing Farms Team On Tue, 22 Apr 2003, Greg Lindahl wrote: > > SPECfp_peak 2000 (single cpu) > > ================ > > it2-1.0 1431 > > amd-144 1219 > > Xeon-3.06 1103 > > By the way, if you get rid of art (which gets a major cache benefit at > 3 MB) and swim (main memory bound), the Opteron is faster than Itanium > on the rest. Pretty amazing. > > -- greg > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Tue Apr 22 14:33:17 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Tue, 22 Apr 2003 11:33:17 -0700 Subject: PGI v5.0 [Beta] avail.. (was: Opteron announcement) In-Reply-To: References: Message-ID: <20030422183316.GA1631@greglaptop.internal.keyresearch.com> On Tue, Apr 22, 2003 at 11:54:09AM -0400, Brian Dobbins wrote: > I recently used it to benchmark a code and > was thoroughly impressed. I'd give my opinions, but: BETA LICENSE CONDITIONS iv. In the absense of explicit permission from STMicroelectronics, The Portland Group Compiler Technology, performance results obtained using this Software will not be published or presented in a public forum This is fairly typical for beta releases... -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jeff at aslab.com Tue Apr 22 14:27:00 2003 From: jeff at aslab.com (Jeff Nguyen) Date: Tue, 22 Apr 2003 11:27:00 -0700 Subject: Opteron announcement References: <20030422053550.GA6923@sphere.math.ucdavis.edu> Message-ID: <056a01c308fc$c390e6a0$6502a8c0@jeff> Hi Bill, Here is an interesting benchmark result of the Opteron platforms running on the combination of 32-bit/64-bit operating system and applications. For this benchmark, Povray 3D ray tracer is used. Platform Render Time (smaller is faster) ----------------------------------------------------------------- Opteron Model 242 41m 44s 1.6ghz, 1MB L2, 32-bit OS (RH 8.0), 32-bit Povray binary Opteron Model 242 41m 44s 1.6ghz, 1MB L2, 32-bit OS (RH 8.0), 32-bit Povray binary Opteron Model 242 41m 44s 1.6ghz, 1MB L2, 64-bit OS (UnitedLinux x86-64 v1.0), 32-bit Povray binary Opteron Model 242 30m 12s 1.6ghz, 1MB L2, 64-bit OS (UnitedLinux x86-64 v1.0), 64-bit Povray binary Intel Xeon 3.06ghz 31m 11s 32-bit OS (RH 8.0), 32-bit Povray binary Jeff ASL Inc. ----- Original Message ----- From: "Bill Broadley" To: Sent: Monday, April 21, 2003 10:35 PM Subject: Opteron announcement > > Apparently the link to http://www.amd.com/opteronservers just went > live. Tons of cool docs/benchmarks. > > SPECfp rate 2000 (dual cpu) > ================ > it2-1.0 30.7 > amd-244 26.7 > amd-242 25.1 > amd-240 22.7 > Xeon-2.8 14.7 > > SPECfp_peak 2000 (single cpu) > ================ > it2-1.0 1431 > amd-144 1219 > Xeon-3.06 1103 > > SPECint_peak 2000 (single cpu) > ================= > Amd-144 1170 > Xeon-3.06 1130 > IT2-1.0 719 > > SPECint_rate 2000 (dual cpu) > ================= > amd-244 26.8 > amd-242 24.0 > amd-240 21.2 > Xeon-2.8 19.6 > it2-900 15.5 > > SPECint_rate2000 (windows) 4p > ============================= > amd-844 48.5 > amd-852 45.1 > amd-840 40 > Xeon MP 2.0 34.7 > it2-1.0 32.9 > > SPECfp_rate20000 4P > =================== > it2-1.0 49.3 > amd-844 49.2 > amd-842 45.0 > amd-840 40.7 > Xeon-2.0 20.2 > > Oh and one more interesting link: > Software Optimization Guide for AMD athlon 64 and AMD Opteron Processors > http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_739_7203,00 .html > > Amusingly all the submissions that I looked at the full reports for > use the Intel compiler. So the Opterons extra registers are ignored. > > Time will tell if 3rd party compilers that fully utilize the additional > registers can win benchmarks against Intel's compiler. > > Based on the preliminary pricing I have the Opterons look to make for > very nice beowulf nodes. > > > -- > Bill Broadley > Mathematics > UC Davis > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From erwan at mandrakesoft.com Tue Apr 22 18:12:58 2003 From: erwan at mandrakesoft.com (Erwan Velu) Date: Wed, 23 Apr 2003 00:12:58 +0200 (CEST) Subject: Opteron announcement In-Reply-To: <20030422053550.GA6923@sphere.math.ucdavis.edu> References: <20030422053550.GA6923@sphere.math.ucdavis.edu> Message-ID: <32813.81.56.219.165.1051049578.squirrel@webmail.mandrakesoft.com> As some has may noticed, Mandrakesoft is one of the AMD launch partners. The Mandrake Linux products are ready-to-run under this platform except the clustering side (June). - http://www.mandrakesoft.com/company/press/briefs?n=/mandrakesoft/news/2414 [...] Later in June 2003, MandrakeSoft will release 'MandrakeClustering' for Opteron?, an easy-to-use clustering solution designed to answer needs in the intensive calculation area that will greatly benefit from the power of AMD 64-bit technology. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Tue Apr 22 17:08:49 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue, 22 Apr 2003 17:08:49 -0400 (EDT) Subject: back to the issue of cooling In-Reply-To: <3EAA34FF@itsnt5.its.uiowa.edu> Message-ID: On Tue, 22 Apr 2003, jbassett wrote: > Yes , yes I haven't forgotten that yellow stat phys book that I enjoyed a > couple of semesters ago. It depresses me that there is not an easy solution. I > suppose it is better to make the engine more efficient than to try to trap > unburned fuel. Cluster o' Transmeta. No, not even this does it in the HPC market where there are few idle cycles, unless my back-of-the-envelope computations are wrong (entirely possible as I suck at arithmetic:-). IIRC there is an energy cost per switching operation in VLSI that provides a basic, physical limitation on the energy efficiency per "flop". Beyond that, it is the battle of the chip maskers. How to lay out the chip at a given fabrication scale so that the switching operations are reliable, so that pathways are minimized, so that energy isn't radiated away. If you work out the actual energy cost per average "instruction" for the different silicon foundries, you don't get all that profound a difference between them -- well within a factor of two in most cases. So you can get more slower, cooler chips, or fewer faster, hotter chips, but the net amount of energy you consume doing a GFLOPS-year of mixed computation isn't likely to vary tremendously, from at least the seat of the pants computations I've done. Don't forget the auxiliary costs, as well -- one case, motherboard, memory, disk for a dual 2.5 GHz CPU (5 aggregate GHz of instructions) vs five cases for single 1 GHz CPUs means that even if the 1 GHz CPU runs more than 2.5x cooler (often it won't) you may be spending an extra 200 Watts running the extra cases and peripherals. You might save 20% or 30% of your energy costs PER UNIT OF WORK ACCOMPLISHED shopping for energy-efficient processors, but I would be surprised if you did much better than that. So -- tanstaafl. Barring real technical breakthroughs at the silicon level -- teensy switches that switch, reliably, just as fast, at lower voltage with lower energy, the best you are dealing with is rearrangements of the same scaling laws at each level of VLSI masking. Not that there aren't real differences that appear over time -- my palm pilot is about as fast as my original IBM PC was, but runs MUCH cooler:-) but there is a pretty significant lag in performance to where cpu masking rearrangements and implementation in different technologies starts making them happen. Believe me, if it were possible to run silicon cooler (and over time it is), chip designers would "immediately" implement the cooler technology to increase switch densities and make more powerful chips, since heat dissipation is a major limitation on chip design as it is. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From exa at kablonet.com.tr Tue Apr 22 16:26:27 2003 From: exa at kablonet.com.tr (Eray Ozkural) Date: Tue, 22 Apr 2003 23:26:27 +0300 Subject: back to the issue of cooling In-Reply-To: <3EAA34FF@itsnt5.its.uiowa.edu> References: <3EAA34FF@itsnt5.its.uiowa.edu> Message-ID: <200304222326.27878.exa@kablonet.com.tr> On Tuesday 22 April 2003 20:51, jbassett wrote: > Yes , yes I haven't forgotten that yellow stat phys book that I enjoyed a > couple of semesters ago. It depresses me that there is not an easy > solution. I suppose it is better to make the engine more efficient than to > try to trap unburned fuel. Cluster o' Transmeta. Has anybody calculated if the operation of a low-power cluster can amortize the actual price of the system in a couple of years? I'm thinking something like an apple cluster or something, might actually be viable! What about total cost/processing power of the cluster ? Cheers, -- Eray Ozkural (exa) Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Tue Apr 22 17:47:36 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue, 22 Apr 2003 17:47:36 -0400 (EDT) Subject: back to the issue of cooling In-Reply-To: <200304222326.27878.exa@kablonet.com.tr> Message-ID: On Tue, 22 Apr 2003, Eray Ozkural wrote: > Has anybody calculated if the operation of a low-power cluster can amortize > the actual price of the system in a couple of years? I'm thinking something > like an apple cluster or something, might actually be viable! What about > total cost/processing power of the cluster ? This is the relevant measure, to be sure. Total cost of ownership with a vengeance, per unit of work done, amortized over the life of a cluster. $1 per watt per year for heating and cooling, add cost of systems themselves, correct for SPEED of systems (ideally including your Amdahl's law scaling hit for using more slower processors!). I predict that the sweet spot is probably Athlon 2400's or possibly 2.4 GHz P4's (depending on your code), with fine grained people shifted toward the even higher end processors and with some room for Celerons or Durons on the low end. I think the main reason to get transmetas is likely to be to get the processing densities, not to save power or money. For things like apples, you may have to factor in increased sysadmin costs, if you're not careful. Intel or Athlon clusters are pretty much plug'n'play with multiple distributions and techniques, if you shop your hardware at all carefully. rgb > > Cheers, > > -- > Eray Ozkural (exa) > Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org > www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza > GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From toon at moene.indiv.nluug.nl Tue Apr 22 18:15:29 2003 From: toon at moene.indiv.nluug.nl (Toon Moene) Date: Wed, 23 Apr 2003 00:15:29 +0200 Subject: Opteron announcement References: <20030422053550.GA6923@sphere.math.ucdavis.edu> Message-ID: <3EA5BF01.7090308@moene.indiv.nluug.nl> Bill Broadley wrote: > Amusingly all the submissions that I looked at the full reports for > use the Intel compiler. So the Opterons extra registers are ignored. Yes, I'd hoped AMD would use g77 for the Fortran 77 parts of SPECfp2000 :-) > Time will tell if 3rd party compilers that fully utilize the additional > registers can win benchmarks against Intel's compiler. You want to look at this page of the (15 !) page report by Aces Hardware (the results are mixed): http://www.aceshardware.com/read.jsp?id=55000265 [ the very last table on that page ] -- Toon Moene - mailto:toon at moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html GNU Fortran 95: http://gcc-g95.sourceforge.net/ (under construction) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Tue Apr 22 20:09:20 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Tue, 22 Apr 2003 20:09:20 -0400 (EDT) Subject: Opteron announcement In-Reply-To: <3EA5BF01.7090308@moene.indiv.nluug.nl> Message-ID: > You want to look at this page of the (15 !) page report by Aces Hardware > (the results are mixed): I think the results are pretty clear: AMD has successfully transformed the well-regarded Athlon core into a serious competitor to anything Intel has to offer. the core isn't really changed that much: it's got SSE2 now (which helps blas a lot), seems to gain somewhat from extra regs in 64b mode (but it's not clear how you go about using them, since until now, Intel's compiler has been the far-best). I'm still a little puzzled about the 1MB onchip L2, since low-latency ram at least partially obviates the need for it. at the system level, AMD can now compete against Intel's agressive bandwidth scaling (where the Athlon recently lagged) and has a clearly superior SMP architecture (especially for >2-way). in my opinion, AMD will quickly realize that their CPU price ($794 for opt/244(1800)) has to roughly match that of the Xeon/2.8 (pricewatch: $425), since they're comparable in performance. there's no reason AMD can't play $794 as an opening bid, to capitalize on the buzz and make a point about being serious players. I don't see any reason for a serious difference in prices of motherboards: Intel's i7xxx Xeon chipsets are well-understood and perform well, but if anything, Opteron boards should be cheaper, since the chipset has fewer responsibilities. so for those of us looking at dual-CPU cluster bricks, $740 difference due to CPU price is a serious issue. AMD can either let the prices slide or bump up the clocks (for which AMD has already paid the SOI price, as well as the cost of a couple extra pipestages.) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From math at velocet.ca Tue Apr 22 23:46:23 2003 From: math at velocet.ca (Ken Chase) Date: Tue, 22 Apr 2003 23:46:23 -0400 Subject: back to the issue of cooling In-Reply-To: ; from rgb@phy.duke.edu on Tue, Apr 22, 2003 at 05:47:36PM -0400 References: <200304222326.27878.exa@kablonet.com.tr> Message-ID: <20030422234623.Z25523@velocet.ca> On Tue, Apr 22, 2003 at 05:47:36PM -0400, Robert G. Brown's all... >On Tue, 22 Apr 2003, Eray Ozkural wrote: > >> Has anybody calculated if the operation of a low-power cluster can amortize >> the actual price of the system in a couple of years? I'm thinking something >> like an apple cluster or something, might actually be viable! What about >> total cost/processing power of the cluster ? Depends on where you live. Canada is a pretty cheap place to waste power, tho Seattle is better than Toronto: http://www.bchydro.com/policies/rates/rates759.html (But then you might factor in real estate pricing >For things like apples, you may have to factor in increased sysadmin >costs, if you're not careful. Intel or Athlon clusters are pretty much >plug'n'play with multiple distributions and techniques, if you shop your >hardware at all carefully. Suddenly diskless vs non diskless isnt just a management issue too - 15krpm drives can eat a fair bit of power. (We have power supplies in smaller servers that cant handle 2x 15Krpm + Dual p3). Start adding up to 30-40W+ per drive across 1000 nodes and you have a fair chunk of power. /kc > > rgb > >> >> Cheers, >> >> -- >> Eray Ozkural (exa) >> Comp. Sci. Dept., Bilkent University, Ankara KDE Project: http://www.kde.org >> www: http://www.cs.bilkent.edu.tr/~erayo Malfunction: http://mp3.com/ariza >> GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C >> >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org >> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf >> > >Robert G. Brown http://www.phy.duke.edu/~rgb/ >Duke University Dept. of Physics, Box 90305 >Durham, N.C. 27708-0305 >Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu > > > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Ken Chase, math at velocet.ca * Velocet Communications Inc. * Toronto, Canada Wiznet Velocet DSL.ca Datavaults 24/7: 416-967-4414 tollfree: 1-866-353-0363 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From timm at fnal.gov Wed Apr 23 09:26:38 2003 From: timm at fnal.gov (Steven Timm) Date: Wed, 23 Apr 2003 08:26:38 -0500 (CDT) Subject: back to the issue of cooling In-Reply-To: Message-ID: > On Tue, 22 Apr 2003, Eray Ozkural wrote: > > > Has anybody calculated if the operation of a low-power cluster can amortize > > the actual price of the system in a couple of years? I'm thinking something > > like an apple cluster or something, might actually be viable! What about > > total cost/processing power of the cluster ? > > This is the relevant measure, to be sure. Total cost of ownership with > a vengeance, per unit of work done, amortized over the life of a > cluster. $1 per watt per year for heating and cooling, add cost of > systems themselves, correct for SPEED of systems (ideally including your > Amdahl's law scaling hit for using more slower processors!). I predict > that the sweet spot is probably Athlon 2400's or possibly 2.4 GHz P4's > (depending on your code), with fine grained people shifted toward the > even higher end processors and with some room for Celerons or Durons on > the low end. I think the main reason to get transmetas is likely to be > to get the processing densities, not to save power or money. > The problem with the above calculation is that oftentimes the cost to get the electrical infrastructure into your facility in the first place is much, much greater than the cost of the electricity it delivers. We are spending $560K here at Fermilab to add 250 kVA of electrical capacity to our floor. We calculate that the cost of the electricity to run the machines over 3 years will only be $50K. We considered whether to actually put a weighting factor into our bids so that more electrically-efficient machines would be preferred, but then when we thought about it, we figured that (1) these machines are usually slower so you need more of them (2) they also use up more floor space which isn't free, and (3) within the same CPU speed class, those machines which use up the most electricity are also likely to be the ones with the biggest fans which are the best cooled internally. Steve Timm _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Apr 23 09:51:41 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 23 Apr 2003 09:51:41 -0400 (EDT) Subject: back to the issue of cooling In-Reply-To: Message-ID: On Wed, 23 Apr 2003, Steven Timm wrote: > The problem with the above calculation is that oftentimes the cost > to get the electrical infrastructure into your facility in the > first place is much, much greater than the cost of the electricity > it delivers. We are spending $560K here at Fermilab to add 250 kVA > of electrical capacity to our floor. We calculate that the > cost of the electricity to run the machines over 3 years will only > be $50K. Indeed true. However, the infrastructure cost is amortized over a longer time, as well, and it varies strongly and nonlinearly from site to site, depending on how much capacity you already have "handy" (or how far they have to go to find a transformer with the capacity to deliver it). If you have 250 kVA worth of machines running in your machine room, and spend 8 cents or so a kVA-hour, then power and cooling combined for your room would cost about $250K a year (so that's a fairer measure of the running capacity you are purchasing, even if you don't use it right away). Amortized over ten years the cost of the renovation would be roughly $60-65K per year, including interest. That's still high -- you obviously had to install some BIG transformers to get the capacity you need or something -- but not insanely high. > We considered whether to actually put a weighting factor into our > bids so that more electrically-efficient machines would be preferred, > but then when we thought about it, we figured that (1) these > machines are usually slower so you need more of them (2) they > also use up more floor space which isn't free, and (3) within > the same CPU speed class, those machines which use up the most > electricity are also likely to be the ones with the biggest fans > which are the best cooled internally. Indeed and agreed. If you like, what matters is getting the most work done per dollar spent, regardless of how you get the work done. rgb > > Steve Timm > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From adamgood at linux-mag.com Wed Apr 23 12:19:32 2003 From: adamgood at linux-mag.com (Adam Goodman) Date: Wed, 23 Apr 2003 09:19:32 -0700 (PDT) Subject: ClusterWorld Conference & Expo Announcement Message-ID: Hello Everyone, We would be really be thrilled to have any and all of you involved in our conference! We're excited to inform you that ClusterWorld Conference & Expo San Jose 2003 Registration is now open! You can register today for your FREE Exhibits Pass, or for one of our in-depth conference passes! Please use your SPECIAL PRIORITY CODE - BELOW when registering. Just go to http://www.clusterworldexpo.com and click on "REGISTER NOW!" to sign up today! ClusterWorld Conference and Expo June 23 - 26, 2003 San Jose Convention Center San Jose, CA http://www.clusterworldexpo.com Cluster technology is changing everything - from supercomputing to high-availability - and it's growing faster than any other segment of the market. ClusterWorld Conference & Expo stands at the very center of this amazing movement. At ClusterWorld Conference & Expo, you can: * LEARN from top clustering experts from all industries in our extensive conference programs. * EXPERIENCE the latest cluster technology from all the top vendors on our awesome expo floor. * MEET AND NETWORK with colleagues from across the world of clustering at our numerous sponsored social events and parties. *** Keynote Speakers *** (Keynotes are open to all attendees) John Picklo Manager, High Performance Computing DaimlerChrysler John Reynders Vice-President, Informatics Celera Therapeutics Jacobus N. Buur Principal Research Physicist Shell International Exploration and Production B.V. Dr. Tilak Agerwala Vice President, Systems IBM Research The ClusterWorld conference program was created in conjunction with the Linux Clusters Institute (LCI) and offers something for everyone interested in cluster technology. If you work with clusters in any capacity, ClusterWorld Conference & Expo is the one event you cannot afford to miss this year. Learn more at http://www.clusterworldexpo.com. *** ClusterWorld Conference & Expo Sponsors *** Platinum: HP and Intel Gold: AMD, Dell, MSC Software, Myricom, RackSaver, RLX Technologies Silver: APC - American Power Conversion, Appro International, Linux Networx, Microway, Penguin Computing, Promicro Systems, and Western Scientific Media Sponsors: Dr. Dobbs Journal, GridToday, HPCwire, Linux Magazine, and Sys Admin _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jbassett at blue.weeg.uiowa.edu Wed Apr 23 12:31:59 2003 From: jbassett at blue.weeg.uiowa.edu (jbassett) Date: Wed, 23 Apr 2003 11:31:59 -0500 Subject: back to the issue of cooling Message-ID: <3EA854CE@itsnt5.its.uiowa.edu> Transmeta quotes a TDP for their 1-Ghz Crusoe as 7.5 watts An Athlon XP at around twice the clock-speed is around 10* that at 75 watts but at .05$/kw*h I agree that it is unlikely that you could ever find an operating cost that would be able to offset the greater cost and slower performance of the Crusoe. But the density that you could pack them would be incredible. If you were running so much cooler that less of a cooling system investment were required that might change the equation. Or if there was ever a need for a highly mobile cluster system. You could pack a great number into a single box and carry it about and perhaps because in theory 10 Crusoes would dissipate the heat of a single Athlon you could easily cool many of them. Joseph Bassett _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gary at umsl.edu Wed Apr 23 14:23:19 2003 From: gary at umsl.edu (Gary Stiehr) Date: Wed, 23 Apr 2003 13:23:19 -0500 Subject: Serial Port Concentrators vs. KVMs Message-ID: <3EA6DA17.6040100@umsl.edu> Hi, Having used both KVMs and serial port concentrators, I have my own opinions about the advantages and disadvantes of each. I was hoping that list members might share their opinions as well. My experience is with Belkin 8-port KVMs and with a Computone RAS2000 serial port concentrator. Here are some of my opinions, please feel free to add to the list or correct me if I'm wrong. In particular, any comments on scalability or some price comparisons would be interesting. KVM Advantages -------------- * Ease of Setup: usually you just run the keyboard/video/mouse cables to the KVM and then a set of keyboard/video/mouse cables from the KVM to some other node from which you can access the console for all of the nodes attached to the KVM. There usually is nothing that needs to be done with the OS (although I've heard of some BIOSes having problems but I've never experienced this). There is also usually nothing to set up with the KVM itself--just hook up the cables. KVM Disadvantages ----------------- * Lots of cables: Even if you do not use a mouse cable, you still have two cables running from the back of each node. I have heard of some KVMs lately that use an adapter to combine all three kvm cables into one. I have not actually seen or used one but that would certainly help. * No remote access: The only KVM switches that I have seen with remote access are "enterprise" KVM switches that have a high price tag. I have no experience with this type of KVM switch but I would imagine it would be like a hybrid KVM/serial port concentrator. Serial Port Concentrator (SPC) Advantages ----------------------------------------- * Remote access: Most SPCs that I looked at listed remote access as a feature. And some, including the Computone RAS2000 that I use, allow you to access the them via ssh. * Less cables: You only need to run one cable from the back of each node (from the serial port) to the SPC. * Multiple access methods: As noted above, you can access a lot of SPCs via the network. But if that is down, you can also access the SPC via a node that is attached via serial port to a special port on the SPC. Serial Port Concentrator (SPC) Disadvantages -------------------------------------------- * Need to set up the SPC itself: In my case, this wasn't too bad. Unfortunately, I would think that each vendor would have its own set of procedures to follow for the setup of its own SPC. * Somewhat of a learning curve: If you have not had experience with serial ports (i.e., you know what they are but you've never done anything with them), there will be a lot of terms that are unfamiliar. You will also need to find out a lot of information about your hardware, OS and BIOS. For instance, what speed do they support (9600 baud, 115200 baud, etc.)? What terminal emulation do they support (vt100, vt102, ansi)? Is my serial port enabled in the BIOS? Which serial port is which (For Linux: /dev/ttyS0, /dev/ttyS1, etc.)? And so on. * A significant number of small changes to OS: There are a number of changes that you need to make to the OS (in my case Linux) in order for the console messages to be sent to the serial port. Thanks to various how-tos and other docs, I was able to make all of the appropriate changes but a lot of them were not very obvious (although once you read about them you can see why it would be necessary). * Must access the BIOS on each system: Unless your BIOS has serial port redirection enabled by default (if it has this feature at all), you will need to access each BIOS as you set the systems up (if you want to see console messages generated by the BIOS). Thanks for reading this somewhat lengthy e-mail. I would appreciate your comments. Thank you, Gary Stiehr gary at umsl.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From henken at seas.upenn.edu Wed Apr 23 15:04:15 2003 From: henken at seas.upenn.edu (Nicholas Henke) Date: 23 Apr 2003 15:04:15 -0400 Subject: Serial Port Concentrators vs. KVMs In-Reply-To: <3EA6DA17.6040100@umsl.edu> References: <3EA6DA17.6040100@umsl.edu> Message-ID: <1051124655.7370.45.camel@roughneck.liniac.upenn.edu> On Wed, 2003-04-23 at 14:23, Gary Stiehr wrote: > > KVM Disadvantages > ----------------- > * Lots of cables: Even if you do not use a mouse cable, you still have > two cables running from the back of each node. I have heard of some > KVMs lately that use an adapter to combine all three kvm cables into > one. I have not actually seen or used one but that would certainly help. This get's to be a price issue too -- good KVM cables are darned expensive. > > * No remote access: The only KVM switches that I have seen with remote > access are "enterprise" KVM switches that have a high price tag. I have > no experience with this type of KVM switch but I would imagine it would > be like a hybrid KVM/serial port concentrator. > > Serial Port Concentrator (SPC) Advantages > ----------------------------------------- > * Remote access: Most SPCs that I looked at listed remote access as a > feature. And some, including the Computone RAS2000 that I use, allow > you to access the them via ssh. Best part -- never leave the comfort of your own desk :) Generally Cheaper than KVM -- I am sure there is some knee point, but for clusters in the >24 nodes size, remote serial is the way to go. You will need some sort of KVM to access the nodes in the machine room, for those problems where hardware is !#@$-ed, or if BIOS redirection is not an option, as we have seen on some of our machines. Also, when using a nice package like conserver (conserver.com (free :)), you get logs of the console output -- absolutely critical for debugging oops output -- unless you like transcribing oops to notepad, and then back to a text file for ksymoops. Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Apr 23 17:15:18 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 23 Apr 2003 17:15:18 -0400 (EDT) Subject: back to the issue of cooling In-Reply-To: <3EA854CE@itsnt5.its.uiowa.edu> Message-ID: On Wed, 23 Apr 2003, jbassett wrote: > Transmeta quotes a TDP for their 1-Ghz Crusoe as 7.5 watts > > An Athlon XP at around twice the clock-speed is around 10* that at 75 watts > > but at .05$/kw*h I agree that it is unlikely that you could ever find an > operating cost that would be able to offset the greater cost and slower > performance of the Crusoe. But the density that you could pack them would be > incredible. If you were running so much cooler that less of a cooling system > investment were required that might change the equation. (People bored with blades can skip the following:-) Right, this is really their niche at the moment, especially in environments where installing them in high densities saves you from REALLY costly infrastructure investments or where space itself is just plain tight. Be careful about comparing raw clocks, though -- they are different architectures, with the transmeta according to Feng's own paper (Feng is the Green Destiny cluster guy) only delivering 1/3 to 1/2 the performance at equivalent clock to Athlons. I don't think he came CLOSE to systematically exploring system performance to get those numbers, but that's just me -- maybe I'm misreading. I'd like to see systematic e.g. specmarks, systematic lmbench's, stream, netpipes, and more, not just sqrt's; less emphasis on "fraction of peak" and more on wall-clock completion times. To put it another way, I don't think Feng's paper is a sound basis for would be cluster engineers trying to guestimate the performance of a bladed system on a given problem. This makes it very difficult to compare "theoretically" with competing designs (but of course that won't stop me below -- just take it with a grain of salt:-). Also be careful about comparing CPU-only numbers for e.g. power. The CPU is mounted on a card with memory, a hard disk (or two), a network interface, and a bus/backplane interface. All of these draw power. The power itself comes from a chassis with a power supply that gets hot while operating. What I looked for, but failed to find in Feng's paper, is the actual wall-plug power load of a 24 blade chassis running code flat out. If the chassis power supply capacity is any indication, it is probably more like 20 watts per blade (and maybe more, as some fraction of the "blade load" goes to running chassis electronics and heat dissipated by the chassis power supply). The only good way to find out is to stick a kill-a-watt between a blade chassis and the wall and read out its draw under a mix of loads. I don't think Feng did that (hard to tell from the paper, at any rate). I suspect he used published numbers for the CPU draw or the blade draw instead of measuring it himself but if he told WHAT he did I missed it. If we assume 20 W and compare it to the 85 full chassis load watts (or so) burned (per CPU) in a loaded dual Athlon at 1.6 GHz, then the transmeta gets 0.3 to 0.5 "Athlon GHz" (AGHz) (or 1/5 to 1/3 the performance) for 1/4 the power draw. Hmmm. Where is the big win here? Even if my power numbers are off by a power of two and a fully loaded blade burns only 10 W -- a number I'd doubt since NICs alone tend to burn 5 W and a blade has a NIC -- I'm not impressed, given the cost differential, because we have NOT YET CONSIDERED the scaling laws associated with parallelizing tasks themselves, which often strongly favor faster processors (i.e. faster processors on systems with faster busses can often be used to make clusters that scale near-linearly to far higher total performance and to far more processors). Nor have we considered micro-determinants of performance. How expensive is a context switch? How well does it manage cache and dataflow? How smoothly does it process interrupts so it can USE the NIC or disk? Is there an all-things-equal network latency hit of 3x or more relative to an Athlon? There might be (or not)...but barring a published measurement we won't know. I think a far more sophisticated analysis is called for to determine what the real performance/power scaling is PER NODE since the crux of the argument is whether more slower cooler processors are going to perform as well as fewer faster hotter processors. This is a problem dependent question, as has been a focus of the list forever, and might well be TRUE for one problem and FALSE for another. I was REALLY unimpressed by Feng's TCO argument, and especially by his analysis of the processor scaling laws that are limiting processor speedup and leading to an increase in power draw as Moore's law cranks along. First of all, those things are well known -- on chip or off chip, parallelism is a way to get better usage of chip real estate, as Ian Foster points out in a lot more convincing detail in his book on parallel program design. Second of all, Feng's proposed "solution": "quit using the `increasing frequency = increasing performance' marketing ploy" -- isn't a solution at all, it isn't even an argument -- it is raw polemic. What marketing ploy, and what does marketing ploy have to do with chip scaling laws? Increasing frequency DOES, visibly and obviously, increase performance on CPU bound problems, including mine, in a marvelously linear fashion. On the transmeta too, at a guess, just as it has for generations of in-family CPUs. Quantum jumps (relative to clock) occur when the chip is rearchitectured with more parallelism and finer scale, e.g. changing from 8 to 16 to 32 bit architectures, from no pipelines to several pipelines. These are the realities of CPU design, not marketing ploys. Second of all, he offers no argument at all, convincing or otherwise, for how using lots of cooler slower chips is going to actually beat the scaling laws he himself introduces (and the ones he omits). Foster does, in the explicit context of parallel task execution (so I'm familiar with them in a fair bit of detail) but Feng doesn't. A good argument would require him to account for various kinds of overhead, account for parallel scaling on tasks (where his argument OBVIOUSLY fails for a task that will run, fast, from memory on a single CPU with no IPC's but require lots of slow IPC's to run in parallel on two or more transmetas) and would inevitably restrict the classes of task that can be distributed cost-efficiently on the bladed architecture. It is NOT a "substitute" for the increased clock single CPU cycle, it is something different for solving different problems. And finally, there is the good old tanstaafl, which makes me suspicious of the whole line of argument from the beginning. Chip designers at Intel and AMD (and at Transmeta, for that matter) are not idiots. They are REALLY familiar with the chip real estate, clockspeed, parallelism scaling laws and have introduced a LOT of on-chip parallelism in part because of them. They are real experts on this and don't do stupid things and are all working with the same microscopic "components" on their VLSI layouts, trying to optimize a highly nonlinear cost-benefit function in truly creative ways. Their chip designs are genius, not dumb, expensive genius at that (up to order $billion per CPU generation foundry at this point?). RISC itself is something of a response to these laws, and Transmeta's architecture seems almost like "super-RISC" with a lot of code translation and pre-processing to conserve chip real estate. Ultimately, winning the performance war requires either finding a really fundamentally different design that has different scaling laws or finding a niche market where the design you have (which may be a different emphasis or design focus of existing designs) can be successful. So far I don't see it, although I've seen some intriguing ideas kicked around. I'd be at least intrigued, for example, by an "8 processor motherboard" where 8 transmeta's were slotted right up on a very fast memory bus with a standard peripheral (PCI-x) interconnect. That would give you e.g. "8 transmeta GHz" on a system that drew roughly the same power as a P4 or Athlon in the 2 GHz range. Multiply by 0.4 (say) and perhaps it is competitive, and gets you out to decent performance without an ethernet interconnect, giving you better parallelism for certain classes of task. Transmeta's in PDA's are also very intriguing -- building a handheld device that can run for hours at high speed on a small battery is very cool indeed. THAT kind of (SMP) design in a mainstream mass market delivery would require a new kernel and a fundamentally parallel approach to programming, to make "happen". It might not make it -- lots of stuff on PC's is single threaded and CPU or memory I/O bound, and lots of CPUs competing for memory or trying to deliver a threaded task are a known headache. It would be interesting, though, especially if the design was modular and could be scaled up to 24, 48, 1024 processors. > Or if there was ever a need for a highly mobile cluster system. You could pack > a great number into a single box and carry it about and perhaps because in > theory 10 Crusoes would dissipate the heat of a single Athlon you could easily > cool many of them. Joseph Bassett Well, yes, unless you needed the single-threaded PERFORMANCE of a single Athlon. And remember, until that 10 way SMP motherboard for the Crusoe comes along, you're feeding the CPU, its own memory, its own disk, and a network (ten of each), and suddenly it isn't anything like 10 for one to the Athlon, more like four for one or even five for one, and when you multiply by the speed differential per clock, suddenly you're back dangerously close to where you started in BOTH FLOPS/Watt AND in absolute FLOPS, with now 10 CPUs to care for, feed, network, and program. The single AMD will run ANY application over the counter, no parallel programming required. Lower TCO? I think that's obvious. I'm not down on blades -- I think they have their niche and power/cooling/space starved environments are it. I don't think that they are even close to a cost/benefit win in most other environments, and not because of "marketing hype". I'm not selling anything; if anything I'm buying. Should I spend my (say) $15K on Crusoes in a blade configuration or on dual Athlons? Hmmm, I can afford just about 8 dual Athlon 2400+'s or just about 12 Crusoes (presuming $1K each by the time a chassis and so forth is thrown in). 16 Athlon CPUs buys me some 32 "Athlon GHz" (and costs me about $1300 a year in utility bills). The alternative gives me 12 "Transmeta GHz", where a TGHz is "worth" perhaps 0.5 AGHz in FLOPS, according to Feng's incomplete measurements. So it buys me roughly 6 AGHz, five times fewer, and costs me (heck, I'll GIVE you 10x less power) $150 year to run. I'd still need to spend $75,000 on transmetas, assuming my application scaled linearly to 60 transmetas at all, to equal the power of my (8 dual FF) 16 Athlons CPUs (assuming I'm still scaling linearly there, as well). My three-year power bill would be maybe $2000 less, but my overall bill would be $53,000 more for the Crusoes. In a lot of environments, I could buy brand new wiring, a dedicated air conditioner, and get STILL get back enough change to travel business class to Australia going with the Athlons, especially if my goal is to feed 8 whole dual Athlons (ballpark of 170W each, 1400 watts to perhaps 2000 watts total consummption under load, one to two 20 Amps circuits, installable in most locations that have a bit of surplus capacity at the box for maybe $1000 bucks tops even if they have to pull wire). If there is something wrong with this analysis, I'd be interested in hearing it. At $200/blade, blades would be a great deal from a TCO or cost/benefit perspective. At $400/blade they would be "interesting" and often competitive. At $1000/blade, they are a niche market only item, as I see it -- people who have $100K in renovation required otherwise to build a cluster, people who have an uncooled broom closet available as a "cluster room" and who inexplicably can STILL afford a Transmeta cluster in the first place. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Wed Apr 23 19:00:39 2003 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Wed, 23 Apr 2003 18:00:39 -0500 Subject: Serial Port Concentrators vs. KVMs In-Reply-To: <3EA6DA17.6040100@umsl.edu> References: <3EA6DA17.6040100@umsl.edu> Message-ID: <3EA71B17.2010206@tamu.edu> ALthough in a different context, we use both. And an additional approach. We use KVM switches in my virtual network engineering lab environment when we're working in the lab and doing "console" access locally: Resetting/rebuilding systems, adding hardware and fine-tuning, etc. We use Concentrators for our remote access, although our application is a little different from most: Our Cyclades systems are isolated in a private network, and we have front-end systems to access them. This system has worked very well for our lab exercises. We can keep the number of connections and logins relatively low, which is one place the terminal concentrators fall down on: too long to log in, IMHO. We also use the generic equivalent of a Head node, which we refer to as a 'Bastion" box. This allows ssh access to the sandbox area for terminal-type connections. The Bastion Host is accessible on ssh through our campus firewall. We use a series of restrictive rules, especially during our security classes (which are real, hacking, attack-defend classes) to prevent students from accessing a system not their own from the Bastion Host, and the Rules of the Game preclude use of the Bastion Host from attacking. In summary, we like all three approaches. There are times where the Bastion approach is best and we try to utilize it there. We like the Belkin 16 port KVMs when we've got to be in the room. We don't like the 8- or 4-ports as they are not cost effective. We like the serial concentrators when doing "serial console" tasks. If you reboot a system, the Bastion Host won't maintain the connection during the process, while the serial system will. Series of tradeoffs and we've tried to ascertain what works best for us... Gerry Gary Stiehr wrote: > Hi, > Having used both KVMs and serial port concentrators, I have my own > opinions about the advantages and disadvantes of each. I was hoping > that list members might share their opinions as well. My experience is > with Belkin 8-port KVMs and with a Computone RAS2000 serial port > concentrator. Here are some of my opinions, please feel free to add to > the list or correct me if I'm wrong. In particular, any comments on > scalability or some price comparisons would be interesting. > > KVM Advantages > -------------- > * Ease of Setup: usually you just run the keyboard/video/mouse cables to > the KVM and then a set of keyboard/video/mouse cables from the KVM to > some other node from which you can access the console for all of the > nodes attached to the KVM. There usually is nothing that needs to be > done with the OS (although I've heard of some BIOSes having problems but > I've never experienced this). There is also usually nothing to set up > with the KVM itself--just hook up the cables. > > KVM Disadvantages > ----------------- > * Lots of cables: Even if you do not use a mouse cable, you still have > two cables running from the back of each node. I have heard of some > KVMs lately that use an adapter to combine all three kvm cables into > one. I have not actually seen or used one but that would certainly help. > > * No remote access: The only KVM switches that I have seen with remote > access are "enterprise" KVM switches that have a high price tag. I have > no experience with this type of KVM switch but I would imagine it would > be like a hybrid KVM/serial port concentrator. > > Serial Port Concentrator (SPC) Advantages > ----------------------------------------- > * Remote access: Most SPCs that I looked at listed remote access as a > feature. And some, including the Computone RAS2000 that I use, allow > you to access the them via ssh. > > * Less cables: You only need to run one cable from the back of each node > (from the serial port) to the SPC. > > * Multiple access methods: As noted above, you can access a lot of SPCs > via the network. But if that is down, you can also access the SPC via a > node that is attached via serial port to a special port on the SPC. > > Serial Port Concentrator (SPC) Disadvantages > -------------------------------------------- > * Need to set up the SPC itself: In my case, this wasn't too bad. > Unfortunately, I would think that each vendor would have its own set of > procedures to follow for the setup of its own SPC. > > * Somewhat of a learning curve: If you have not had experience with > serial ports (i.e., you know what they are but you've never done > anything with them), there will be a lot of terms that are unfamiliar. > You will also need to find out a lot of information about your hardware, > OS and BIOS. For instance, what speed do they support (9600 baud, > 115200 baud, etc.)? What terminal emulation do they support (vt100, > vt102, ansi)? Is my serial port enabled in the BIOS? Which serial port > is which (For Linux: /dev/ttyS0, /dev/ttyS1, etc.)? And so on. > > * A significant number of small changes to OS: There are a number of > changes that you need to make to the OS (in my case Linux) in order for > the console messages to be sent to the serial port. Thanks to various > how-tos and other docs, I was able to make all of the appropriate > changes but a lot of them were not very obvious (although once you read > about them you can see why it would be necessary). > > * Must access the BIOS on each system: Unless your BIOS has serial port > redirection enabled by default (if it has this feature at all), you will > need to access each BIOS as you set the systems up (if you want to see > console messages generated by the BIOS). > > > Thanks for reading this somewhat lengthy e-mail. I would appreciate > your comments. > > Thank you, > Gary Stiehr > gary at umsl.edu > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From award at andorra.ad Thu Apr 24 01:08:42 2003 From: award at andorra.ad (Alan Ward) Date: Thu, 24 Apr 2003 07:08:42 +0200 Subject: back to the issue of cooling References: Message-ID: <3EA7715A.44EFAA24@andorra.ad> I tend to think Transmeta and other low-power CPUs belong on the desktop, so you can run them without the noisy fans (and they don't heat up the air). Alan Ward Robert G. Brown ha escrit: (big snip) > I'm not down on blades -- I think they have their niche and > power/cooling/space starved environments are it. I don't think that (other big snip) > rgb > > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gmpc at sanger.ac.uk Thu Apr 24 05:03:41 2003 From: gmpc at sanger.ac.uk (Guy Coates) Date: Thu, 24 Apr 2003 10:03:41 +0100 (BST) Subject: Serial Port Concentrators vs. KVMs In-Reply-To: <200304231901.h3NJ1Vs17842@NewBlue.Scyld.com> References: <200304231901.h3NJ1Vs17842@NewBlue.Scyld.com> Message-ID: The other important aspects are logging and automation. If you use serial port concentrators you can then you can use the wonders of Conserver (http://www.conserver.com/) to manage and log the console output, allowing you to capture your kernel panics in their full glory. You can script against serial consoles, something that you can't do with KVM. The case we like to taunt our prospective hardware vendors with is: "How do I change the bios settings of all of the machines in a 200 node cluster?" Assuming serial access to the bios, you can automate that process with expect. With KVM (or the horribly broken VNC "remote management" solutions some manufacturers seem so keen on) you have to do it by hand. Cheers, Guy Coates -- Guy Coates, Informatics System Group The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK Tel: +44 (0)1223 834244 ex 7199 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jahia at mail.umesd.k12.or.us Thu Apr 24 05:34:41 2003 From: jahia at mail.umesd.k12.or.us (Jim Ahia) Date: Thu, 24 Apr 2003 02:34:41 -0700 Subject: beowulf in space Message-ID: As I was reading this thread, some things came to mind that might add to the discussion: 1 ) although Dells and Gateways are too heavy to lift into orbit, pc-104 systems might be the solution. 3.6 x 3.8 inch pentium-class motherboards with a single 5v power requirement make things much smaller. It is completely possible to have each node fit into the space of a half-height CD-ROM drive. Can anyone say "cluster in one box"? 2 ) Has anyone yet mentioned the possibility of mesh networks using 802.11 for robotics clustering? Such networks of robots might make site construction, ship construction, and mining feasible. Mining the surface of the moon is well documented to provide hydrogen, oxygen, aluminum, silica, and titanium. Launching fuel & materials for spacecraft to an orbital construction facility might make more sense than the billions we are spending now, if the mine, transport, and construction are largely carried out by robotics under the oversight of a resident cluster with ground-based monitoring. Using a similar swarm of robots for site construction on mars prior to human arrival can have a major impact on mission success. If all robots use identical motion base and cpu, then 2 broken bots can be cannibalized to return one working bot to service. If all of the robots that are currently recharging batteries are added to the cluster as mains-connected nodes, then a cluster of sorts is in effect to speed control processing of the 'hive'. This is assuming that the central site has the main power supply system online, be it solar, nuc, whatever. -Jim Ahia -makenamicro at charter.net _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jhearns at freesolutions.net Thu Apr 24 10:44:42 2003 From: jhearns at freesolutions.net (John Hearns) Date: 24 Apr 2003 15:44:42 +0100 Subject: beowulf in space In-Reply-To: References: Message-ID: <1051195484.16295.22.camel@harwood.home> On Thu, 2003-04-24 at 10:34, Jim Ahia wrote: > As I was reading this thread, some things came to mind that might add to > the discussion: > > 1 ) although Dells and Gateways are too heavy to lift into orbit, > pc-104 systems might be the solution. 3.6 x 3.8 inch pentium-class > motherboards with a single 5v power requirement make things much > smaller. It is completely possible to have each node fit into the space > of a half-height CD-ROM drive. Can anyone say "cluster in one box"? A PC-104 cluster has been constructed at Sandia: http://eri.ca.sandia.gov/eri/howto.html BTW, one of the UK Sunday newspapers recently carried a magazine article on Sandia, and the projects going on there. What an interesting place. I think we've all heard about the fun computing things at Sandia, but this article talked about other things like the very high powered laser they have. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From daniel.kidger at quadrics.com Thu Apr 24 11:28:12 2003 From: daniel.kidger at quadrics.com (Dan Kidger) Date: Thu, 24 Apr 2003 16:28:12 +0100 Subject: Serial Port Concentrators vs. KVMs References: <3EA6DA17.6040100@umsl.edu> <3EA71B17.2010206@tamu.edu> Message-ID: <012d01c30a76$1ebc6e30$04a8a8c0@spot> > > Having used both KVMs and serial port concentrators, I have my own > > opinions about the advantages and disadvantes of each. I was hoping > > that list members might share their opinions as well. Traditionaly we have used serial lines for all large clusters, often with a small say 4 or 8-way KVM. The KVM gives us access to the management node(s) (and various hence X11 based tools), other servers (e.g raid controllers, QsNet switches), together with a spare connection or two. The spare connections are used only when we have odd 'problem nodes' or when setting the BIOS to serial for the first time. IMHO Linux based serial line concentrators like Cyclades et al. are particularly easy to setup and maintain However the current trend is for all new rack-mount nodes to offer some sort of BMC (baseboard management controller) with an ethernet connection. As well as giving remote power cycling, this should allow 'Serial_over_Lan" - the headnode in the cluster acts as a server and you can telnet/ssh to it on a certain port and hence reach the console on any compute node. As a result, our latest cluster we have got has neither a KVM nor a serial port concentrator. Yours, Daniel. -------------------------------------------------------------- Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505 ----------------------- www.quadrics.com -------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From cozzi at nd.edu Thu Apr 24 12:41:15 2003 From: cozzi at nd.edu (Marc Cozzi) Date: Thu, 24 Apr 2003 11:41:15 -0500 Subject: Cluster World Expo Message-ID: Someone posted http://www.clusterworldexpo.com here a day or so ago. Have (m)any of you been to the Cluster World Expo before? Any comments on the value of this meeting versus other similar conferences would be appreciated. I've built a few Intel clusters and maintain a few. I would be primarily interested in support tools and cluster software installation tools. It looks like they have BOF sessions that I would most likely benefit from. thanks --marc _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bari at onelabs.com Thu Apr 24 12:57:20 2003 From: bari at onelabs.com (Bari Ari) Date: Thu, 24 Apr 2003 11:57:20 -0500 Subject: back to the issue of cooling References: <3EA854CE@itsnt5.its.uiowa.edu> Message-ID: <3EA81770.9000802@onelabs.com> jbassett wrote: >Transmeta quotes a TDP for their 1-Ghz Crusoe as 7.5 watts > >An Athlon XP at around twice the clock-speed is around 10* that at 75 watts > >but at .05$/kw*h I agree that it is unlikely that you could ever find an >operating cost that would be able to offset the greater cost and slower >performance of the Crusoe. But the density that you could pack them would be >incredible. If you were running so much cooler that less of a cooling system >investment were required that might change the equation. > >Or if there was ever a need for a highly mobile cluster system. You could pack >a great number into a single box and carry it about and perhaps because in >theory 10 Crusoes would dissipate the heat of a single Athlon you could easily >cool many of them. Joseph Bassett > > > Density is not a problem using 75W Athlon's or Xeon's. You can stuff 8-16 Athlon's or Xeon's into a 16.5" x 25" x 1.7" 1U box. Heat is transferred away from them (cpu, memory, chipset) using a combination of conduction cooling techniques and heat pipes in the case tied to a "heatbus" outside the case. The heatbus can be (depending on the heat generated) a large highly profiled heatsink requiring forced air convection, heat exchange coil (evaporator) or combination of the two. What's more of a limiting factor in tightly packing cpu's is the distance required between cpu and chipset and also chipset to memory that eats up board space. There are very tight PCB routing rules that limit how closely devices can be spaced. 1" - 1.5" min. is common. There's lots of talk about very dense systems but nobody ever really wants them. Everyone wants clusters with COTS motherboards and enclosures. They then rely on forced air cooling through slots in the enclosures and then cool the room air down with A/C. --Bari Ari _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Thu Apr 24 16:43:49 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Thu, 24 Apr 2003 13:43:49 -0700 Subject: beowulf in space In-Reply-To: Message-ID: <5.1.0.14.2.20030424133401.030443d8@mailhost4.jpl.nasa.gov> At 02:34 AM 4/24/2003 -0700, Jim Ahia wrote: >As I was reading this thread, some things came to mind that might add to >the discussion: > >1 ) although Dells and Gateways are too heavy to lift into orbit, >pc-104 systems might be the solution. 3.6 x 3.8 inch pentium-class >motherboards with a single 5v power requirement make things much >smaller. It is completely possible to have each node fit into the space >of a half-height CD-ROM drive. Can anyone say "cluster in one box"? I have seen PC104 stuff being used in prototypes, but for space applications, they prefer a more robust packaging. cPCI is showing some signs of popularity, as is the venerable VME. ESA has funded and is flying quite a lot of stuff that is essentially single board computers interconnected with high speed serial links. >2 ) Has anyone yet mentioned the possibility of mesh networks using >802.11 for robotics clustering? Such networks of robots might make site >construction, ship construction, and mining feasible. There is a huge amount of this kind of work going on at JPL: cooperative robotics. Take a look at the JPL planetary robotics web site http://prl.jpl.nasa.gov/ However, to my knowledge, they're not doing much cluster computing. >Mining the surface of the moon is well documented to provide hydrogen, >oxygen, aluminum, silica, and titanium. Uhhhhh... yes, in the sense that the moon is made of rock, which is made of hydrogen, oxygen, aluminum, etc. Turning rock into metal is a non-trivial process, even on Earth where there are literally millenia of history for the process. >Using a similar swarm of robots for site construction on mars prior to >human arrival can have a major impact on mission success. > >If all robots use identical motion base and cpu, then 2 broken bots can >be cannibalized to return one working bot to service. Of course, this means that the robots have to be a lot smarter and more capable because not only do they have to do their primary job, they also have to be "dismantleable", and have the ability to dismantle things. While this is feasible, in an abstract sense, it might not be worth it; you might be able to spend the resources you'd spend on providing that additional capability on just buying more simpler robots in the first place.. Hmmm.. kind of like buying a bunch of generic commodity computers instead of one big specialized computer to do a particular job... I'll also point out that it's a pretty big job just to design and build rovers that drive and make a few measurements, much less do construction work, smelt metal, do scrap recovery, etc. The Mars Exploration Rovers are quite capable as far as spacecraft go, but weren't particularly easy or cheap to develop, and are hardly a production line item, nor are they likely to be anytime in the next couple decades. James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From becker at scyld.com Thu Apr 24 17:07:52 2003 From: becker at scyld.com (Donald Becker) Date: Thu, 24 Apr 2003 17:07:52 -0400 (EDT) Subject: Cluster World Expo In-Reply-To: Message-ID: On Thu, 24 Apr 2003, Marc Cozzi wrote: > Someone posted http://www.clusterworldexpo.com here a day > or so ago. > Have (m)any of you been to the Cluster World Expo before? It's a new show combined with an older conference. The focus is more on deployed and end-to-end use of clusters and cluster applications, rather than algorithm or theory oriented conferences. Thanks to the hard work of Adam Goodman, it's shaping up to be a really good event. (Adam is the well-known editor of Linux Magazine -- if you have been to a Linux conference in the U.S., you have seen Adam.) > Any comments on the value of this meeting versus other similar > conferences would be appreciated. I've built a few Intel > clusters and maintain a few. I would be primarily interested > in support tools and cluster software installation tools. > It looks like they have BOF sessions that I would most > likely benefit from. -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Scyld Beowulf cluster system Annapolis MD 21403 410-990-9993 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rodmur at maybe.org Thu Apr 24 18:24:44 2003 From: rodmur at maybe.org (Dale Harris) Date: Thu, 24 Apr 2003 15:24:44 -0700 Subject: Serial Port Concentrators vs. KVMs In-Reply-To: <012d01c30a76$1ebc6e30$04a8a8c0@spot> References: <3EA6DA17.6040100@umsl.edu> <3EA71B17.2010206@tamu.edu> <012d01c30a76$1ebc6e30$04a8a8c0@spot> Message-ID: <20030424222444.GY9122@maybe.org> On Thu, Apr 24, 2003 at 04:28:12PM +0100, Dan Kidger elucidated: > However the current trend is for all new rack-mount nodes to offer some sort > of BMC (baseboard management controller) with an ethernet connection. As > well as giving remote power cycling, this should allow 'Serial_over_Lan" - Course the thing I wonder about that is then would seem to loose some redundancy, unless you have a separate network setup to run the serial over LAN. Basically stuff like Cyclades is out of band from your administrative network. Course I guess if a switch blows up, there may not be any particular need to access a node. -- Dale Harris rodmur at maybe.org /.-) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Duclam80 at gmx.net Thu Apr 24 19:38:04 2003 From: Duclam80 at gmx.net (Vu Duc Lam) Date: Fri, 25 Apr 2003 06:38:04 +0700 Subject: Mom Config Message-ID: <004f01c30abb$46e300f0$1a3afea9@conan> Hi, I want to install OpenPBS in a cluster with 16 nodes IBM PC and 1 front-end machine HP Server. I want to use FIFO scheduler. So could any one can give me the detail of mom config file, scheduler config file which I can use to configure the cluster. I try serveral times to config these file according to PBS admin administration document but when I submit a Job, I receive a message error:"Job execeeds queue resources limit" although i set a resources resquest for a job as small as posible. Thanks for yours help. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Thu Apr 24 20:35:32 2003 From: raysonlogin at yahoo.com (Rayson Ho) Date: Thu, 24 Apr 2003 17:35:32 -0700 (PDT) Subject: Mom Config In-Reply-To: <004f01c30abb$46e300f0$1a3afea9@conan> Message-ID: <20030425003532.34945.qmail@web11407.mail.yahoo.com> I would strongly suggest any new batch system installations to start with GridEngine. It is "more opensource" than OpenPBS, easier to install, easier to use, more features, and more friendly developers. http://gridengine.sunsource.net/ If you are in doubt, read the thread "sun grid engine?": http://www.beowulf.org/pipermail/beowulf/2003-March/date.html Rayson --- Vu Duc Lam wrote: > Hi, > > I want to install OpenPBS in a cluster with 16 nodes IBM PC and 1 > front-end > machine HP Server. I want to use FIFO scheduler. So could any one can > give > me the detail of mom config file, scheduler config file which I can > use to > configure the cluster. I try serveral times to config these file > according > to PBS admin administration document but when I submit a Job, I > receive a > message error:"Job execeeds queue resources limit" although i set a > resources resquest for a job as small as posible. Thanks for yours > help. > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From m053546 at usna.edu Thu Apr 24 20:48:42 2003 From: m053546 at usna.edu (MIDN Sean Jones) Date: 24 Apr 2003 20:48:42 -0400 Subject: beowulf in space In-Reply-To: References: Message-ID: <1051231726.1859.9.camel@Eagle.mid4.usna.edu> For reference the United States Naval Academy is putting up a PowerPC 405 SoC in PC/104 form factor up as the Command and Data Handling System of the MidSTAR I satellite slated for launch in March 2006. Sean Jones MIDN USN MidSTAR C&DH Lead Armada Cluster Asst. Admin On Thu, 2003-04-24 at 05:34, Jim Ahia wrote: > As I was reading this thread, some things came to mind that might add to > the discussion: > > 1 ) although Dells and Gateways are too heavy to lift into orbit, > pc-104 systems might be the solution. 3.6 x 3.8 inch pentium-class > motherboards with a single 5v power requirement make things much > smaller. It is completely possible to have each node fit into the space > of a half-height CD-ROM drive. Can anyone say "cluster in one box"? > > 2 ) Has anyone yet mentioned the possibility of mesh networks using > 802.11 for robotics clustering? Such networks of robots might make site > construction, ship construction, and mining feasible. > > Mining the surface of the moon is well documented to provide hydrogen, > oxygen, aluminum, silica, and titanium. Launching fuel & materials for > spacecraft to an orbital construction facility might make more sense > than the billions we are spending now, if the mine, transport, and > construction are largely carried out by robotics under the oversight of > a resident cluster with ground-based monitoring. > > Using a similar swarm of robots for site construction on mars prior to > human arrival can have a major impact on mission success. > > If all robots use identical motion base and cpu, then 2 broken bots can > be cannibalized to return one working bot to service. > > If all of the robots that are currently recharging batteries are added > to the cluster as mains-connected nodes, then a cluster of sorts is in > effect to speed control processing of the 'hive'. This is assuming that > the central site has the main power supply system online, be it solar, > nuc, whatever. > > -Jim Ahia > -makenamicro at charter.net > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- ============================================================================== /\ | Sean Jones / \ _ __ __| __ MIDN USN /====\ |/ \ /\/\ / | / | / | m053546 at usna.edu / \ | | | \_/| \_/| \_/| United States Naval Academy Annapolis, MD 21412 ============================================================================== _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From dvancon at alineos.com Fri Apr 25 03:28:01 2003 From: dvancon at alineos.com (=?ISO-8859-1?Q?Dominique_Van=E7on?=) Date: Fri, 25 Apr 2003 09:28:01 +0200 Subject: AMD Opteron benchmarks Message-ID: <3EA8E381.6020202@alineos.com> Hi All, we performed some benchmarks on AMD Opteron 1400 (also Intel XEON, Itanium2 900, AMD MP, Apple and Alpha processors) : http://www.alineos.com/benchs_eng.html We also could make some comments about these tests, so feel free to contact. -- Dominique Van?on | http://www.alineos.com mailto:dvancon at alineos.com | tel/fax +33 1 64 78 57 65/66 ALINEOS SA, 14 bis rue du Mar?chal Foch, F-77780 Bourron Marlotte France _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From daniel.kidger at quadrics.com Fri Apr 25 03:59:59 2003 From: daniel.kidger at quadrics.com (Dan Kidger) Date: Fri, 25 Apr 2003 08:59:59 +0100 Subject: Serial Port Concentrators vs. KVMs References: <3EA6DA17.6040100@umsl.edu> <3EA71B17.2010206@tamu.edu> <012d01c30a76$1ebc6e30$04a8a8c0@spot> <20030424222444.GY9122@maybe.org> Message-ID: <018201c30b01$65998aa0$04a8a8c0@spot> > On Thu, Apr 24, 2003 at 04:28:12PM +0100, Dan Kidger elucidated: > > However the current trend is for all new rack-mount nodes to offer some sort > > of BMC (baseboard management controller) with an ethernet connection. As > > well as giving remote power cycling, this should allow 'Serial_over_Lan" - > > Course the thing I wonder about that is then would seem to loose some > redundancy, unless you have a separate network setup to run the serial > over LAN. Basically stuff like Cyclades is out of band from your > administrative network. Course I guess if a switch blows up, there may > not be any particular need to access a node. You do not necessirly lose any reduncancy.. Compaq nodes have a extra ethernet socket for the BMC (Hence serial_over_lan). You then can use a seperate ethernet hub in place of the Cyclades, which is of course much cheaper (you could even recycle an old 10Mbit hub from the cupboard). Alternatively the Intel MoBo's like E7501 hijack the same physical ethernet socket for the BMC. This may lose a little redundancy (if for example a cable falls out), but halves the amount of cat5 cabling needed Yours, Daniel. -------------------------------------------------------------- Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505 ----------------------- www.quadrics.com -------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jahia at mail.umesd.k12.or.us Fri Apr 25 01:31:21 2003 From: jahia at mail.umesd.k12.or.us (Jim Ahia) Date: Thu, 24 Apr 2003 22:31:21 -0700 Subject: beowulf in space Message-ID: So the radiation concerns with rad-hardened computer equipment are not as much of a problem once clear of the Van Allen Radiation Belt? How does this affect the space station and the planned missions to mars? What about the lunar environment? I admit to having a lot of ignorance on this subject, but I am concerned because part of my college project is for robotic teams to do excavation / mining using a "hive" concept. The end result is to get more information on the challenges that will be faced by the robotic workers that are eventually sent to the moon first, and to mars second. I am not speaking about the exploration missions by NASA, but rather the much-farther-down-the-road commercial mining interests that will want to build a foundry on the moon and a spacedock in earth orbit prior to the big colonization push into our solar system. I believe it is going to happen someday, because we already know that eventually our sun will go nova and earth will be no longer habitable. Sooner or later mankind, if it is to survive, will need to undergo some kind of diaspora and migrate out into space. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Fri Apr 25 00:32:13 2003 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Thu, 24 Apr 2003 23:32:13 -0500 Subject: beowulf in space In-Reply-To: <1051231726.1859.9.camel@Eagle.mid4.usna.edu> References: <1051231726.1859.9.camel@Eagle.mid4.usna.edu> Message-ID: <3EA8BA4D.9070104@tamu.edu> One of the things I established when I was working on the old Space Station Freedon, in the early '90s, is that the space-rated CPUs, less the issues with radiation hardening and single-event upset recovery, were hardly different from good CPUs. What we discovered was that the MIL-SPEC components differed little from the "industrial-grade" components, save in the degree of paperwork delivered with the device. And the costs. Thus, we drove toward the use of the lower-cost, similar quality Industrial-Grade devices. Now, for low- and mid-earth-orbit altitudes, the radiation environment is pretty harsh. One should be cognizant of that environment, and model the potential for radiation induced transient problems. If you're not ready for transient failures, and at that, failures that may or may not heal (aneal), you shouldn't use non-radiation hardened, commercial, processors. I've not looked at the specs for rad-hardening and SEU performance. If it's a commercial- as opposed to an industrial-grade processor, I'd not be too sure of reliability, either, although those specs have come up markedly over the last 10 years. gerry MIDN Sean Jones wrote: > For reference the United States Naval Academy is putting up a PowerPC > 405 SoC in PC/104 form factor up as the Command and Data Handling System > of the MidSTAR I satellite slated for launch in March 2006. > > Sean Jones > MIDN USN > > MidSTAR C&DH Lead > Armada Cluster Asst. Admin > > On Thu, 2003-04-24 at 05:34, Jim Ahia wrote: > >>As I was reading this thread, some things came to mind that might add to >>the discussion: >> >>1 ) although Dells and Gateways are too heavy to lift into orbit, >>pc-104 systems might be the solution. 3.6 x 3.8 inch pentium-class >>motherboards with a single 5v power requirement make things much >>smaller. It is completely possible to have each node fit into the space >>of a half-height CD-ROM drive. Can anyone say "cluster in one box"? >> >>2 ) Has anyone yet mentioned the possibility of mesh networks using >>802.11 for robotics clustering? Such networks of robots might make site >>construction, ship construction, and mining feasible. >> >>Mining the surface of the moon is well documented to provide hydrogen, >>oxygen, aluminum, silica, and titanium. Launching fuel & materials for >>spacecraft to an orbital construction facility might make more sense >>than the billions we are spending now, if the mine, transport, and >>construction are largely carried out by robotics under the oversight of >>a resident cluster with ground-based monitoring. >> >>Using a similar swarm of robots for site construction on mars prior to >>human arrival can have a major impact on mission success. >> >>If all robots use identical motion base and cpu, then 2 broken bots can >>be cannibalized to return one working bot to service. >> >>If all of the robots that are currently recharging batteries are added >>to the cluster as mains-connected nodes, then a cluster of sorts is in >>effect to speed control processing of the 'hive'. This is assuming that >>the central site has the main power supply system online, be it solar, >>nuc, whatever. >> >>-Jim Ahia >>-makenamicro at charter.net >>_______________________________________________ >>Beowulf mailing list, Beowulf at beowulf.org >>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf >> -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Fri Apr 25 10:14:30 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 25 Apr 2003 10:14:30 -0400 (EDT) Subject: beowulf in space In-Reply-To: Message-ID: On Thu, 24 Apr 2003, Jim Ahia wrote: > So the radiation concerns with rad-hardened computer equipment are not > as much of a problem once clear of the Van Allen Radiation Belt? How >From what I've read, the main concern is solar activity. The sun can relatively suddenly decide to spew significantly higher levels of radiation our way. When this happens it least appears that space can be quite dangerous, and it can actually mess up the ionosphere and radiotransmission all the way down here. We lack an adequate baseline for proper measurement and comparison or prediction, but it wouldn't horribly surprise me if at least some events get to the point where radiation levels on the surface reach mutogenetic levels. An "interesting" possibility that might explain the relatively sudden emergence of new species, for example. However, the ones we've seen are enough to confirm the potential risk. > I am not speaking about the exploration missions by NASA, but rather > the much-farther-down-the-road commercial mining interests that will > want to build a foundry on the moon and a spacedock in earth orbit prior > to the big colonization push into our solar system. I believe it is > going to happen someday, because we already know that eventually our sun > will go nova and earth will be no longer habitable. Sooner or later > mankind, if it is to survive, will need to undergo some kind of diaspora > and migrate out into space. Ah, a far thinking person, I see. Time to start planning for the big implosion already? Mankind will not survive. Whatever it is that is around when the sun goes nova (if anything) will resemble man as man resembles small furry rodents, at least if they are our descendants. The event isn't due for a rather long time;-) rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From dtj at uberh4x0r.org Fri Apr 25 11:05:07 2003 From: dtj at uberh4x0r.org (Dean Johnson) Date: 25 Apr 2003 10:05:07 -0500 Subject: beowulf in space In-Reply-To: References: Message-ID: <1051283107.27185.12.camel@terra> On Fri, 2003-04-25 at 00:31, Jim Ahia wrote: > So the radiation concerns with rad-hardened computer equipment are not > as much of a problem once clear of the Van Allen Radiation Belt? How > does this affect the space station and the planned missions to mars? > What about the lunar environment? I admit to having a lot of ignorance > on this subject, but I am concerned because part of my college project > is for robotic teams to do excavation / mining using a "hive" concept. > The end result is to get more information on the challenges that will be > faced by the robotic workers that are eventually sent to the moon first, > and to mars second. > > I am not speaking about the exploration missions by NASA, but rather > the much-farther-down-the-road commercial mining interests that will > want to build a foundry on the moon and a spacedock in earth orbit prior > to the big colonization push into our solar system. I believe it is > going to happen someday, because we already know that eventually our sun > will go nova and earth will be no longer habitable. Sooner or later > mankind, if it is to survive, will need to undergo some kind of diaspora > and migrate out into space. > In terms of setting up mining operations on other celestial bodies, cpu stability and radiation protection are amongst the least of your worries. What will be needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical issues are largely tractable, one way or another, but the institutional and international issues will be like herding cats on crack. -Dean _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From dtj at uberh4x0r.org Fri Apr 25 11:07:30 2003 From: dtj at uberh4x0r.org (Dean Johnson) Date: 25 Apr 2003 10:07:30 -0500 Subject: beowulf in space In-Reply-To: References: Message-ID: <1051283250.27174.14.camel@terra> On Fri, 2003-04-25 at 09:14, Robert G. Brown wrote: > Mankind will not survive. Whatever it is that is around when the sun > goes nova (if anything) will resemble man as man resembles small furry > rodents, at least if they are our descendants. I resemble that remark! Okay, maybe a big furry rodent. -- -Dean _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From edwardsa at plk.af.mil Fri Apr 25 11:03:05 2003 From: edwardsa at plk.af.mil (Art Edwards) Date: Fri, 25 Apr 2003 09:03:05 -0600 Subject: beowulf in space In-Reply-To: <3EA8BA4D.9070104@tamu.edu> References: <1051231726.1859.9.camel@Eagle.mid4.usna.edu> <3EA8BA4D.9070104@tamu.edu> Message-ID: <20030425150305.GB21431@plk.af.mil> I should also mention that there is a very large drive for radiation hardening by design. There is work in the literature indicating that device redesign (annular transistors, for example, to handle total dose effects, and circuit redesign for SEU, SET) can lead to strategic hardening from commercial foundries. This is for digital logic circuits. The disadvantage is that these design techniques always lead to degradation of circuit density and of performance. However, cost should be dramatically improved. An issue brought up by Jim Lyke but not addressed elsewhere is cooling. Recall that clusters generate lots of heat and that we use convection to transfer it. In space there is either radiation or conduction. This has to be a major focus for compact clusters that will go in space. Art Edwards On Thu, Apr 24, 2003 at 11:32:13PM -0500, Gerry Creager N5JXS wrote: > One of the things I established when I was working on the old Space > Station Freedon, in the early '90s, is that the space-rated CPUs, less > the issues with radiation hardening and single-event upset recovery, > were hardly different from good CPUs. What we discovered was that the > MIL-SPEC components differed little from the "industrial-grade" > components, save in the degree of paperwork delivered with the device. > And the costs. Thus, we drove toward the use of the lower-cost, similar > quality Industrial-Grade devices. > > Now, for low- and mid-earth-orbit altitudes, the radiation environment > is pretty harsh. One should be cognizant of that environment, and model > the potential for radiation induced transient problems. If you're not > ready for transient failures, and at that, failures that may or may not > heal (aneal), you shouldn't use non-radiation hardened, commercial, > processors. > > I've not looked at the specs for rad-hardening and SEU performance. If > it's a commercial- as opposed to an industrial-grade processor, I'd not > be too sure of reliability, either, although those specs have come up > markedly over the last 10 years. > > gerry > > MIDN Sean Jones wrote: > >For reference the United States Naval Academy is putting up a PowerPC > >405 SoC in PC/104 form factor up as the Command and Data Handling System > >of the MidSTAR I satellite slated for launch in March 2006. > > > >Sean Jones > >MIDN USN > > > >MidSTAR C&DH Lead > >Armada Cluster Asst. Admin > > > >On Thu, 2003-04-24 at 05:34, Jim Ahia wrote: > > > >>As I was reading this thread, some things came to mind that might add to > >>the discussion: > >> > >>1 ) although Dells and Gateways are too heavy to lift into orbit, > >>pc-104 systems might be the solution. 3.6 x 3.8 inch pentium-class > >>motherboards with a single 5v power requirement make things much > >>smaller. It is completely possible to have each node fit into the space > >>of a half-height CD-ROM drive. Can anyone say "cluster in one box"? > >> > >>2 ) Has anyone yet mentioned the possibility of mesh networks using > >>802.11 for robotics clustering? Such networks of robots might make site > >>construction, ship construction, and mining feasible. > >> > >>Mining the surface of the moon is well documented to provide hydrogen, > >>oxygen, aluminum, silica, and titanium. Launching fuel & materials for > >>spacecraft to an orbital construction facility might make more sense > >>than the billions we are spending now, if the mine, transport, and > >>construction are largely carried out by robotics under the oversight of > >>a resident cluster with ground-based monitoring. > >> > >>Using a similar swarm of robots for site construction on mars prior to > >>human arrival can have a major impact on mission success. > >> > >>If all robots use identical motion base and cpu, then 2 broken bots can > >>be cannibalized to return one working bot to service. > >> > >>If all of the robots that are currently recharging batteries are added > >>to the cluster as mains-connected nodes, then a cluster of sorts is in > >>effect to speed control processing of the 'hive'. This is assuming that > >>the central site has the main power supply system online, be it solar, > >>nuc, whatever. > >> > >>-Jim Ahia > >>-makenamicro at charter.net > >>_______________________________________________ > >>Beowulf mailing list, Beowulf at beowulf.org > >>To change your subscription (digest mode or unsubscribe) visit > >>http://www.beowulf.org/mailman/listinfo/beowulf > >> > > -- > Gerry Creager -- gerry.creager at tamu.edu > Network Engineering -- AATLT, Texas A&M University > Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 > Page: 979.228.0173 > Office: 903A Eller Bldg, TAMU, College Station, TX 77843 > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Art Edwards Senior Research Physicist Air Force Research Laboratory Electronics Foundations Branch KAFB, New Mexico (505) 853-6042 (v) (505) 846-2290 (f) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Fri Apr 25 12:21:27 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 25 Apr 2003 12:21:27 -0400 (EDT) Subject: beowulf in space In-Reply-To: <1051283107.27185.12.camel@terra> Message-ID: On 25 Apr 2003, Dean Johnson wrote: > In terms of setting up mining operations on other celestial bodies, cpu stability > and radiation protection are amongst the least of your worries. What will be > needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical > issues are largely tractable, one way or another, but the institutional and > international issues will be like herding cats on crack. Well, there are also the umm, "economic" issues as well. As in no matter what you do, no matter what you say, the theoretical minimum cost of lifting something out of the earth's gravity well is on the order of a 100 megajoules per kilogram (mgR_earth is "escape energy"). Ignoring things like the 2nd law, call it a first-law cost of a buck purchased as raw electricity at commercial rates. However, using rockets to provide lift, the net 2nd law efficiency is some appallingly low number, as one has to lift the fuel to lift the fuel to lift the fuel to lift the payload, and then there are things like drag forces and the fact that failure has a very high cost so everything is overengineered, and the fact that you have to build a REALLY BIG vehicle to deliver a REALLY SMALL payload, which kicks in several orders of magnitude in cost (like 5?). As in, we are never going to "explore space" on chemical rockets more than slowly and infrequently, period, at $100K/kg. We can't afford it. Sorry, that's just a fact and I don't see it changing, not even if we figure out how to make electricity from fusion reactors and drop the fuel costs. Electromagnet mass drivers (a la Heinlein) would get rid of the lifting of reaction mass/fuel problem (which would make a BIG difference) but leaves you with lots of OTHER problems (like accelerating something to order of 10 km/sec against drag forces and without exceeding (say) 3-4 g's or cooking the contents with eddy currents, punching it through the thicker lower atmosphere against nonlinear turbulence that makes the stuff that ripped up the space shuttle seem like kid's stuff, and more). This approach would require a huge capital investment, new technologies galore (and maybe a bit of new physics), and might not ever work. We could spend a significant fraction of a terabuck just finding out. IF it worked, though, it could reduce the cost (ignoring the amortization of the initial investment, which was a mostly-ignored hundreds of gigabucks for chemical rockets and NASA as well, truth be told) to perhaps $100/kg, which is at least in the not-completely-insane range (assuming that one could achieve 1% efficiency, which is open to doubt). I see nobody designing earth-orbit mass drivers. I see little serious investment in the entire concept (although my physics students love the idea and regularly do exam questions on it:-). Until somebody does, the space program will be restricted to rare manned big ticket "exploration of space" trips and lots of unmanned earth orbit flights with a predictable economic payoff (weather and comm and military satellites). This goes for the indefinite future. Just buying the fuel to fill a shuttle mission has to cost a literally insane sum compared to the weight of the orbital payload, and the cost of that energy (viewed as energy) literally defines the value of money and cannot ever become "cheap". So sorry, although I >>love<< space exploration and have read a signficant fraction of all science fiction, the tragic thing about being a physicist is one has to really work to suspend that disbelief thing when one can do the math. Look on the bright side. With mass drivers it at least >>is<< feasible to contemplate exploiting here to the moon, maybe even near solar system (although the cost issue starts creeping in again when you get outside the moon, as do lots of other things like time of travel). Until you hear of work being done on them (with some degree of success) then forget mining the moon. They may well have to wait on other breakthroughs, as well (like viable hi-T superconductors) -- in addition to atmospheric drag forces and friction, there are eddy currents to consider, where resistance in the lifted shell is a "bad thing". rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Fri Apr 25 08:16:38 2003 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Fri, 25 Apr 2003 07:16:38 -0500 Subject: beowulf in space In-Reply-To: References: Message-ID: <3EA92726.801@tamu.edu> The Van Allen belts are not a continuum of radiation that you see as a "shell", but rather, are modulated by the solar environment and the earth's own geomagnetic environment. One might look at the variations in geomagnetic potential and gravetic potential for the earth at its surface, and some projections (and measurements) in low earth orbit. The possibility of an increased ion/particle based event is higher when passing through the "belts" than when outside of them. Thus, there is concern. There tends to be a higher concentration of radiation environments ("belts") in the mid-earth-orbit range (1500-5000km) than at lower altitudes, but the concentration of particles is higher in the lower orbits within the belts. So, it's a catch-22. I might add, it's hard to explain this, as there's not a board handy I can drawon, and I can't be seen waving my hands. Then there's the issue of adequate coffee levels... In the ISS, there are "safe haven" areas with more shielding than other parts of the station. In the event of a strong solar event, the crew could be ordered into the safe haven area for a period of time. Or, they could be ordered to prepare for evacuation via Soyuz. Medical planning for a Mars mission was problemmatic when I was at NASA. The concept of a safe haven has to be considered as the potential for a solar storm is non-trivial during a mission transit of the necessary duration, and theweight penalty for such a safe haven area is very great. Addition of multiple layers of heavy metal, which might mitigate some of the ion transitions has its own drawbacks, as mentioned earlier. Layering of heavy metals and differing dense materials is one path that's been evaluated. Let's add to the discussion: There are conditions where the human machine will self-repair better than silicon or germanium... or silicon-on-saphire, or, pick your substrate. In these cases, we have to protect the computers more. However, the reverse can also be true, if the hit is a soft, but fairly frequent set of radiation hits: The machine sees these as single event upsets, while the human could well see them as enough ionizing radiation to modify the immune system, the neurological system or synapses. So protecting hardware _and_ liveware becomes an important, and difficult task. And there should be no separation of NASA and follow-on commercial concerns: These concerns should be echoed all down the chain, because the concept of protecting the hardware and the liveware isn't something NASA does to ramp up the costs. I feel obligated to note that there was a Shuttle mission several years ago, where the crew were exposed to a sudden and unanticipated solar event. They were placed in safe-haven in the airlock for a period of a couple of hours until the majority of the event had passed. THere was inadequate time to de-orbet and protect them via the atmosphere. All dosimeters were over normal exposure limits. Certain post-flight medical recommendations were made with regard to the potential for reproductive health, and they were subjected to more, and longer follow-up medically than other crews. I'm not aware of any lasting consequences... but then, if I were, I'd probably have violated some of the confidentiality tenets. The story here is simple, though: Solar events happen, sometimes unexpectedly. When they do, some contingency planning must be on hand to rapidly (or by design) protect the liveware and hardware, or something may get damaged. It's really hard to make a service call to a broken device when it's vertical offset is 250 or so km, and its delta-V is over 10m/s... gerry Jim Ahia wrote: > So the radiation concerns with rad-hardened computer equipment are not > as much of a problem once clear of the Van Allen Radiation Belt? How > does this affect the space station and the planned missions to mars? > What about the lunar environment? I admit to having a lot of ignorance > on this subject, but I am concerned because part of my college project > is for robotic teams to do excavation / mining using a "hive" concept. > The end result is to get more information on the challenges that will be > faced by the robotic workers that are eventually sent to the moon first, > and to mars second. > > I am not speaking about the exploration missions by NASA, but rather > the much-farther-down-the-road commercial mining interests that will > want to build a foundry on the moon and a spacedock in earth orbit prior > to the big colonization push into our solar system. I believe it is > going to happen someday, because we already know that eventually our sun > will go nova and earth will be no longer habitable. Sooner or later > mankind, if it is to survive, will need to undergo some kind of diaspora > and migrate out into space. -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From becker at scyld.com Fri Apr 25 12:43:35 2003 From: becker at scyld.com (Donald Becker) Date: Fri, 25 Apr 2003 12:43:35 -0400 (EDT) Subject: Serial Port Concentrators vs. KVMs In-Reply-To: <018201c30b01$65998aa0$04a8a8c0@spot> Message-ID: On Fri, 25 Apr 2003, Dan Kidger wrote: > > On Thu, Apr 24, 2003 at 04:28:12PM +0100, Dan Kidger elucidated: > > > However the current trend is for all new rack-mount nodes to offer some > sort > > > of BMC (baseboard management controller) with an ethernet connection. As > > > well as giving remote power cycling, this should allow > 'Serial_over_Lan" - > > > > Course the thing I wonder about that is then would seem to loose some > > redundancy ... > > Course I guess if a switch blows up, there may > > not be any particular need to access a node. That's the key idea: by reducing the cable count, you reduce the number of things that can wrong. And if the communications to the node is down, there is little point in having anything else working. > You do not necessirly lose any reduncancy.. > Compaq nodes have a extra ethernet socket for the BMC (Hence > serial_over_lan). ... > Alternatively the Intel MoBo's like E7501 hijack the same physical ethernet > socket for the BMC. The different approach is determined by the Ethernet chip is in use. A special NIC design is required to transparently piggyback management traffic on the main network channel, and to continue to do so when the main system is powered off. The selection is pretty much limited to a few 10/100 chips from 3Com and Intel. If the system board uses a gigabit NIC, the BMC has to have its own network connection. An second Ethernet network is far less expensive, more reliable and easier to diagnose than a KVM or serial setup. While I prefer having only two cables, power and one Cat5, rather than three, even three is no comparison to the complexity of other solutions. > This may lose a little redundancy (if for example a > cable falls out), but halves the amount of cat5 cabling needed This is a good example of how fewer cables doesn't lose redundancy, it decreases the points of failure and increases the reliability. -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Scyld Beowulf cluster system Annapolis MD 21403 410-990-9993 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From math at velocet.ca Fri Apr 25 15:18:00 2003 From: math at velocet.ca (Ken Chase) Date: Fri, 25 Apr 2003 15:18:00 -0400 Subject: back to the issue of cooling In-Reply-To: <3EA7715A.44EFAA24@andorra.ad>; from award@andorra.ad on Thu, Apr 24, 2003 at 07:08:42AM +0200 References: <3EA7715A.44EFAA24@andorra.ad> Message-ID: <20030425151800.A69860@velocet.ca> On Thu, Apr 24, 2003 at 07:08:42AM +0200, Alan Ward's all... >I tend to think Transmeta and other low-power CPUs belong on the >desktop, so you can run them without the noisy fans (and they don't >heat up the air). make them diskless and you have no moving parts - great for remote machines that are mission critical but you cant get to them quick for repair. Im gonna stick my firewall on one of these, hide it in the closet, and boot it off my desktop cuz the $WIFE "doesnt want to see anymore bloody computers in the house!" /kc _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at math.ucdavis.edu Fri Apr 25 17:07:29 2003 From: bill at math.ucdavis.edu (Bill Broadley) Date: Fri, 25 Apr 2003 14:07:29 -0700 Subject: AMD Opteron benchmarks In-Reply-To: <3EA8E381.6020202@alineos.com> References: <3EA8E381.6020202@alineos.com> Message-ID: <20030425210729.GA12550@sphere.math.ucdavis.edu> On Fri, Apr 25, 2003 at 09:28:01AM +0200, Dominique Van?on wrote: > Hi All, > we performed some benchmarks on AMD Opteron 1400 (also Intel XEON, > Itanium2 900, AMD MP, Apple and Alpha processors) : > http://www.alineos.com/benchs_eng.html > We also could make some comments about these tests, so feel free to > contact. Interesting. What were the exact configurations of the hardware? Unlike most other hardware the opterons can be significantly slower based on the configuration. Apparently there is a shortage of PC2700 ECC Registered memory. Did your test opterons have PC2100 or PC2700? Did each opteron have 2 matched dimms (so 4 for a dual cpu)? I've seen duals with only 1 of the 2 memory banks populated. Why did you use Intel's compiler for the Xeons, but gcc for the Opterons? -- Bill Broadley Mathematics UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Fri Apr 25 17:27:29 2003 From: gerry.creager at tamu.edu (Gerry Creager) Date: Fri, 25 Apr 2003 16:27:29 -0500 Subject: beowulf in space References: <1051283107.27185.12.camel@terra> Message-ID: <3EA9A841.9030302@tamu.edu> Layers 8 & 9 (fiscal/political) of the ISO model? gerry Dean Johnson wrote: > On Fri, 2003-04-25 at 00:31, Jim Ahia wrote: > > In terms of setting up mining operations on other celestial bodies, cpu stability > and radiation protection are amongst the least of your worries. What will be > needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical > issues are largely tractable, one way or another, but the institutional and > international issues will be like herding cats on crack. -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Office: 979.458.4020 FAX: 979.847.8578 Cell: 979.229.5301 Pager: 979.228.0173 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From landman at scalableinformatics.com Fri Apr 25 18:15:44 2003 From: landman at scalableinformatics.com (Joseph Landman) Date: 25 Apr 2003 18:15:44 -0400 Subject: AMD Opteron benchmarks In-Reply-To: <20030425210729.GA12550@sphere.math.ucdavis.edu> References: <3EA8E381.6020202@alineos.com> <20030425210729.GA12550@sphere.math.ucdavis.edu> Message-ID: <1051308944.2898.15.camel@protein.scalableinformatics.com> I'd be curious to see the Intel compiled code run on the Opteron, and the gcc compiled code run on the Xeon. The BLAST results were quite suprising, so I would like to see if that is the identical binary on both systems, and if so, what is the config of each. On Fri, 2003-04-25 at 17:07, Bill Broadley wrote: > On Fri, Apr 25, 2003 at 09:28:01AM +0200, Dominique Van?on wrote: > > Hi All, > > we performed some benchmarks on AMD Opteron 1400 (also Intel XEON, > > Itanium2 900, AMD MP, Apple and Alpha processors) : > > http://www.alineos.com/benchs_eng.html > > We also could make some comments about these tests, so feel free to > > contact. > > Interesting. What were the exact configurations of the hardware? > Unlike most other hardware the opterons can be significantly slower > based on the configuration. > > Apparently there is a shortage of PC2700 ECC Registered memory. Did your > test opterons have PC2100 or PC2700? > > Did each opteron have 2 matched dimms (so 4 for a dual cpu)? I've seen > duals with only 1 of the 2 memory banks populated. > > Why did you use Intel's compiler for the Xeons, but gcc for the Opterons? -- Joseph Landman _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From astroguy at bellsouth.net Sat Apr 26 07:40:19 2003 From: astroguy at bellsouth.net (astroguy at bellsouth.net) Date: Sat, 26 Apr 2003 7:40:19 -0400 Subject: OT warning...Re: beowulf in space, now wandering far afield... Message-ID: <20030426114019.YYXZ1247.imf54bis.bellsouth.net@mail.bellsouth.net> > > From: Gerry Creager N5JXS > Date: 2003/04/26 Sat AM 01:55:02 EDT > To: astroguy > CC: beowulf at beowulf.org > Subject: OT warning...Re: beowulf in space, now wandering far afield... > > I can't conceive of a reason to locate a cluster in the vicinity of > CHernobyl. As you note, the death toll continues to climb, and the > mutation rate is non-trivial among the retilian population. The cancer > rate is considerably higher than background, as well. > > However, parking a cluster under the sarcophogus doesn't strike me as > adding anything to the mass of knowledge, nor is there too much I can > think of, from a research perspective, that'd require or benefit from > on-site cluster computations. > > I suspect one reason no one is wanting to talk about Chernobyl is that > it occurred so long ago, at least in American terms, that it's ancient > history. We know what caused it (carelessness) and we know a lot of > damage was wrought. Right now, aside from documenting mutation and > cancer rates, neither of which requires massively parallel applications, > there's some interesting structural engineering data to be gleaned from > the concrete sarcophygus (overshroud of reinforced concrete that's > decaying at a pretty amazing rate. But, I think I'd to my assessments > on-site, then go home to decontaminate and run my datasets. > > gerry > > astroguy wrote: > > Gerry Creager wrote: > > > > > >>Layers 8 & 9 (fiscal/political) of the ISO model? > >> > >>gerry > >> > >>Dean Johnson wrote: > >> > >>>On Fri, 2003-04-25 at 00:31, Jim Ahia wrote: > >>> > >>>In terms of setting up mining operations on other celestial bodies, cpu stability > >>>and radiation protection are amongst the least of your worries. What will be > >>>needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical > >>>issues are largely tractable, one way or another, but the institutional and > >>>international issues will be like herding cats on crack. > >> > >>-- > >>Gerry Creager -- gerry.creager at tamu.edu > >>Network Engineering -- AATLT, Texas A&M University > >>Office: 979.458.4020 FAX: 979.847.8578 > >>Cell: 979.229.5301 Pager: 979.228.0173 > >> > >>_______________________________________________ > >>Beowulf mailing list, Beowulf at beowulf.org > >>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > Hi Gerry, > > Since you asked... not sure what distro of ISO we may need but there is a lot > > of work yet here on Terra fir ma... one area of convergence that the Beowolf > > in space might hold some measure of promise in synergistic high radiation > > environment would be, at least in my mind, the site of Chernobyl that no one > > is very keen to talk about but all are certanly aware it is a site that must be revisited... in the death toll > > even greater that our 9/11... and their Russian firefighters are still adding their heroic numbers to the list... > > Who are we to ask them to sacrifice more than they have already... But I think all agree we have a daunting and > > serious job to do in a very difficult almost space like environment. > > . Crazy Russians a little nuts but ya just got to love'em > > Just posting, thanks for list indulgence > > c.clary > > spartan sys. > > po box 1515 > > spartanburg, sc 29304-0243 > > > > fax# (801) 858-2722 > > -- > Gerry Creager -- gerry.creager at tamu.edu > Network Engineering -- AATLT, Texas A&M University > Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 As I recall we were speaking of diplomacy and high radiation... it just seems that a bot that could function would because of mission demand to function in an actual worker labor intensive multi functional capacity... visual demands alone of simply walking or rolling are considerable in any independent fashion... basic demands beyond the simple robotic independent of a tether leash and logic demands of simply picking up a piece of pipe... even to see the pipe and understand and distinguish a difference from a wooden broom on the floor are considerable and daunting task that have eluded our top engineers to date... as from the inception from this debate I hold to Dr.Browns position.... Lots of work to be done yet on earth before we might place pie in the sky theoretical magnetic space drives and magic space monkey's into space... some call proof of concept or failure analysis... It to me is basic foundation groundwork that we build the steps before we just rocket into heaven! or hell. c.clary spartan sys po box 1515 spartanburg, sc 29304-0243 PS sorry to poop on the party... but beyond this whist of some H.G.Wells there is a ton of real work attached to not only theory but test... test and more test. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Sat Apr 26 01:55:02 2003 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Sat, 26 Apr 2003 00:55:02 -0500 Subject: OT warning...Re: beowulf in space, now wandering far afield... In-Reply-To: <3EAA0A33.D7D7BD0A@bellsouth.net> References: <1051283107.27185.12.camel@terra> <3EA9A841.9030302@tamu.edu> <3EAA0A33.D7D7BD0A@bellsouth.net> Message-ID: <3EAA1F36.5010100@tamu.edu> I can't conceive of a reason to locate a cluster in the vicinity of CHernobyl. As you note, the death toll continues to climb, and the mutation rate is non-trivial among the retilian population. The cancer rate is considerably higher than background, as well. However, parking a cluster under the sarcophogus doesn't strike me as adding anything to the mass of knowledge, nor is there too much I can think of, from a research perspective, that'd require or benefit from on-site cluster computations. I suspect one reason no one is wanting to talk about Chernobyl is that it occurred so long ago, at least in American terms, that it's ancient history. We know what caused it (carelessness) and we know a lot of damage was wrought. Right now, aside from documenting mutation and cancer rates, neither of which requires massively parallel applications, there's some interesting structural engineering data to be gleaned from the concrete sarcophygus (overshroud of reinforced concrete that's decaying at a pretty amazing rate. But, I think I'd to my assessments on-site, then go home to decontaminate and run my datasets. gerry astroguy wrote: > Gerry Creager wrote: > > >>Layers 8 & 9 (fiscal/political) of the ISO model? >> >>gerry >> >>Dean Johnson wrote: >> >>>On Fri, 2003-04-25 at 00:31, Jim Ahia wrote: >>> >>>In terms of setting up mining operations on other celestial bodies, cpu stability >>>and radiation protection are amongst the least of your worries. What will be >>>needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical >>>issues are largely tractable, one way or another, but the institutional and >>>international issues will be like herding cats on crack. >> >>-- >>Gerry Creager -- gerry.creager at tamu.edu >>Network Engineering -- AATLT, Texas A&M University >>Office: 979.458.4020 FAX: 979.847.8578 >>Cell: 979.229.5301 Pager: 979.228.0173 >> >>_______________________________________________ >>Beowulf mailing list, Beowulf at beowulf.org >>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > > Hi Gerry, > Since you asked... not sure what distro of ISO we may need but there is a lot > of work yet here on Terra fir ma... one area of convergence that the Beowolf > in space might hold some measure of promise in synergistic high radiation > environment would be, at least in my mind, the site of Chernobyl that no one > is very keen to talk about but all are certanly aware it is a site that must be revisited... in the death toll > even greater that our 9/11... and their Russian firefighters are still adding their heroic numbers to the list... > Who are we to ask them to sacrifice more than they have already... But I think all agree we have a daunting and > serious job to do in a very difficult almost space like environment. > . Crazy Russians a little nuts but ya just got to love'em > Just posting, thanks for list indulgence > c.clary > spartan sys. > po box 1515 > spartanburg, sc 29304-0243 > > fax# (801) 858-2722 -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From astroguy at bellsouth.net Sat Apr 26 00:25:23 2003 From: astroguy at bellsouth.net (astroguy) Date: Sat, 26 Apr 2003 00:25:23 -0400 Subject: beowulf in space References: <1051283107.27185.12.camel@terra> <3EA9A841.9030302@tamu.edu> Message-ID: <3EAA0A33.D7D7BD0A@bellsouth.net> Gerry Creager wrote: > Layers 8 & 9 (fiscal/political) of the ISO model? > > gerry > > Dean Johnson wrote: > > On Fri, 2003-04-25 at 00:31, Jim Ahia wrote: > > > > In terms of setting up mining operations on other celestial bodies, cpu stability > > and radiation protection are amongst the least of your worries. What will be > > needed is "diplomacy hardened" and "bureaucracy proofed" processes. Technical > > issues are largely tractable, one way or another, but the institutional and > > international issues will be like herding cats on crack. > > -- > Gerry Creager -- gerry.creager at tamu.edu > Network Engineering -- AATLT, Texas A&M University > Office: 979.458.4020 FAX: 979.847.8578 > Cell: 979.229.5301 Pager: 979.228.0173 > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf Hi Gerry, Since you asked... not sure what distro of ISO we may need but there is a lot of work yet here on Terra fir ma... one area of convergence that the Beowolf in space might hold some measure of promise in synergistic high radiation environment would be, at least in my mind, the site of Chernobyl that no one is very keen to talk about but all are certanly aware it is a site that must be revisited... in the death toll even greater that our 9/11... and their Russian firefighters are still adding their heroic numbers to the list... Who are we to ask them to sacrifice more than they have already... But I think all agree we have a daunting and serious job to do in a very difficult almost space like environment. . Crazy Russians a little nuts but ya just got to love'em Just posting, thanks for list indulgence c.clary spartan sys. po box 1515 spartanburg, sc 29304-0243 fax# (801) 858-2722 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf