From pegu at dolphinics.no Mon Mar 1 03:45:32 2004 From: pegu at dolphinics.no (Petter Gustad) Date: Mon, 01 Mar 2004 09:45:32 +0100 (CET) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds Message-ID: <20040301.094532.17863925.pegu@dolphinics.no> Taken from: http://www.dolphinics.no/news/2004/2_25.html Dolphin SCI Socket Software Delivers Record Breaking Latency New evaluation kit available at special pricing Clinton, MA and Oslo, Norway, Feb 26, 2004 Dolphin Interconnect today announced that the SCI Socket version 1.0 software library is now available to customers for high-performance computing applications interconnected with Dolphin SCI adapters. SCI Socket enables standard Berkley sockets to use the Scalable Coherent Interface (SCI) as a transport medium with its high bandwidth and extremely low latency. "This is the lowest latency socket solution available today," said Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new high-performance possibilities for a broad range of networking applications." Dolphin has benchmarked a completed one byte socket send/socket receive latency at 2.27 microseconds, which corresponds to more than 203,800 roundtrip transactions per second. Benchmarks using Netperf also show more than 255 MBytes (2,035 Megabits/s) sustained throughput using standard TCP STREAM sockets. The SCI Socket software uses Dolphin's SISCI API as its transport and most of the communication takes place in user space, avoiding time-consuming system calls and networking protocols. SCI remote memory access provides a fast and reliable connection. "These record-setting performance benchmarks underscore the capabilities of the SCI standard as a high-performance interconnect," said Kare Lochsen, CEO of Dolphin Interconnect. "Dolphin has extensive expertise in this technology having developed the first SCI-based interconnect soon after it became a IEEE standard in 1992, and we remain committed to keeping SCI at the most competitive performance levels in the future." SCI Socket requires no operating system patches or application modifications to run the software. SCI Socket is open source software available under LGPL/GPL and supports all popular Linux distributions for x86 and x86/Opteron. In Dolphin testing, the lowest latency was achieved using AMD Opteron (X86_64) processors. Support for UDP and Microsoft Windows is planned. Dolphin SCI adapters are used to build server clusters for high-performance computing and in a wide range of embedded real-time computing applications including reflective memory, simulation and visualization systems, and systems requiring high-availability and fast failover. For a limited time, an evaluation kit consisting of two PCI-SCI adapter cards and cables is available directly from Dolphin Interconnect at a substantial discount. When installed in a user's application platform, the evaluation kit enables effective testing of the SCI Socket software. The software and documentation is included at no charge. Please visit the Dolphin web site for more information at www.dolphinics.com/eval. foobar GmbH (www.foobar-cpa.de), a software development and consulting firm with particular expertise in SCI and located in Chemnitz, Germany, assisted Dolphin in the development of SCI Socket. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Mon Mar 1 08:16:28 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Mon, 1 Mar 2004 21:16:28 +0800 (CST) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com> > In Dolphin testing, the > lowest latency was > achieved using AMD Opteron (X86_64) processors. No wonder Intel killed IA64 and released 64-bit x86 (aka IA32e) a week or two ago... Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mathiasbrito at yahoo.com.br Mon Mar 1 08:59:19 2004 From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=) Date: Mon, 1 Mar 2004 10:59:19 -0300 (ART) Subject: [Beowulf] Mpirun error Message-ID: <20040301135919.45861.qmail@web12202.mail.yahoo.com> I intalled the lastest version of mpich in my personal computer for simulate my parallel programs. I can compile my programs without problem, but when I try to run it I receive the fallowing message error: p0_6941: p4_error: Path to program is invalid while starting /home/mathias/mpi/bubble with RSHCOMMAND on linux: -1 p4_error: latest msg from perror: No such file or directory What can I do? Thanks ===== Mathias Brito Universidade Estadual de Santa Cruz - UESC Departamento de Ci?ncias Exatas e Tecnol?gicas Estudante do Curso de Ci?ncia da Computa??o ______________________________________________________________________ Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora: http://br.yahoo.com/info/mail.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bogdan.costescu at iwr.uni-heidelberg.de Mon Mar 1 10:49:39 2004 From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Mon, 1 Mar 2004 16:49:39 +0100 (CET) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: On Mon, 1 Mar 2004, Petter Gustad wrote: > Dolphin has benchmarked a completed one byte socket send/socket > receive latency at 2.27 microseconds, Is this in polling mode or interrupt-driven ? I'm interested to see if I can do something useful (like computation) _and_ get such low latency. > Benchmarks using Netperf also show more than 255 MBytes (2,035 > Megabits/s) sustained throughput using standard TCP STREAM sockets. What is the CPU usage for this throughput ? -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kfarmer at linuxhpc.org Mon Mar 1 09:51:39 2004 From: kfarmer at linuxhpc.org (Kenneth Farmer) Date: Mon, 1 Mar 2004 09:51:39 -0500 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds References: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com> Message-ID: <097701c3ff9c$b465fe30$1601a8c0@deskpro> ----- Original Message ----- From: "Andrew Wang" To: Sent: Monday, March 01, 2004 8:16 AM Subject: Re: [Beowulf] SCI Socket latency: 2.27 microseconds > > In Dolphin testing, the > > lowest latency was > > achieved using AMD Opteron (X86_64) processors. > > No wonder Intel killed IA64 and released 64-bit x86 > (aka IA32e) a week or two ago... > > Andrew. Intel killed IA64? Where did you come up with that? -- Kenneth Farmer <>< LinuxHPC.org _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Mar 1 11:35:56 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 1 Mar 2004 11:35:56 -0500 (EST) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: well, this is interesting. it appears that AMD has given all interconnect vendors a boost, since Myri and Quadrics seem to like Opterons as well ;) > "This is the lowest latency socket solution available today," said > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new well, Quadrics now claims 1.8 us MPI latency: http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD it's interesting that SCI is still on 64x66 PCI - it would be very interesting to know how many and what kinds of codes really require higher bandwidth. yes, some vendors (esp IB) are pushing PCI-express as bandwith salvation, but afaikt, none of my users need even >500 MB/s today. it doesn't seem like PCI-express will be any kind of major win in small-packet latency... _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From csamuel at vpac.org Mon Mar 1 17:09:55 2004 From: csamuel at vpac.org (Chris Samuel) Date: Tue, 2 Mar 2004 09:09:55 +1100 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <097701c3ff9c$b465fe30$1601a8c0@deskpro> References: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com> <097701c3ff9c$b465fe30$1601a8c0@deskpro> Message-ID: <200403020910.02925.csamuel@vpac.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, 2 Mar 2004 01:51 am, Kenneth Farmer wrote: > From: "Andrew Wang" > > > No wonder Intel killed IA64 and released 64-bit x86 > > (aka IA32e) a week or two ago... > > Intel killed IA64? Where did you come up with that? Intel certainly haven't announced the death of Itanium, but you've got to wonder about its long term future when Intel start producing 64-bit AMD compatible chips. Also see [1] below. This is more the question of what will the market do when choosing between them, especially as HPC is only really a niche (though a fairly high spending one) compared to the general computing market. The big advantage AMD have is that "legacy" 32-bit apps will be around for a long long time to come (look at the mass clamour for MS to continue supporting Win98, something they'd hoped would be dead a long time ago) and that gives the hybrids a big advantage in the general market. I guess it comes down to a business decision on Intel as to whether they feel the demand for Itanium is enough to justify its continued development. Note that I'm not saying the demand per se isn't there, I've got absolutely no idea on the matter! cheers, Chris [1] - for those who haven't seen it, here's Linus's response to the launch: http://kerneltrap.org/node/view/2466 - -- Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQFAQ7S2O2KABBYQAh8RAlA/AJ4yzNxJcXZc3e8I8CtYjgScQOCpUwCfdVzF lpG7iEOXSo3+xAK73kNb9c0= =eYRs -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at cse.ucdavis.edu Mon Mar 1 16:38:52 2004 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Mon, 1 Mar 2004 13:38:52 -0800 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: References: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: <20040301213852.GA28803@cse.ucdavis.edu> > well, Quadrics now claims 1.8 us MPI latency: > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD Note the title says "sub 2us" and the body says "close to" 1.8us. Of more interest (to me) is that further down they say: In the next quarter, Quadrics will announce a series of highly competitive switch configurations making QsNetII more cost-effective for medium sized cluster configuration deployment. Sounds like more competition for IB, Myrinet and Dolphin. Hopefully anyways. Cool, found a quadrics price list: http://doc.quadrics.com/Quadrics/QuadricsHome.nsf/DisplayPages/A3EE4AED738B6E2480256DD30057B227 http://tinyurl.com/2sn2b Looks like $3k per node or so for 64, and $4k per node for 1024, I'm guessing that is list price and is somewhat negotiable. According to my sc2003 notes the Quadrics latency was: 100ns for the sending elan4 300ns for the 128 node switch and 20 meters of cable 130ns for the receiving card. 2420ns for two trips across the PCI-X bus and a main memory write ================ 2950ns for an mpi message between 2 nodes. Anyone know what changes to get this number down to 1.8us - 2.0us? > higher bandwidth. yes, some vendors (esp IB) are pushing PCI-express > as bandwith salvation, but afaikt, none of my users need even >500 MB/s > today. it doesn't seem like PCI-express will be any kind of major win > in small-packet latency... Anyone have an expected timetable for PCI-express connected interconnect cards? Anyone have projected PCI-express latencies vs PCI-X (133 MHz/64 bit)? -- Bill Broadley Computational Science and Engineering UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From patrick at myri.com Mon Mar 1 17:40:55 2004 From: patrick at myri.com (Patrick Geoffray) Date: Mon, 01 Mar 2004 17:40:55 -0500 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: References: Message-ID: <4043BBF7.9090706@myri.com> Mark Hahn wrote: > well, Quadrics now claims 1.8 us MPI latency: > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD Hum, this one claims "under 3us": http://doc.quadrics.com/quadrics/QuadricsHome.nsf/PageSectionsByName/F6E4FE91508A319580256D5900447E40/$File/QsNetII+Performance+Evaluation+ltr.pdf Maybe the 1.8us is a one-sided MPI latency, aka a PUT ? Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From daniel.kidger at quadrics.com Mon Mar 1 18:47:33 2004 From: daniel.kidger at quadrics.com (Dan Kidger) Date: Mon, 1 Mar 2004 23:47:33 +0000 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <20040301213852.GA28803@cse.ucdavis.edu> References: <20040301.094532.17863925.pegu@dolphinics.no> <20040301213852.GA28803@cse.ucdavis.edu> Message-ID: <200403012347.33322.daniel.kidger@quadrics.com> On Monday 01 March 2004 9:38 pm, Bill Broadley wrote: > > well, Quadrics now claims 1.8 us MPI latency: > > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B > >398280256E44005A31DD > > Note the title says "sub 2us" and the body says "close to" 1.8us. Just ran this: [dan at opteron0]$ mpicc mping.c -o mping; prun -N2 ./mping 1: 0 bytes 1.80 uSec 0.00 MB/s This is simple bit of MPI: proc 1 posts an MPI_Recv, proc0 then does a MPI_Send, then proc1 does MPI_Send and proc0 an MPI_Recv. Latency printed is half the round trip averaged over say 1000 passes This is for Opteron - it seems to have the best PCI-X implimentation we have seen. Latency on IA64 is a little higher - say 2.61uSec on one platform I have just tried. MPI performance has also improved over time as we have tuned the DMA/PIO writes,etc. in the device drivers. > Of more interest (to me) is that further down they say: > In the next quarter, Quadrics will announce a series of highly competitive > switch configurations making QsNetII more cost-effective for medium > sized cluster configuration deployment. yep - yet to be announced offically - but as you might expect this revolves around introducing a wider range of smaller switch chasses and configuratiions. -- Yours, Daniel. -------------------------------------------------------------- Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505 ----------------------- www.quadrics.com -------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Mar 1 19:37:11 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 1 Mar 2004 19:37:11 -0500 (EST) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <200403020910.02925.csamuel@vpac.org> Message-ID: > > > No wonder Intel killed IA64 and released 64-bit x86 > > > (aka IA32e) a week or two ago... > > > > Intel killed IA64? Where did you come up with that? > > Intel certainly haven't announced the death of Itanium, but you've got to > wonder about its long term future when Intel start producing 64-bit AMD > compatible chips. Also see [1] below. bah. buying chips based on their address register width makes about as much sense as buying based on clock. yes, some people have good reason to be excited about 64b hitting the mass market. but that number is quite small - how many machines do you have with >4 GB per cpu? remember, Intel has always said that 64b wasn't terribly important for anything except the "enterprise" (mauve has more ram) market (mainframe recidivists). I think they're right, but should have also adopted AMD's cpu-integrated memory controller. > I guess it comes down to a business decision on Intel as to whether they feel > the demand for Itanium is enough to justify its continued development. maybe instead of a bazillion bytes of cache on the next it2, Intel will just drop in a P4 or two ;) regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From pegu at dolphinics.no Tue Mar 2 02:59:04 2004 From: pegu at dolphinics.no (Petter Gustad) Date: Tue, 02 Mar 2004 08:59:04 +0100 (CET) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: References: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: <20040302.085904.68044976.pegu@dolphinics.no> From: Mark Hahn Subject: Re: [Beowulf] SCI Socket latency: 2.27 microseconds Date: Mon, 1 Mar 2004 11:35:56 -0500 (EST) > > "This is the lowest latency socket solution available today," said > > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new > > well, Quadrics now claims 1.8 us MPI latency: This is excellent MPI latency. However, the quoted 2.27 ?s latency was for the *socket* library. Latency using the Dolphin SISCI library is 1.4 ?s. See also: http://www.dolphinics.no/products/benchmarks.html Petter _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joachim at ccrl-nece.de Tue Mar 2 03:36:42 2004 From: joachim at ccrl-nece.de (Joachim Worringen) Date: Tue, 2 Mar 2004 09:36:42 +0100 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: References: Message-ID: <200403020936.42553.joachim@ccrl-nece.de> Mark Hahn: > > "This is the lowest latency socket solution available today," said > > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new > > well, Quadrics now claims 1.8 us MPI latency: > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B39 >8280256E44005A31DD > > it's interesting that SCI is still on 64x66 PCI - it would be very > interesting to know how many and what kinds of codes really require [..] A large fraction of the latency does indeed stem from the two PCI-buses that need to be crossed. For that reason, Dolphin would certainly get an additional latency decrease when running on a 133MHz bus. I guess they have this in the pipeline. Joachim -- Joachim Worringen - NEC C&C research lab St.Augustin fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sfr at foobar-cpa.de Tue Mar 2 04:40:46 2004 From: sfr at foobar-cpa.de (Friedrich Seifert) Date: Tue, 02 Mar 2004 10:40:46 +0100 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds Message-ID: <4044569E.9010803@foobar-cpa.de> Bogdan Costescu wrote: > On Mon, 1 Mar 2004, Petter Gustad wrote: > > >>Dolphin has benchmarked a completed one byte socket send/socket >>receive latency at 2.27 microseconds, > > > Is this in polling mode or interrupt-driven ? I'm interested to see if > I can do something useful (like computation) _and_ get such low > latency. Actually, SCI SOCKET uses a combination of both, it polls for a configurable amount of time, and if nothing arrives meanwhile, waits for an interrupt. Something like that is necessary since the current Linux interrupt processing and wake up mechanism is quite slow and unpredictable. There is a promising project going on to provide real time interrupt capability, but it is still in an early stage (http://lwn.net/Articles/65710/) >>Benchmarks using Netperf also show more than 255 MBytes (2,035 >>Megabits/s) sustained throughput using standard TCP STREAM sockets. > > > What is the CPU usage for this throughput ? SCI SOCKET was run in PIO mode for this test, so one CPU is needed to transfer the data. Current DMA performance is lower, but is subject to optimization in future revisions. CPU usage for DMA is 8%/29% at sender/receiver. Regards, Friedrich -- Dipl.-Inf. Friedrich Seifert - foobar GmbH Phone: +49-371-5221-157 Email: sfr at foobar-cpa.de Mobil: +49-172-3740089 Web: http://www.foobar-cpa.de _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at pathscale.com Mon Mar 1 20:57:48 2004 From: lindahl at pathscale.com (Greg Lindahl) Date: Mon, 1 Mar 2004 17:57:48 -0800 Subject: [Beowulf] advantages of this particular 64-bit chip In-Reply-To: References: <200403020910.02925.csamuel@vpac.org> Message-ID: <20040302015748.GA6730@greglaptop.internal.keyresearch.com> On Mon, Mar 01, 2004 at 07:37:11PM -0500, Mark Hahn wrote: > bah. buying chips based on their address register width makes > about as much sense as buying based on clock. yes, some people have > good reason to be excited about 64b hitting the mass market. but > that number is quite small - how many machines do you have with > >4 GB per cpu? Don't forget that "64 bits", in this case means "wider GPRs, and twice as many, plus a better ABI." These are substantial wins on many codes, even on machines with small memories. Bignums are a well known example, but there are far more general-purpose examples. For example, with the PathScale compilers on the Opteron, we find that only 1 of the SPECfp benchmarks and 3 of the SPECint benchmarks run faster in 32-bit mode than 64-bit mode -- keeping in mind that 64-bit mode features longer instructions and bigger pointers and longs. (This is our alpha 32-bit mode vs. our beta 64-bit mode, so this answer will change a little by the time both are production quality.) So yes, there's a reason to buy Opteron and IA32e chips beyond the address width: more bang for the buck. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Tue Mar 2 08:59:55 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue, 2 Mar 2004 08:59:55 -0500 (EST) Subject: [Beowulf] help! need thoughts/recommendations quickly In-Reply-To: <20040302132333.GA3957@mikee.ath.cx> Message-ID: On Tue, 2 Mar 2004, Mike Eggleston wrote: > I realize this question is not specific to beowulf clusters... however, > at 9a I'm meeting with an upset user about a bunch of workstations > using serial termainals. Things don't happen as quickly as he wants: > setup, problem diagnosis, throughput, etc. What solutions can I present > for these problems (I realize this is just a quick summary!). Also, > the serial terminals are running at 9600 baud over sometimes 50 meters. > One table I found shows 60 meters is 2400 baud and 30 meters is 4800 > baud. I think this is part of the problem. It really shouldn't be, if the wiring is decent quality TP. Back in the old days, when our department was basically NOTHING but serial terminals running over TP down to a Sun 110 with a serial port expansion, we had lots of runs over 50 meters (probably some close to 100) without difficulty at 9600 baud. Keep the wires away from e.g. fluorescent lights (BIG problem), major power cables, or other sources of low frequency noise. Running parallel to a noise source over a long distance is where most crosstalk occurs -- try to cross wires at right angles. Conduit can help as it shields, as well, but our wires were basically thrown up in a drop ceiling haphazardly by "trained professionals" a.k.a. graduate students, faculty, and sometimes a shop/maintenance guy. > Possible solutions I have thought of: > > - user stops complaining and deals with the situation Always a popular one. To accomplish it you had better be prepared to use force. Bring duct tape to the meeting... > - put ethernet->serial converts at the terminals so the terminals are > on the network Sounds expensive. Of course, terminal servers themselves are typically pretty expensive, although we used to use them in the old days when we finally had more terminals than our server could manage even with expansions. And then workstations started getting cheaper and we converted over to workstations and ethernet and never looked back. How is it that you're still using terminals? I didn't know that terminals were still a viable option -- a cheap PC is less than what, $500 these days, and by the time you compare the cost of the terminal itself, the serial port terminal server, the serial wiring, and the incredible loss of productivity associated with using what amounts to a single, slow, tty interface they just don't sound cost effective. Not to mention maintenance, user complaints, and your time... > - put small VIA type boards whose image is loaded through tftp and > the serial terminals actually run from the via boards > - what else? Give the terminals to somebody you don't like, replace them with cheap diskless second hand PCs on ethernet running a stripped linux that basically provides either the standard set of Alt-Fx tty's in non-graphical mode or basic X and as many xterms as memory permits. Problem solved. In fact, depending on the applications being accessed and whether they CAN run locally, problem solved even better by running them locally and reducing demand on the network and servers. rgb > > Mike > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mikee at mikee.ath.cx Tue Mar 2 08:23:33 2004 From: mikee at mikee.ath.cx (Mike Eggleston) Date: Tue, 2 Mar 2004 07:23:33 -0600 Subject: [Beowulf] help! need thoughts/recommendations quickly Message-ID: <20040302132333.GA3957@mikee.ath.cx> I realize this question is not specific to beowulf clusters... however, at 9a I'm meeting with an upset user about a bunch of workstations using serial termainals. Things don't happen as quickly as he wants: setup, problem diagnosis, throughput, etc. What solutions can I present for these problems (I realize this is just a quick summary!). Also, the serial terminals are running at 9600 baud over sometimes 50 meters. One table I found shows 60 meters is 2400 baud and 30 meters is 4800 baud. I think this is part of the problem. Possible solutions I have thought of: - user stops complaining and deals with the situation - put ethernet->serial converts at the terminals so the terminals are on the network - put small VIA type boards whose image is loaded through tftp and the serial terminals actually run from the via boards - what else? Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mikee at mikee.ath.cx Tue Mar 2 09:08:56 2004 From: mikee at mikee.ath.cx (Mike Eggleston) Date: Tue, 2 Mar 2004 08:08:56 -0600 Subject: [Beowulf] help! need thoughts/recommendations quickly In-Reply-To: References: <20040302132333.GA3957@mikee.ath.cx> Message-ID: <20040302140856.GA4615@mikee.ath.cx> On Tue, 02 Mar 2004, Robert G. Brown wrote: > On Tue, 2 Mar 2004, Mike Eggleston wrote: > > > I realize this question is not specific to beowulf clusters... however, > > at 9a I'm meeting with an upset user about a bunch of workstations > > using serial termainals. Things don't happen as quickly as he wants: > > setup, problem diagnosis, throughput, etc. What solutions can I present > > for these problems (I realize this is just a quick summary!). Also, > > the serial terminals are running at 9600 baud over sometimes 50 meters. > > One table I found shows 60 meters is 2400 baud and 30 meters is 4800 > > baud. I think this is part of the problem. > > It really shouldn't be, if the wiring is decent quality TP. Back in the > old days, when our department was basically NOTHING but serial terminals > running over TP down to a Sun 110 with a serial port expansion, we had > lots of runs over 50 meters (probably some close to 100) without > difficulty at 9600 baud. Keep the wires away from e.g. fluorescent > lights (BIG problem), major power cables, or other sources of low > frequency noise. Running parallel to a noise source over a long > distance is where most crosstalk occurs -- try to cross wires at right > angles. Conduit can help as it shields, as well, but our wires were > basically thrown up in a drop ceiling haphazardly by "trained > professionals" a.k.a. graduate students, faculty, and sometimes a > shop/maintenance guy. I know it should work and the old way it does work, but I've always seen problems with serial and printers. I much prefer getting away from them to full ethernet. > > Possible solutions I have thought of: > > > > - user stops complaining and deals with the situation > > Always a popular one. To accomplish it you had better be prepared to > use force. Bring duct tape to the meeting... This problem is happening in the warehouse, so there is lots of packing material and tape around. :) > > - put ethernet->serial converts at the terminals so the terminals are > > on the network > > Sounds expensive. Of course, terminal servers themselves are typically > pretty expensive, although we used to use them in the old days when we > finally had more terminals than our server could manage even with > expansions. And then workstations started getting cheaper and we > converted over to workstations and ethernet and never looked back. > > How is it that you're still using terminals? I didn't know that > terminals were still a viable option -- a cheap PC is less than what, > $500 these days, and by the time you compare the cost of the terminal > itself, the serial port terminal server, the serial wiring, and the > incredible loss of productivity associated with using what amounts to a > single, slow, tty interface they just don't sound cost effective. Not > to mention maintenance, user complaints, and your time... This is an application in the warehouse. We have many serial (dumb) terminals and printers. We are using 'Dorio's(?). Similiar to the Wyse 60. I've not used a dorio before, but wyse terminals lots. The application is all curses based and doesn't require much. The users are not even concerned about the speed of the application (display, etc.) just that the terminals are quick to setup and work all the time. > > - put small VIA type boards whose image is loaded through tftp and > > the serial terminals actually run from the via boards > > - what else? > > Give the terminals to somebody you don't like, replace them with cheap > diskless second hand PCs on ethernet running a stripped linux that > basically provides either the standard set of Alt-Fx tty's in > non-graphical mode or basic X and as many xterms as memory permits. > Problem solved. > > In fact, depending on the applications being accessed and whether they > CAN run locally, problem solved even better by running them locally and > reducing demand on the network and servers. I can use the terminals on the via boards and not have to replace them with crt monitors and keyboards, until they all start failing. I'd prefer to use the crt monitors through vga (fewer problems with linux and getty). Do you (anyone) know of a cheap motherboard that would do this? Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From john.hearns at clustervision.com Tue Mar 2 10:44:47 2004 From: john.hearns at clustervision.com (John Hearns) Date: Tue, 2 Mar 2004 16:44:47 +0100 (CET) Subject: [Beowulf] help! need thoughts/recommendations quickly In-Reply-To: <20040302140856.GA4615@mikee.ath.cx> Message-ID: On Tue, 2 Mar 2004, Mike Eggleston wrote: > > Do you (anyone) know of a cheap motherboard that would do this? Sorry to sound like a Cyclades salesman, but from their webpages the Cyclades TS-100 would fit the bill. Plus lots of packing tape. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rbw at demec.ufpe.br Tue Mar 2 15:51:05 2004 From: rbw at demec.ufpe.br (Ramiro Brito Willmersdorf) Date: Tue, 2 Mar 2004 17:51:05 -0300 Subject: [Beowulf] Invitation to Conference Message-ID: <20040302205105.GA30141@demec.ufpe.br> Dear Colleagues, The XXV CILAMCE (Iberian Latin American Congress on Computational Methods for Engineering) will be held from November 10th to the 12th at Recife, Brazil. This Congress will encompass more than 30 mini-symposia over a very wide range of multidisciplinary methods in engineering and applied sciences. Please check the congress home page (http://www.demec.ufpe.br/cilamce2004/) for more specific details. We would like to invite you to participate in the High Performance Computing mini-symposium. If you are interested, you should submit an abstract by March 29th, 2004. This is one of the most important conferences on this subject in South America, and top researchers from here and abroad will attend. On a personal note, we would like to tell you that Recife is one of the top touristic destinations in Brazil, with a very pleasant weather and very nice beaches. We are grateful for you attention are ask that this information be passed along to other people in your institution that may be interested. Many Thanks, A. L. G. Coutinho, COPPE/UFRJ, alvaro at nacad.ufrj.br R. B. Willmersdorf, DEMEC/UFPE, rbw at demec.ufpe.br -- Ramiro Brito Willmersdorf rbw at demec.ufpe.br GPG key: http://www.demec.ufpe.br/~rbw/GPG/gpg_key.txt _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Wed Mar 3 12:52:30 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Wed, 03 Mar 2004 09:52:30 -0800 Subject: [Beowulf] mpich program segfaults Message-ID: <40461B5E.6010003@cert.ucr.edu> Hi, Sorry if this is off topic. Anyway, I've got an mpich Fortran program I'm trying to get going, which produces a segmentation fault right at a subroutine call. I put a print statement right before and right after the call and when I run the program, I'm only seeing the one before. I've also put a print statement right at the beginning of the subroutine which is being called and never see that either. The real strange part is when I run this under a debugger, the program runs fine. So would anyone happen to have any insight to what's going on here? I'd really appriciate it. Thanks, Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sdutta at deas.harvard.edu Wed Mar 3 14:42:13 2004 From: sdutta at deas.harvard.edu (Suvendra Nath Dutta) Date: Wed, 3 Mar 2004 14:42:13 -0500 Subject: [Beowulf] mpich program segfaults In-Reply-To: <40461B5E.6010003@cert.ucr.edu> References: <40461B5E.6010003@cert.ucr.edu> Message-ID: Glen, Does your program seg fault when compiled with debugging off or on? Sometimes compilers will initialize arrays when compiling for debugging, but not waste time doing that when compiled without debugging. Also if you compile with optimization which line follows which one isn't always clear. You want to make sure you aren't over-running memory. Because what you say sounds suspiciously like that. Also you want to be sure its nothing to do with MPICH. Try calling the subroutine from a serial program if possible. Suvendra. On Mar 3, 2004, at 12:52 PM, Glen Kaukola wrote: > Hi, > > Sorry if this is off topic. Anyway, I've got an mpich Fortran program > I'm trying to get going, which produces a segmentation fault right at > a subroutine call. I put a print statement right before and right > after the call and when I run the program, I'm only seeing the one > before. I've also put a print statement right at the beginning of the > subroutine which is being called and never see that either. The real > strange part is when I run this under a debugger, the program runs > fine. So would anyone happen to have any insight to what's going on > here? I'd really appriciate it. > > Thanks, > Glen > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Wed Mar 3 15:46:36 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Wed, 03 Mar 2004 12:46:36 -0800 Subject: [Beowulf] mpich program segfaults In-Reply-To: References: <40461B5E.6010003@cert.ucr.edu> Message-ID: <4046442C.4090704@cert.ucr.edu> Suvendra Nath Dutta wrote: > Glen, > Does your program seg fault when compiled with debugging off or on? Either way. > Sometimes compilers will initialize arrays when compiling for > debugging, but not waste time doing that when compiled without debugging. The arguments being passed to the subroutine are two arrays of real numbers and a few integers. Nothing being passed to the subroutine has been dynamically allocated. The compiler, IBM's XLF compiler, initializes the array to 0. At least I'm pretty sure it does, since I can print things before the subroutine call. > Also if you compile with optimization which line follows which one > isn't always clear. I don't have any optimizations turned on. > You want to make sure you aren't over-running memory. The machine has 2 gigs of memory, which should be plenty. The same program runs on an x86 machine with 1 gig of memory just fine (I'm trying to get the program working on an Apple G5 by the way). > Also you want to be sure its nothing to do with MPICH. Try calling the > subroutine from a serial program if possible. I've tried telling mpirun to only use one cpu and I get the same results. I've also tried running the program all by itself and it still crashes. Like I said though, it runs just fine under the a debugger. Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sdutta at deas.harvard.edu Thu Mar 4 06:26:49 2004 From: sdutta at deas.harvard.edu (Suvendra Nath Dutta) Date: Thu, 4 Mar 2004 06:26:49 -0500 (EST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <4046442C.4090704@cert.ucr.edu> References: <40461B5E.6010003@cert.ucr.edu> <4046442C.4090704@cert.ucr.edu> Message-ID: Glen, I am sorry, I meant buffer-overrun instead of memory overrun. It is of course impossible to say, but you are describing a classic description of buffer overrun. Program seg-faulting, some where there shouldn't be a problem. This is usually because you've over run the array limits and are writing on the program space. Suvendra. On Wed, 3 Mar 2004, Glen Kaukola wrote: > Suvendra Nath Dutta wrote: > > > Glen, > > Does your program seg fault when compiled with debugging off or on? > > > Either way. > > > Sometimes compilers will initialize arrays when compiling for > > debugging, but not waste time doing that when compiled without debugging. > > > The arguments being passed to the subroutine are two arrays of real > numbers and a few integers. Nothing being passed to the subroutine has > been dynamically allocated. The compiler, IBM's XLF compiler, > initializes the array to 0. At least I'm pretty sure it does, since I > can print things before the subroutine call. > > > Also if you compile with optimization which line follows which one > > isn't always clear. > > > I don't have any optimizations turned on. > > > You want to make sure you aren't over-running memory. > > > The machine has 2 gigs of memory, which should be plenty. The same > program runs on an x86 machine with 1 gig of memory just fine (I'm > trying to get the program working on an Apple G5 by the way). > > > Also you want to be sure its nothing to do with MPICH. Try calling the > > subroutine from a serial program if possible. > > > I've tried telling mpirun to only use one cpu and I get the same > results. I've also tried running the program all by itself and it still > crashes. Like I said though, it runs just fine under the a debugger. > > > Glen > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From wseas at canada.com Thu Mar 4 09:34:46 2004 From: wseas at canada.com (WSEAS Newsletter on MECHANICAL ENGINEERING) Date: Thu, 4 Mar 2004 16:34:46 +0200 Subject: [Beowulf] WSEAS NEWSLETTER in MECHANICAL ENGINEERING Message-ID: <3FE20F40001FB40E@fesscrpp1.tellas.gr> (added by postmaster@fesscrpp1.tellas.gr) If you want to contact us, the Subject of your email must contains the code: WSEAS CALL FOR PAPERS -- CALL FOR REVIEWERS -- CALL FOR SPECIAL SESSIONS wseas at canada.com http://wseas.freeservers.com **************************************************************** Udine, Italy, March 25-27, 2004: IASME/WSEAS 2004 Int.Conf. on MECHANICS and MECHATRONICS **************************************************************** Miami, Florida, USA, April 21-23, 2004 5th WSEAS International Conference on APPLIED MATHEMATICS (SYMPOSIA on: Linear Algebra and Applications, Numerical Analysis and Applications, Differential Equations and Applications, Probabilities, Statistics, Operational Research, Optimization, Algorithms, Discrete Mathematics, Systems, Communications, Control, Computers, Education) **************************************************************** Corfu Island, Greece, August 17-19, 2004 WSEAS/IASME Int.Conf. on FLUID MECHANICS WSEAS/IASME Int.Conf. on HEAT and MASS TRANSFER ********************************************************** Vouliagmeni, Athens, Greece, July 12-13, 2004 WSEAS ELECTROSCIENCE AND TECHNOLOGY FOR NAVAL ENGINEERING and ALL-ELECTRIC SHIP ********************************************************** Copacabana, Rio de Janeiro, Brazil, October 12-15, 2004 3rd WSEAS Int.Conf. on INFORMATION SECURITY, HARDWARE/SOFTWARE CODESIGN and COMPUTER NETWORKS (ISCOCO 2004) 3rd WSEAS Int. Conf. on APPLIED MATHEMATICS and COMPUTER SCIENCE (AMCOS 2004) 3rd WSEAS Int.Conf. on SYSTEM SCIENCE and ENGINEERING (ICOSSE 2004) 4th WSEAS Int.Conf. on POWER ENGINEERING SYSTEMS (ICOPES 2004) **************************************************************** Cancun, Mexico, May 12-15, 2004 6th WSEAS Int.Conf. on ALGORITHMS, SCIENTIFIC COMPUTING, MODELLING AND SIMULATION (ASCOMS '04) ********************************************************** NOTE THAT IN WSEAS CONFERENCES YOU CAN HAVE PROCEEDINGS 1) HARD COPY 2) CD-ROM and 3) Web Publishing SELECTED PAPERS are also published (after further review) * as regular papers in WSEAS TRANSACTIONS (Journals) or * as Chapters in WSEAS Book Series. WSEAS Books, Journals, Proceedings participate now in all major science citation indexes. ISI, ELSEVIER, CSA, AMS. Mathematical Reviews, ELP, NLG, Engineering Index Directory of Published Proceedings, INSPEC (IEE) Thanks Alexis Espen WSEAS NEWSLETTER in MECHANICAL ENGINEERING wseas at canada.com http://wseas.freeservers.com ##### HOW TO UNSUBSCRIBE #### You receive this newsletter from your email address: beowulf at beowulf.org If you want to unsubscribe, send an email to: wseas at canada.com The Subject of your message must be exactly: REMOVE beowulf at beowulf.org WSEAS If you want to unsubscribe more than one email addresses, send a message to nata at wseas.org with Subject: REMOVE [email1, emal2, ...., emailn] WSEAS _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From robl at mcs.anl.gov Thu Mar 4 13:46:26 2004 From: robl at mcs.anl.gov (Robert Latham) Date: Thu, 4 Mar 2004 12:46:26 -0600 Subject: [Beowulf] mpich program segfaults In-Reply-To: <4046442C.4090704@cert.ucr.edu> References: <40461B5E.6010003@cert.ucr.edu> <4046442C.4090704@cert.ucr.edu> Message-ID: <20040304184626.GA2746@mcs.anl.gov> On Wed, Mar 03, 2004 at 12:46:36PM -0800, Glen Kaukola wrote: > I've tried telling mpirun to only use one cpu and I get the same > results. I've also tried running the program all by itself and it still > crashes. Like I said though, it runs just fine under the a debugger. since you see this crash when the program runs by itself, try running under a memory checker (valgrid is good and free, also purify, insure++...). ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Labs, IL USA B29D F333 664A 4280 315B _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Thu Mar 4 14:32:12 2004 From: raysonlogin at yahoo.com (Rayson Ho) Date: Thu, 4 Mar 2004 11:32:12 -0800 (PST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <4046442C.4090704@cert.ucr.edu> Message-ID: <20040304193213.411.qmail@web11407.mail.yahoo.com> Then run the program by hand, and attach a debugger... Rayson --- Glen Kaukola wrote: > Like I said though, it runs just fine under the a debugger. > > > Glen > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________ Do you Yahoo!? Yahoo! Search - Find what you?re looking for faster http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Thu Mar 4 13:45:25 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Thu, 04 Mar 2004 10:45:25 -0800 Subject: [Beowulf] mpich program segfaults In-Reply-To: References: <40461B5E.6010003@cert.ucr.edu> <4046442C.4090704@cert.ucr.edu> Message-ID: <40477945.9090808@cert.ucr.edu> Suvendra Nath Dutta wrote: >Glen, > I am sorry, I meant buffer-overrun instead of memory overrun. It >is of course impossible to say, but you are describing a classic >description of buffer overrun. Program seg-faulting, some where there >shouldn't be a problem. This is usually because you've over run the array >limits and are writing on the program space. > > Ok, but simply calling a subroutine shouldn't cause a buffer overrun should it? Especially when none of the arguments being passed to the subroutine are dynamically allocated. I'm beginning to suspect it's a problem with the compiler actually. Maybe the stack that holds subroutine arguments isn't big enough. And when my problematic subroutine call is 4 levels deep or so like it is, then there isn't enough room on the stack for it's arguments. Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From deadline at linux-mag.com Thu Mar 4 17:34:29 2004 From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine) Date: Thu, 4 Mar 2004 17:34:29 -0500 (EST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <4046442C.4090704@cert.ucr.edu> Message-ID: What type of machine is this? Doug On Wed, 3 Mar 2004, Glen Kaukola wrote: > Suvendra Nath Dutta wrote: > > > Glen, > > Does your program seg fault when compiled with debugging off or on? > > > Either way. > > > Sometimes compilers will initialize arrays when compiling for > > debugging, but not waste time doing that when compiled without debugging. > > > The arguments being passed to the subroutine are two arrays of real > numbers and a few integers. Nothing being passed to the subroutine has > been dynamically allocated. The compiler, IBM's XLF compiler, > initializes the array to 0. At least I'm pretty sure it does, since I > can print things before the subroutine call. > > > Also if you compile with optimization which line follows which one > > isn't always clear. > > > I don't have any optimizations turned on. > > > You want to make sure you aren't over-running memory. > > > The machine has 2 gigs of memory, which should be plenty. The same > program runs on an x86 machine with 1 gig of memory just fine (I'm > trying to get the program working on an Apple G5 by the way). > > > Also you want to be sure its nothing to do with MPICH. Try calling the > > subroutine from a serial program if possible. > > > I've tried telling mpirun to only use one cpu and I get the same > results. I've also tried running the program all by itself and it still > crashes. Like I said though, it runs just fine under the a debugger. > > > Glen > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- ---------------------------------------------------------------- Editor-in-chief ClusterWorld Magazine Desk: 610.865.6061 Cell: 610.390.7765 Redefining High Performance Computing Fax: 610.865.6618 www.clusterworld.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From smcdaniel at kciinc.net Thu Mar 4 13:59:37 2004 From: smcdaniel at kciinc.net (smcdaniel) Date: Thu, 4 Mar 2004 12:59:37 -0600 Subject: [Beowulf] mpich program segfaults (Glen Kaukola) Message-ID: <002501c4021a$d77830c0$2a01010a@kciinc.local> Physical memory errors could be the problem if they occur between the pointer and offset of your array location in the stack. Other than that I would suspect a buffer overrun that Suvendra Nath Dutta mentioned. Sam McDaniel _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Thu Mar 4 19:48:21 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Thu, 04 Mar 2004 16:48:21 -0800 Subject: [Beowulf] mpich program segfaults In-Reply-To: References: Message-ID: <4047CE55.6010300@cert.ucr.edu> Douglas Eadline, Cluster World Magazine wrote: >What type of machine is this? > > An Apple G5. And actually I've figured out what's wrong. Sorta. =) I replaced my problematic subroutine with a dummy subroutine that contains nothing but variable declarations and a print statement. This still caused a segmentation fault. So I commented pretty much everything out. No segmentation fault. Alright then. I slowly added it all back in, checking each time to see if I got a segmentation fault. And now I'm down to 4 variable declarations that are causing a problem: REAL ZFGLURG ( NCOLS,NROWS,0:NLAYS ) INTEGER ICASE( NCOLS,NROWS,0:NLAYS ) REAL THETAV( NCOLS,NROWS,NLAYS ) REAL ZINT ( NCOLS,NROWS,NLAYS ) If I uncomment any one of those, I get a segmentation fault again. But it still doesn't make any sense. First of all, there are variable declarations almost exactly like the ones I listed and those don't cause a problem. I also made a small test case that called my dummy subroutine and that worked just fine. I then commented out everything but the problematic variable declarations I listed above and that worked just fine. I tried changing the variable names but that didn't seem to make a difference, as I still got a segmentation fault. So I have no idea what the heck is going on. I think I need to tell my boss we need to give up on G5's. Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Thu Mar 4 20:05:28 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Fri, 5 Mar 2004 09:05:28 +0800 (CST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <40477945.9090808@cert.ucr.edu> Message-ID: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com> the default stack size on OSX is 512 KB, try to increase it to 64MB, I encountered this problem before. Andrew. --- Glen Kaukola ????> Suvendra Nath Dutta wrote: > > >Glen, > > I am sorry, I meant buffer-overrun instead of > memory overrun. It > >is of course impossible to say, but you are > describing a classic > >description of buffer overrun. Program > seg-faulting, some where there > >shouldn't be a problem. This is usually because > you've over run the array > >limits and are writing on the program space. > > > > > > Ok, but simply calling a subroutine shouldn't cause > a buffer overrun > should it? Especially when none of the arguments > being passed to the > subroutine are dynamically allocated. I'm beginning > to suspect it's a > problem with the compiler actually. Maybe the stack > that holds > subroutine arguments isn't big enough. And when my > problematic > subroutine call is 4 levels deep or so like it is, > then there isn't > enough room on the stack for it's arguments. > > Glen > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or > unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From deadline at linux-mag.com Thu Mar 4 21:46:16 2004 From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine) Date: Thu, 4 Mar 2004 21:46:16 -0500 (EST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <4047CE55.6010300@cert.ucr.edu> Message-ID: Don't give up on the G5 just yet. Sounds like to me you may be stepping on some memory somehow. Which means the crash occurs at that particular spot in the code, but the cause of the crash probably is occurring somewhere else in the program. There are "simple" several you can do to collect evidence that may help you solve this "crime". (this is detective work by the way) First, this sounds like the kind of thing that happens in C programs. Is it pure Fortran? What version of MPICH? 1) try another compiler, if you are lucky it will find the problem. It may also work, in which case you will want to blame the first compiler, don't, because that is probably not the case. The new compiler probably lays out the memory different than the first one and you just got lucky. 2) run your code on another architecture. 3) try another MPI (LAM?) I am sure there are more, but not knowing the particulars, I can not suggest anything else. Doug On Thu, 4 Mar 2004, Glen Kaukola wrote: > Douglas Eadline, Cluster World Magazine wrote: > > >What type of machine is this? > > > > > > An Apple G5. > > And actually I've figured out what's wrong. Sorta. =) > > I replaced my problematic subroutine with a dummy subroutine that > contains nothing but variable declarations and a print statement. This > still caused a segmentation fault. So I commented pretty much > everything out. No segmentation fault. Alright then. I slowly added > it all back in, checking each time to see if I got a segmentation fault. > > And now I'm down to 4 variable declarations that are causing a problem: > REAL ZFGLURG ( NCOLS,NROWS,0:NLAYS ) > INTEGER ICASE( NCOLS,NROWS,0:NLAYS ) > REAL THETAV( NCOLS,NROWS,NLAYS ) > REAL ZINT ( NCOLS,NROWS,NLAYS ) > > If I uncomment any one of those, I get a segmentation fault again. > > But it still doesn't make any sense. First of all, there are variable > declarations almost exactly like the ones I listed and those don't cause > a problem. I also made a small test case that called my dummy > subroutine and that worked just fine. I then commented out everything > but the problematic variable declarations I listed above and that worked > just fine. I tried changing the variable names but that didn't seem to > make a difference, as I still got a segmentation fault. So I have no > idea what the heck is going on. I think I need to tell my boss we need > to give up on G5's. > > > Glen > -- ---------------------------------------------------------------- Editor-in-chief ClusterWorld Magazine Desk: 610.865.6061 Cell: 610.390.7765 Redefining High Performance Computing Fax: 610.865.6618 www.clusterworld.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mathiasbrito at yahoo.com.br Fri Mar 5 08:43:33 2004 From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=) Date: Fri, 5 Mar 2004 10:43:33 -0300 (ART) Subject: [Beowulf] Benchmarking with HPL Message-ID: <20040305134333.90538.qmail@web12201.mail.yahoo.com> Hello, I'm benchmarking my cluster with HPL, the cluster have 16 nodes, 8 nodes athlon 1600+ with 512MB RAM and 20GB H.D. , and 8 nodes athlan 1700+ with 512MB RAM and 20GB, all with a 100Mbit fast ethernet linked in a switch. Well, the problem is, what the best setup for the HPL.dat, to obtain the maximum performance of the cluster? Mathias ===== Mathias Brito Universidade Estadual de Santa Cruz - UESC Departamento de Ci?ncias Exatas e Tecnol?gicas Estudante do Curso de Ci?ncia da Computa??o ______________________________________________________________________ Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora: http://br.yahoo.com/info/mail.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Sebastien.Georget at sophia.inria.fr Fri Mar 5 10:10:10 2004 From: Sebastien.Georget at sophia.inria.fr (=?ISO-8859-1?Q?S=E9bastien_Georget?=) Date: Fri, 05 Mar 2004 16:10:10 +0100 Subject: [Beowulf] Benchmarking with HPL In-Reply-To: <20040305134333.90538.qmail@web12201.mail.yahoo.com> References: <20040305134333.90538.qmail@web12201.mail.yahoo.com> Message-ID: <40489852.3050206@sophia.inria.fr> Mathias Brito wrote: > Hello, > > I'm benchmarking my cluster with HPL, the cluster have > 16 nodes, 8 nodes athlon 1600+ with 512MB RAM and 20GB > H.D. , and 8 nodes athlan 1700+ with 512MB RAM and > 20GB, all with a 100Mbit fast ethernet linked in a > switch. Well, the problem is, what the best setup for > the HPL.dat, to obtain the maximum performance of the > cluster? > > Mathias Hi, starting points for HPL tuning here: http://www.netlib.org/benchmark/hpl/faqs.html http://www.netlib.org/benchmark/hpl/tuning.html ++ -- S?bastien Georget INRIA Sophia-Antipolis, Service DREAM, B.P. 93 06902 Sophia-Antipolis Cedex, FRANCE E-mail:sebastien.georget at sophia.inria.fr _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Fri Mar 5 12:28:36 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 5 Mar 2004 12:28:36 -0500 (EST) Subject: [Beowulf] Newbie on beowulf clustering In-Reply-To: <20040305171757.15481.qmail@web20730.mail.yahoo.com> Message-ID: On Fri, 5 Mar 2004, khurram b wrote: > hi! > i am newbie to beowulf clustering, have done some work > in MOSIX linux clustering and got interested in > beowulf clustering, please guide me where to start , > tutorials, documents. http://www.phy.duke.edu/brahma Has many resources and links to many more. Also think about subscribing to Cluster World magazine. rgb > > Thanks! > > __________________________________ > Do you Yahoo!? > Yahoo! Search - Find what you?re looking for faster > http://search.yahoo.com > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From myaoha at yahoo.com Fri Mar 5 12:17:57 2004 From: myaoha at yahoo.com (khurram b) Date: Fri, 5 Mar 2004 09:17:57 -0800 (PST) Subject: [Beowulf] Newbie on beowulf clustering Message-ID: <20040305171757.15481.qmail@web20730.mail.yahoo.com> hi! i am newbie to beowulf clustering, have done some work in MOSIX linux clustering and got interested in beowulf clustering, please guide me where to start , tutorials, documents. Thanks! __________________________________ Do you Yahoo!? Yahoo! Search - Find what you?re looking for faster http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mprinkey at aeolusresearch.com Fri Mar 5 14:02:13 2004 From: mprinkey at aeolusresearch.com (Michael T. Prinkey) Date: Fri, 5 Mar 2004 14:02:13 -0500 (EST) Subject: [Beowulf] "noht" in 2.4.24? Message-ID: Hi Everyone, I installed 2.4.24 on a dual Xeon system with a Tyan 7501-chipset motherboard and the noht option seems to be ignored. The RH9 kernel (2.4.20?) repected noht. Has this been changed or is there a patch that I missed? I can't think that it is a BIOS issue or otherwise hardware related as I can shut it off with RH9 kernel. Thanks, Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hartner at cs.utah.edu Fri Mar 5 16:22:37 2004 From: hartner at cs.utah.edu (Mark Hartner) Date: Fri, 5 Mar 2004 14:22:37 -0700 (MST) Subject: [Beowulf] "noht" in 2.4.24? In-Reply-To: Message-ID: > I installed 2.4.24 on a dual Xeon system with a Tyan 7501-chipset > motherboard and the noht option seems to be ignored. The RH9 kernel > (2.4.20?) repected noht. Has this been changed or is there a patch that I think that option was removed around 2.4.21 If you look at Documentation/kernel-parameters.txt in the kernel source it will give you a list of options for the 2.4.24 kernel. > missed? I can't think that it is a BIOS issue or otherwise hardware > related as I can shut it off with RH9 kernel. 'acpi=off' will disable ht'ing (and a bunch of other stuff) The other option is to disable it in your BIOS. Mark _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rdn at uchicago.edu Fri Mar 5 18:27:34 2004 From: rdn at uchicago.edu (Russell Nordquist) Date: Fri, 5 Mar 2004 17:27:34 -0600 (CST) Subject: [Beowulf] good 24 port gige switch Message-ID: Does anyone have a recommendation for a good 24 port gige switch for clustering? I know this issue has been discussed, but I didn't find any actual manufacturer/models people like. Were not really looking at the very high end models from Cisco, but I am wary of the many low end switches on the market with regard to bisectional bandwidth issues. Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches and found one to be better than the other. There are a bunch of 24 port gige switches for <$2000, but are they any good? are some better than others (likely so i'd guess)? thanks and have a good weekend. russell - - - - - - - - - - - - Russell Nordquist UNIX Systems Administrator Geophysical Sciences Computing http://geosci.uchicago.edu/computing NSIT, University of Chicago - - - - - - - - - - - _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Fri Mar 5 20:24:55 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sat, 6 Mar 2004 09:24:55 +0800 (CST) Subject: [Beowulf] SGEEE free and more platform offically supported Message-ID: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com> I used to think that SGE is free, but SGEEE (with more advanced scheduling algorithms) is not. But it is not true, both are free and open source. In SGE 6.0, there will be no "SGEEE mode", but the default mode will have all the SGEEE functionality! And Sun is adding more support too, instead of looking at the source or finding other people to support non-Sun OSes: "Sun will also support non Sun platforms beginning with Grid Engine 6 (HP, IBM, SGI, MAC)." http://gridengine.sunsource.net/servlets/ReadMsg?msgId=16510&listName=users Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Fri Mar 5 20:04:33 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sat, 6 Mar 2004 09:04:33 +0800 (CST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <40491AA7.6050703@cert.ucr.edu> Message-ID: <20040306010433.99259.qmail@web16812.mail.tpe.yahoo.com> It's not your code, I think there is a compiler flag to not allocate variables from the stack, but I need to look at the XLF manuals again. BTW, there are several OSX settings that you can do to tune the performance of your fortran on the G5. I said fortran since it has to do with the hardware prefetching on the Power4 and the G5, if you have c programs with a lot of vector computation, you can set those too. Andrew. --- Glen Kaukola > >the default stack size on OSX is 512 KB, try to > >increase it to 64MB, I encountered this problem > >before. > Yep, that did the trick. Thanks a bunch! > > I'm wondering though, does this indicate there's > some sort of problem > with the code? > > > Glen ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Fri Mar 5 19:26:15 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Fri, 05 Mar 2004 16:26:15 -0800 Subject: [Beowulf] mpich program segfaults In-Reply-To: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com> References: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com> Message-ID: <40491AA7.6050703@cert.ucr.edu> Andrew Wang wrote: >the default stack size on OSX is 512 KB, try to >increase it to 64MB, I encountered this problem >before. > > Yep, that did the trick. Thanks a bunch! I'm wondering though, does this indicate there's some sort of problem with the code? Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Fri Mar 5 19:34:05 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Fri, 5 Mar 2004 19:34:05 -0500 (EST) Subject: [Beowulf] good 24 port gige switch In-Reply-To: Message-ID: > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches > and found one to be better than the other. There are a bunch of 24 port > gige switches for <$2000, but are they any good? are some better than > others (likely so i'd guess)? I've had good luck with SMC 8624t's, and know of one quite large cluster that uses a lot of them of them (mckenzie, #140). _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lars at meshtechnologies.com Sat Mar 6 04:55:22 2004 From: lars at meshtechnologies.com (Lars Henriksen) Date: 06 Mar 2004 09:55:22 +0000 Subject: [Beowulf] good 24 port gige switch In-Reply-To: References: Message-ID: <1078566922.2547.6.camel@fermi> On Fri, 2004-03-05 at 23:27, Russell Nordquist wrote: > Does anyone have a recommendation for a good 24 port gige switch for > clustering? > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches > and found one to be better than the other. There are a bunch of 24 port > gige switches for <$2000, but are they any good? are some better than > others (likely so i'd guess)? We mostly use HP2724's for this size of clusters. We have found them to perform ok and they are stable under heavy load - and they are priced at around $2000 (in Denmark, that is, might be cheaper in the US) best regards Lars -- Lars Henriksen | MESH-Technologies A/S Systems Manager & Consultant | Lille Gr?br?drestr?de 1 www.meshtechnologies.com | DK-5000 Odense C, Denmark lars at meshtechnologies.com | mobile: +45 2291 2904 direct: +45 6311 1187 | fax: +45 6311 1189 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Sat Mar 6 09:01:49 2004 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Sat, 6 Mar 2004 06:01:49 -0800 (PST) Subject: [Beowulf] good 24 port gige switch In-Reply-To: <1078566922.2547.6.camel@fermi> Message-ID: On 6 Mar 2004, Lars Henriksen wrote: > On Fri, 2004-03-05 at 23:27, Russell Nordquist wrote: > > Does anyone have a recommendation for a good 24 port gige switch for > > clustering? > > > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches > > and found one to be better than the other. There are a bunch of 24 port > > gige switches for <$2000, but are they any good? are some better than > > others (likely so i'd guess)? > > We mostly use HP2724's for this size of clusters. We have found them to > perform ok and they are stable under heavy load - and they are priced at > around $2000 (in Denmark, that is, might be cheaper in the US) hp doesn't do jumbo frames on anything other than their top of the line l3 switch products which may or may not be an issue for certain applications. > best regards > Lars > -- > Lars Henriksen | MESH-Technologies A/S > Systems Manager & Consultant | Lille Gr?br?drestr?de 1 > www.meshtechnologies.com | DK-5000 Odense C, Denmark > lars at meshtechnologies.com | mobile: +45 2291 2904 > direct: +45 6311 1187 | fax: +45 6311 1189 > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Unix Consulting joelja at darkwing.uoregon.edu GPG Key Fingerprint: 5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hanzl at noel.feld.cvut.cz Sat Mar 6 10:02:37 2004 From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz) Date: Sat, 06 Mar 2004 16:02:37 +0100 Subject: [Beowulf] SGEEE free and more platform offically supported In-Reply-To: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com> References: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com> Message-ID: <20040306160237D.hanzl@unknown-domain> > I used to think that SGE is free, but SGEEE (with more > advanced scheduling algorithms) is not. But it is not > true, both are free and open source. SGEEE is free and opensource but many many people did not know this. I thing this confusion made big harm to SGE project and I invested a lot of effort in clarifying this (Google "hanzl SGEEE" to see all that). > In SGE 6.0, there will be no "SGEEE mode", but the > default mode will have all the SGEEE functionality! Great, hope this will stop the confusion once for ever. Regards Vaclav Hanzl _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Sat Mar 6 10:00:35 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sat, 6 Mar 2004 23:00:35 +0800 (CST) Subject: [Beowulf] SGEEE free and more platform offically supported In-Reply-To: <20040306160237D.hanzl@unknown-domain> Message-ID: <20040306150035.75079.qmail@web16806.mail.tpe.yahoo.com> --- hanzl at noel.feld.cvut.cz ????> > SGEEE is free and opensource but many many people > did not know this. I > thing this confusion made big harm to SGE project > and I invested a lot > of effort in clarifying this (Google "hanzl SGEEE" > to see all that). I think it is because Sun called it "Enterprise Edition" (EE), and when people think of Enterprise, they think of $$$. Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From atp at piskorski.com Sat Mar 6 15:43:28 2004 From: atp at piskorski.com (Andrew Piskorski) Date: Sat, 6 Mar 2004 15:43:28 -0500 Subject: [Beowulf] DC powered clusters? Message-ID: <20040306204328.GA49615@piskorski.com> Some rackmount vendors now offer systems with a small DC-to-DC power supply for each node, with separate AC-DC rectifiers feeding power. I imagine the DC is probably at 48 V rather than 12 V or whatever, but often they don't even seem to ay that, e.g.: http://rackable.com/products/dcpower.htm Has anyone OTHER than commercial rackmount vendors designed and built a cluster using such DC-to-DC power supplies? Is there detailed info on such anywhere on the web? Anybody have any idea exactly what components those vendors are using for their power systems, where they can be purchased (in small quantities), and/or how much they cost? I'm curious how the purchase and operating costs compare to the normal "stick a standard desktop AC-to-DC PUSE in each node" approach, or even the hackish "wire on extra connectors and use one high qualtiy desktop PSU to power 2 or 3 nodes" approach. The only DC-to-DC supplies I've seen on the web seem quite expensive, e.g.: http://www.rackmountpro.com/productsearch.cfm?catid=118 http://www.mini-box.com/power-faq.htm So I suspect the DC-to-DC approach would only ever make economic sense for large high-end clusters, those with unusual space or heat constraints, or the like. But I'm still curious about the details... -- Andrew Piskorski http://www.piskorski.com/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Fri Mar 5 23:41:07 2004 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Fri, 05 Mar 2004 22:41:07 -0600 Subject: [Beowulf] good 24 port gige switch In-Reply-To: References: Message-ID: <40495663.7010507@tamu.edu> Caveats: 1. It's been arough week. 2. I've got some specific opinions about 3Com hardware these days. I just ordered a 16 node cluster. I'm using the Foundry EdgeIron 24G as the basic switch. More than adequate backplane, pretty good small and large packet performance as tested with an Anritsu MD1230. Cost is expected to be about $3000, for the 24 port model. I'm getting 2, and have dual nics on the nodes, for some playing with channel bonding, and so that I've got a failover hot spare if/when one dies. Remember: Murphy was an optimist. For the record I don't expect the EdgeIron to die, but conversely (perversely?) I expect any and all network devices to die at the least opportune time! I didn't even consider 3Com. Didn't test it. The 3Com "gigabit" hardware I've seen recently in the LAN-space was usually capable of gig uplinks, but had trouble with congestion when gig and 100BaseT were mixed on the switch. HP had been OEM'ing Foundry. I'm not sure if that's still the case or if they went recently to someone else; my Foundry rep won't say, and I don't have a close HP rep. We have programmatically stayed away from Asante in our LAN operations here. That translates to no experience an dno contacts. Sorry. Cluster should be in within a month, and so should the switches. I'll do some latency runs and report objective data. gerry Russell Nordquist wrote: > Does anyone have a recommendation for a good 24 port gige switch for > clustering? I know this issue has been discussed, but I didn't find any > actual manufacturer/models people like. Were not really looking at the > very high end models from Cisco, but I am wary of the many low end > switches on the market with regard to bisectional bandwidth issues. > > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches > and found one to be better than the other. There are a bunch of 24 port > gige switches for <$2000, but are they any good? are some better than > others (likely so i'd guess)? > > thanks and have a good weekend. > russell > > > - - - - - - - - - - - - > Russell Nordquist > UNIX Systems Administrator > Geophysical Sciences Computing > http://geosci.uchicago.edu/computing > NSIT, University of Chicago > - - - - - - - - - - - > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From alvin at Mail.Linux-Consulting.com Sun Mar 7 03:00:56 2004 From: alvin at Mail.Linux-Consulting.com (Alvin Oga) Date: Sun, 7 Mar 2004 00:00:56 -0800 (PST) Subject: [Beowulf] DC powered clusters? - fun In-Reply-To: <20040306204328.GA49615@piskorski.com> Message-ID: hi ya andrew fun stuff ... :-) good techie vitamins ;-) - lots of thinking of why it is the way it is vs what the real measure power consumption is On Sat, 6 Mar 2004, Andrew Piskorski wrote: > Some rackmount vendors now offer systems with a small DC-to-DC power > supply for each node, with separate AC-DC rectifiers feeding power. I > imagine the DC is probably at 48 V rather than 12 V or whatever, but > often they don't even seem to ay that, e.g.: > > http://rackable.com/products/dcpower.htm i don't like that they claim "back-to-back rackmounts" is their "patented technology" ... geez ... - anybody can mount a generic 1U in the rack .. one in the front and one in the back ( other side ) ... ( obviously the 1U chassis cannot be too deep ) > Has anyone OTHER than commercial rackmount vendors designed and built > a cluster using such DC-to-DC power supplies? Is there detailed info > on such anywhere on the web? dc-dc power supplies are made literally and figuratively by the million various combination of voltage, current capacity and footprint http://www.Linux-1U.net/PowerSupp ( see the list of various power supply manufacturers ) > Anybody have any idea exactly what components those vendors are using > for their power systems, where they can be purchased (in small > quantities), and/or how much they cost? you can buy any size dc-dc power supplies from $1.oo to the thousands if you want the dc-dc power supply to have atx output capabilities, than you have 2 or 3 choice of dc-atx output power supplies: - mini-box.com ( and they have a few resellers ) - there's a power supply company that also did a variation of mini-box.com's design ... i cant find the orig url at this time http://www.dc2dc.com is a resller of the "other option" - probably a bunch of power supp working on dc-atx convertors > The only DC-to-DC supplies I've seen on the web seem quite expensive, > e.g.: > > http://www.rackmountpro.com/productsearch.cfm?catid=118 99% of the rackmount vendors are just reselling (adding $$$ to ) a power supply manufacturer's power supply ... - you can save a good chunk of change by buying direct from the generic power supply OEM distributors - somtimes as much or mroe than 50% cost savings of the cost of the power supply > http://www.mini-box.com/power-faq.htm most of their data are measured data per their test setups and more info about dc-dc stuff http://www.via.com.tw/en/VInternet/power.pdf see the rest of the +12v DC input "atx power supply" vendors http://www.Linux-1U.net/PowerSupp/DC/ http://www.Linux-1U.net/PowerSupp/12v/ ( +12v at up to 500A or more ) > So I suspect the DC-to-DC approach would only ever make economic sense > for large high-end clusters, those with unusual space or heat > constraints, or the like. But I'm still curious about the details... dc-atx power supply makes sense when: - power supply heat and airflow is a problem or you dont like having too many power cords ( 400 cords vs 40 in a rack ) - simple cabling is a big problem ( rats nest ) - you want to reduce the costs of the system by throwing away un-used power supply capacity that is available with the traditional one power supply per 1 motherboard and peripherals - most power supplies used are used for maximum supported load (NOT a motherboard + cpu + disk + mem only) - you have a huge airconditioning bill problem - that should motivate you to find and test a system with "less heat generated solutions" - your cluster only needs to have enough power for the cpu + 1disk - you have a space consideration problems - dc-atx power supply allows 420 cpus per 42U rack and up to 840 cpus for front and back loaded cluster - on and on ... for a typical 4U-8U height blade clusters ( 10 blades ) - you only need one 600-800W atx power supply to drive the 10 mini-itx or flex-atx blades - cpu is 25W ?? motherboard is 25W ... - disks need 1A at 12v to spin up.. normal operation current is 80ma at 12v ... etc .. per disk specs - how you want to do power calculations is the trick 10 full-tower system with a 450W power does NOT imply you';re using 4500W of power for 10 systems :-) have fun alvin http://www.1U-ITX.net 100TB - 200TB of disks per 42U racks ?? -- even more fun http://www.itx-blades.net/1U-Blades ( blades are with mini-box.com's dc-dc atx power supply ) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mayank_kaushik at vsnl.net Sun Mar 7 03:29:42 2004 From: mayank_kaushik at vsnl.net (mayank_kaushik at vsnl.net) Date: Sun, 07 Mar 2004 13:29:42 +0500 Subject: [Beowulf] PVM says `PVM_ROOT not set..cant start pvmd` on remote computer Message-ID: <69bcf6b69beb90.69beb9069bcf6b@vsnl.net> hi... im trying to make a two-machine PVM virtual machine. but im having problems with PVM. the names of the two machines are "mayank" and "manish".."mayank" runs fedora core 1, "manish" runs red hat linux -9..both are part of a simple 100mbps LAN, connected by a 100mbps switch. iv *disabled* the firewall on both machines. iv installed pvm-3.4.4-14 on both machines. the problem is: when i try to add "mayank" to the virtual machine from "manish" using "add mayank", pvm is unable to do so..gives an error message "cant start pvmd"..then it tries to diagnose what went wrong..it passes all tests but one -- says "PVM_ROOT" is set to "" on the target machine ("mayank")...but thats ABSURD..iv checked a mill-yun times, the said variable is correctly set..when i ssh to mayank from manish, and then echo $PVM_ROOT , i get the correct answer... plz note that im using ssh instead of rsh, by changing the variable PVM_RSH=/usr/bin/ssh..since im more comfortable with ssh... but when i try the opposite--adding "manish" to the virtual machine from "mayank" runnnig fedora..it works! furthermore....before i installed fedora core 1 on mayank, it too had red hat 9..and then i was getting the same problem from BOTH machines..but after installing fedora on mayank, things began to work from that end. what going on??? (apart from me whos going nuts) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Sun Mar 7 11:10:20 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Sun, 7 Mar 2004 08:10:20 -0800 (PST) Subject: [Beowulf] good 24 port gige switch In-Reply-To: <40495663.7010507@tamu.edu> Message-ID: Does anyone have experience with Dell's new 2624 unmanaged 24 port gigE switch? It's only about $330, around a 1/10 the cost of the managed switches. >From what I've read, the Dell/Linksys 5224 managed gigE switch is good. It could be that the unmanaged switch uses the exact same Broadcom switch chips, but just doesn't have management. On Fri, 5 Mar 2004, Gerry Creager N5JXS wrote: > expected to be about $3000, for the 24 port model. I'm getting 2, and > have dual nics on the nodes, for some playing with channel bonding, and Last I heard, the interrupt mitigation on gigE cards messes up channel bonding for extra bandwidth. The packets arrive in batches out of order, and Linux's TCP/IP stack doesn't like this, so you get less bandwidth with two cards than you would with just one. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Sun Mar 7 17:13:26 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Sun, 7 Mar 2004 17:13:26 -0500 (EST) Subject: [Beowulf] PVM says `PVM_ROOT not set..cant start pvmd` on remote computer In-Reply-To: <69bcf6b69beb90.69beb9069bcf6b@vsnl.net> Message-ID: On Sun, 7 Mar 2004 mayank_kaushik at vsnl.net wrote: > hi... > > > im trying to make a two-machine PVM virtual machine. but im having problems with PVM. > the names of the two machines are "mayank" and "manish".."mayank" runs fedora core 1, "manish" runs red hat linux -9..both are part of a simple 100mbps LAN, connected by a 100mbps switch. > iv *disabled* the firewall on both machines. > > iv installed pvm-3.4.4-14 on both machines. > the problem is: > when i try to add "mayank" to the virtual machine from "manish" using > "add mayank", pvm is unable to do so..gives an error message "cant start > pvmd"..then it tries to diagnose what went wrong..it passes all tests > but one -- says "PVM_ROOT" is set to "" on the target machine > ("mayank")...but thats ABSURD..iv checked a mill-yun times, the said > variable is correctly set..when i ssh to mayank from manish, and then > echo $PVM_ROOT , i get the correct answer... This COULD be associated with the order things like .bash_profile and so forth are run for interactive shells vs login shells. If you are setting PVM_ROOT in .bash_profile (so it would be correct on a login) be sure to ALSO set it in .bashrc so that it is set for the remote shell likely used to start PVM. I haven't looked at the fedora RPM so I don't know if /usr/bin/pvm is still a script that sets this variable for you anyway. > plz note that im using ssh instead of rsh, by changing the variable > PVM_RSH=/usr/bin/ssh..since im more comfortable with ssh... Me too. ssh also has a very nice feature that permits an environment to be set on the remote machine for non-interactive remote commands that CAN be useful for PVM, although I think the stuff above might fix it. > but when i try the opposite--adding "manish" to the virtual machine > from "mayank" runnnig fedora..it works! > furthermore....before i installed fedora core 1 on mayank, it too had > red hat 9..and then i was getting the same problem from BOTH > machines..but after installing fedora on mayank, things began to work > from that end. I've encountered a similar problem only once, trying to add nodes FROM a wireless laptop. Didn't work. Adding the wireless laptop from anywhere else worked fine, all systems RH 9 and clean (new) installs from RPM of pvm, I explicitly set PVM_ROOT and PVM_RSH when logging in. PVM_ROOT is additionally set (correctly) by the /usr/bin/pvm command, which is really a shell. > what going on??? (apart from me whos going nuts) Try checking your environment to make sure it is set for both a remote command: ssh mayank echo "\$PVM_ROOT" and in a remote login: ssh mayank $ echo "$PVM_ROOT" rgb > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sunyy_2004 at hotmail.com Mon Mar 8 11:33:18 2004 From: sunyy_2004 at hotmail.com (Yiyang Sun) Date: Tue, 09 Mar 2004 00:33:18 +0800 Subject: [Beowulf] Relation between Marvell Yukon Controller and SysKonnect GbE Adapters Message-ID: Hi, Beowulf users, We're going to setup a small cluster. The motherboard we ordered is the newly released Gigabyte GA-8IPE1000-G which integrates Marvell's Yukon 8001 GbE Controller. I tried to find the Linux driver for this controller on Google and was directed to SysKonnect's website http://www.syskonnect.com/syskonnect/support/driver/d0102_driver.html which provides a driver for Marvell Yukon/SysKonnect SK-98xx Gigabit Ethernet Adapters. However, there is no explicit indication on this website that SysKonnect's adapters use Marvell's chips. Does any here have experience using Marvell's controllers? Is it easy to install Yukon 8001 on Linux? Thanks! Yiyang _________________________________________________________________ Get MSN Hotmail alerts on your mobile. http://en-asiasms.mobile.msn.com/ac.aspx?cid=1002 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Mon Mar 8 14:44:50 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Mon, 8 Mar 2004 14:44:50 -0500 (EST) Subject: [Beowulf] Re: beowulf In-Reply-To: <20040308184024.955.qmail@web21501.mail.yahoo.com> Message-ID: On Mon, 8 Mar 2004, prakash borade wrote: > how should i proceed for a client which takes dta from 5 servers > reoetadly after every 15 seconds > i get the data but it prints the garbage value > > what can be the problem i am usiung sockets on redhat 9 > > i am creting new sockets for it every time on clien side Dear Prakash, There is such a dazzling array of possible problems with your code that (not being psychic) I cannot possibly help you. For example -- You could be printing an integer as a float without a cast (purely misusing printf). Or vice versa. I do this all the time; it is a common mistake. You could be sending the data on a bigendian system, receiving it and trying to print it on a littleendian system. You could have a trivial offset wrong in your receive buffers -- printing an integer (for example) starting a byte in and overlapping some other data in your stack would yield garbage. You could have a serious problem with your read algorithm. Reading reliably from a socket is not trivial. I use a routine that I developed over a fairly long time and it STILL has bugs that surface. The reading/writing are fundamentally asynchronous, and a read can easily leave data behind in the socket buffer (so that what IS read is garbage). ...and this is the tip of an immense iceberg of possible programming errors. The best way to proceed to write network code is to a) start with a working template of networking/socket code. There are examples in a number of texts, for example, as well as lots of socket-based applications. Pick a template, get it working. b) SLOWLY and GENTLY change your working template into your application, ensuring that the networking component never breaks at intermediary revisions. or c) learn, slowly, surely, and by making many mistakes, to write socket code from scratch without using a template. Me, I use a template. rgb P.S. to get more help, you're really going to have to provide a LOT more detail than this. Possibly including the actual source code. -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From beowulf at studio26.be Mon Mar 8 14:54:40 2004 From: beowulf at studio26.be (Maikel Punie) Date: Mon, 8 Mar 2004 20:54:40 +0100 Subject: [Beowulf] Cluster school project Message-ID: hi, I need to make a smaal beowulf cluster for a school project i have like 2 months for this stuff, but i need to make my own task asignment. So basicly what do you guys think that would be possible to realize in 2 months time? The only thing they told me, is that the nodes must be discless systems. any ideas about what could be donne in 2 months. Maikel _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From beowulf at studio26.be Mon Mar 8 16:03:41 2004 From: beowulf at studio26.be (Maikel Punie) Date: Mon, 8 Mar 2004 22:03:41 +0100 Subject: [Beowulf] Cluster school project In-Reply-To: Message-ID: hmm, ok, maybe i explained badly, at the moment i just need to create a project discryption on what would be possible to realize in 2 months, and off course i could use the cluster knoppix, but then its not a real project anymore, then its just an install task. also the openmosix structure is it using diskless nodes? or what because i can't find a lot off info about it. By the way, which part of Belgium are you from? I recently attended the FOSDEM conference at the ULB in Bruxelles. Great conference. Well its the whole other part off the country, but yeah it was a great conference i was there to :) Thanks Miakle -----Oorspronkelijk bericht----- Van: John Hearns [mailto:john.hearns at clustervision.com] Verzonden: maandag 8 maart 2004 21:52 Aan: Maikel Punie CC: Beowul-f Mailing lists Onderwerp: Re: [Beowulf] Cluster school project On Mon, 8 Mar 2004, Maikel Punie wrote: > > hi, > > I need to make a smaal beowulf cluster for a school project i have like 2 > months for this stuff, but i need to make my own task asignment. > So basicly what do you guys think that would be possible to realize in 2 > months time? The only thing they told me, is that the nodes must be discless > systems. > Maikel, first you need the computers! Then you should first look at ClusterKnoppix http://bofh.be/clusterknoppix/ Once you have that running, come back and tell us how you got on. We'll help you do more then. By the way, which part of Belgium are you from? I recently attended the FOSDEM conference at the ULB in Bruxelles. Great conference. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From john.hearns at clustervision.com Mon Mar 8 15:51:58 2004 From: john.hearns at clustervision.com (John Hearns) Date: Mon, 8 Mar 2004 21:51:58 +0100 (CET) Subject: [Beowulf] Cluster school project In-Reply-To: Message-ID: On Mon, 8 Mar 2004, Maikel Punie wrote: > > hi, > > I need to make a smaal beowulf cluster for a school project i have like 2 > months for this stuff, but i need to make my own task asignment. > So basicly what do you guys think that would be possible to realize in 2 > months time? The only thing they told me, is that the nodes must be discless > systems. > Maikel, first you need the computers! Then you should first look at ClusterKnoppix http://bofh.be/clusterknoppix/ Once you have that running, come back and tell us how you got on. We'll help you do more then. By the way, which part of Belgium are you from? I recently attended the FOSDEM conference at the ULB in Bruxelles. Great conference. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mprinkey at aeolusresearch.com Mon Mar 8 14:39:44 2004 From: mprinkey at aeolusresearch.com (Michael T. Prinkey) Date: Mon, 8 Mar 2004 14:39:44 -0500 (EST) Subject: [Beowulf] e1000 performance Message-ID: Hello everyone, I am building a small cluster that uses Tyan S2723GNN motherboards that include an integrated Intel e1000 gigabit NIC. I have installed two Netgear 302T gigabit cards in the 66 MHz slots as well. With point-to-point links, I can get a very respectable 890 Mbps with the tg3 cards, but the e1000 lags significantly at 300 to 450 Mbps. I am using the NAPI e1000 driver in the 2.4.24 kernel. I have tried the following measures without any improvement: - changed the tcp_mem,_wmem,_rmem to larger values. - increased the MTU to values >1500. - reniced the ksoftirq processes to 0. The 2.4.24 kernel contains the 4.x version of the e1000. I plan to try the 5.x version this evening. Also, want to try increasing the Txqueuelen as well. Has anyone had similar experience with these embedded e1000s? Googling leads me to several sites like this one: http://www.hep.ucl.ac.uk/~ytl/tcpip/tuning/ that seem to indicate that I should expect much more from the e1000. Any help here is welcome? Thanks, Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Mon Mar 8 16:59:59 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Mon, 8 Mar 2004 13:59:59 -0800 (PST) Subject: [Beowulf] e1000 performance In-Reply-To: Message-ID: On Mon, 8 Mar 2004, Michael T. Prinkey wrote: > I am building a small cluster that uses Tyan S2723GNN motherboards that > include an integrated Intel e1000 gigabit NIC. I have installed two >From a supermicro X5DPL-iGM (E7501 chipset) with onboard e1000 to supermicro E7500 board with an e1000 PCI-X gigabit card, via a dell 5224 switch. The E7501 board has a 3ware 8506 card on the same PCI-X bus as the e1000 chip, so it's running at 64/66. The PCI-X card is running at 133 MHz. TCP STREAM TEST to duet Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131070 131070 1472 9.99 940.86 Kernel versions are 2.4.20 (PCI-X card) and 2.4.22-pre2 (the onboard chip). 2.4.20 has driver 4.4.12-k1, while 2.4.22-pre2 has driver 5.1.11-k1. The old e1000 driver has a very useful proc file in /proc/net/PRO_LAN_Adapters that gives all kind of information. I have RX checksum on and flow control turned on. The newer driver doesn't have this information. > the NAPI e1000 driver in the 2.4.24 kernel. I have tried the following NAPI? > measures without any improvement: I've done nothing wrt gigabit performance, other than turn on flow control. I found that without flowcontrol, tcp connections to 100 mbit hosts would hang. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mnerren at paracel.com Mon Mar 8 17:31:55 2004 From: mnerren at paracel.com (micah nerren) Date: Mon, 08 Mar 2004 14:31:55 -0800 Subject: [Beowulf] Cluster school project In-Reply-To: References: Message-ID: <1078785115.30523.89.camel@angmar> On Mon, 2004-03-08 at 11:54, Maikel Punie wrote: > hi, > > I need to make a smaal beowulf cluster for a school project i have like 2 > months for this stuff, but i need to make my own task asignment. > So basicly what do you guys think that would be possible to realize in 2 > months time? The only thing they told me, is that the nodes must be discless > systems. > > any ideas about what could be donne in 2 months. > > Maikel To actually build a small (or large!) beowulf of discless systems is pretty easy, I guess the hardest part will be determining what the purpose of the cluster will be. What type of code will be running on it? They will basically be network booting a kernel and mounting an nfs filesystem. Research these aspects, and research what kind of tools you want to have on the cluster, ie. distributed shell, monitoring, mpi, etc. 2 months should be plenty, you should be able to get a basic small beowulf up and running in 2 hours once you know what to do and how to set it up. Time to fire up google and start researching beowulf's and diskless booting. There is a lot of good info out there. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From becker at scyld.com Mon Mar 8 17:15:46 2004 From: becker at scyld.com (Donald Becker) Date: Mon, 8 Mar 2004 17:15:46 -0500 (EST) Subject: [Beowulf] BWBUG Greenbelt: Intel HPC and Grid, Beowulf Clusters Message-ID: Special notes: This month's meeting is in Greenbelt Maryland, not Virginia! From pre-registration we expect a full room, so please register on line at http://bwbug.org and show up at least 15 minutes early. Title: Intel's Perspective on Beowulf's Clusters Speaker: Stephen Wheat Ph.D This talk will review Intel's perspective on technology trends and transitions in this decade. The focus will be on bringing the latest technology to the scientists' labs in the shortest amount of time. The technologies reviewed will include processors, chipsets, I/O, systems management, and software tools. Come with your questions; the presentation is designed to be interactive. Date: March 9, 2004 Time: 3:00 PM (doors open at 2:30) Location: Northrop Grumman IT 7501 Greenway Center Drive (Intersection of BW Parkway and DC beltway) Suite 1200 (12th floor) Greenbelt Maryland Need to be a member?: No ( guests are welcome ) Parking: Free As usual there will be door prizes, food and refreshments. From: "Fitzmaurice, Michael" Dr. Wheat from Intel must be a popular speaker we have a big turn out expected. If you have not registered yet please do so. We may need to plan for extra chairs and we need to predict how many pizzas to order. This would be great meeting to invite a friend or your boss. It may be crowded, therefore, getting there a little early is recommended. This event is sponsored by the Baltimore-Washington Beowulf Users Group (BWBUG) Please register on line at http://bwbug.org As usual there will be door prizes, food and refreshments. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From nathan at iwantka.com Mon Mar 8 18:21:15 2004 From: nathan at iwantka.com (Nathan Littlepage) Date: Mon, 8 Mar 2004 17:21:15 -0600 Subject: [Beowulf] SCTP Message-ID: <00d701c40564$1d21a830$6c45a8c0@ntbrt.bigrivertelephone.com> Has anyone looked into incorporating SCTP in the cluster environment? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From becker at scyld.com Mon Mar 8 20:44:52 2004 From: becker at scyld.com (Donald Becker) Date: Mon, 8 Mar 2004 20:44:52 -0500 (EST) Subject: [Beowulf] SCTP In-Reply-To: <00d701c40564$1d21a830$6c45a8c0@ntbrt.bigrivertelephone.com> Message-ID: On Mon, 8 Mar 2004, Nathan Littlepage wrote: > Has anyone looked into incorporating SCTP in the cluster environment? What advantage would it provide for a SAN- or LAN-based cluster? Not that TCP is especially light-weight. TCP implementations are WAN-oriented and have increasingly costly features (look at the CPU cost of iptables/ipchains) and defenses against spoofing (TCP stream start-up is much more costly than the early BSD implementations). The only reason SCTP would be a better cluster protocol is that it hasn't yet accumulated the cruft ("features") of a typical TCP stack. But if it became popular, that would change pretty much instantly. -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 914 Bay Ridge Road, Suite 220 Scyld Beowulf cluster systems Annapolis MD 21403 410-990-9993 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rdn at uchicago.edu Mon Mar 8 23:40:01 2004 From: rdn at uchicago.edu (Russell Nordquist) Date: Mon, 8 Mar 2004 22:40:01 -0600 Subject: [Beowulf] good 24 port gige switch In-Reply-To: References: <1078566922.2547.6.camel@fermi> Message-ID: <20040308224001.50f2f728@vitalstatistix> thanks for all the good info. it got me to thinking....i have resources for comparing most components of a cluster excepts network switches. it would be nice to have a source of information for this as well. something like: *bandwidth/latency between 2 hosts *bandwidth/latency at 25%/50%/75%/100% port usage *short vs long message comparisons great so far, but what about the issues: *what SW to use for the benchmark. perhaps netpipe? *the NICS used will make a difference. how does one account for the difference between a realtec and syskonnect chipset, bus speeds, etc? *do we have enough variation of cluster sizes and HW to make a useful repository? *and i'm sure there's more Is this feasible? Is it a case where any info is useful even if it is not very reliable/accurate? With more MB's coming with decent gige on board there will be a greater chance the the difference between to setups will only be the switch. so, is this a worthwhile are useful project for the community? or are there to many variables to make the results useful? russell -- - - - - - - - - - - - - Russell Nordquist UNIX Systems Administrator Geophysical Sciences Computing http://geosci.uchicago.edu/computing NSIT, University of Chicago - - - - - - - - - - - _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From beowulf at studio26.be Tue Mar 9 12:45:47 2004 From: beowulf at studio26.be (Maikel Punie) Date: Tue, 9 Mar 2004 18:45:47 +0100 Subject: [Beowulf] Cluster school project In-Reply-To: <644D9337A02FC24689647BF9E48EC39E08ABB797@drm556> Message-ID: >> ok, maybe i explained badly, at the moment i just need to create a project >> discryption on what would be possible to realize in 2 months, and off course >Do you mean a computing/programming project could you do, >like calculating pi to some large number of digits? yeah something like that, i realy have no idea what is possible. if there are any suggestions, they are always welcome. Maikel _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From paulojjs at bragatel.pt Tue Mar 9 04:15:05 2004 From: paulojjs at bragatel.pt (Paulo Silva) Date: Tue, 09 Mar 2004 09:15:05 +0000 Subject: [Beowulf] How to choose an UPS for a Beowulf cluster Message-ID: <1078823704.1882.33.camel@blackTiger> Hi, I'm building a small Beowulf cluster for HPC (about 16 nodes) and I need some advices on choosing the right UPS. The UPS should be able to signal the central node when the battery reaches some level (I think this is common usage) and it should be able to turn itself off before running out of battery (I was told that this extends the life of the battery). 10 minutes of runtime sould be enough. I was looking in the APC site but I was rather confused by all the models available. Can anyone give me some advice on the type of device to choose? Thanks for any tip -- Paulo Jorge Jesus Silva perl -we 'print "paulojjs".reverse "\ntp.letagarb@"' If a guru falls in the forest with no one to hear him, was he really a guru at all? -- Strange de Jim, "The Metasexuals" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Esta ? uma parte de mensagem assinada digitalmente URL: From brichard at clusterworldexpo.com Tue Mar 9 13:45:15 2004 From: brichard at clusterworldexpo.com (Bryan Richard) Date: Tue, 9 Mar 2004 13:45:15 -0500 Subject: [Beowulf] Join Don Becker and Thomas Sterling at ClusterWorld Conference & Expo Message-ID: <20040309184515.GB47601@clusterworldexpo.com> ClusterWorld Conference & Expo welcomes Scyld's Don Becker and Keynote Thomas Sterling to the program! If you work in Beowulf and clusters, you can't miss the following program events: - Donald Becker, Scyld Computing Corporation: "Scyld Beowulf Introductory Workshop" - Donald Becker, Scyld Computing Corporation: "Scyld Beowulf Advanced Workshop" - Thomas Sterling, California Institute of Technology: "Beowulf Cluster Computing a Decade of Accomplishment, a Decade of Challenge" PLUS, ClusterWorld's exciting program of intensive tutorials, special events, and expert presentations in 8 vertical industry tracks: Applications, Automotive & Aerospace Engineering, Bioinformatics, Digital Content Creation, Grid, Finance, Petroleum & Geophysical Exploration, and Systems. A Special Offer for Beowulf Members =================================== Beowulf.org members get 20% off registration prices when registering online! You MUST use your special Priority Code - BEOW -- when registering online to receive your 20% discount! Online registration ends March 31, 2004 so don't delay! Just go to http://www.clusterworldexpo.com and click on "REGISTER NOW!" to fill out our quick enrollment form. Associations, Universities and Labs Get 50% off Registration ============================================================ Students and employees of universities, associations, and government labs are eligible for 50% off ClusterWorld registration! This offer is only available via fax or mail. Please log on to www.clusterworldexpo.com and click on "Register Now" to download registration PDFs. Or call 415-321-3062 for more information A TERRIFIC PROGRAM ================== At ClusterWorld Conference & Expo, you will: * LEARN from top clustering experts in our extensive conference program. * EXPERIENCE the latest cluster technology from the top vendors on our expo floor. * MEET AND NETWORK with colleagues from across the world of clustering at our social events and parties. Keynotes: - Ian Foster, Argonne National Laboratory, University of Chicago, Globus Alliance, and co-author of "The Grid: Blueprint for a New Computing Infrastructure", - Thomas Sterling, California Institute of Technology, author of "How to Build a Beowulf," and co-author of "Enabling Technologies for Petaflops Computing". - Andrew Mendelsohn, Senior Vice President, Database & Application Server Technology, Oracle Corporation - David Kuck, Intel Fellow, Manager, Software and Solutions Group, Intel Corporation Want to know which sessions are getting the biggest buzz? Click on http://www.clusterworldexpo.com/SessionSpotlight for a list of highlights by Technical Session Track. REGISTER TODAY! ClusterWorld Conference and Expo April 5 - 8, 2004 San Jose Convention Center San Jose, California http://www.clusterworldexpo.com ClusterWorld Conference & Expo Sponsors ======================================= Platinum: Oracle Corporation, Intel Corporation Gold: AMD, Dell, Hewlett Packard, Linux Networx, Mountain View Data, Panasas, Penguin Computing, and RLX Technologies Silver: Appro, Engineered Intelligence, Microway, NEC, Platform Computing, and PolyServe Media & Association Sponsors: Bioinformatics.org, ClusterWorld Magazine, Distributed Systems Online, Dr. Dobbs Journal, Gelato Federation, Global Grid Forum, GlobusWorld, LinuxHPC, Linux Magazine, PR Newswire, Storage Management, and SysAdmin Magazine _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From wseas at canada.com Tue Mar 9 12:25:56 2004 From: wseas at canada.com (WSEAS newsletter in mechanical engineering) Date: Tue, 9 Mar 2004 19:25:56 +0200 Subject: [Beowulf] WSEAS and IASME newsletter in mechanical engineering, March 9, 2004 Message-ID: <3FE20F4000220BB2@fesscrpp1.tellas.gr> (added by postmaster@fesscrpp1.tellas.gr) If you want to contact us, the Subject of your email must contains the code: WSEAS CALL FOR PAPERS -- CALL FOR REVIEWERS -- CALL FOR SPECIAL SESSIONS http://www.wseas.org IASME / WSEAS International Conference on "FLUID MECHANICS" (FLUIDS 2004) August 17-19, Corfu Island, Greece The papers of this conference will be published: (a) as regular papers in the IASME/WSEAS conference proceedings (b) regular papers in the IASME TRANSACTIONS ON MECHANICAL ENGINEERING http://www.wseas.org REGISTRATION FEES: 250 EUR DEADLINE: APRIL 10, 2004 ACCOMODATION: Incredible low prices in a 5 Star Sea Resort (former HILTON of Corfu Island), Greece, 5 Star Sea resort where the multiconference of WSEAS will take place in August 2004: 51 EUR in double room and 81 EUR in single room. (in August 2004, in the Capital of Greece, Athens, the 2004 Olympic Games will take place) ---> Sponsored by IASME <---- Topics of FLUIDS 2004 Mathematical Modelling in fluid mechanics Simulation in fluid mechanics Numerical methods in fluid mechanics Convection, heat and mass transfer Experimental Methodologies in fluid mechanics Thin film technologies Multiphase flow Boundary layer flow Material properties Fluid structure interaction Hydrotechnology Hydrodynamics Coastal and estuarial modelling Wave modelling Industrial applications Environmental Problems Air Pollution Problems Fluid Mechanics for Civil Engineering Fluid Mechanics in Geosciences Flow visualisation Biofluids Meteorology Waste Management Environmental protection Management of living resources Mathematical models Management of Rivers and Lakes Underwater Ecology Hydrology Oceanology Ocean Engineering Others INTERNATIONAL SCIENTIFIC COMMITTEE Andrei Fedorov (USA) A. C. Baytas (Turkey) Albert R. George (USA) Alexander I. Leontiev (Russia) Andreas Dillmann (Germany) Bruce Caswell (USA) Chris Swan (UK) David A. Caughey (USA) Derek B Ingham (UK) Donatien Njomo (CM) Dong Chen (Australia) Dong-Ryul Lee (Korea) Edward E. Anderson (USA) G. Gaiser (Germany) G.D. Raithby (Canada) Gad Hetsroni (Israel) H. Beir?o da Veiga (Italy) Ingegerd Sjfholm (Sweden) Jerry R. Dunn (USA) Joseph T. C. Liu (USA) Karl B?hler (Germany) Kenneth S. Breuer (USA) Kumar K. Tamma (USA) Kyungkeun Kang (USA) M. A. Hossain (UK) M. F. El-Amin (USA) M.-Y. Wen (Taiwan) Michiel Nijemeisland (USA) Ming-C. Chyu (USA) Naoto Tanaka (Japan) Natalia V. Medvetskaya (Russia) O. Liungman (Sweden) Philip Marcus (USA) Pradip Majumdar (USA) Rama Subba Reddy Gorla (USA) Robert Nerem (USA) Rod Sobey (UK) Ruairi Maciver (UK) S.M.Ghiaasiaan (USA) Stanley Berger (USA) Tak?o Takahashi (France) Vassilis Gekas (Sweden) Yinping Zhang (China) Yoshitaka Watanabe (Japan) NOTE THAT IN WSEAS CONFERENCES YOU CAN HAVE PROCEEDINGS 1) HARD COPY 2) CD-ROM and 3) Web Publishing WSEAS Books, Journals, Proceedings participate now in all major science citation indexes. ISI, ELSEVIER, CSA, AMS. Mathematical Reviews, ELP, NLG, Engineering Index Directory of Published Proceedings, INSPEC (IEE) More Details: http://www.wseas.org Thanks Alexis Espen ##### HOW TO UNSUBSCRIBE #### You receive this newsletter from your email address: beowulf at beowulf.org If you want to unsubscribe, send an email to: wseas at canada.com The Subject of your message must be exactly: REMOVE beowulf at beowulf.org WSEAS If you want to unsubscribe more than one email addresses, send a message to nata at wseas.org with Subject: REMOVE [email1, emal2, ...., emailn] WSEAS _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From michael.worsham at mci.com Tue Mar 9 13:32:14 2004 From: michael.worsham at mci.com (Michael Worsham) Date: Tue, 09 Mar 2004 13:32:14 -0500 Subject: [Beowulf] Cluster school project Message-ID: <000f01c40604$d8ef6520$987a32a6@Wcomnet.com> I would say also check out the Bootable Cluster CD (http://bccd.cs.uni.edu/) as well. It is very easy to use and was specifically designed so you could cluster an entire network lab, without having to worry about the hard drives being written to. -- Michael _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Tue Mar 9 16:13:24 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Tue, 9 Mar 2004 13:13:24 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? Message-ID: Has anyone with dual opteron machines and a kill-a-watt measured how much power they consume? I measured the dual P3 and xeons we have here, but no dual opterons yet. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at cse.ucdavis.edu Tue Mar 9 17:36:05 2004 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Tue, 9 Mar 2004 14:36:05 -0800 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <20040309223605.GA29912@cse.ucdavis.edu> On Tue, Mar 09, 2004 at 01:13:24PM -0800, Trent Piepho wrote: > Has anyone with dual opteron machines and a kill-a-watt measured how much > power they consume? > > I measured the dual P3 and xeons we have here, but no dual opterons yet. I recently measured a Sunfire V20z (dual 2.2 GHz) opteron, I believe it had 2 scsi disks, 4 GB ram. watts VA Idle 237-249 260-281 Pstream 1 thread 260-277 290-311 Pstream 2 threads 265-280 303-313 Pstream is very much like McCalpin's stream, except it uses pthreads 2 run parallel threads in sync, and it runs over a range of array sizes. It's the most power intensive application I've found, anything with heave disk usage tends to decrease the power usage. It's also great for showing memory system parallelism, say for a dual p4 vs opteron. I also find it useful for finding misconfigured dual opterons. For those interested: http://cse.ucdavis.edu/bill/pstream.c -- Bill Broadley Computational Science and Engineering UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at cse.ucdavis.edu Tue Mar 9 17:49:14 2004 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Tue, 9 Mar 2004 14:49:14 -0800 Subject: [Beowulf] good 24 port gige switch In-Reply-To: <20040308224001.50f2f728@vitalstatistix> References: <1078566922.2547.6.camel@fermi> <20040308224001.50f2f728@vitalstatistix> Message-ID: <20040309224914.GB29912@cse.ucdavis.edu> On Mon, Mar 08, 2004 at 10:40:01PM -0600, Russell Nordquist wrote: > > thanks for all the good info. it got me to thinking....i have resources > for comparing most components of a cluster excepts network switches. it > would be nice to have a source of information for this as well. > something like: > > *bandwidth/latency between 2 hosts > *bandwidth/latency at 25%/50%/75%/100% port usage > *short vs long message comparisons I use nrelay.c a small simple program I wrote that will MPI_Send MPI_send very size packets between sets of nodes. So I do something like the following to find best base latency and bandwidth: mpirun -np 2 ./nrelay 1 # then run with 10 100 1000 10000 size = 1, 2 nodes in 2.97 sec ( 5.7 us/hop) 690 KB/sec size= 10, 524288 hops, 2 nodes in 3.06 sec ( 5.8 us/hop) 6688 KB/sec size= 100, 524288 hops, 2 nodes in 4.19 sec ( 8.0 us/hop) 48868 KB/sec size= 1000, 524288 hops, 2 nodes in 15.37 sec ( 29.3 us/hop) 133267 KB/sec size= 10000, 524288 hops, 2 nodes in 40.72 sec ( 77.7 us/hop) 502908 KB/sec So we have an interconnect that manages 5.8 us for small messages and 500 MB/sec or so for large (10000 MPI_INTs). Then I run: mpirun -np 2,4,8,16,32,64 ./nrelay 10000 size= 10000, 524288 hops, 2 nodes in 40.72 sec ( 77.7 us/hop) 502908 KB/sec size= 10000, 524288 hops, 4 nodes in 39.79 sec ( 75.9 us/hop) 514698 KB/sec size= 10000, 524288 hops, 8 nodes in 39.21 sec ( 74.8 us/hop) 522253 KB/sec size= 10000, 524288 hops, 16 nodes in 45.53 sec ( 86.8 us/hop) 449772 KB/sec size= 10000, 524288 hops, 32 nodes in 49.25 sec ( 93.9 us/hop) 415876 KB/sec size= 10000, 524288 hops, 64 nodes in 52.90 sec (100.9 us/hop) 387111 KB/sec So in this case it looks like the switch is becoming saturated. The source is at: http://cse.ucdavis.edu/bill/nrelay.c I'd love to see numbers posted for various GigE, Myrinet, Dolphin and IB configurations -- Bill Broadley Computational Science and Engineering UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Tue Mar 9 19:32:49 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue, 9 Mar 2004 19:32:49 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: On Tue, 9 Mar 2004, Trent Piepho wrote: > Has anyone with dual opteron machines and a kill-a-watt measured how much > power they consume? > > I measured the dual P3 and xeons we have here, but no dual opterons yet. By strange chance yes. An astoundingly low 154 watts (IIRC -- I'm home, the kill-a-watt is at Duke -- but it was definitely ballpark of 150W) under load. That's a load average of 2, one task per processor, without testing under a variety of KINDS of load. Around 75W per loaded CPU. That's a bit less than the draw of an >>idle<< dual Athlon (165W). I'm actually racking six more boxes tomorrow and will recheck the draw and verify that it really is under load, but I was with Seth when I measured it and we remarked back and forth about it, really pleased, so I'm pretty sure I'm right. It has several very positive implications and seems believable. They are 1U cases (Penguin Altus 1000's) but the air coming out of the back is not that hot, really, again compared to the E-Z Bake Oven 2U 2466 dual Athlons (something like 260W under load). So we gain significantly in CPU, get access to larger memory if/when we care, get 64 bit memory bus, and drop power and cooling requirements (per CPU, but very nearly per rack U). It just don't get any better than this. I think they are 242's, FWIW. YMMV. I could be wrong, mistaken, deaf, dumb, blind, and stupid. My kill-a-watt could be on drugs. I could be on drugs. Maybe I dropped a decimal and they really draw 1500W. Perhaps the beer I spilled in my kill-a-watt confused it. I was up to 3:30 am finishing a month-late column for deadline himself (leaving me only days late on the CURRENT column) and my brain doesn't work very well any more. Caveat auditor. rgb > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Tue Mar 9 20:41:45 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Wed, 10 Mar 2004 09:41:45 +0800 (CST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <20040309223605.GA29912@cse.ucdavis.edu> Message-ID: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com> --- Bill Broadley ??? > I recently measured a Sunfire V20z (dual 2.2 GHz) > opteron, I believe it had 2 scsi disks, 4 GB ram. > > watts VA > Idle 237-249 260-281 > Pstream 1 thread 260-277 290-311 > Pstream 2 threads 265-280 303-313 But that is with the disks, RAM, and other hardware you have. Anyone with similar configurations but have P4s instead? It just looks too good to believe the numbers... consider that the similar performance one IA64 processor ALONE draws over 120W. Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Tue Mar 9 21:08:45 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Tue, 9 Mar 2004 18:08:45 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: On Tue, 9 Mar 2004, C J Kenneth Tan -- Heuchera Technologies wrote: > What is the power consumption that you measured for your dual P3 and > Xeons? System #1: Dual P3-500 Katmai, BX motherbaord, 512 MB PC100 ECC RAM, two tulip NICs, cheap graphics card, 5400 RPM IDE drive, floppy drive, one case fan, and a normal 250W ATX PS with a fan: System #2: Nearly the same as system #1 more or less, but with dual P3-850 Coppermines and no case fan. System #3: Dual Xeon 2.4 GHz 533FSB, E7501 chipset, 1 GB PC2100 ECC memory, two 3Ware 8506-8 cards, a firewire card, onboard intel GB and FE, one Maxtor 6Y200P0 drive, 6 high speed case fans (rated 4.44W each), floppy drive, CD-ROM drive, 550W PS with power factor correction (rated minimum 63% efficient), SATA backplane, and 16 Maxtor 6Y200M0 SATA drives (rated 7.4W idle each) in hotswap carriers. I measured system #3 with the SATA drives both installed and removed. Unfortunately I don't have a dual Xeon with minimal extra hardware to test. #1 Idle 42W 72 VA (.58 PF) #1 Loaded 103W 157 VA (.66 PF) #2 Idle 39W 67 VA (.58 PF) #2 Loaded 96W 148 VA (.65 PF) #3 Idle w/o RAID 162W 168 VA (.96 PF) #3 Loaded w/o RAID 283W 289 VA (.98 PF) #3 Idle w/ RAID 375W (stays at .98) #3 Loaded w/ RAID 510W (stays at .98) #3 Loaded w/RAID/bonnie 534W (stays at .98) For the load, I used two processes of burnP6, part of cpuburn at http://users.ev1.net/~redelm/ For a load breakdown by load type for system 1: 1 process 2 processes burnP5 65W burnP6 72WA 103W (exactly 30W per CPU over idle) burnMMX 64W burnK6 69W burnK7 67W burnBX 87W 90W stream 84W 85W The stream and burnBX memory loaders use more power than a single CPU load program, but two at once and the CPU loaders use more power. To load system #3 with the disks on, I ran bonnie++ on all 16 drives. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Wed Mar 10 00:48:45 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed, 10 Mar 2004 00:48:45 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com> Message-ID: > > I recently measured a Sunfire V20z (dual 2.2 GHz) > > opteron, I believe it had 2 scsi disks, 4 GB ram. > > > > watts VA > > Idle 237-249 260-281 > > Pstream 1 thread 260-277 290-311 > > Pstream 2 threads 265-280 303-313 that's about right. my dual 240's peak at about 250 running two copies of stream and one bonnie (2GB, 40G 7200rpm IDE). > But that is with the disks, RAM, and other hardware > you have. nothing else counts for much. for instance, dimms are a couple watts apiece (makes you wonder about the heatspreaders that gamers/overclockers love so much), nics and disks are ~10W, etc. > Anyone with similar configurations but have > P4s instead? iirc my dual xeon/2.4's peak at around 190W (1-2GB, otherwise same). > It just looks too good to believe the numbers... > consider that the similar performance one IA64 > processor ALONE draws over 120W. hey, to marketing planners, massive power dissipation is probably a *good* thing. serious "enterprise" computers must have an impressive dissipation to set them apart from those piddly little game/surfing boxes ;) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From burcu at ulakbim.gov.tr Wed Mar 10 02:30:47 2004 From: burcu at ulakbim.gov.tr (Burcu Akcan) Date: Wed, 10 Mar 2004 09:30:47 +0200 Subject: [Beowulf] SPBS problem Message-ID: <404EC427.7070200@ulakbim.gov.tr> Hi, We have built a beowulf Debian cluster that contains 128 PIV nodes and one dual xeon server. I need some help about SPBS (Storm). We have already installed SPBS on the server and nodes and all daemons seem to work regularly. When any job is given to the system by using pbs scripting, the job can be seen on defined queue by running status and related nodes are allocated for the job. On the other hand there is no cpu or memory consumption on the nodes, the job does not run exactly and at the end of estimated cpu time there is no output file. Can anyone give some advice on my problem about SPBS. Thank you... Burcu Akcan ULAKBIM High Performance Computing Center _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Wed Mar 10 09:56:49 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Wed, 10 Mar 2004 06:56:49 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com> Message-ID: On Wed, 10 Mar 2004, [big5] Andrew Wang wrote: > --- Bill Broadley > > I recently measured a Sunfire V20z (dual 2.2 GHz) > > opteron, I believe it had 2 scsi disks, 4 GB ram. > > > > watts VA > > Idle 237-249 260-281 > > Pstream 1 thread 260-277 290-311 > > Pstream 2 threads 265-280 303-313 > > It just looks too good to believe the numbers... > consider that the similar performance one IA64 > processor ALONE draws over 120W. You also have to consider that the typical computer power supply is only around 60% to 80% efficient. If the CPU draws 120W, then that's going to be something like 150 to 200 watts measured with a power meter, and really, that's what matters. It makes no difference to the AC and circuit breakers if the power is dissipated in the CPU or in the power supply. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Wed Mar 10 10:14:18 2004 From: raysonlogin at yahoo.com (Rayson Ho) Date: Wed, 10 Mar 2004 07:14:18 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: <20040310151418.51414.qmail@web11413.mail.yahoo.com> But the Itanium 2 is using so much energy that Intel couldn't rise the frequency... or else the machine would melt :( See the online lecture: "Things CPU Architects Need To Think About" http://www.stanford.edu/class/ee380/ BTW, that guy used to work for Intel, and he also mentioned about the compiler guys tuned the IA-64 compiler for the benchmarks... Rayson > hey, to marketing planners, massive power dissipation is > probably a *good* thing. serious "enterprise" computers must > have an impressive dissipation to set them apart from those > piddly little game/surfing boxes ;) > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________ Do you Yahoo!? Yahoo! Search - Find what you?re looking for faster http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Wed Mar 10 10:13:58 2004 From: raysonlogin at yahoo.com (Rayson Ho) Date: Wed, 10 Mar 2004 07:13:58 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: <20040310151358.43826.qmail@web11407.mail.yahoo.com> But the Itanium 2 is using so much energy that Intel couldn't rise the frequency... or else the machine would melt :( See the online lecture: "Things CPU Architects Need To Think About" http://www.stanford.edu/class/ee380/ BTW, that guy used to work for Intel, and he also mentioned about the compiler guys tuned the IA-64 compiler for the benchmarks... Rayson > hey, to marketing planners, massive power dissipation is > probably a *good* thing. serious "enterprise" computers must > have an impressive dissipation to set them apart from those > piddly little game/surfing boxes ;) > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________ Do you Yahoo!? Yahoo! Search - Find what you?re looking for faster http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgoornaden at intnet.mu Wed Mar 10 11:42:39 2004 From: rgoornaden at intnet.mu (roudy) Date: Wed, 10 Mar 2004 20:42:39 +0400 Subject: [Beowulf] Writing a parallel program References: <200403101448.i2AEmIA22804@NewBlue.scyld.com> Message-ID: <003701c406bf$085f25b0$590b7bca@roudy> Hello everybody, I completed to build my beowulf cluster. Now I am writing a parallel program using MPICH2. Can someone give me a help. Because, the program that I wrote take more time to run on several nodes compare when it is run on one node. If there is a small program that someone can send me about distributing data among nodes, then each node process the data, and the information is sent back to the master node for printing. This will be a real help for me. Thanks Roud _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Mar 10 12:28:54 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 10 Mar 2004 12:28:54 -0500 (EST) Subject: [Beowulf] Writing a parallel program In-Reply-To: <003701c406bf$085f25b0$590b7bca@roudy> Message-ID: On Wed, 10 Mar 2004, roudy wrote: > Hello everybody, > I completed to build my beowulf cluster. Now I am writing a parallel program > using MPICH2. Can someone give me a help. Because, the program that I wrote > take more time to run on several nodes compare when it is run on one node. > If there is a small program that someone can send me about distributing data > among nodes, then each node process the data, and the information is sent > back to the master node for printing. This will be a real help for me. > Thanks > Roud I can't help you much with MPI but I can help you understand the problems you might encounter with ANY message passing system or library in terms of parallel task scaling. There is a ready-to-run PVM program I just posted in tarball form on my personal website that will be featured in the May issue of Cluster World Magazine. http:www.phy.duke.edu/~rgb/General/random_pvm.php It is designed to give you direct control over the most important parameters that affect task scaling so that you can learn just how it works. The task itself consists of a "master" program and a "slave" program. The master parses several parameters from the command line: -n number of slaves -d delay (to vary the amount of simulated work per communication) -r number of rands (to vary the number of communications per run and work burdent per slave) -b a flag to control whether the slaves send back EACH number as it is generated (lots of small messags) or "bundles" all the numbers they generate into a single message. This makes a visible, rather huge difference in task scaling, as it should. The task itself is trivial -- generating random numbers. The master starts by computing a trivial task partitioning among the n nodes. It spawns n slave tasks, sending each one the delay on the command line. It then sends each slave the number of rands to generate and a trivially unique seed as messages. Each slave generates a rand, waits delay (in nanoseconds, with a high-precision polling loop), and either sends it back as a message immediately (the default) or saves it in a large vector until the task is finished and sends the whole buffer as a single message (if the -b flag was set). This serves two valuable purposes for the novice. First, it gives you a ready-to-build working master/slave program to use as a template for a pretty much any problem for which the paradigm is a good fit. Second, by simply playing with it, you can learn LOTS of things about parallel programs and clusters. If delay is small (order of the packet latency, 100 usec or less) the program is in a latency dominated scaling regime where communications per number actually takes longer than generating the numbers and its parallel scaling is lousy (if slowing a task down relative to serial can be called merely lousy). If delay is large, so that it takes a long time to compute and a short time to send back the results, parallel scaling is excellent with near linear speedup. Turning on the -b flag for certain ranges of the delay can "instantly" shift one from latency bounded to bandwidth bounded parallel scaling regimes, and restore decent scaling. Even if you don't use it because it is based on PVM, if you clone it for MPI you'll learn the same lessons there, as they are universal and part of the theoretical basis for understanding parallel scaling. Eventually I'll do an MPI version myself for the column, but the mag HAS an MPI column and my focus would be more for the novice learning about parallel computing in general. BTW, obviously I think that subscribing to CWM is a good idea for novices. Among its many other virtues (such as articles by lots of the luminaries of this vary list:-), you can read my columns. In fact, from what I've seen from the first few issues, ALL the columns are pretty damn good and getting back issues to the beginning wouldn't hurt, if it is still possible. If you (or anybody) DO grab random_pvm and give it a try, please send me feedback, preferrably before the actual column comes out in May, so that I can fix it before then. It is moderately well documented in the tarball, but of course there is more "documentation" and explanation in the column itself. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Mar 10 12:28:54 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 10 Mar 2004 12:28:54 -0500 (EST) Subject: [Beowulf] Writing a parallel program In-Reply-To: <003701c406bf$085f25b0$590b7bca@roudy> Message-ID: On Wed, 10 Mar 2004, roudy wrote: > Hello everybody, > I completed to build my beowulf cluster. Now I am writing a parallel program > using MPICH2. Can someone give me a help. Because, the program that I wrote > take more time to run on several nodes compare when it is run on one node. > If there is a small program that someone can send me about distributing data > among nodes, then each node process the data, and the information is sent > back to the master node for printing. This will be a real help for me. > Thanks > Roud I can't help you much with MPI but I can help you understand the problems you might encounter with ANY message passing system or library in terms of parallel task scaling. There is a ready-to-run PVM program I just posted in tarball form on my personal website that will be featured in the May issue of Cluster World Magazine. http:www.phy.duke.edu/~rgb/General/random_pvm.php It is designed to give you direct control over the most important parameters that affect task scaling so that you can learn just how it works. The task itself consists of a "master" program and a "slave" program. The master parses several parameters from the command line: -n number of slaves -d delay (to vary the amount of simulated work per communication) -r number of rands (to vary the number of communications per run and work burdent per slave) -b a flag to control whether the slaves send back EACH number as it is generated (lots of small messags) or "bundles" all the numbers they generate into a single message. This makes a visible, rather huge difference in task scaling, as it should. The task itself is trivial -- generating random numbers. The master starts by computing a trivial task partitioning among the n nodes. It spawns n slave tasks, sending each one the delay on the command line. It then sends each slave the number of rands to generate and a trivially unique seed as messages. Each slave generates a rand, waits delay (in nanoseconds, with a high-precision polling loop), and either sends it back as a message immediately (the default) or saves it in a large vector until the task is finished and sends the whole buffer as a single message (if the -b flag was set). This serves two valuable purposes for the novice. First, it gives you a ready-to-build working master/slave program to use as a template for a pretty much any problem for which the paradigm is a good fit. Second, by simply playing with it, you can learn LOTS of things about parallel programs and clusters. If delay is small (order of the packet latency, 100 usec or less) the program is in a latency dominated scaling regime where communications per number actually takes longer than generating the numbers and its parallel scaling is lousy (if slowing a task down relative to serial can be called merely lousy). If delay is large, so that it takes a long time to compute and a short time to send back the results, parallel scaling is excellent with near linear speedup. Turning on the -b flag for certain ranges of the delay can "instantly" shift one from latency bounded to bandwidth bounded parallel scaling regimes, and restore decent scaling. Even if you don't use it because it is based on PVM, if you clone it for MPI you'll learn the same lessons there, as they are universal and part of the theoretical basis for understanding parallel scaling. Eventually I'll do an MPI version myself for the column, but the mag HAS an MPI column and my focus would be more for the novice learning about parallel computing in general. BTW, obviously I think that subscribing to CWM is a good idea for novices. Among its many other virtues (such as articles by lots of the luminaries of this vary list:-), you can read my columns. In fact, from what I've seen from the first few issues, ALL the columns are pretty damn good and getting back issues to the beginning wouldn't hurt, if it is still possible. If you (or anybody) DO grab random_pvm and give it a try, please send me feedback, preferrably before the actual column comes out in May, so that I can fix it before then. It is moderately well documented in the tarball, but of course there is more "documentation" and explanation in the column itself. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Wed Mar 10 12:07:10 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed, 10 Mar 2004 12:07:10 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <20040310151358.43826.qmail@web11407.mail.yahoo.com> Message-ID: > See the online lecture: "Things CPU Architects Need To Think About" > http://www.stanford.edu/class/ee380/ does anyone have a lead on an open-source player for these .asx files? or at least something not tied to windows? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sp at scali.com Wed Mar 10 13:41:59 2004 From: sp at scali.com (Steffen Persvold) Date: Wed, 10 Mar 2004 19:41:59 +0100 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <404F6177.8050108@scali.com> Mark Hahn wrote: >>See the online lecture: "Things CPU Architects Need To Think About" >>http://www.stanford.edu/class/ee380/ > > > does anyone have a lead on an open-source player for these .asx files? > or at least something not tied to windows? > The .asx file is just a link to a .wmv (Windows Media) file, which again just contains a streaming media reference. I haven't tried, but I think you could use mplayer to play them : http://www.mplayerhq.hu Best regards, Steffen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Wed Mar 10 16:11:07 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed, 10 Mar 2004 16:11:07 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <404F7BE0.6040900@nada.kth.se> Message-ID: > Seems to be running fine with xine. wow, you're right! thanks... _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Mar 10 18:56:06 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 10 Mar 2004 18:56:06 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: On Wed, 10 Mar 2004, Mark Hahn wrote: > > Seems to be running fine with xine. > > wow, you're right! thanks... (sorry to jump back on the thread this way, but it is easier than scrolling back through mail to find the original:-) I went downstairs again today and really paid attention to the kill-a-watt. Dual 1600 MHz Opteron, 1 GB of memory, load average of 3 (I don't know why but they are running three jobs instead of two at the moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over 120 V line voltage). This seems lower than a lot of the other numbers being reported (although it is a bit higher than my memory recalled yesterday -- I TOLD you not to trust me:-). It is still considerably better than a dual Athlon at much higher clock as well. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ddw at dreamscape.com Wed Mar 10 20:36:13 2004 From: ddw at dreamscape.com (Daniel Williams) Date: Wed, 10 Mar 2004 20:36:13 -0500 Subject: [Beowulf] Cluster school project References: <200403101446.i2AEknA22660@NewBlue.scyld.com> Message-ID: <404FC28A.7607EF77@dreamscape.com> > From: "Maikel Punie" > Subject: RE: [Beowulf] Cluster school project > Date: Tue, 9 Mar 2004 18:45:47 +0100 > [snip...] > >>Do you mean a computing/programming project could you do, >>like calculating pi to some large number of digits? > >yeah something like that, i realy have no idea what is possible. >if there are any suggestions, they are always welcome. Here's what I want to do once I get enough junk 500mhz machines together: Make a model of the spread of genetic diseases in a population of a few hundred million. I've been wanting to do that for years, but it would probably take a few months to run on any single machine I own. I figure it should run in a few weeks as soon as I get a 16 node cluster together to run it. Is that something you could maybe use? DDW _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From a.j.martin at qmul.ac.uk Thu Mar 11 04:56:24 2004 From: a.j.martin at qmul.ac.uk (Alex Martin) Date: Thu, 11 Mar 2004 09:56:24 +0000 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> On Wednesday 10 March 2004 11:56 pm, Robert G. Brown wrote: > > I went downstairs again today and really paid attention to the > kill-a-watt. Dual 1600 MHz Opteron, 1 GB of memory, load average of 3 > (I don't know why but they are running three jobs instead of two at the > moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over > 120 V line voltage). > > This seems lower than a lot of the other numbers being reported > (although it is a bit higher than my memory recalled yesterday -- I TOLD > you not to trust me:-). It is still considerably better than a dual > Athlon at much higher clock as well. > > rgb I find you numbers a bit surprising still As part of our latest procurement I looked up the power consumption in the INTEL/AMD documention for the various processors under consideration: Athlon model 6 2200MP 58.9 W model 8 2400MP 54.5 W model 11 2800MP (Barton) 47.2 W Opteron 240-244 82.1 W 246-248 89.0 W Xeon 2.8 GHz 77 W (512K Cache) 3.06 GHz 87 W I think these numbers are meant to be maximum? -- ------------------------------------------------------------------------------ | | | Dr. Alex Martin | | e-Mail: a.j.martin at qmul.ac.uk Queen Mary, University of London, | | Phone : +44-(0)20-7882-5033 Mile End Road, | | Fax : +44-(0)20-8981-9465 London, UK E1 4NS | | | ------------------------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From a.j.martin at qmul.ac.uk Thu Mar 11 07:47:57 2004 From: a.j.martin at qmul.ac.uk (Alex Martin) Date: Thu, 11 Mar 2004 12:47:57 +0000 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <200403111247.i2BClv215026@heppcb.ph.qmw.ac.uk> On Thursday 11 March 2004 12:35 pm, Bogdan Costescu wrote: > On Thu, 11 Mar 2004, Alex Martin wrote: > > I find you numbers a bit surprising still > > I don't :-) I was suprised that rgb's opteron numbers were so low! > While I can't remember what was the exact figure for the dual Opteron > 246 (2 GHz) system, I'm sure that it was over 200W. > > > Athlon model 11 2800MP (Barton) 47.2 W > > dual Athlon 2800MP (2133MHz) under load from 2 cpuburn ~ 230W > > > Xeon (512K Cache) 3.06 GHz 87 W > > dual Xeon 3.06GHz under load from 2 cpuburn ~ 275W your system numbers are pretty consistent with what I've measured. ( ~230 W for Athlon 2200MP and ~250W for Xeon 2.8GHz ) -- ------------------------------------------------------------------------------ | | | Dr. Alex Martin | | e-Mail: a.j.martin at qmul.ac.uk Queen Mary, University of London, | | Phone : +44-(0)20-7882-5033 Mile End Road, | | Fax : +44-(0)20-8981-9465 London, UK E1 4NS | | | ------------------------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bogdan.costescu at iwr.uni-heidelberg.de Thu Mar 11 07:35:30 2004 From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Thu, 11 Mar 2004 13:35:30 +0100 (CET) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> Message-ID: On Thu, 11 Mar 2004, Alex Martin wrote: > I find you numbers a bit surprising still I don't :-) While I can't remember what was the exact figure for the dual Opteron 246 (2 GHz) system, I'm sure that it was over 200W. > Athlon model 11 2800MP (Barton) 47.2 W dual Athlon 2800MP (2133MHz) under load from 2 cpuburn ~ 230W > Xeon (512K Cache) 3.06 GHz 87 W dual Xeon 3.06GHz under load from 2 cpuburn ~ 275W -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Mar 11 08:39:02 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 11 Mar 2004 08:39:02 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> Message-ID: On Thu, 11 Mar 2004, Alex Martin wrote: > I find you numbers a bit surprising still As part of our latest procurement > I looked up the power consumption in the INTEL/AMD documention for the > various processors under consideration: ... > Opteron 240-244 82.1 W > 246-248 89.0 W > I think these numbers are meant to be maximum? You've got me -- dunno. I can post a digital photo of the kill-a-watt reading if you like (I was going to take a camera down there anyway to add a new rack photo to the brahma tour). I can also take the kill-a-watt and plug in an electric light bulb or something with a fairly predictable draw and see if it is broken somehow. Right now a system in production work is plugged into it -- I'll try to retrieve it soon and plug one of my new systems into it so that I can run more detailed tests under more controlled loads. I don't know exactly what kind of work is being done in the current jobs being run. One advantage may be that the cases are apparently equipped with a PFC power supply. The power factor appears to be very good -- close to 1. This may make the power supplies themselves run cooler, so that the power draw of the rest of the system IS only 20 or so more watts. The systems also have a bare minimum of peripherals -- a hard disk (sitting idle), onboard dual gig NICs (one idle) and video (idle). Will post newer/better tests as I have time and make them, although others may beat me to it...;-) rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Thu Mar 11 11:10:16 2004 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Thu, 11 Mar 2004 08:10:16 -0800 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> Message-ID: <5.2.0.9.2.20040311080304.017d8008@mailhost4.jpl.nasa.gov> At 08:39 AM 3/11/2004 -0500, Robert G. Brown wrote: >On Thu, 11 Mar 2004, Alex Martin wrote: > > > I find you numbers a bit surprising still As part of our latest > procurement > > I looked up the power consumption in the INTEL/AMD documention for the > > various processors under consideration: >... > > Opteron 240-244 82.1 W > > 246-248 89.0 W > > I think these numbers are meant to be maximum? > >You've got me -- dunno. I can post a digital photo of the kill-a-watt >reading if you like (I was going to take a camera down there anyway to >add a new rack photo to the brahma tour). I can also take the >kill-a-watt and plug in an electric light bulb or something with a >fairly predictable draw and see if it is broken somehow. > >Right now a system in production work is plugged into it -- I'll try to >retrieve it soon and plug one of my new systems into it so that I can >run more detailed tests under more controlled loads. I don't know >exactly what kind of work is being done in the current jobs being run. > >One advantage may be that the cases are apparently equipped with a PFC >power supply. The power factor appears to be very good -- close to 1. >This may make the power supplies themselves run cooler, so that the >power draw of the rest of the system IS only 20 or so more watts. The >systems also have a bare minimum of peripherals -- a hard disk (sitting >idle), onboard dual gig NICs (one idle) and video (idle). Those power supplies are impressive PFC wise.. I'd venture to say, though, that the rated powers are peak over some fairly short time. The Kill-A-Watt averages over some reasonable time (a second or two?), so you could actually have an average that's half the peak. Everytime there's a pipeline stall, or a cache miss, etc, the current's going to change. We used processor current to debug DSP code, because you could actually see interrupts come in during the other steps(FFT = very high power, sudden drop for a few microseconds while ISR is running). You could also accurately time how long each "pass" in the FFT took, since the CPU power dropped while setting up the parameters for the next set of butterflies. To really track this kind of thing down, you'd want to hook a DC current probe around the wires from the Power supply to the motherboard. Then, write some benchmark program with a fairly repeatable computational resource requirement pattern. Look at the current on an oscilloscope. I suspect that onboard filtering will get rid of variations that last less than, say, 1-10 mSec, so a program that has a basic cyclical nature lasting 10 times that would be nice. Ideally, you'd probe the current going to the CPU, vs the rest of the mobo, but that's probably a bit of a challenge. Another experiment would be to write a small program that you KNOW will stay in cache and never go off chip and measure the current draw when running it. James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tegner at nada.kth.se Wed Mar 10 15:34:40 2004 From: tegner at nada.kth.se (Jon Tegner) Date: Wed, 10 Mar 2004 21:34:40 +0100 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <404F7BE0.6040900@nada.kth.se> Seems to be running fine with xine. /jon Mark Hahn wrote: >>See the online lecture: "Things CPU Architects Need To Think About" >>http://www.stanford.edu/class/ee380/ >> >> > >does anyone have a lead on an open-source player for these .asx files? >or at least something not tied to windows? > > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jimlux at earthlink.net Thu Mar 11 09:07:09 2004 From: jimlux at earthlink.net (Jim Lux) Date: Thu, 11 Mar 2004 06:07:09 -0800 Subject: [Beowulf] Power consumption for opterons? References: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> Message-ID: <000e01c40772$2611bf60$36a8a8c0@LAPTOP152422> ----- Original Message ----- From: "Alex Martin" To: "Robert G. Brown" ; "Mark Hahn" Cc: "Jon Tegner" ; Sent: Thursday, March 11, 2004 1:56 AM Subject: Re: [Beowulf] Power consumption for opterons? > On Wednesday 10 March 2004 11:56 pm, Robert G. Brown wrote: > > > > > I went downstairs again today and really paid attention to the > > kill-a-watt. Dual 1600 MHz Opteron, 1 GB of memory, load average of 3 > > (I don't know why but they are running three jobs instead of two at the > > moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over > > 120 V line voltage). > > I find you numbers a bit surprising still As part of our latest procurement > I looked up the power consumption in the INTEL/AMD documention for the > various processors under consideration: > surprising high or surprising low? You're comparing DC power to just the processor vs wall plug power to the whole system (including cooling fans, RAM, PCI bridge chips, etc.) I think that the databook numbers of ca 50-80 W per CPU (probably the highest continuous average power) is nicely matched with 180 W from the wall for a dual CPU... The databook number is probably a bit on the high side... 180W from the wall probably equates to about 140W DC. There's probably 10W or so in fans and glue, maybe 100W for both procesors, and 30W for the rest of the logic and RAM _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mathiasbrito at yahoo.com.br Fri Mar 12 08:51:22 2004 From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=) Date: Fri, 12 Mar 2004 10:51:22 -0300 (ART) Subject: [Beowulf] Strange Behavior Message-ID: <20040312135122.92643.qmail@web12208.mail.yahoo.com> Hi, I'm benchmarking my 16 nodes cluster with HPL and I obtain a estrange result, different of all I ever seen before. When I send more data with a big N, the performance is worse than with small values of N. I used N=5000 with NB=20 and the performance was 3.3GB, when I send N=10000 with NB=20 i get only 2.1GB. I don't liked the result, the nodes are athlon xp 1600+ with 512MB RAM, and I think the cluster very slow. Someone had the same problem and could help me? Mathias ===== Mathias Brito Universidade Estadual de Santa Cruz - UESC Departamento de Ci?ncias Exatas e Tecnol?gicas Estudante do Curso de Ci?ncia da Computa??o ______________________________________________________________________ Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora: http://br.yahoo.com/info/mail.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lars at meshtechnologies.com Fri Mar 12 11:43:47 2004 From: lars at meshtechnologies.com (Lars Henriksen) Date: Fri, 12 Mar 2004 16:43:47 +0000 Subject: [Beowulf] Strange Behavior In-Reply-To: <20040312135122.92643.qmail@web12208.mail.yahoo.com> References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> Message-ID: <1079109827.3745.7.camel@tp1.mesh-hq> On Fri, 2004-03-12 at 13:51, Mathias Brito wrote: > I'm benchmarking my 16 nodes cluster with HPL and I > obtain a estrange result, different of all I ever seen > before. When I send more data with a big N, the > performance is worse than with small values of N. I > used N=5000 with NB=20 and the performance was 3.3GB, > when I send N=10000 with NB=20 i get only 2.1GB. I > don't liked the result, the nodes are athlon xp 1600+ > with 512MB RAM, and I think the cluster very slow. > Someone had the same problem and could help me? Please correct me anybody, if im wrong: It seems to me, that the best results are acheived with approx 85-90% memory utilization (leaving something to the rest of the system). (16*512*1024*1024/8)^0.5 ~= 30200, that would close to the best N value isn't Nb=20 very low? I currently use arround 145 for P4 cpu's What performance du you get from a setup like the one above? best regards Lars -- Lars Henriksen | MESH-Technologies A/S Systems Manager & Consultant | Lille Graabroedrestraede 1 www.meshtechnologies.com | DK-5000 Odense C, Denmark lars at meshtechnologies.com | mobile: +45 2291 2904 direct: +45 6311 1187 | fax: +45 6311 1189 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From M.Arndt at science-computing.de Fri Mar 12 07:06:36 2004 From: M.Arndt at science-computing.de (Michael Arndt) Date: Fri, 12 Mar 2004 13:06:36 +0100 Subject: [Beowulf] Cluster Uplink via Wireless Message-ID: <20040312130636.D49119@blnsrv1.science-computing.de> Hello * has anyone done a wireless technology uplink to a compute cluster that is in real use ? If so, i would be interested to know how and how is the experinece in transferring "greater" (e.g. 2 GB ++ ) Result files? explanation: We have a cluster with gigabit interconnect where it would make life cheaper, if there is a possibility to upload input data and download output data via wireless link, since connecting twisted pair between WS and CLuster would be expensive. TIA Micha _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Fri Mar 12 17:22:58 2004 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Fri, 12 Mar 2004 14:22:58 -0800 Subject: [Beowulf] Cluster Uplink via Wireless In-Reply-To: <20040312130636.D49119@blnsrv1.science-computing.de> Message-ID: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> At 01:06 PM 3/12/2004 +0100, Michael Arndt wrote: >Hello * > >has anyone done a wireless technology uplink to a compute cluster >that is in real use ? >If so, i would be interested to know how and how is the experinece in >transferring "greater" (e.g. 2 GB ++ ) Result files? > >explanation: >We have a cluster with gigabit interconnect >where it would make life cheaper, if there is a possibility to upload >input data and download output data via wireless link, since connecting >twisted pair between WS and CLuster would be expensive. > I have a very small cluster that is using wireless interconnect for everything, and based upon my early observations, I'd be real, real leery of contemplating transferring Gigabytes in any practical time. For instance, loading a 25 MB compressed ram file system using tftp during PXE boot takes about a minute. This is on a very non-optimized configuration using 802.11a, through a variety of devices. Yes, indeed, the ad literature claims 54 Mbps, but that's not the actual data rate, but more the "bit rate" of the over the air signal. Wireless LANs are NOT full duplex, and there are synchronization preambles, etc. that make the throughput much lower. On a standard "11 Mbps" 802.11b type network, the "real data throughput" in a unidirectional transfer is probably closer to 3-5 Mbps. Say you get that wireless link really humming at 20 Mbps real data rate. Transferring 16,000 Mbit is still going to take 10-15 minutes. Your situation might be a bit better, especially if you can use a point to point wireless link. James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Fri Mar 12 19:29:41 2004 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Fri, 12 Mar 2004 16:29:41 -0800 Subject: [Beowulf] Cluster Uplink via Wireless References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> Message-ID: <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> At 06:04 PM 3/12/2004 -0500, Mark Hahn wrote: > > Say you get that wireless link really humming at 20 Mbps real data > > rate. Transferring 16,000 Mbit is still going to take 10-15 minutes. > >out of truely morbid curiosity, what's the latency like? I'll have some numbers next week. The configuration is sort of weird.. diskless node booting w/PXE D-Link Wireless AP in multi AP connect mode over the air D-Link wireless AP in multi AP connect mode network w/NFS and DHCP server The D-Link boxes try to be smart and not push packets across the air link that are for MACs they know are on the wired side, and that whole process is "tricky"... James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From clwang at csis.hku.hk Fri Mar 12 21:29:43 2004 From: clwang at csis.hku.hk (Cho Li Wang) Date: Sat, 13 Mar 2004 10:29:43 +0800 Subject: [Beowulf] NPC2004 CFP : Deadline Extended to March 22, 2004 Message-ID: <40527217.92D67387@csis.hku.hk> ******************************************************************* NPC2004 IFIP International Conference on Network and Parallel Computing October 18-20, 2004 Wuhan, China http://grid.hust.edu.cn/npc04 ------------------------------------------------------------------- Important Dates Paper Submission March 22, 2004 (extended) Author Notification May 1, 2004 Final Camera Ready Manuscript June 1, 2004 ******************************************************************* Call For Papers The goal of IFIP International Conference on Network and Parallel Computing (NPC 2004) is to establish an international forum for engineers and scientists to present their excellent ideas and experiences in all system fields of network and parallel computing. NPC 2004, hosted by the Huazhong University of Science and Technology, will be held in the city of Wuhan, China - the "Homeland of White Clouds and the Yellow Crane." Topics of interest include, but are not limited to: - Parallel & Distributed Architectures - Parallel & Distributed Applications/Algorithms - Parallel Programming Environments & Tools - Network & Interconnect Architecture - Network Security - Network Storage - Advanced Web and Proxy Services - Middleware Frameworks & Toolkits - Cluster and Grid Computing - Ubiquitous Computing - Peer-to-peer Computing - Multimedia Streaming Services - Performance Modeling & Evaluation Submitted papers may not have appeared in or be considered for another conference. Papers must be written in English and must be in PDF format. Detailed electronic submission instructions will be posted on the conference web site. The conference proceedings will be published by Springer Verlag in the Lecture Notes in Computer Science Series (cited by SCI). Best papers from NPC 2004 will be published in a special issue of International Journal of High Performance Computing and Networking (IJHPCN) after conference. ************************************************************************** Committee General Co-Chairs: H. J. Siegel Colorado State University, USA Guojie Li The Institute of Computing Technology, CAS, China Steering Committee Chair: Kemal Ebcioglu IBM T.J. Watson Research Center, USA Program Co-Chairs: Guangrong Gao University of Delaware, USA Zhiwei Xu Chinese Academy of Sciences, China Program Vice-Chairs: Victor K. Prasanna University of Southern California, USA Albert Y. Zomaya University of Sydney, Australia Hai Jin Huazhong University of Science and Technology, China Publicity Co-Chairs: Cho-Li Wang The University of Hong Kong, Hong Kong Chris Jesshope The University of Hull, UK Local Arrangement Chair: Song Wu Huazhong University of Science and Technology, China Steering Committee Members: Jack Dongarra University of Tennessee, USA Guangrong Gao University of Delaware, USA Jean-Luc Gaudiot University of California, Irvine, USA Guojie Li The Institute of Computing Technology, CAS, China Yoichi Muraoka Waseda University, Japan Daniel Reed University of North Carolina, USA Program Committee Members: Ishfaq Ahmad University of Texas at Arlington, USA Shoukat Ali University of Missouri-Rolla, USA Makoto Amamiya Kyushu University, Japan David Bader University of New Mexico, USA Luc Bouge IRISA/ENS Cachan, France Pascal Bouvry University of Luxembourg, Luxembourg Ralph Castain Los Alamos National Laboratory, USA Guoliang Chen University of Science and Technology of China, China Alain Darte CNRS, ENS-Lyon, France Chen Ding University of Rochester, USA Jianping Fan Institute of Computing Technology, CAS, China Xiaobing Feng Institute of Computing Technology, CAS, China Jean-Luc Gaudiot University of California, Irvine, USA Minyi Guo University of Aizu, Japan Mary Hall University of Southern California, USA Salim Hariri University of Arizona, USA Kai Hwang University of Southern California, USA Anura Jayasumana Colorado State Univeristy, USA Chris R. Jesshop The University of Hull, UK Ricky Kwok The University of Hong Kong, Hong Kong Francis Lau The University of Hong Kong, Hong Kong Chuang Lin Tsinghua University, China John Morrison University College Cork, Ireland Lionel Ni Hong Kong University of Science and Technology, Hong Kong Stephan Olariu Old Dominion University, USA Yi Pan Georgia State University, USA Depei Qian Xi'an Jiaotong University, China Daniel A. Reed University of North Carolina at Chapel Hill, USA Jose Rolim University of Geneva, Switzerland Arnold Rosenberg University of Massachusetts at Amherst, USA Sartaj Sahni University of Florida, USA Selvakennedy Selvadurai University of Sydney, Australia Franciszek Seredynski Polish Academy of Sciences, Poland Hong Shen Japan Advanced Institute of Science and Technology, Japan Xiaowei Shen IBM T. J. Watson Research Center, USA Gabby Silberman IBM Centers for Advanced Studies, USA Per Stenstrom Chalmers University of Technology, Sweden Ivan Stojmenovic University of Ottawa, Canada Ninghui Sun Institute of Computing Technology, CAS, China El-Ghazali Talbi University of Lille, France Domenico Talia University of Calabria, Italy Mitchell D. Theys University of Illinois at Chicago, USA Xinmin Tian Intel Corporation, USA Dean Tullsen University of California, San Diego, USA Cho-Li Wang The University of Hong Kong, Hong Kong Qing Yang University of Rhode Island, USA Yuanyuan Yang State University of New York at Stony Brook, USA Xiaodong Zhang College of William and Mary, USA Weimin Zheng Tsinghua University, China Bingbing Zhou University of Sydney, Australia Chuanqi Zhu Fudan University, China ------------------------------------------------------------------------ For more information, please contact the program vice-chair at the address below: Dr. Hai Jin, Professor Director, Cluster and Grid Computing Lab Vice-Dean, School of Computer Huazhong University of Science and Technology Wuhan, 430074, China Tel: +86-27-87543529 Fax: +86-27-87557354 e-fax: +1-425-920-8937 e-mail: hjin at hust.edu.cn _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mayank_kaushik at vsnl.net Sat Mar 13 05:24:13 2004 From: mayank_kaushik at vsnl.net (mayank_kaushik at vsnl.net) Date: Sat, 13 Mar 2004 15:24:13 +0500 Subject: [Beowulf] Benchmarking with PVM Message-ID: <74070c77404a91.7404a9174070c7@vsnl.net> hi everyone first of all, id like to thank Robert G. Brown for his help in solving my PVM problem, and getting my cluster running! now that its running, iv been trying to run tests on it to see how fast it really is..so i ran PVMPOV, and the results were pretty impressive- i had two P4s clustered, and the rendering time was reduced by half..may sound trivial to you guys, but to a first-timer like me, it looks great! :-) okay, so heres the deal- we`v got lots of idle computers in the college computer lab..an eclectic mix of P2 350s and P3 733s, which everyone has abandoned in favour of flashy new compaq evo P4 2.4ghzs, so along comes me the evangelist and turns all the outcasts into cluster nodes.. (wev got a gigabit LAN too) now,id like to run benchmarking tests on the cluster so as to outline the increase in performance as individual nodes are added..and also the increase in the load on the network.. are there tools available that would let me do all this..and, say, get graphs etc too? tools that are compatible with PVM? could anyone provide links to places where they can be downloaded? (im running red-hat 9.0 on all systems) thanx in anticipation Mayank PS. those proud compaq evos are giving me trouble..thev got winXP with an NTFS filesystem, n im trying to use partition magic to make a pratition so that i can make a dual boot system and install linux...but partition magic always exits with an error, on all the systems..fips wont work with NTFS..has anyone ever done this? the quick-restor cd says it would remove all partitions and make just one NTFS partition, so i didnt try that. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From daniel.kidger at quadrics.com Sat Mar 13 17:00:40 2004 From: daniel.kidger at quadrics.com (Dan Kidger) Date: Sat, 13 Mar 2004 22:00:40 +0000 Subject: [Beowulf] Strange Behavior In-Reply-To: <1079109827.3745.7.camel@tp1.mesh-hq> References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> <1079109827.3745.7.camel@tp1.mesh-hq> Message-ID: <200403132200.40877.daniel.kidger@quadrics.com> On Friday 12 March 2004 4:43 pm, Lars Henriksen wrote: > On Fri, 2004-03-12 at 13:51, Mathias Brito wrote: > > I'm benchmarking my 16 nodes cluster with HPL and I > > obtain a estrange result, different of all I ever seen > > before. When I send more data with a big N, the > > performance is worse than with small values of N. I > > used N=5000 with NB=20 and the performance was 3.3GB, > > when I send N=10000 with NB=20 i get only 2.1GB. I > > don't liked the result, the nodes are athlon xp 1600+ > > with 512MB RAM, and I think the cluster very slow. > > Someone had the same problem and could help me? > > Please correct me anybody, if im wrong: > It seems to me, that the best results are acheived with approx 85-90% > memory utilization (leaving something to the rest of the system). > > (16*512*1024*1024/8)^0.5 ~= 30200, that would close to the best N value Your target should be say 75% of theoretical peak performance 0.75 * 16nodes * 1 cpupernode * 1.4Ghz * 1 floppertick = 16.8 Gflops/s So figures like '3.1' Gflops/s (14% peak) are much lower than what you should be achieving (Only vendors like IBM post figures on the top500 with %peak figures as low as this (Nov2003) ) Linpack figures are dominated by the choice of maths library - you do not say which one you are using (MKL, libgoto, Atlas, ACML) ? > isn't Nb=20 very low? I currently use arround 145 for P4 cpu's Remember choice of NB depends on which maths library you use rather than simply on the platform - but in general the best values lie between 80 to 256; 20x20 is far too small for a matrix multiply. Daniel. -------------------------------------------------------------- Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505 ----------------------- www.quadrics.com -------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ratscus at hotmail.com Sat Mar 13 20:55:17 2004 From: ratscus at hotmail.com (Joe Manning) Date: Sat, 13 Mar 2004 18:55:17 -0700 Subject: [Beowulf] project Message-ID: Does anyone know of a good non-profit that posts data to be processed? Kind of like how SETI dispenses its data, but for cancer or something? I have a whole school to my disposal and am just going to run a diskless system pushed down from a server. I can't really do much about the network, but will use it as a working model for some personal curiosities. (hopefully I will be able to contribute to this group at some point) Also, if anyone does know of a good place to get this type of data, can they please point me in the right direction of the type of process said sight uses, so I can decide what version I want to use to implement the process. Thanks, Joe Manning _________________________________________________________________ Get a FREE online computer virus scan from McAfee when you click here. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From patrick at myri.com Sat Mar 13 21:57:40 2004 From: patrick at myri.com (Patrick Geoffray) Date: Sat, 13 Mar 2004 21:57:40 -0500 Subject: [Beowulf] Strange Behavior In-Reply-To: <200403132200.40877.daniel.kidger@quadrics.com> References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> <1079109827.3745.7.camel@tp1.mesh-hq> <200403132200.40877.daniel.kidger@quadrics.com> Message-ID: <4053CA24.1020901@myri.com> Hi Dan. Dan Kidger wrote: > Your target should be say 75% of theoretical peak performance He is likely using IP over Ethernet, so 50% would be a more reasonable expectation. > So figures like '3.1' Gflops/s (14% peak) are much lower than what you should > be achieving (Only vendors like IBM post figures on the top500 with %peak > figures as low as this (Nov2003) ) Which ones ? Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From unix_no_win at yahoo.com Sun Mar 14 11:49:17 2004 From: unix_no_win at yahoo.com (unix_no_win) Date: Sun, 14 Mar 2004 08:49:17 -0800 (PST) Subject: [Beowulf] project In-Reply-To: Message-ID: <20040314164917.45310.qmail@web40412.mail.yahoo.com> You might want to check out: www.distributedfolding.org --- Joe Manning wrote: > Does anyone know of a good non-profit that posts > data to be processed? Kind > of like how SETI dispenses its data, but for cancer > or something? I have a > whole school to my disposal and am just going to run > a diskless system > pushed down from a server. I can't really do much > about the network, but > will use it as a working model for some personal > curiosities. (hopefully I > will be able to contribute to this group at some > point) Also, if anyone > does know of a good place to get this type of data, > can they please point me > in the right direction of the type of process said > sight uses, so I can > decide what version I want to use to implement the > process. > > Thanks, > > > Joe Manning > > _________________________________________________________________ > Get a FREE online computer virus scan from McAfee > when you click here. > http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963 > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or > unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________ Do you Yahoo!? Yahoo! Mail - More reliable, more storage, less spam http://mail.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From peter at cs.usfca.edu Sun Mar 14 13:57:22 2004 From: peter at cs.usfca.edu (Peter Pacheco) Date: Sun, 14 Mar 2004 10:57:22 -0800 Subject: [Beowulf] Flashmob Supercomputer Message-ID: <20040314185722.GB14301@cs.usfca.edu> The University of San Francisco is sponsoring the first FlashMob Supercomputer on - Saturday, April 3, from 8 am to 6 pm, in the - Koret Center of the University of San Francisco. We're planning to network 1200-1400 laptops with Myrinet and Foundry Switches. We'll be running High-Performance Linpack, and we're hoping to achieve 600 GFLOPS, which is faster than some of the Top500 fastest supercomputers. We need volunteers to - Bring their laptops: Pentium III or IV or AMD, minimum requirements 1.3 GHz with 256 MBytes of RAM - Be table captains: help people set up laptops before running the benchmark - Speak on subjects related to high-performance computing For further information, please visit our website http://flashmobcomputing.org Peter Pacheco Department of Computer Science University of San Francisco San Francisco, CA 94117 peter at cs.usfca.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Sun Mar 14 21:17:50 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Mon, 15 Mar 2004 10:17:50 +0800 (CST) Subject: [Beowulf] Oh MyGrid Message-ID: <20040315021750.49880.qmail@web16813.mail.tpe.yahoo.com> http://mygrid.sourceforge.net/ "MyGrid is designed with the modern concepts in mind, simple naming and transparent class hierarchy." It's targeting DataSynapse, licensed under GPL, and more features. Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgoornaden at intnet.mu Sun Mar 14 22:12:55 2004 From: rgoornaden at intnet.mu (roudy) Date: Mon, 15 Mar 2004 07:12:55 +0400 Subject: [Beowulf] Re: Writing a parallel program Message-ID: <000701c40a3b$9a415e60$2b007bca@roudy> Hello, I don't know if it will be here that I can get a solution to my problem. Well, I have an array of elements and I would like to divide the array by the number of processors and then each processor process parts of the whole array. Below is the source code of how I am proceeding, can someone tell me what is wrong? Assume that the I have an array allval[tdegree] void share_data(void) { double nleft; int i, k, j, nmin; nmin = tdegree/size; /* Number of degrees to be handled by each processor */ nleft = tdegree%size; for(i=0;i References: <404EC427.7070200@ulakbim.gov.tr> Message-ID: <40556EA3.60400@ulakbim.gov.tr> Hi, We have built a beowulf Debian cluster that contains 128 PIV nodes and one dual xeon server. I need some help about SPBS (Storm). We have already installed SPBS on the server and nodes and all daemons seem to work regularly. When any job is given to the system by using pbs scripting, the job can be seen on defined queue by running status and related nodes are allocated for the job. On the other hand there is no cpu or memory consumption on the nodes, the job does not run exactly and at the end of estimated cpu time there is no output file. Can anyone give some advice on my problem about SPBS. Thank you... Burcu Akcan ULAKBIM High Performance Computing Center _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From klamman.gard at telia.com Mon Mar 15 13:42:43 2004 From: klamman.gard at telia.com (Per Lindstrom) Date: Mon, 15 Mar 2004 19:42:43 +0100 Subject: [Beowulf] MOSIX cluster Message-ID: <4055F923.70203@telia.com> Hi, I wonder if some of you have experience of MOSIX? (www.mosix.org) What do you think about that solution for FEA-simulations? Can MOSIX be regarded as a form of a Beowulf cluster? Best regards Per Lindstrom Per.Lindstrom at me.chalmers.se , klamman.gard at telia.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From john4482 at umn.edu Mon Mar 15 15:02:40 2004 From: john4482 at umn.edu (Eric R Johnson) Date: Mon, 15 Mar 2004 14:02:40 -0600 Subject: [Beowulf] Scyld system mysteriously locks up Message-ID: <40560BE0.1090808@umn.edu> Hello, I purchased a 4 node, 8 processor Scyld (version 28) cluster approximately 6 months ago. About 5 days ago, it started mysteriously locking up on me. Once it is locked up, I can't do anything except physically reboot the machine. Unfortunately, I am rather new to Linux clusters and, since it worked "right out of the box", I have had no experience in troubleshooting. Can someone give me an idea of where I should start? I have the BIOS on all machines set to do a full memory check on startup and the /var/log/message file shows nothing. Thanks, Eric -- ******************************************************************** Eric R A Johnson University Of Minnesota tel: (612) 626 5115 Dept. of Laboratory Medicine & Pathology fax: (612) 625 1121 7-230 BSBE e-mail: john4482 at umn.edu 312 Church Street web: www.eric-r-johnson.com Minneapolis, MN 55455 USA -- ******************************************************************** Eric R A Johnson University Of Minnesota tel: (612) 626 5115 Dept. of Laboratory Medicine & Pathology fax: (612) 625 1121 7-230 BSBE e-mail: john4482 at umn.edu 312 Church Street web: www.eric-r-johnson.com Minneapolis, MN 55455 USA _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From agrajag at dragaera.net Mon Mar 15 16:23:34 2004 From: agrajag at dragaera.net (Jag) Date: Mon, 15 Mar 2004 16:23:34 -0500 Subject: [Beowulf] Cluster Uplink via Wireless In-Reply-To: <20040312130636.D49119@blnsrv1.science-computing.de> References: <20040312130636.D49119@blnsrv1.science-computing.de> Message-ID: <1079385814.4352.86.camel@pel> On Fri, 2004-03-12 at 07:06, Michael Arndt wrote: > Hello * > > has anyone done a wireless technology uplink to a compute cluster > that is in real use ? > If so, i would be interested to know how and how is the experinece in > transferring "greater" (e.g. 2 GB ++ ) Result files? > > explanation: > We have a cluster with gigabit interconnect > where it would make life cheaper, if there is a possibility to upload > input data and download output data via wireless link, since connecting > twisted pair between WS and CLuster would be expensive. Depending on your setup, some kind of "wireless" besides 802.11[bg] may be worth considering. I'm assuming the expense in wiring the WS to the cluster isn't wire costs so much as where you'd have to put the cable. One thing you might consider is IR uplink. I don't remember what speed they get, but a few years back I saw a college use IR to get connectivity to a building, that otherwise would have required digging up a busy public street to wire. In the long run it was a lot cheaper. If your expense in wiring is something similar, you may want to look into IR or similar technologies. (The IR guns weren't cheap by any means, except when compared to digging up a city street) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mnerren at paracel.com Mon Mar 15 18:21:04 2004 From: mnerren at paracel.com (micah nerren) Date: Mon, 15 Mar 2004 15:21:04 -0800 Subject: [Beowulf] Scyld system mysteriously locks up In-Reply-To: <40560BE0.1090808@umn.edu> References: <40560BE0.1090808@umn.edu> Message-ID: <1079392863.27739.25.camel@angmar> On Mon, 2004-03-15 at 12:02, Eric R Johnson wrote: > Hello, > > I purchased a 4 node, 8 processor Scyld (version 28) cluster > approximately 6 months ago. About 5 days ago, it started mysteriously > locking up on me. Once it is locked up, I can't do anything except > physically reboot the machine. I would check heating issues. Has the ventilation changed, does the machine feel hot? How long between lockups? Micah _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From john.hearns at clustervision.com Tue Mar 16 04:42:30 2004 From: john.hearns at clustervision.com (John Hearns) Date: Tue, 16 Mar 2004 10:42:30 +0100 (CET) Subject: [Beowulf] Cluster Uplink via Wireless In-Reply-To: <1079385814.4352.86.camel@pel> Message-ID: On Mon, 15 Mar 2004, Jag wrote: > be worth considering. I'm assuming the expense in wiring the WS to the > cluster isn't wire costs so much as where you'd have to put the cable. > One thing you might consider is IR uplink. I don't remember what speed > they get, but a few years back I saw a college use IR to get > connectivity to a building, that otherwise would have required digging > up a busy public street to wire. In the long run it was a lot cheaper. When I worked in Soho, we had a laser link over the rooftops of London. At the time a 155Mbps ATM link, which we later used for 100Mbps Ethernet. Main problem was cleaning the lenses every so often, in the lovely London air conditions. We later put in a gigabit laser from Nbase to another building. We needed much more bandwidth than 100Mbps in the end, and had our own trench dug and put in dark fibre. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Tue Mar 16 04:58:23 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Tue, 16 Mar 2004 17:58:23 +0800 (CST) Subject: [Beowulf] MOSIX cluster In-Reply-To: <4055F923.70203@telia.com> Message-ID: <20040316095823.57806.qmail@web16813.mail.tpe.yahoo.com> Since you know the number of tasks your simulations use, I think using a batch system would make it easier to management - MOSIX is usually for jobs which are very dynamic. You can take a look at the common batch systems such as SGE or SPBS. http://gridengine.sunsource.net http://www.supercluster.org/projects/torque/ Andrew. --- Per Lindstrom ????> Hi, > > I wonder if some of you have experience of MOSIX? > (www.mosix.org) > > What do you think about that solution for > FEA-simulations? > > Can MOSIX be regarded as a form of a Beowulf > cluster? > > Best regards > Per Lindstrom > > Per.Lindstrom at me.chalmers.se , > klamman.gard at telia.com > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or > unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From prml at na.chalmers.se Mon Mar 15 13:39:23 2004 From: prml at na.chalmers.se (Per R M Lindstrom) Date: Mon, 15 Mar 2004 19:39:23 +0100 (CET) Subject: [Beowulf] (no subject) Message-ID: Hi, I wonder if some of you have experience of MOSIX? (www.mosix.org) What do you think about that solution for FEA-simulations? Can MOSIX be regarded as a form of a Beowulf cluster? Best regards Per Lindstrom _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bioinformaticist at mn.rr.com Mon Mar 15 14:49:36 2004 From: bioinformaticist at mn.rr.com (Eric R Johnson) Date: Mon, 15 Mar 2004 13:49:36 -0600 Subject: [Beowulf] Scyld system mysteriously locks up Message-ID: <405608D0.60501@mn.rr.com> Hello, I purchased a 4 node, 8 processor Scyld (version 28) cluster approximately 6 months ago. About 5 days ago, it started mysteriously locking up on me. Once it is locked up, I can't do anything except physically reboot the machine. Unfortunately, I am rather new to Linux clusters and, since it worked "right out of the box", I have had no experience in troubleshooting. Can someone give me an idea of where I should start? I have the BIOS on all machines set to do a full memory check on startup and the /var/log/message file shows nothing. Thanks, Eric -- ******************************************************************** Eric R A Johnson University Of Minnesota tel: (612) 626 5115 Dept. of Laboratory Medicine & Pathology fax: (612) 625 1121 7-230 BSBE e-mail: john4482 at umn.edu 312 Church Street web: www.eric-r-johnson.com Minneapolis, MN 55455 USA _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From br66 at HPCL.CSE.MsState.Edu Mon Mar 15 18:09:37 2004 From: br66 at HPCL.CSE.MsState.Edu (Balaji Rangasamy) Date: Mon, 15 Mar 2004 17:09:37 -0600 (CST) Subject: [Beowulf] MPICH Exporting environment variables. Message-ID: Hi, Has anyone successfully exported any environment variables (specifically LD_PRELOAD) in MPICH? There is an easy way to do this with LAM/MPI; there is this -x switch in mpirun command that comes with LAM/MPI that will export the environment variable you specify to all the child processes. Is there any easy way to do this in MPICH? Thanks for your reply, Balaji. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tsariysk at craft-tech.com Tue Mar 16 14:12:46 2004 From: tsariysk at craft-tech.com (Ted Sariyski) Date: Tue, 16 Mar 2004 14:12:46 -0500 Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> Message-ID: <405751AE.2040806@craft-tech.com> Hi, I am about to configure a 16 node dual xeon cluster based on Supermicro X5DPA-TGM motherboard. The cluster may grow so I am looking for a manageable, nonblocking 24 or 32 port gigabit switch. Any comments or recommendations will be highly appreciated. Thanks, Ted _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From agrajag at dragaera.net Tue Mar 16 13:49:02 2004 From: agrajag at dragaera.net (Sean Dilda) Date: Tue, 16 Mar 2004 13:49:02 -0500 Subject: [Beowulf] Scyld system mysteriously locks up In-Reply-To: <405608D0.60501@mn.rr.com> References: <405608D0.60501@mn.rr.com> Message-ID: <1079462942.4354.49.camel@pel> On Mon, 2004-03-15 at 14:49, Eric R Johnson wrote: > Hello, > > I purchased a 4 node, 8 processor Scyld (version 28) cluster > approximately 6 months ago. About 5 days ago, it started mysteriously > locking up on me. Once it is locked up, I can't do anything except > physically reboot the machine. > Unfortunately, I am rather new to Linux clusters and, since it worked > "right out of the box", I have had no experience in troubleshooting. > Can someone give me an idea of where I should start? > I have the BIOS on all machines set to do a full memory check on startup > and the /var/log/message file shows nothing. It might be useful to try to figure out what is locking up. Is it just the head node that's locking? Have you made any recent changes that might account for it? Or are you running any new programs that might be stressing the machine in a way it wasn't stressed before? If its completely locking (if you can no longer toggle the numlock light on your keyboard, then its completely locked), then its either a kernel hang, or a hardware issue. If the kernel is the same and the usage pattern hasn't changed, then it might be a hardware issue. Hardware can degrade over time and dying hardware can be unpredictable. You may also consider contacting Scyld, and possibly the hardware manufacturer for help diagnosing the problem. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From david.n.lombard at intel.com Tue Mar 16 16:19:12 2004 From: david.n.lombard at intel.com (Lombard, David N) Date: Tue, 16 Mar 2004 13:19:12 -0800 Subject: [Beowulf] MOSIX for FEA (was: no subject) Message-ID: <187D3A7CAB42A54DB61F1D05F0125722025F5662@orsmsx402.jf.intel.com> From: Per R M Lindstrom; Monday, March 15, 2004 10:39 AM > > Hi, > > I wonder if some of you have experience of MOSIX? (www.mosix.org) > > What do you think about that solution for FEA-simulations? As with all things, "it depends." More specifically, it depends on the characteristics of the FEA app. For the FEA app that I have intimate familiarity with, MOSIX would not work well at all. The reason is the app is highly sensitive to sustained memory bandwidth and sustained disk I/O bandwidth. While memory bandwidth is not an issue with MOSIX, disk I/O bandwidth will become an issue once MOSIX migrates a process to balance CPU load. The (local scratch) disk I/O will then be forced through both the current and original nodes, severely impacting the bandwidth. Having said that, I can imagine an in-memory FEA app that could work quite well on MOSIX. More specifically, the hypothetical app would read its data from disk, crunch for a while, and then write its results to disk. -- David N. Lombard My comments represent my opinions, not those of Intel Corporation _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gropp at mcs.anl.gov Tue Mar 16 15:14:58 2004 From: gropp at mcs.anl.gov (William Gropp) Date: Tue, 16 Mar 2004 14:14:58 -0600 Subject: [Beowulf] MPICH Exporting environment variables. In-Reply-To: References: Message-ID: <6.0.0.22.2.20040316141246.025e4f48@localhost> At 05:09 PM 3/15/2004, Balaji Rangasamy wrote: >Hi, >Has anyone successfully exported any environment variables (specifically >LD_PRELOAD) in MPICH? There is an easy way to do this with LAM/MPI; there >is this -x switch in mpirun command that comes with LAM/MPI that will >export the environment variable you specify to all the child processes. Is >there any easy way to do this in MPICH? It depends on the process manager/startup system that you are using with MPICH. With the "p4 secure server", environment variables can be exported. With the default ch_p4 device, environment variables are not exported. Under MPICH2, most process managers export the environment to the user processes. Bill _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From csamuel at vpac.org Tue Mar 16 22:09:44 2004 From: csamuel at vpac.org (Chris Samuel) Date: Wed, 17 Mar 2004 14:09:44 +1100 Subject: [Beowulf] cfengine users ? Message-ID: <200403171409.45273.csamuel@vpac.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, Anyone out there using cfengine to manage clusters, or who's tried and failed? Just curious as to whether it's worth looking at.. cheers! Chris - -- Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQFAV8F4O2KABBYQAh8RAth7AJ9NkRhIUqcykX1zWGZyi/vZcB7JhwCgkVej uX5R/EcQrBPX+/Pyew55FC0= =tRe+ -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From a.j.martin at qmul.ac.uk Wed Mar 17 05:09:28 2004 From: a.j.martin at qmul.ac.uk (Alex Martin) Date: Wed, 17 Mar 2004 10:09:28 +0000 Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: <405751AE.2040806@craft-tech.com> References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> <405751AE.2040806@craft-tech.com> Message-ID: <200403171009.i2HA9S314735@heppcb.ph.qmw.ac.uk> On Tuesday 16 March 2004 7:12 pm, Ted Sariyski wrote: > Hi, > I am about to configure a 16 node dual xeon cluster based on Supermicro > X5DPA-TGM motherboard. The cluster may grow so I am looking for a > manageable, nonblocking 24 or 32 port gigabit switch. Any comments or > recommendations will be highly appreciated. > Thanks, > Ted > You might want to look at the HP ProCurve 2824 or 2848 series. We choose the latter, because it means we only need one switch per (logical) rack and the cost/port is pretty low. I can't yet comment on performance. cheers, Alex -- ------------------------------------------------------------------------------ | | | Dr. Alex Martin | | e-Mail: a.j.martin at qmul.ac.uk Queen Mary, University of London, | | Phone : +44-(0)20-7882-5033 Mile End Road, | | Fax : +44-(0)20-8981-9465 London, UK E1 4NS | | | ------------------------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bogdan.costescu at iwr.uni-heidelberg.de Wed Mar 17 07:17:06 2004 From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed, 17 Mar 2004 13:17:06 +0100 (CET) Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: <200403171009.i2HA9S314735@heppcb.ph.qmw.ac.uk> Message-ID: On Wed, 17 Mar 2004, Alex Martin wrote: > You might want to look at the HP ProCurve 2824 or 2848 series. We > choose the latter, because it means we only need one switch per > (logical) rack and the cost/port is pretty low. I can't yet comment > on performance. I'm interested in buying a 48 port Gigabit switch as well, and I was looking at the 2848 as it has the advantage of 48 ports in only 1U. One thing that is not clear from the descriptions that I find on the net is if it has support for Jumbo frames. Does the documentation that come with it mention something like this or, even better, have you tried using Jumbo frames ? I'm also interested in hearing opinions about other 48 ports Gigabit switches. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Wed Mar 17 08:04:10 2004 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Wed, 17 Mar 2004 05:04:10 -0800 (PST) Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: Message-ID: On Wed, 17 Mar 2004, Bogdan Costescu wrote: > On Wed, 17 Mar 2004, Alex Martin wrote: > > > You might want to look at the HP ProCurve 2824 or 2848 series. We > > choose the latter, because it means we only need one switch per > > (logical) rack and the cost/port is pretty low. I can't yet comment > > on performance. > > I'm interested in buying a 48 port Gigabit switch as well, and I was > looking at the 2848 as it has the advantage of 48 ports in only 1U. > One thing that is not clear from the descriptions that I find on the > net is if it has support for Jumbo frames. Does the documentation that > come with it mention something like this or, even better, have you > tried using Jumbo frames ? hp does not support jumbo frames on anything except their high-end l3 products... > I'm also interested in hearing opinions about other 48 ports Gigabit > switches. > > -- -------------------------------------------------------------------------- Joel Jaeggli Unix Consulting joelja at darkwing.uoregon.edu GPG Key Fingerprint: 5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tsariysk at craft-tech.com Wed Mar 17 07:28:34 2004 From: tsariysk at craft-tech.com (Ted Sariyski) Date: Wed, 17 Mar 2004 07:28:34 -0500 Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: References: Message-ID: <40584472.1050600@craft-tech.com> If jumboframes are important you may look at Foundry EdgeIron 24G or 48G. Ted Bogdan Costescu wrote: > On Wed, 17 Mar 2004, Alex Martin wrote: > > >>You might want to look at the HP ProCurve 2824 or 2848 series. We >>choose the latter, because it means we only need one switch per >>(logical) rack and the cost/port is pretty low. I can't yet comment >>on performance. > > > I'm interested in buying a 48 port Gigabit switch as well, and I was > looking at the 2848 as it has the advantage of 48 ports in only 1U. > One thing that is not clear from the descriptions that I find on the > net is if it has support for Jumbo frames. Does the documentation that > come with it mention something like this or, even better, have you > tried using Jumbo frames ? > > I'm also interested in hearing opinions about other 48 ports Gigabit > switches. > -- Ted Sariyski ------------ Combustion Research and Flow Technology, Inc. 6210 Keller's Church Road Pipersville, PA 18947 Tel: 215-766-1520 Fax: 215-766-1524 www.craft-tech.com tsariysk at craft-tech.com ----------------------- "Our experiment is perfect and is not limited by fundamental principles." _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From canon at nersc.gov Wed Mar 17 10:26:28 2004 From: canon at nersc.gov (canon at nersc.gov) Date: Wed, 17 Mar 2004 07:26:28 -0800 Subject: [Beowulf] cfengine users ? In-Reply-To: Message from Chris Samuel of "Wed, 17 Mar 2004 14:09:44 +1100." <200403171409.45273.csamuel@vpac.org> Message-ID: <200403171526.i2HFQSni004735@pookie.nersc.gov> Chris, We use cfengine to help manage our ~400 node linux cluster and 416 nodes (6656 processor) SP system. I highly recommend it. We typically use an rpm update script (we are moving to yum now) to manage the binaries and use cfengine to manage config files and scripts. There are some aspects of cfengine that can be a little convoluted, but it is very flexible. --Shane ------------------------------------------------------------------------ Shane Canon PSDF Project Lead National Energy Research Scientific Computing Center 1 Cyclotron Road Mailstop 943-256 Berkeley, CA 94720 ------------------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From anandv at singnet.com.sg Wed Mar 17 00:40:39 2004 From: anandv at singnet.com.sg (Anand Vaidya) Date: Wed, 17 Mar 2004 13:40:39 +0800 Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: <405751AE.2040806@craft-tech.com> References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> <405751AE.2040806@craft-tech.com> Message-ID: <200403171340.39601.anandv@singnet.com.sg> You can try Foundry Networks EIF24G or EIF48G, offers full BW, 1U, we like it. -Anand On Wednesday 17 March 2004 03:12, Ted Sariyski wrote: > Hi, > I am about to configure a 16 node dual xeon cluster based on Supermicro > X5DPA-TGM motherboard. The cluster may grow so I am looking for a > manageable, nonblocking 24 or 32 port gigabit switch. Any comments or > recommendations will be highly appreciated. > Thanks, > Ted > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Thu Mar 18 10:26:51 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Thu, 18 Mar 2004 10:26:51 -0500 (EST) Subject: [Beowulf] Intel CSA performance? Message-ID: Intel added a special connection on their chipset to connect gigabit on some chipsets (CSA). I've been wondering whether this would offer a latency advantage, since it's conventional wisdom that PCI latency is a noticable part of MPI latency. this article: http://tinyurl.com/2vlez claims that CSA actually hurts latency, which is a bit puzzling. it is, admittedly, "gamepc.com", so perhaps they are unaware of tuning issues like interrupt-coalescing/mitigation. do any of you have CSA-based networks and have done performance tests? thanks, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From venkatraman at programmer.net Thu Mar 18 07:03:57 2004 From: venkatraman at programmer.net (Venkatraman Madurai Venkatasubramanyam) Date: Thu, 18 Mar 2004 07:03:57 -0500 Subject: [Beowulf] Suggest me on my attempt!! Message-ID: <20040318120357.A52A91D435B@ws1-12.us4.outblaze.com> Hello ppl! I am a Computer Science and Engineering student of India. I am planning to build a Beowulf Cluster for my Project as a part of my curriculum. Resource I have are four laptops with Intel Celeron 2 GHz, 18 GB HDD, HP Compaq Presario 2100 series, 192 MB RAM and I dont know what else shud I specify here. I have RedHat Linux 9 running on it. So I seek your help here to suggest me on how to build a Cluster. Please show me a way, as I am new to the Linux Platform. If you can personally help me, I will be really appreciated. MOkShAA. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Mar 18 15:01:10 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 18 Mar 2004 15:01:10 -0500 (EST) Subject: [Beowulf] Suggest me on my attempt!! In-Reply-To: <20040318120357.A52A91D435B@ws1-12.us4.outblaze.com> Message-ID: On Thu, 18 Mar 2004, Venkatraman Madurai Venkatasubramanyam wrote: > Hello ppl! > I am a Computer Science and Engineering student of India. I am > planning to build a Beowulf Cluster for my Project as a part of my > curriculum. Resource I have are four laptops with Intel Celeron 2 GHz, > 18 GB HDD, HP Compaq Presario 2100 series, 192 MB RAM and I dont know > what else shud I specify here. I have RedHat Linux 9 running on it. So I > seek your help here to suggest me on how to build a Cluster. Please show > me a way, as I am new to the Linux Platform. If you can personally help > me, I will be really appreciated. a) Visit http://www.phy.duke.edu/brahma Among other things on this site is an online book on building clusters. Read/skim it. b) In your case the recipe is almost certainly going to be: i) Put laptops on a common switched network (cheap 100 Mbps switch). ii) Install PVM, MPI (lam and/or mpich), programming tools and support if you haven't already on all nodes. iii) Set them up with a common home directory space NFS exported from one to the rest, and with common accounts to match. You can distribute account information on so small a cluster by just copying e.g. /etc/passwd and /etc/group and so on or by using NIS (or other ways). iv) Set up a remote shell so that you can freely login from any node to any other node without a password. I recommend ssh (openssh rpms) but rsh is OK if your network is otherwise isolated and secure. v) Obtain, write, build parallel applications to explore what your cluster can do. There are demo programs for both PVM and MPI that come with the distributions and more are available on the web. There is a PVM program template and an example PVM application suitable for demonstrating scaling (also a potential template for master/slave code) on: http://www.phy.duke.edu/~rgb under "General". vi) Proceed from there as your skills increase. I think that you'll find that after this you'll be in pretty good shape for further progress, guided as you think necessary by this list. There are also books out there that can help, but they cost money. Finally, I'd strongly suggest subscribing to Cluster World Magazine, where there are both articles and monthly columns that cover how to do all of the above and much more. rgb > MOkShAA. > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rouds at servihoo.com Fri Mar 19 06:48:38 2004 From: rouds at servihoo.com (RoUdY) Date: Fri, 19 Mar 2004 15:48:38 +0400 Subject: [Beowulf] HELP! MPI PROGRAM In-Reply-To: <200310011901.h91J1LY06826@NewBlue.Scyld.com> Message-ID: Hello I really need a very big hand from you... I have to run a program on my cluster for the final year project, which require a lot of computation power... Can someone sent me a program (the source code) or a site where i can download a big program PLEASE ... Using MPI.... Hope to hear from you Roud -------------------------------------------------- Get your free email address from Servihoo.com! http://www.servihoo.com The Portal of Mauritius _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lars at meshtechnologies.com Fri Mar 19 09:31:25 2004 From: lars at meshtechnologies.com (Lars Henriksen) Date: Fri, 19 Mar 2004 14:31:25 +0000 Subject: [Beowulf] HELP! MPI PROGRAM In-Reply-To: References: Message-ID: <1079706684.2520.1.camel@tp1.mesh-hq> On Fri, 2004-03-19 at 11:48, RoUdY wrote: > I have to run a program on my cluster for the final year > project, which require a lot of computation power... > Can someone sent me a program (the source code) or a site > where i can download a big program PLEASE ... > Using MPI.... Try HPL (High-Performance Linpack): http://www.netlib.org/benchmark/hpl/ best regards Lars -- Lars Henriksen | MESH-Technologies A/S Systems Manager & Consultant | Lille Graabroedrestraede 1 www.meshtechnologies.com | DK-5000 Odense C, Denmark lars at meshtechnologies.com | mobile: +45 2291 2904 direct: +45 6311 1187 | fax: +45 6311 1189 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gropp at mcs.anl.gov Fri Mar 19 08:43:43 2004 From: gropp at mcs.anl.gov (William Gropp) Date: Fri, 19 Mar 2004 07:43:43 -0600 Subject: [Beowulf] HELP! MPI PROGRAM In-Reply-To: References: <200310011901.h91J1LY06826@NewBlue.Scyld.com> Message-ID: <6.0.0.22.2.20040319074111.02505e60@localhost> At 05:48 AM 3/19/2004, RoUdY wrote: >Hello >I really need a very big hand from you... >I have to run a program on my cluster for the final year project, which >require a lot of computation power... >Can someone sent me a program (the source code) or a site where i can >download a big program PLEASE ... >Using MPI.... >Hope to hear from you Roud There are many examples included with PETSc (www.mcs.anl.gov/petsc) that can be sized to use as much power as you have. HPLinpack will also use as much computational power as you have and allows you to compare your cluster to the Top500 list. Both use MPI for communication. Bill _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gharinarayana at yahoo.com Fri Mar 19 11:34:57 2004 From: gharinarayana at yahoo.com (HARINARAYANA G) Date: Fri, 19 Mar 2004 08:34:57 -0800 (PST) Subject: [Beowulf] Give an application to PARALLELIZE Message-ID: <20040319163457.3051.qmail@web11306.mail.yahoo.com> Dear friends, Please give me a very good application which uses pda(algorithms) and MPI to the maximum extent and which is POSSIBLE to do in 2 months(It's OK even if you have done it already, just send the NAME of the topic and the problem requirements). I am doing my Bachelor of Engineering in Comp. Science at RNSIT,Bangalore,INDIA. I am with a team of 4 people. With regards, Sivaram. __________________________________ Do you Yahoo!? Yahoo! Mail - More reliable, more storage, less spam http://mail.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Fri Mar 19 21:18:31 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sat, 20 Mar 2004 10:18:31 +0800 (CST) Subject: [Beowulf] GridEngine 6.0 beta is ready! Message-ID: <20040320021831.65847.qmail@web16811.mail.tpe.yahoo.com> It's finally available, follow this link to download the binary packages or source: http://gridengine.sunsource.net/project/gridengine/news/SGE60beta-announce.html Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at pathscale.com Fri Mar 19 21:51:56 2004 From: lindahl at pathscale.com (Greg Lindahl) Date: Fri, 19 Mar 2004 18:51:56 -0800 Subject: [Beowulf] Intel CSA performance? In-Reply-To: References: Message-ID: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> On Thu, Mar 18, 2004 at 10:26:51AM -0500, Mark Hahn wrote: > Intel added a special connection on their chipset to connect > gigabit on some chipsets (CSA). I've been wondering whether > this would offer a latency advantage, since it's conventional wisdom > that PCI latency is a noticable part of MPI latency. Eh? PCI latency can be noticable when you have a low latency network, but gigE latency isn't nearly that low, especially once you've gone through a switch. The only reference to gigabit latency in the article didn't say what they measured. I'd assume that it was using the normal drivers, which means the kernel networking stack, which means you're looking through the telescope from the wrong end. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From desi_star786 at yahoo.com Sat Mar 20 13:38:10 2004 From: desi_star786 at yahoo.com (desi star) Date: Sat, 20 Mar 2004 10:38:10 -0800 (PST) Subject: [Beowulf] Problem running Jaguar on Scyld-Beowulf in parallel mode. Message-ID: <20040320183810.94267.qmail@web40812.mail.yahoo.com> Hi.. I have installed a molecular modeling software Jaguar by Schrodinger Inc. on my scyld-beowulf 16 node cluster. The software runs perfectly fine on the master node but gives an error when I try to run the program on more than one CPU. User manual of the program suggests following steps to run Jaguar in parallel mode: 1. Install MPICH and configure with option: --with-comm=shared --with-device=ch_p4 2. Edit the machine.LINUX file in the MPICH directory and list the name of the host and number of processors on that host. 3. Test that 'rsh' is working 4. Launch the secure server ch4p_servs We already have the MPICH installed on the cluster using package 'mpich-p4-inter-1.3.2-5_scyld.i368.rpm'. I do not know whether package installation was done with specific configure options in step#1. Do I need to re-install the MPICH? I know that MPICH works perfectly fine for the FORTRAN 90 programs on different nodes. Also, Is it really important to enable 'rsh' on scyld? The cluster is not protected by firewall so I want to use the more secure 'ssh' but then do I need to install the MPICH again telling it to use ssh rather than rsh for communication? I am also wondering if the reason I am not been able to run program on more than one CPU has to do with the fact that Jaguar is not linked to MPICH libraries? This is my first experience with MPICH and running programs in parallel. I would really appreciate quick tips and suggestions as to why I am not been to make Jaguar run in the parallel mode. Thanks in advance. Eagerly waiting for a response. -- Pratap Singh. Graduate Student, The Chemical and Biomolecular Eng. Johns Hopkins Univ. __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rmyers1400 at comcast.net Fri Mar 19 22:58:53 2004 From: rmyers1400 at comcast.net (Robert Myers) Date: Fri, 19 Mar 2004 22:58:53 -0500 Subject: [Beowulf] Intel CSA performance? In-Reply-To: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> References: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> Message-ID: <405BC17D.3010504@comcast.net> Greg Lindahl wrote: >On Thu, Mar 18, 2004 at 10:26:51AM -0500, Mark Hahn wrote: > > > >>Intel added a special connection on their chipset to connect >>gigabit on some chipsets (CSA). I've been wondering whether >>this would offer a latency advantage, since it's conventional wisdom >>that PCI latency is a noticable part of MPI latency. >> >> > >Eh? PCI latency can be noticable when you have a low latency network, >but gigE latency isn't nearly that low, especially once you've gone >through a switch. > >The only reference to gigabit latency in the article didn't say what >they measured. I'd assume that it was using the normal drivers, which >means the kernel networking stack, which means you're looking through >the telescope from the wrong end. > > > I had thought it might be interesting to fool around with trying to use CSA for hyperscsi, but I think you're saying if you're going to use a switched network, don't bother, if you're trying to win on latency. When Intel abandoned infiniband and the memory controller hub sprouted this ethernet link, I figured that was their opening shot in stomping what's left of infiniband. Maybe it is, and they just don't care about latency, but it sounds like nobody's got any reliable information as to what the latency effects of CSA may be, anyway. Every indication I can find is that Intel has all its bets on ethernet, and I don't know that there is any technological obstacle to building a low-latency ethernet. RM _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jimlux at earthlink.net Sat Mar 20 17:42:54 2004 From: jimlux at earthlink.net (Jim Lux) Date: Sat, 20 Mar 2004 14:42:54 -0800 Subject: [Beowulf] Wireless network speed for clusters Message-ID: <002b01c40ecc$cd7cec50$32a8a8c0@LAPTOP152422> Some preliminary results for those of you wondering just how slow it actually is... Configuration is basically this: node (Via EPIA C3 533MHz) running freevix kernel (ramdisk filesystem) wired connection through Dlink 5 port hub DWL-7000AP set up for point to multipoint 802.11a (5GHz band) luminiferous aether DWL-7000AP ancient 10Mbps hub Clunky PPro running Knoppix/debian Maxtor NAS with a NFS mount Pings with default 63 byte packets give 1.2-2.0 ms both ways... Compare to <0.1 ms with a wired connection (i.e. plugging a cable from the Dlink hub to the ancient hub) DHCP/PXE booting sort of works (not exhaustively tested) For some reason, the nodes can't see the NAS so NFS doesn't mount There are a lot of "issues" with the DWL-7000AP... I think it's trying to be clever about not routing traffic to MACs on the local side over the air, but then, it doesn't know to route the traffic to the NFS server. The DWL-7000's also don't like to be powered up with no live (as in responding to packets) device hooked up to them, so there's sort of a potential power sequencing thing with the EPIA boards and the DWL-7000AP. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at pathscale.com Sat Mar 20 23:01:36 2004 From: lindahl at pathscale.com (Greg Lindahl) Date: Sat, 20 Mar 2004 20:01:36 -0800 Subject: [Beowulf] Intel CSA performance? In-Reply-To: <405BC17D.3010504@comcast.net> References: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> <405BC17D.3010504@comcast.net> Message-ID: <20040321040136.GA1977@greglaptop.greghome.keyresearch.com> On Fri, Mar 19, 2004 at 10:58:53PM -0500, Robert Myers wrote: > I had thought it might be interesting to fool around with trying to use > CSA for hyperscsi, but I think you're saying if you're going to use a > switched network, don't bother, if you're trying to win on latency. I've never heard of "hyperscsi", and I am not saying what you think I'm saying. What I am saying is that if you're going to use 1 gigabit Ethernet, which has high latency in the switches, AND go through the kernel, don't bother. I was pretty clear, so I don't see how you missed it. There are certainly many examples of switched networks that are low latency, such as Myrinet, IB, Quadrics, SCI, and so forth. > When Intel abandoned infiniband Intel has not abandoned Infiniband. They discontinued a 1X interface that was going to get stomped in the market that was developing more slowly than expected. Just like you drew the wrong lesson from what I said, don't draw the wrong lesson from what Intel did. > Every indication I can find is that Intel has all its bets on ethernet, This contradicts what Intel says. They are not betting against ethernet, but they are certainly encouraging FC and IB where FC and IB make sense. However, this is straying beyond beowulf, and I hope that this mailing list can avoid being the cesspool that comp.arch has been for many years. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rmyers1400 at comcast.net Sun Mar 21 01:40:30 2004 From: rmyers1400 at comcast.net (Robert Myers) Date: Sun, 21 Mar 2004 01:40:30 -0500 Subject: [Beowulf] Intel CSA performance? In-Reply-To: <20040321040136.GA1977@greglaptop.greghome.keyresearch.com> References: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> <405BC17D.3010504@comcast.net> <20040321040136.GA1977@greglaptop.greghome.keyresearch.com> Message-ID: <405D38DE.1010409@comcast.net> Greg Lindahl wrote: >On Fri, Mar 19, 2004 at 10:58:53PM -0500, Robert Myers wrote: > > > >>I had thought it might be interesting to fool around with trying to use >>CSA for hyperscsi, but I think you're saying if you're going to use a >>switched network, don't bother, if you're trying to win on latency. >> >> > >I've never heard of "hyperscsi", and I am not saying what you think >I'm saying. What I am saying is that if you're going to use 1 gigabit >Ethernet, which has high latency in the switches, AND go through the >kernel, don't bother. I was pretty clear, so I don't see how you >missed it. There are certainly many examples of switched networks that >are low latency, such as Myrinet, IB, Quadrics, SCI, and so forth. > I should have been explicit. "If you are going through a switched _ethernet_ connection." If you do the groups.google.com search low-latency infiniband group:comp.arch author:Robert author:Myers you will find that you really don't need to educate me about the existence of low-latency interconnects. As to hyperscsi, I gather that it is incumbent only on others to check google. Hyperscsi is a way to pass raw data over ethernet without going through the TCP/IP stack: http://www.linuxdevices.com/files/misc/hyperscsi.pdf so it doesn't consume nearly the CPU resources that TCP/IP does without hardware offload, and I don't think CSA allows you to use separate hardware TCP/IP offload. It looks potentially interesting as a low-cost clustering interconnect, especially if, as I expect, Intel continues to push ethernet. RM _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Sun Mar 21 09:46:36 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sun, 21 Mar 2004 22:46:36 +0800 (CST) Subject: [Beowulf] Re: GridEngine 6.0 beta is ready! In-Reply-To: Message-ID: <20040321144636.49074.qmail@web16808.mail.tpe.yahoo.com> SGE 6.1 will be avaiable at the end of the year, so when the newer version of Rocks Cluster picks up SGE 6.0, SGE 6.1 will be available at around the same time. Andrew. --- "Mason J. Katz" ???T???G> Thanks for the update. We're not going to include > this in our April > release, but we will update to the official Opteron > port and remove our > version of this port. We hope to build experience > with SGE 6.0 in the > coming months and include it as part of our November > release as 6.0 > goes from beta to release. Thanks. > > -mjk > > On Mar 19, 2004, at 6:18 PM, Andrew Wang wrote: > > > It's finally available, follow this link to > download > > the binary packages or source: > > > > > http://gridengine.sunsource.net/project/gridengine/news/SGE60beta- > > > announce.html > > > > Andrew. > > > > > > > ----------------------------------------------------------------- > > ????????Yahoo!?????? > > > ?????????????????????????????????????????????????????????> > > http://tw.promo.yahoo.com/mail_premium/stationery.html > ----------------------------------------------------------------- ?C???? Yahoo!?_?? ?????C???B?????????B?R?A???????A???b?H?????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Mon Mar 22 12:33:15 2004 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Mon, 22 Mar 2004 09:33:15 -0800 Subject: [Beowulf] Give an application to PARALLELIZE In-Reply-To: <20040319163457.3051.qmail@web11306.mail.yahoo.com> Message-ID: <5.2.0.9.2.20040322093203.017e1000@mailhost4.jpl.nasa.gov> At 08:34 AM 3/19/2004 -0800, HARINARAYANA G wrote: >Dear friends, > >Please give me a very good application which uses >pda(algorithms) and MPI to the maximum extent and >which is POSSIBLE to do in 2 months(It's OK even if >you have done it already, just send the NAME of the >topic and the problem requirements). > > I am doing my Bachelor of Engineering in Comp. >Science at RNSIT,Bangalore,INDIA. > > I am with a team of 4 people. > >With regards, >Sivaram. A couple issues back of IEEE Proceedings, there were several papers describing doing acoustic source localization with a bunch of iPAQs. I don't know if they were doing MPI for node/node communication, but there's fairly extensive literature out there, and the papers describe the algorithms used. James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From clwang at csis.hku.hk Sun Mar 21 23:55:00 2004 From: clwang at csis.hku.hk (Cho Li Wang) Date: Mon, 22 Mar 2004 12:55:00 +0800 Subject: [Beowulf] Final Call : NPC2004 (Deadline: March 22, 2004) Message-ID: <405E71A4.1556E651@csis.hku.hk> ******************************************************************* NPC2004 IFIP International Conference on Network and Parallel Computing October 18-20, 2004 Wuhan, China http://grid.hust.edu.cn/npc04 - ------------------------------------------------------------------- Important Dates Paper Submission March 22, 2004 (extended) Author Notification May 1, 2004 Final Camera Ready Manuscript June 1, 2004 ******************************************************************* Call For Papers The goal of IFIP International Conference on Network and Parallel Computing (NPC 2004) is to establish an international forum for engineers and scientists to present their excellent ideas and experiences in all system fields of network and parallel computing. NPC 2004, hosted by the Huazhong University of Science and Technology, will be held in the city of Wuhan, China - the "Homeland of White Clouds and the Yellow Crane." Topics of interest include, but are not limited to: - Parallel & Distributed Architectures - Parallel & Distributed Applications/Algorithms - Parallel Programming Environments & Tools - Network & Interconnect Architecture - Network Security - Network Storage - Advanced Web and Proxy Services - Middleware Frameworks & Toolkits - Cluster and Grid Computing - Ubiquitous Computing - Peer-to-peer Computing - Multimedia Streaming Services - Performance Modeling & Evaluation Submitted papers may not have appeared in or be considered for another conference. Papers must be written in English and must be in PDF format. Detailed electronic submission instructions will be posted on the conference web site. The conference proceedings will be published by Springer Verlag in the Lecture Notes in Computer Science Series (cited by SCI). Best papers from NPC 2004 will be published in a special issue of International Journal of High Performance Computing and Networking (IJHPCN) after conference. ************************************************************************** Committee General Co-Chairs: H. J. Siegel Colorado State University, USA Guojie Li The Institute of Computing Technology, CAS, China Steering Committee Chair: Kemal Ebcioglu IBM T.J. Watson Research Center, USA Program Co-Chairs: Guangrong Gao University of Delaware, USA Zhiwei Xu Chinese Academy of Sciences, China Program Vice-Chairs: Victor K. Prasanna University of Southern California, USA Albert Y. Zomaya University of Sydney, Australia Hai Jin Huazhong University of Science and Technology, China Publicity Co-Chairs: Cho-Li Wang The University of Hong Kong, Hong Kong Chris Jesshope The University of Hull, UK Local Arrangement Chair: Song Wu Huazhong University of Science and Technology, China Steering Committee Members: Jack Dongarra University of Tennessee, USA Guangrong Gao University of Delaware, USA Jean-Luc Gaudiot University of California, Irvine, USA Guojie Li The Institute of Computing Technology, CAS, China Yoichi Muraoka Waseda University, Japan Daniel Reed University of North Carolina, USA Program Committee Members: Ishfaq Ahmad University of Texas at Arlington, USA Shoukat Ali University of Missouri-Rolla, USA Makoto Amamiya Kyushu University, Japan David Bader University of New Mexico, USA Luc Bouge IRISA/ENS Cachan, France Pascal Bouvry University of Luxembourg, Luxembourg Ralph Castain Los Alamos National Laboratory, USA Guoliang Chen University of Science and Technology of China, China Alain Darte CNRS, ENS-Lyon, France Chen Ding University of Rochester, USA Jianping Fan Institute of Computing Technology, CAS, China Xiaobing Feng Institute of Computing Technology, CAS, China Jean-Luc Gaudiot University of California, Irvine, USA Minyi Guo University of Aizu, Japan Mary Hall University of Southern California, USA Salim Hariri University of Arizona, USA Kai Hwang University of Southern California, USA Anura Jayasumana Colorado State Univeristy, USA Chris R. Jesshop The University of Hull, UK Ricky Kwok The University of Hong Kong, Hong Kong Francis Lau The University of Hong Kong, Hong Kong Chuang Lin Tsinghua University, China John Morrison University College Cork, Ireland Lionel Ni Hong Kong University of Science and Technology, Hong Kong Stephan Olariu Old Dominion University, USA Yi Pan Georgia State University, USA Depei Qian Xi'an Jiaotong University, China Daniel A. Reed University of North Carolina at Chapel Hill, USA Jose Rolim University of Geneva, Switzerland Arnold Rosenberg University of Massachusetts at Amherst, USA Sartaj Sahni University of Florida, USA Selvakennedy Selvadurai University of Sydney, Australia Franciszek Seredynski Polish Academy of Sciences, Poland Hong Shen Japan Advanced Institute of Science and Technology, Japan Xiaowei Shen IBM T. J. Watson Research Center, USA Gabby Silberman IBM Centers for Advanced Studies, USA Per Stenstrom Chalmers University of Technology, Sweden Ivan Stojmenovic University of Ottawa, Canada Ninghui Sun Institute of Computing Technology, CAS, China El-Ghazali Talbi University of Lille, France Domenico Talia University of Calabria, Italy Mitchell D. Theys University of Illinois at Chicago, USA Xinmin Tian Intel Corporation, USA Dean Tullsen University of California, San Diego, USA Cho-Li Wang The University of Hong Kong, Hong Kong Qing Yang University of Rhode Island, USA Yuanyuan Yang State University of New York at Stony Brook, USA Xiaodong Zhang College of William and Mary, USA Weimin Zheng Tsinghua University, China Bingbing Zhou University of Sydney, Australia Chuanqi Zhu Fudan University, China _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mwill at penguincomputing.com Mon Mar 22 12:20:47 2004 From: mwill at penguincomputing.com (Michael Will) Date: Mon, 22 Mar 2004 09:20:47 -0800 Subject: [Beowulf] Re: scyld and jaguar Message-ID: <200403220920.47878.mwill@penguincomputing.com> Hi, I saw your email on the beowulf list, and have a few comments: 1. MPICH on Scyld does not require rsh or ssh but rather it will take advantage of the bproc features of Scyld to achieve the same faster. 2. If your fortran programs work fine, so should the c programs. Unless you have an executable that is statically linked with its own mpich implementation. You can test that by using 'ldd' on the executable, it will list which libraries it is loading. If there are no mpich libs mentioned, you might have a statically linked program. Let me know how it goes. Michael Will -- Michael Will, Linux Sales Engineer NEWS: We have moved to a larger iceberg :-) NEWS: 300 California St., San Francisco, CA. Tel: 415-954-2822 Toll Free: 888-PENGUIN Fax: 415-954-2899 www.penguincomputing.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jeffrey.b.layton at lmco.com Mon Mar 22 15:03:35 2004 From: jeffrey.b.layton at lmco.com (Jeff Layton) Date: Mon, 22 Mar 2004 15:03:35 -0500 Subject: [Beowulf] NUMA Patches for AMD64 in 2.4? Message-ID: <405F4697.9070507@lmco.com> Good Afternoon! Does anyone know if the latest stock 2.4 kernel has the NUMA patches in it? If not, where can I get NUMA patches that will work for AMD64? TIA! Jeff -- Dr. Jeff Layton Aerodynamics and CFD Lockheed-Martin Aeronautical Company - Marietta _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From landman at scalableinformatics.com Mon Mar 22 16:30:04 2004 From: landman at scalableinformatics.com (Joe Landman) Date: Mon, 22 Mar 2004 16:30:04 -0500 Subject: [Beowulf] NUMA Patches for AMD64 in 2.4? In-Reply-To: <405F4697.9070507@lmco.com> References: <405F4697.9070507@lmco.com> Message-ID: <405F5ADC.2080101@scalableinformatics.com> You can pull x86_64 patches from ftp://ftp.x86-64.org/pub/linux/v2.6/ . The 2.4 kernels would need backports in some cases (RedHat is doing this, and I think SUSE might be as well). Not sure if Fedora is doing this as well (no /proc/numa in it or in the SUSE 9.0 AMD64). Joe Jeff Layton wrote: > Good Afternoon! > > Does anyone know if the latest stock 2.4 kernel has the > NUMA patches in it? If not, where can I get NUMA patches > that will work for AMD64? > > TIA! > > Jeff > -- Joseph Landman, Ph.D Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://scalableinformatics.com phone: +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From desi_star786 at yahoo.com Mon Mar 22 15:15:27 2004 From: desi_star786 at yahoo.com (desi star) Date: Mon, 22 Mar 2004 12:15:27 -0800 (PST) Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <200403220920.47878.mwill@penguincomputing.com> Message-ID: <20040322201527.98403.qmail@web40809.mail.yahoo.com> Hi Mike, Thanks much for responding. Jaguar is indeed staticaly linked to the MPICH libraries as per manuals. When I ran the ldd commands as you suggested: -- $ ldd Jaguar not a dynamic executable $ -- Thats why the very first step sugested in the Jaguar installation is to build and configure MPICH from the start. Where do I go from here? I also worked on Alan's suggestion and created a dynamic link between the ssh and rsh. I am now stuck in making ssh passwordless. Using 'ssh-keygen -t' I generated public and private keys and then copied public key to the authorised_keys2 in ~/.ssh/. I am not sure if thats all I need to make ssh passwordless. I was wondering if I will have to copy public keys on each node using bpcp command. I would appreciate suggestions in this matter. Thanks. Pratap. --- Michael Will wrote: > Hi, > > I saw your email on the beowulf list, and have a few > comments: > > 1. MPICH on Scyld does not require rsh or ssh but > rather it will take > advantage of the bproc features of Scyld to achieve > the same faster. > > > 2. If your fortran programs work fine, so should the > c programs. Unless you > have an executable that is statically linked with > its own mpich > implementation. You can test that by using 'ldd' on > the executable, it will > list which libraries it is loading. If there are no > mpich libs mentioned, you > might have a statically linked program. > > Let me know how it goes. > > Michael Will > -- > Michael Will, Linux Sales Engineer > NEWS: We have moved to a larger iceberg :-) > NEWS: 300 California St., San Francisco, CA. > Tel: 415-954-2822 Toll Free: 888-PENGUIN > Fax: 415-954-2899 > www.penguincomputing.com > __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mwill at penguincomputing.com Mon Mar 22 17:01:31 2004 From: mwill at penguincomputing.com (Michael Will) Date: Mon, 22 Mar 2004 14:01:31 -0800 Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com> References: <20040322201527.98403.qmail@web40809.mail.yahoo.com> Message-ID: <200403221401.31370.mwill@penguincomputing.com> The problem is that a statically linked executable will not be able to use the Scyld infrastructure. It won't take advantage of your Infinidband or Myrinet, it won't use bproc, etcpp. You might set up the compute nodes to look like general unix nodes in order to run that particular implementation, but then you loose all the advantages of Scyld. > I also worked on Alan's suggestion and created a > dynamic link between the ssh and rsh. AFAIK you would be better off to set the enviroment variable to force it to use rsh or ssh. I think its P4_RSHCOMMAND="ssh" . The best way would be to ask your vendor to provide you with a dymanically linked executable, or even the source code and compile it yourself. > I am now stuck > in making ssh passwordless. Using 'ssh-keygen -t' I > generated public and private keys and then copied > public key to the authorised_keys2 in ~/.ssh/. I am > not sure if thats all I need to make ssh passwordless. Does it work with localhost? It sometimes is tricky to get it right. then it could also work remotely, given that you 1) have sshd running 2) have your home NFS mounted 3) have made /dev/random accessible, at least for ssh I believe thats necessary > I was wondering if I will have to copy public keys on > each node using bpcp command. You could do that too if you do not want to NFS mount the home. That you could easily do by editing /etc/exports to export /home and /etc/beowulf/fstab to mount $MASTER, after that rebooting your compute node. (might be possible without rebooting, but I don't know off of the top of my head) Michael -- Michael Will, Linux Sales Engineer NEWS: We have moved to a larger iceberg :-) NEWS: 300 California St., San Francisco, CA. Tel: 415-954-2822 Toll Free: 888-PENGUIN Fax: 415-954-2899 www.penguincomputing.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From m.dierks at skynet.be Mon Mar 22 18:39:30 2004 From: m.dierks at skynet.be (Michel Dierks) Date: Tue, 23 Mar 2004 00:39:30 +0100 Subject: [Beowulf] Minimal OS Message-ID: <405F7932.20404@skynet.be> Hello, I?m a beginner in the Beowulf world. To achieve my school graduate I choose to make a Beowulf cluster. My cluster: 8 slaves: pc IBM 166 Mhz, 96 Mb ram, HD 2 Giga. 1 master: Dell PowerEdge 2200 bi processor 233 Mhz, 320 Mb ram, 3 SCSI HD (9.1, 2.1 and 2.1 Giga). 1 switch 10/100 Ethernet. The application must calculate a mesh 2D for a research over stream in fluid mechanics. I must use the MPI library for communication and PARMS for the calculation. This application will be developed in C. The operating system is the Red Hat distribution 9.0. My question is: for the slave pc?s , which is the minimal operating system to install. (Kernell + which package?). Thank you. Michel D. Belgium _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mwill at penguincomputing.com Mon Mar 22 18:01:00 2004 From: mwill at penguincomputing.com (Michael Will) Date: Mon, 22 Mar 2004 15:01:00 -0800 Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <1079996184.4352.14.camel@pel> References: <20040322201527.98403.qmail@web40809.mail.yahoo.com> <1079996184.4352.14.camel@pel> Message-ID: <200403221501.00766.mwill@penguincomputing.com> I agree that rather than compiling your own MPICH you should try to make it work with the existing one. However 1) there is no source 2) the binary is statically linked. 3) Scyld does have an mpirun which should set the enviromentvariables right. The right attempt is to make it use bpsh instead of rsh or ssh. I saw that some of the calls are done with shell scripts, which might be a way to fix it as well if the enviroment variables don't help. Michael On Monday 22 March 2004 02:56 pm, Sean Dilda wrote: > On Mon, 2004-03-22 at 15:15, desi star wrote: > > Hi Mike, > > > > Thanks much for responding. Jaguar is indeed staticaly > > linked to the MPICH libraries as per manuals. When I > > ran the ldd commands as you suggested: > > > > -- > > $ ldd Jaguar > > not a dynamic executable > > $ > > -- > > > > Thats why the very first step sugested in the Jaguar > > installation is to build and configure MPICH from the > > start. Where do I go from here? > > I'm not familiar with Jaguar, but I am somewhat familiar with Scyld. I > believe you are taking the wrong approach with this. > > Even though Jaguar says you should start with building mpich, I don't > think that's what you want to do. You almost certainly want to stick > with the MPICH binaries that were provided by Scyld. First make sure > there is no confusion and remove the copy of mpich that you built. Next > make sure the mpich and mpich-devel packages are installed on your > system. 'rpm -q mpich ; rpm -q mpich-devel' should tell you this. If > they're not, 'rpm -i mpich-XXXXX.rpm' should install the package. You > can find the packages on your Scyld cd(s). > > Once you have those packages installed, then attempt to compile jaguar. > It should link against Scyld's copy of mpich and just work. I suggest > following Scyld's instructions for running mpich jobs, not Jaguars. > Scyld has made adjustments to their copy of MPICH that make it work > right on their system. In the process they also change the way jobs are > launched. So Scyld may not have 'mpirun', but has a better way to start > the job. > > As Michael pointed out, Scyld's version of MPICH doesn't require rsh, > ssh, or anything like it. So your questions along those lines are > somewhat moot. > > > Sean -- Michael Will, Linux Sales Engineer NEWS: We have moved to a larger iceberg :-) NEWS: 300 California St., San Francisco, CA. Tel: 415-954-2822 Toll Free: 888-PENGUIN Fax: 415-954-2899 www.penguincomputing.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mwill at penguincomputing.com Mon Mar 22 17:11:42 2004 From: mwill at penguincomputing.com (Michael Will) Date: Mon, 22 Mar 2004 14:11:42 -0800 Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com> References: <20040322201527.98403.qmail@web40809.mail.yahoo.com> Message-ID: <200403221411.42975.mwill@penguincomputing.com> Another idea - make it use bpsh by setting export P4_RSHCOMMAND="bpsh" or set it to use some shell script of yours that massages its parameters into the format bpsh expects. bpsh will start a process without requiring rsh or ssh, using Scylds bproc support. Michael. -- Michael Will, Linux Sales Engineer NEWS: We have moved to a larger iceberg :-) NEWS: 300 California St., San Francisco, CA. Tel: 415-954-2822 Toll Free: 888-PENGUIN Fax: 415-954-2899 www.penguincomputing.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From agrajag at dragaera.net Mon Mar 22 17:56:24 2004 From: agrajag at dragaera.net (Sean Dilda) Date: Mon, 22 Mar 2004 17:56:24 -0500 Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com> References: <20040322201527.98403.qmail@web40809.mail.yahoo.com> Message-ID: <1079996184.4352.14.camel@pel> On Mon, 2004-03-22 at 15:15, desi star wrote: > Hi Mike, > > Thanks much for responding. Jaguar is indeed staticaly > linked to the MPICH libraries as per manuals. When I > ran the ldd commands as you suggested: > > -- > $ ldd Jaguar > not a dynamic executable > $ > -- > > Thats why the very first step sugested in the Jaguar > installation is to build and configure MPICH from the > start. Where do I go from here? > I'm not familiar with Jaguar, but I am somewhat familiar with Scyld. I believe you are taking the wrong approach with this. Even though Jaguar says you should start with building mpich, I don't think that's what you want to do. You almost certainly want to stick with the MPICH binaries that were provided by Scyld. First make sure there is no confusion and remove the copy of mpich that you built. Next make sure the mpich and mpich-devel packages are installed on your system. 'rpm -q mpich ; rpm -q mpich-devel' should tell you this. If they're not, 'rpm -i mpich-XXXXX.rpm' should install the package. You can find the packages on your Scyld cd(s). Once you have those packages installed, then attempt to compile jaguar. It should link against Scyld's copy of mpich and just work. I suggest following Scyld's instructions for running mpich jobs, not Jaguars. Scyld has made adjustments to their copy of MPICH that make it work right on their system. In the process they also change the way jobs are launched. So Scyld may not have 'mpirun', but has a better way to start the job. As Michael pointed out, Scyld's version of MPICH doesn't require rsh, ssh, or anything like it. So your questions along those lines are somewhat moot. Sean _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Mon Mar 22 21:24:16 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Mon, 22 Mar 2004 18:24:16 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <000e01c40772$2611bf60$36a8a8c0@LAPTOP152422> Message-ID: Two weeks ago, I asked about power consumption for dual opteron systems. This is summary of the numbers I saw posted here. 237 idle to 280 loaded for a dual 248 with two SCSI drives from Bill Broadley 250 loaded for a dual 240 from Mark Hahn 182 loaded for a dual 242 from Robert G. Brown The 182 numbers seems to be too low, but it would be nice to have some other data points. Combine fewer fans, less memory, lower power or no harddrive, more efficient power supply, and less load on the CPU, and you could see 182 vs 250 watts I think. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From pegu at dolphinics.no Mon Mar 1 03:45:32 2004 From: pegu at dolphinics.no (Petter Gustad) Date: Mon, 01 Mar 2004 09:45:32 +0100 (CET) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds Message-ID: <20040301.094532.17863925.pegu@dolphinics.no> Taken from: http://www.dolphinics.no/news/2004/2_25.html Dolphin SCI Socket Software Delivers Record Breaking Latency New evaluation kit available at special pricing Clinton, MA and Oslo, Norway, Feb 26, 2004 Dolphin Interconnect today announced that the SCI Socket version 1.0 software library is now available to customers for high-performance computing applications interconnected with Dolphin SCI adapters. SCI Socket enables standard Berkley sockets to use the Scalable Coherent Interface (SCI) as a transport medium with its high bandwidth and extremely low latency. "This is the lowest latency socket solution available today," said Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new high-performance possibilities for a broad range of networking applications." Dolphin has benchmarked a completed one byte socket send/socket receive latency at 2.27 microseconds, which corresponds to more than 203,800 roundtrip transactions per second. Benchmarks using Netperf also show more than 255 MBytes (2,035 Megabits/s) sustained throughput using standard TCP STREAM sockets. The SCI Socket software uses Dolphin's SISCI API as its transport and most of the communication takes place in user space, avoiding time-consuming system calls and networking protocols. SCI remote memory access provides a fast and reliable connection. "These record-setting performance benchmarks underscore the capabilities of the SCI standard as a high-performance interconnect," said Kare Lochsen, CEO of Dolphin Interconnect. "Dolphin has extensive expertise in this technology having developed the first SCI-based interconnect soon after it became a IEEE standard in 1992, and we remain committed to keeping SCI at the most competitive performance levels in the future." SCI Socket requires no operating system patches or application modifications to run the software. SCI Socket is open source software available under LGPL/GPL and supports all popular Linux distributions for x86 and x86/Opteron. In Dolphin testing, the lowest latency was achieved using AMD Opteron (X86_64) processors. Support for UDP and Microsoft Windows is planned. Dolphin SCI adapters are used to build server clusters for high-performance computing and in a wide range of embedded real-time computing applications including reflective memory, simulation and visualization systems, and systems requiring high-availability and fast failover. For a limited time, an evaluation kit consisting of two PCI-SCI adapter cards and cables is available directly from Dolphin Interconnect at a substantial discount. When installed in a user's application platform, the evaluation kit enables effective testing of the SCI Socket software. The software and documentation is included at no charge. Please visit the Dolphin web site for more information at www.dolphinics.com/eval. foobar GmbH (www.foobar-cpa.de), a software development and consulting firm with particular expertise in SCI and located in Chemnitz, Germany, assisted Dolphin in the development of SCI Socket. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Mon Mar 1 08:16:28 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Mon, 1 Mar 2004 21:16:28 +0800 (CST) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com> > In Dolphin testing, the > lowest latency was > achieved using AMD Opteron (X86_64) processors. No wonder Intel killed IA64 and released 64-bit x86 (aka IA32e) a week or two ago... Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mathiasbrito at yahoo.com.br Mon Mar 1 08:59:19 2004 From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=) Date: Mon, 1 Mar 2004 10:59:19 -0300 (ART) Subject: [Beowulf] Mpirun error Message-ID: <20040301135919.45861.qmail@web12202.mail.yahoo.com> I intalled the lastest version of mpich in my personal computer for simulate my parallel programs. I can compile my programs without problem, but when I try to run it I receive the fallowing message error: p0_6941: p4_error: Path to program is invalid while starting /home/mathias/mpi/bubble with RSHCOMMAND on linux: -1 p4_error: latest msg from perror: No such file or directory What can I do? Thanks ===== Mathias Brito Universidade Estadual de Santa Cruz - UESC Departamento de Ci?ncias Exatas e Tecnol?gicas Estudante do Curso de Ci?ncia da Computa??o ______________________________________________________________________ Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora: http://br.yahoo.com/info/mail.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bogdan.costescu at iwr.uni-heidelberg.de Mon Mar 1 10:49:39 2004 From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Mon, 1 Mar 2004 16:49:39 +0100 (CET) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: On Mon, 1 Mar 2004, Petter Gustad wrote: > Dolphin has benchmarked a completed one byte socket send/socket > receive latency at 2.27 microseconds, Is this in polling mode or interrupt-driven ? I'm interested to see if I can do something useful (like computation) _and_ get such low latency. > Benchmarks using Netperf also show more than 255 MBytes (2,035 > Megabits/s) sustained throughput using standard TCP STREAM sockets. What is the CPU usage for this throughput ? -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kfarmer at linuxhpc.org Mon Mar 1 09:51:39 2004 From: kfarmer at linuxhpc.org (Kenneth Farmer) Date: Mon, 1 Mar 2004 09:51:39 -0500 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds References: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com> Message-ID: <097701c3ff9c$b465fe30$1601a8c0@deskpro> ----- Original Message ----- From: "Andrew Wang" To: Sent: Monday, March 01, 2004 8:16 AM Subject: Re: [Beowulf] SCI Socket latency: 2.27 microseconds > > In Dolphin testing, the > > lowest latency was > > achieved using AMD Opteron (X86_64) processors. > > No wonder Intel killed IA64 and released 64-bit x86 > (aka IA32e) a week or two ago... > > Andrew. Intel killed IA64? Where did you come up with that? -- Kenneth Farmer <>< LinuxHPC.org _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Mar 1 11:35:56 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 1 Mar 2004 11:35:56 -0500 (EST) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: well, this is interesting. it appears that AMD has given all interconnect vendors a boost, since Myri and Quadrics seem to like Opterons as well ;) > "This is the lowest latency socket solution available today," said > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new well, Quadrics now claims 1.8 us MPI latency: http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD it's interesting that SCI is still on 64x66 PCI - it would be very interesting to know how many and what kinds of codes really require higher bandwidth. yes, some vendors (esp IB) are pushing PCI-express as bandwith salvation, but afaikt, none of my users need even >500 MB/s today. it doesn't seem like PCI-express will be any kind of major win in small-packet latency... _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From csamuel at vpac.org Mon Mar 1 17:09:55 2004 From: csamuel at vpac.org (Chris Samuel) Date: Tue, 2 Mar 2004 09:09:55 +1100 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <097701c3ff9c$b465fe30$1601a8c0@deskpro> References: <20040301131628.90307.qmail@web16813.mail.tpe.yahoo.com> <097701c3ff9c$b465fe30$1601a8c0@deskpro> Message-ID: <200403020910.02925.csamuel@vpac.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, 2 Mar 2004 01:51 am, Kenneth Farmer wrote: > From: "Andrew Wang" > > > No wonder Intel killed IA64 and released 64-bit x86 > > (aka IA32e) a week or two ago... > > Intel killed IA64? Where did you come up with that? Intel certainly haven't announced the death of Itanium, but you've got to wonder about its long term future when Intel start producing 64-bit AMD compatible chips. Also see [1] below. This is more the question of what will the market do when choosing between them, especially as HPC is only really a niche (though a fairly high spending one) compared to the general computing market. The big advantage AMD have is that "legacy" 32-bit apps will be around for a long long time to come (look at the mass clamour for MS to continue supporting Win98, something they'd hoped would be dead a long time ago) and that gives the hybrids a big advantage in the general market. I guess it comes down to a business decision on Intel as to whether they feel the demand for Itanium is enough to justify its continued development. Note that I'm not saying the demand per se isn't there, I've got absolutely no idea on the matter! cheers, Chris [1] - for those who haven't seen it, here's Linus's response to the launch: http://kerneltrap.org/node/view/2466 - -- Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQFAQ7S2O2KABBYQAh8RAlA/AJ4yzNxJcXZc3e8I8CtYjgScQOCpUwCfdVzF lpG7iEOXSo3+xAK73kNb9c0= =eYRs -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at cse.ucdavis.edu Mon Mar 1 16:38:52 2004 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Mon, 1 Mar 2004 13:38:52 -0800 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: References: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: <20040301213852.GA28803@cse.ucdavis.edu> > well, Quadrics now claims 1.8 us MPI latency: > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD Note the title says "sub 2us" and the body says "close to" 1.8us. Of more interest (to me) is that further down they say: In the next quarter, Quadrics will announce a series of highly competitive switch configurations making QsNetII more cost-effective for medium sized cluster configuration deployment. Sounds like more competition for IB, Myrinet and Dolphin. Hopefully anyways. Cool, found a quadrics price list: http://doc.quadrics.com/Quadrics/QuadricsHome.nsf/DisplayPages/A3EE4AED738B6E2480256DD30057B227 http://tinyurl.com/2sn2b Looks like $3k per node or so for 64, and $4k per node for 1024, I'm guessing that is list price and is somewhat negotiable. According to my sc2003 notes the Quadrics latency was: 100ns for the sending elan4 300ns for the 128 node switch and 20 meters of cable 130ns for the receiving card. 2420ns for two trips across the PCI-X bus and a main memory write ================ 2950ns for an mpi message between 2 nodes. Anyone know what changes to get this number down to 1.8us - 2.0us? > higher bandwidth. yes, some vendors (esp IB) are pushing PCI-express > as bandwith salvation, but afaikt, none of my users need even >500 MB/s > today. it doesn't seem like PCI-express will be any kind of major win > in small-packet latency... Anyone have an expected timetable for PCI-express connected interconnect cards? Anyone have projected PCI-express latencies vs PCI-X (133 MHz/64 bit)? -- Bill Broadley Computational Science and Engineering UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From patrick at myri.com Mon Mar 1 17:40:55 2004 From: patrick at myri.com (Patrick Geoffray) Date: Mon, 01 Mar 2004 17:40:55 -0500 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: References: Message-ID: <4043BBF7.9090706@myri.com> Mark Hahn wrote: > well, Quadrics now claims 1.8 us MPI latency: > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B398280256E44005A31DD Hum, this one claims "under 3us": http://doc.quadrics.com/quadrics/QuadricsHome.nsf/PageSectionsByName/F6E4FE91508A319580256D5900447E40/$File/QsNetII+Performance+Evaluation+ltr.pdf Maybe the 1.8us is a one-sided MPI latency, aka a PUT ? Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From daniel.kidger at quadrics.com Mon Mar 1 18:47:33 2004 From: daniel.kidger at quadrics.com (Dan Kidger) Date: Mon, 1 Mar 2004 23:47:33 +0000 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <20040301213852.GA28803@cse.ucdavis.edu> References: <20040301.094532.17863925.pegu@dolphinics.no> <20040301213852.GA28803@cse.ucdavis.edu> Message-ID: <200403012347.33322.daniel.kidger@quadrics.com> On Monday 01 March 2004 9:38 pm, Bill Broadley wrote: > > well, Quadrics now claims 1.8 us MPI latency: > > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B > >398280256E44005A31DD > > Note the title says "sub 2us" and the body says "close to" 1.8us. Just ran this: [dan at opteron0]$ mpicc mping.c -o mping; prun -N2 ./mping 1: 0 bytes 1.80 uSec 0.00 MB/s This is simple bit of MPI: proc 1 posts an MPI_Recv, proc0 then does a MPI_Send, then proc1 does MPI_Send and proc0 an MPI_Recv. Latency printed is half the round trip averaged over say 1000 passes This is for Opteron - it seems to have the best PCI-X implimentation we have seen. Latency on IA64 is a little higher - say 2.61uSec on one platform I have just tried. MPI performance has also improved over time as we have tuned the DMA/PIO writes,etc. in the device drivers. > Of more interest (to me) is that further down they say: > In the next quarter, Quadrics will announce a series of highly competitive > switch configurations making QsNetII more cost-effective for medium > sized cluster configuration deployment. yep - yet to be announced offically - but as you might expect this revolves around introducing a wider range of smaller switch chasses and configuratiions. -- Yours, Daniel. -------------------------------------------------------------- Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505 ----------------------- www.quadrics.com -------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Mon Mar 1 19:37:11 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Mon, 1 Mar 2004 19:37:11 -0500 (EST) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: <200403020910.02925.csamuel@vpac.org> Message-ID: > > > No wonder Intel killed IA64 and released 64-bit x86 > > > (aka IA32e) a week or two ago... > > > > Intel killed IA64? Where did you come up with that? > > Intel certainly haven't announced the death of Itanium, but you've got to > wonder about its long term future when Intel start producing 64-bit AMD > compatible chips. Also see [1] below. bah. buying chips based on their address register width makes about as much sense as buying based on clock. yes, some people have good reason to be excited about 64b hitting the mass market. but that number is quite small - how many machines do you have with >4 GB per cpu? remember, Intel has always said that 64b wasn't terribly important for anything except the "enterprise" (mauve has more ram) market (mainframe recidivists). I think they're right, but should have also adopted AMD's cpu-integrated memory controller. > I guess it comes down to a business decision on Intel as to whether they feel > the demand for Itanium is enough to justify its continued development. maybe instead of a bazillion bytes of cache on the next it2, Intel will just drop in a P4 or two ;) regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From pegu at dolphinics.no Tue Mar 2 02:59:04 2004 From: pegu at dolphinics.no (Petter Gustad) Date: Tue, 02 Mar 2004 08:59:04 +0100 (CET) Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: References: <20040301.094532.17863925.pegu@dolphinics.no> Message-ID: <20040302.085904.68044976.pegu@dolphinics.no> From: Mark Hahn Subject: Re: [Beowulf] SCI Socket latency: 2.27 microseconds Date: Mon, 1 Mar 2004 11:35:56 -0500 (EST) > > "This is the lowest latency socket solution available today," said > > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new > > well, Quadrics now claims 1.8 us MPI latency: This is excellent MPI latency. However, the quoted 2.27 ?s latency was for the *socket* library. Latency using the Dolphin SISCI library is 1.4 ?s. See also: http://www.dolphinics.no/products/benchmarks.html Petter _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joachim at ccrl-nece.de Tue Mar 2 03:36:42 2004 From: joachim at ccrl-nece.de (Joachim Worringen) Date: Tue, 2 Mar 2004 09:36:42 +0100 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds In-Reply-To: References: Message-ID: <200403020936.42553.joachim@ccrl-nece.de> Mark Hahn: > > "This is the lowest latency socket solution available today," said > > Hugo Kohmann, CTO for Dolphin Interconnect. "SCI Socket opens new > > well, Quadrics now claims 1.8 us MPI latency: > http://doc.quadrics.com/quadrics/QuadricsHome.nsf/NewsByDate/5D7D7444A42B39 >8280256E44005A31DD > > it's interesting that SCI is still on 64x66 PCI - it would be very > interesting to know how many and what kinds of codes really require [..] A large fraction of the latency does indeed stem from the two PCI-buses that need to be crossed. For that reason, Dolphin would certainly get an additional latency decrease when running on a 133MHz bus. I guess they have this in the pipeline. Joachim -- Joachim Worringen - NEC C&C research lab St.Augustin fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sfr at foobar-cpa.de Tue Mar 2 04:40:46 2004 From: sfr at foobar-cpa.de (Friedrich Seifert) Date: Tue, 02 Mar 2004 10:40:46 +0100 Subject: [Beowulf] SCI Socket latency: 2.27 microseconds Message-ID: <4044569E.9010803@foobar-cpa.de> Bogdan Costescu wrote: > On Mon, 1 Mar 2004, Petter Gustad wrote: > > >>Dolphin has benchmarked a completed one byte socket send/socket >>receive latency at 2.27 microseconds, > > > Is this in polling mode or interrupt-driven ? I'm interested to see if > I can do something useful (like computation) _and_ get such low > latency. Actually, SCI SOCKET uses a combination of both, it polls for a configurable amount of time, and if nothing arrives meanwhile, waits for an interrupt. Something like that is necessary since the current Linux interrupt processing and wake up mechanism is quite slow and unpredictable. There is a promising project going on to provide real time interrupt capability, but it is still in an early stage (http://lwn.net/Articles/65710/) >>Benchmarks using Netperf also show more than 255 MBytes (2,035 >>Megabits/s) sustained throughput using standard TCP STREAM sockets. > > > What is the CPU usage for this throughput ? SCI SOCKET was run in PIO mode for this test, so one CPU is needed to transfer the data. Current DMA performance is lower, but is subject to optimization in future revisions. CPU usage for DMA is 8%/29% at sender/receiver. Regards, Friedrich -- Dipl.-Inf. Friedrich Seifert - foobar GmbH Phone: +49-371-5221-157 Email: sfr at foobar-cpa.de Mobil: +49-172-3740089 Web: http://www.foobar-cpa.de _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at pathscale.com Mon Mar 1 20:57:48 2004 From: lindahl at pathscale.com (Greg Lindahl) Date: Mon, 1 Mar 2004 17:57:48 -0800 Subject: [Beowulf] advantages of this particular 64-bit chip In-Reply-To: References: <200403020910.02925.csamuel@vpac.org> Message-ID: <20040302015748.GA6730@greglaptop.internal.keyresearch.com> On Mon, Mar 01, 2004 at 07:37:11PM -0500, Mark Hahn wrote: > bah. buying chips based on their address register width makes > about as much sense as buying based on clock. yes, some people have > good reason to be excited about 64b hitting the mass market. but > that number is quite small - how many machines do you have with > >4 GB per cpu? Don't forget that "64 bits", in this case means "wider GPRs, and twice as many, plus a better ABI." These are substantial wins on many codes, even on machines with small memories. Bignums are a well known example, but there are far more general-purpose examples. For example, with the PathScale compilers on the Opteron, we find that only 1 of the SPECfp benchmarks and 3 of the SPECint benchmarks run faster in 32-bit mode than 64-bit mode -- keeping in mind that 64-bit mode features longer instructions and bigger pointers and longs. (This is our alpha 32-bit mode vs. our beta 64-bit mode, so this answer will change a little by the time both are production quality.) So yes, there's a reason to buy Opteron and IA32e chips beyond the address width: more bang for the buck. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Tue Mar 2 08:59:55 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue, 2 Mar 2004 08:59:55 -0500 (EST) Subject: [Beowulf] help! need thoughts/recommendations quickly In-Reply-To: <20040302132333.GA3957@mikee.ath.cx> Message-ID: On Tue, 2 Mar 2004, Mike Eggleston wrote: > I realize this question is not specific to beowulf clusters... however, > at 9a I'm meeting with an upset user about a bunch of workstations > using serial termainals. Things don't happen as quickly as he wants: > setup, problem diagnosis, throughput, etc. What solutions can I present > for these problems (I realize this is just a quick summary!). Also, > the serial terminals are running at 9600 baud over sometimes 50 meters. > One table I found shows 60 meters is 2400 baud and 30 meters is 4800 > baud. I think this is part of the problem. It really shouldn't be, if the wiring is decent quality TP. Back in the old days, when our department was basically NOTHING but serial terminals running over TP down to a Sun 110 with a serial port expansion, we had lots of runs over 50 meters (probably some close to 100) without difficulty at 9600 baud. Keep the wires away from e.g. fluorescent lights (BIG problem), major power cables, or other sources of low frequency noise. Running parallel to a noise source over a long distance is where most crosstalk occurs -- try to cross wires at right angles. Conduit can help as it shields, as well, but our wires were basically thrown up in a drop ceiling haphazardly by "trained professionals" a.k.a. graduate students, faculty, and sometimes a shop/maintenance guy. > Possible solutions I have thought of: > > - user stops complaining and deals with the situation Always a popular one. To accomplish it you had better be prepared to use force. Bring duct tape to the meeting... > - put ethernet->serial converts at the terminals so the terminals are > on the network Sounds expensive. Of course, terminal servers themselves are typically pretty expensive, although we used to use them in the old days when we finally had more terminals than our server could manage even with expansions. And then workstations started getting cheaper and we converted over to workstations and ethernet and never looked back. How is it that you're still using terminals? I didn't know that terminals were still a viable option -- a cheap PC is less than what, $500 these days, and by the time you compare the cost of the terminal itself, the serial port terminal server, the serial wiring, and the incredible loss of productivity associated with using what amounts to a single, slow, tty interface they just don't sound cost effective. Not to mention maintenance, user complaints, and your time... > - put small VIA type boards whose image is loaded through tftp and > the serial terminals actually run from the via boards > - what else? Give the terminals to somebody you don't like, replace them with cheap diskless second hand PCs on ethernet running a stripped linux that basically provides either the standard set of Alt-Fx tty's in non-graphical mode or basic X and as many xterms as memory permits. Problem solved. In fact, depending on the applications being accessed and whether they CAN run locally, problem solved even better by running them locally and reducing demand on the network and servers. rgb > > Mike > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mikee at mikee.ath.cx Tue Mar 2 08:23:33 2004 From: mikee at mikee.ath.cx (Mike Eggleston) Date: Tue, 2 Mar 2004 07:23:33 -0600 Subject: [Beowulf] help! need thoughts/recommendations quickly Message-ID: <20040302132333.GA3957@mikee.ath.cx> I realize this question is not specific to beowulf clusters... however, at 9a I'm meeting with an upset user about a bunch of workstations using serial termainals. Things don't happen as quickly as he wants: setup, problem diagnosis, throughput, etc. What solutions can I present for these problems (I realize this is just a quick summary!). Also, the serial terminals are running at 9600 baud over sometimes 50 meters. One table I found shows 60 meters is 2400 baud and 30 meters is 4800 baud. I think this is part of the problem. Possible solutions I have thought of: - user stops complaining and deals with the situation - put ethernet->serial converts at the terminals so the terminals are on the network - put small VIA type boards whose image is loaded through tftp and the serial terminals actually run from the via boards - what else? Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mikee at mikee.ath.cx Tue Mar 2 09:08:56 2004 From: mikee at mikee.ath.cx (Mike Eggleston) Date: Tue, 2 Mar 2004 08:08:56 -0600 Subject: [Beowulf] help! need thoughts/recommendations quickly In-Reply-To: References: <20040302132333.GA3957@mikee.ath.cx> Message-ID: <20040302140856.GA4615@mikee.ath.cx> On Tue, 02 Mar 2004, Robert G. Brown wrote: > On Tue, 2 Mar 2004, Mike Eggleston wrote: > > > I realize this question is not specific to beowulf clusters... however, > > at 9a I'm meeting with an upset user about a bunch of workstations > > using serial termainals. Things don't happen as quickly as he wants: > > setup, problem diagnosis, throughput, etc. What solutions can I present > > for these problems (I realize this is just a quick summary!). Also, > > the serial terminals are running at 9600 baud over sometimes 50 meters. > > One table I found shows 60 meters is 2400 baud and 30 meters is 4800 > > baud. I think this is part of the problem. > > It really shouldn't be, if the wiring is decent quality TP. Back in the > old days, when our department was basically NOTHING but serial terminals > running over TP down to a Sun 110 with a serial port expansion, we had > lots of runs over 50 meters (probably some close to 100) without > difficulty at 9600 baud. Keep the wires away from e.g. fluorescent > lights (BIG problem), major power cables, or other sources of low > frequency noise. Running parallel to a noise source over a long > distance is where most crosstalk occurs -- try to cross wires at right > angles. Conduit can help as it shields, as well, but our wires were > basically thrown up in a drop ceiling haphazardly by "trained > professionals" a.k.a. graduate students, faculty, and sometimes a > shop/maintenance guy. I know it should work and the old way it does work, but I've always seen problems with serial and printers. I much prefer getting away from them to full ethernet. > > Possible solutions I have thought of: > > > > - user stops complaining and deals with the situation > > Always a popular one. To accomplish it you had better be prepared to > use force. Bring duct tape to the meeting... This problem is happening in the warehouse, so there is lots of packing material and tape around. :) > > - put ethernet->serial converts at the terminals so the terminals are > > on the network > > Sounds expensive. Of course, terminal servers themselves are typically > pretty expensive, although we used to use them in the old days when we > finally had more terminals than our server could manage even with > expansions. And then workstations started getting cheaper and we > converted over to workstations and ethernet and never looked back. > > How is it that you're still using terminals? I didn't know that > terminals were still a viable option -- a cheap PC is less than what, > $500 these days, and by the time you compare the cost of the terminal > itself, the serial port terminal server, the serial wiring, and the > incredible loss of productivity associated with using what amounts to a > single, slow, tty interface they just don't sound cost effective. Not > to mention maintenance, user complaints, and your time... This is an application in the warehouse. We have many serial (dumb) terminals and printers. We are using 'Dorio's(?). Similiar to the Wyse 60. I've not used a dorio before, but wyse terminals lots. The application is all curses based and doesn't require much. The users are not even concerned about the speed of the application (display, etc.) just that the terminals are quick to setup and work all the time. > > - put small VIA type boards whose image is loaded through tftp and > > the serial terminals actually run from the via boards > > - what else? > > Give the terminals to somebody you don't like, replace them with cheap > diskless second hand PCs on ethernet running a stripped linux that > basically provides either the standard set of Alt-Fx tty's in > non-graphical mode or basic X and as many xterms as memory permits. > Problem solved. > > In fact, depending on the applications being accessed and whether they > CAN run locally, problem solved even better by running them locally and > reducing demand on the network and servers. I can use the terminals on the via boards and not have to replace them with crt monitors and keyboards, until they all start failing. I'd prefer to use the crt monitors through vga (fewer problems with linux and getty). Do you (anyone) know of a cheap motherboard that would do this? Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From john.hearns at clustervision.com Tue Mar 2 10:44:47 2004 From: john.hearns at clustervision.com (John Hearns) Date: Tue, 2 Mar 2004 16:44:47 +0100 (CET) Subject: [Beowulf] help! need thoughts/recommendations quickly In-Reply-To: <20040302140856.GA4615@mikee.ath.cx> Message-ID: On Tue, 2 Mar 2004, Mike Eggleston wrote: > > Do you (anyone) know of a cheap motherboard that would do this? Sorry to sound like a Cyclades salesman, but from their webpages the Cyclades TS-100 would fit the bill. Plus lots of packing tape. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rbw at demec.ufpe.br Tue Mar 2 15:51:05 2004 From: rbw at demec.ufpe.br (Ramiro Brito Willmersdorf) Date: Tue, 2 Mar 2004 17:51:05 -0300 Subject: [Beowulf] Invitation to Conference Message-ID: <20040302205105.GA30141@demec.ufpe.br> Dear Colleagues, The XXV CILAMCE (Iberian Latin American Congress on Computational Methods for Engineering) will be held from November 10th to the 12th at Recife, Brazil. This Congress will encompass more than 30 mini-symposia over a very wide range of multidisciplinary methods in engineering and applied sciences. Please check the congress home page (http://www.demec.ufpe.br/cilamce2004/) for more specific details. We would like to invite you to participate in the High Performance Computing mini-symposium. If you are interested, you should submit an abstract by March 29th, 2004. This is one of the most important conferences on this subject in South America, and top researchers from here and abroad will attend. On a personal note, we would like to tell you that Recife is one of the top touristic destinations in Brazil, with a very pleasant weather and very nice beaches. We are grateful for you attention are ask that this information be passed along to other people in your institution that may be interested. Many Thanks, A. L. G. Coutinho, COPPE/UFRJ, alvaro at nacad.ufrj.br R. B. Willmersdorf, DEMEC/UFPE, rbw at demec.ufpe.br -- Ramiro Brito Willmersdorf rbw at demec.ufpe.br GPG key: http://www.demec.ufpe.br/~rbw/GPG/gpg_key.txt _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Wed Mar 3 12:52:30 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Wed, 03 Mar 2004 09:52:30 -0800 Subject: [Beowulf] mpich program segfaults Message-ID: <40461B5E.6010003@cert.ucr.edu> Hi, Sorry if this is off topic. Anyway, I've got an mpich Fortran program I'm trying to get going, which produces a segmentation fault right at a subroutine call. I put a print statement right before and right after the call and when I run the program, I'm only seeing the one before. I've also put a print statement right at the beginning of the subroutine which is being called and never see that either. The real strange part is when I run this under a debugger, the program runs fine. So would anyone happen to have any insight to what's going on here? I'd really appriciate it. Thanks, Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sdutta at deas.harvard.edu Wed Mar 3 14:42:13 2004 From: sdutta at deas.harvard.edu (Suvendra Nath Dutta) Date: Wed, 3 Mar 2004 14:42:13 -0500 Subject: [Beowulf] mpich program segfaults In-Reply-To: <40461B5E.6010003@cert.ucr.edu> References: <40461B5E.6010003@cert.ucr.edu> Message-ID: Glen, Does your program seg fault when compiled with debugging off or on? Sometimes compilers will initialize arrays when compiling for debugging, but not waste time doing that when compiled without debugging. Also if you compile with optimization which line follows which one isn't always clear. You want to make sure you aren't over-running memory. Because what you say sounds suspiciously like that. Also you want to be sure its nothing to do with MPICH. Try calling the subroutine from a serial program if possible. Suvendra. On Mar 3, 2004, at 12:52 PM, Glen Kaukola wrote: > Hi, > > Sorry if this is off topic. Anyway, I've got an mpich Fortran program > I'm trying to get going, which produces a segmentation fault right at > a subroutine call. I put a print statement right before and right > after the call and when I run the program, I'm only seeing the one > before. I've also put a print statement right at the beginning of the > subroutine which is being called and never see that either. The real > strange part is when I run this under a debugger, the program runs > fine. So would anyone happen to have any insight to what's going on > here? I'd really appriciate it. > > Thanks, > Glen > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Wed Mar 3 15:46:36 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Wed, 03 Mar 2004 12:46:36 -0800 Subject: [Beowulf] mpich program segfaults In-Reply-To: References: <40461B5E.6010003@cert.ucr.edu> Message-ID: <4046442C.4090704@cert.ucr.edu> Suvendra Nath Dutta wrote: > Glen, > Does your program seg fault when compiled with debugging off or on? Either way. > Sometimes compilers will initialize arrays when compiling for > debugging, but not waste time doing that when compiled without debugging. The arguments being passed to the subroutine are two arrays of real numbers and a few integers. Nothing being passed to the subroutine has been dynamically allocated. The compiler, IBM's XLF compiler, initializes the array to 0. At least I'm pretty sure it does, since I can print things before the subroutine call. > Also if you compile with optimization which line follows which one > isn't always clear. I don't have any optimizations turned on. > You want to make sure you aren't over-running memory. The machine has 2 gigs of memory, which should be plenty. The same program runs on an x86 machine with 1 gig of memory just fine (I'm trying to get the program working on an Apple G5 by the way). > Also you want to be sure its nothing to do with MPICH. Try calling the > subroutine from a serial program if possible. I've tried telling mpirun to only use one cpu and I get the same results. I've also tried running the program all by itself and it still crashes. Like I said though, it runs just fine under the a debugger. Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sdutta at deas.harvard.edu Thu Mar 4 06:26:49 2004 From: sdutta at deas.harvard.edu (Suvendra Nath Dutta) Date: Thu, 4 Mar 2004 06:26:49 -0500 (EST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <4046442C.4090704@cert.ucr.edu> References: <40461B5E.6010003@cert.ucr.edu> <4046442C.4090704@cert.ucr.edu> Message-ID: Glen, I am sorry, I meant buffer-overrun instead of memory overrun. It is of course impossible to say, but you are describing a classic description of buffer overrun. Program seg-faulting, some where there shouldn't be a problem. This is usually because you've over run the array limits and are writing on the program space. Suvendra. On Wed, 3 Mar 2004, Glen Kaukola wrote: > Suvendra Nath Dutta wrote: > > > Glen, > > Does your program seg fault when compiled with debugging off or on? > > > Either way. > > > Sometimes compilers will initialize arrays when compiling for > > debugging, but not waste time doing that when compiled without debugging. > > > The arguments being passed to the subroutine are two arrays of real > numbers and a few integers. Nothing being passed to the subroutine has > been dynamically allocated. The compiler, IBM's XLF compiler, > initializes the array to 0. At least I'm pretty sure it does, since I > can print things before the subroutine call. > > > Also if you compile with optimization which line follows which one > > isn't always clear. > > > I don't have any optimizations turned on. > > > You want to make sure you aren't over-running memory. > > > The machine has 2 gigs of memory, which should be plenty. The same > program runs on an x86 machine with 1 gig of memory just fine (I'm > trying to get the program working on an Apple G5 by the way). > > > Also you want to be sure its nothing to do with MPICH. Try calling the > > subroutine from a serial program if possible. > > > I've tried telling mpirun to only use one cpu and I get the same > results. I've also tried running the program all by itself and it still > crashes. Like I said though, it runs just fine under the a debugger. > > > Glen > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From wseas at canada.com Thu Mar 4 09:34:46 2004 From: wseas at canada.com (WSEAS Newsletter on MECHANICAL ENGINEERING) Date: Thu, 4 Mar 2004 16:34:46 +0200 Subject: [Beowulf] WSEAS NEWSLETTER in MECHANICAL ENGINEERING Message-ID: <3FE20F40001FB40E@fesscrpp1.tellas.gr> (added by postmaster@fesscrpp1.tellas.gr) If you want to contact us, the Subject of your email must contains the code: WSEAS CALL FOR PAPERS -- CALL FOR REVIEWERS -- CALL FOR SPECIAL SESSIONS wseas at canada.com http://wseas.freeservers.com **************************************************************** Udine, Italy, March 25-27, 2004: IASME/WSEAS 2004 Int.Conf. on MECHANICS and MECHATRONICS **************************************************************** Miami, Florida, USA, April 21-23, 2004 5th WSEAS International Conference on APPLIED MATHEMATICS (SYMPOSIA on: Linear Algebra and Applications, Numerical Analysis and Applications, Differential Equations and Applications, Probabilities, Statistics, Operational Research, Optimization, Algorithms, Discrete Mathematics, Systems, Communications, Control, Computers, Education) **************************************************************** Corfu Island, Greece, August 17-19, 2004 WSEAS/IASME Int.Conf. on FLUID MECHANICS WSEAS/IASME Int.Conf. on HEAT and MASS TRANSFER ********************************************************** Vouliagmeni, Athens, Greece, July 12-13, 2004 WSEAS ELECTROSCIENCE AND TECHNOLOGY FOR NAVAL ENGINEERING and ALL-ELECTRIC SHIP ********************************************************** Copacabana, Rio de Janeiro, Brazil, October 12-15, 2004 3rd WSEAS Int.Conf. on INFORMATION SECURITY, HARDWARE/SOFTWARE CODESIGN and COMPUTER NETWORKS (ISCOCO 2004) 3rd WSEAS Int. Conf. on APPLIED MATHEMATICS and COMPUTER SCIENCE (AMCOS 2004) 3rd WSEAS Int.Conf. on SYSTEM SCIENCE and ENGINEERING (ICOSSE 2004) 4th WSEAS Int.Conf. on POWER ENGINEERING SYSTEMS (ICOPES 2004) **************************************************************** Cancun, Mexico, May 12-15, 2004 6th WSEAS Int.Conf. on ALGORITHMS, SCIENTIFIC COMPUTING, MODELLING AND SIMULATION (ASCOMS '04) ********************************************************** NOTE THAT IN WSEAS CONFERENCES YOU CAN HAVE PROCEEDINGS 1) HARD COPY 2) CD-ROM and 3) Web Publishing SELECTED PAPERS are also published (after further review) * as regular papers in WSEAS TRANSACTIONS (Journals) or * as Chapters in WSEAS Book Series. WSEAS Books, Journals, Proceedings participate now in all major science citation indexes. ISI, ELSEVIER, CSA, AMS. Mathematical Reviews, ELP, NLG, Engineering Index Directory of Published Proceedings, INSPEC (IEE) Thanks Alexis Espen WSEAS NEWSLETTER in MECHANICAL ENGINEERING wseas at canada.com http://wseas.freeservers.com ##### HOW TO UNSUBSCRIBE #### You receive this newsletter from your email address: beowulf at beowulf.org If you want to unsubscribe, send an email to: wseas at canada.com The Subject of your message must be exactly: REMOVE beowulf at beowulf.org WSEAS If you want to unsubscribe more than one email addresses, send a message to nata at wseas.org with Subject: REMOVE [email1, emal2, ...., emailn] WSEAS _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From robl at mcs.anl.gov Thu Mar 4 13:46:26 2004 From: robl at mcs.anl.gov (Robert Latham) Date: Thu, 4 Mar 2004 12:46:26 -0600 Subject: [Beowulf] mpich program segfaults In-Reply-To: <4046442C.4090704@cert.ucr.edu> References: <40461B5E.6010003@cert.ucr.edu> <4046442C.4090704@cert.ucr.edu> Message-ID: <20040304184626.GA2746@mcs.anl.gov> On Wed, Mar 03, 2004 at 12:46:36PM -0800, Glen Kaukola wrote: > I've tried telling mpirun to only use one cpu and I get the same > results. I've also tried running the program all by itself and it still > crashes. Like I said though, it runs just fine under the a debugger. since you see this crash when the program runs by itself, try running under a memory checker (valgrid is good and free, also purify, insure++...). ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Labs, IL USA B29D F333 664A 4280 315B _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Thu Mar 4 14:32:12 2004 From: raysonlogin at yahoo.com (Rayson Ho) Date: Thu, 4 Mar 2004 11:32:12 -0800 (PST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <4046442C.4090704@cert.ucr.edu> Message-ID: <20040304193213.411.qmail@web11407.mail.yahoo.com> Then run the program by hand, and attach a debugger... Rayson --- Glen Kaukola wrote: > Like I said though, it runs just fine under the a debugger. > > > Glen > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________ Do you Yahoo!? Yahoo! Search - Find what you?re looking for faster http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Thu Mar 4 13:45:25 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Thu, 04 Mar 2004 10:45:25 -0800 Subject: [Beowulf] mpich program segfaults In-Reply-To: References: <40461B5E.6010003@cert.ucr.edu> <4046442C.4090704@cert.ucr.edu> Message-ID: <40477945.9090808@cert.ucr.edu> Suvendra Nath Dutta wrote: >Glen, > I am sorry, I meant buffer-overrun instead of memory overrun. It >is of course impossible to say, but you are describing a classic >description of buffer overrun. Program seg-faulting, some where there >shouldn't be a problem. This is usually because you've over run the array >limits and are writing on the program space. > > Ok, but simply calling a subroutine shouldn't cause a buffer overrun should it? Especially when none of the arguments being passed to the subroutine are dynamically allocated. I'm beginning to suspect it's a problem with the compiler actually. Maybe the stack that holds subroutine arguments isn't big enough. And when my problematic subroutine call is 4 levels deep or so like it is, then there isn't enough room on the stack for it's arguments. Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From deadline at linux-mag.com Thu Mar 4 17:34:29 2004 From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine) Date: Thu, 4 Mar 2004 17:34:29 -0500 (EST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <4046442C.4090704@cert.ucr.edu> Message-ID: What type of machine is this? Doug On Wed, 3 Mar 2004, Glen Kaukola wrote: > Suvendra Nath Dutta wrote: > > > Glen, > > Does your program seg fault when compiled with debugging off or on? > > > Either way. > > > Sometimes compilers will initialize arrays when compiling for > > debugging, but not waste time doing that when compiled without debugging. > > > The arguments being passed to the subroutine are two arrays of real > numbers and a few integers. Nothing being passed to the subroutine has > been dynamically allocated. The compiler, IBM's XLF compiler, > initializes the array to 0. At least I'm pretty sure it does, since I > can print things before the subroutine call. > > > Also if you compile with optimization which line follows which one > > isn't always clear. > > > I don't have any optimizations turned on. > > > You want to make sure you aren't over-running memory. > > > The machine has 2 gigs of memory, which should be plenty. The same > program runs on an x86 machine with 1 gig of memory just fine (I'm > trying to get the program working on an Apple G5 by the way). > > > Also you want to be sure its nothing to do with MPICH. Try calling the > > subroutine from a serial program if possible. > > > I've tried telling mpirun to only use one cpu and I get the same > results. I've also tried running the program all by itself and it still > crashes. Like I said though, it runs just fine under the a debugger. > > > Glen > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- ---------------------------------------------------------------- Editor-in-chief ClusterWorld Magazine Desk: 610.865.6061 Cell: 610.390.7765 Redefining High Performance Computing Fax: 610.865.6618 www.clusterworld.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From smcdaniel at kciinc.net Thu Mar 4 13:59:37 2004 From: smcdaniel at kciinc.net (smcdaniel) Date: Thu, 4 Mar 2004 12:59:37 -0600 Subject: [Beowulf] mpich program segfaults (Glen Kaukola) Message-ID: <002501c4021a$d77830c0$2a01010a@kciinc.local> Physical memory errors could be the problem if they occur between the pointer and offset of your array location in the stack. Other than that I would suspect a buffer overrun that Suvendra Nath Dutta mentioned. Sam McDaniel _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Thu Mar 4 19:48:21 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Thu, 04 Mar 2004 16:48:21 -0800 Subject: [Beowulf] mpich program segfaults In-Reply-To: References: Message-ID: <4047CE55.6010300@cert.ucr.edu> Douglas Eadline, Cluster World Magazine wrote: >What type of machine is this? > > An Apple G5. And actually I've figured out what's wrong. Sorta. =) I replaced my problematic subroutine with a dummy subroutine that contains nothing but variable declarations and a print statement. This still caused a segmentation fault. So I commented pretty much everything out. No segmentation fault. Alright then. I slowly added it all back in, checking each time to see if I got a segmentation fault. And now I'm down to 4 variable declarations that are causing a problem: REAL ZFGLURG ( NCOLS,NROWS,0:NLAYS ) INTEGER ICASE( NCOLS,NROWS,0:NLAYS ) REAL THETAV( NCOLS,NROWS,NLAYS ) REAL ZINT ( NCOLS,NROWS,NLAYS ) If I uncomment any one of those, I get a segmentation fault again. But it still doesn't make any sense. First of all, there are variable declarations almost exactly like the ones I listed and those don't cause a problem. I also made a small test case that called my dummy subroutine and that worked just fine. I then commented out everything but the problematic variable declarations I listed above and that worked just fine. I tried changing the variable names but that didn't seem to make a difference, as I still got a segmentation fault. So I have no idea what the heck is going on. I think I need to tell my boss we need to give up on G5's. Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Thu Mar 4 20:05:28 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Fri, 5 Mar 2004 09:05:28 +0800 (CST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <40477945.9090808@cert.ucr.edu> Message-ID: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com> the default stack size on OSX is 512 KB, try to increase it to 64MB, I encountered this problem before. Andrew. --- Glen Kaukola ????> Suvendra Nath Dutta wrote: > > >Glen, > > I am sorry, I meant buffer-overrun instead of > memory overrun. It > >is of course impossible to say, but you are > describing a classic > >description of buffer overrun. Program > seg-faulting, some where there > >shouldn't be a problem. This is usually because > you've over run the array > >limits and are writing on the program space. > > > > > > Ok, but simply calling a subroutine shouldn't cause > a buffer overrun > should it? Especially when none of the arguments > being passed to the > subroutine are dynamically allocated. I'm beginning > to suspect it's a > problem with the compiler actually. Maybe the stack > that holds > subroutine arguments isn't big enough. And when my > problematic > subroutine call is 4 levels deep or so like it is, > then there isn't > enough room on the stack for it's arguments. > > Glen > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or > unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From deadline at linux-mag.com Thu Mar 4 21:46:16 2004 From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine) Date: Thu, 4 Mar 2004 21:46:16 -0500 (EST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <4047CE55.6010300@cert.ucr.edu> Message-ID: Don't give up on the G5 just yet. Sounds like to me you may be stepping on some memory somehow. Which means the crash occurs at that particular spot in the code, but the cause of the crash probably is occurring somewhere else in the program. There are "simple" several you can do to collect evidence that may help you solve this "crime". (this is detective work by the way) First, this sounds like the kind of thing that happens in C programs. Is it pure Fortran? What version of MPICH? 1) try another compiler, if you are lucky it will find the problem. It may also work, in which case you will want to blame the first compiler, don't, because that is probably not the case. The new compiler probably lays out the memory different than the first one and you just got lucky. 2) run your code on another architecture. 3) try another MPI (LAM?) I am sure there are more, but not knowing the particulars, I can not suggest anything else. Doug On Thu, 4 Mar 2004, Glen Kaukola wrote: > Douglas Eadline, Cluster World Magazine wrote: > > >What type of machine is this? > > > > > > An Apple G5. > > And actually I've figured out what's wrong. Sorta. =) > > I replaced my problematic subroutine with a dummy subroutine that > contains nothing but variable declarations and a print statement. This > still caused a segmentation fault. So I commented pretty much > everything out. No segmentation fault. Alright then. I slowly added > it all back in, checking each time to see if I got a segmentation fault. > > And now I'm down to 4 variable declarations that are causing a problem: > REAL ZFGLURG ( NCOLS,NROWS,0:NLAYS ) > INTEGER ICASE( NCOLS,NROWS,0:NLAYS ) > REAL THETAV( NCOLS,NROWS,NLAYS ) > REAL ZINT ( NCOLS,NROWS,NLAYS ) > > If I uncomment any one of those, I get a segmentation fault again. > > But it still doesn't make any sense. First of all, there are variable > declarations almost exactly like the ones I listed and those don't cause > a problem. I also made a small test case that called my dummy > subroutine and that worked just fine. I then commented out everything > but the problematic variable declarations I listed above and that worked > just fine. I tried changing the variable names but that didn't seem to > make a difference, as I still got a segmentation fault. So I have no > idea what the heck is going on. I think I need to tell my boss we need > to give up on G5's. > > > Glen > -- ---------------------------------------------------------------- Editor-in-chief ClusterWorld Magazine Desk: 610.865.6061 Cell: 610.390.7765 Redefining High Performance Computing Fax: 610.865.6618 www.clusterworld.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mathiasbrito at yahoo.com.br Fri Mar 5 08:43:33 2004 From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=) Date: Fri, 5 Mar 2004 10:43:33 -0300 (ART) Subject: [Beowulf] Benchmarking with HPL Message-ID: <20040305134333.90538.qmail@web12201.mail.yahoo.com> Hello, I'm benchmarking my cluster with HPL, the cluster have 16 nodes, 8 nodes athlon 1600+ with 512MB RAM and 20GB H.D. , and 8 nodes athlan 1700+ with 512MB RAM and 20GB, all with a 100Mbit fast ethernet linked in a switch. Well, the problem is, what the best setup for the HPL.dat, to obtain the maximum performance of the cluster? Mathias ===== Mathias Brito Universidade Estadual de Santa Cruz - UESC Departamento de Ci?ncias Exatas e Tecnol?gicas Estudante do Curso de Ci?ncia da Computa??o ______________________________________________________________________ Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora: http://br.yahoo.com/info/mail.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Sebastien.Georget at sophia.inria.fr Fri Mar 5 10:10:10 2004 From: Sebastien.Georget at sophia.inria.fr (=?ISO-8859-1?Q?S=E9bastien_Georget?=) Date: Fri, 05 Mar 2004 16:10:10 +0100 Subject: [Beowulf] Benchmarking with HPL In-Reply-To: <20040305134333.90538.qmail@web12201.mail.yahoo.com> References: <20040305134333.90538.qmail@web12201.mail.yahoo.com> Message-ID: <40489852.3050206@sophia.inria.fr> Mathias Brito wrote: > Hello, > > I'm benchmarking my cluster with HPL, the cluster have > 16 nodes, 8 nodes athlon 1600+ with 512MB RAM and 20GB > H.D. , and 8 nodes athlan 1700+ with 512MB RAM and > 20GB, all with a 100Mbit fast ethernet linked in a > switch. Well, the problem is, what the best setup for > the HPL.dat, to obtain the maximum performance of the > cluster? > > Mathias Hi, starting points for HPL tuning here: http://www.netlib.org/benchmark/hpl/faqs.html http://www.netlib.org/benchmark/hpl/tuning.html ++ -- S?bastien Georget INRIA Sophia-Antipolis, Service DREAM, B.P. 93 06902 Sophia-Antipolis Cedex, FRANCE E-mail:sebastien.georget at sophia.inria.fr _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Fri Mar 5 12:28:36 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 5 Mar 2004 12:28:36 -0500 (EST) Subject: [Beowulf] Newbie on beowulf clustering In-Reply-To: <20040305171757.15481.qmail@web20730.mail.yahoo.com> Message-ID: On Fri, 5 Mar 2004, khurram b wrote: > hi! > i am newbie to beowulf clustering, have done some work > in MOSIX linux clustering and got interested in > beowulf clustering, please guide me where to start , > tutorials, documents. http://www.phy.duke.edu/brahma Has many resources and links to many more. Also think about subscribing to Cluster World magazine. rgb > > Thanks! > > __________________________________ > Do you Yahoo!? > Yahoo! Search - Find what you?re looking for faster > http://search.yahoo.com > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From myaoha at yahoo.com Fri Mar 5 12:17:57 2004 From: myaoha at yahoo.com (khurram b) Date: Fri, 5 Mar 2004 09:17:57 -0800 (PST) Subject: [Beowulf] Newbie on beowulf clustering Message-ID: <20040305171757.15481.qmail@web20730.mail.yahoo.com> hi! i am newbie to beowulf clustering, have done some work in MOSIX linux clustering and got interested in beowulf clustering, please guide me where to start , tutorials, documents. Thanks! __________________________________ Do you Yahoo!? Yahoo! Search - Find what you?re looking for faster http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mprinkey at aeolusresearch.com Fri Mar 5 14:02:13 2004 From: mprinkey at aeolusresearch.com (Michael T. Prinkey) Date: Fri, 5 Mar 2004 14:02:13 -0500 (EST) Subject: [Beowulf] "noht" in 2.4.24? Message-ID: Hi Everyone, I installed 2.4.24 on a dual Xeon system with a Tyan 7501-chipset motherboard and the noht option seems to be ignored. The RH9 kernel (2.4.20?) repected noht. Has this been changed or is there a patch that I missed? I can't think that it is a BIOS issue or otherwise hardware related as I can shut it off with RH9 kernel. Thanks, Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hartner at cs.utah.edu Fri Mar 5 16:22:37 2004 From: hartner at cs.utah.edu (Mark Hartner) Date: Fri, 5 Mar 2004 14:22:37 -0700 (MST) Subject: [Beowulf] "noht" in 2.4.24? In-Reply-To: Message-ID: > I installed 2.4.24 on a dual Xeon system with a Tyan 7501-chipset > motherboard and the noht option seems to be ignored. The RH9 kernel > (2.4.20?) repected noht. Has this been changed or is there a patch that I think that option was removed around 2.4.21 If you look at Documentation/kernel-parameters.txt in the kernel source it will give you a list of options for the 2.4.24 kernel. > missed? I can't think that it is a BIOS issue or otherwise hardware > related as I can shut it off with RH9 kernel. 'acpi=off' will disable ht'ing (and a bunch of other stuff) The other option is to disable it in your BIOS. Mark _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rdn at uchicago.edu Fri Mar 5 18:27:34 2004 From: rdn at uchicago.edu (Russell Nordquist) Date: Fri, 5 Mar 2004 17:27:34 -0600 (CST) Subject: [Beowulf] good 24 port gige switch Message-ID: Does anyone have a recommendation for a good 24 port gige switch for clustering? I know this issue has been discussed, but I didn't find any actual manufacturer/models people like. Were not really looking at the very high end models from Cisco, but I am wary of the many low end switches on the market with regard to bisectional bandwidth issues. Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches and found one to be better than the other. There are a bunch of 24 port gige switches for <$2000, but are they any good? are some better than others (likely so i'd guess)? thanks and have a good weekend. russell - - - - - - - - - - - - Russell Nordquist UNIX Systems Administrator Geophysical Sciences Computing http://geosci.uchicago.edu/computing NSIT, University of Chicago - - - - - - - - - - - _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Fri Mar 5 20:24:55 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sat, 6 Mar 2004 09:24:55 +0800 (CST) Subject: [Beowulf] SGEEE free and more platform offically supported Message-ID: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com> I used to think that SGE is free, but SGEEE (with more advanced scheduling algorithms) is not. But it is not true, both are free and open source. In SGE 6.0, there will be no "SGEEE mode", but the default mode will have all the SGEEE functionality! And Sun is adding more support too, instead of looking at the source or finding other people to support non-Sun OSes: "Sun will also support non Sun platforms beginning with Grid Engine 6 (HP, IBM, SGI, MAC)." http://gridengine.sunsource.net/servlets/ReadMsg?msgId=16510&listName=users Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Fri Mar 5 20:04:33 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sat, 6 Mar 2004 09:04:33 +0800 (CST) Subject: [Beowulf] mpich program segfaults In-Reply-To: <40491AA7.6050703@cert.ucr.edu> Message-ID: <20040306010433.99259.qmail@web16812.mail.tpe.yahoo.com> It's not your code, I think there is a compiler flag to not allocate variables from the stack, but I need to look at the XLF manuals again. BTW, there are several OSX settings that you can do to tune the performance of your fortran on the G5. I said fortran since it has to do with the hardware prefetching on the Power4 and the G5, if you have c programs with a lot of vector computation, you can set those too. Andrew. --- Glen Kaukola > >the default stack size on OSX is 512 KB, try to > >increase it to 64MB, I encountered this problem > >before. > Yep, that did the trick. Thanks a bunch! > > I'm wondering though, does this indicate there's > some sort of problem > with the code? > > > Glen ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From glen at mail.cert.ucr.edu Fri Mar 5 19:26:15 2004 From: glen at mail.cert.ucr.edu (Glen Kaukola) Date: Fri, 05 Mar 2004 16:26:15 -0800 Subject: [Beowulf] mpich program segfaults In-Reply-To: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com> References: <20040305010528.68835.qmail@web16812.mail.tpe.yahoo.com> Message-ID: <40491AA7.6050703@cert.ucr.edu> Andrew Wang wrote: >the default stack size on OSX is 512 KB, try to >increase it to 64MB, I encountered this problem >before. > > Yep, that did the trick. Thanks a bunch! I'm wondering though, does this indicate there's some sort of problem with the code? Glen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Fri Mar 5 19:34:05 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Fri, 5 Mar 2004 19:34:05 -0500 (EST) Subject: [Beowulf] good 24 port gige switch In-Reply-To: Message-ID: > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches > and found one to be better than the other. There are a bunch of 24 port > gige switches for <$2000, but are they any good? are some better than > others (likely so i'd guess)? I've had good luck with SMC 8624t's, and know of one quite large cluster that uses a lot of them of them (mckenzie, #140). _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lars at meshtechnologies.com Sat Mar 6 04:55:22 2004 From: lars at meshtechnologies.com (Lars Henriksen) Date: 06 Mar 2004 09:55:22 +0000 Subject: [Beowulf] good 24 port gige switch In-Reply-To: References: Message-ID: <1078566922.2547.6.camel@fermi> On Fri, 2004-03-05 at 23:27, Russell Nordquist wrote: > Does anyone have a recommendation for a good 24 port gige switch for > clustering? > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches > and found one to be better than the other. There are a bunch of 24 port > gige switches for <$2000, but are they any good? are some better than > others (likely so i'd guess)? We mostly use HP2724's for this size of clusters. We have found them to perform ok and they are stable under heavy load - and they are priced at around $2000 (in Denmark, that is, might be cheaper in the US) best regards Lars -- Lars Henriksen | MESH-Technologies A/S Systems Manager & Consultant | Lille Gr?br?drestr?de 1 www.meshtechnologies.com | DK-5000 Odense C, Denmark lars at meshtechnologies.com | mobile: +45 2291 2904 direct: +45 6311 1187 | fax: +45 6311 1189 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Sat Mar 6 09:01:49 2004 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Sat, 6 Mar 2004 06:01:49 -0800 (PST) Subject: [Beowulf] good 24 port gige switch In-Reply-To: <1078566922.2547.6.camel@fermi> Message-ID: On 6 Mar 2004, Lars Henriksen wrote: > On Fri, 2004-03-05 at 23:27, Russell Nordquist wrote: > > Does anyone have a recommendation for a good 24 port gige switch for > > clustering? > > > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches > > and found one to be better than the other. There are a bunch of 24 port > > gige switches for <$2000, but are they any good? are some better than > > others (likely so i'd guess)? > > We mostly use HP2724's for this size of clusters. We have found them to > perform ok and they are stable under heavy load - and they are priced at > around $2000 (in Denmark, that is, might be cheaper in the US) hp doesn't do jumbo frames on anything other than their top of the line l3 switch products which may or may not be an issue for certain applications. > best regards > Lars > -- > Lars Henriksen | MESH-Technologies A/S > Systems Manager & Consultant | Lille Gr?br?drestr?de 1 > www.meshtechnologies.com | DK-5000 Odense C, Denmark > lars at meshtechnologies.com | mobile: +45 2291 2904 > direct: +45 6311 1187 | fax: +45 6311 1189 > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Unix Consulting joelja at darkwing.uoregon.edu GPG Key Fingerprint: 5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hanzl at noel.feld.cvut.cz Sat Mar 6 10:02:37 2004 From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz) Date: Sat, 06 Mar 2004 16:02:37 +0100 Subject: [Beowulf] SGEEE free and more platform offically supported In-Reply-To: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com> References: <20040306012455.3473.qmail@web16808.mail.tpe.yahoo.com> Message-ID: <20040306160237D.hanzl@unknown-domain> > I used to think that SGE is free, but SGEEE (with more > advanced scheduling algorithms) is not. But it is not > true, both are free and open source. SGEEE is free and opensource but many many people did not know this. I thing this confusion made big harm to SGE project and I invested a lot of effort in clarifying this (Google "hanzl SGEEE" to see all that). > In SGE 6.0, there will be no "SGEEE mode", but the > default mode will have all the SGEEE functionality! Great, hope this will stop the confusion once for ever. Regards Vaclav Hanzl _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Sat Mar 6 10:00:35 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sat, 6 Mar 2004 23:00:35 +0800 (CST) Subject: [Beowulf] SGEEE free and more platform offically supported In-Reply-To: <20040306160237D.hanzl@unknown-domain> Message-ID: <20040306150035.75079.qmail@web16806.mail.tpe.yahoo.com> --- hanzl at noel.feld.cvut.cz ????> > SGEEE is free and opensource but many many people > did not know this. I > thing this confusion made big harm to SGE project > and I invested a lot > of effort in clarifying this (Google "hanzl SGEEE" > to see all that). I think it is because Sun called it "Enterprise Edition" (EE), and when people think of Enterprise, they think of $$$. Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From atp at piskorski.com Sat Mar 6 15:43:28 2004 From: atp at piskorski.com (Andrew Piskorski) Date: Sat, 6 Mar 2004 15:43:28 -0500 Subject: [Beowulf] DC powered clusters? Message-ID: <20040306204328.GA49615@piskorski.com> Some rackmount vendors now offer systems with a small DC-to-DC power supply for each node, with separate AC-DC rectifiers feeding power. I imagine the DC is probably at 48 V rather than 12 V or whatever, but often they don't even seem to ay that, e.g.: http://rackable.com/products/dcpower.htm Has anyone OTHER than commercial rackmount vendors designed and built a cluster using such DC-to-DC power supplies? Is there detailed info on such anywhere on the web? Anybody have any idea exactly what components those vendors are using for their power systems, where they can be purchased (in small quantities), and/or how much they cost? I'm curious how the purchase and operating costs compare to the normal "stick a standard desktop AC-to-DC PUSE in each node" approach, or even the hackish "wire on extra connectors and use one high qualtiy desktop PSU to power 2 or 3 nodes" approach. The only DC-to-DC supplies I've seen on the web seem quite expensive, e.g.: http://www.rackmountpro.com/productsearch.cfm?catid=118 http://www.mini-box.com/power-faq.htm So I suspect the DC-to-DC approach would only ever make economic sense for large high-end clusters, those with unusual space or heat constraints, or the like. But I'm still curious about the details... -- Andrew Piskorski http://www.piskorski.com/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Fri Mar 5 23:41:07 2004 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Fri, 05 Mar 2004 22:41:07 -0600 Subject: [Beowulf] good 24 port gige switch In-Reply-To: References: Message-ID: <40495663.7010507@tamu.edu> Caveats: 1. It's been arough week. 2. I've got some specific opinions about 3Com hardware these days. I just ordered a 16 node cluster. I'm using the Foundry EdgeIron 24G as the basic switch. More than adequate backplane, pretty good small and large packet performance as tested with an Anritsu MD1230. Cost is expected to be about $3000, for the 24 port model. I'm getting 2, and have dual nics on the nodes, for some playing with channel bonding, and so that I've got a failover hot spare if/when one dies. Remember: Murphy was an optimist. For the record I don't expect the EdgeIron to die, but conversely (perversely?) I expect any and all network devices to die at the least opportune time! I didn't even consider 3Com. Didn't test it. The 3Com "gigabit" hardware I've seen recently in the LAN-space was usually capable of gig uplinks, but had trouble with congestion when gig and 100BaseT were mixed on the switch. HP had been OEM'ing Foundry. I'm not sure if that's still the case or if they went recently to someone else; my Foundry rep won't say, and I don't have a close HP rep. We have programmatically stayed away from Asante in our LAN operations here. That translates to no experience an dno contacts. Sorry. Cluster should be in within a month, and so should the switches. I'll do some latency runs and report objective data. gerry Russell Nordquist wrote: > Does anyone have a recommendation for a good 24 port gige switch for > clustering? I know this issue has been discussed, but I didn't find any > actual manufacturer/models people like. Were not really looking at the > very high end models from Cisco, but I am wary of the many low end > switches on the market with regard to bisectional bandwidth issues. > > Has anyone tested/used the HP, Asante, Foundry, 3com, or Extreme switches > and found one to be better than the other. There are a bunch of 24 port > gige switches for <$2000, but are they any good? are some better than > others (likely so i'd guess)? > > thanks and have a good weekend. > russell > > > - - - - - - - - - - - - > Russell Nordquist > UNIX Systems Administrator > Geophysical Sciences Computing > http://geosci.uchicago.edu/computing > NSIT, University of Chicago > - - - - - - - - - - - > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From alvin at Mail.Linux-Consulting.com Sun Mar 7 03:00:56 2004 From: alvin at Mail.Linux-Consulting.com (Alvin Oga) Date: Sun, 7 Mar 2004 00:00:56 -0800 (PST) Subject: [Beowulf] DC powered clusters? - fun In-Reply-To: <20040306204328.GA49615@piskorski.com> Message-ID: hi ya andrew fun stuff ... :-) good techie vitamins ;-) - lots of thinking of why it is the way it is vs what the real measure power consumption is On Sat, 6 Mar 2004, Andrew Piskorski wrote: > Some rackmount vendors now offer systems with a small DC-to-DC power > supply for each node, with separate AC-DC rectifiers feeding power. I > imagine the DC is probably at 48 V rather than 12 V or whatever, but > often they don't even seem to ay that, e.g.: > > http://rackable.com/products/dcpower.htm i don't like that they claim "back-to-back rackmounts" is their "patented technology" ... geez ... - anybody can mount a generic 1U in the rack .. one in the front and one in the back ( other side ) ... ( obviously the 1U chassis cannot be too deep ) > Has anyone OTHER than commercial rackmount vendors designed and built > a cluster using such DC-to-DC power supplies? Is there detailed info > on such anywhere on the web? dc-dc power supplies are made literally and figuratively by the million various combination of voltage, current capacity and footprint http://www.Linux-1U.net/PowerSupp ( see the list of various power supply manufacturers ) > Anybody have any idea exactly what components those vendors are using > for their power systems, where they can be purchased (in small > quantities), and/or how much they cost? you can buy any size dc-dc power supplies from $1.oo to the thousands if you want the dc-dc power supply to have atx output capabilities, than you have 2 or 3 choice of dc-atx output power supplies: - mini-box.com ( and they have a few resellers ) - there's a power supply company that also did a variation of mini-box.com's design ... i cant find the orig url at this time http://www.dc2dc.com is a resller of the "other option" - probably a bunch of power supp working on dc-atx convertors > The only DC-to-DC supplies I've seen on the web seem quite expensive, > e.g.: > > http://www.rackmountpro.com/productsearch.cfm?catid=118 99% of the rackmount vendors are just reselling (adding $$$ to ) a power supply manufacturer's power supply ... - you can save a good chunk of change by buying direct from the generic power supply OEM distributors - somtimes as much or mroe than 50% cost savings of the cost of the power supply > http://www.mini-box.com/power-faq.htm most of their data are measured data per their test setups and more info about dc-dc stuff http://www.via.com.tw/en/VInternet/power.pdf see the rest of the +12v DC input "atx power supply" vendors http://www.Linux-1U.net/PowerSupp/DC/ http://www.Linux-1U.net/PowerSupp/12v/ ( +12v at up to 500A or more ) > So I suspect the DC-to-DC approach would only ever make economic sense > for large high-end clusters, those with unusual space or heat > constraints, or the like. But I'm still curious about the details... dc-atx power supply makes sense when: - power supply heat and airflow is a problem or you dont like having too many power cords ( 400 cords vs 40 in a rack ) - simple cabling is a big problem ( rats nest ) - you want to reduce the costs of the system by throwing away un-used power supply capacity that is available with the traditional one power supply per 1 motherboard and peripherals - most power supplies used are used for maximum supported load (NOT a motherboard + cpu + disk + mem only) - you have a huge airconditioning bill problem - that should motivate you to find and test a system with "less heat generated solutions" - your cluster only needs to have enough power for the cpu + 1disk - you have a space consideration problems - dc-atx power supply allows 420 cpus per 42U rack and up to 840 cpus for front and back loaded cluster - on and on ... for a typical 4U-8U height blade clusters ( 10 blades ) - you only need one 600-800W atx power supply to drive the 10 mini-itx or flex-atx blades - cpu is 25W ?? motherboard is 25W ... - disks need 1A at 12v to spin up.. normal operation current is 80ma at 12v ... etc .. per disk specs - how you want to do power calculations is the trick 10 full-tower system with a 450W power does NOT imply you';re using 4500W of power for 10 systems :-) have fun alvin http://www.1U-ITX.net 100TB - 200TB of disks per 42U racks ?? -- even more fun http://www.itx-blades.net/1U-Blades ( blades are with mini-box.com's dc-dc atx power supply ) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mayank_kaushik at vsnl.net Sun Mar 7 03:29:42 2004 From: mayank_kaushik at vsnl.net (mayank_kaushik at vsnl.net) Date: Sun, 07 Mar 2004 13:29:42 +0500 Subject: [Beowulf] PVM says `PVM_ROOT not set..cant start pvmd` on remote computer Message-ID: <69bcf6b69beb90.69beb9069bcf6b@vsnl.net> hi... im trying to make a two-machine PVM virtual machine. but im having problems with PVM. the names of the two machines are "mayank" and "manish".."mayank" runs fedora core 1, "manish" runs red hat linux -9..both are part of a simple 100mbps LAN, connected by a 100mbps switch. iv *disabled* the firewall on both machines. iv installed pvm-3.4.4-14 on both machines. the problem is: when i try to add "mayank" to the virtual machine from "manish" using "add mayank", pvm is unable to do so..gives an error message "cant start pvmd"..then it tries to diagnose what went wrong..it passes all tests but one -- says "PVM_ROOT" is set to "" on the target machine ("mayank")...but thats ABSURD..iv checked a mill-yun times, the said variable is correctly set..when i ssh to mayank from manish, and then echo $PVM_ROOT , i get the correct answer... plz note that im using ssh instead of rsh, by changing the variable PVM_RSH=/usr/bin/ssh..since im more comfortable with ssh... but when i try the opposite--adding "manish" to the virtual machine from "mayank" runnnig fedora..it works! furthermore....before i installed fedora core 1 on mayank, it too had red hat 9..and then i was getting the same problem from BOTH machines..but after installing fedora on mayank, things began to work from that end. what going on??? (apart from me whos going nuts) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Sun Mar 7 11:10:20 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Sun, 7 Mar 2004 08:10:20 -0800 (PST) Subject: [Beowulf] good 24 port gige switch In-Reply-To: <40495663.7010507@tamu.edu> Message-ID: Does anyone have experience with Dell's new 2624 unmanaged 24 port gigE switch? It's only about $330, around a 1/10 the cost of the managed switches. >From what I've read, the Dell/Linksys 5224 managed gigE switch is good. It could be that the unmanaged switch uses the exact same Broadcom switch chips, but just doesn't have management. On Fri, 5 Mar 2004, Gerry Creager N5JXS wrote: > expected to be about $3000, for the 24 port model. I'm getting 2, and > have dual nics on the nodes, for some playing with channel bonding, and Last I heard, the interrupt mitigation on gigE cards messes up channel bonding for extra bandwidth. The packets arrive in batches out of order, and Linux's TCP/IP stack doesn't like this, so you get less bandwidth with two cards than you would with just one. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Sun Mar 7 17:13:26 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Sun, 7 Mar 2004 17:13:26 -0500 (EST) Subject: [Beowulf] PVM says `PVM_ROOT not set..cant start pvmd` on remote computer In-Reply-To: <69bcf6b69beb90.69beb9069bcf6b@vsnl.net> Message-ID: On Sun, 7 Mar 2004 mayank_kaushik at vsnl.net wrote: > hi... > > > im trying to make a two-machine PVM virtual machine. but im having problems with PVM. > the names of the two machines are "mayank" and "manish".."mayank" runs fedora core 1, "manish" runs red hat linux -9..both are part of a simple 100mbps LAN, connected by a 100mbps switch. > iv *disabled* the firewall on both machines. > > iv installed pvm-3.4.4-14 on both machines. > the problem is: > when i try to add "mayank" to the virtual machine from "manish" using > "add mayank", pvm is unable to do so..gives an error message "cant start > pvmd"..then it tries to diagnose what went wrong..it passes all tests > but one -- says "PVM_ROOT" is set to "" on the target machine > ("mayank")...but thats ABSURD..iv checked a mill-yun times, the said > variable is correctly set..when i ssh to mayank from manish, and then > echo $PVM_ROOT , i get the correct answer... This COULD be associated with the order things like .bash_profile and so forth are run for interactive shells vs login shells. If you are setting PVM_ROOT in .bash_profile (so it would be correct on a login) be sure to ALSO set it in .bashrc so that it is set for the remote shell likely used to start PVM. I haven't looked at the fedora RPM so I don't know if /usr/bin/pvm is still a script that sets this variable for you anyway. > plz note that im using ssh instead of rsh, by changing the variable > PVM_RSH=/usr/bin/ssh..since im more comfortable with ssh... Me too. ssh also has a very nice feature that permits an environment to be set on the remote machine for non-interactive remote commands that CAN be useful for PVM, although I think the stuff above might fix it. > but when i try the opposite--adding "manish" to the virtual machine > from "mayank" runnnig fedora..it works! > furthermore....before i installed fedora core 1 on mayank, it too had > red hat 9..and then i was getting the same problem from BOTH > machines..but after installing fedora on mayank, things began to work > from that end. I've encountered a similar problem only once, trying to add nodes FROM a wireless laptop. Didn't work. Adding the wireless laptop from anywhere else worked fine, all systems RH 9 and clean (new) installs from RPM of pvm, I explicitly set PVM_ROOT and PVM_RSH when logging in. PVM_ROOT is additionally set (correctly) by the /usr/bin/pvm command, which is really a shell. > what going on??? (apart from me whos going nuts) Try checking your environment to make sure it is set for both a remote command: ssh mayank echo "\$PVM_ROOT" and in a remote login: ssh mayank $ echo "$PVM_ROOT" rgb > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sunyy_2004 at hotmail.com Mon Mar 8 11:33:18 2004 From: sunyy_2004 at hotmail.com (Yiyang Sun) Date: Tue, 09 Mar 2004 00:33:18 +0800 Subject: [Beowulf] Relation between Marvell Yukon Controller and SysKonnect GbE Adapters Message-ID: Hi, Beowulf users, We're going to setup a small cluster. The motherboard we ordered is the newly released Gigabyte GA-8IPE1000-G which integrates Marvell's Yukon 8001 GbE Controller. I tried to find the Linux driver for this controller on Google and was directed to SysKonnect's website http://www.syskonnect.com/syskonnect/support/driver/d0102_driver.html which provides a driver for Marvell Yukon/SysKonnect SK-98xx Gigabit Ethernet Adapters. However, there is no explicit indication on this website that SysKonnect's adapters use Marvell's chips. Does any here have experience using Marvell's controllers? Is it easy to install Yukon 8001 on Linux? Thanks! Yiyang _________________________________________________________________ Get MSN Hotmail alerts on your mobile. http://en-asiasms.mobile.msn.com/ac.aspx?cid=1002 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Mon Mar 8 14:44:50 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Mon, 8 Mar 2004 14:44:50 -0500 (EST) Subject: [Beowulf] Re: beowulf In-Reply-To: <20040308184024.955.qmail@web21501.mail.yahoo.com> Message-ID: On Mon, 8 Mar 2004, prakash borade wrote: > how should i proceed for a client which takes dta from 5 servers > reoetadly after every 15 seconds > i get the data but it prints the garbage value > > what can be the problem i am usiung sockets on redhat 9 > > i am creting new sockets for it every time on clien side Dear Prakash, There is such a dazzling array of possible problems with your code that (not being psychic) I cannot possibly help you. For example -- You could be printing an integer as a float without a cast (purely misusing printf). Or vice versa. I do this all the time; it is a common mistake. You could be sending the data on a bigendian system, receiving it and trying to print it on a littleendian system. You could have a trivial offset wrong in your receive buffers -- printing an integer (for example) starting a byte in and overlapping some other data in your stack would yield garbage. You could have a serious problem with your read algorithm. Reading reliably from a socket is not trivial. I use a routine that I developed over a fairly long time and it STILL has bugs that surface. The reading/writing are fundamentally asynchronous, and a read can easily leave data behind in the socket buffer (so that what IS read is garbage). ...and this is the tip of an immense iceberg of possible programming errors. The best way to proceed to write network code is to a) start with a working template of networking/socket code. There are examples in a number of texts, for example, as well as lots of socket-based applications. Pick a template, get it working. b) SLOWLY and GENTLY change your working template into your application, ensuring that the networking component never breaks at intermediary revisions. or c) learn, slowly, surely, and by making many mistakes, to write socket code from scratch without using a template. Me, I use a template. rgb P.S. to get more help, you're really going to have to provide a LOT more detail than this. Possibly including the actual source code. -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From beowulf at studio26.be Mon Mar 8 14:54:40 2004 From: beowulf at studio26.be (Maikel Punie) Date: Mon, 8 Mar 2004 20:54:40 +0100 Subject: [Beowulf] Cluster school project Message-ID: hi, I need to make a smaal beowulf cluster for a school project i have like 2 months for this stuff, but i need to make my own task asignment. So basicly what do you guys think that would be possible to realize in 2 months time? The only thing they told me, is that the nodes must be discless systems. any ideas about what could be donne in 2 months. Maikel _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From beowulf at studio26.be Mon Mar 8 16:03:41 2004 From: beowulf at studio26.be (Maikel Punie) Date: Mon, 8 Mar 2004 22:03:41 +0100 Subject: [Beowulf] Cluster school project In-Reply-To: Message-ID: hmm, ok, maybe i explained badly, at the moment i just need to create a project discryption on what would be possible to realize in 2 months, and off course i could use the cluster knoppix, but then its not a real project anymore, then its just an install task. also the openmosix structure is it using diskless nodes? or what because i can't find a lot off info about it. By the way, which part of Belgium are you from? I recently attended the FOSDEM conference at the ULB in Bruxelles. Great conference. Well its the whole other part off the country, but yeah it was a great conference i was there to :) Thanks Miakle -----Oorspronkelijk bericht----- Van: John Hearns [mailto:john.hearns at clustervision.com] Verzonden: maandag 8 maart 2004 21:52 Aan: Maikel Punie CC: Beowul-f Mailing lists Onderwerp: Re: [Beowulf] Cluster school project On Mon, 8 Mar 2004, Maikel Punie wrote: > > hi, > > I need to make a smaal beowulf cluster for a school project i have like 2 > months for this stuff, but i need to make my own task asignment. > So basicly what do you guys think that would be possible to realize in 2 > months time? The only thing they told me, is that the nodes must be discless > systems. > Maikel, first you need the computers! Then you should first look at ClusterKnoppix http://bofh.be/clusterknoppix/ Once you have that running, come back and tell us how you got on. We'll help you do more then. By the way, which part of Belgium are you from? I recently attended the FOSDEM conference at the ULB in Bruxelles. Great conference. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From john.hearns at clustervision.com Mon Mar 8 15:51:58 2004 From: john.hearns at clustervision.com (John Hearns) Date: Mon, 8 Mar 2004 21:51:58 +0100 (CET) Subject: [Beowulf] Cluster school project In-Reply-To: Message-ID: On Mon, 8 Mar 2004, Maikel Punie wrote: > > hi, > > I need to make a smaal beowulf cluster for a school project i have like 2 > months for this stuff, but i need to make my own task asignment. > So basicly what do you guys think that would be possible to realize in 2 > months time? The only thing they told me, is that the nodes must be discless > systems. > Maikel, first you need the computers! Then you should first look at ClusterKnoppix http://bofh.be/clusterknoppix/ Once you have that running, come back and tell us how you got on. We'll help you do more then. By the way, which part of Belgium are you from? I recently attended the FOSDEM conference at the ULB in Bruxelles. Great conference. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mprinkey at aeolusresearch.com Mon Mar 8 14:39:44 2004 From: mprinkey at aeolusresearch.com (Michael T. Prinkey) Date: Mon, 8 Mar 2004 14:39:44 -0500 (EST) Subject: [Beowulf] e1000 performance Message-ID: Hello everyone, I am building a small cluster that uses Tyan S2723GNN motherboards that include an integrated Intel e1000 gigabit NIC. I have installed two Netgear 302T gigabit cards in the 66 MHz slots as well. With point-to-point links, I can get a very respectable 890 Mbps with the tg3 cards, but the e1000 lags significantly at 300 to 450 Mbps. I am using the NAPI e1000 driver in the 2.4.24 kernel. I have tried the following measures without any improvement: - changed the tcp_mem,_wmem,_rmem to larger values. - increased the MTU to values >1500. - reniced the ksoftirq processes to 0. The 2.4.24 kernel contains the 4.x version of the e1000. I plan to try the 5.x version this evening. Also, want to try increasing the Txqueuelen as well. Has anyone had similar experience with these embedded e1000s? Googling leads me to several sites like this one: http://www.hep.ucl.ac.uk/~ytl/tcpip/tuning/ that seem to indicate that I should expect much more from the e1000. Any help here is welcome? Thanks, Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Mon Mar 8 16:59:59 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Mon, 8 Mar 2004 13:59:59 -0800 (PST) Subject: [Beowulf] e1000 performance In-Reply-To: Message-ID: On Mon, 8 Mar 2004, Michael T. Prinkey wrote: > I am building a small cluster that uses Tyan S2723GNN motherboards that > include an integrated Intel e1000 gigabit NIC. I have installed two >From a supermicro X5DPL-iGM (E7501 chipset) with onboard e1000 to supermicro E7500 board with an e1000 PCI-X gigabit card, via a dell 5224 switch. The E7501 board has a 3ware 8506 card on the same PCI-X bus as the e1000 chip, so it's running at 64/66. The PCI-X card is running at 133 MHz. TCP STREAM TEST to duet Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131070 131070 1472 9.99 940.86 Kernel versions are 2.4.20 (PCI-X card) and 2.4.22-pre2 (the onboard chip). 2.4.20 has driver 4.4.12-k1, while 2.4.22-pre2 has driver 5.1.11-k1. The old e1000 driver has a very useful proc file in /proc/net/PRO_LAN_Adapters that gives all kind of information. I have RX checksum on and flow control turned on. The newer driver doesn't have this information. > the NAPI e1000 driver in the 2.4.24 kernel. I have tried the following NAPI? > measures without any improvement: I've done nothing wrt gigabit performance, other than turn on flow control. I found that without flowcontrol, tcp connections to 100 mbit hosts would hang. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mnerren at paracel.com Mon Mar 8 17:31:55 2004 From: mnerren at paracel.com (micah nerren) Date: Mon, 08 Mar 2004 14:31:55 -0800 Subject: [Beowulf] Cluster school project In-Reply-To: References: Message-ID: <1078785115.30523.89.camel@angmar> On Mon, 2004-03-08 at 11:54, Maikel Punie wrote: > hi, > > I need to make a smaal beowulf cluster for a school project i have like 2 > months for this stuff, but i need to make my own task asignment. > So basicly what do you guys think that would be possible to realize in 2 > months time? The only thing they told me, is that the nodes must be discless > systems. > > any ideas about what could be donne in 2 months. > > Maikel To actually build a small (or large!) beowulf of discless systems is pretty easy, I guess the hardest part will be determining what the purpose of the cluster will be. What type of code will be running on it? They will basically be network booting a kernel and mounting an nfs filesystem. Research these aspects, and research what kind of tools you want to have on the cluster, ie. distributed shell, monitoring, mpi, etc. 2 months should be plenty, you should be able to get a basic small beowulf up and running in 2 hours once you know what to do and how to set it up. Time to fire up google and start researching beowulf's and diskless booting. There is a lot of good info out there. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From becker at scyld.com Mon Mar 8 17:15:46 2004 From: becker at scyld.com (Donald Becker) Date: Mon, 8 Mar 2004 17:15:46 -0500 (EST) Subject: [Beowulf] BWBUG Greenbelt: Intel HPC and Grid, Beowulf Clusters Message-ID: Special notes: This month's meeting is in Greenbelt Maryland, not Virginia! From pre-registration we expect a full room, so please register on line at http://bwbug.org and show up at least 15 minutes early. Title: Intel's Perspective on Beowulf's Clusters Speaker: Stephen Wheat Ph.D This talk will review Intel's perspective on technology trends and transitions in this decade. The focus will be on bringing the latest technology to the scientists' labs in the shortest amount of time. The technologies reviewed will include processors, chipsets, I/O, systems management, and software tools. Come with your questions; the presentation is designed to be interactive. Date: March 9, 2004 Time: 3:00 PM (doors open at 2:30) Location: Northrop Grumman IT 7501 Greenway Center Drive (Intersection of BW Parkway and DC beltway) Suite 1200 (12th floor) Greenbelt Maryland Need to be a member?: No ( guests are welcome ) Parking: Free As usual there will be door prizes, food and refreshments. From: "Fitzmaurice, Michael" Dr. Wheat from Intel must be a popular speaker we have a big turn out expected. If you have not registered yet please do so. We may need to plan for extra chairs and we need to predict how many pizzas to order. This would be great meeting to invite a friend or your boss. It may be crowded, therefore, getting there a little early is recommended. This event is sponsored by the Baltimore-Washington Beowulf Users Group (BWBUG) Please register on line at http://bwbug.org As usual there will be door prizes, food and refreshments. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From nathan at iwantka.com Mon Mar 8 18:21:15 2004 From: nathan at iwantka.com (Nathan Littlepage) Date: Mon, 8 Mar 2004 17:21:15 -0600 Subject: [Beowulf] SCTP Message-ID: <00d701c40564$1d21a830$6c45a8c0@ntbrt.bigrivertelephone.com> Has anyone looked into incorporating SCTP in the cluster environment? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From becker at scyld.com Mon Mar 8 20:44:52 2004 From: becker at scyld.com (Donald Becker) Date: Mon, 8 Mar 2004 20:44:52 -0500 (EST) Subject: [Beowulf] SCTP In-Reply-To: <00d701c40564$1d21a830$6c45a8c0@ntbrt.bigrivertelephone.com> Message-ID: On Mon, 8 Mar 2004, Nathan Littlepage wrote: > Has anyone looked into incorporating SCTP in the cluster environment? What advantage would it provide for a SAN- or LAN-based cluster? Not that TCP is especially light-weight. TCP implementations are WAN-oriented and have increasingly costly features (look at the CPU cost of iptables/ipchains) and defenses against spoofing (TCP stream start-up is much more costly than the early BSD implementations). The only reason SCTP would be a better cluster protocol is that it hasn't yet accumulated the cruft ("features") of a typical TCP stack. But if it became popular, that would change pretty much instantly. -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 914 Bay Ridge Road, Suite 220 Scyld Beowulf cluster systems Annapolis MD 21403 410-990-9993 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rdn at uchicago.edu Mon Mar 8 23:40:01 2004 From: rdn at uchicago.edu (Russell Nordquist) Date: Mon, 8 Mar 2004 22:40:01 -0600 Subject: [Beowulf] good 24 port gige switch In-Reply-To: References: <1078566922.2547.6.camel@fermi> Message-ID: <20040308224001.50f2f728@vitalstatistix> thanks for all the good info. it got me to thinking....i have resources for comparing most components of a cluster excepts network switches. it would be nice to have a source of information for this as well. something like: *bandwidth/latency between 2 hosts *bandwidth/latency at 25%/50%/75%/100% port usage *short vs long message comparisons great so far, but what about the issues: *what SW to use for the benchmark. perhaps netpipe? *the NICS used will make a difference. how does one account for the difference between a realtec and syskonnect chipset, bus speeds, etc? *do we have enough variation of cluster sizes and HW to make a useful repository? *and i'm sure there's more Is this feasible? Is it a case where any info is useful even if it is not very reliable/accurate? With more MB's coming with decent gige on board there will be a greater chance the the difference between to setups will only be the switch. so, is this a worthwhile are useful project for the community? or are there to many variables to make the results useful? russell -- - - - - - - - - - - - - Russell Nordquist UNIX Systems Administrator Geophysical Sciences Computing http://geosci.uchicago.edu/computing NSIT, University of Chicago - - - - - - - - - - - _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From beowulf at studio26.be Tue Mar 9 12:45:47 2004 From: beowulf at studio26.be (Maikel Punie) Date: Tue, 9 Mar 2004 18:45:47 +0100 Subject: [Beowulf] Cluster school project In-Reply-To: <644D9337A02FC24689647BF9E48EC39E08ABB797@drm556> Message-ID: >> ok, maybe i explained badly, at the moment i just need to create a project >> discryption on what would be possible to realize in 2 months, and off course >Do you mean a computing/programming project could you do, >like calculating pi to some large number of digits? yeah something like that, i realy have no idea what is possible. if there are any suggestions, they are always welcome. Maikel _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From paulojjs at bragatel.pt Tue Mar 9 04:15:05 2004 From: paulojjs at bragatel.pt (Paulo Silva) Date: Tue, 09 Mar 2004 09:15:05 +0000 Subject: [Beowulf] How to choose an UPS for a Beowulf cluster Message-ID: <1078823704.1882.33.camel@blackTiger> Hi, I'm building a small Beowulf cluster for HPC (about 16 nodes) and I need some advices on choosing the right UPS. The UPS should be able to signal the central node when the battery reaches some level (I think this is common usage) and it should be able to turn itself off before running out of battery (I was told that this extends the life of the battery). 10 minutes of runtime sould be enough. I was looking in the APC site but I was rather confused by all the models available. Can anyone give me some advice on the type of device to choose? Thanks for any tip -- Paulo Jorge Jesus Silva perl -we 'print "paulojjs".reverse "\ntp.letagarb@"' If a guru falls in the forest with no one to hear him, was he really a guru at all? -- Strange de Jim, "The Metasexuals" -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Esta ? uma parte de mensagem assinada digitalmente URL: From brichard at clusterworldexpo.com Tue Mar 9 13:45:15 2004 From: brichard at clusterworldexpo.com (Bryan Richard) Date: Tue, 9 Mar 2004 13:45:15 -0500 Subject: [Beowulf] Join Don Becker and Thomas Sterling at ClusterWorld Conference & Expo Message-ID: <20040309184515.GB47601@clusterworldexpo.com> ClusterWorld Conference & Expo welcomes Scyld's Don Becker and Keynote Thomas Sterling to the program! If you work in Beowulf and clusters, you can't miss the following program events: - Donald Becker, Scyld Computing Corporation: "Scyld Beowulf Introductory Workshop" - Donald Becker, Scyld Computing Corporation: "Scyld Beowulf Advanced Workshop" - Thomas Sterling, California Institute of Technology: "Beowulf Cluster Computing a Decade of Accomplishment, a Decade of Challenge" PLUS, ClusterWorld's exciting program of intensive tutorials, special events, and expert presentations in 8 vertical industry tracks: Applications, Automotive & Aerospace Engineering, Bioinformatics, Digital Content Creation, Grid, Finance, Petroleum & Geophysical Exploration, and Systems. A Special Offer for Beowulf Members =================================== Beowulf.org members get 20% off registration prices when registering online! You MUST use your special Priority Code - BEOW -- when registering online to receive your 20% discount! Online registration ends March 31, 2004 so don't delay! Just go to http://www.clusterworldexpo.com and click on "REGISTER NOW!" to fill out our quick enrollment form. Associations, Universities and Labs Get 50% off Registration ============================================================ Students and employees of universities, associations, and government labs are eligible for 50% off ClusterWorld registration! This offer is only available via fax or mail. Please log on to www.clusterworldexpo.com and click on "Register Now" to download registration PDFs. Or call 415-321-3062 for more information A TERRIFIC PROGRAM ================== At ClusterWorld Conference & Expo, you will: * LEARN from top clustering experts in our extensive conference program. * EXPERIENCE the latest cluster technology from the top vendors on our expo floor. * MEET AND NETWORK with colleagues from across the world of clustering at our social events and parties. Keynotes: - Ian Foster, Argonne National Laboratory, University of Chicago, Globus Alliance, and co-author of "The Grid: Blueprint for a New Computing Infrastructure", - Thomas Sterling, California Institute of Technology, author of "How to Build a Beowulf," and co-author of "Enabling Technologies for Petaflops Computing". - Andrew Mendelsohn, Senior Vice President, Database & Application Server Technology, Oracle Corporation - David Kuck, Intel Fellow, Manager, Software and Solutions Group, Intel Corporation Want to know which sessions are getting the biggest buzz? Click on http://www.clusterworldexpo.com/SessionSpotlight for a list of highlights by Technical Session Track. REGISTER TODAY! ClusterWorld Conference and Expo April 5 - 8, 2004 San Jose Convention Center San Jose, California http://www.clusterworldexpo.com ClusterWorld Conference & Expo Sponsors ======================================= Platinum: Oracle Corporation, Intel Corporation Gold: AMD, Dell, Hewlett Packard, Linux Networx, Mountain View Data, Panasas, Penguin Computing, and RLX Technologies Silver: Appro, Engineered Intelligence, Microway, NEC, Platform Computing, and PolyServe Media & Association Sponsors: Bioinformatics.org, ClusterWorld Magazine, Distributed Systems Online, Dr. Dobbs Journal, Gelato Federation, Global Grid Forum, GlobusWorld, LinuxHPC, Linux Magazine, PR Newswire, Storage Management, and SysAdmin Magazine _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From wseas at canada.com Tue Mar 9 12:25:56 2004 From: wseas at canada.com (WSEAS newsletter in mechanical engineering) Date: Tue, 9 Mar 2004 19:25:56 +0200 Subject: [Beowulf] WSEAS and IASME newsletter in mechanical engineering, March 9, 2004 Message-ID: <3FE20F4000220BB2@fesscrpp1.tellas.gr> (added by postmaster@fesscrpp1.tellas.gr) If you want to contact us, the Subject of your email must contains the code: WSEAS CALL FOR PAPERS -- CALL FOR REVIEWERS -- CALL FOR SPECIAL SESSIONS http://www.wseas.org IASME / WSEAS International Conference on "FLUID MECHANICS" (FLUIDS 2004) August 17-19, Corfu Island, Greece The papers of this conference will be published: (a) as regular papers in the IASME/WSEAS conference proceedings (b) regular papers in the IASME TRANSACTIONS ON MECHANICAL ENGINEERING http://www.wseas.org REGISTRATION FEES: 250 EUR DEADLINE: APRIL 10, 2004 ACCOMODATION: Incredible low prices in a 5 Star Sea Resort (former HILTON of Corfu Island), Greece, 5 Star Sea resort where the multiconference of WSEAS will take place in August 2004: 51 EUR in double room and 81 EUR in single room. (in August 2004, in the Capital of Greece, Athens, the 2004 Olympic Games will take place) ---> Sponsored by IASME <---- Topics of FLUIDS 2004 Mathematical Modelling in fluid mechanics Simulation in fluid mechanics Numerical methods in fluid mechanics Convection, heat and mass transfer Experimental Methodologies in fluid mechanics Thin film technologies Multiphase flow Boundary layer flow Material properties Fluid structure interaction Hydrotechnology Hydrodynamics Coastal and estuarial modelling Wave modelling Industrial applications Environmental Problems Air Pollution Problems Fluid Mechanics for Civil Engineering Fluid Mechanics in Geosciences Flow visualisation Biofluids Meteorology Waste Management Environmental protection Management of living resources Mathematical models Management of Rivers and Lakes Underwater Ecology Hydrology Oceanology Ocean Engineering Others INTERNATIONAL SCIENTIFIC COMMITTEE Andrei Fedorov (USA) A. C. Baytas (Turkey) Albert R. George (USA) Alexander I. Leontiev (Russia) Andreas Dillmann (Germany) Bruce Caswell (USA) Chris Swan (UK) David A. Caughey (USA) Derek B Ingham (UK) Donatien Njomo (CM) Dong Chen (Australia) Dong-Ryul Lee (Korea) Edward E. Anderson (USA) G. Gaiser (Germany) G.D. Raithby (Canada) Gad Hetsroni (Israel) H. Beir?o da Veiga (Italy) Ingegerd Sjfholm (Sweden) Jerry R. Dunn (USA) Joseph T. C. Liu (USA) Karl B?hler (Germany) Kenneth S. Breuer (USA) Kumar K. Tamma (USA) Kyungkeun Kang (USA) M. A. Hossain (UK) M. F. El-Amin (USA) M.-Y. Wen (Taiwan) Michiel Nijemeisland (USA) Ming-C. Chyu (USA) Naoto Tanaka (Japan) Natalia V. Medvetskaya (Russia) O. Liungman (Sweden) Philip Marcus (USA) Pradip Majumdar (USA) Rama Subba Reddy Gorla (USA) Robert Nerem (USA) Rod Sobey (UK) Ruairi Maciver (UK) S.M.Ghiaasiaan (USA) Stanley Berger (USA) Tak?o Takahashi (France) Vassilis Gekas (Sweden) Yinping Zhang (China) Yoshitaka Watanabe (Japan) NOTE THAT IN WSEAS CONFERENCES YOU CAN HAVE PROCEEDINGS 1) HARD COPY 2) CD-ROM and 3) Web Publishing WSEAS Books, Journals, Proceedings participate now in all major science citation indexes. ISI, ELSEVIER, CSA, AMS. Mathematical Reviews, ELP, NLG, Engineering Index Directory of Published Proceedings, INSPEC (IEE) More Details: http://www.wseas.org Thanks Alexis Espen ##### HOW TO UNSUBSCRIBE #### You receive this newsletter from your email address: beowulf at beowulf.org If you want to unsubscribe, send an email to: wseas at canada.com The Subject of your message must be exactly: REMOVE beowulf at beowulf.org WSEAS If you want to unsubscribe more than one email addresses, send a message to nata at wseas.org with Subject: REMOVE [email1, emal2, ...., emailn] WSEAS _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From michael.worsham at mci.com Tue Mar 9 13:32:14 2004 From: michael.worsham at mci.com (Michael Worsham) Date: Tue, 09 Mar 2004 13:32:14 -0500 Subject: [Beowulf] Cluster school project Message-ID: <000f01c40604$d8ef6520$987a32a6@Wcomnet.com> I would say also check out the Bootable Cluster CD (http://bccd.cs.uni.edu/) as well. It is very easy to use and was specifically designed so you could cluster an entire network lab, without having to worry about the hard drives being written to. -- Michael _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Tue Mar 9 16:13:24 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Tue, 9 Mar 2004 13:13:24 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? Message-ID: Has anyone with dual opteron machines and a kill-a-watt measured how much power they consume? I measured the dual P3 and xeons we have here, but no dual opterons yet. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at cse.ucdavis.edu Tue Mar 9 17:36:05 2004 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Tue, 9 Mar 2004 14:36:05 -0800 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <20040309223605.GA29912@cse.ucdavis.edu> On Tue, Mar 09, 2004 at 01:13:24PM -0800, Trent Piepho wrote: > Has anyone with dual opteron machines and a kill-a-watt measured how much > power they consume? > > I measured the dual P3 and xeons we have here, but no dual opterons yet. I recently measured a Sunfire V20z (dual 2.2 GHz) opteron, I believe it had 2 scsi disks, 4 GB ram. watts VA Idle 237-249 260-281 Pstream 1 thread 260-277 290-311 Pstream 2 threads 265-280 303-313 Pstream is very much like McCalpin's stream, except it uses pthreads 2 run parallel threads in sync, and it runs over a range of array sizes. It's the most power intensive application I've found, anything with heave disk usage tends to decrease the power usage. It's also great for showing memory system parallelism, say for a dual p4 vs opteron. I also find it useful for finding misconfigured dual opterons. For those interested: http://cse.ucdavis.edu/bill/pstream.c -- Bill Broadley Computational Science and Engineering UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at cse.ucdavis.edu Tue Mar 9 17:49:14 2004 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Tue, 9 Mar 2004 14:49:14 -0800 Subject: [Beowulf] good 24 port gige switch In-Reply-To: <20040308224001.50f2f728@vitalstatistix> References: <1078566922.2547.6.camel@fermi> <20040308224001.50f2f728@vitalstatistix> Message-ID: <20040309224914.GB29912@cse.ucdavis.edu> On Mon, Mar 08, 2004 at 10:40:01PM -0600, Russell Nordquist wrote: > > thanks for all the good info. it got me to thinking....i have resources > for comparing most components of a cluster excepts network switches. it > would be nice to have a source of information for this as well. > something like: > > *bandwidth/latency between 2 hosts > *bandwidth/latency at 25%/50%/75%/100% port usage > *short vs long message comparisons I use nrelay.c a small simple program I wrote that will MPI_Send MPI_send very size packets between sets of nodes. So I do something like the following to find best base latency and bandwidth: mpirun -np 2 ./nrelay 1 # then run with 10 100 1000 10000 size = 1, 2 nodes in 2.97 sec ( 5.7 us/hop) 690 KB/sec size= 10, 524288 hops, 2 nodes in 3.06 sec ( 5.8 us/hop) 6688 KB/sec size= 100, 524288 hops, 2 nodes in 4.19 sec ( 8.0 us/hop) 48868 KB/sec size= 1000, 524288 hops, 2 nodes in 15.37 sec ( 29.3 us/hop) 133267 KB/sec size= 10000, 524288 hops, 2 nodes in 40.72 sec ( 77.7 us/hop) 502908 KB/sec So we have an interconnect that manages 5.8 us for small messages and 500 MB/sec or so for large (10000 MPI_INTs). Then I run: mpirun -np 2,4,8,16,32,64 ./nrelay 10000 size= 10000, 524288 hops, 2 nodes in 40.72 sec ( 77.7 us/hop) 502908 KB/sec size= 10000, 524288 hops, 4 nodes in 39.79 sec ( 75.9 us/hop) 514698 KB/sec size= 10000, 524288 hops, 8 nodes in 39.21 sec ( 74.8 us/hop) 522253 KB/sec size= 10000, 524288 hops, 16 nodes in 45.53 sec ( 86.8 us/hop) 449772 KB/sec size= 10000, 524288 hops, 32 nodes in 49.25 sec ( 93.9 us/hop) 415876 KB/sec size= 10000, 524288 hops, 64 nodes in 52.90 sec (100.9 us/hop) 387111 KB/sec So in this case it looks like the switch is becoming saturated. The source is at: http://cse.ucdavis.edu/bill/nrelay.c I'd love to see numbers posted for various GigE, Myrinet, Dolphin and IB configurations -- Bill Broadley Computational Science and Engineering UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Tue Mar 9 19:32:49 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue, 9 Mar 2004 19:32:49 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: On Tue, 9 Mar 2004, Trent Piepho wrote: > Has anyone with dual opteron machines and a kill-a-watt measured how much > power they consume? > > I measured the dual P3 and xeons we have here, but no dual opterons yet. By strange chance yes. An astoundingly low 154 watts (IIRC -- I'm home, the kill-a-watt is at Duke -- but it was definitely ballpark of 150W) under load. That's a load average of 2, one task per processor, without testing under a variety of KINDS of load. Around 75W per loaded CPU. That's a bit less than the draw of an >>idle<< dual Athlon (165W). I'm actually racking six more boxes tomorrow and will recheck the draw and verify that it really is under load, but I was with Seth when I measured it and we remarked back and forth about it, really pleased, so I'm pretty sure I'm right. It has several very positive implications and seems believable. They are 1U cases (Penguin Altus 1000's) but the air coming out of the back is not that hot, really, again compared to the E-Z Bake Oven 2U 2466 dual Athlons (something like 260W under load). So we gain significantly in CPU, get access to larger memory if/when we care, get 64 bit memory bus, and drop power and cooling requirements (per CPU, but very nearly per rack U). It just don't get any better than this. I think they are 242's, FWIW. YMMV. I could be wrong, mistaken, deaf, dumb, blind, and stupid. My kill-a-watt could be on drugs. I could be on drugs. Maybe I dropped a decimal and they really draw 1500W. Perhaps the beer I spilled in my kill-a-watt confused it. I was up to 3:30 am finishing a month-late column for deadline himself (leaving me only days late on the CURRENT column) and my brain doesn't work very well any more. Caveat auditor. rgb > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Tue Mar 9 20:41:45 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Wed, 10 Mar 2004 09:41:45 +0800 (CST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <20040309223605.GA29912@cse.ucdavis.edu> Message-ID: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com> --- Bill Broadley ??? > I recently measured a Sunfire V20z (dual 2.2 GHz) > opteron, I believe it had 2 scsi disks, 4 GB ram. > > watts VA > Idle 237-249 260-281 > Pstream 1 thread 260-277 290-311 > Pstream 2 threads 265-280 303-313 But that is with the disks, RAM, and other hardware you have. Anyone with similar configurations but have P4s instead? It just looks too good to believe the numbers... consider that the similar performance one IA64 processor ALONE draws over 120W. Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Tue Mar 9 21:08:45 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Tue, 9 Mar 2004 18:08:45 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: On Tue, 9 Mar 2004, C J Kenneth Tan -- Heuchera Technologies wrote: > What is the power consumption that you measured for your dual P3 and > Xeons? System #1: Dual P3-500 Katmai, BX motherbaord, 512 MB PC100 ECC RAM, two tulip NICs, cheap graphics card, 5400 RPM IDE drive, floppy drive, one case fan, and a normal 250W ATX PS with a fan: System #2: Nearly the same as system #1 more or less, but with dual P3-850 Coppermines and no case fan. System #3: Dual Xeon 2.4 GHz 533FSB, E7501 chipset, 1 GB PC2100 ECC memory, two 3Ware 8506-8 cards, a firewire card, onboard intel GB and FE, one Maxtor 6Y200P0 drive, 6 high speed case fans (rated 4.44W each), floppy drive, CD-ROM drive, 550W PS with power factor correction (rated minimum 63% efficient), SATA backplane, and 16 Maxtor 6Y200M0 SATA drives (rated 7.4W idle each) in hotswap carriers. I measured system #3 with the SATA drives both installed and removed. Unfortunately I don't have a dual Xeon with minimal extra hardware to test. #1 Idle 42W 72 VA (.58 PF) #1 Loaded 103W 157 VA (.66 PF) #2 Idle 39W 67 VA (.58 PF) #2 Loaded 96W 148 VA (.65 PF) #3 Idle w/o RAID 162W 168 VA (.96 PF) #3 Loaded w/o RAID 283W 289 VA (.98 PF) #3 Idle w/ RAID 375W (stays at .98) #3 Loaded w/ RAID 510W (stays at .98) #3 Loaded w/RAID/bonnie 534W (stays at .98) For the load, I used two processes of burnP6, part of cpuburn at http://users.ev1.net/~redelm/ For a load breakdown by load type for system 1: 1 process 2 processes burnP5 65W burnP6 72WA 103W (exactly 30W per CPU over idle) burnMMX 64W burnK6 69W burnK7 67W burnBX 87W 90W stream 84W 85W The stream and burnBX memory loaders use more power than a single CPU load program, but two at once and the CPU loaders use more power. To load system #3 with the disks on, I ran bonnie++ on all 16 drives. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Wed Mar 10 00:48:45 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed, 10 Mar 2004 00:48:45 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com> Message-ID: > > I recently measured a Sunfire V20z (dual 2.2 GHz) > > opteron, I believe it had 2 scsi disks, 4 GB ram. > > > > watts VA > > Idle 237-249 260-281 > > Pstream 1 thread 260-277 290-311 > > Pstream 2 threads 265-280 303-313 that's about right. my dual 240's peak at about 250 running two copies of stream and one bonnie (2GB, 40G 7200rpm IDE). > But that is with the disks, RAM, and other hardware > you have. nothing else counts for much. for instance, dimms are a couple watts apiece (makes you wonder about the heatspreaders that gamers/overclockers love so much), nics and disks are ~10W, etc. > Anyone with similar configurations but have > P4s instead? iirc my dual xeon/2.4's peak at around 190W (1-2GB, otherwise same). > It just looks too good to believe the numbers... > consider that the similar performance one IA64 > processor ALONE draws over 120W. hey, to marketing planners, massive power dissipation is probably a *good* thing. serious "enterprise" computers must have an impressive dissipation to set them apart from those piddly little game/surfing boxes ;) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From burcu at ulakbim.gov.tr Wed Mar 10 02:30:47 2004 From: burcu at ulakbim.gov.tr (Burcu Akcan) Date: Wed, 10 Mar 2004 09:30:47 +0200 Subject: [Beowulf] SPBS problem Message-ID: <404EC427.7070200@ulakbim.gov.tr> Hi, We have built a beowulf Debian cluster that contains 128 PIV nodes and one dual xeon server. I need some help about SPBS (Storm). We have already installed SPBS on the server and nodes and all daemons seem to work regularly. When any job is given to the system by using pbs scripting, the job can be seen on defined queue by running status and related nodes are allocated for the job. On the other hand there is no cpu or memory consumption on the nodes, the job does not run exactly and at the end of estimated cpu time there is no output file. Can anyone give some advice on my problem about SPBS. Thank you... Burcu Akcan ULAKBIM High Performance Computing Center _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Wed Mar 10 09:56:49 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Wed, 10 Mar 2004 06:56:49 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <20040310014145.8092.qmail@web16807.mail.tpe.yahoo.com> Message-ID: On Wed, 10 Mar 2004, [big5] Andrew Wang wrote: > --- Bill Broadley > > I recently measured a Sunfire V20z (dual 2.2 GHz) > > opteron, I believe it had 2 scsi disks, 4 GB ram. > > > > watts VA > > Idle 237-249 260-281 > > Pstream 1 thread 260-277 290-311 > > Pstream 2 threads 265-280 303-313 > > It just looks too good to believe the numbers... > consider that the similar performance one IA64 > processor ALONE draws over 120W. You also have to consider that the typical computer power supply is only around 60% to 80% efficient. If the CPU draws 120W, then that's going to be something like 150 to 200 watts measured with a power meter, and really, that's what matters. It makes no difference to the AC and circuit breakers if the power is dissipated in the CPU or in the power supply. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Wed Mar 10 10:14:18 2004 From: raysonlogin at yahoo.com (Rayson Ho) Date: Wed, 10 Mar 2004 07:14:18 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: <20040310151418.51414.qmail@web11413.mail.yahoo.com> But the Itanium 2 is using so much energy that Intel couldn't rise the frequency... or else the machine would melt :( See the online lecture: "Things CPU Architects Need To Think About" http://www.stanford.edu/class/ee380/ BTW, that guy used to work for Intel, and he also mentioned about the compiler guys tuned the IA-64 compiler for the benchmarks... Rayson > hey, to marketing planners, massive power dissipation is > probably a *good* thing. serious "enterprise" computers must > have an impressive dissipation to set them apart from those > piddly little game/surfing boxes ;) > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________ Do you Yahoo!? Yahoo! Search - Find what you?re looking for faster http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at yahoo.com Wed Mar 10 10:13:58 2004 From: raysonlogin at yahoo.com (Rayson Ho) Date: Wed, 10 Mar 2004 07:13:58 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: <20040310151358.43826.qmail@web11407.mail.yahoo.com> But the Itanium 2 is using so much energy that Intel couldn't rise the frequency... or else the machine would melt :( See the online lecture: "Things CPU Architects Need To Think About" http://www.stanford.edu/class/ee380/ BTW, that guy used to work for Intel, and he also mentioned about the compiler guys tuned the IA-64 compiler for the benchmarks... Rayson > hey, to marketing planners, massive power dissipation is > probably a *good* thing. serious "enterprise" computers must > have an impressive dissipation to set them apart from those > piddly little game/surfing boxes ;) > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________ Do you Yahoo!? Yahoo! Search - Find what you?re looking for faster http://search.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgoornaden at intnet.mu Wed Mar 10 11:42:39 2004 From: rgoornaden at intnet.mu (roudy) Date: Wed, 10 Mar 2004 20:42:39 +0400 Subject: [Beowulf] Writing a parallel program References: <200403101448.i2AEmIA22804@NewBlue.scyld.com> Message-ID: <003701c406bf$085f25b0$590b7bca@roudy> Hello everybody, I completed to build my beowulf cluster. Now I am writing a parallel program using MPICH2. Can someone give me a help. Because, the program that I wrote take more time to run on several nodes compare when it is run on one node. If there is a small program that someone can send me about distributing data among nodes, then each node process the data, and the information is sent back to the master node for printing. This will be a real help for me. Thanks Roud _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Mar 10 12:28:54 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 10 Mar 2004 12:28:54 -0500 (EST) Subject: [Beowulf] Writing a parallel program In-Reply-To: <003701c406bf$085f25b0$590b7bca@roudy> Message-ID: On Wed, 10 Mar 2004, roudy wrote: > Hello everybody, > I completed to build my beowulf cluster. Now I am writing a parallel program > using MPICH2. Can someone give me a help. Because, the program that I wrote > take more time to run on several nodes compare when it is run on one node. > If there is a small program that someone can send me about distributing data > among nodes, then each node process the data, and the information is sent > back to the master node for printing. This will be a real help for me. > Thanks > Roud I can't help you much with MPI but I can help you understand the problems you might encounter with ANY message passing system or library in terms of parallel task scaling. There is a ready-to-run PVM program I just posted in tarball form on my personal website that will be featured in the May issue of Cluster World Magazine. http:www.phy.duke.edu/~rgb/General/random_pvm.php It is designed to give you direct control over the most important parameters that affect task scaling so that you can learn just how it works. The task itself consists of a "master" program and a "slave" program. The master parses several parameters from the command line: -n number of slaves -d delay (to vary the amount of simulated work per communication) -r number of rands (to vary the number of communications per run and work burdent per slave) -b a flag to control whether the slaves send back EACH number as it is generated (lots of small messags) or "bundles" all the numbers they generate into a single message. This makes a visible, rather huge difference in task scaling, as it should. The task itself is trivial -- generating random numbers. The master starts by computing a trivial task partitioning among the n nodes. It spawns n slave tasks, sending each one the delay on the command line. It then sends each slave the number of rands to generate and a trivially unique seed as messages. Each slave generates a rand, waits delay (in nanoseconds, with a high-precision polling loop), and either sends it back as a message immediately (the default) or saves it in a large vector until the task is finished and sends the whole buffer as a single message (if the -b flag was set). This serves two valuable purposes for the novice. First, it gives you a ready-to-build working master/slave program to use as a template for a pretty much any problem for which the paradigm is a good fit. Second, by simply playing with it, you can learn LOTS of things about parallel programs and clusters. If delay is small (order of the packet latency, 100 usec or less) the program is in a latency dominated scaling regime where communications per number actually takes longer than generating the numbers and its parallel scaling is lousy (if slowing a task down relative to serial can be called merely lousy). If delay is large, so that it takes a long time to compute and a short time to send back the results, parallel scaling is excellent with near linear speedup. Turning on the -b flag for certain ranges of the delay can "instantly" shift one from latency bounded to bandwidth bounded parallel scaling regimes, and restore decent scaling. Even if you don't use it because it is based on PVM, if you clone it for MPI you'll learn the same lessons there, as they are universal and part of the theoretical basis for understanding parallel scaling. Eventually I'll do an MPI version myself for the column, but the mag HAS an MPI column and my focus would be more for the novice learning about parallel computing in general. BTW, obviously I think that subscribing to CWM is a good idea for novices. Among its many other virtues (such as articles by lots of the luminaries of this vary list:-), you can read my columns. In fact, from what I've seen from the first few issues, ALL the columns are pretty damn good and getting back issues to the beginning wouldn't hurt, if it is still possible. If you (or anybody) DO grab random_pvm and give it a try, please send me feedback, preferrably before the actual column comes out in May, so that I can fix it before then. It is moderately well documented in the tarball, but of course there is more "documentation" and explanation in the column itself. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Mar 10 12:28:54 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 10 Mar 2004 12:28:54 -0500 (EST) Subject: [Beowulf] Writing a parallel program In-Reply-To: <003701c406bf$085f25b0$590b7bca@roudy> Message-ID: On Wed, 10 Mar 2004, roudy wrote: > Hello everybody, > I completed to build my beowulf cluster. Now I am writing a parallel program > using MPICH2. Can someone give me a help. Because, the program that I wrote > take more time to run on several nodes compare when it is run on one node. > If there is a small program that someone can send me about distributing data > among nodes, then each node process the data, and the information is sent > back to the master node for printing. This will be a real help for me. > Thanks > Roud I can't help you much with MPI but I can help you understand the problems you might encounter with ANY message passing system or library in terms of parallel task scaling. There is a ready-to-run PVM program I just posted in tarball form on my personal website that will be featured in the May issue of Cluster World Magazine. http:www.phy.duke.edu/~rgb/General/random_pvm.php It is designed to give you direct control over the most important parameters that affect task scaling so that you can learn just how it works. The task itself consists of a "master" program and a "slave" program. The master parses several parameters from the command line: -n number of slaves -d delay (to vary the amount of simulated work per communication) -r number of rands (to vary the number of communications per run and work burdent per slave) -b a flag to control whether the slaves send back EACH number as it is generated (lots of small messags) or "bundles" all the numbers they generate into a single message. This makes a visible, rather huge difference in task scaling, as it should. The task itself is trivial -- generating random numbers. The master starts by computing a trivial task partitioning among the n nodes. It spawns n slave tasks, sending each one the delay on the command line. It then sends each slave the number of rands to generate and a trivially unique seed as messages. Each slave generates a rand, waits delay (in nanoseconds, with a high-precision polling loop), and either sends it back as a message immediately (the default) or saves it in a large vector until the task is finished and sends the whole buffer as a single message (if the -b flag was set). This serves two valuable purposes for the novice. First, it gives you a ready-to-build working master/slave program to use as a template for a pretty much any problem for which the paradigm is a good fit. Second, by simply playing with it, you can learn LOTS of things about parallel programs and clusters. If delay is small (order of the packet latency, 100 usec or less) the program is in a latency dominated scaling regime where communications per number actually takes longer than generating the numbers and its parallel scaling is lousy (if slowing a task down relative to serial can be called merely lousy). If delay is large, so that it takes a long time to compute and a short time to send back the results, parallel scaling is excellent with near linear speedup. Turning on the -b flag for certain ranges of the delay can "instantly" shift one from latency bounded to bandwidth bounded parallel scaling regimes, and restore decent scaling. Even if you don't use it because it is based on PVM, if you clone it for MPI you'll learn the same lessons there, as they are universal and part of the theoretical basis for understanding parallel scaling. Eventually I'll do an MPI version myself for the column, but the mag HAS an MPI column and my focus would be more for the novice learning about parallel computing in general. BTW, obviously I think that subscribing to CWM is a good idea for novices. Among its many other virtues (such as articles by lots of the luminaries of this vary list:-), you can read my columns. In fact, from what I've seen from the first few issues, ALL the columns are pretty damn good and getting back issues to the beginning wouldn't hurt, if it is still possible. If you (or anybody) DO grab random_pvm and give it a try, please send me feedback, preferrably before the actual column comes out in May, so that I can fix it before then. It is moderately well documented in the tarball, but of course there is more "documentation" and explanation in the column itself. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Wed Mar 10 12:07:10 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed, 10 Mar 2004 12:07:10 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <20040310151358.43826.qmail@web11407.mail.yahoo.com> Message-ID: > See the online lecture: "Things CPU Architects Need To Think About" > http://www.stanford.edu/class/ee380/ does anyone have a lead on an open-source player for these .asx files? or at least something not tied to windows? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sp at scali.com Wed Mar 10 13:41:59 2004 From: sp at scali.com (Steffen Persvold) Date: Wed, 10 Mar 2004 19:41:59 +0100 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <404F6177.8050108@scali.com> Mark Hahn wrote: >>See the online lecture: "Things CPU Architects Need To Think About" >>http://www.stanford.edu/class/ee380/ > > > does anyone have a lead on an open-source player for these .asx files? > or at least something not tied to windows? > The .asx file is just a link to a .wmv (Windows Media) file, which again just contains a streaming media reference. I haven't tried, but I think you could use mplayer to play them : http://www.mplayerhq.hu Best regards, Steffen _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Wed Mar 10 16:11:07 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed, 10 Mar 2004 16:11:07 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <404F7BE0.6040900@nada.kth.se> Message-ID: > Seems to be running fine with xine. wow, you're right! thanks... _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed Mar 10 18:56:06 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 10 Mar 2004 18:56:06 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: Message-ID: On Wed, 10 Mar 2004, Mark Hahn wrote: > > Seems to be running fine with xine. > > wow, you're right! thanks... (sorry to jump back on the thread this way, but it is easier than scrolling back through mail to find the original:-) I went downstairs again today and really paid attention to the kill-a-watt. Dual 1600 MHz Opteron, 1 GB of memory, load average of 3 (I don't know why but they are running three jobs instead of two at the moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over 120 V line voltage). This seems lower than a lot of the other numbers being reported (although it is a bit higher than my memory recalled yesterday -- I TOLD you not to trust me:-). It is still considerably better than a dual Athlon at much higher clock as well. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ddw at dreamscape.com Wed Mar 10 20:36:13 2004 From: ddw at dreamscape.com (Daniel Williams) Date: Wed, 10 Mar 2004 20:36:13 -0500 Subject: [Beowulf] Cluster school project References: <200403101446.i2AEknA22660@NewBlue.scyld.com> Message-ID: <404FC28A.7607EF77@dreamscape.com> > From: "Maikel Punie" > Subject: RE: [Beowulf] Cluster school project > Date: Tue, 9 Mar 2004 18:45:47 +0100 > [snip...] > >>Do you mean a computing/programming project could you do, >>like calculating pi to some large number of digits? > >yeah something like that, i realy have no idea what is possible. >if there are any suggestions, they are always welcome. Here's what I want to do once I get enough junk 500mhz machines together: Make a model of the spread of genetic diseases in a population of a few hundred million. I've been wanting to do that for years, but it would probably take a few months to run on any single machine I own. I figure it should run in a few weeks as soon as I get a 16 node cluster together to run it. Is that something you could maybe use? DDW _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From a.j.martin at qmul.ac.uk Thu Mar 11 04:56:24 2004 From: a.j.martin at qmul.ac.uk (Alex Martin) Date: Thu, 11 Mar 2004 09:56:24 +0000 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> On Wednesday 10 March 2004 11:56 pm, Robert G. Brown wrote: > > I went downstairs again today and really paid attention to the > kill-a-watt. Dual 1600 MHz Opteron, 1 GB of memory, load average of 3 > (I don't know why but they are running three jobs instead of two at the > moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over > 120 V line voltage). > > This seems lower than a lot of the other numbers being reported > (although it is a bit higher than my memory recalled yesterday -- I TOLD > you not to trust me:-). It is still considerably better than a dual > Athlon at much higher clock as well. > > rgb I find you numbers a bit surprising still As part of our latest procurement I looked up the power consumption in the INTEL/AMD documention for the various processors under consideration: Athlon model 6 2200MP 58.9 W model 8 2400MP 54.5 W model 11 2800MP (Barton) 47.2 W Opteron 240-244 82.1 W 246-248 89.0 W Xeon 2.8 GHz 77 W (512K Cache) 3.06 GHz 87 W I think these numbers are meant to be maximum? -- ------------------------------------------------------------------------------ | | | Dr. Alex Martin | | e-Mail: a.j.martin at qmul.ac.uk Queen Mary, University of London, | | Phone : +44-(0)20-7882-5033 Mile End Road, | | Fax : +44-(0)20-8981-9465 London, UK E1 4NS | | | ------------------------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From a.j.martin at qmul.ac.uk Thu Mar 11 07:47:57 2004 From: a.j.martin at qmul.ac.uk (Alex Martin) Date: Thu, 11 Mar 2004 12:47:57 +0000 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <200403111247.i2BClv215026@heppcb.ph.qmw.ac.uk> On Thursday 11 March 2004 12:35 pm, Bogdan Costescu wrote: > On Thu, 11 Mar 2004, Alex Martin wrote: > > I find you numbers a bit surprising still > > I don't :-) I was suprised that rgb's opteron numbers were so low! > While I can't remember what was the exact figure for the dual Opteron > 246 (2 GHz) system, I'm sure that it was over 200W. > > > Athlon model 11 2800MP (Barton) 47.2 W > > dual Athlon 2800MP (2133MHz) under load from 2 cpuburn ~ 230W > > > Xeon (512K Cache) 3.06 GHz 87 W > > dual Xeon 3.06GHz under load from 2 cpuburn ~ 275W your system numbers are pretty consistent with what I've measured. ( ~230 W for Athlon 2200MP and ~250W for Xeon 2.8GHz ) -- ------------------------------------------------------------------------------ | | | Dr. Alex Martin | | e-Mail: a.j.martin at qmul.ac.uk Queen Mary, University of London, | | Phone : +44-(0)20-7882-5033 Mile End Road, | | Fax : +44-(0)20-8981-9465 London, UK E1 4NS | | | ------------------------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bogdan.costescu at iwr.uni-heidelberg.de Thu Mar 11 07:35:30 2004 From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Thu, 11 Mar 2004 13:35:30 +0100 (CET) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> Message-ID: On Thu, 11 Mar 2004, Alex Martin wrote: > I find you numbers a bit surprising still I don't :-) While I can't remember what was the exact figure for the dual Opteron 246 (2 GHz) system, I'm sure that it was over 200W. > Athlon model 11 2800MP (Barton) 47.2 W dual Athlon 2800MP (2133MHz) under load from 2 cpuburn ~ 230W > Xeon (512K Cache) 3.06 GHz 87 W dual Xeon 3.06GHz under load from 2 cpuburn ~ 275W -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Mar 11 08:39:02 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 11 Mar 2004 08:39:02 -0500 (EST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> Message-ID: On Thu, 11 Mar 2004, Alex Martin wrote: > I find you numbers a bit surprising still As part of our latest procurement > I looked up the power consumption in the INTEL/AMD documention for the > various processors under consideration: ... > Opteron 240-244 82.1 W > 246-248 89.0 W > I think these numbers are meant to be maximum? You've got me -- dunno. I can post a digital photo of the kill-a-watt reading if you like (I was going to take a camera down there anyway to add a new rack photo to the brahma tour). I can also take the kill-a-watt and plug in an electric light bulb or something with a fairly predictable draw and see if it is broken somehow. Right now a system in production work is plugged into it -- I'll try to retrieve it soon and plug one of my new systems into it so that I can run more detailed tests under more controlled loads. I don't know exactly what kind of work is being done in the current jobs being run. One advantage may be that the cases are apparently equipped with a PFC power supply. The power factor appears to be very good -- close to 1. This may make the power supplies themselves run cooler, so that the power draw of the rest of the system IS only 20 or so more watts. The systems also have a bare minimum of peripherals -- a hard disk (sitting idle), onboard dual gig NICs (one idle) and video (idle). Will post newer/better tests as I have time and make them, although others may beat me to it...;-) rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Thu Mar 11 11:10:16 2004 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Thu, 11 Mar 2004 08:10:16 -0800 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> Message-ID: <5.2.0.9.2.20040311080304.017d8008@mailhost4.jpl.nasa.gov> At 08:39 AM 3/11/2004 -0500, Robert G. Brown wrote: >On Thu, 11 Mar 2004, Alex Martin wrote: > > > I find you numbers a bit surprising still As part of our latest > procurement > > I looked up the power consumption in the INTEL/AMD documention for the > > various processors under consideration: >... > > Opteron 240-244 82.1 W > > 246-248 89.0 W > > I think these numbers are meant to be maximum? > >You've got me -- dunno. I can post a digital photo of the kill-a-watt >reading if you like (I was going to take a camera down there anyway to >add a new rack photo to the brahma tour). I can also take the >kill-a-watt and plug in an electric light bulb or something with a >fairly predictable draw and see if it is broken somehow. > >Right now a system in production work is plugged into it -- I'll try to >retrieve it soon and plug one of my new systems into it so that I can >run more detailed tests under more controlled loads. I don't know >exactly what kind of work is being done in the current jobs being run. > >One advantage may be that the cases are apparently equipped with a PFC >power supply. The power factor appears to be very good -- close to 1. >This may make the power supplies themselves run cooler, so that the >power draw of the rest of the system IS only 20 or so more watts. The >systems also have a bare minimum of peripherals -- a hard disk (sitting >idle), onboard dual gig NICs (one idle) and video (idle). Those power supplies are impressive PFC wise.. I'd venture to say, though, that the rated powers are peak over some fairly short time. The Kill-A-Watt averages over some reasonable time (a second or two?), so you could actually have an average that's half the peak. Everytime there's a pipeline stall, or a cache miss, etc, the current's going to change. We used processor current to debug DSP code, because you could actually see interrupts come in during the other steps(FFT = very high power, sudden drop for a few microseconds while ISR is running). You could also accurately time how long each "pass" in the FFT took, since the CPU power dropped while setting up the parameters for the next set of butterflies. To really track this kind of thing down, you'd want to hook a DC current probe around the wires from the Power supply to the motherboard. Then, write some benchmark program with a fairly repeatable computational resource requirement pattern. Look at the current on an oscilloscope. I suspect that onboard filtering will get rid of variations that last less than, say, 1-10 mSec, so a program that has a basic cyclical nature lasting 10 times that would be nice. Ideally, you'd probe the current going to the CPU, vs the rest of the mobo, but that's probably a bit of a challenge. Another experiment would be to write a small program that you KNOW will stay in cache and never go off chip and measure the current draw when running it. James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tegner at nada.kth.se Wed Mar 10 15:34:40 2004 From: tegner at nada.kth.se (Jon Tegner) Date: Wed, 10 Mar 2004 21:34:40 +0100 Subject: [Beowulf] Power consumption for opterons? In-Reply-To: References: Message-ID: <404F7BE0.6040900@nada.kth.se> Seems to be running fine with xine. /jon Mark Hahn wrote: >>See the online lecture: "Things CPU Architects Need To Think About" >>http://www.stanford.edu/class/ee380/ >> >> > >does anyone have a lead on an open-source player for these .asx files? >or at least something not tied to windows? > > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jimlux at earthlink.net Thu Mar 11 09:07:09 2004 From: jimlux at earthlink.net (Jim Lux) Date: Thu, 11 Mar 2004 06:07:09 -0800 Subject: [Beowulf] Power consumption for opterons? References: <200403110956.i2B9uOu14719@heppcb.ph.qmw.ac.uk> Message-ID: <000e01c40772$2611bf60$36a8a8c0@LAPTOP152422> ----- Original Message ----- From: "Alex Martin" To: "Robert G. Brown" ; "Mark Hahn" Cc: "Jon Tegner" ; Sent: Thursday, March 11, 2004 1:56 AM Subject: Re: [Beowulf] Power consumption for opterons? > On Wednesday 10 March 2004 11:56 pm, Robert G. Brown wrote: > > > > > I went downstairs again today and really paid attention to the > > kill-a-watt. Dual 1600 MHz Opteron, 1 GB of memory, load average of 3 > > (I don't know why but they are running three jobs instead of two at the > > moment) was drawing 182 Watts, 186 VA (roughly 1.5 Amps at a bit over > > 120 V line voltage). > > I find you numbers a bit surprising still As part of our latest procurement > I looked up the power consumption in the INTEL/AMD documention for the > various processors under consideration: > surprising high or surprising low? You're comparing DC power to just the processor vs wall plug power to the whole system (including cooling fans, RAM, PCI bridge chips, etc.) I think that the databook numbers of ca 50-80 W per CPU (probably the highest continuous average power) is nicely matched with 180 W from the wall for a dual CPU... The databook number is probably a bit on the high side... 180W from the wall probably equates to about 140W DC. There's probably 10W or so in fans and glue, maybe 100W for both procesors, and 30W for the rest of the logic and RAM _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mathiasbrito at yahoo.com.br Fri Mar 12 08:51:22 2004 From: mathiasbrito at yahoo.com.br (=?iso-8859-1?q?Mathias=20Brito?=) Date: Fri, 12 Mar 2004 10:51:22 -0300 (ART) Subject: [Beowulf] Strange Behavior Message-ID: <20040312135122.92643.qmail@web12208.mail.yahoo.com> Hi, I'm benchmarking my 16 nodes cluster with HPL and I obtain a estrange result, different of all I ever seen before. When I send more data with a big N, the performance is worse than with small values of N. I used N=5000 with NB=20 and the performance was 3.3GB, when I send N=10000 with NB=20 i get only 2.1GB. I don't liked the result, the nodes are athlon xp 1600+ with 512MB RAM, and I think the cluster very slow. Someone had the same problem and could help me? Mathias ===== Mathias Brito Universidade Estadual de Santa Cruz - UESC Departamento de Ci?ncias Exatas e Tecnol?gicas Estudante do Curso de Ci?ncia da Computa??o ______________________________________________________________________ Yahoo! Mail - O melhor e-mail do Brasil! Abra sua conta agora: http://br.yahoo.com/info/mail.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lars at meshtechnologies.com Fri Mar 12 11:43:47 2004 From: lars at meshtechnologies.com (Lars Henriksen) Date: Fri, 12 Mar 2004 16:43:47 +0000 Subject: [Beowulf] Strange Behavior In-Reply-To: <20040312135122.92643.qmail@web12208.mail.yahoo.com> References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> Message-ID: <1079109827.3745.7.camel@tp1.mesh-hq> On Fri, 2004-03-12 at 13:51, Mathias Brito wrote: > I'm benchmarking my 16 nodes cluster with HPL and I > obtain a estrange result, different of all I ever seen > before. When I send more data with a big N, the > performance is worse than with small values of N. I > used N=5000 with NB=20 and the performance was 3.3GB, > when I send N=10000 with NB=20 i get only 2.1GB. I > don't liked the result, the nodes are athlon xp 1600+ > with 512MB RAM, and I think the cluster very slow. > Someone had the same problem and could help me? Please correct me anybody, if im wrong: It seems to me, that the best results are acheived with approx 85-90% memory utilization (leaving something to the rest of the system). (16*512*1024*1024/8)^0.5 ~= 30200, that would close to the best N value isn't Nb=20 very low? I currently use arround 145 for P4 cpu's What performance du you get from a setup like the one above? best regards Lars -- Lars Henriksen | MESH-Technologies A/S Systems Manager & Consultant | Lille Graabroedrestraede 1 www.meshtechnologies.com | DK-5000 Odense C, Denmark lars at meshtechnologies.com | mobile: +45 2291 2904 direct: +45 6311 1187 | fax: +45 6311 1189 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From M.Arndt at science-computing.de Fri Mar 12 07:06:36 2004 From: M.Arndt at science-computing.de (Michael Arndt) Date: Fri, 12 Mar 2004 13:06:36 +0100 Subject: [Beowulf] Cluster Uplink via Wireless Message-ID: <20040312130636.D49119@blnsrv1.science-computing.de> Hello * has anyone done a wireless technology uplink to a compute cluster that is in real use ? If so, i would be interested to know how and how is the experinece in transferring "greater" (e.g. 2 GB ++ ) Result files? explanation: We have a cluster with gigabit interconnect where it would make life cheaper, if there is a possibility to upload input data and download output data via wireless link, since connecting twisted pair between WS and CLuster would be expensive. TIA Micha _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Fri Mar 12 17:22:58 2004 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Fri, 12 Mar 2004 14:22:58 -0800 Subject: [Beowulf] Cluster Uplink via Wireless In-Reply-To: <20040312130636.D49119@blnsrv1.science-computing.de> Message-ID: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> At 01:06 PM 3/12/2004 +0100, Michael Arndt wrote: >Hello * > >has anyone done a wireless technology uplink to a compute cluster >that is in real use ? >If so, i would be interested to know how and how is the experinece in >transferring "greater" (e.g. 2 GB ++ ) Result files? > >explanation: >We have a cluster with gigabit interconnect >where it would make life cheaper, if there is a possibility to upload >input data and download output data via wireless link, since connecting >twisted pair between WS and CLuster would be expensive. > I have a very small cluster that is using wireless interconnect for everything, and based upon my early observations, I'd be real, real leery of contemplating transferring Gigabytes in any practical time. For instance, loading a 25 MB compressed ram file system using tftp during PXE boot takes about a minute. This is on a very non-optimized configuration using 802.11a, through a variety of devices. Yes, indeed, the ad literature claims 54 Mbps, but that's not the actual data rate, but more the "bit rate" of the over the air signal. Wireless LANs are NOT full duplex, and there are synchronization preambles, etc. that make the throughput much lower. On a standard "11 Mbps" 802.11b type network, the "real data throughput" in a unidirectional transfer is probably closer to 3-5 Mbps. Say you get that wireless link really humming at 20 Mbps real data rate. Transferring 16,000 Mbit is still going to take 10-15 minutes. Your situation might be a bit better, especially if you can use a point to point wireless link. James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Fri Mar 12 19:29:41 2004 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Fri, 12 Mar 2004 16:29:41 -0800 Subject: [Beowulf] Cluster Uplink via Wireless References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> Message-ID: <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> At 06:04 PM 3/12/2004 -0500, Mark Hahn wrote: > > Say you get that wireless link really humming at 20 Mbps real data > > rate. Transferring 16,000 Mbit is still going to take 10-15 minutes. > >out of truely morbid curiosity, what's the latency like? I'll have some numbers next week. The configuration is sort of weird.. diskless node booting w/PXE D-Link Wireless AP in multi AP connect mode over the air D-Link wireless AP in multi AP connect mode network w/NFS and DHCP server The D-Link boxes try to be smart and not push packets across the air link that are for MACs they know are on the wired side, and that whole process is "tricky"... James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From clwang at csis.hku.hk Fri Mar 12 21:29:43 2004 From: clwang at csis.hku.hk (Cho Li Wang) Date: Sat, 13 Mar 2004 10:29:43 +0800 Subject: [Beowulf] NPC2004 CFP : Deadline Extended to March 22, 2004 Message-ID: <40527217.92D67387@csis.hku.hk> ******************************************************************* NPC2004 IFIP International Conference on Network and Parallel Computing October 18-20, 2004 Wuhan, China http://grid.hust.edu.cn/npc04 ------------------------------------------------------------------- Important Dates Paper Submission March 22, 2004 (extended) Author Notification May 1, 2004 Final Camera Ready Manuscript June 1, 2004 ******************************************************************* Call For Papers The goal of IFIP International Conference on Network and Parallel Computing (NPC 2004) is to establish an international forum for engineers and scientists to present their excellent ideas and experiences in all system fields of network and parallel computing. NPC 2004, hosted by the Huazhong University of Science and Technology, will be held in the city of Wuhan, China - the "Homeland of White Clouds and the Yellow Crane." Topics of interest include, but are not limited to: - Parallel & Distributed Architectures - Parallel & Distributed Applications/Algorithms - Parallel Programming Environments & Tools - Network & Interconnect Architecture - Network Security - Network Storage - Advanced Web and Proxy Services - Middleware Frameworks & Toolkits - Cluster and Grid Computing - Ubiquitous Computing - Peer-to-peer Computing - Multimedia Streaming Services - Performance Modeling & Evaluation Submitted papers may not have appeared in or be considered for another conference. Papers must be written in English and must be in PDF format. Detailed electronic submission instructions will be posted on the conference web site. The conference proceedings will be published by Springer Verlag in the Lecture Notes in Computer Science Series (cited by SCI). Best papers from NPC 2004 will be published in a special issue of International Journal of High Performance Computing and Networking (IJHPCN) after conference. ************************************************************************** Committee General Co-Chairs: H. J. Siegel Colorado State University, USA Guojie Li The Institute of Computing Technology, CAS, China Steering Committee Chair: Kemal Ebcioglu IBM T.J. Watson Research Center, USA Program Co-Chairs: Guangrong Gao University of Delaware, USA Zhiwei Xu Chinese Academy of Sciences, China Program Vice-Chairs: Victor K. Prasanna University of Southern California, USA Albert Y. Zomaya University of Sydney, Australia Hai Jin Huazhong University of Science and Technology, China Publicity Co-Chairs: Cho-Li Wang The University of Hong Kong, Hong Kong Chris Jesshope The University of Hull, UK Local Arrangement Chair: Song Wu Huazhong University of Science and Technology, China Steering Committee Members: Jack Dongarra University of Tennessee, USA Guangrong Gao University of Delaware, USA Jean-Luc Gaudiot University of California, Irvine, USA Guojie Li The Institute of Computing Technology, CAS, China Yoichi Muraoka Waseda University, Japan Daniel Reed University of North Carolina, USA Program Committee Members: Ishfaq Ahmad University of Texas at Arlington, USA Shoukat Ali University of Missouri-Rolla, USA Makoto Amamiya Kyushu University, Japan David Bader University of New Mexico, USA Luc Bouge IRISA/ENS Cachan, France Pascal Bouvry University of Luxembourg, Luxembourg Ralph Castain Los Alamos National Laboratory, USA Guoliang Chen University of Science and Technology of China, China Alain Darte CNRS, ENS-Lyon, France Chen Ding University of Rochester, USA Jianping Fan Institute of Computing Technology, CAS, China Xiaobing Feng Institute of Computing Technology, CAS, China Jean-Luc Gaudiot University of California, Irvine, USA Minyi Guo University of Aizu, Japan Mary Hall University of Southern California, USA Salim Hariri University of Arizona, USA Kai Hwang University of Southern California, USA Anura Jayasumana Colorado State Univeristy, USA Chris R. Jesshop The University of Hull, UK Ricky Kwok The University of Hong Kong, Hong Kong Francis Lau The University of Hong Kong, Hong Kong Chuang Lin Tsinghua University, China John Morrison University College Cork, Ireland Lionel Ni Hong Kong University of Science and Technology, Hong Kong Stephan Olariu Old Dominion University, USA Yi Pan Georgia State University, USA Depei Qian Xi'an Jiaotong University, China Daniel A. Reed University of North Carolina at Chapel Hill, USA Jose Rolim University of Geneva, Switzerland Arnold Rosenberg University of Massachusetts at Amherst, USA Sartaj Sahni University of Florida, USA Selvakennedy Selvadurai University of Sydney, Australia Franciszek Seredynski Polish Academy of Sciences, Poland Hong Shen Japan Advanced Institute of Science and Technology, Japan Xiaowei Shen IBM T. J. Watson Research Center, USA Gabby Silberman IBM Centers for Advanced Studies, USA Per Stenstrom Chalmers University of Technology, Sweden Ivan Stojmenovic University of Ottawa, Canada Ninghui Sun Institute of Computing Technology, CAS, China El-Ghazali Talbi University of Lille, France Domenico Talia University of Calabria, Italy Mitchell D. Theys University of Illinois at Chicago, USA Xinmin Tian Intel Corporation, USA Dean Tullsen University of California, San Diego, USA Cho-Li Wang The University of Hong Kong, Hong Kong Qing Yang University of Rhode Island, USA Yuanyuan Yang State University of New York at Stony Brook, USA Xiaodong Zhang College of William and Mary, USA Weimin Zheng Tsinghua University, China Bingbing Zhou University of Sydney, Australia Chuanqi Zhu Fudan University, China ------------------------------------------------------------------------ For more information, please contact the program vice-chair at the address below: Dr. Hai Jin, Professor Director, Cluster and Grid Computing Lab Vice-Dean, School of Computer Huazhong University of Science and Technology Wuhan, 430074, China Tel: +86-27-87543529 Fax: +86-27-87557354 e-fax: +1-425-920-8937 e-mail: hjin at hust.edu.cn _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mayank_kaushik at vsnl.net Sat Mar 13 05:24:13 2004 From: mayank_kaushik at vsnl.net (mayank_kaushik at vsnl.net) Date: Sat, 13 Mar 2004 15:24:13 +0500 Subject: [Beowulf] Benchmarking with PVM Message-ID: <74070c77404a91.7404a9174070c7@vsnl.net> hi everyone first of all, id like to thank Robert G. Brown for his help in solving my PVM problem, and getting my cluster running! now that its running, iv been trying to run tests on it to see how fast it really is..so i ran PVMPOV, and the results were pretty impressive- i had two P4s clustered, and the rendering time was reduced by half..may sound trivial to you guys, but to a first-timer like me, it looks great! :-) okay, so heres the deal- we`v got lots of idle computers in the college computer lab..an eclectic mix of P2 350s and P3 733s, which everyone has abandoned in favour of flashy new compaq evo P4 2.4ghzs, so along comes me the evangelist and turns all the outcasts into cluster nodes.. (wev got a gigabit LAN too) now,id like to run benchmarking tests on the cluster so as to outline the increase in performance as individual nodes are added..and also the increase in the load on the network.. are there tools available that would let me do all this..and, say, get graphs etc too? tools that are compatible with PVM? could anyone provide links to places where they can be downloaded? (im running red-hat 9.0 on all systems) thanx in anticipation Mayank PS. those proud compaq evos are giving me trouble..thev got winXP with an NTFS filesystem, n im trying to use partition magic to make a pratition so that i can make a dual boot system and install linux...but partition magic always exits with an error, on all the systems..fips wont work with NTFS..has anyone ever done this? the quick-restor cd says it would remove all partitions and make just one NTFS partition, so i didnt try that. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From daniel.kidger at quadrics.com Sat Mar 13 17:00:40 2004 From: daniel.kidger at quadrics.com (Dan Kidger) Date: Sat, 13 Mar 2004 22:00:40 +0000 Subject: [Beowulf] Strange Behavior In-Reply-To: <1079109827.3745.7.camel@tp1.mesh-hq> References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> <1079109827.3745.7.camel@tp1.mesh-hq> Message-ID: <200403132200.40877.daniel.kidger@quadrics.com> On Friday 12 March 2004 4:43 pm, Lars Henriksen wrote: > On Fri, 2004-03-12 at 13:51, Mathias Brito wrote: > > I'm benchmarking my 16 nodes cluster with HPL and I > > obtain a estrange result, different of all I ever seen > > before. When I send more data with a big N, the > > performance is worse than with small values of N. I > > used N=5000 with NB=20 and the performance was 3.3GB, > > when I send N=10000 with NB=20 i get only 2.1GB. I > > don't liked the result, the nodes are athlon xp 1600+ > > with 512MB RAM, and I think the cluster very slow. > > Someone had the same problem and could help me? > > Please correct me anybody, if im wrong: > It seems to me, that the best results are acheived with approx 85-90% > memory utilization (leaving something to the rest of the system). > > (16*512*1024*1024/8)^0.5 ~= 30200, that would close to the best N value Your target should be say 75% of theoretical peak performance 0.75 * 16nodes * 1 cpupernode * 1.4Ghz * 1 floppertick = 16.8 Gflops/s So figures like '3.1' Gflops/s (14% peak) are much lower than what you should be achieving (Only vendors like IBM post figures on the top500 with %peak figures as low as this (Nov2003) ) Linpack figures are dominated by the choice of maths library - you do not say which one you are using (MKL, libgoto, Atlas, ACML) ? > isn't Nb=20 very low? I currently use arround 145 for P4 cpu's Remember choice of NB depends on which maths library you use rather than simply on the platform - but in general the best values lie between 80 to 256; 20x20 is far too small for a matrix multiply. Daniel. -------------------------------------------------------------- Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505 ----------------------- www.quadrics.com -------------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ratscus at hotmail.com Sat Mar 13 20:55:17 2004 From: ratscus at hotmail.com (Joe Manning) Date: Sat, 13 Mar 2004 18:55:17 -0700 Subject: [Beowulf] project Message-ID: Does anyone know of a good non-profit that posts data to be processed? Kind of like how SETI dispenses its data, but for cancer or something? I have a whole school to my disposal and am just going to run a diskless system pushed down from a server. I can't really do much about the network, but will use it as a working model for some personal curiosities. (hopefully I will be able to contribute to this group at some point) Also, if anyone does know of a good place to get this type of data, can they please point me in the right direction of the type of process said sight uses, so I can decide what version I want to use to implement the process. Thanks, Joe Manning _________________________________________________________________ Get a FREE online computer virus scan from McAfee when you click here. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From patrick at myri.com Sat Mar 13 21:57:40 2004 From: patrick at myri.com (Patrick Geoffray) Date: Sat, 13 Mar 2004 21:57:40 -0500 Subject: [Beowulf] Strange Behavior In-Reply-To: <200403132200.40877.daniel.kidger@quadrics.com> References: <20040312135122.92643.qmail@web12208.mail.yahoo.com> <1079109827.3745.7.camel@tp1.mesh-hq> <200403132200.40877.daniel.kidger@quadrics.com> Message-ID: <4053CA24.1020901@myri.com> Hi Dan. Dan Kidger wrote: > Your target should be say 75% of theoretical peak performance He is likely using IP over Ethernet, so 50% would be a more reasonable expectation. > So figures like '3.1' Gflops/s (14% peak) are much lower than what you should > be achieving (Only vendors like IBM post figures on the top500 with %peak > figures as low as this (Nov2003) ) Which ones ? Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From unix_no_win at yahoo.com Sun Mar 14 11:49:17 2004 From: unix_no_win at yahoo.com (unix_no_win) Date: Sun, 14 Mar 2004 08:49:17 -0800 (PST) Subject: [Beowulf] project In-Reply-To: Message-ID: <20040314164917.45310.qmail@web40412.mail.yahoo.com> You might want to check out: www.distributedfolding.org --- Joe Manning wrote: > Does anyone know of a good non-profit that posts > data to be processed? Kind > of like how SETI dispenses its data, but for cancer > or something? I have a > whole school to my disposal and am just going to run > a diskless system > pushed down from a server. I can't really do much > about the network, but > will use it as a working model for some personal > curiosities. (hopefully I > will be able to contribute to this group at some > point) Also, if anyone > does know of a good place to get this type of data, > can they please point me > in the right direction of the type of process said > sight uses, so I can > decide what version I want to use to implement the > process. > > Thanks, > > > Joe Manning > > _________________________________________________________________ > Get a FREE online computer virus scan from McAfee > when you click here. > http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963 > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or > unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf __________________________________ Do you Yahoo!? Yahoo! Mail - More reliable, more storage, less spam http://mail.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From peter at cs.usfca.edu Sun Mar 14 13:57:22 2004 From: peter at cs.usfca.edu (Peter Pacheco) Date: Sun, 14 Mar 2004 10:57:22 -0800 Subject: [Beowulf] Flashmob Supercomputer Message-ID: <20040314185722.GB14301@cs.usfca.edu> The University of San Francisco is sponsoring the first FlashMob Supercomputer on - Saturday, April 3, from 8 am to 6 pm, in the - Koret Center of the University of San Francisco. We're planning to network 1200-1400 laptops with Myrinet and Foundry Switches. We'll be running High-Performance Linpack, and we're hoping to achieve 600 GFLOPS, which is faster than some of the Top500 fastest supercomputers. We need volunteers to - Bring their laptops: Pentium III or IV or AMD, minimum requirements 1.3 GHz with 256 MBytes of RAM - Be table captains: help people set up laptops before running the benchmark - Speak on subjects related to high-performance computing For further information, please visit our website http://flashmobcomputing.org Peter Pacheco Department of Computer Science University of San Francisco San Francisco, CA 94117 peter at cs.usfca.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Sun Mar 14 21:17:50 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Mon, 15 Mar 2004 10:17:50 +0800 (CST) Subject: [Beowulf] Oh MyGrid Message-ID: <20040315021750.49880.qmail@web16813.mail.tpe.yahoo.com> http://mygrid.sourceforge.net/ "MyGrid is designed with the modern concepts in mind, simple naming and transparent class hierarchy." It's targeting DataSynapse, licensed under GPL, and more features. Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgoornaden at intnet.mu Sun Mar 14 22:12:55 2004 From: rgoornaden at intnet.mu (roudy) Date: Mon, 15 Mar 2004 07:12:55 +0400 Subject: [Beowulf] Re: Writing a parallel program Message-ID: <000701c40a3b$9a415e60$2b007bca@roudy> Hello, I don't know if it will be here that I can get a solution to my problem. Well, I have an array of elements and I would like to divide the array by the number of processors and then each processor process parts of the whole array. Below is the source code of how I am proceeding, can someone tell me what is wrong? Assume that the I have an array allval[tdegree] void share_data(void) { double nleft; int i, k, j, nmin; nmin = tdegree/size; /* Number of degrees to be handled by each processor */ nleft = tdegree%size; for(i=0;i References: <404EC427.7070200@ulakbim.gov.tr> Message-ID: <40556EA3.60400@ulakbim.gov.tr> Hi, We have built a beowulf Debian cluster that contains 128 PIV nodes and one dual xeon server. I need some help about SPBS (Storm). We have already installed SPBS on the server and nodes and all daemons seem to work regularly. When any job is given to the system by using pbs scripting, the job can be seen on defined queue by running status and related nodes are allocated for the job. On the other hand there is no cpu or memory consumption on the nodes, the job does not run exactly and at the end of estimated cpu time there is no output file. Can anyone give some advice on my problem about SPBS. Thank you... Burcu Akcan ULAKBIM High Performance Computing Center _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From klamman.gard at telia.com Mon Mar 15 13:42:43 2004 From: klamman.gard at telia.com (Per Lindstrom) Date: Mon, 15 Mar 2004 19:42:43 +0100 Subject: [Beowulf] MOSIX cluster Message-ID: <4055F923.70203@telia.com> Hi, I wonder if some of you have experience of MOSIX? (www.mosix.org) What do you think about that solution for FEA-simulations? Can MOSIX be regarded as a form of a Beowulf cluster? Best regards Per Lindstrom Per.Lindstrom at me.chalmers.se , klamman.gard at telia.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From john4482 at umn.edu Mon Mar 15 15:02:40 2004 From: john4482 at umn.edu (Eric R Johnson) Date: Mon, 15 Mar 2004 14:02:40 -0600 Subject: [Beowulf] Scyld system mysteriously locks up Message-ID: <40560BE0.1090808@umn.edu> Hello, I purchased a 4 node, 8 processor Scyld (version 28) cluster approximately 6 months ago. About 5 days ago, it started mysteriously locking up on me. Once it is locked up, I can't do anything except physically reboot the machine. Unfortunately, I am rather new to Linux clusters and, since it worked "right out of the box", I have had no experience in troubleshooting. Can someone give me an idea of where I should start? I have the BIOS on all machines set to do a full memory check on startup and the /var/log/message file shows nothing. Thanks, Eric -- ******************************************************************** Eric R A Johnson University Of Minnesota tel: (612) 626 5115 Dept. of Laboratory Medicine & Pathology fax: (612) 625 1121 7-230 BSBE e-mail: john4482 at umn.edu 312 Church Street web: www.eric-r-johnson.com Minneapolis, MN 55455 USA -- ******************************************************************** Eric R A Johnson University Of Minnesota tel: (612) 626 5115 Dept. of Laboratory Medicine & Pathology fax: (612) 625 1121 7-230 BSBE e-mail: john4482 at umn.edu 312 Church Street web: www.eric-r-johnson.com Minneapolis, MN 55455 USA _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From agrajag at dragaera.net Mon Mar 15 16:23:34 2004 From: agrajag at dragaera.net (Jag) Date: Mon, 15 Mar 2004 16:23:34 -0500 Subject: [Beowulf] Cluster Uplink via Wireless In-Reply-To: <20040312130636.D49119@blnsrv1.science-computing.de> References: <20040312130636.D49119@blnsrv1.science-computing.de> Message-ID: <1079385814.4352.86.camel@pel> On Fri, 2004-03-12 at 07:06, Michael Arndt wrote: > Hello * > > has anyone done a wireless technology uplink to a compute cluster > that is in real use ? > If so, i would be interested to know how and how is the experinece in > transferring "greater" (e.g. 2 GB ++ ) Result files? > > explanation: > We have a cluster with gigabit interconnect > where it would make life cheaper, if there is a possibility to upload > input data and download output data via wireless link, since connecting > twisted pair between WS and CLuster would be expensive. Depending on your setup, some kind of "wireless" besides 802.11[bg] may be worth considering. I'm assuming the expense in wiring the WS to the cluster isn't wire costs so much as where you'd have to put the cable. One thing you might consider is IR uplink. I don't remember what speed they get, but a few years back I saw a college use IR to get connectivity to a building, that otherwise would have required digging up a busy public street to wire. In the long run it was a lot cheaper. If your expense in wiring is something similar, you may want to look into IR or similar technologies. (The IR guns weren't cheap by any means, except when compared to digging up a city street) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mnerren at paracel.com Mon Mar 15 18:21:04 2004 From: mnerren at paracel.com (micah nerren) Date: Mon, 15 Mar 2004 15:21:04 -0800 Subject: [Beowulf] Scyld system mysteriously locks up In-Reply-To: <40560BE0.1090808@umn.edu> References: <40560BE0.1090808@umn.edu> Message-ID: <1079392863.27739.25.camel@angmar> On Mon, 2004-03-15 at 12:02, Eric R Johnson wrote: > Hello, > > I purchased a 4 node, 8 processor Scyld (version 28) cluster > approximately 6 months ago. About 5 days ago, it started mysteriously > locking up on me. Once it is locked up, I can't do anything except > physically reboot the machine. I would check heating issues. Has the ventilation changed, does the machine feel hot? How long between lockups? Micah _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From john.hearns at clustervision.com Tue Mar 16 04:42:30 2004 From: john.hearns at clustervision.com (John Hearns) Date: Tue, 16 Mar 2004 10:42:30 +0100 (CET) Subject: [Beowulf] Cluster Uplink via Wireless In-Reply-To: <1079385814.4352.86.camel@pel> Message-ID: On Mon, 15 Mar 2004, Jag wrote: > be worth considering. I'm assuming the expense in wiring the WS to the > cluster isn't wire costs so much as where you'd have to put the cable. > One thing you might consider is IR uplink. I don't remember what speed > they get, but a few years back I saw a college use IR to get > connectivity to a building, that otherwise would have required digging > up a busy public street to wire. In the long run it was a lot cheaper. When I worked in Soho, we had a laser link over the rooftops of London. At the time a 155Mbps ATM link, which we later used for 100Mbps Ethernet. Main problem was cleaning the lenses every so often, in the lovely London air conditions. We later put in a gigabit laser from Nbase to another building. We needed much more bandwidth than 100Mbps in the end, and had our own trench dug and put in dark fibre. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Tue Mar 16 04:58:23 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Tue, 16 Mar 2004 17:58:23 +0800 (CST) Subject: [Beowulf] MOSIX cluster In-Reply-To: <4055F923.70203@telia.com> Message-ID: <20040316095823.57806.qmail@web16813.mail.tpe.yahoo.com> Since you know the number of tasks your simulations use, I think using a batch system would make it easier to management - MOSIX is usually for jobs which are very dynamic. You can take a look at the common batch systems such as SGE or SPBS. http://gridengine.sunsource.net http://www.supercluster.org/projects/torque/ Andrew. --- Per Lindstrom ????> Hi, > > I wonder if some of you have experience of MOSIX? > (www.mosix.org) > > What do you think about that solution for > FEA-simulations? > > Can MOSIX be regarded as a form of a Beowulf > cluster? > > Best regards > Per Lindstrom > > Per.Lindstrom at me.chalmers.se , > klamman.gard at telia.com > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or > unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From prml at na.chalmers.se Mon Mar 15 13:39:23 2004 From: prml at na.chalmers.se (Per R M Lindstrom) Date: Mon, 15 Mar 2004 19:39:23 +0100 (CET) Subject: [Beowulf] (no subject) Message-ID: Hi, I wonder if some of you have experience of MOSIX? (www.mosix.org) What do you think about that solution for FEA-simulations? Can MOSIX be regarded as a form of a Beowulf cluster? Best regards Per Lindstrom _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bioinformaticist at mn.rr.com Mon Mar 15 14:49:36 2004 From: bioinformaticist at mn.rr.com (Eric R Johnson) Date: Mon, 15 Mar 2004 13:49:36 -0600 Subject: [Beowulf] Scyld system mysteriously locks up Message-ID: <405608D0.60501@mn.rr.com> Hello, I purchased a 4 node, 8 processor Scyld (version 28) cluster approximately 6 months ago. About 5 days ago, it started mysteriously locking up on me. Once it is locked up, I can't do anything except physically reboot the machine. Unfortunately, I am rather new to Linux clusters and, since it worked "right out of the box", I have had no experience in troubleshooting. Can someone give me an idea of where I should start? I have the BIOS on all machines set to do a full memory check on startup and the /var/log/message file shows nothing. Thanks, Eric -- ******************************************************************** Eric R A Johnson University Of Minnesota tel: (612) 626 5115 Dept. of Laboratory Medicine & Pathology fax: (612) 625 1121 7-230 BSBE e-mail: john4482 at umn.edu 312 Church Street web: www.eric-r-johnson.com Minneapolis, MN 55455 USA _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From br66 at HPCL.CSE.MsState.Edu Mon Mar 15 18:09:37 2004 From: br66 at HPCL.CSE.MsState.Edu (Balaji Rangasamy) Date: Mon, 15 Mar 2004 17:09:37 -0600 (CST) Subject: [Beowulf] MPICH Exporting environment variables. Message-ID: Hi, Has anyone successfully exported any environment variables (specifically LD_PRELOAD) in MPICH? There is an easy way to do this with LAM/MPI; there is this -x switch in mpirun command that comes with LAM/MPI that will export the environment variable you specify to all the child processes. Is there any easy way to do this in MPICH? Thanks for your reply, Balaji. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tsariysk at craft-tech.com Tue Mar 16 14:12:46 2004 From: tsariysk at craft-tech.com (Ted Sariyski) Date: Tue, 16 Mar 2004 14:12:46 -0500 Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> Message-ID: <405751AE.2040806@craft-tech.com> Hi, I am about to configure a 16 node dual xeon cluster based on Supermicro X5DPA-TGM motherboard. The cluster may grow so I am looking for a manageable, nonblocking 24 or 32 port gigabit switch. Any comments or recommendations will be highly appreciated. Thanks, Ted _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From agrajag at dragaera.net Tue Mar 16 13:49:02 2004 From: agrajag at dragaera.net (Sean Dilda) Date: Tue, 16 Mar 2004 13:49:02 -0500 Subject: [Beowulf] Scyld system mysteriously locks up In-Reply-To: <405608D0.60501@mn.rr.com> References: <405608D0.60501@mn.rr.com> Message-ID: <1079462942.4354.49.camel@pel> On Mon, 2004-03-15 at 14:49, Eric R Johnson wrote: > Hello, > > I purchased a 4 node, 8 processor Scyld (version 28) cluster > approximately 6 months ago. About 5 days ago, it started mysteriously > locking up on me. Once it is locked up, I can't do anything except > physically reboot the machine. > Unfortunately, I am rather new to Linux clusters and, since it worked > "right out of the box", I have had no experience in troubleshooting. > Can someone give me an idea of where I should start? > I have the BIOS on all machines set to do a full memory check on startup > and the /var/log/message file shows nothing. It might be useful to try to figure out what is locking up. Is it just the head node that's locking? Have you made any recent changes that might account for it? Or are you running any new programs that might be stressing the machine in a way it wasn't stressed before? If its completely locking (if you can no longer toggle the numlock light on your keyboard, then its completely locked), then its either a kernel hang, or a hardware issue. If the kernel is the same and the usage pattern hasn't changed, then it might be a hardware issue. Hardware can degrade over time and dying hardware can be unpredictable. You may also consider contacting Scyld, and possibly the hardware manufacturer for help diagnosing the problem. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From david.n.lombard at intel.com Tue Mar 16 16:19:12 2004 From: david.n.lombard at intel.com (Lombard, David N) Date: Tue, 16 Mar 2004 13:19:12 -0800 Subject: [Beowulf] MOSIX for FEA (was: no subject) Message-ID: <187D3A7CAB42A54DB61F1D05F0125722025F5662@orsmsx402.jf.intel.com> From: Per R M Lindstrom; Monday, March 15, 2004 10:39 AM > > Hi, > > I wonder if some of you have experience of MOSIX? (www.mosix.org) > > What do you think about that solution for FEA-simulations? As with all things, "it depends." More specifically, it depends on the characteristics of the FEA app. For the FEA app that I have intimate familiarity with, MOSIX would not work well at all. The reason is the app is highly sensitive to sustained memory bandwidth and sustained disk I/O bandwidth. While memory bandwidth is not an issue with MOSIX, disk I/O bandwidth will become an issue once MOSIX migrates a process to balance CPU load. The (local scratch) disk I/O will then be forced through both the current and original nodes, severely impacting the bandwidth. Having said that, I can imagine an in-memory FEA app that could work quite well on MOSIX. More specifically, the hypothetical app would read its data from disk, crunch for a while, and then write its results to disk. -- David N. Lombard My comments represent my opinions, not those of Intel Corporation _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gropp at mcs.anl.gov Tue Mar 16 15:14:58 2004 From: gropp at mcs.anl.gov (William Gropp) Date: Tue, 16 Mar 2004 14:14:58 -0600 Subject: [Beowulf] MPICH Exporting environment variables. In-Reply-To: References: Message-ID: <6.0.0.22.2.20040316141246.025e4f48@localhost> At 05:09 PM 3/15/2004, Balaji Rangasamy wrote: >Hi, >Has anyone successfully exported any environment variables (specifically >LD_PRELOAD) in MPICH? There is an easy way to do this with LAM/MPI; there >is this -x switch in mpirun command that comes with LAM/MPI that will >export the environment variable you specify to all the child processes. Is >there any easy way to do this in MPICH? It depends on the process manager/startup system that you are using with MPICH. With the "p4 secure server", environment variables can be exported. With the default ch_p4 device, environment variables are not exported. Under MPICH2, most process managers export the environment to the user processes. Bill _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From csamuel at vpac.org Tue Mar 16 22:09:44 2004 From: csamuel at vpac.org (Chris Samuel) Date: Wed, 17 Mar 2004 14:09:44 +1100 Subject: [Beowulf] cfengine users ? Message-ID: <200403171409.45273.csamuel@vpac.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, Anyone out there using cfengine to manage clusters, or who's tried and failed? Just curious as to whether it's worth looking at.. cheers! Chris - -- Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQFAV8F4O2KABBYQAh8RAth7AJ9NkRhIUqcykX1zWGZyi/vZcB7JhwCgkVej uX5R/EcQrBPX+/Pyew55FC0= =tRe+ -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From a.j.martin at qmul.ac.uk Wed Mar 17 05:09:28 2004 From: a.j.martin at qmul.ac.uk (Alex Martin) Date: Wed, 17 Mar 2004 10:09:28 +0000 Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: <405751AE.2040806@craft-tech.com> References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> <405751AE.2040806@craft-tech.com> Message-ID: <200403171009.i2HA9S314735@heppcb.ph.qmw.ac.uk> On Tuesday 16 March 2004 7:12 pm, Ted Sariyski wrote: > Hi, > I am about to configure a 16 node dual xeon cluster based on Supermicro > X5DPA-TGM motherboard. The cluster may grow so I am looking for a > manageable, nonblocking 24 or 32 port gigabit switch. Any comments or > recommendations will be highly appreciated. > Thanks, > Ted > You might want to look at the HP ProCurve 2824 or 2848 series. We choose the latter, because it means we only need one switch per (logical) rack and the cost/port is pretty low. I can't yet comment on performance. cheers, Alex -- ------------------------------------------------------------------------------ | | | Dr. Alex Martin | | e-Mail: a.j.martin at qmul.ac.uk Queen Mary, University of London, | | Phone : +44-(0)20-7882-5033 Mile End Road, | | Fax : +44-(0)20-8981-9465 London, UK E1 4NS | | | ------------------------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bogdan.costescu at iwr.uni-heidelberg.de Wed Mar 17 07:17:06 2004 From: bogdan.costescu at iwr.uni-heidelberg.de (Bogdan Costescu) Date: Wed, 17 Mar 2004 13:17:06 +0100 (CET) Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: <200403171009.i2HA9S314735@heppcb.ph.qmw.ac.uk> Message-ID: On Wed, 17 Mar 2004, Alex Martin wrote: > You might want to look at the HP ProCurve 2824 or 2848 series. We > choose the latter, because it means we only need one switch per > (logical) rack and the cost/port is pretty low. I can't yet comment > on performance. I'm interested in buying a 48 port Gigabit switch as well, and I was looking at the 2848 as it has the advantage of 48 ports in only 1U. One thing that is not clear from the descriptions that I find on the net is if it has support for Jumbo frames. Does the documentation that come with it mention something like this or, even better, have you tried using Jumbo frames ? I'm also interested in hearing opinions about other 48 ports Gigabit switches. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Wed Mar 17 08:04:10 2004 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Wed, 17 Mar 2004 05:04:10 -0800 (PST) Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: Message-ID: On Wed, 17 Mar 2004, Bogdan Costescu wrote: > On Wed, 17 Mar 2004, Alex Martin wrote: > > > You might want to look at the HP ProCurve 2824 or 2848 series. We > > choose the latter, because it means we only need one switch per > > (logical) rack and the cost/port is pretty low. I can't yet comment > > on performance. > > I'm interested in buying a 48 port Gigabit switch as well, and I was > looking at the 2848 as it has the advantage of 48 ports in only 1U. > One thing that is not clear from the descriptions that I find on the > net is if it has support for Jumbo frames. Does the documentation that > come with it mention something like this or, even better, have you > tried using Jumbo frames ? hp does not support jumbo frames on anything except their high-end l3 products... > I'm also interested in hearing opinions about other 48 ports Gigabit > switches. > > -- -------------------------------------------------------------------------- Joel Jaeggli Unix Consulting joelja at darkwing.uoregon.edu GPG Key Fingerprint: 5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tsariysk at craft-tech.com Wed Mar 17 07:28:34 2004 From: tsariysk at craft-tech.com (Ted Sariyski) Date: Wed, 17 Mar 2004 07:28:34 -0500 Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: References: Message-ID: <40584472.1050600@craft-tech.com> If jumboframes are important you may look at Foundry EdgeIron 24G or 48G. Ted Bogdan Costescu wrote: > On Wed, 17 Mar 2004, Alex Martin wrote: > > >>You might want to look at the HP ProCurve 2824 or 2848 series. We >>choose the latter, because it means we only need one switch per >>(logical) rack and the cost/port is pretty low. I can't yet comment >>on performance. > > > I'm interested in buying a 48 port Gigabit switch as well, and I was > looking at the 2848 as it has the advantage of 48 ports in only 1U. > One thing that is not clear from the descriptions that I find on the > net is if it has support for Jumbo frames. Does the documentation that > come with it mention something like this or, even better, have you > tried using Jumbo frames ? > > I'm also interested in hearing opinions about other 48 ports Gigabit > switches. > -- Ted Sariyski ------------ Combustion Research and Flow Technology, Inc. 6210 Keller's Church Road Pipersville, PA 18947 Tel: 215-766-1520 Fax: 215-766-1524 www.craft-tech.com tsariysk at craft-tech.com ----------------------- "Our experiment is perfect and is not limited by fundamental principles." _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From canon at nersc.gov Wed Mar 17 10:26:28 2004 From: canon at nersc.gov (canon at nersc.gov) Date: Wed, 17 Mar 2004 07:26:28 -0800 Subject: [Beowulf] cfengine users ? In-Reply-To: Message from Chris Samuel of "Wed, 17 Mar 2004 14:09:44 +1100." <200403171409.45273.csamuel@vpac.org> Message-ID: <200403171526.i2HFQSni004735@pookie.nersc.gov> Chris, We use cfengine to help manage our ~400 node linux cluster and 416 nodes (6656 processor) SP system. I highly recommend it. We typically use an rpm update script (we are moving to yum now) to manage the binaries and use cfengine to manage config files and scripts. There are some aspects of cfengine that can be a little convoluted, but it is very flexible. --Shane ------------------------------------------------------------------------ Shane Canon PSDF Project Lead National Energy Research Scientific Computing Center 1 Cyclotron Road Mailstop 943-256 Berkeley, CA 94720 ------------------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From anandv at singnet.com.sg Wed Mar 17 00:40:39 2004 From: anandv at singnet.com.sg (Anand Vaidya) Date: Wed, 17 Mar 2004 13:40:39 +0800 Subject: [Beowulf] manageable, nonblocking 24 port gigabit switch In-Reply-To: <405751AE.2040806@craft-tech.com> References: <5.2.0.9.2.20040312141443.017da008@mailhost4.jpl.nasa.gov> <5.2.0.9.2.20040312161915.017da9b0@mailhost4.jpl.nasa.gov> <405751AE.2040806@craft-tech.com> Message-ID: <200403171340.39601.anandv@singnet.com.sg> You can try Foundry Networks EIF24G or EIF48G, offers full BW, 1U, we like it. -Anand On Wednesday 17 March 2004 03:12, Ted Sariyski wrote: > Hi, > I am about to configure a 16 node dual xeon cluster based on Supermicro > X5DPA-TGM motherboard. The cluster may grow so I am looking for a > manageable, nonblocking 24 or 32 port gigabit switch. Any comments or > recommendations will be highly appreciated. > Thanks, > Ted > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Thu Mar 18 10:26:51 2004 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Thu, 18 Mar 2004 10:26:51 -0500 (EST) Subject: [Beowulf] Intel CSA performance? Message-ID: Intel added a special connection on their chipset to connect gigabit on some chipsets (CSA). I've been wondering whether this would offer a latency advantage, since it's conventional wisdom that PCI latency is a noticable part of MPI latency. this article: http://tinyurl.com/2vlez claims that CSA actually hurts latency, which is a bit puzzling. it is, admittedly, "gamepc.com", so perhaps they are unaware of tuning issues like interrupt-coalescing/mitigation. do any of you have CSA-based networks and have done performance tests? thanks, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From venkatraman at programmer.net Thu Mar 18 07:03:57 2004 From: venkatraman at programmer.net (Venkatraman Madurai Venkatasubramanyam) Date: Thu, 18 Mar 2004 07:03:57 -0500 Subject: [Beowulf] Suggest me on my attempt!! Message-ID: <20040318120357.A52A91D435B@ws1-12.us4.outblaze.com> Hello ppl! I am a Computer Science and Engineering student of India. I am planning to build a Beowulf Cluster for my Project as a part of my curriculum. Resource I have are four laptops with Intel Celeron 2 GHz, 18 GB HDD, HP Compaq Presario 2100 series, 192 MB RAM and I dont know what else shud I specify here. I have RedHat Linux 9 running on it. So I seek your help here to suggest me on how to build a Cluster. Please show me a way, as I am new to the Linux Platform. If you can personally help me, I will be really appreciated. MOkShAA. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu Mar 18 15:01:10 2004 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 18 Mar 2004 15:01:10 -0500 (EST) Subject: [Beowulf] Suggest me on my attempt!! In-Reply-To: <20040318120357.A52A91D435B@ws1-12.us4.outblaze.com> Message-ID: On Thu, 18 Mar 2004, Venkatraman Madurai Venkatasubramanyam wrote: > Hello ppl! > I am a Computer Science and Engineering student of India. I am > planning to build a Beowulf Cluster for my Project as a part of my > curriculum. Resource I have are four laptops with Intel Celeron 2 GHz, > 18 GB HDD, HP Compaq Presario 2100 series, 192 MB RAM and I dont know > what else shud I specify here. I have RedHat Linux 9 running on it. So I > seek your help here to suggest me on how to build a Cluster. Please show > me a way, as I am new to the Linux Platform. If you can personally help > me, I will be really appreciated. a) Visit http://www.phy.duke.edu/brahma Among other things on this site is an online book on building clusters. Read/skim it. b) In your case the recipe is almost certainly going to be: i) Put laptops on a common switched network (cheap 100 Mbps switch). ii) Install PVM, MPI (lam and/or mpich), programming tools and support if you haven't already on all nodes. iii) Set them up with a common home directory space NFS exported from one to the rest, and with common accounts to match. You can distribute account information on so small a cluster by just copying e.g. /etc/passwd and /etc/group and so on or by using NIS (or other ways). iv) Set up a remote shell so that you can freely login from any node to any other node without a password. I recommend ssh (openssh rpms) but rsh is OK if your network is otherwise isolated and secure. v) Obtain, write, build parallel applications to explore what your cluster can do. There are demo programs for both PVM and MPI that come with the distributions and more are available on the web. There is a PVM program template and an example PVM application suitable for demonstrating scaling (also a potential template for master/slave code) on: http://www.phy.duke.edu/~rgb under "General". vi) Proceed from there as your skills increase. I think that you'll find that after this you'll be in pretty good shape for further progress, guided as you think necessary by this list. There are also books out there that can help, but they cost money. Finally, I'd strongly suggest subscribing to Cluster World Magazine, where there are both articles and monthly columns that cover how to do all of the above and much more. rgb > MOkShAA. > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rouds at servihoo.com Fri Mar 19 06:48:38 2004 From: rouds at servihoo.com (RoUdY) Date: Fri, 19 Mar 2004 15:48:38 +0400 Subject: [Beowulf] HELP! MPI PROGRAM In-Reply-To: <200310011901.h91J1LY06826@NewBlue.Scyld.com> Message-ID: Hello I really need a very big hand from you... I have to run a program on my cluster for the final year project, which require a lot of computation power... Can someone sent me a program (the source code) or a site where i can download a big program PLEASE ... Using MPI.... Hope to hear from you Roud -------------------------------------------------- Get your free email address from Servihoo.com! http://www.servihoo.com The Portal of Mauritius _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lars at meshtechnologies.com Fri Mar 19 09:31:25 2004 From: lars at meshtechnologies.com (Lars Henriksen) Date: Fri, 19 Mar 2004 14:31:25 +0000 Subject: [Beowulf] HELP! MPI PROGRAM In-Reply-To: References: Message-ID: <1079706684.2520.1.camel@tp1.mesh-hq> On Fri, 2004-03-19 at 11:48, RoUdY wrote: > I have to run a program on my cluster for the final year > project, which require a lot of computation power... > Can someone sent me a program (the source code) or a site > where i can download a big program PLEASE ... > Using MPI.... Try HPL (High-Performance Linpack): http://www.netlib.org/benchmark/hpl/ best regards Lars -- Lars Henriksen | MESH-Technologies A/S Systems Manager & Consultant | Lille Graabroedrestraede 1 www.meshtechnologies.com | DK-5000 Odense C, Denmark lars at meshtechnologies.com | mobile: +45 2291 2904 direct: +45 6311 1187 | fax: +45 6311 1189 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gropp at mcs.anl.gov Fri Mar 19 08:43:43 2004 From: gropp at mcs.anl.gov (William Gropp) Date: Fri, 19 Mar 2004 07:43:43 -0600 Subject: [Beowulf] HELP! MPI PROGRAM In-Reply-To: References: <200310011901.h91J1LY06826@NewBlue.Scyld.com> Message-ID: <6.0.0.22.2.20040319074111.02505e60@localhost> At 05:48 AM 3/19/2004, RoUdY wrote: >Hello >I really need a very big hand from you... >I have to run a program on my cluster for the final year project, which >require a lot of computation power... >Can someone sent me a program (the source code) or a site where i can >download a big program PLEASE ... >Using MPI.... >Hope to hear from you Roud There are many examples included with PETSc (www.mcs.anl.gov/petsc) that can be sized to use as much power as you have. HPLinpack will also use as much computational power as you have and allows you to compare your cluster to the Top500 list. Both use MPI for communication. Bill _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gharinarayana at yahoo.com Fri Mar 19 11:34:57 2004 From: gharinarayana at yahoo.com (HARINARAYANA G) Date: Fri, 19 Mar 2004 08:34:57 -0800 (PST) Subject: [Beowulf] Give an application to PARALLELIZE Message-ID: <20040319163457.3051.qmail@web11306.mail.yahoo.com> Dear friends, Please give me a very good application which uses pda(algorithms) and MPI to the maximum extent and which is POSSIBLE to do in 2 months(It's OK even if you have done it already, just send the NAME of the topic and the problem requirements). I am doing my Bachelor of Engineering in Comp. Science at RNSIT,Bangalore,INDIA. I am with a team of 4 people. With regards, Sivaram. __________________________________ Do you Yahoo!? Yahoo! Mail - More reliable, more storage, less spam http://mail.yahoo.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Fri Mar 19 21:18:31 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sat, 20 Mar 2004 10:18:31 +0800 (CST) Subject: [Beowulf] GridEngine 6.0 beta is ready! Message-ID: <20040320021831.65847.qmail@web16811.mail.tpe.yahoo.com> It's finally available, follow this link to download the binary packages or source: http://gridengine.sunsource.net/project/gridengine/news/SGE60beta-announce.html Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ?????????????????????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at pathscale.com Fri Mar 19 21:51:56 2004 From: lindahl at pathscale.com (Greg Lindahl) Date: Fri, 19 Mar 2004 18:51:56 -0800 Subject: [Beowulf] Intel CSA performance? In-Reply-To: References: Message-ID: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> On Thu, Mar 18, 2004 at 10:26:51AM -0500, Mark Hahn wrote: > Intel added a special connection on their chipset to connect > gigabit on some chipsets (CSA). I've been wondering whether > this would offer a latency advantage, since it's conventional wisdom > that PCI latency is a noticable part of MPI latency. Eh? PCI latency can be noticable when you have a low latency network, but gigE latency isn't nearly that low, especially once you've gone through a switch. The only reference to gigabit latency in the article didn't say what they measured. I'd assume that it was using the normal drivers, which means the kernel networking stack, which means you're looking through the telescope from the wrong end. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From desi_star786 at yahoo.com Sat Mar 20 13:38:10 2004 From: desi_star786 at yahoo.com (desi star) Date: Sat, 20 Mar 2004 10:38:10 -0800 (PST) Subject: [Beowulf] Problem running Jaguar on Scyld-Beowulf in parallel mode. Message-ID: <20040320183810.94267.qmail@web40812.mail.yahoo.com> Hi.. I have installed a molecular modeling software Jaguar by Schrodinger Inc. on my scyld-beowulf 16 node cluster. The software runs perfectly fine on the master node but gives an error when I try to run the program on more than one CPU. User manual of the program suggests following steps to run Jaguar in parallel mode: 1. Install MPICH and configure with option: --with-comm=shared --with-device=ch_p4 2. Edit the machine.LINUX file in the MPICH directory and list the name of the host and number of processors on that host. 3. Test that 'rsh' is working 4. Launch the secure server ch4p_servs We already have the MPICH installed on the cluster using package 'mpich-p4-inter-1.3.2-5_scyld.i368.rpm'. I do not know whether package installation was done with specific configure options in step#1. Do I need to re-install the MPICH? I know that MPICH works perfectly fine for the FORTRAN 90 programs on different nodes. Also, Is it really important to enable 'rsh' on scyld? The cluster is not protected by firewall so I want to use the more secure 'ssh' but then do I need to install the MPICH again telling it to use ssh rather than rsh for communication? I am also wondering if the reason I am not been able to run program on more than one CPU has to do with the fact that Jaguar is not linked to MPICH libraries? This is my first experience with MPICH and running programs in parallel. I would really appreciate quick tips and suggestions as to why I am not been to make Jaguar run in the parallel mode. Thanks in advance. Eagerly waiting for a response. -- Pratap Singh. Graduate Student, The Chemical and Biomolecular Eng. Johns Hopkins Univ. __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rmyers1400 at comcast.net Fri Mar 19 22:58:53 2004 From: rmyers1400 at comcast.net (Robert Myers) Date: Fri, 19 Mar 2004 22:58:53 -0500 Subject: [Beowulf] Intel CSA performance? In-Reply-To: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> References: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> Message-ID: <405BC17D.3010504@comcast.net> Greg Lindahl wrote: >On Thu, Mar 18, 2004 at 10:26:51AM -0500, Mark Hahn wrote: > > > >>Intel added a special connection on their chipset to connect >>gigabit on some chipsets (CSA). I've been wondering whether >>this would offer a latency advantage, since it's conventional wisdom >>that PCI latency is a noticable part of MPI latency. >> >> > >Eh? PCI latency can be noticable when you have a low latency network, >but gigE latency isn't nearly that low, especially once you've gone >through a switch. > >The only reference to gigabit latency in the article didn't say what >they measured. I'd assume that it was using the normal drivers, which >means the kernel networking stack, which means you're looking through >the telescope from the wrong end. > > > I had thought it might be interesting to fool around with trying to use CSA for hyperscsi, but I think you're saying if you're going to use a switched network, don't bother, if you're trying to win on latency. When Intel abandoned infiniband and the memory controller hub sprouted this ethernet link, I figured that was their opening shot in stomping what's left of infiniband. Maybe it is, and they just don't care about latency, but it sounds like nobody's got any reliable information as to what the latency effects of CSA may be, anyway. Every indication I can find is that Intel has all its bets on ethernet, and I don't know that there is any technological obstacle to building a low-latency ethernet. RM _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jimlux at earthlink.net Sat Mar 20 17:42:54 2004 From: jimlux at earthlink.net (Jim Lux) Date: Sat, 20 Mar 2004 14:42:54 -0800 Subject: [Beowulf] Wireless network speed for clusters Message-ID: <002b01c40ecc$cd7cec50$32a8a8c0@LAPTOP152422> Some preliminary results for those of you wondering just how slow it actually is... Configuration is basically this: node (Via EPIA C3 533MHz) running freevix kernel (ramdisk filesystem) wired connection through Dlink 5 port hub DWL-7000AP set up for point to multipoint 802.11a (5GHz band) luminiferous aether DWL-7000AP ancient 10Mbps hub Clunky PPro running Knoppix/debian Maxtor NAS with a NFS mount Pings with default 63 byte packets give 1.2-2.0 ms both ways... Compare to <0.1 ms with a wired connection (i.e. plugging a cable from the Dlink hub to the ancient hub) DHCP/PXE booting sort of works (not exhaustively tested) For some reason, the nodes can't see the NAS so NFS doesn't mount There are a lot of "issues" with the DWL-7000AP... I think it's trying to be clever about not routing traffic to MACs on the local side over the air, but then, it doesn't know to route the traffic to the NFS server. The DWL-7000's also don't like to be powered up with no live (as in responding to packets) device hooked up to them, so there's sort of a potential power sequencing thing with the EPIA boards and the DWL-7000AP. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at pathscale.com Sat Mar 20 23:01:36 2004 From: lindahl at pathscale.com (Greg Lindahl) Date: Sat, 20 Mar 2004 20:01:36 -0800 Subject: [Beowulf] Intel CSA performance? In-Reply-To: <405BC17D.3010504@comcast.net> References: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> <405BC17D.3010504@comcast.net> Message-ID: <20040321040136.GA1977@greglaptop.greghome.keyresearch.com> On Fri, Mar 19, 2004 at 10:58:53PM -0500, Robert Myers wrote: > I had thought it might be interesting to fool around with trying to use > CSA for hyperscsi, but I think you're saying if you're going to use a > switched network, don't bother, if you're trying to win on latency. I've never heard of "hyperscsi", and I am not saying what you think I'm saying. What I am saying is that if you're going to use 1 gigabit Ethernet, which has high latency in the switches, AND go through the kernel, don't bother. I was pretty clear, so I don't see how you missed it. There are certainly many examples of switched networks that are low latency, such as Myrinet, IB, Quadrics, SCI, and so forth. > When Intel abandoned infiniband Intel has not abandoned Infiniband. They discontinued a 1X interface that was going to get stomped in the market that was developing more slowly than expected. Just like you drew the wrong lesson from what I said, don't draw the wrong lesson from what Intel did. > Every indication I can find is that Intel has all its bets on ethernet, This contradicts what Intel says. They are not betting against ethernet, but they are certainly encouraging FC and IB where FC and IB make sense. However, this is straying beyond beowulf, and I hope that this mailing list can avoid being the cesspool that comp.arch has been for many years. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rmyers1400 at comcast.net Sun Mar 21 01:40:30 2004 From: rmyers1400 at comcast.net (Robert Myers) Date: Sun, 21 Mar 2004 01:40:30 -0500 Subject: [Beowulf] Intel CSA performance? In-Reply-To: <20040321040136.GA1977@greglaptop.greghome.keyresearch.com> References: <20040320025156.GB7761@greglaptop.internal.keyresearch.com> <405BC17D.3010504@comcast.net> <20040321040136.GA1977@greglaptop.greghome.keyresearch.com> Message-ID: <405D38DE.1010409@comcast.net> Greg Lindahl wrote: >On Fri, Mar 19, 2004 at 10:58:53PM -0500, Robert Myers wrote: > > > >>I had thought it might be interesting to fool around with trying to use >>CSA for hyperscsi, but I think you're saying if you're going to use a >>switched network, don't bother, if you're trying to win on latency. >> >> > >I've never heard of "hyperscsi", and I am not saying what you think >I'm saying. What I am saying is that if you're going to use 1 gigabit >Ethernet, which has high latency in the switches, AND go through the >kernel, don't bother. I was pretty clear, so I don't see how you >missed it. There are certainly many examples of switched networks that >are low latency, such as Myrinet, IB, Quadrics, SCI, and so forth. > I should have been explicit. "If you are going through a switched _ethernet_ connection." If you do the groups.google.com search low-latency infiniband group:comp.arch author:Robert author:Myers you will find that you really don't need to educate me about the existence of low-latency interconnects. As to hyperscsi, I gather that it is incumbent only on others to check google. Hyperscsi is a way to pass raw data over ethernet without going through the TCP/IP stack: http://www.linuxdevices.com/files/misc/hyperscsi.pdf so it doesn't consume nearly the CPU resources that TCP/IP does without hardware offload, and I don't think CSA allows you to use separate hardware TCP/IP offload. It looks potentially interesting as a low-cost clustering interconnect, especially if, as I expect, Intel continues to push ethernet. RM _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Sun Mar 21 09:46:36 2004 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sun, 21 Mar 2004 22:46:36 +0800 (CST) Subject: [Beowulf] Re: GridEngine 6.0 beta is ready! In-Reply-To: Message-ID: <20040321144636.49074.qmail@web16808.mail.tpe.yahoo.com> SGE 6.1 will be avaiable at the end of the year, so when the newer version of Rocks Cluster picks up SGE 6.0, SGE 6.1 will be available at around the same time. Andrew. --- "Mason J. Katz" ???T???G> Thanks for the update. We're not going to include > this in our April > release, but we will update to the official Opteron > port and remove our > version of this port. We hope to build experience > with SGE 6.0 in the > coming months and include it as part of our November > release as 6.0 > goes from beta to release. Thanks. > > -mjk > > On Mar 19, 2004, at 6:18 PM, Andrew Wang wrote: > > > It's finally available, follow this link to > download > > the binary packages or source: > > > > > http://gridengine.sunsource.net/project/gridengine/news/SGE60beta- > > > announce.html > > > > Andrew. > > > > > > > ----------------------------------------------------------------- > > ????????Yahoo!?????? > > > ?????????????????????????????????????????????????????????> > > http://tw.promo.yahoo.com/mail_premium/stationery.html > ----------------------------------------------------------------- ?C???? Yahoo!?_?? ?????C???B?????????B?R?A???????A???b?H?????? http://tw.promo.yahoo.com/mail_premium/stationery.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Mon Mar 22 12:33:15 2004 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Mon, 22 Mar 2004 09:33:15 -0800 Subject: [Beowulf] Give an application to PARALLELIZE In-Reply-To: <20040319163457.3051.qmail@web11306.mail.yahoo.com> Message-ID: <5.2.0.9.2.20040322093203.017e1000@mailhost4.jpl.nasa.gov> At 08:34 AM 3/19/2004 -0800, HARINARAYANA G wrote: >Dear friends, > >Please give me a very good application which uses >pda(algorithms) and MPI to the maximum extent and >which is POSSIBLE to do in 2 months(It's OK even if >you have done it already, just send the NAME of the >topic and the problem requirements). > > I am doing my Bachelor of Engineering in Comp. >Science at RNSIT,Bangalore,INDIA. > > I am with a team of 4 people. > >With regards, >Sivaram. A couple issues back of IEEE Proceedings, there were several papers describing doing acoustic source localization with a bunch of iPAQs. I don't know if they were doing MPI for node/node communication, but there's fairly extensive literature out there, and the papers describe the algorithms used. James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From clwang at csis.hku.hk Sun Mar 21 23:55:00 2004 From: clwang at csis.hku.hk (Cho Li Wang) Date: Mon, 22 Mar 2004 12:55:00 +0800 Subject: [Beowulf] Final Call : NPC2004 (Deadline: March 22, 2004) Message-ID: <405E71A4.1556E651@csis.hku.hk> ******************************************************************* NPC2004 IFIP International Conference on Network and Parallel Computing October 18-20, 2004 Wuhan, China http://grid.hust.edu.cn/npc04 - ------------------------------------------------------------------- Important Dates Paper Submission March 22, 2004 (extended) Author Notification May 1, 2004 Final Camera Ready Manuscript June 1, 2004 ******************************************************************* Call For Papers The goal of IFIP International Conference on Network and Parallel Computing (NPC 2004) is to establish an international forum for engineers and scientists to present their excellent ideas and experiences in all system fields of network and parallel computing. NPC 2004, hosted by the Huazhong University of Science and Technology, will be held in the city of Wuhan, China - the "Homeland of White Clouds and the Yellow Crane." Topics of interest include, but are not limited to: - Parallel & Distributed Architectures - Parallel & Distributed Applications/Algorithms - Parallel Programming Environments & Tools - Network & Interconnect Architecture - Network Security - Network Storage - Advanced Web and Proxy Services - Middleware Frameworks & Toolkits - Cluster and Grid Computing - Ubiquitous Computing - Peer-to-peer Computing - Multimedia Streaming Services - Performance Modeling & Evaluation Submitted papers may not have appeared in or be considered for another conference. Papers must be written in English and must be in PDF format. Detailed electronic submission instructions will be posted on the conference web site. The conference proceedings will be published by Springer Verlag in the Lecture Notes in Computer Science Series (cited by SCI). Best papers from NPC 2004 will be published in a special issue of International Journal of High Performance Computing and Networking (IJHPCN) after conference. ************************************************************************** Committee General Co-Chairs: H. J. Siegel Colorado State University, USA Guojie Li The Institute of Computing Technology, CAS, China Steering Committee Chair: Kemal Ebcioglu IBM T.J. Watson Research Center, USA Program Co-Chairs: Guangrong Gao University of Delaware, USA Zhiwei Xu Chinese Academy of Sciences, China Program Vice-Chairs: Victor K. Prasanna University of Southern California, USA Albert Y. Zomaya University of Sydney, Australia Hai Jin Huazhong University of Science and Technology, China Publicity Co-Chairs: Cho-Li Wang The University of Hong Kong, Hong Kong Chris Jesshope The University of Hull, UK Local Arrangement Chair: Song Wu Huazhong University of Science and Technology, China Steering Committee Members: Jack Dongarra University of Tennessee, USA Guangrong Gao University of Delaware, USA Jean-Luc Gaudiot University of California, Irvine, USA Guojie Li The Institute of Computing Technology, CAS, China Yoichi Muraoka Waseda University, Japan Daniel Reed University of North Carolina, USA Program Committee Members: Ishfaq Ahmad University of Texas at Arlington, USA Shoukat Ali University of Missouri-Rolla, USA Makoto Amamiya Kyushu University, Japan David Bader University of New Mexico, USA Luc Bouge IRISA/ENS Cachan, France Pascal Bouvry University of Luxembourg, Luxembourg Ralph Castain Los Alamos National Laboratory, USA Guoliang Chen University of Science and Technology of China, China Alain Darte CNRS, ENS-Lyon, France Chen Ding University of Rochester, USA Jianping Fan Institute of Computing Technology, CAS, China Xiaobing Feng Institute of Computing Technology, CAS, China Jean-Luc Gaudiot University of California, Irvine, USA Minyi Guo University of Aizu, Japan Mary Hall University of Southern California, USA Salim Hariri University of Arizona, USA Kai Hwang University of Southern California, USA Anura Jayasumana Colorado State Univeristy, USA Chris R. Jesshop The University of Hull, UK Ricky Kwok The University of Hong Kong, Hong Kong Francis Lau The University of Hong Kong, Hong Kong Chuang Lin Tsinghua University, China John Morrison University College Cork, Ireland Lionel Ni Hong Kong University of Science and Technology, Hong Kong Stephan Olariu Old Dominion University, USA Yi Pan Georgia State University, USA Depei Qian Xi'an Jiaotong University, China Daniel A. Reed University of North Carolina at Chapel Hill, USA Jose Rolim University of Geneva, Switzerland Arnold Rosenberg University of Massachusetts at Amherst, USA Sartaj Sahni University of Florida, USA Selvakennedy Selvadurai University of Sydney, Australia Franciszek Seredynski Polish Academy of Sciences, Poland Hong Shen Japan Advanced Institute of Science and Technology, Japan Xiaowei Shen IBM T. J. Watson Research Center, USA Gabby Silberman IBM Centers for Advanced Studies, USA Per Stenstrom Chalmers University of Technology, Sweden Ivan Stojmenovic University of Ottawa, Canada Ninghui Sun Institute of Computing Technology, CAS, China El-Ghazali Talbi University of Lille, France Domenico Talia University of Calabria, Italy Mitchell D. Theys University of Illinois at Chicago, USA Xinmin Tian Intel Corporation, USA Dean Tullsen University of California, San Diego, USA Cho-Li Wang The University of Hong Kong, Hong Kong Qing Yang University of Rhode Island, USA Yuanyuan Yang State University of New York at Stony Brook, USA Xiaodong Zhang College of William and Mary, USA Weimin Zheng Tsinghua University, China Bingbing Zhou University of Sydney, Australia Chuanqi Zhu Fudan University, China _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mwill at penguincomputing.com Mon Mar 22 12:20:47 2004 From: mwill at penguincomputing.com (Michael Will) Date: Mon, 22 Mar 2004 09:20:47 -0800 Subject: [Beowulf] Re: scyld and jaguar Message-ID: <200403220920.47878.mwill@penguincomputing.com> Hi, I saw your email on the beowulf list, and have a few comments: 1. MPICH on Scyld does not require rsh or ssh but rather it will take advantage of the bproc features of Scyld to achieve the same faster. 2. If your fortran programs work fine, so should the c programs. Unless you have an executable that is statically linked with its own mpich implementation. You can test that by using 'ldd' on the executable, it will list which libraries it is loading. If there are no mpich libs mentioned, you might have a statically linked program. Let me know how it goes. Michael Will -- Michael Will, Linux Sales Engineer NEWS: We have moved to a larger iceberg :-) NEWS: 300 California St., San Francisco, CA. Tel: 415-954-2822 Toll Free: 888-PENGUIN Fax: 415-954-2899 www.penguincomputing.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jeffrey.b.layton at lmco.com Mon Mar 22 15:03:35 2004 From: jeffrey.b.layton at lmco.com (Jeff Layton) Date: Mon, 22 Mar 2004 15:03:35 -0500 Subject: [Beowulf] NUMA Patches for AMD64 in 2.4? Message-ID: <405F4697.9070507@lmco.com> Good Afternoon! Does anyone know if the latest stock 2.4 kernel has the NUMA patches in it? If not, where can I get NUMA patches that will work for AMD64? TIA! Jeff -- Dr. Jeff Layton Aerodynamics and CFD Lockheed-Martin Aeronautical Company - Marietta _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From landman at scalableinformatics.com Mon Mar 22 16:30:04 2004 From: landman at scalableinformatics.com (Joe Landman) Date: Mon, 22 Mar 2004 16:30:04 -0500 Subject: [Beowulf] NUMA Patches for AMD64 in 2.4? In-Reply-To: <405F4697.9070507@lmco.com> References: <405F4697.9070507@lmco.com> Message-ID: <405F5ADC.2080101@scalableinformatics.com> You can pull x86_64 patches from ftp://ftp.x86-64.org/pub/linux/v2.6/ . The 2.4 kernels would need backports in some cases (RedHat is doing this, and I think SUSE might be as well). Not sure if Fedora is doing this as well (no /proc/numa in it or in the SUSE 9.0 AMD64). Joe Jeff Layton wrote: > Good Afternoon! > > Does anyone know if the latest stock 2.4 kernel has the > NUMA patches in it? If not, where can I get NUMA patches > that will work for AMD64? > > TIA! > > Jeff > -- Joseph Landman, Ph.D Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://scalableinformatics.com phone: +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From desi_star786 at yahoo.com Mon Mar 22 15:15:27 2004 From: desi_star786 at yahoo.com (desi star) Date: Mon, 22 Mar 2004 12:15:27 -0800 (PST) Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <200403220920.47878.mwill@penguincomputing.com> Message-ID: <20040322201527.98403.qmail@web40809.mail.yahoo.com> Hi Mike, Thanks much for responding. Jaguar is indeed staticaly linked to the MPICH libraries as per manuals. When I ran the ldd commands as you suggested: -- $ ldd Jaguar not a dynamic executable $ -- Thats why the very first step sugested in the Jaguar installation is to build and configure MPICH from the start. Where do I go from here? I also worked on Alan's suggestion and created a dynamic link between the ssh and rsh. I am now stuck in making ssh passwordless. Using 'ssh-keygen -t' I generated public and private keys and then copied public key to the authorised_keys2 in ~/.ssh/. I am not sure if thats all I need to make ssh passwordless. I was wondering if I will have to copy public keys on each node using bpcp command. I would appreciate suggestions in this matter. Thanks. Pratap. --- Michael Will wrote: > Hi, > > I saw your email on the beowulf list, and have a few > comments: > > 1. MPICH on Scyld does not require rsh or ssh but > rather it will take > advantage of the bproc features of Scyld to achieve > the same faster. > > > 2. If your fortran programs work fine, so should the > c programs. Unless you > have an executable that is statically linked with > its own mpich > implementation. You can test that by using 'ldd' on > the executable, it will > list which libraries it is loading. If there are no > mpich libs mentioned, you > might have a statically linked program. > > Let me know how it goes. > > Michael Will > -- > Michael Will, Linux Sales Engineer > NEWS: We have moved to a larger iceberg :-) > NEWS: 300 California St., San Francisco, CA. > Tel: 415-954-2822 Toll Free: 888-PENGUIN > Fax: 415-954-2899 > www.penguincomputing.com > __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mwill at penguincomputing.com Mon Mar 22 17:01:31 2004 From: mwill at penguincomputing.com (Michael Will) Date: Mon, 22 Mar 2004 14:01:31 -0800 Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com> References: <20040322201527.98403.qmail@web40809.mail.yahoo.com> Message-ID: <200403221401.31370.mwill@penguincomputing.com> The problem is that a statically linked executable will not be able to use the Scyld infrastructure. It won't take advantage of your Infinidband or Myrinet, it won't use bproc, etcpp. You might set up the compute nodes to look like general unix nodes in order to run that particular implementation, but then you loose all the advantages of Scyld. > I also worked on Alan's suggestion and created a > dynamic link between the ssh and rsh. AFAIK you would be better off to set the enviroment variable to force it to use rsh or ssh. I think its P4_RSHCOMMAND="ssh" . The best way would be to ask your vendor to provide you with a dymanically linked executable, or even the source code and compile it yourself. > I am now stuck > in making ssh passwordless. Using 'ssh-keygen -t' I > generated public and private keys and then copied > public key to the authorised_keys2 in ~/.ssh/. I am > not sure if thats all I need to make ssh passwordless. Does it work with localhost? It sometimes is tricky to get it right. then it could also work remotely, given that you 1) have sshd running 2) have your home NFS mounted 3) have made /dev/random accessible, at least for ssh I believe thats necessary > I was wondering if I will have to copy public keys on > each node using bpcp command. You could do that too if you do not want to NFS mount the home. That you could easily do by editing /etc/exports to export /home and /etc/beowulf/fstab to mount $MASTER, after that rebooting your compute node. (might be possible without rebooting, but I don't know off of the top of my head) Michael -- Michael Will, Linux Sales Engineer NEWS: We have moved to a larger iceberg :-) NEWS: 300 California St., San Francisco, CA. Tel: 415-954-2822 Toll Free: 888-PENGUIN Fax: 415-954-2899 www.penguincomputing.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From m.dierks at skynet.be Mon Mar 22 18:39:30 2004 From: m.dierks at skynet.be (Michel Dierks) Date: Tue, 23 Mar 2004 00:39:30 +0100 Subject: [Beowulf] Minimal OS Message-ID: <405F7932.20404@skynet.be> Hello, I?m a beginner in the Beowulf world. To achieve my school graduate I choose to make a Beowulf cluster. My cluster: 8 slaves: pc IBM 166 Mhz, 96 Mb ram, HD 2 Giga. 1 master: Dell PowerEdge 2200 bi processor 233 Mhz, 320 Mb ram, 3 SCSI HD (9.1, 2.1 and 2.1 Giga). 1 switch 10/100 Ethernet. The application must calculate a mesh 2D for a research over stream in fluid mechanics. I must use the MPI library for communication and PARMS for the calculation. This application will be developed in C. The operating system is the Red Hat distribution 9.0. My question is: for the slave pc?s , which is the minimal operating system to install. (Kernell + which package?). Thank you. Michel D. Belgium _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mwill at penguincomputing.com Mon Mar 22 18:01:00 2004 From: mwill at penguincomputing.com (Michael Will) Date: Mon, 22 Mar 2004 15:01:00 -0800 Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <1079996184.4352.14.camel@pel> References: <20040322201527.98403.qmail@web40809.mail.yahoo.com> <1079996184.4352.14.camel@pel> Message-ID: <200403221501.00766.mwill@penguincomputing.com> I agree that rather than compiling your own MPICH you should try to make it work with the existing one. However 1) there is no source 2) the binary is statically linked. 3) Scyld does have an mpirun which should set the enviromentvariables right. The right attempt is to make it use bpsh instead of rsh or ssh. I saw that some of the calls are done with shell scripts, which might be a way to fix it as well if the enviroment variables don't help. Michael On Monday 22 March 2004 02:56 pm, Sean Dilda wrote: > On Mon, 2004-03-22 at 15:15, desi star wrote: > > Hi Mike, > > > > Thanks much for responding. Jaguar is indeed staticaly > > linked to the MPICH libraries as per manuals. When I > > ran the ldd commands as you suggested: > > > > -- > > $ ldd Jaguar > > not a dynamic executable > > $ > > -- > > > > Thats why the very first step sugested in the Jaguar > > installation is to build and configure MPICH from the > > start. Where do I go from here? > > I'm not familiar with Jaguar, but I am somewhat familiar with Scyld. I > believe you are taking the wrong approach with this. > > Even though Jaguar says you should start with building mpich, I don't > think that's what you want to do. You almost certainly want to stick > with the MPICH binaries that were provided by Scyld. First make sure > there is no confusion and remove the copy of mpich that you built. Next > make sure the mpich and mpich-devel packages are installed on your > system. 'rpm -q mpich ; rpm -q mpich-devel' should tell you this. If > they're not, 'rpm -i mpich-XXXXX.rpm' should install the package. You > can find the packages on your Scyld cd(s). > > Once you have those packages installed, then attempt to compile jaguar. > It should link against Scyld's copy of mpich and just work. I suggest > following Scyld's instructions for running mpich jobs, not Jaguars. > Scyld has made adjustments to their copy of MPICH that make it work > right on their system. In the process they also change the way jobs are > launched. So Scyld may not have 'mpirun', but has a better way to start > the job. > > As Michael pointed out, Scyld's version of MPICH doesn't require rsh, > ssh, or anything like it. So your questions along those lines are > somewhat moot. > > > Sean -- Michael Will, Linux Sales Engineer NEWS: We have moved to a larger iceberg :-) NEWS: 300 California St., San Francisco, CA. Tel: 415-954-2822 Toll Free: 888-PENGUIN Fax: 415-954-2899 www.penguincomputing.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mwill at penguincomputing.com Mon Mar 22 17:11:42 2004 From: mwill at penguincomputing.com (Michael Will) Date: Mon, 22 Mar 2004 14:11:42 -0800 Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com> References: <20040322201527.98403.qmail@web40809.mail.yahoo.com> Message-ID: <200403221411.42975.mwill@penguincomputing.com> Another idea - make it use bpsh by setting export P4_RSHCOMMAND="bpsh" or set it to use some shell script of yours that massages its parameters into the format bpsh expects. bpsh will start a process without requiring rsh or ssh, using Scylds bproc support. Michael. -- Michael Will, Linux Sales Engineer NEWS: We have moved to a larger iceberg :-) NEWS: 300 California St., San Francisco, CA. Tel: 415-954-2822 Toll Free: 888-PENGUIN Fax: 415-954-2899 www.penguincomputing.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From agrajag at dragaera.net Mon Mar 22 17:56:24 2004 From: agrajag at dragaera.net (Sean Dilda) Date: Mon, 22 Mar 2004 17:56:24 -0500 Subject: [Beowulf] Re: scyld and jaguar In-Reply-To: <20040322201527.98403.qmail@web40809.mail.yahoo.com> References: <20040322201527.98403.qmail@web40809.mail.yahoo.com> Message-ID: <1079996184.4352.14.camel@pel> On Mon, 2004-03-22 at 15:15, desi star wrote: > Hi Mike, > > Thanks much for responding. Jaguar is indeed staticaly > linked to the MPICH libraries as per manuals. When I > ran the ldd commands as you suggested: > > -- > $ ldd Jaguar > not a dynamic executable > $ > -- > > Thats why the very first step sugested in the Jaguar > installation is to build and configure MPICH from the > start. Where do I go from here? > I'm not familiar with Jaguar, but I am somewhat familiar with Scyld. I believe you are taking the wrong approach with this. Even though Jaguar says you should start with building mpich, I don't think that's what you want to do. You almost certainly want to stick with the MPICH binaries that were provided by Scyld. First make sure there is no confusion and remove the copy of mpich that you built. Next make sure the mpich and mpich-devel packages are installed on your system. 'rpm -q mpich ; rpm -q mpich-devel' should tell you this. If they're not, 'rpm -i mpich-XXXXX.rpm' should install the package. You can find the packages on your Scyld cd(s). Once you have those packages installed, then attempt to compile jaguar. It should link against Scyld's copy of mpich and just work. I suggest following Scyld's instructions for running mpich jobs, not Jaguars. Scyld has made adjustments to their copy of MPICH that make it work right on their system. In the process they also change the way jobs are launched. So Scyld may not have 'mpirun', but has a better way to start the job. As Michael pointed out, Scyld's version of MPICH doesn't require rsh, ssh, or anything like it. So your questions along those lines are somewhat moot. Sean _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From xyzzy at speakeasy.org Mon Mar 22 21:24:16 2004 From: xyzzy at speakeasy.org (Trent Piepho) Date: Mon, 22 Mar 2004 18:24:16 -0800 (PST) Subject: [Beowulf] Power consumption for opterons? In-Reply-To: <000e01c40772$2611bf60$36a8a8c0@LAPTOP152422> Message-ID: Two weeks ago, I asked about power consumption for dual opteron systems. This is summary of the numbers I saw posted here. 237 idle to 280 loaded for a dual 248 with two SCSI drives from Bill Broadley 250 loaded for a dual 240 from Mark Hahn 182 loaded for a dual 242 from Robert G. Brown The 182 numbers seems to be too low, but it would be nice to have some other data points. Combine fewer fans, less memory, lower power or no harddrive, more efficient power supply, and less load on the CPU, and you could see 182 vs 250 watts I think. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf