From bob at drzyzgula.org Wed May 21 14:14:38 2003 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed, 21 May 2003 14:14:38 -0400 Subject: (Opens can of worms..) What is the best linux distro for a cluste r? In-Reply-To: References: <20030515150646.A7727@www2> Message-ID: <20030521141438.C17173@www2> Interesting bit of musing in The Register today: Red Hat, Linux, consumers, money - do they mix? http://www.theregister.co.uk/content/4/30805.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at math.ucdavis.edu Wed May 21 18:52:50 2003 From: bill at math.ucdavis.edu (Bill Broadley) Date: Wed, 21 May 2003 15:52:50 -0700 Subject: Like to hear about opteron experiences In-Reply-To: <3EC8FC72.4020507@cora.nwra.com> References: <3EC8FC72.4020507@cora.nwra.com> Message-ID: <20030521225250.GC32102@sphere.math.ucdavis.edu> I wrote a double precision based memory benchmark much like to John McCalpin's stream, except I use a variety of array sizes and pthreads to manage multiple parallel threads. I found a dramatic difference in performance when running 2 or 4 threads. For instance a dual p4 @ 2.4 Ghz (80% the clock of todays fastest p4): http://www.math.ucdavis.edu/~bill/dual-p4-2.4-icc.png Note that 2 threads has slower throughput then 1 to main memory. Vs a dual opteron-240 (77% as fast as todays fastest opteron): http://www.math.ucdavis.edu/~bill/dual-240-4xpc2700-icc.png 2 threads approximately 1.66 the throughput of 1 to main memory. As an additional datapoint I made a run on a quad-842: http://www.math.ucdavis.edu/~bill/4x842-icc.png 4 threads approximately 1.66 the throughput of 2 to main memory. Not directly related to generalized double FP performance, but I thought it might be an interesting data point. These were all test accounts, I actually have 2 opteron cpu's in hand, but alas I'm awaiting delivery of an opteron motherboard. -- Bill Broadley Mathematics UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Wed May 21 20:49:22 2003 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Thu, 22 May 2003 08:49:22 +0800 (CST) Subject: [xcat-user] 5/21/2003, 13:55 - Need info ! Re: [PBS-USERS] LSF vs its "Closest Competitor" In-Reply-To: <00b601c31f5d$c23049f0$673c6e8c@dt.nchc.gov.tw> Message-ID: <20030522004922.50461.qmail@web16804.mail.tpe.yahoo.com> I am also interested in knowing how Platform Computing got the numbers, or they just made the numbers up? Andrew. --- c00dcw00 ???? > To whom it may concern : > > I got the following questions, hope someone may echo > on them, TKS !! > > 1. Who's the Closest Competitor ? > > 2. Who's the 200,000+ CPU user ? > > 3. What's the definition of Fairshare > Utilization ? > > 4. Are the 100+ clusters homegeneous or > hetrogeneous ? > > Sincerely yours, > David C. Wan > Tel : 886-3-577-085 ext. 326 > 5/21/2003, 13:55 > > ----- Original Message ----- > From: "Andrew Wang" > To: ; > > Sent: Wednesday, May 21, 2003 8:47 AM > Subject: Fwd: [PBS-USERS] LSF vs its "Closest > Competitor" > > > > A message from the PBS mailing-list. Anyone want > to > > comment? > > > > Andrew. > > > > Ron Chen wrote: > > > Someone sent me a chat from the Platform "web > > > event", > > > I would like to share with PBS developers/users. > > > ===================================================== > > > Performance, Scalability, Robustness > > > > > > LSF 5 Closest > Competitor > > > > > > Clusters 100+ 1 > > > > > > CPUs 200000+ 300 > > > > > > Jobs 500000+ ~10000+ > > > (active across clusters) > > > > > > Fairshare Utilization ~100% ~50% > > > > > > Query Time 20% better than 40% > slower > > > LSF 4.2 than > LSF > > > 5 > > > > > > Scheduler Usage 4K/job > 28K/job > > > > > > ======================================================== > > > > > > I would love to hear from the people here, at > least > > > a number of things above are not true. > > > > > > I know that PBS with Globus, Silver, or other > meta > > > schedulers can support over 100+ clusters too. > > > > > > For CPUs supported, I am sure I've heard people > > > using PBS with over 500+ processors. > > > > > > It would be interesting to see how Platform came > up > > > with the numbers! > > > > > > -Ron > > > > > > > > > > ----------------------------------------------------------------- > > ??? Yahoo!?? > > ??????? - ???????????? > > http://fate.yahoo.com.tw/ > > _______________________________________________ > > Beowulf mailing list, Beowulf at beowulf.org > > To change your subscription (digest mode or > unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > ----------------------------------------------------------------- ??? Yahoo!?? ??????? - ???????????? http://fate.yahoo.com.tw/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jtochoi at hotmail.com Tue May 20 13:45:36 2003 From: jtochoi at hotmail.com (Jung Jin Choi) Date: Tue, 20 May 2003 13:45:36 -0400 Subject: connecting 200pc Message-ID: Hi all, I just got to know what beowulf system is and found out that I can connect the pc's up to the number of ports in a switch. Let's say, if I have a 16 port switch, I can connect up to 16 pc's. Now, my question is if I have 200pc's, how do I connect them? Should I connect 48 pc's to a 48 ports switch, then connect these four 48 ports switch to another switch? Please teach me some ways to connect many pc's... Thank you Jung Choi -------------- next part -------------- An HTML attachment was scrubbed... URL: From c00dcw00 at nchc.org.tw Wed May 21 01:56:44 2003 From: c00dcw00 at nchc.org.tw (c00dcw00) Date: Wed, 21 May 2003 13:56:44 +0800 Subject: 5/21/2003, 13:55 - Need info ! Re: [PBS-USERS] LSF vs its "Closest Competitor" References: <20030521004740.79916.qmail@web16806.mail.tpe.yahoo.com> Message-ID: <00b601c31f5d$c23049f0$673c6e8c@dt.nchc.gov.tw> To whom it may concern : I got the following questions, hope someone may echo on them, TKS !! 1. Who's the Closest Competitor ? 2. Who's the 200,000+ CPU user ? 3. What's the definition of Fairshare Utilization ? 4. Are the 100+ clusters homegeneous or hetrogeneous ? Sincerely yours, David C. Wan Tel : 886-3-577-085 ext. 326 5/21/2003, 13:55 ----- Original Message ----- From: "Andrew Wang" To: ; Sent: Wednesday, May 21, 2003 8:47 AM Subject: Fwd: [PBS-USERS] LSF vs its "Closest Competitor" > A message from the PBS mailing-list. Anyone want to > comment? > > Andrew. > > Ron Chen wrote: > > Someone sent me a chat from the Platform "web > > event", > > I would like to share with PBS developers/users. > ===================================================== > > Performance, Scalability, Robustness > > > > LSF 5 Closest Competitor > > > > Clusters 100+ 1 > > > > CPUs 200000+ 300 > > > > Jobs 500000+ ~10000+ > > (active across clusters) > > > > Fairshare Utilization ~100% ~50% > > > > Query Time 20% better than 40% slower > > LSF 4.2 than LSF > > 5 > > > > Scheduler Usage 4K/job 28K/job > > > ======================================================== > > > > I would love to hear from the people here, at least > > a number of things above are not true. > > > > I know that PBS with Globus, Silver, or other meta > > schedulers can support over 100+ clusters too. > > > > For CPUs supported, I am sure I've heard people > > using PBS with over 500+ processors. > > > > It would be interesting to see how Platform came up > > with the numbers! > > > > -Ron > > > > > ----------------------------------------------------------------- > ??? Yahoo!?? > ??????? - ???????????? > http://fate.yahoo.com.tw/ > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From amitvyas_cse at hotmail.com Thu May 22 14:03:58 2003 From: amitvyas_cse at hotmail.com (Amit vyas) Date: Thu, 22 May 2003 23:33:58 +0530 Subject: connecting 200pc In-Reply-To: Message-ID: <001501c3208c$84b60ee0$e822030a@amit> Well you have two option 1. use 16 port switch to 16 computers and in this way connect 12 set of switches than it will be 12x16=192 pc's connected if you prefer you can also save on wires by placing the switch near the set of pc's than gon on connecting those 12 sets of switches into other switch so only 8 pc's will be left unconnected for which u can connect another switch and connect that too with the central switch and same the case with the 48 way switch . 2. this option will be less manageable I guess that you go on connecting switches in a hierarchy and than at the end points connect your pc's but this will add on management overhead since managing one level hierarchy is more easy thay two or three level . Hope this helps -----Original Message----- From: beowulf-admin at scyld.com [mailto:beowulf-admin at scyld.com] On Behalf Of Jung Jin Choi Sent: Tuesday, May 20, 2003 11:16 PM To: Beowulf at beowulf.org Subject: connecting 200pc Hi all, I just got to know what beowulf system is and found out that I can connect the pc's up to the number of ports in a switch. Let's say, if I have a 16 port switch, I can connect up to 16 pc's. Now, my question is if I have 200pc's, how do I connect them? Should I connect 48 pc's to a 48 ports switch, then connect these four 48 ports switch to another switch? Please teach me some ways to connect many pc's... Thank you Jung Choi _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu May 22 14:14:20 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 22 May 2003 14:14:20 -0400 (EDT) Subject: connecting 200pc In-Reply-To: Message-ID: On Tue, 20 May 2003, Jung Jin Choi wrote: > Hi all, > > I just got to know what beowulf system is and found out that > I can connect the pc's up to the number of ports in a switch. > Let's say, if I have a 16 port switch, I can connect up to 16 pc's. > Now, my question is if I have 200pc's, how do I connect them? > Should I connect 48 pc's to a 48 ports switch, then connect these > four 48 ports switch to another switch? > Please teach me some ways to connect many pc's... You can proceed several ways. Which one is right for you depends on your application needs and budget. If your cluster application is embarrassingly parallel (EP), or does a LOT of work computationally for a LITTLE interprocessor communication, then connecting a stack of switches together is fine and is also by far the cheapest alternative. The best way to interconnect them as almost certainly going to be what you describe; can buy a "master" switch with e.g. 16 ports (m) and plug A and B and C and D and ... into its ports one at a time. All traffic from A to any non-A port thus goes one hop through m: A1 -> Am || mA -> mB || Bm -> B2 for port 1 on switch A to get to port 2 on switch B (the ->'s are within the switch, the ||'s are between switches). This is fairly symmetrical, not TOO expensive, and can manage even "real parallel" applications as long as they aren't too fine grained. If it IS too fine grained, then your next choice is to bump your budget. How much you have to bump it depends on your needs and the topology of your problem. Switch cost per port is absurdly low for switches with less than or equal to 32 ports. Note the following snapshot from pricewatch.com: $2359 - Switch 64port $586 - Switch 48port $129 - Switch 32port $76 - Switch 24port $31 - Switch 16port $48 - Switch 12port $22 - Switch 8port $19 - Switch 5port $19 - Switch 4port In quantities of 32 or fewer ports per switch, the price per port is ballpark of only $2-4 dollars (don't ask me to explain the anomaly at e.g. 16 ports vs 12:-). At 48 it jumps to over $10 per port. At 64 it jumps again to close to $40 (and an e.g. $1800 HP Procurve 4000M, times two for 80 ports on a single backplane, holds at around $50/port ). Clearly it gets really expensive to pack lots of ports on a single backplane, especially planes that attempt to deliver full bisection bandwidth between all pairs of ports. Compare a wimpy ~200 ports made up of only three filled 4000M chassis with gig uplinks at ballpark $6000 vs hmmm, $31x17 = maybe $600 including the cabling to get to 256 ports with 16 16 port switches interconnected via a 16 port switch. Of course performance is better if you go up a notch and get 16 port stackable switches with a gigibit uplink and put them on a 16 port gigabit switch. Prices, however, also go up to the ballpark of $2-3K (I think) -- clearly you can span a range of anywhere from $5 to $50 per port to get to 200+ ports with various topologies and bottlenecks. If you feel REALLY rich, you can look into Foundry switches, e.g. their bigiron switches and the like. These are enterprise-class switching chassis and you will have to bleed money from every pore to buy one, but you can get hundreds of ports on a common backplane with state of the art switching technology delivering a large fraction of full bisection bandwidth and symmetry. Most users who want to build high performance networks that require full bisection bandwidth between all pairs of hosts to run fine grained code on a cluster containing hundreds of nodes and up eschew 100BT or even 1000BT and ethernet altogether, and choose either myrinet or SCI. Both of those have their own ways of managing very large clusters. In both cases you will also bleed money from every pore, but it actually might end up being LESS money than a really big ethernet switch and will have far better performance (latencies down in the <5 microsecond range instead of in the >100 microsecond range). This is really only a short review of the options (you may here more from some of the networking experts on the list) but this might get you started. To summarize, a) profile your task and determine is communication requirements; b) match your task and its expected scaling to your budget, your node architecture, and a network simultaneously. That is, to get the most work done per dollar spent, figure out whether you have to spend relatively much on a network or relatively little. If EP or coarse grained, spend little on the network, and shift more and more over to the networ at the expense of nodes as the task becomes fine grained and the ratio of communication to computation increases. Don't worry about "spending nodes" on faster communications but fewer (maybe a LOT fewer) nodes that you expected/hoped to get -- if you're doing a really fine grained real parallel task, it won't scale out to LOTS of nodes anyway, certainly not without a premiere network interconnecting them. As in, nodes+network prices range from ballpark of $750 for a cost-benefit optimal processor with maybe 512 MB of memory each on a "cheapest" network to $2000 or even more for a high end network. However, if your parallel task only scales linearly to 16 nodes with a cheap network (and maybe even slows DOWN after 32 nodes, so buying 200 doesn't actually help:-) you may be better off using your 200-cheapest-node-budget to buy only 64 nodes that use a network that permits the task to scale nearly linearly up to all 64. Hope this helps, rgb > Thank you > > Jung Choi > > > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu May 22 16:01:05 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 22 May 2003 13:01:05 -0700 Subject: Question on RHL 7.2, 8.0 and 9.0 In-Reply-To: <3EC6597F.60507@earthlink.net> References: <1052862721.13407.22.camel@roughneck.liniac.upenn.edu> <3EC24666.2070701@yahoo.com.sg> <3EC6597F.60507@earthlink.net> Message-ID: <20030522200105.GA1574@greglaptop.internal.keyresearch.com> On Sat, May 17, 2003 at 12:47:11PM -0300, Sam Daniel wrote: > There's a company out in Lodi, CA called CheapBytes (www.cheapbytes.com) > that sells a "workalike" version of RedHat called "Pink Tie". I ordered > a set of version 9 CDs to compare against the RH9 I bought yesterday. > > According to CheapBytes, "Pink Tie 9.0 has been modified to comply with > the Red Hat 9.0 EULA (End User License Agreement) regarding the use of > the Red Hat Logo and Trademark and therefore is freely distributable." The only thing that's different about Pink Tie is that the redhat-logos package was replaced, and a few instances of the word "RedHat" got changed. The main gotcha is that these days, when you do an upgrade of RedHat, it looks at /etc/issue to confirm that you're updating RedHat and not another distro of Linux. Well, Pink Tie changes /etc/issue... so the installer don't show you upgrade as an option. So, if you later upgrade, you need to add "upgradeany" when you boot. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From derek.richardson at pgs.com Thu May 22 16:31:58 2003 From: derek.richardson at pgs.com (Derek Richardson) Date: Thu, 22 May 2003 15:31:58 -0500 Subject: Opteron issues In-Reply-To: <200305190743.44605.shewa@inel.gov> References: <3EC56EC7.4080803@pgs.com> <200305190743.44605.shewa@inel.gov> Message-ID: <3ECD33BE.9070001@pgs.com> Andrew, Thanks for the info. Just out of curiousity, what clock speed are these CPUs ( both crashed and non )? Thanks to others who replied, BTW! Regards, Derek R. Andrew Shewmaker wrote: >On Friday 16 May 2003 05:05 pm, Derek Richardson wrote: > > >>All, >>I've heard rumors of an Opteron stability issue, but can't seem to find >>anything concrete on the web yet. Has anyone heard about this? >>Opinions, experiences? >>Thanks, >>Derek R. >> >> > >I tested an Opteron 240 system with 8GB RAM running SuSe. >The gcc 3.3 (prerelease at the time) that SuSe provided was good... >I didn't compare it to an earlier version of gcc. The >PG beta worked well for an f90 code I wanted to test. This code >runs as a single process and will allocate between 5 and 6 GB of >RAM (that's the resident set size) when running our big models. >The big model I tested completed in about 46 hours on the Opteron >compared to around 168 hours for a 750MHz UltraSPARC III. > >The system I tested used a Newisys motherboard and I was able >to crash it, but that was before it had its production cpus. I >wasn't able to crash it after they were installed . > >Andrew > > > -- Linux Administrator derek.richardson at pgs.com derek.richardson at ieee.org Office 713-781-4000 Cell 713-817-1197 One of your most ancient writers, a historian named Herodotus, tells of a thief who was to be executed. As he was taken away he made a bargain with the king: in one year he would teach the king's favorite horse to sing hymns. The other prisoners watched the thief singing to the horse and laughed. "You will not succeed," they told him. "No one can." To which the thief replied, "I have a year, and who knows what might happen in that time. The king might die. The horse might die. I might die. And perhaps the horse will learn to sing. -- "The Mote in God's Eye", Niven and Pournelle _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jhearns at freesolutions.net Thu May 22 17:12:52 2003 From: jhearns at freesolutions.net (John Hearns) Date: Thu, 22 May 2003 22:12:52 +0100 Subject: Question on RHL 7.2, 8.0 and 9.0 References: <1052862721.13407.22.camel@roughneck.liniac.upenn.edu> <3EC24666.2070701@yahoo.com.sg> <3EC6597F.60507@earthlink.net> <20030522200105.GA1574@greglaptop.internal.keyresearch.com> Message-ID: <000f01c320a6$e8c15470$5c259fd4@mypc> > > > > According to CheapBytes, "Pink Tie 9.0 has been modified to comply with > > the Red Hat 9.0 EULA (End User License Agreement) regarding the use of > > the Red Hat Logo and Trademark and therefore is freely distributable." > > > So, if you later upgrade, you need to add "upgradeany" when you boot. > > -- greg > I run Pink Tie 8.0 at home, and it is just fine. Thanks to Greg about the upgrade tip - which I will try when I get 9 soon. For anyone in the UK, I can heartily recommend John Winters at the Linux Emporium for all Linux distros, including Pink Tie http://www.linuxemporium.co.uk John produces update CDs of the latest update RPMs also, for the bandwidth challenged. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From beowulf-admin at beowulf.org Fri May 23 16:18:05 2003 From: beowulf-admin at beowulf.org (beowulf-admin at beowulf.org) Date: Fri, 23 May 2003 16:18:05 -0400 Subject: No subject Message-ID: From beowulf-admin at beowulf.org Fri May 23 16:35:58 2003 From: beowulf-admin at beowulf.org (beowulf-admin at beowulf.org) Date: Fri, 23 May 2003 16:35:58 -0400 Subject: No subject Message-ID: From beowulf-admin at beowulf.org Fri May 23 17:09:19 2003 From: beowulf-admin at beowulf.org (beowulf-admin at beowulf.org) Date: Fri, 23 May 2003 17:09:19 -0400 Subject: No subject Message-ID: From beowulf-admin at beowulf.org Fri May 23 18:15:58 2003 From: beowulf-admin at beowulf.org (beowulf-admin at beowulf.org) Date: Fri, 23 May 2003 18:15:58 -0400 Subject: No subject Message-ID: From sliu at pipeline.com Fri May 23 09:29:29 2003 From: sliu at pipeline.com (Shaohui Liu) Date: Fri, 23 May 2003 09:29:29 -0400 (EDT) Subject: question on beostat Message-ID: <3882993.1053695718020.JavaMail.nobody@wamui06.slb.atl.earthlink.net> Hi, I am beowulf newbie running a basic version of scyld beowulf on a 9 node cluster. Here is the OS: Linux beowulf1 2.2.19-12.beo #1 Tue Jul 17 17:10:45 EDT 2001 i686 unknown The slave nodes were started with default CDs. I was able to run a few application with parallel processing. But I could not see any changes on the monitoring pgm. If I ran beostat cmd, I only get see information of node -1, 1 and 3, while all other nodes were unaccessible (even with -N option). Can anyone know why? How can I see the resources on each node? Thanks in advance Shaohui _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From shashank at evl.uic.edu Sat May 24 15:29:08 2003 From: shashank at evl.uic.edu (Shashank Khanvilkar) Date: Sat, 24 May 2003 14:29:08 -0500 Subject: Buying a Beowulf Cluster (Help) Message-ID: <008401c3222a$bf5c3d90$44acf880@SHASHANK> Hi, We have already put an order for a beowulf cluster and it is scheduled to come soon. The only problem is that we are very in-experienced with such clusters and would really appreciate some help. There is a lot of documentation, and i have yet to start reading. WIll really appreciate if someone can point out ans's to the following questions: (I will be google'ing for it also). 1. Opinions on the OS to be installed on the cluster: We have decided on REd Hat 7.3, however suggestions are welcome. (We are not going for RH 8/9 because of some known compiler (Intel and portland) problems.. If anyone has knowledge abt this, please let me know). 2. Any documentation on the software that needs to be installed (MPI, PVM, admin stuff etc) that will help us in the long run. 3. Any documentation on TO-DO's..or things that we need to check/do before working on the cluster. Any help is greatly appreciated. Shashank http://mia.ece.uic.edu/~papers _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Sun May 25 04:17:14 2003 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sun, 25 May 2003 16:17:14 +0800 (CST) Subject: Buying a Beowulf Cluster (Help) In-Reply-To: <008401c3222a$bf5c3d90$44acf880@SHASHANK> Message-ID: <20030525081714.16945.qmail@web16801.mail.tpe.yahoo.com> I just installed Gridengine on our newly installed cluster, and got it running smoothly. If you have more than serveral users sharing the cluster, a batch system is essential. Some people choose to use PBS, I suggest you use PBSPro instead of OpenPBS. OpenPBS is free, but PBSPro is much nicer, with better features. You may want to apply several patches floating from OpenPBS Public home. (Even with those, Gridengine or PBSPro is still much nicer) Gridengine: http://gridengine.sunsource.net/ OpenPBS: http://www.openpbs.org/ http://www-unix.mcs.anl.gov/openpbs/ If you run parallel code, you can take a look at LAM-MPI: http://www.lam-mpi.org Andrew. --- Shashank Khanvilkar ????> Hi, > We have already put an order for a beowulf cluster > and it is scheduled to > come soon. The only problem is that we are very > in-experienced with such > clusters and would really appreciate some help. > There is a lot of > documentation, and i have yet to start reading. > WIll really appreciate if someone can point out > ans's to the following > questions: (I will be google'ing for it also). > > > 1. Opinions on the OS to be installed on the > cluster: We have decided on REd > Hat 7.3, however suggestions are welcome. (We are > not going for RH 8/9 > because of some known compiler (Intel and portland) > problems.. If anyone has > knowledge abt this, please let me know). > > 2. Any documentation on the software that needs to > be installed (MPI, PVM, > admin stuff etc) that will help us in the long run. > > 3. Any documentation on TO-DO's..or things that we > need to check/do before > working on the cluster. > > Any help is greatly appreciated. > > Shashank > > > > > http://mia.ece.uic.edu/~papers > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or > unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf ----------------------------------------------------------------- ??? Yahoo!?? ??????? - ???????????? http://fate.yahoo.com.tw/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jhearns at freesolutions.net Sun May 25 04:53:57 2003 From: jhearns at freesolutions.net (John Hearns) Date: Sun, 25 May 2003 09:53:57 +0100 Subject: Buying a Beowulf Cluster (Help) References: <008401c3222a$bf5c3d90$44acf880@SHASHANK> Message-ID: <001501c3229b$2e3d9950$a1249fd4@mypc> ----- Original Message ----- From: "Shashank Khanvilkar" To: Sent: Saturday, May 24, 2003 8:29 PM Subject: Buying a Beowulf Cluster (Help) > Hi, > We have already put an order for a beowulf cluster and it is scheduled to > come soon. The only problem is that we are very in-experienced with such > clusters and would really appreciate some help. There is a lot of You could do a lot worse than getting some books on clustering. Two which I recommend are: "Linux Clustering - Building and Maintaining Linux Clusters" by Charles Bookman ISBN 1578702747 "Beowulf Cluster Computing with Linux" by Thomas Sterling et.al. ISBN 0262692740 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From atp at piskorski.com Sun May 25 21:22:28 2003 From: atp at piskorski.com (Andrew Piskorski) Date: Sun, 25 May 2003 21:22:28 -0400 Subject: Debian FAI diskless? In-Reply-To: <200305160944.h4G9iYW03794@NewBlue.Scyld.com> References: <200305160944.h4G9iYW03794@NewBlue.Scyld.com> Message-ID: <20030526012225.GA31443@piskorski.com> On Fri, May 16, 2003 at 05:44:34AM -0400, beowulf-request at scyld.com wrote: > From: Thomas Lange > Date: Fri, 16 May 2003 10:42:18 +0200 > Subject: Re: (Opens can of worms..) What is the best linux distro for a cluste r? > Debian has a very nice tool for installing a beowulf cluster > unattended. It's called Fully Automatic Installation (FAI) and can be > found at http://www.informatik.uni-koeln.de/fai/. The manual has also > a chapter how to set up a Beowuluf cluster with little work. I did > several installations of Beowulf clusters with this tool (I'm the > author) but I also know other people using it for clusters. Have a Can FAI be used to setup a where only the head node has local disk, and all the rest ares diskless? I looked in the docs but didn't see any reference to diskless. -- Andrew Piskorski http://www.piskorski.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lange at informatik.Uni-Koeln.DE Mon May 26 06:19:49 2003 From: lange at informatik.Uni-Koeln.DE (Thomas Lange) Date: Mon, 26 May 2003 12:19:49 +0200 Subject: Debian FAI diskless? In-Reply-To: <20030526012225.GA31443@piskorski.com> References: <200305160944.h4G9iYW03794@NewBlue.Scyld.com> <20030526012225.GA31443@piskorski.com> Message-ID: <16081.59973.341483.254274@informatik.uni-koeln.de> >>>>> On Sun, 25 May 2003 21:22:28 -0400, Andrew Piskorski said: > Can FAI be used to setup a where only the head node has local > disk, and all the rest ares diskless? I looked in the docs but > didn't see any reference to diskless. Surely! There's an examples configuration in FAI that shows how to install a diskless machine. The only part that is different from the default is "partition my local hard disks". The rest remains the same. Have a look at the file templates/hooks/partition.DISKLESS -- regards Thomas _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From 8nrf at qlink.queensu.ca Mon May 26 10:48:03 2003 From: 8nrf at qlink.queensu.ca (Nathan Fredrickson) Date: Mon, 26 May 2003 10:48:03 -0400 Subject: mpich on cluster of SMPs Message-ID: Hi, I have a cluster of SMP machines running linux-2.4.20 and mpich-1.2.5. I configured mpich with device=ch_p4 and comm=shared to allow shared-memory communication between processes on the same node. This seemed to be working fine, but I was not confident that shared-memory was actually being used on-node so I build and installed a second instance of mpich with device=shmem. Using a simple two process test program that measures how long it takes to send an integer back and forth 10000 times I compared the three setups: device=ch_p4, comm=shared, off-node: 1.88 seconds device=ch_p4, comm=shared, on-node: 1.25 seconds device=ch_shmem: 0.288314 seconds The ch_p4 device does not seem to be using shared-memory when both processes are on the same node. Am I misinterpreting what comm=shared is supposed to do? Is there additional configuration required to make the ch_p4 device use shared-memory on-node? I expected on-node performance similar to device=shmem. Any insights would be appreciated. Thanks, Nathan _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From cjereme at ucla.edu Mon May 26 12:21:27 2003 From: cjereme at ucla.edu (Tintin J Marapao) Date: Mon, 26 May 2003 09:21:27 -0700 Subject: myrinet fault detection References: Message-ID: <000b01c323a2$dd0014b0$a400a8c0@Laptopchristine> Hello all, I am looking into myrinet for our cluster, and I was wondering if anyone knows anything about (or resources to look these up) -Myrinet has a built in fault /error detection -If there is a built in fault recovery system -Power management system Thanks much, Christine _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From cdcunh01 at cs.fiu.edu Tue May 27 13:49:05 2003 From: cdcunh01 at cs.fiu.edu (cassian d'cunha) Date: Tue, 27 May 2003 13:49:05 -0400 (EDT) Subject: enabling rlogin Message-ID: <4815.131.94.130.190.1054057745.squirrel@www.cs.fiu.edu> Hi, I am quite a newbie as far as clusters are concerned. I can't get rlogin to work on a cluster that has scyld Beowulf based on redhat 6.2. I can rlogin and telnet to other machines, but not to the host from any other machine. It also doesn't have the /etc/inetd.conf file where I would normally enable (server) rlogin, telnet, etc. Any help on enabling rlogin would be greatly appreciated. Thanks, Cassian. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From patrick at myri.com Tue May 27 15:15:39 2003 From: patrick at myri.com (Patrick Geoffray) Date: 27 May 2003 15:15:39 -0400 Subject: myrinet fault detection In-Reply-To: <000b01c323a2$dd0014b0$a400a8c0@Laptopchristine> References: <000b01c323a2$dd0014b0$a400a8c0@Laptopchristine> Message-ID: <1054062940.542.20.camel@asterix> Hi Christine, On Mon, 2003-05-26 at 12:21, Tintin J Marapao wrote: > -Myrinet has a built in fault /error detection The hardware provides: 1) CRC8 and CRC32 to detect bit corruption on the link. 2) SRAM parity to detect memory corruption on the NIC. 3) the PCIDMA chipset checks for PCI parity on DMA Reads (when the NIC is the PCI target). > -If there is a built in fault recovery system 2) and 3) are fatal errors, it should not happen in your lifetime unless faulty hardware. 1) and other cases are recoverable if the firmware you are using is reliable. GM is reliable, does segmentation/reassembly in the NIC and ACKs each fragment. It retransmits the data if a packet is lost or corrupted. You can find more information on BER here: http://www.myri.com/cgi-bin/fom?file=245 > -Power management system What do you mean ? Patrick -- Patrick Geoffray, PhD Myricom, Inc. http://www.myri.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bushnell at chem.ucsb.edu Tue May 27 16:26:59 2003 From: bushnell at chem.ucsb.edu (John Bushnell) Date: Tue, 27 May 2003 13:26:59 -0700 (PDT) Subject: Buying a Beowulf Cluster (Help) In-Reply-To: <008401c3222a$bf5c3d90$44acf880@SHASHANK> Message-ID: I would strongly suggest knocking on a few doors where you're at and finding some local cluster folks. There have got to be some people maintaining clusters at your University. If nothing else, you will have company during your misery. And they can probably clear up a lot of your questions over a couple mugs of coffee. As soon as the word "Beowulf" came out of my mouth while talking to someone who had built one, I immediately discovered some fundamental misconceptions I had and modified my plans. - John On Sat, 24 May 2003, Shashank Khanvilkar wrote: > Hi, > We have already put an order for a beowulf cluster and it is scheduled to > come soon. The only problem is that we are very in-experienced with such > clusters and would really appreciate some help. There is a lot of > documentation, and i have yet to start reading. > WIll really appreciate if someone can point out ans's to the following > questions: (I will be google'ing for it also). > > > 1. Opinions on the OS to be installed on the cluster: We have decided on REd > Hat 7.3, however suggestions are welcome. (We are not going for RH 8/9 > because of some known compiler (Intel and portland) problems.. If anyone has > knowledge abt this, please let me know). > > 2. Any documentation on the software that needs to be installed (MPI, PVM, > admin stuff etc) that will help us in the long run. > > 3. Any documentation on TO-DO's..or things that we need to check/do before > working on the cluster. > > Any help is greatly appreciated. > > Shashank > > > > > http://mia.ece.uic.edu/~papers > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From alvin at Maggie.Linux-Consulting.com Tue May 27 17:04:51 2003 From: alvin at Maggie.Linux-Consulting.com (Alvin Oga) Date: Tue, 27 May 2003 14:04:51 -0700 (PDT) Subject: Buying a Beowulf Cluster (Help) In-Reply-To: Message-ID: hi ya - 2 replies in 1 On Tue, 27 May 2003, John Bushnell wrote: > I would strongly suggest knocking on a few doors where you're > at and finding some local cluster folks. There have got to be > some people maintaining clusters at your University. If nothing > else, you will have company during your misery. than one can look outside the area - you will get 10x better hw support if they were local - biggest problem .. to keep machines up ... ( something dies, and you need it fixed "now" ) - remote admin and howto support can be done remotely... by people that can keep it up .. - i think "hands-on support" vs "uptime/reliability support" should be split up ... > On Sat, 24 May 2003, Shashank Khanvilkar wrote: > > > 1. Opinions on the OS to be installed on the cluster: We have decided on REd > > Hat 7.3, however suggestions are welcome. (We are not going for RH 8/9 > > because of some known compiler (Intel and portland) problems.. If anyone has > > knowledge abt this, please let me know). gcc problems will be across the board .. - old gcc on new hw or new gcc on new hw - old gcc on old hw or new gcc on old hw - you will have problems ( glibc + gcc-x.y problems ) - there's probably more open source support for new gcc w/ new hw i think to build new boxes based on old distro is a bad idea, since it'd run into old known bugs that has since been fixed in the newer distro - yes, you might get the new bugs in the new systems/distro .. but you will also get old bugs in old distro and a lot smaller group of open source folks addressing those older issues - there's usually work arounds for most bugs/problems .. - typical/usual work around for older bugs is to upgrade - new bugs/problems -- simply means you need ore time to do more detailed testing before deciding - i have yet to see newer distro NOT be able to run an older app that "claimed to require an old foo-x.y version by the 3rd party vendor" .. it works even on newer version unless they did something unique to their code to lock it to that linux distro - given known bugs and features requirements, i'd build on the latest/greatest stuff > > 2. Any documentation on the software that needs to be installed (MPI, PVM, > > admin stuff etc) that will help us in the long run. random collection of stuff http://www.Linux-Consulting.com/Cluster > > 3. Any documentation on TO-DO's..or things that we need to check/do before > > working on the cluster. pricing and support ... - simulate the "my node just died, how do you(vendor) plan to fix it ?? " and our deadline was yesterday have fun alvin _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kr4m17 at aol.com Tue May 27 20:15:42 2003 From: kr4m17 at aol.com (kr4m17 at aol.com) Date: Tue, 27 May 2003 20:15:42 -0400 Subject: Process Migration in Scyld Beowulf Message-ID: <7E0EC9F9.03151EEF.000655AE@aol.com> Hi, I am new to the clustering scene. I have just installed Scyld Beowulf, and have successfully added nodes. I can not get any of the nodes to process ANY information. Their CPU status stays on 0% along with their memory... I would very much like to run a program called Ubench (www.phystech.com/download) just to test the performance increase of the cluster. If anyone can tell me how to run any program over all nodes of the cluster or inform me of how to run Ubench specifically i would greatly appreciate it. I have tried mpirun and it basically does not work at all, i am not sure why. If there is any information that anyone has or knows of any documentation that can help me out, please respond to the mailing list or email me at kr4m17 at aol.com. kr4m17 (J. Simon) . _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mphil39 at hotmail.com Wed May 28 10:47:58 2003 From: mphil39 at hotmail.com (Matt Phillips) Date: Wed, 28 May 2003 10:47:58 -0400 Subject: Network RAM revisited Message-ID: Hello, I am a student about to start work on using Network RAMs as swap space in a cluster environment, as a part of semester project. I need to convince myself first that this is indeed relevant in today's situation. I have read the earlier discussions in the archives but they are two years old and things might have changed since that might have made Network RAM more plausible (gigE becoming ubiquitous, network latency reducing, seek time in disks not getting better etc). I would like to hear the opinion of the members in the group. I guess the main argument against it is why not simply put in more memory sticks and avoid swap altogether. I was told there are applications out there that would still always need swap. To make the case more convincing, I would also like to test performance with real world application traces instead of probablity distributions. Does anyone know of applications (preferably used widespread) for which swap is unavoidable? Another question that bothers me is network latency deteriorates severely after packet size goes beyond 1-1.5 KB. Assuming page size is 4KB, wouldn't this affect the network RAM performance in a big way? Any way around this problem? Thanks in advance, Matt _________________________________________________________________ The new MSN 8: advanced junk mail protection and 2 months FREE* http://join.msn.com/?page=features/junkmail _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From j.c.burton at gats-inc.com Wed May 28 11:46:36 2003 From: j.c.burton at gats-inc.com (John Burton) Date: Wed, 28 May 2003 11:46:36 -0400 Subject: Network RAM revisited References: Message-ID: <3ED4D9DC.60005@gats-inc.com> Matt Phillips wrote: > Hello, > > I guess the main argument against it is why not simply put in more > memory sticks and avoid swap altogether. I was told there are > applications out there that would still always need swap. To make the > case more convincing, I would also like to test performance with real > world application traces instead of probablity distributions. Does > anyone know of applications (preferably used widespread) for which swap > is unavoidable? In my line of work (atmospheric remote sensing), its not so much a matter of "applications for which swap is unavoidable", but "build it and they will use it and more" - I don't care how "big" a machine / cluster I build, the scientists will find a way to use all its resources and ask for more. As I give them more powerful setups, their mathematical models increase in size and complexity. Its a constant battle to break even. Their models fill the existing memory and start hitting swap, which slows their processing down. They need to run faster so they can process a day's worth of data in one day, I give them more memory. They fill up the additional memory and start hitting swap again, but they need it to run faster so... (think you get the picture). Swap allows them to continually improve / enhance their models. Additional memory just makes it go faster. John _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Wed May 28 12:13:39 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed, 28 May 2003 12:13:39 -0400 (EDT) Subject: Network RAM revisited In-Reply-To: Message-ID: > I am a student about to start work on using Network RAMs as swap space in a > cluster environment, as a part of semester project. I need to convince sigh. > plausible (gigE becoming ubiquitous, network latency reducing, seek time in > disks not getting better etc). I would like to hear the opinion of the seek time improves slowly, it's true, but it does improve. perhaps more importantly, striping makes it a non-issue, or at least a solvable one. network latency hasn't improved much, in spite of gbe: I see around 80 us /bin/ping latency for 100bT, and about half that for gbe. in both cases, bandwidth has improved dramatically, though. networks are still pretty pathetic compared to dram interfaces: in other words, dram hasn't stood still, either. > I guess the main argument against it is why not simply put in more memory > sticks and avoid swap altogether. I was told there are applications out why do you think swap is an issue? > there that would still always need swap. I don't believe that's true. there are workloads which always result in some dirty data which is mostly idle; those pages would be perfect for swapping out, since it would free up the physical page for hotter use. it would seem very strange to me if an app created a *lot* of these pages, since that would more-or-less imply an app design flaw. > Another question that bothers me is network latency deteriorates severely > after packet size goes beyond 1-1.5 KB. I don't see that, unless by "severe" you mean latency=bandwidth/size ;) fragmenting a packet should definitely not cause a big decrease in throughput. also, support for jumbo MTU's is not that uncommon. > Assuming page size is 4KB, wouldn't > this affect the network RAM performance in a big way? Any way around this > problem? no. MMU hardware is inimicably hostile to this sort of too-clever-ness. not only are pages large, but manipulating them (TLB's actually) is expensive, especially in a multiprocessor environment. my take on swap is this: - a little bit of swapout is a very good thing, since it means that idle pages are being put to better use. - more than a trivial amount of swap *in* is very bad, since it means someone is waiting on those pages. worse is when a page appears to be swapped out and back in quickly. that's really just a kernel bug. - swap-outs are also nice in that they are often async, so no one is waiting for them to complete. they can also be lazy-written and even optimistically pre-written. - swap-ins are painful, but at least you can scale bandwidth and latency by adding spindles. - the ultimate solution is, of course, to add more ram! for ia32, this is pretty iffy above 2-6 GB, partly because it's 32b hardware, and partly because the hardware has dampened demand for big memory. but ram is at historically low prices: http://sharcnet.mcmaster.ca/~hahn/ram.png (OK, there was a brief period when crappy old PC133 SDR was cheaper than PC2100 is today, but...) in general, if you're concerned about ram, I'd look seriously at Opteron machines, since there simply is no other platform that's quite as clean: 64b goodness, scales ram capacity with cpus, not crippled by a shared FSB. it's true that you can put together some pretty big-ram ia64 machines, but they tend to wind up being rather expensive ;) in summary: I believe network shared memory is simply not a great computing model. if I was supervising a thesis project, I'd probably try to steer the student towards something vaguely like Linda... regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Wed May 28 14:24:14 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Wed, 28 May 2003 11:24:14 -0700 Subject: Network RAM revisited In-Reply-To: References: Message-ID: <20030528182414.GA1913@greglaptop.internal.keyresearch.com> On Wed, May 28, 2003 at 10:47:58AM -0400, Matt Phillips wrote: > I am a student about to start work on using Network RAMs as swap space in a > cluster environment, as a part of semester project. This is not as simple as it looks, because of deadlock situations. If you have to frantically start swapping because you're low on memory, how are you going to be sure that the process or kernel thread which is going to send memory to another node's ram (swap out) isn't itself going to have to allocate memory? There are 2 extensions beyond this project which are Really Cool. The first is a shared database cache, like Oracle's Cache Fusion. The second is a global cache for a shared filesystem. If you make the filesystem read-only, then this might make a better semester project than swap. For your application, you could use BLAST. You could even hack up BLAST's I/O to go through a library instead of through the kernel. That would significantly reduce the complexity of the project. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From fryman at cc.gatech.edu Wed May 28 17:30:08 2003 From: fryman at cc.gatech.edu (Josh Fryman) Date: Wed, 28 May 2003 17:30:08 -0400 Subject: Network RAM revisited In-Reply-To: References: Message-ID: <20030528173008.40e39cdc.fryman@cc.gatech.edu> there are several research papers that explore the issues in doing this. try a lit review. some that randomly pop to the top of my stack: Using network memory to improve performance in transaction-based systems Using remote memory to avoid disk thrashing Memory servers for multicomputers Implementation of a reliable remote memory pager The network RAMdisk (etc) -josh _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed May 28 19:29:52 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 28 May 2003 19:29:52 -0400 (EDT) Subject: Network RAM revisited In-Reply-To: Message-ID: On Wed, 28 May 2003, Mark Hahn wrote: > > Another question that bothers me is network latency deteriorates severely > > after packet size goes beyond 1-1.5 KB. > > I don't see that, unless by "severe" you mean latency=bandwidth/size ;) > fragmenting a packet should definitely not cause a big decrease in > throughput. also, support for jumbo MTU's is not that uncommon. Mark is dead right here. In fact, there are two regimes of bottleneck in networking. Small packets are latency dominated -- the interface cranks out packets as fast as it can, and typically bandwidth (such as it is) increases linearly with packet SIZE as R_l * P_s (max rate for latency bounded packets times packet size). Double the packet size, double the "bandwidth", but you just can't get any more pps through the interface/switch/interface combo (often with TCP stack on the side dominating the whole thing). What you're seeing around P_s = 1 KB is a crossover from latency dominated to bandwidth dominated (bottlenecked) traffic. You are approaching wire speed. As soon as this occurs you CAN'T continue spit out packets at the maximum rate so speed continues to increase linearly with packet size as the wire simply won't hold any more bps during the time you are using it. This causes latency to "increase" (or rather, the packet rate to decrease) as packet delivery starts to be delayed by the sheer time required to load the packet onto the wire at the maximum rate, not the time required to put the packet together and initiate transmission. The result is a near-exponential saturation curve exhibiting linear growth saturating at wirespeed less all sorts of cumulative overhead and retransmissions and other inefficiencies, typically about 10 MBps data transmission (around 90% of the theoretical limit after allowing for mandatory headers) for 100BT TCP/IP although this varies a LOT with the quality of your NIC and switch and wiring and protocol/stack. At one time fragmenting a packet stream of messages each just larger than the MTU caused one to fall off to a performance region that was once again latency dominated and cost one a "jump" down in bandwidth, but in recent years this jump has been small to absent as NIC latencies have dropped so even fragmented/split packets are still in the bandwidth dominated region where bandwidth is nearly saturated and slowly varying. > in summary: I believe network shared memory is simply not a great computing > model. if I was supervising a thesis project, I'd probably try to steer > the student towards something vaguely like Linda... I'm not sure I agree with this. There have certainly been major CS projects (like Duke's Trapeze) that have been devoted to creating a reasonably transparent network-based large memory model because there ARE problems (or at least have been problems) where there is a need for very large virtual memory spaces but disk-based swap is simply too slow. A FAST network (not ethernet, and not TCP/IP) with 5 usec or so latency and 100 MBps scale bandwidth, large B, can still beat up disk swap for certain (fairly random or at least nonlocal) memory access patterns. You assert that these patterns can be avoided by careful design and could be right. However, there is some virtue in having a general purpose magic-wand level tool where accessing more memory than you have kicks in a transparent mechanism for distributing the memory and runtime-optimizing its access -- basically creating an additional level of memory speed in the memory speed hierarchy -- so users don't HAVE to code for a particular size or architecture. I do think that simply providing "networked swap" through the existing VM is unlikely to be a great solution (although it might "work" with tools already in the kernel for at least testing purposes). The VM is almost certainly tuned for a single, highly expensive level of memory outside of physical DRAM, and there are too many orders of magnitude between DRAM and disk latencies and bandwidths for the tuning to "work" correctly for an intermediary network layer with very different advantages and nonlinearities. Then there is the page issue which if nothing else might require tuning or might favor certain hardware (capable of jumbo packets that can hold a full page) or both (different tunings for different hardware). What I would recommend is that the student talk to somebody like Jeff Chase at Duke and look over the literatures on existing and past projects that have addressed the issue. They'll need to quote Jeff's and the other people's work anyway in any sort of a sane dissertation, and they are by far the best people to tell them if the idea is still extant and worthy of further work (perhaps built on top of their base, perhaps not) or not. It is also wise to (as they are apparently doing) find a few people with applications that would "use a huge VM tomorrow" if it existed as a mainstream option in (say) an OTC distribution or even a specialized scyld or homebrew kernel. Weather, cosmology, there are a few "enormous" problems where researchers ALWAYS want to work bigger than current physical limits and budgets permit and can still get useful results even with the penalties imposed by disk or other VM extensions. For those workers, a "cluster" might be a single compute node and a farm of VM extension nodes providing some sort of CC-NUMA across the aggregate memory space for just the one core processor. If you have the problem in hand, it makes developing and testing a meaningful solution a whole lot easier... Or, as you say Mark, 64 bit systems may rapidly make the issue at least technically moot. COST nonlinearities, though, might still make a distributed/cluster DRAM VM attractive, as it might well be cheaper to buy a farm of 16 2 GB systems even with myrinet or sci interconnects to get to ballpark of 32 GB VM than it could be to buy a motherboard and all the memory capable of running 32 GB native on a single board. They won't sell a lot of the latter (at least at first), and they'll charge through the nose for developing them... and then there are the folks that would say fine, how about a network of 32 32 GB nodes to let me get to a TB of transparent VM? Indeed, one COULD argue that this is an idea that ONLY makes sense (again) now that there are 64 bit systems (and address spaces, kernels, compiler support) proliferating. Kernels that can use (mostly) all the memory one can put NOW on a 32 bit board have only recently come along, and grafting a de facto 64 bit address space onto a 32 bit architecture to permit accessing more than 4 GB of VM with no existing compiler or kernel support would be immensely painful and/or special library ("message passing" to slaves that do nothing but lookup or store data) evil (which may be why trapeze more or less stopped). 64 bit architectures could revive it again, especially while the aforementioned nonlinear price break between 64 bit (but still 2-4 GB limited on the cheap end) motherboards and 64 bit (but capable of holding 32 GB or more, expensive) motherboards holds. An Opteron is hardly more expensive than a plain old high end Athlon at similar levels of memory filling, and even a kilodollar scale premium for low-latency interconnects could still keep the price well (an order of magnitude?) below a large memory box for quite a while. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rburns at cs.ucsb.edu Wed May 28 20:18:10 2003 From: rburns at cs.ucsb.edu (Ryan Burns) Date: Wed, 28 May 2003 17:18:10 -0700 Subject: DSM solution? Message-ID: <20030528171810.40420b18.rburns@cs.ucsb.edu> Hello, I'm looking for a transparent DSM solution. I'm trying to create a suite of software that allows users to run applications designed for multiprocessor computers to be run on a cluster. My problem is a little bit unusual, so let me state what I need to do: I need to be able to specify which node each thread/process runs on. Each thread/process needs to be able to access specific hardware on each node. For example, each thread might need to display to its own video card. Each thread/process needs to read shared memory from 1 specific thread/process which is part of the user application. So imagine I take a user application, that I know nothing about, fork it a few times onto other nodes, then each node needs to read memory from that original process. So another example would be: I have a program that get gets input from a user by some form or another, and then it displays a photo or something for each input it receives, and I want each copy of this program on each node to read that same input. I can't access that input b/c I don't know what it is, but the copy applications do. So my situation isn't quite as drastic as it seems, I do know a lot about the user app. I just don't know anything about its data types. So far all the solutions I've looked into don't seem to do what I want. I'm thinking when openmosix is able to merge shared memory apps, I will be able to use some of their code to solve my problem. Basically I'm looking for an easy way out. I don't want to write any custom kernel level code to solve the problem if I don't have to. If anyone knows of any transparent DSM solutions, please let me know. Thanks, Ryan Burns _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu May 29 03:15:12 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 29 May 2003 00:15:12 -0700 Subject: DSM solution? In-Reply-To: <20030528171810.40420b18.rburns@cs.ucsb.edu> References: <20030528171810.40420b18.rburns@cs.ucsb.edu> Message-ID: <20030529071512.GB1496@greglaptop.greghome.keyresearch.com> On Wed, May 28, 2003 at 05:18:10PM -0700, Ryan Burns wrote: > I'm looking for a transparent DSM solution. If you look around, you'll find 30+ projects that have produced DSM with varying degrees of transparency. You'll find that performance is quite low for a very transparent solution -- so it works OK only for programs which are nearly embarrassingly parallel. > Each thread/process needs to read shared memory from 1 specific > thread/process which is part of the user application. So imagine I take a > user application, that I know nothing about, fork it a few times onto > other nodes, then each node needs to read memory from that original > process. If your user application wrote and read files (or named pipes) instead of using shared memory, you could produce something probably much higher in performance, with much less work. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu May 29 03:12:26 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 29 May 2003 00:12:26 -0700 Subject: Network RAM revisited In-Reply-To: References:

Message-ID: <20030529071226.GA1496@greglaptop.greghome.keyresearch.com> On Wed, May 28, 2003 at 07:29:52PM -0400, Robert G. Brown wrote: > Weather, cosmology, there are a few > "enormous" problems where researchers ALWAYS want to work bigger than > current physical limits and budgets permit and can still get useful > results even with the penalties imposed by disk or other VM extensions. Yes, but those are bad examples. The more memory you use in a 72-hour weather forecast (more memory means more input data, higher resolution output), the more cpu time it takes to make the computation -- the cpu time increases as memory**(4/3). In reality, weather forecasting is actually limited by our ability to insert fudge factors for local physics, i.e. things that aren't resolved on the grid. You need ~ 7 cells in 1 km to resolve a thunder head. Current production runs are at 10-20km cells. Those pesky weather satellites produce plenty of high-resolution input. So memory size is not a problem, and the data is big enough that you can use 100-200 cpus with Myrinet. Not too shabby. Now if you have a serial code, sure, you might be able to use network RAM. But parallelizing a weather code is old hat. By the way, Opteron mobos make your arguments about big memory motherboards a bit obsolete... there are 2 cpu motherboards with 2x4 = 8 total DIMM slots. I'm not sure what 4 cpu motherboards will do, but it might be 4x4 = 16. These 2 cpu motherboards are not that expensive... the Khapri is gold-plated, but its competition is not. Bigger DIMMs are expensive, but they are always falling. I always thought it was kind of a shame that motherboards used to have such low limits on memory. While I was at UVa I wanted to build a small cluster with 1 Tbyte of memory for big-memory apps. The figure of merit was that the cluster was going to have a total cost less than 2x the price of just the memory. If Digital had made a low end Alpha motherboard that could address big memory, I could have done it, but their low end chipset didn't carry out enough address pins. Grrr. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rburns at cs.ucsb.edu Thu May 29 03:37:03 2003 From: rburns at cs.ucsb.edu (Ryan Burns) Date: Thu, 29 May 2003 00:37:03 -0700 Subject: DSM solution? In-Reply-To: <20030529071512.GB1496@greglaptop.greghome.keyresearch.com> References: <20030528171810.40420b18.rburns@cs.ucsb.edu> <20030529071512.GB1496@greglaptop.greghome.keyresearch.com> Message-ID: <20030529003703.79c516ab.rburns@cs.ucsb.edu> Thanks for the info, Performance was one of my concerns, I should have figured something as transparent as I was looking for wouldn't be very fast. Looks like my only path is going to lead me to hacking more libraries. If I can get the useful information another way, I can then send it without using DSM. Thanks again, Ryan On Thu, 29 May 2003 00:15:12 -0700 "Greg Lindahl" wrote: > On Wed, May 28, 2003 at 05:18:10PM -0700, Ryan Burns wrote: > > > I'm looking for a transparent DSM solution. > > If you look around, you'll find 30+ projects that have produced DSM > with varying degrees of transparency. You'll find that performance is > quite low for a very transparent solution -- so it works OK only for > programs which are nearly embarrassingly parallel. > > > Each thread/process needs to read shared memory from 1 specific > > thread/process which is part of the user application. So imagine I > > take a user application, that I know nothing about, fork it a few > > times onto other nodes, then each node needs to read memory from that > > original process. > > If your user application wrote and read files (or named pipes) instead > of using shared memory, you could produce something probably much > higher in performance, with much less work. > > -- greg > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu May 29 03:58:26 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 29 May 2003 00:58:26 -0700 Subject: Network RAM revisited In-Reply-To: References: <20030529071226.GA1496@greglaptop.greghome.keyresearch.com> Message-ID: <20030529075826.GA1740@greglaptop.greghome.keyresearch.com> On Thu, May 29, 2003 at 12:43:52AM -0700, Joel Jaeggli wrote: > 2GB dimms reg ecc dimms seem to still be a factor of 8 or so more > expensive than 1gb dimms. but spending ~$13000 or so per node for memory > makes spending $600 for a mainboard and $400ea for cpu's seem like a > comparitvly minor part of the cost. even with niceish cases to protect > your expensive ram, and a blown out myrinet setup you could probably slip > in under 1.25-1.33x the cost of the ram. I wasn't talking about 1.33x the cost of expensive ram, the object was 1.2x the cost of reasonable ram. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Thu May 29 03:43:52 2003 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Thu, 29 May 2003 00:43:52 -0700 (PDT) Subject: Network RAM revisited In-Reply-To: <20030529071226.GA1496@greglaptop.greghome.keyresearch.com> Message-ID: 2GB dimms reg ecc dimms seem to still be a factor of 8 or so more expensive than 1gb dimms. but spending ~$13000 or so per node for memory makes spending $600 for a mainboard and $400ea for cpu's seem like a comparitvly minor part of the cost. even with niceish cases to protect your expensive ram, and a blown out myrinet setup you could probably slip in under 1.25-1.33x the cost of the ram. joelja On Thu, 29 May 2003, Greg Lindahl wrote: > On Wed, May 28, 2003 at 07:29:52PM -0400, Robert G. Brown wrote: > > > Weather, cosmology, there are a few > > "enormous" problems where researchers ALWAYS want to work bigger than > > current physical limits and budgets permit and can still get useful > > results even with the penalties imposed by disk or other VM extensions. > > Yes, but those are bad examples. The more memory you use in a 72-hour > weather forecast (more memory means more input data, higher resolution > output), the more cpu time it takes to make the computation -- the cpu > time increases as memory**(4/3). In reality, weather forecasting is > actually limited by our ability to insert fudge factors for local > physics, i.e. things that aren't resolved on the grid. You need ~ 7 > cells in 1 km to resolve a thunder head. Current production runs are > at 10-20km cells. Those pesky weather satellites produce plenty of > high-resolution input. So memory size is not a problem, and the data > is big enough that you can use 100-200 cpus with Myrinet. Not too > shabby. > > Now if you have a serial code, sure, you might be able to use network > RAM. But parallelizing a weather code is old hat. > > By the way, Opteron mobos make your arguments about big memory > motherboards a bit obsolete... there are 2 cpu motherboards with 2x4 = > 8 total DIMM slots. I'm not sure what 4 cpu motherboards will do, but > it might be 4x4 = 16. These 2 cpu motherboards are not that > expensive... the Khapri is gold-plated, but its competition is not. > Bigger DIMMs are expensive, but they are always falling. > > I always thought it was kind of a shame that motherboards used to have > such low limits on memory. While I was at UVa I wanted to build a > small cluster with 1 Tbyte of memory for big-memory apps. The figure > of merit was that the cluster was going to have a total cost less than > 2x the price of just the memory. If Digital had made a low end Alpha > motherboard that could address big memory, I could have done it, but > their low end chipset didn't carry out enough address pins. Grrr. > > -- greg > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- In Dr. Johnson's famous dictionary patriotism is defined as the last resort of the scoundrel. With all due respect to an enlightened but inferior lexicographer I beg to submit that it is the first. -- Ambrose Bierce, "The Devil's Dictionary" _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kim.branson at csiro.au Thu May 29 08:55:26 2003 From: kim.branson at csiro.au (Kim Branson) Date: Thu, 29 May 2003 22:55:26 +1000 Subject: Network RAM revisited In-Reply-To: <20030529071226.GA1496@greglaptop.greghome.keyresearch.com> Message-ID: > . Current production runs are > at 10-20km cells. Those pesky weather satellites produce plenty of > high-resolution input. So memory size is not a problem, and the data > is big enough that you can use 100-200 cpus with Myrinet. Not too > shabby. > Interesting, is weather forecasting limited by computational resources and not data availability? So if an organization had better resources with the same set of input data they could perform simulations that would be better at forcasting than a less well endowed resource?, or is it quantity and quality of data? Are there machines capable of simulating 1km cells, and is there data for these, presumably this data varies from region to region, with some areas having more sensor data than others? cheers Kim > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From d.l.farley at larc.nasa.gov Thu May 29 09:35:54 2003 From: d.l.farley at larc.nasa.gov (Doug Farley) Date: Thu, 29 May 2003 09:35:54 -0400 Subject: Gigabit performance issues and NFS Message-ID: <5.0.2.1.2.20030529092240.02aece48@pop.larc.nasa.gov> Fellow Wulfers, I know this isnt 100% wulf related, although it is part of my wulfs setup, but this is the best forum where everyone has alot of good experience. Well Heres the deal, I have a nice 2TB Linux file server with an Intel e1000 based nic in it. And I have an SGI O3 (master node) that is dumping to it with a tigon series gigabit card. I've tuned both, and my ttcp and netpipe performance average ~ 80-95MB/s which is more than reasonable for me. Both the fibre channel on my SGI and the raid (3ware) on my Linux box can write at 40MB/s sustained, read is a little faster for both maybe ~ 50MB/s sustained. I can get ftp/http transfers between the two to go at 39-40MB/s, which again i'm reasonably happy with. BUT, the part that is killing me is nfs and scp. Both crawl in at around 8-11MB/s with no other devices on the network. Any exports from the SGI i've exported with the 32bitclients flag, and i've pumped my r&wsize windows up to 32K, and forced nfs v3 on both Linux and Irix. After spending a week scouring the web I've found nothing that has worked, and SGI support thinks its a Linux nfs problem, which could be, but i'd like to get the opinion of this crowd in hopes of some light! Thanks! Doug Farley ============================== Doug Farley Data Analysis and Imaging Branch Systems Engineering Competency NASA Langley Research Center < D.L.FARLEY at LaRC.NASA.GOV > < Phone +1 757 864-8141 > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From erwan at mandrakesoft.com Mon May 26 04:29:32 2003 From: erwan at mandrakesoft.com (Erwan Velu) Date: 26 May 2003 10:29:32 +0200 Subject: Opteron O/S Support In-Reply-To: <3.0.3.32.20030516134703.012e5988@popd.ix.netcom.com> References: <3.0.3.32.20030516134703.012e5988@popd.ix.netcom.com> Message-ID: <1053937771.1430.24.camel@revolution.mandrakesoft.com> Le ven 16/05/2003 ? 22:47, Michael Huntingdon a ?crit : > Just wondering whether anyone could tell me if there has been any public Redhat support statement for 64-bit Opteron? A target date perhaps? If you need a ready-to-run Opteron Linux Distro, MandrakeSoft had released its 9.0 since march 14th. http://www.mandrakesoft.com/company/community/mandrakesoftnews/news?n=/mandrakesoft/news/2414 -- Erwan Velu Linux Cluster Distribution Project Manager MandrakeSoft 43 rue d'aboukir 75002 Paris Phone Number : +33 (0) 1 40 41 17 94 Fax Number : +33 (0) 1 40 41 92 00 Web site : http://www.mandrakesoft.com OpenPGP key : http://www.mandrakesecure.net/cks/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From victor_ms at bol.com.br Tue May 27 17:43:13 2003 From: victor_ms at bol.com.br (Victor Lima) Date: Tue, 27 May 2003 17:43:13 -0400 Subject: Beowulf digest, Vol 1 #1311 - 1 msg In-Reply-To: <200305271901.h4RJ1gd00794@NewBlue.Scyld.com> References: <200305271901.h4RJ1gd00794@NewBlue.Scyld.com> Message-ID: <3ED3DBF1.2060704@bol.com.br> Why don't you try ssh on 22 port??? []'s Victor beowulf-request at scyld.com wrote: >Send Beowulf mailing list submissions to > beowulf at beowulf.org > >To subscribe or unsubscribe via the World Wide Web, visit > http://www.beowulf.org/mailman/listinfo/beowulf >or, via email, send a message with subject or body 'help' to > beowulf-request at beowulf.org > >You can reach the person managing the list at > beowulf-admin at beowulf.org > >When replying, please edit your Subject line so it is more specific >than "Re: Contents of Beowulf digest..." > > >Today's Topics: > > 1. enabling rlogin (cassian d'cunha) > >--__--__-- > >Message: 1 >From: "cassian d'cunha" >Date: Tue, 27 May 2003 13:49:05 -0400 (EDT) >Subject: enabling rlogin >To: > >Hi, > >I am quite a newbie as far as clusters are concerned. I can't get rlogin >to work on a cluster that has scyld Beowulf based on redhat 6.2. I can >rlogin and telnet to other machines, but not to the host from any other >machine. It also doesn't have the /etc/inetd.conf file where I would >normally enable (server) rlogin, telnet, etc. > >Any help on enabling rlogin would be greatly appreciated. > >Thanks, >Cassian. > > > > > >--__--__-- > >_______________________________________________ >Beowulf mailing list >Beowulf at beowulf.org >http://www.beowulf.org/mailman/listinfo/beowulf > > >End of Beowulf Digest > > > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sean.hubbell at sed.redstone.army.mil Tue May 27 08:17:37 2003 From: sean.hubbell at sed.redstone.army.mil (Hubbell, Sean (GDIS)) Date: Tue, 27 May 2003 07:17:37 -0500 Subject: Question for a beginner Message-ID: <8A293F96C86DD51193B00008C716A4FE03C1BA0E@sedexch2.sed.redstone.army.mil> Hello, I am fairly new to building a Beowulf cluster. I was wondering if someone could point me to a reliable place to research the requirements for a single computer to use in a cluster (i.e. 128 MB DRAM, 60 GB ...). Thanks for your time, Sean Sean C. Hubbell Senior Software Engineer General Dynamics C4 Systems Software Engineering Directorate _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From sureamsam1 at yahoo.com Sun May 25 17:08:09 2003 From: sureamsam1 at yahoo.com (SAM) Date: Sun, 25 May 2003 14:08:09 -0700 (PDT) Subject: Starting out Message-ID: <20030525210809.68489.qmail@web14710.mail.yahoo.com> I want to make a small cluster, with Linux Suse 8.2... I don't know much about it, and would like to know if any of you could help, or if there are any books or websites that owuld help? Thanks alot. BTW: Is there any way to do it with windows without spending thousands? Thank you. Sam Seifert --------------------------------- Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo. -------------- next part -------------- An HTML attachment was scrubbed... URL: From joelja at darkwing.uoregon.edu Thu May 29 12:00:59 2003 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Thu, 29 May 2003 09:00:59 -0700 (PDT) Subject: Network RAM revisited In-Reply-To: Message-ID: On Thu, 29 May 2003, Kim Branson wrote: > > > . Current production runs are > > at 10-20km cells. Those pesky weather satellites produce plenty of > > high-resolution input. So memory size is not a problem, and the data > > is big enough that you can use 100-200 cpus with Myrinet. Not too > > shabby. > > > > Interesting, is weather forecasting limited by computational resources > and not data availability? So if an organization > had better resources with the same set of input data they could perform > simulations that would be better at forcasting than a less well > endowed resource?, or is it quantity and quality of data? Are there > machines capable of simulating 1km cells, and is there data for these, > presumably this data varies from region to region, with some areas > having more sensor data than others? actually it doesn't because the data is collected by sattelite, and distributed by noaa... > cheers > > > Kim > > > > > > > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- In Dr. Johnson's famous dictionary patriotism is defined as the last resort of the scoundrel. With all due respect to an enlightened but inferior lexicographer I beg to submit that it is the first. -- Ambrose Bierce, "The Devil's Dictionary" _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From joelja at darkwing.uoregon.edu Thu May 29 13:23:11 2003 From: joelja at darkwing.uoregon.edu (Joel Jaeggli) Date: Thu, 29 May 2003 10:23:11 -0700 (PDT) Subject: Network RAM revisited In-Reply-To: <20030529075826.GA1740@greglaptop.greghome.keyresearch.com> Message-ID: On Thu, 29 May 2003, Greg Lindahl wrote: > On Thu, May 29, 2003 at 12:43:52AM -0700, Joel Jaeggli wrote: > > > 2GB dimms reg ecc dimms seem to still be a factor of 8 or so more > > expensive than 1gb dimms. but spending ~$13000 or so per node for memory > > makes spending $600 for a mainboard and $400ea for cpu's seem like a > > comparitvly minor part of the cost. even with niceish cases to protect > > your expensive ram, and a blown out myrinet setup you could probably slip > > in under 1.25-1.33x the cost of the ram. > > I wasn't talking about 1.33x the cost of expensive ram, the object was > 1.2x the cost of reasonable ram. yeah but what was reasonable ram costing when you targeted doing it with alphas... if you put only 8GB in then then you have to buy twice as many machines... some off-the-cuff calaculations I did last night peg the price of a 64node 1TB dual opteron cluster at right around a million bucks... which looks like kinda a fantastic deal, in light of how much it would have cost to do that with ia64 three months ago. joelja > -- greg > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- -------------------------------------------------------------------------- Joel Jaeggli Academic User Services joelja at darkwing.uoregon.edu -- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -- In Dr. Johnson's famous dictionary patriotism is defined as the last resort of the scoundrel. With all due respect to an enlightened but inferior lexicographer I beg to submit that it is the first. -- Ambrose Bierce, "The Devil's Dictionary" _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From josip at lanl.gov Thu May 29 13:16:04 2003 From: josip at lanl.gov (Josip Loncaric) Date: Thu, 29 May 2003 11:16:04 -0600 Subject: Network RAM revisited In-Reply-To: References: Message-ID: <3ED64054.6030401@lanl.gov> > On Thu, 29 May 2003, Kim Branson wrote: >> >>Interesting, is weather forecasting limited by computational resources >>and not data availability? So if an organization >>had better resources with the same set of input data they could perform >>simulations that would be better at forcasting than a less well >>endowed resource?, or is it quantity and quality of data? Are there >>machines capable of simulating 1km cells, and is there data for these, >>presumably this data varies from region to region, with some areas >>having more sensor data than others? Regional-scale modeling is being done in various places. For example, see the work of Lloyd Treinish and his group at IBM -- their "Deep Thunder" project predicts local weather using mesoscale high resolution (1 km) simulations, which get initialized and updated based on synoptic scale (40 km resolution) data from the National Weather Service. http://www.research.ibm.com/weather/DT.html Of course, IBM uses their own hardware (RS/6000) for this... Sincerely, Josip P.S. Weather model initialization from diverse sources of collected data is an interesting problem. See the Data Assimilation Office at NASA Goddard (http://polar.gsfc.nasa.gov/) for more detail. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From msnitzer at lnxi.com Thu May 29 14:09:02 2003 From: msnitzer at lnxi.com (Mike Snitzer) Date: Thu, 29 May 2003 12:09:02 -0600 Subject: Buying a Beowulf Cluster (Help) In-Reply-To: ; from alvin@Maggie.Linux-Consulting.com on Tue, May 27, 2003 at 02:04:51PM -0700 References:

Message-ID: <20030529120902.A6946@lnxi.com> On Tue, May 27 2003 at 15:04, Alvin Oga wrote: > > On Sat, 24 May 2003, Shashank Khanvilkar wrote: > > > > > 1. Opinions on the OS to be installed on the cluster: We have decided on REd > > > Hat 7.3, however suggestions are welcome. (We are not going for RH 8/9 > > > because of some known compiler (Intel and portland) problems.. If anyone has > > > knowledge abt this, please let me know). > > gcc problems will be across the board .. > - old gcc on new hw or new gcc on new hw > - old gcc on old hw or new gcc on old hw > - you will have problems ( glibc + gcc-x.y problems ) > > - there's probably more open source support for new gcc w/ new hw Upgrading rh7.3 to gcc 3.2 (e.g. get rh8.0's gcc 3.2-7 to work under rh7.3) is not hard. RH7.3 is known to be quite stable; any issues that rh7.3 does have can be worked around. > i think to build new boxes based on old distro is a bad idea, > since it'd run into old known bugs that has since been fixed > in the newer distro > - given known bugs and features requirements, i'd build on the > latest/greatest stuff A well maintained Redhat7.3, with a custom kernel and updates/fixes, is very well suited for production clusters. The same cannot be said of rh8.0 or rh9.0 (even though it is possible, it requires more effort and there are more HPC software incompatibilities). > - yes, you might get the new bugs in the new systems/distro .. > but you will also get old bugs in old distro and a lot smaller > group of open source folks addressing those older issues This is true for specific packages; but in the case of rh9.0 there are some serious issues associated with the transition to NPTL. Given RedHat's new policy on the standard redhat linux (bleeding edge churn); standard RH will continue to be a source of varying degrees of instability (by design; RH wants the $$$ for RHEL). Not to mention the extremely short window of time RedHat will support the standard RH. SO all that said; something needs to give: either you stay current with the RedHat churn or you get creative with a different Linux solution. For a do it yourself cluster guru sticking with the RH churn may be acceptable, but for a production cluster that isn't _really_ an option. Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu May 29 14:43:51 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 29 May 2003 11:43:51 -0700 Subject: Network RAM revisited In-Reply-To: References: <20030529071226.GA1496@greglaptop.greghome.keyresearch.com> Message-ID: <20030529184351.GD1326@greglaptop.internal.keyresearch.com> On Thu, May 29, 2003 at 10:55:26PM +1000, Kim Branson wrote: > Interesting, is weather forecasting limited by computational resources > and not data availability? Yes and no. Groups share datda with each other. But there are 2 things needed to scale the computation: the first is raw compute power, and the second is better representation of sub-cell physics. The second is not a matter of computer power, it's a matter of of a lot of R&D, which is helped by more compute power (more test runs) but still takes a lot of time. Another way that raw compute power can help out is that you can run ensemble forecasts, which is a bunch of separate runs with slightly perturbed input data. By the way, I goofed up my scaling in an earlier post: the amount of compute power used goes up by the resolution**4 (3 spatial dimensions, 1 time). So going from 40km to 20km uses 16x the computation, roughly. And no, there is no current machine capable of doing 1km forecasts. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu May 29 14:20:12 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 29 May 2003 11:20:12 -0700 Subject: Network RAM revisited In-Reply-To: References: <20030529075826.GA1740@greglaptop.greghome.keyresearch.com> Message-ID: <20030529182012.GA1326@greglaptop.internal.keyresearch.com> On Thu, May 29, 2003 at 10:23:11AM -0700, Joel Jaeggli wrote: > yeah but what was reasonable ram costing when you targeted doing it with > alphas... This was in the EV56 era, so Alphas used normal DRAM. Even with EV6 machines, if the chipset addressing hadn't _again_ been crippled, purchasing large amounts of 200 pin DIMMs gives a price much closer to commodity 168 pin DIMMs -- once I bought 256 Gbytes in one shot, and that wasn't so bad. The main number that had to be right was memory per motherboard. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From waitt at saic.com Thu May 29 16:16:36 2003 From: waitt at saic.com (Tim Wait) Date: Thu, 29 May 2003 16:16:36 -0400 Subject: Network RAM revisited References: <20030529071226.GA1496@greglaptop.greghome.keyresearch.com> <20030529184351.GD1326@greglaptop.internal.keyresearch.com> Message-ID: <3ED66AA4.2020508@saic.com> Greg Lindahl wrote: > And no, there is no current machine capable of doing 1km forecasts. For traditional hydrostatic models on rectilinear meshes, maybe. Non-hydrostatic models on adaptive unstructred meshes are a whole 'nother story. Of course, that means implementing different physics for cells less than about 5-10 km. But certainly doable. Tim _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From josip at lanl.gov Thu May 29 16:56:23 2003 From: josip at lanl.gov (Josip Loncaric) Date: Thu, 29 May 2003 14:56:23 -0600 Subject: Network RAM revisited In-Reply-To: <20030529184351.GD1326@greglaptop.internal.keyresearch.com> References: <20030529071226.GA1496@greglaptop.greghome.keyresearch.com> <20030529184351.GD1326@greglaptop.internal.keyresearch.com> Message-ID: <3ED673F7.2090203@lanl.gov> Greg Lindahl wrote: > > And no, there is no current machine capable of doing 1km forecasts. Not on the global level, nor the national level, nor even at any sizable state level. However, this has been demonstrated at city level (e.g. Atlanta during the 1996 Olympic Games and more recently NYC were/are covered by IBM's "Deep Thunder" project, which can quite reliably do short term forecasts with 1km resolution). Sincerely, Josip _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Thu May 29 18:34:22 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Thu, 29 May 2003 15:34:22 -0700 Subject: Cheap PCs from Wal-Mart Message-ID: <5.2.0.9.2.20030529152525.02e8cb68@mailhost4.jpl.nasa.gov> For those of you looking to build a cluster on the (real) cheap, Walmart has mailorder PCs, with Lindows (a Linux variant) installed for $200 (plus sales tax and shipping, of course). I just bought one of these for my daughter (with WinXP, for $300.. I guess the MS license is $100) and while it's no ball of fire, and the keyboard and mouse are what you'd expect for a $200 computer, it DOES work ok..at least WinXP hasn't crashed in the last week except when my daughter tried to play a Win95 (only) Disney game on it. The exact configuration seems to be changing.. last fall it was a Via C3 800MHz, now it's an Athlon 1.1GHz.. anyway, 128MB, 20GB disk, CD-ROM (read only), Realtek on board 10/100 ethernet, S3 video, no floppy,... The fan is quite loud in the power supply. For someone looking to "play" with Beowulfery and set up, say 3 or 4 nodes, this seems to be a pretty low budget way to do it. I haven't checked, but I'll bet you can buy a 5 port Linksys switch from Walmart as well as the plug strips and network cables. A 4 node system for under $1000 is a very real possibility. I ordered mine online on Tuesday and UPS tried to deliver it on Friday, so it's even an "impulse buy" candidate. For what it's worth, they are made by MicroTelPC http://www.microtelpc.com/ I'd just try shoving a Linux CDROM in to see if it boots, but I'm afraid that WinXP will die... I might just get another one, bare, to try it out. James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kim.branson at csiro.au Thu May 29 19:11:04 2003 From: kim.branson at csiro.au (Kim Branson) Date: Fri, 30 May 2003 09:11:04 +1000 Subject: Network RAM revisited In-Reply-To: <3ED64054.6030401@lanl.gov> Message-ID: On Friday, May 30, 2003, at 03:16 AM, Josip Loncaric wrote: > > Josip > > > P.S. Weather model initialization from diverse sources of collected > data is an interesting problem. See the Data Assimilation Office at > NASA Goddard (http://polar.gsfc.nasa.gov/) for more detail. > Great thanks for that link. I've recently started to move into this area, my problem ( computational drug design ) is beginning to combine different data from a range of different scoring ( protein ligand affinity estimation functions), drug likeness, toxicology etc estimations. After discovering i've reinvented the wheel, since i didn't think of the obvious keyword search of Data Fusion, I can see this sort of problem arises in many areas. It took an undergrad to point me in the direction of the journal of datafusion...... cheers Kim _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From msnitzer at lnxi.com Thu May 29 19:02:55 2003 From: msnitzer at lnxi.com (Mike Snitzer) Date: Thu, 29 May 2003 17:02:55 -0600 Subject: Cheap PCs from Wal-Mart In-Reply-To: <5.2.0.9.2.20030529152525.02e8cb68@mailhost4.jpl.nasa.gov>; from James.P.Lux@jpl.nasa.gov on Thu, May 29, 2003 at 03:34:22PM -0700 References: <5.2.0.9.2.20030529152525.02e8cb68@mailhost4.jpl.nasa.gov> Message-ID: <20030529170255.A5293@lnxi.com> On Thu, May 29 2003 at 16:34, Jim Lux wrote: > I'd just try shoving a Linux CDROM in to see if it boots, but I'm afraid > that WinXP will die... I might just get another one, bare, to try it out. Jim, You could always boot a Knoppix cd; it will enable you to run linux out of a ramdisk and has no need to touch the harddisk. Mike _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From alvin at Maggie.Linux-Consulting.com Thu May 29 19:19:39 2003 From: alvin at Maggie.Linux-Consulting.com (Alvin Oga) Date: Thu, 29 May 2003 16:19:39 -0700 (PDT) Subject: Cheap PCs from Wal-Mart In-Reply-To: <5.2.0.9.2.20030529152525.02e8cb68@mailhost4.jpl.nasa.gov> Message-ID: hi ya cheap PCs can be gotten almost anywhere ??? doesnt have to be walmart/circuit city/emachines/etc $ 30 cheap pc case ( that makes the PC their widget ) $ 70 generic motherboard w/ onboard nic, onboard svga $ 70 Celeron-1.7G 478pin fsb400 cpu $ 25 128MB pc-133 $ 25 50x cdrom $ 60 20GB ide disk ---- ------------- $ 280 grand total $ 25 oem ms license mb, cpu, disks can be lot lower in $$ if you use p3 and pc-133 meory via series mb w/ p3-800 is about $85 total ( subtract ~ $60 from above ) same cost estimates for amd duron/athlon based systems you can save the shipping by bying locally... and might be zero sales tax in some states too stuff all that into a 1U chassis and add $100 - $250 extra ... and take out the cost of the "generic midtower case" and if there's a problem w/ the pc, i'd hate to worry about how to return it and get a better box back or is it, as typically the case, that they'd simply send out a different returned PC .. since its a warranty replacement, they dont have to send you a brand new pc like they would have to with a new order On Thu, 29 May 2003, Jim Lux wrote: > For those of you looking to build a cluster on the (real) cheap, Walmart > has mailorder PCs, with Lindows (a Linux variant) installed for $200 (plus > sales tax and shipping, of course). > > I just bought one of these for my daughter (with WinXP, for $300.. I guess > the MS license is $100) and while it's no ball of fire, and the keyboard > and mouse are what you'd expect for a $200 computer, it DOES work ok..at magic ! have fun alvin _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kim.branson at csiro.au Thu May 29 19:13:49 2003 From: kim.branson at csiro.au (Kim Branson) Date: Fri, 30 May 2003 09:13:49 +1000 Subject: Network RAM revisited In-Reply-To: <3ED66AA4.2020508@saic.com> Message-ID: <348954E2-922B-11D7-B274-000A9579AE94@csiro.au> On Friday, May 30, 2003, at 06:16 AM, Tim Wait wrote: > Greg Lindahl wrote: >> And no, there is no current machine capable of doing 1km forecasts. > > For traditional hydrostatic models on rectilinear meshes, maybe. > Non-hydrostatic models on adaptive unstructred meshes are a whole > 'nother story. Of course, that means implementing different physics > for cells less than about 5-10 km. But certainly doable. > So by 'adaptive unstructured' you mean the size of the grid cell in the region varies depending on how much data you have? lots of data -> smaller cells. So how does one deal with the edges of the cells? Sounds a little bit like the spatial decomposition approach in molecular dynamics. kim _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Thu May 29 19:46:21 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Thu, 29 May 2003 16:46:21 -0700 Subject: Cheap PCs from Wal-Mart References: <5.2.0.9.2.20030529152525.02e8cb68@mailhost4.jpl.nasa.gov> Message-ID: <5.2.0.9.2.20030529162915.02e8f250@mailhost4.jpl.nasa.gov> Alvin's right, there are other approaches... However, allow me to point out the following: - $200 for a Linux machine is cheaper than $280, albeit for less performance. I have shopped around the local stores looking for a similar pricepoint, and nobody is interested in stocking components that provide the low level of performance at the low price point Wal-Mart is selling at. - The Walmart special requires ZERO labor to assemble and test, and is guaranteed good out of the box (they'll pay return shipping)...As far as DOA rates go... I'll bet Walmart wouldn't tolerate a 5% DOA rate, since their corporate reputation rests on "non-sophisticated users" who "take it out of the box and plug it in" - One probably cannot buy a *LEGAL* copy of WinXP for $25 as an end user. One might be able to negotiate such a price from a dealer, but I'll bet the OEM License agreement *REQUIRES* the dealer to install it on a working system. Microsoft is no fool when it comes to extracting money, which is probably why WalMart is charging $100+ for the WinXP - Buying locally over the counter merely moves the shipping cost from the vendor to you. How much is your time worth? FWIW, Walmart charged me $14 for shipping, which is quite reasonable. - Legally, you have to pay sales tax (or, alternately use tax) in California regardless. Sure, some out of state vendors may be remiss in collecting it, but, then, you're then *legally responsible* for paying the use tax. Likewise, if you use your "resale permit" to claim you're buying for resale as an OEM (probably also how you'd finagle the oem Windows license), you're responsible for the tax. When it comes to tax, the government is amazingly tenacious... And, of course, if you were buying a low-buck Beowulf for an educational purpose (i.e. to give a local middle school a chance to work with a cluster), they're not going to hassle the shady tax and/or shipping and/or licensing issues. If you're just hacking a cluster in your living room or garage, have no money, and lots of time, then, by all means, meet the guy in the alley, buy the bare motherboards for cash, hot glue them to a piece of plywood and lash them up.... Clearly, if one had $1000 to spend on raw computing, there are better ways to invest it than buying 4 WalMart specials and a switch. I was looking at this as an example of a very inexpensive way to put together a cluster without having to build your own PCs. At 04:19 PM 5/29/2003 -0700, you wrote: >hi ya > >cheap PCs can be gotten almost anywhere ??? doesnt have to >be walmart/circuit city/emachines/etc > >$ 30 cheap pc case ( that makes the PC their widget ) >$ 70 generic motherboard w/ onboard nic, onboard svga >$ 70 Celeron-1.7G 478pin fsb400 cpu >$ 25 128MB pc-133 >$ 25 50x cdrom >$ 60 20GB ide disk >---- ------------- >$ 280 grand total > >$ 25 oem ms license > >mb, cpu, disks can be lot lower in $$ if you use p3 and pc-133 meory > >via series mb w/ p3-800 is about $85 total ( subtract ~ $60 from above ) > >same cost estimates for amd duron/athlon based systems > >you can save the shipping by bying locally... >and might be zero sales tax in some states too > >stuff all that into a 1U chassis and add $100 - $250 extra ... >and take out the cost of the "generic midtower case" > >and if there's a problem w/ the pc, i'd hate to worry about how >to return it and get a better box back or is it, as typically the case, >that they'd simply send out a different returned PC .. since its >a warranty replacement, they dont have to send you a brand new pc >like they would have to with a new order > >On Thu, 29 May 2003, Jim Lux wrote: > > > For those of you looking to build a cluster on the (real) cheap, Walmart > > has mailorder PCs, with Lindows (a Linux variant) installed for $200 (plus > > sales tax and shipping, of course). > > > > I just bought one of these for my daughter (with WinXP, for $300.. I guess > > the MS license is $100) and while it's no ball of fire, and the keyboard > > and mouse are what you'd expect for a $200 computer, it DOES work ok..at > >magic ! > >have fun >alvin James Lux, P.E. Spacecraft Telecommunications Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From alvin at Maggie.Linux-Consulting.com Thu May 29 20:23:09 2003 From: alvin at Maggie.Linux-Consulting.com (Alvin Oga) Date: Thu, 29 May 2003 17:23:09 -0700 (PDT) Subject: Cheap PCs from Wal-Mart In-Reply-To: <5.2.0.9.2.20030529162915.02e8f250@mailhost4.jpl.nasa.gov> Message-ID: hi ya jim On Thu, 29 May 2003, Jim Lux wrote: > However, allow me to point out the following: > - $200 for a Linux machine is cheaper than $280, albeit for less > performance. I have shopped around the local stores looking for a similar > pricepoint, and nobody is interested in stocking components that provide > the low level of performance at the low price point Wal-Mart is selling at. some local pc stores carry the parts .. off the shelf ... - you will have a hard time finding a $30 pc case though and rest of the components are easy to find ( even at frys(fries) ) its probably better to buy a cpu/mb/disk/memory/ps that you know and like vs experimenting with "cheap parts" ... i always get burnt trying to save $5 - $10 by buying generic but buying "bad" name-brand stuff doesnt help either > - The Walmart special requires ZERO labor to assemble and test, and is > guaranteed good out of the box (they'll pay return shipping)...As far as > DOA rates go... I'll bet Walmart wouldn't tolerate a 5% DOA rate, since > their corporate reputation rests on "non-sophisticated users" who "take it > out of the box and plug it in" yes... a good thing about buying a complete box > - One probably cannot buy a *LEGAL* copy of WinXP for $25 as an end user. > One might be able to negotiate such a price from a dealer, but I'll bet the > OEM License agreement *REQUIRES* the dealer to install it on a working > system. Microsoft is no fool when it comes to extracting money, which is > probably why WalMart is charging $100+ for the WinXP those ms OEM deals are tightly controlled based on number of systems one sells vs number of ms cdrom only sold, etc...etc.. - most of the pc stores selling w/ windoze xp preinstalled are just paying for the rights to sell n-machines ... and their ms license fee is significantly discounted... ( ask for a copy of ms xp cdrom and watch them squirm and ( turn red ... same for laptops > - Buying locally over the counter merely moves the shipping cost from the > vendor to you. How much is your time worth? FWIW, Walmart charged me $14 > for shipping, which is quite reasonable. shipping for large entities is cheap ... shipping for 1 system from small 1-z and 2-z companies is "regular pricing" .. roughly $5/lb of shipping .. ( ground shipping is little less about 1/2 but takes 7 days in transit ) > - Legally, you have to pay sales tax (or, alternately use tax) in > California regardless. Sure, some out of state vendors may be remiss in > collecting it, but, then, you're then *legally responsible* for paying the > use tax. Likewise, if you use your "resale permit" to claim you're buying > for resale as an OEM (probably also how you'd finagle the oem Windows > license), you're responsible for the tax. When it comes to tax, the > government is amazingly tenacious... nasa.gov should be tax exempt, even if a cali corp sells a bunch of pc to nasa.gov some groups of stanford.edu is tax exempt also and yes, collecting and paying on sales tax is important.. we say resale permit vs paperwork needed is not worth the hassle.. its 10x cheaper to pay the 8.25% cali sales tax for 1-z 2-z orders > And, of course, if you were buying a low-buck Beowulf for an educational > purpose (i.e. to give a local middle school a chance to work with a > cluster), they're not going to hassle the shady tax and/or shipping and/or > licensing issues. those are probably can all be tax exempt except for MS part of it have fun alvin _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From waitt at saic.com Thu May 29 20:42:25 2003 From: waitt at saic.com (Tim Wait) Date: Thu, 29 May 2003 20:42:25 -0400 Subject: Network RAM revisited References: <348954E2-922B-11D7-B274-000A9579AE94@csiro.au> Message-ID: <3ED6A8F1.2080507@saic.com> >> For traditional hydrostatic models on rectilinear meshes, maybe. >> Non-hydrostatic models on adaptive unstructred meshes are a whole >> 'nother story. Of course, that means implementing different physics >> for cells less than about 5-10 km. But certainly doable. >> > So by 'adaptive unstructured' you mean the size of the grid cell in the > region varies depending on how much data you have? > lots of data -> smaller cells. So how does one deal with the edges of > the cells? Sounds a little bit like the spatial decomposition approach > in molecular dynamics. No, the resolution is refined over areas of interest. For met, this would be areas with significant pressure gradients, vertical motion etc. The drawback to higher resolution in atmospheric modelling is that the timesteps are based on the edge length (to keep it stable) so you get a big computational hit. The advantage to using adaptive unstructured grids is that you can have low resolution over regions that don't have any significant processes occuring, yet refine over areas that do - or retain high res over areas over a particular region of interest... ie flow over a specific area. Another big plus with atmospheric modelling is that you can easilly do global runs with local high res and avoid boundary condition discontinuities. See http://vortex.atgteam.com for more info. The gloabl grid on there at present is from before we had global adaptation working in parallel. I'll try to get a global daptation case up there tomorrow done for the OK tornado outbreak a few weeks ago. Mostly dated stuff there, and it's on a 10 Mb network segment, so be patient. The hurricane floyd run will give you an idea of what I'm talking about. Higher resolution -> better track, but don't waste cycles on areas that don't affect the solution significantly. The problem with traditional models is that they used fixed rectilinear grids. They do nesting to get the resolution up, but that wastes a lot of cycles where nothing is happening. Sure, you can make a nice 1 km nest over a city (Josip) but that isn't going to help much if the supercell or whatever was formed outside of that region and advects in. Tim _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Thu May 29 23:12:58 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Thu, 29 May 2003 20:12:58 -0700 Subject: Network RAM revisited References: <348954E2-922B-11D7-B274-000A9579AE94@csiro.au> Message-ID: <000a01c32659$5fcfbae0$02a8a8c0@office1> There's also a similar thing being done in FDTD (finite difference time domain) simulation in electromagnetics. You have unstructured grids of varying densities, over which you explicitly solve Maxwell's equations. ----- Original Message ----- From: "Kim Branson" To: "Tim Wait" Cc: Sent: Thursday, May 29, 2003 4:13 PM Subject: Re: Network RAM revisited > > On Friday, May 30, 2003, at 06:16 AM, Tim Wait wrote: > > > Greg Lindahl wrote: > >> And no, there is no current machine capable of doing 1km forecasts. > > > > For traditional hydrostatic models on rectilinear meshes, maybe. > > Non-hydrostatic models on adaptive unstructred meshes are a whole > > 'nother story. Of course, that means implementing different physics > > for cells less than about 5-10 km. But certainly doable. > > > So by 'adaptive unstructured' you mean the size of the grid cell in the > region varies depending on how much data you have? > lots of data -> smaller cells. So how does one deal with the edges of > the cells? Sounds a little bit like the spatial decomposition approach > in molecular dynamics. > > kim > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From brian.dobbins at yale.edu Thu May 29 22:57:37 2003 From: brian.dobbins at yale.edu (Brian Dobbins) Date: Thu, 29 May 2003 22:57:37 -0400 (EDT) Subject: Questions on x86-64/32 kernel w/ large arrays.. Message-ID: Hi guys, Could anyone with a lot more kernel knowledge than myself fill me in...? Recalling that with 32-bit systems, the default linux behaviour was to load system stuff around the 1GB mark, making it impossible to (statically) allocate arrays larger than that without modifying the kernel source. (See: http://www.pgroup.com/faq/execute.htm#2GB_mem ) .. Now with the x86-64 kernel, as supplied by GinGin64, in 'include/asm-x86-64/processor.h', I see the following: #define TASK_UNMAPPED_32 0xa0000000 #define TASK_UNMAPPED_64 (TASK_SIZE/3) #define TASK_UNMAPPED_BASE \ ((current->thread.flags & THREAD_IA32) ? TASK_UNMAPPED_32 : TASK_UNMAPPED_64) .. Does this mean that in 32-bit mode on the Opteron, I automatically get bumped up from the 1GB limit to nearly 2.5GB (0xa0000000)? And, more importantly, since the OS itself is in 64-bit mode, can I alter this setting to allow myself to have very nearly (or all!) 4GB of space for a static allocation for a 32-bit executable? (Ordinarily I'd run in 64-bit mode, but someone I know is looking to port their code from the old Digital FORTRAN to the Intel compiler, which limits us to a 32-bit executable.) Any ideas? Thoughts? Am I totally off base here? (I don't poke around the kernel much!) I'll most likely send this to the kernel lists if I don't get any luck here, but since it's not urgent, I thought I'd ask the very knowledgeable people here first. :-) Cheers, - Brian _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From James.P.Lux at jpl.nasa.gov Thu May 29 23:23:27 2003 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Thu, 29 May 2003 20:23:27 -0700 Subject: Cheap PCs from Wal-Mart References: Message-ID: <001501c3265a$d61ad260$02a8a8c0@office1> > > the low level of performance at the low price point Wal-Mart is selling at. > > some local pc stores carry the parts .. off the shelf ... > - you will have a hard time finding a $30 pc case though > and rest of the components are easy to find > ( even at frys(fries) ) > > its probably better to buy a cpu/mb/disk/memory/ps that you know and like > vs experimenting with "cheap parts" ... i always get burnt trying to save > $5 - $10 by buying generic but buying "bad" name-brand stuff doesnt help > either That's the deal with Walmart.. someone else (Microtel, in this case) has done the assembly and checkout. As for Fry's... don't get me started.. just suffice it to say that major manufacturers (i.e. HP/Compaq/etc) do not provide Fry's with original mfr shipping materials to replace packaging that gets dinged in transit. Most retail outlets get a supply of original mfr packaging so that they can deliver a "clean looking box" with the new computer, even if somebody in receiving spills a soda on the pallet on the loading dock (e.g.) > > - Buying locally over the counter merely moves the shipping cost from the > > vendor to you. How much is your time worth? FWIW, Walmart charged me $14 > > for shipping, which is quite reasonable. > > shipping for large entities is cheap ... > > shipping for 1 system from small 1-z and 2-z companies is "regular > pricing" .. roughly $5/lb of shipping .. ( ground shipping is little > less about 1/2 but takes 7 days in transit ) That was the default cheap ground shipping.. in my case, the mfr is reasonably local, so UPS ground will usually get here in 2 days. > > > - Legally, you have to pay sales tax (or, alternately use tax) in > > California regardless. Sure, some out of state vendors may be remiss in > > collecting it, but, then, you're then *legally responsible* for paying the > > use tax. Likewise, if you use your "resale permit" to claim you're buying > > for resale as an OEM (probably also how you'd finagle the oem Windows > > license), you're responsible for the tax. When it comes to tax, the > > government is amazingly tenacious... > > nasa.gov should be tax exempt, even if a cali corp sells a bunch of pc to > nasa.gov JPL, and NASA, do have an exemption, for some things (but not as much as you might think... tax laws aren't "logical"). However, I bought this little box for my daughter, so I'm on the hook. > > some groups of stanford.edu is tax exempt also Maybe, maybe not.. California is quite enthusiastic about sales tax, particularly since Prop 13 made property taxes a much smaller source of revenue for local government. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From alvin at Maggie.Linux-Consulting.com Fri May 30 00:14:12 2003 From: alvin at Maggie.Linux-Consulting.com (Alvin Oga) Date: Thu, 29 May 2003 21:14:12 -0700 (PDT) Subject: Cheap PCs from Wal-Mart In-Reply-To: <001501c3265a$d61ad260$02a8a8c0@office1> Message-ID: hi ya jim On Thu, 29 May 2003, Jim Lux wrote: > That's the deal with Walmart.. someone else (Microtel, in this case) has > done the assembly and checkout. sometimes its worth it to buy pre-assembled systems even if one has inhouse staff > As for Fry's... don't get me started.. i see you're familiar with "the fun of buying at Fries" :-) > just suffice it to say that major > manufacturers (i.e. HP/Compaq/etc) do not provide Fry's with original mfr > shipping materials to replace packaging that gets dinged in transit. Most > retail outlets get a supply of original mfr packaging so that they can > deliver a "clean looking box" with the new computer, even if somebody in > receiving spills a soda on the pallet on the loading dock (e.g.) yes.. when buying from distributors... one does NOT get to return items even if it's "slightly" broken ... thats part of the "we sell new equipment at wholesale and no support" ... works out nicely .. usually... ( know what it is you want .. manufacturer and precise model# ) yes, good for hp and all manufacturers to NOT receive parts back after it left their doors... if you want their 3 year intel/amd warranty, it should straight back to intel/amd ... and they can decide to send a replacement ( new or used ) with the same original warranty period - it's not cheap too honour those warranty policies cpus are rated at 30,000 hrs MTBF ( ~ 5yrs ) http://www.linux-1u.net/CPU/ disks are rated at 1,000,000 hrs MTBF ( ~ 150 yrs ) ( must be a marketing based support/warranty plan ?? ) yes .. recycled parts .... one just inherits someone elses pc that didn't work right and also got returned only to be repackaged and sent back out to another ... > That was the default cheap ground shipping.. in my case, the mfr is > reasonably local, so UPS ground will usually get here in 2 days. hand delivery is good .... here's your boxes ... may i, please have the check too, pretty please ... :-) have fun alvin _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Fri May 30 01:10:00 2003 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Fri, 30 May 2003 13:10:00 +0800 (CST) Subject: Scalable PBS (beta) now available for download. Message-ID: <20030530051000.38178.qmail@web16805.mail.tpe.yahoo.com> http://www.supercluster.org/projects/pbs Andrew. ----------------------------------------------------------------- ??? Yahoo!?? ??????? - ???????????? http://fate.yahoo.com.tw/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rbw at ahpcrc.org Fri May 30 10:02:14 2003 From: rbw at ahpcrc.org (Richard Walsh) Date: Fri, 30 May 2003 09:02:14 -0500 Subject: Network RAM revisited Message-ID: <200305301402.h4UE2EA10722@mycroft.ahpcrc.org> Greg Lindahl wrote: >By the way, I goofed up my scaling in an earlier post: the amount of >compute power used goes up by the resolution**4 (3 spatial dimensions, >1 time). So going from 40km to 20km uses 16x the computation, roughly. >And no, there is no current machine capable of doing 1km forecasts. We have recently run 5km MM5 models of the entire US recently on our Cray X1 and after the upgrades later this year will be trying higher resolutions. At full scale (1024 MSPs or more) do you think the Cray X1 will be able to deliver 1km forecasts in less that 24 hours? I am not much of a weather guy, but would like to know what you think. rbw _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Fri May 30 11:23:14 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Fri, 30 May 2003 11:23:14 -0400 (EDT) Subject: Questions on x86-64/32 kernel w/ large arrays.. In-Reply-To: Message-ID: > Recalling that with 32-bit systems, the default linux behaviour was to > load system stuff around the 1GB mark, making it impossible to I'm being persnickety, but the details are somewhat illuminating: it's no "system stuff", but rather just the mmap arena. there are, after all, three arenas: the traditional sbrk/malloc heap, growing up from the end of program text; the stack, growing down from 3G, and the mmap arena, which has to fit in there somewhere (and grows up by default). you might think of it as "system stuff" just because you probably notice shared libraries, which are mmaped, in /proc//maps. yes, it's true: a totally static program can avoid any use of mmap, and therefore get the whole region for heap or stack! caveats: the last time I tried this, static libc stdio mmaped a single page. also, there exist patches to tweak this in two ways: you can change TASK_UNMAPPED_BASE (one even makes it a sysctl), and you can make the mmap arena grow down (if you only ever need an 8M stack, this makes very good sense). another alternative would probably be to abandon the sbrk heap, and use a malloc implementation that was entirely based on allocating arenas using mmap. actually, this sounds like a pretty good idea - glibc could probably be hacked to do this, since it already uses mmap for large allocations... > (See: http://www.pgroup.com/faq/execute.htm#2GB_mem ) technically, the kernel merely starts mmaps at TASK_UNMAPPED_BASE - it's ld.so (userspace) which uses that for shlibs. actually, it occurs to me that you could probably do some trickery wherein you did say a 1.5 GB mmap *before* ld.so starts grubbing around, then munmap it when ld.so's done. that would let the heap expand all the way up to ~2.5GB, I think. > .. Now with the x86-64 kernel, as supplied by GinGin64, in > 'include/asm-x86-64/processor.h', I see the following: > > #define TASK_UNMAPPED_32 0xa0000000 > #define TASK_UNMAPPED_64 (TASK_SIZE/3) > #define TASK_UNMAPPED_BASE \ > ((current->thread.flags & THREAD_IA32) ? TASK_UNMAPPED_32 : > TASK_UNMAPPED_64) > > .. Does this mean that in 32-bit mode on the Opteron, I automatically > get bumped up from the 1GB limit to nearly 2.5GB (0xa0000000)? And, more that's the way I read it. > importantly, since the OS itself is in 64-bit mode, can I alter this > setting to allow myself to have very nearly (or all!) 4GB of space for a > static allocation for a 32-bit executable? hmm, good question. just for background, the 3G limit (TASK_SIZE) is also not a hard limit - you can set it to 3.5, for instance. the area above TASK_SIZE is an "aperture" used by the kernel so that it can avoid switching address spaces. if you make it small, you'll probably run into problems on big-memory machines (page tables need to live in there, I think), and possibly IO to certain kinds of devices. offhand, I'd guess the x86-64 people are sensible enough to have figured out a way to avoid this, which is indeed a pretty cool. advantage even for 32b tasks... I hope to have my own opteron to play with next week ;) regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Robert at jaspreet.org Thu May 29 19:51:38 2003 From: Robert at jaspreet.org (Robert) Date: Thu, 29 May 2003 16:51:38 -0700 Subject: Cheap PCs from Wal-Mart In-Reply-To: References: <5.2.0.9.2.20030529152525.02e8cb68@mailhost4.jpl.nasa.gov> Message-ID: <5.2.1.1.0.20030529163946.01d00a58@mail.toaster.net> Hello these guys make small foot print 1U rackmountable low powered units and they can also be wall mounted, real cool units for clusters for starter's. http://www.ironsystems.com/products/iservers/aclass/a110_low_power.htm for $499.. Rob At 04:19 PM 5/29/2003 -0700, Alvin Oga wrote: >hi ya > >cheap PCs can be gotten almost anywhere ??? doesnt have to be >walmart/circuit city/emachines/etc > >$ 30 cheap pc case ( that makes the PC their widget ) >$ 70 generic motherboard w/ onboard nic, onboard svga >$ 70 Celeron-1.7G 478pin fsb400 cpu >$ 25 128MB pc-133 >$ 25 50x cdrom >$ 60 20GB ide disk >---- ------------- >$ 280 grand total > >$ 25 oem ms license > >mb, cpu, disks can be lot lower in $$ if you use p3 and pc-133 meory > >via series mb w/ p3-800 is about $85 total ( subtract ~ $60 from above ) > >same cost estimates for amd duron/athlon based systems > >you can save the shipping by bying locally... >and might be zero sales tax in some states too > >stuff all that into a 1U chassis and add $100 - $250 extra ... >and take out the cost of the "generic midtower case" > >and if there's a problem w/ the pc, i'd hate to worry about how to return >it and get a better box back or is it, as typically the case, >that they'd simply send out a different returned PC .. since its a >warranty replacement, they dont have to send you a brand new pc >like they would have to with a new order > >On Thu, 29 May 2003, Jim Lux wrote: > > > For those of you looking to build a cluster on the (real) cheap, Walmart > > has mailorder PCs, with Lindows (a Linux variant) installed for $200 (plus > > sales tax and shipping, of course). > > > > I just bought one of these for my daughter (with WinXP, for $300.. I guess > > the MS license is $100) and while it's no ball of fire, and the keyboard > > and mouse are what you'd expect for a $200 computer, it DOES work ok..at > >magic ! > >have fun >alvin > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Fri May 30 08:31:30 2003 From: gerry.creager at tamu.edu (Gerry Creager N5JXS) Date: Fri, 30 May 2003 07:31:30 -0500 Subject: Cheap PCs from Wal-Mart In-Reply-To: References: Message-ID: <3ED74F22.5070704@tamu.edu> If you're "blessed" with a Fry's Electronics in your area, the $30 cases are available. So are the $400 cases... If you're astute, Fry's usually has the toys to do the $200 computer with only a time investment. However, I agree with Jim: My time's worth something, and I'm running out of grad students. If we get the proposed funding I'm looking at, I'll hire a couple of kids to play node acolyte, but until then, it's not likely we're going to have spare cycles to build and test hardware. And tell me again, Why would I want to get a M$ license? Especially on my cluster? As for the sales tax issue, if I don't present the appropriate paperwork, Texas a sales tax collection from in-state vendors. If I don't provide the paperwork and tax is collected, the University will extract that tax from ME(!) and not acknowledge the paperwork reduction as cost-savings... Gerry Alvin Oga wrote: > hi ya jim > > On Thu, 29 May 2003, Jim Lux wrote: > > >>However, allow me to point out the following: >>- $200 for a Linux machine is cheaper than $280, albeit for less >>performance. I have shopped around the local stores looking for a similar >>pricepoint, and nobody is interested in stocking components that provide >>the low level of performance at the low price point Wal-Mart is selling at. > > > some local pc stores carry the parts .. off the shelf ... > - you will have a hard time finding a $30 pc case though > and rest of the components are easy to find > ( even at frys(fries) ) > > its probably better to buy a cpu/mb/disk/memory/ps that you know and like > vs experimenting with "cheap parts" ... i always get burnt trying to save > $5 - $10 by buying generic but buying "bad" name-brand stuff doesnt help > either > > >>- The Walmart special requires ZERO labor to assemble and test, and is >>guaranteed good out of the box (they'll pay return shipping)...As far as >>DOA rates go... I'll bet Walmart wouldn't tolerate a 5% DOA rate, since >>their corporate reputation rests on "non-sophisticated users" who "take it >>out of the box and plug it in" > > > yes... a good thing about buying a complete box > > >>- One probably cannot buy a *LEGAL* copy of WinXP for $25 as an end user. >>One might be able to negotiate such a price from a dealer, but I'll bet the >>OEM License agreement *REQUIRES* the dealer to install it on a working >>system. Microsoft is no fool when it comes to extracting money, which is >>probably why WalMart is charging $100+ for the WinXP > > > those ms OEM deals are tightly controlled based on number of systems > one sells vs number of ms cdrom only sold, etc...etc.. > - most of the pc stores selling w/ windoze xp preinstalled are > just paying for the rights to sell n-machines ... and their > ms license fee is significantly discounted... > ( ask for a copy of ms xp cdrom and watch them squirm and > ( turn red ... same for laptops > > >>- Buying locally over the counter merely moves the shipping cost from the >>vendor to you. How much is your time worth? FWIW, Walmart charged me $14 >>for shipping, which is quite reasonable. > > > shipping for large entities is cheap ... > > shipping for 1 system from small 1-z and 2-z companies is "regular > pricing" .. roughly $5/lb of shipping .. ( ground shipping is little > less about 1/2 but takes 7 days in transit ) > > >>- Legally, you have to pay sales tax (or, alternately use tax) in >>California regardless. Sure, some out of state vendors may be remiss in >>collecting it, but, then, you're then *legally responsible* for paying the >>use tax. Likewise, if you use your "resale permit" to claim you're buying >>for resale as an OEM (probably also how you'd finagle the oem Windows >>license), you're responsible for the tax. When it comes to tax, the >>government is amazingly tenacious... > > > nasa.gov should be tax exempt, even if a cali corp sells a bunch of pc to > nasa.gov > > some groups of stanford.edu is tax exempt also > > and yes, collecting and paying on sales tax is important.. > > we say resale permit vs paperwork needed is not worth the hassle.. > its 10x cheaper to pay the 8.25% cali sales tax for 1-z 2-z orders > > >>And, of course, if you were buying a low-buck Beowulf for an educational >>purpose (i.e. to give a local middle school a chance to work with a >>cluster), they're not going to hassle the shady tax and/or shipping and/or >>licensing issues. > > > those are probably can all be tax exempt except for MS part of it > > have fun > alvin > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager at tamu.edu Network Engineering -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bob at drzyzgula.org Wed May 21 14:14:38 2003 From: bob at drzyzgula.org (Bob Drzyzgula) Date: Wed, 21 May 2003 14:14:38 -0400 Subject: (Opens can of worms..) What is the best linux distro for a cluste r? In-Reply-To: References: <20030515150646.A7727@www2> Message-ID: <20030521141438.C17173@www2> Interesting bit of musing in The Register today: Red Hat, Linux, consumers, money - do they mix? http://www.theregister.co.uk/content/4/30805.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bill at math.ucdavis.edu Wed May 21 18:52:50 2003 From: bill at math.ucdavis.edu (Bill Broadley) Date: Wed, 21 May 2003 15:52:50 -0700 Subject: Like to hear about opteron experiences In-Reply-To: <3EC8FC72.4020507@cora.nwra.com> References: <3EC8FC72.4020507@cora.nwra.com> Message-ID: <20030521225250.GC32102@sphere.math.ucdavis.edu> I wrote a double precision based memory benchmark much like to John McCalpin's stream, except I use a variety of array sizes and pthreads to manage multiple parallel threads. I found a dramatic difference in performance when running 2 or 4 threads. For instance a dual p4 @ 2.4 Ghz (80% the clock of todays fastest p4): http://www.math.ucdavis.edu/~bill/dual-p4-2.4-icc.png Note that 2 threads has slower throughput then 1 to main memory. Vs a dual opteron-240 (77% as fast as todays fastest opteron): http://www.math.ucdavis.edu/~bill/dual-240-4xpc2700-icc.png 2 threads approximately 1.66 the throughput of 1 to main memory. As an additional datapoint I made a run on a quad-842: http://www.math.ucdavis.edu/~bill/4x842-icc.png 4 threads approximately 1.66 the throughput of 2 to main memory. Not directly related to generalized double FP performance, but I thought it might be an interesting data point. These were all test accounts, I actually have 2 opteron cpu's in hand, but alas I'm awaiting delivery of an opteron motherboard. -- Bill Broadley Mathematics UC Davis _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Wed May 21 20:49:22 2003 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Thu, 22 May 2003 08:49:22 +0800 (CST) Subject: [xcat-user] 5/21/2003, 13:55 - Need info ! Re: [PBS-USERS] LSF vs its "Closest Competitor" In-Reply-To: <00b601c31f5d$c23049f0$673c6e8c@dt.nchc.gov.tw> Message-ID: <20030522004922.50461.qmail@web16804.mail.tpe.yahoo.com> I am also interested in knowing how Platform Computing got the numbers, or they just made the numbers up? Andrew. --- c00dcw00 ???? > To whom it may concern : > > I got the following questions, hope someone may echo > on them, TKS !! > > 1. Who's the Closest Competitor ? > > 2. Who's the 200,000+ CPU user ? > > 3. What's the definition of Fairshare > Utilization ? > > 4. Are the 100+ clusters homegeneous or > hetrogeneous ? > > Sincerely yours, > David C. Wan > Tel : 886-3-577-085 ext. 326 > 5/21/2003, 13:55 > > ----- Original Message ----- > From: "Andrew Wang" > To: ; > > Sent: Wednesday, May 21, 2003 8:47 AM > Subject: Fwd: [PBS-USERS] LSF vs its "Closest > Competitor" > > > > A message from the PBS mailing-list. Anyone want > to > > comment? > > > > Andrew. > > > > Ron Chen wrote: > > > Someone sent me a chat from the Platform "web > > > event", > > > I would like to share with PBS developers/users. > > > ===================================================== > > > Performance, Scalability, Robustness > > > > > > LSF 5 Closest > Competitor > > > > > > Clusters 100+ 1 > > > > > > CPUs 200000+ 300 > > > > > > Jobs 500000+ ~10000+ > > > (active across clusters) > > > > > > Fairshare Utilization ~100% ~50% > > > > > > Query Time 20% better than 40% > slower > > > LSF 4.2 than > LSF > > > 5 > > > > > > Scheduler Usage 4K/job > 28K/job > > > > > > ======================================================== > > > > > > I would love to hear from the people here, at > least > > > a number of things above are not true. > > > > > > I know that PBS with Globus, Silver, or other > meta > > > schedulers can support over 100+ clusters too. > > > > > > For CPUs supported, I am sure I've heard people > > > using PBS with over 500+ processors. > > > > > > It would be interesting to see how Platform came > up > > > with the numbers! > > > > > > -Ron > > > > > > > > > > ----------------------------------------------------------------- > > ??? Yahoo!?? > > ??????? - ???????????? > > http://fate.yahoo.com.tw/ > > _______________________________________________ > > Beowulf mailing list, Beowulf at beowulf.org > > To change your subscription (digest mode or > unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > ----------------------------------------------------------------- ??? Yahoo!?? ??????? - ???????????? http://fate.yahoo.com.tw/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jtochoi at hotmail.com Tue May 20 13:45:36 2003 From: jtochoi at hotmail.com (Jung Jin Choi) Date: Tue, 20 May 2003 13:45:36 -0400 Subject: connecting 200pc Message-ID: Hi all, I just got to know what beowulf system is and found out that I can connect the pc's up to the number of ports in a switch. Let's say, if I have a 16 port switch, I can connect up to 16 pc's. Now, my question is if I have 200pc's, how do I connect them? Should I connect 48 pc's to a 48 ports switch, then connect these four 48 ports switch to another switch? Please teach me some ways to connect many pc's... Thank you Jung Choi -------------- next part -------------- An HTML attachment was scrubbed... URL: From c00dcw00 at nchc.org.tw Wed May 21 01:56:44 2003 From: c00dcw00 at nchc.org.tw (c00dcw00) Date: Wed, 21 May 2003 13:56:44 +0800 Subject: 5/21/2003, 13:55 - Need info ! Re: [PBS-USERS] LSF vs its "Closest Competitor" References: <20030521004740.79916.qmail@web16806.mail.tpe.yahoo.com> Message-ID: <00b601c31f5d$c23049f0$673c6e8c@dt.nchc.gov.tw> To whom it may concern : I got the following questions, hope someone may echo on them, TKS !! 1. Who's the Closest Competitor ? 2. Who's the 200,000+ CPU user ? 3. What's the definition of Fairshare Utilization ? 4. Are the 100+ clusters homegeneous or hetrogeneous ? Sincerely yours, David C. Wan Tel : 886-3-577-085 ext. 326 5/21/2003, 13:55 ----- Original Message ----- From: "Andrew Wang" To: ; Sent: Wednesday, May 21, 2003 8:47 AM Subject: Fwd: [PBS-USERS] LSF vs its "Closest Competitor" > A message from the PBS mailing-list. Anyone want to > comment? > > Andrew. > > Ron Chen wrote: > > Someone sent me a chat from the Platform "web > > event", > > I would like to share with PBS developers/users. > ===================================================== > > Performance, Scalability, Robustness > > > > LSF 5 Closest Competitor > > > > Clusters 100+ 1 > > > > CPUs 200000+ 300 > > > > Jobs 500000+ ~10000+ > > (active across clusters) > > > > Fairshare Utilization ~100% ~50% > > > > Query Time 20% better than 40% slower > > LSF 4.2 than LSF > > 5 > > > > Scheduler Usage 4K/job 28K/job > > > ======================================================== > > > > I would love to hear from the people here, at least > > a number of things above are not true. > > > > I know that PBS with Globus, Silver, or other meta > > schedulers can support over 100+ clusters too. > > > > For CPUs supported, I am sure I've heard people > > using PBS with over 500+ processors. > > > > It would be interesting to see how Platform came up > > with the numbers! > > > > -Ron > > > > > ----------------------------------------------------------------- > ??? Yahoo!?? > ??????? - ???????????? > http://fate.yahoo.com.tw/ > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From amitvyas_cse at hotmail.com Thu May 22 14:03:58 2003 From: amitvyas_cse at hotmail.com (Amit vyas) Date: Thu, 22 May 2003 23:33:58 +0530 Subject: connecting 200pc In-Reply-To: Message-ID: <001501c3208c$84b60ee0$e822030a@amit> Well you have two option 1. use 16 port switch to 16 computers and in this way connect 12 set of switches than it will be 12x16=192 pc's connected if you prefer you can also save on wires by placing the switch near the set of pc's than gon on connecting those 12 sets of switches into other switch so only 8 pc's will be left unconnected for which u can connect another switch and connect that too with the central switch and same the case with the 48 way switch . 2. this option will be less manageable I guess that you go on connecting switches in a hierarchy and than at the end points connect your pc's but this will add on management overhead since managing one level hierarchy is more easy thay two or three level . Hope this helps -----Original Message----- From: beowulf-admin at scyld.com [mailto:beowulf-admin at scyld.com] On Behalf Of Jung Jin Choi Sent: Tuesday, May 20, 2003 11:16 PM To: Beowulf at beowulf.org Subject: connecting 200pc Hi all, I just got to know what beowulf system is and found out that I can connect the pc's up to the number of ports in a switch. Let's say, if I have a 16 port switch, I can connect up to 16 pc's. Now, my question is if I have 200pc's, how do I connect them? Should I connect 48 pc's to a 48 ports switch, then connect these four 48 ports switch to another switch? Please teach me some ways to connect many pc's... Thank you Jung Choi _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Thu May 22 14:14:20 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 22 May 2003 14:14:20 -0400 (EDT) Subject: connecting 200pc In-Reply-To: Message-ID: On Tue, 20 May 2003, Jung Jin Choi wrote: > Hi all, > > I just got to know what beowulf system is and found out that > I can connect the pc's up to the number of ports in a switch. > Let's say, if I have a 16 port switch, I can connect up to 16 pc's. > Now, my question is if I have 200pc's, how do I connect them? > Should I connect 48 pc's to a 48 ports switch, then connect these > four 48 ports switch to another switch? > Please teach me some ways to connect many pc's... You can proceed several ways. Which one is right for you depends on your application needs and budget. If your cluster application is embarrassingly parallel (EP), or does a LOT of work computationally for a LITTLE interprocessor communication, then connecting a stack of switches together is fine and is also by far the cheapest alternative. The best way to interconnect them as almost certainly going to be what you describe; can buy a "master" switch with e.g. 16 ports (m) and plug A and B and C and D and ... into its ports one at a time. All traffic from A to any non-A port thus goes one hop through m: A1 -> Am || mA -> mB || Bm -> B2 for port 1 on switch A to get to port 2 on switch B (the ->'s are within the switch, the ||'s are between switches). This is fairly symmetrical, not TOO expensive, and can manage even "real parallel" applications as long as they aren't too fine grained. If it IS too fine grained, then your next choice is to bump your budget. How much you have to bump it depends on your needs and the topology of your problem. Switch cost per port is absurdly low for switches with less than or equal to 32 ports. Note the following snapshot from pricewatch.com: $2359 - Switch 64port $586 - Switch 48port $129 - Switch 32port $76 - Switch 24port $31 - Switch 16port $48 - Switch 12port $22 - Switch 8port $19 - Switch 5port $19 - Switch 4port In quantities of 32 or fewer ports per switch, the price per port is ballpark of only $2-4 dollars (don't ask me to explain the anomaly at e.g. 16 ports vs 12:-). At 48 it jumps to over $10 per port. At 64 it jumps again to close to $40 (and an e.g. $1800 HP Procurve 4000M, times two for 80 ports on a single backplane, holds at around $50/port ). Clearly it gets really expensive to pack lots of ports on a single backplane, especially planes that attempt to deliver full bisection bandwidth between all pairs of ports. Compare a wimpy ~200 ports made up of only three filled 4000M chassis with gig uplinks at ballpark $6000 vs hmmm, $31x17 = maybe $600 including the cabling to get to 256 ports with 16 16 port switches interconnected via a 16 port switch. Of course performance is better if you go up a notch and get 16 port stackable switches with a gigibit uplink and put them on a 16 port gigabit switch. Prices, however, also go up to the ballpark of $2-3K (I think) -- clearly you can span a range of anywhere from $5 to $50 per port to get to 200+ ports with various topologies and bottlenecks. If you feel REALLY rich, you can look into Foundry switches, e.g. their bigiron switches and the like. These are enterprise-class switching chassis and you will have to bleed money from every pore to buy one, but you can get hundreds of ports on a common backplane with state of the art switching technology delivering a large fraction of full bisection bandwidth and symmetry. Most users who want to build high performance networks that require full bisection bandwidth between all pairs of hosts to run fine grained code on a cluster containing hundreds of nodes and up eschew 100BT or even 1000BT and ethernet altogether, and choose either myrinet or SCI. Both of those have their own ways of managing very large clusters. In both cases you will also bleed money from every pore, but it actually might end up being LESS money than a really big ethernet switch and will have far better performance (latencies down in the <5 microsecond range instead of in the >100 microsecond range). This is really only a short review of the options (you may here more from some of the networking experts on the list) but this might get you started. To summarize, a) profile your task and determine is communication requirements; b) match your task and its expected scaling to your budget, your node architecture, and a network simultaneously. That is, to get the most work done per dollar spent, figure out whether you have to spend relatively much on a network or relatively little. If EP or coarse grained, spend little on the network, and shift more and more over to the networ at the expense of nodes as the task becomes fine grained and the ratio of communication to computation increases. Don't worry about "spending nodes" on faster communications but fewer (maybe a LOT fewer) nodes that you expected/hoped to get -- if you're doing a really fine grained real parallel task, it won't scale out to LOTS of nodes anyway, certainly not without a premiere network interconnecting them. As in, nodes+network prices range from ballpark of $750 for a cost-benefit optimal processor with maybe 512 MB of memory each on a "cheapest" network to $2000 or even more for a high end network. However, if your parallel task only scales linearly to 16 nodes with a cheap network (and maybe even slows DOWN after 32 nodes, so buying 200 doesn't actually help:-) you may be better off using your 200-cheapest-node-budget to buy only 64 nodes that use a network that permits the task to scale nearly linearly up to all 64. Hope this helps, rgb > Thank you > > Jung Choi > > > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu May 22 16:01:05 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 22 May 2003 13:01:05 -0700 Subject: Question on RHL 7.2, 8.0 and 9.0 In-Reply-To: <3EC6597F.60507@earthlink.net> References: <1052862721.13407.22.camel@roughneck.liniac.upenn.edu> <3EC24666.2070701@yahoo.com.sg> <3EC6597F.60507@earthlink.net> Message-ID: <20030522200105.GA1574@greglaptop.internal.keyresearch.com> On Sat, May 17, 2003 at 12:47:11PM -0300, Sam Daniel wrote: > There's a company out in Lodi, CA called CheapBytes (www.cheapbytes.com) > that sells a "workalike" version of RedHat called "Pink Tie". I ordered > a set of version 9 CDs to compare against the RH9 I bought yesterday. > > According to CheapBytes, "Pink Tie 9.0 has been modified to comply with > the Red Hat 9.0 EULA (End User License Agreement) regarding the use of > the Red Hat Logo and Trademark and therefore is freely distributable." The only thing that's different about Pink Tie is that the redhat-logos package was replaced, and a few instances of the word "RedHat" got changed. The main gotcha is that these days, when you do an upgrade of RedHat, it looks at /etc/issue to confirm that you're updating RedHat and not another distro of Linux. Well, Pink Tie changes /etc/issue... so the installer don't show you upgrade as an option. So, if you later upgrade, you need to add "upgradeany" when you boot. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From derek.richardson at pgs.com Thu May 22 16:31:58 2003 From: derek.richardson at pgs.com (Derek Richardson) Date: Thu, 22 May 2003 15:31:58 -0500 Subject: Opteron issues In-Reply-To: <200305190743.44605.shewa@inel.gov> References: <3EC56EC7.4080803@pgs.com> <200305190743.44605.shewa@inel.gov> Message-ID: <3ECD33BE.9070001@pgs.com> Andrew, Thanks for the info. Just out of curiousity, what clock speed are these CPUs ( both crashed and non )? Thanks to others who replied, BTW! Regards, Derek R. Andrew Shewmaker wrote: >On Friday 16 May 2003 05:05 pm, Derek Richardson wrote: > > >>All, >>I've heard rumors of an Opteron stability issue, but can't seem to find >>anything concrete on the web yet. Has anyone heard about this? >>Opinions, experiences? >>Thanks, >>Derek R. >> >> > >I tested an Opteron 240 system with 8GB RAM running SuSe. >The gcc 3.3 (prerelease at the time) that SuSe provided was good... >I didn't compare it to an earlier version of gcc. The >PG beta worked well for an f90 code I wanted to test. This code >runs as a single process and will allocate between 5 and 6 GB of >RAM (that's the resident set size) when running our big models. >The big model I tested completed in about 46 hours on the Opteron >compared to around 168 hours for a 750MHz UltraSPARC III. > >The system I tested used a Newisys motherboard and I was able >to crash it, but that was before it had its production cpus. I >wasn't able to crash it after they were installed . > >Andrew > > > -- Linux Administrator derek.richardson at pgs.com derek.richardson at ieee.org Office 713-781-4000 Cell 713-817-1197 One of your most ancient writers, a historian named Herodotus, tells of a thief who was to be executed. As he was taken away he made a bargain with the king: in one year he would teach the king's favorite horse to sing hymns. The other prisoners watched the thief singing to the horse and laughed. "You will not succeed," they told him. "No one can." To which the thief replied, "I have a year, and who knows what might happen in that time. The king might die. The horse might die. I might die. And perhaps the horse will learn to sing. -- "The Mote in God's Eye", Niven and Pournelle _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jhearns at freesolutions.net Thu May 22 17:12:52 2003 From: jhearns at freesolutions.net (John Hearns) Date: Thu, 22 May 2003 22:12:52 +0100 Subject: Question on RHL 7.2, 8.0 and 9.0 References: <1052862721.13407.22.camel@roughneck.liniac.upenn.edu> <3EC24666.2070701@yahoo.com.sg> <3EC6597F.60507@earthlink.net> <20030522200105.GA1574@greglaptop.internal.keyresearch.com> Message-ID: <000f01c320a6$e8c15470$5c259fd4@mypc> > > > > According to CheapBytes, "Pink Tie 9.0 has been modified to comply with > > the Red Hat 9.0 EULA (End User License Agreement) regarding the use of > > the Red Hat Logo and Trademark and therefore is freely distributable." > > > So, if you later upgrade, you need to add "upgradeany" when you boot. > > -- greg > I run Pink Tie 8.0 at home, and it is just fine. Thanks to Greg about the upgrade tip - which I will try when I get 9 soon. For anyone in the UK, I can heartily recommend John Winters at the Linux Emporium for all Linux distros, including Pink Tie http://www.linuxemporium.co.uk John produces update CDs of the latest update RPMs also, for the bandwidth challenged. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From beowulf-admin at beowulf.org Fri May 23 16:18:05 2003 From: beowulf-admin at beowulf.org (beowulf-admin at beowulf.org) Date: Fri, 23 May 2003 16:18:05 -0400 Subject: No subject Message-ID: From beowulf-admin at beowulf.org Fri May 23 16:35:58 2003 From: beowulf-admin at beowulf.org (beowulf-admin at beowulf.org) Date: Fri, 23 May 2003 16:35:58 -0400 Subject: No subject Message-ID: From beowulf-admin at beowulf.org Fri May 23 17:09:19 2003 From: beowulf-admin at beowulf.org (beowulf-admin at beowulf.org) Date: Fri, 23 May 2003 17:09:19 -0400 Subject: No subject Message-ID: From beowulf-admin at beowulf.org Fri May 23 18:15:58 2003 From: beowulf-admin at beowulf.org (beowulf-admin at beowulf.org) Date: Fri, 23 May 2003 18:15:58 -0400 Subject: No subject Message-ID: From sliu at pipeline.com Fri May 23 09:29:29 2003 From: sliu at pipeline.com (Shaohui Liu) Date: Fri, 23 May 2003 09:29:29 -0400 (EDT) Subject: question on beostat Message-ID: <3882993.1053695718020.JavaMail.nobody@wamui06.slb.atl.earthlink.net> Hi, I am beowulf newbie running a basic version of scyld beowulf on a 9 node cluster. Here is the OS: Linux beowulf1 2.2.19-12.beo #1 Tue Jul 17 17:10:45 EDT 2001 i686 unknown The slave nodes were started with default CDs. I was able to run a few application with parallel processing. But I could not see any changes on the monitoring pgm. If I ran beostat cmd, I only get see information of node -1, 1 and 3, while all other nodes were unaccessible (even with -N option). Can anyone know why? How can I see the resources on each node? Thanks in advance Shaohui _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From shashank at evl.uic.edu Sat May 24 15:29:08 2003 From: shashank at evl.uic.edu (Shashank Khanvilkar) Date: Sat, 24 May 2003 14:29:08 -0500 Subject: Buying a Beowulf Cluster (Help) Message-ID: <008401c3222a$bf5c3d90$44acf880@SHASHANK> Hi, We have already put an order for a beowulf cluster and it is scheduled to come soon. The only problem is that we are very in-experienced with such clusters and would really appreciate some help. There is a lot of documentation, and i have yet to start reading. WIll really appreciate if someone can point out ans's to the following questions: (I will be google'ing for it also). 1. Opinions on the OS to be installed on the cluster: We have decided on REd Hat 7.3, however suggestions are welcome. (We are not going for RH 8/9 because of some known compiler (Intel and portland) problems.. If anyone has knowledge abt this, please let me know). 2. Any documentation on the software that needs to be installed (MPI, PVM, admin stuff etc) that will help us in the long run. 3. Any documentation on TO-DO's..or things that we need to check/do before working on the cluster. Any help is greatly appreciated. Shashank http://mia.ece.uic.edu/~papers _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From andrewxwang at yahoo.com.tw Sun May 25 04:17:14 2003 From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=) Date: Sun, 25 May 2003 16:17:14 +0800 (CST) Subject: Buying a Beowulf Cluster (Help) In-Reply-To: <008401c3222a$bf5c3d90$44acf880@SHASHANK> Message-ID: <20030525081714.16945.qmail@web16801.mail.tpe.yahoo.com> I just installed Gridengine on our newly installed cluster, and got it running smoothly. If you have more than serveral users sharing the cluster, a batch system is essential. Some people choose to use PBS, I suggest you use PBSPro instead of OpenPBS. OpenPBS is free, but PBSPro is much nicer, with better features. You may want to apply several patches floating from OpenPBS Public home. (Even with those, Gridengine or PBSPro is still much nicer) Gridengine: http://gridengine.sunsource.net/ OpenPBS: http://www.openpbs.org/ http://www-unix.mcs.anl.gov/openpbs/ If you run parallel code, you can take a look at LAM-MPI: http://www.lam-mpi.org Andrew. --- Shashank Khanvilkar ????> Hi, > We have already put an order for a beowulf cluster > and it is scheduled to > come soon. The only problem is that we are very > in-experienced with such > clusters and would really appreciate some help. > There is a lot of > documentation, and i have yet to start reading. > WIll really appreciate if someone can point out > ans's to the following > questions: (I will be google'ing for it also). > > > 1. Opinions on the OS to be installed on the > cluster: We have decided on REd > Hat 7.3, however suggestions are welcome. (We are > not going for RH 8/9 > because of some known compiler (Intel and portland) > problems.. If anyone has > knowledge abt this, please let me know). > > 2. Any documentation on the software that needs to > be installed (MPI, PVM, > admin stuff etc) that will help us in the long run. > > 3. Any documentation on TO-DO's..or things that we > need to check/do before > working on the cluster. > > Any help is greatly appreciated. > > Shashank > > > > > http://mia.ece.uic.edu/~papers > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or > unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf ----------------------------------------------------------------- ??? Yahoo!?? ??????? - ???????????? http://fate.yahoo.com.tw/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From jhearns at freesolutions.net Sun May 25 04:53:57 2003 From: jhearns at freesolutions.net (John Hearns) Date: Sun, 25 May 2003 09:53:57 +0100 Subject: Buying a Beowulf Cluster (Help) References: <008401c3222a$bf5c3d90$44acf880@SHASHANK> Message-ID: <001501c3229b$2e3d9950$a1249fd4@mypc> ----- Original Message ----- From: "Shashank Khanvilkar" To: Sent: Saturday, May 24, 2003 8:29 PM Subject: Buying a Beowulf Cluster (Help) > Hi, > We have already put an order for a beowulf cluster and it is scheduled to > come soon. The only problem is that we are very in-experienced with such > clusters and would really appreciate some help. There is a lot of You could do a lot worse than getting some books on clustering. Two which I recommend are: "Linux Clustering - Building and Maintaining Linux Clusters" by Charles Bookman ISBN 1578702747 "Beowulf Cluster Computing with Linux" by Thomas Sterling et.al. ISBN 0262692740 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From atp at piskorski.com Sun May 25 21:22:28 2003 From: atp at piskorski.com (Andrew Piskorski) Date: Sun, 25 May 2003 21:22:28 -0400 Subject: Debian FAI diskless? In-Reply-To: <200305160944.h4G9iYW03794@NewBlue.Scyld.com> References: <200305160944.h4G9iYW03794@NewBlue.Scyld.com> Message-ID: <20030526012225.GA31443@piskorski.com> On Fri, May 16, 2003 at 05:44:34AM -0400, beowulf-request at scyld.com wrote: > From: Thomas Lange > Date: Fri, 16 May 2003 10:42:18 +0200 > Subject: Re: (Opens can of worms..) What is the best linux distro for a cluste r? > Debian has a very nice tool for installing a beowulf cluster > unattended. It's called Fully Automatic Installation (FAI) and can be > found at http://www.informatik.uni-koeln.de/fai/. The manual has also > a chapter how to set up a Beowuluf cluster with little work. I did > several installations of Beowulf clusters with this tool (I'm the > author) but I also know other people using it for clusters. Have a Can FAI be used to setup a where only the head node has local disk, and all the rest ares diskless? I looked in the docs but didn't see any reference to diskless. -- Andrew Piskorski http://www.piskorski.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lange at informatik.Uni-Koeln.DE Mon May 26 06:19:49 2003 From: lange at informatik.Uni-Koeln.DE (Thomas Lange) Date: Mon, 26 May 2003 12:19:49 +0200 Subject: Debian FAI diskless? In-Reply-To: <20030526012225.GA31443@piskorski.com> References: <200305160944.h4G9iYW03794@NewBlue.Scyld.com> <20030526012225.GA31443@piskorski.com> Message-ID: <16081.59973.341483.254274@informatik.uni-koeln.de> >>>>> On Sun, 25 May 2003 21:22:28 -0400, Andrew Piskorski said: > Can FAI be used to setup a where only the head node has local > disk, and all the rest ares diskless? I looked in the docs but > didn't see any reference to diskless. Surely! There's an examples configuration in FAI that shows how to install a diskless machine. The only part that is different from the default is "partition my local hard disks". The rest remains the same. Have a look at the file templates/hooks/partition.DISKLESS -- regards Thomas _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From 8nrf at qlink.queensu.ca Mon May 26 10:48:03 2003 From: 8nrf at qlink.queensu.ca (Nathan Fredrickson) Date: Mon, 26 May 2003 10:48:03 -0400 Subject: mpich on cluster of SMPs Message-ID: Hi, I have a cluster of SMP machines running linux-2.4.20 and mpich-1.2.5. I configured mpich with device=ch_p4 and comm=shared to allow shared-memory communication between processes on the same node. This seemed to be working fine, but I was not confident that shared-memory was actually being used on-node so I build and installed a second instance of mpich with device=shmem. Using a simple two process test program that measures how long it takes to send an integer back and forth 10000 times I compared the three setups: device=ch_p4, comm=shared, off-node: 1.88 seconds device=ch_p4, comm=shared, on-node: 1.25 seconds device=ch_shmem: 0.288314 seconds The ch_p4 device does not seem to be using shared-memory when both processes are on the same node. Am I misinterpreting what comm=shared is supposed to do? Is there additional configuration required to make the ch_p4 device use shared-memory on-node? I expected on-node performance similar to device=shmem. Any insights would be appreciated. Thanks, Nathan _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From cjereme at ucla.edu Mon May 26 12:21:27 2003 From: cjereme at ucla.edu (Tintin J Marapao) Date: Mon, 26 May 2003 09:21:27 -0700 Subject: myrinet fault detection References: Message-ID: <000b01c323a2$dd0014b0$a400a8c0@Laptopchristine> Hello all, I am looking into myrinet for our cluster, and I was wondering if anyone knows anything about (or resources to look these up) -Myrinet has a built in fault /error detection -If there is a built in fault recovery system -Power management system Thanks much, Christine _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From cdcunh01 at cs.fiu.edu Tue May 27 13:49:05 2003 From: cdcunh01 at cs.fiu.edu (cassian d'cunha) Date: Tue, 27 May 2003 13:49:05 -0400 (EDT) Subject: enabling rlogin Message-ID: <4815.131.94.130.190.1054057745.squirrel@www.cs.fiu.edu> Hi, I am quite a newbie as far as clusters are concerned. I can't get rlogin to work on a cluster that has scyld Beowulf based on redhat 6.2. I can rlogin and telnet to other machines, but not to the host from any other machine. It also doesn't have the /etc/inetd.conf file where I would normally enable (server) rlogin, telnet, etc. Any help on enabling rlogin would be greatly appreciated. Thanks, Cassian. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From patrick at myri.com Tue May 27 15:15:39 2003 From: patrick at myri.com (Patrick Geoffray) Date: 27 May 2003 15:15:39 -0400 Subject: myrinet fault detection In-Reply-To: <000b01c323a2$dd0014b0$a400a8c0@Laptopchristine> References: <000b01c323a2$dd0014b0$a400a8c0@Laptopchristine> Message-ID: <1054062940.542.20.camel@asterix> Hi Christine, On Mon, 2003-05-26 at 12:21, Tintin J Marapao wrote: > -Myrinet has a built in fault /error detection The hardware provides: 1) CRC8 and CRC32 to detect bit corruption on the link. 2) SRAM parity to detect memory corruption on the NIC. 3) the PCIDMA chipset checks for PCI parity on DMA Reads (when the NIC is the PCI target). > -If there is a built in fault recovery system 2) and 3) are fatal errors, it should not happen in your lifetime unless faulty hardware. 1) and other cases are recoverable if the firmware you are using is reliable. GM is reliable, does segmentation/reassembly in the NIC and ACKs each fragment. It retransmits the data if a packet is lost or corrupted. You can find more information on BER here: http://www.myri.com/cgi-bin/fom?file=245 > -Power management system What do you mean ? Patrick -- Patrick Geoffray, PhD Myricom, Inc. http://www.myri.com _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From bushnell at chem.ucsb.edu Tue May 27 16:26:59 2003 From: bushnell at chem.ucsb.edu (John Bushnell) Date: Tue, 27 May 2003 13:26:59 -0700 (PDT) Subject: Buying a Beowulf Cluster (Help) In-Reply-To: <008401c3222a$bf5c3d90$44acf880@SHASHANK> Message-ID: I would strongly suggest knocking on a few doors where you're at and finding some local cluster folks. There have got to be some people maintaining clusters at your University. If nothing else, you will have company during your misery. And they can probably clear up a lot of your questions over a couple mugs of coffee. As soon as the word "Beowulf" came out of my mouth while talking to someone who had built one, I immediately discovered some fundamental misconceptions I had and modified my plans. - John On Sat, 24 May 2003, Shashank Khanvilkar wrote: > Hi, > We have already put an order for a beowulf cluster and it is scheduled to > come soon. The only problem is that we are very in-experienced with such > clusters and would really appreciate some help. There is a lot of > documentation, and i have yet to start reading. > WIll really appreciate if someone can point out ans's to the following > questions: (I will be google'ing for it also). > > > 1. Opinions on the OS to be installed on the cluster: We have decided on REd > Hat 7.3, however suggestions are welcome. (We are not going for RH 8/9 > because of some known compiler (Intel and portland) problems.. If anyone has > knowledge abt this, please let me know). > > 2. Any documentation on the software that needs to be installed (MPI, PVM, > admin stuff etc) that will help us in the long run. > > 3. Any documentation on TO-DO's..or things that we need to check/do before > working on the cluster. > > Any help is greatly appreciated. > > Shashank > > > > > http://mia.ece.uic.edu/~papers > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From alvin at Maggie.Linux-Consulting.com Tue May 27 17:04:51 2003 From: alvin at Maggie.Linux-Consulting.com (Alvin Oga) Date: Tue, 27 May 2003 14:04:51 -0700 (PDT) Subject: Buying a Beowulf Cluster (Help) In-Reply-To: Message-ID: hi ya - 2 replies in 1 On Tue, 27 May 2003, John Bushnell wrote: > I would strongly suggest knocking on a few doors where you're > at and finding some local cluster folks. There have got to be > some people maintaining clusters at your University. If nothing > else, you will have company during your misery. than one can look outside the area - you will get 10x better hw support if they were local - biggest problem .. to keep machines up ... ( something dies, and you need it fixed "now" ) - remote admin and howto support can be done remotely... by people that can keep it up .. - i think "hands-on support" vs "uptime/reliability support" should be split up ... > On Sat, 24 May 2003, Shashank Khanvilkar wrote: > > > 1. Opinions on the OS to be installed on the cluster: We have decided on REd > > Hat 7.3, however suggestions are welcome. (We are not going for RH 8/9 > > because of some known compiler (Intel and portland) problems.. If anyone has > > knowledge abt this, please let me know). gcc problems will be across the board .. - old gcc on new hw or new gcc on new hw - old gcc on old hw or new gcc on old hw - you will have problems ( glibc + gcc-x.y problems ) - there's probably more open source support for new gcc w/ new hw i think to build new boxes based on old distro is a bad idea, since it'd run into old known bugs that has since been fixed in the newer distro - yes, you might get the new bugs in the new systems/distro .. but you will also get old bugs in old distro and a lot smaller group of open source folks addressing those older issues - there's usually work arounds for most bugs/problems .. - typical/usual work around for older bugs is to upgrade - new bugs/problems -- simply means you need ore time to do more detailed testing before deciding - i have yet to see newer distro NOT be able to run an older app that "claimed to require an old foo-x.y version by the 3rd party vendor" .. it works even on newer version unless they did something unique to their code to lock it to that linux distro - given known bugs and features requirements, i'd build on the latest/greatest stuff > > 2. Any documentation on the software that needs to be installed (MPI, PVM, > > admin stuff etc) that will help us in the long run. random collection of stuff http://www.Linux-Consulting.com/Cluster > > 3. Any documentation on TO-DO's..or things that we need to check/do before > > working on the cluster. pricing and support ... - simulate the "my node just died, how do you(vendor) plan to fix it ?? " and our deadline was yesterday have fun alvin _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From kr4m17 at aol.com Tue May 27 20:15:42 2003 From: kr4m17 at aol.com (kr4m17 at aol.com) Date: Tue, 27 May 2003 20:15:42 -0400 Subject: Process Migration in Scyld Beowulf Message-ID: <7E0EC9F9.03151EEF.000655AE@aol.com> Hi, I am new to the clustering scene. I have just installed Scyld Beowulf, and have successfully added nodes. I can not get any of the nodes to process ANY information. Their CPU status stays on 0% along with their memory... I would very much like to run a program called Ubench (www.phystech.com/download) just to test the performance increase of the cluster. If anyone can tell me how to run any program over all nodes of the cluster or inform me of how to run Ubench specifically i would greatly appreciate it. I have tried mpirun and it basically does not work at all, i am not sure why. If there is any information that anyone has or knows of any documentation that can help me out, please respond to the mailing list or email me at kr4m17 at aol.com. kr4m17 (J. Simon) . _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mphil39 at hotmail.com Wed May 28 10:47:58 2003 From: mphil39 at hotmail.com (Matt Phillips) Date: Wed, 28 May 2003 10:47:58 -0400 Subject: Network RAM revisited Message-ID: Hello, I am a student about to start work on using Network RAMs as swap space in a cluster environment, as a part of semester project. I need to convince myself first that this is indeed relevant in today's situation. I have read the earlier discussions in the archives but they are two years old and things might have changed since that might have made Network RAM more plausible (gigE becoming ubiquitous, network latency reducing, seek time in disks not getting better etc). I would like to hear the opinion of the members in the group. I guess the main argument against it is why not simply put in more memory sticks and avoid swap altogether. I was told there are applications out there that would still always need swap. To make the case more convincing, I would also like to test performance with real world application traces instead of probablity distributions. Does anyone know of applications (preferably used widespread) for which swap is unavoidable? Another question that bothers me is network latency deteriorates severely after packet size goes beyond 1-1.5 KB. Assuming page size is 4KB, wouldn't this affect the network RAM performance in a big way? Any way around this problem? Thanks in advance, Matt _________________________________________________________________ The new MSN 8: advanced junk mail protection and 2 months FREE* http://join.msn.com/?page=features/junkmail _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From j.c.burton at gats-inc.com Wed May 28 11:46:36 2003 From: j.c.burton at gats-inc.com (John Burton) Date: Wed, 28 May 2003 11:46:36 -0400 Subject: Network RAM revisited References: Message-ID: <3ED4D9DC.60005@gats-inc.com> Matt Phillips wrote: > Hello, > > I guess the main argument against it is why not simply put in more > memory sticks and avoid swap altogether. I was told there are > applications out there that would still always need swap. To make the > case more convincing, I would also like to test performance with real > world application traces instead of probablity distributions. Does > anyone know of applications (preferably used widespread) for which swap > is unavoidable? In my line of work (atmospheric remote sensing), its not so much a matter of "applications for which swap is unavoidable", but "build it and they will use it and more" - I don't care how "big" a machine / cluster I build, the scientists will find a way to use all its resources and ask for more. As I give them more powerful setups, their mathematical models increase in size and complexity. Its a constant battle to break even. Their models fill the existing memory and start hitting swap, which slows their processing down. They need to run faster so they can process a day's worth of data in one day, I give them more memory. They fill up the additional memory and start hitting swap again, but they need it to run faster so... (think you get the picture). Swap allows them to continually improve / enhance their models. Additional memory just makes it go faster. John _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hahn at physics.mcmaster.ca Wed May 28 12:13:39 2003 From: hahn at physics.mcmaster.ca (Mark Hahn) Date: Wed, 28 May 2003 12:13:39 -0400 (EDT) Subject: Network RAM revisited In-Reply-To: Message-ID: > I am a student about to start work on using Network RAMs as swap space in a > cluster environment, as a part of semester project. I need to convince sigh. > plausible (gigE becoming ubiquitous, network latency reducing, seek time in > disks not getting better etc). I would like to hear the opinion of the seek time improves slowly, it's true, but it does improve. perhaps more importantly, striping makes it a non-issue, or at least a solvable one. network latency hasn't improved much, in spite of gbe: I see around 80 us /bin/ping latency for 100bT, and about half that for gbe. in both cases, bandwidth has improved dramatically, though. networks are still pretty pathetic compared to dram interfaces: in other words, dram hasn't stood still, either. > I guess the main argument against it is why not simply put in more memory > sticks and avoid swap altogether. I was told there are applications out why do you think swap is an issue? > there that would still always need swap. I don't believe that's true. there are workloads which always result in some dirty data which is mostly idle; those pages would be perfect for swapping out, since it would free up the physical page for hotter use. it would seem very strange to me if an app created a *lot* of these pages, since that would more-or-less imply an app design flaw. > Another question that bothers me is network latency deteriorates severely > after packet size goes beyond 1-1.5 KB. I don't see that, unless by "severe" you mean latency=bandwidth/size ;) fragmenting a packet should definitely not cause a big decrease in throughput. also, support for jumbo MTU's is not that uncommon. > Assuming page size is 4KB, wouldn't > this affect the network RAM performance in a big way? Any way around this > problem? no. MMU hardware is inimicably hostile to this sort of too-clever-ness. not only are pages large, but manipulating them (TLB's actually) is expensive, especially in a multiprocessor environment. my take on swap is this: - a little bit of swapout is a very good thing, since it means that idle pages are being put to better use. - more than a trivial amount of swap *in* is very bad, since it means someone is waiting on those pages. worse is when a page appears to be swapped out and back in quickly. that's really just a kernel bug. - swap-outs are also nice in that they are often async, so no one is waiting for them to complete. they can also be lazy-written and even optimistically pre-written. - swap-ins are painful, but at least you can scale bandwidth and latency by adding spindles. - the ultimate solution is, of course, to add more ram! for ia32, this is pretty iffy above 2-6 GB, partly because it's 32b hardware, and partly because the hardware has dampened demand for big memory. but ram is at historically low prices: http://sharcnet.mcmaster.ca/~hahn/ram.png (OK, there was a brief period when crappy old PC133 SDR was cheaper than PC2100 is today, but...) in general, if you're concerned about ram, I'd look seriously at Opteron machines, since there simply is no other platform that's quite as clean: 64b goodness, scales ram capacity with cpus, not crippled by a shared FSB. it's true that you can put together some pretty big-ram ia64 machines, but they tend to wind up being rather expensive ;) in summary: I believe network shared memory is simply not a great computing model. if I was supervising a thesis project, I'd probably try to steer the student towards something vaguely like Linda... regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Wed May 28 14:24:14 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Wed, 28 May 2003 11:24:14 -0700 Subject: Network RAM revisited In-Reply-To: References: Message-ID: <20030528182414.GA1913@greglaptop.internal.keyresearch.com> On Wed, May 28, 2003 at 10:47:58AM -0400, Matt Phillips wrote: > I am a student about to start work on using Network RAMs as swap space in a > cluster environment, as a part of semester project. This is not as simple as it looks, because of deadlock situations. If you have to frantically start swapping because you're low on memory, how are you going to be sure that the process or kernel thread which is going to send memory to another node's ram (swap out) isn't itself going to have to allocate memory? There are 2 extensions beyond this project which are Really Cool. The first is a shared database cache, like Oracle's Cache Fusion. The second is a global cache for a shared filesystem. If you make the filesystem read-only, then this might make a better semester project than swap. For your application, you could use BLAST. You could even hack up BLAST's I/O to go through a library instead of through the kernel. That would significantly reduce the complexity of the project. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From fryman at cc.gatech.edu Wed May 28 17:30:08 2003 From: fryman at cc.gatech.edu (Josh Fryman) Date: Wed, 28 May 2003 17:30:08 -0400 Subject: Network RAM revisited In-Reply-To: References: Message-ID: <20030528173008.40e39cdc.fryman@cc.gatech.edu> there are several research papers that explore the issues in doing this. try a lit review. some that randomly pop to the top of my stack: Using network memory to improve performance in transaction-based systems Using remote memory to avoid disk thrashing Memory servers for multicomputers Implementation of a reliable remote memory pager The network RAMdisk (etc) -josh _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rgb at phy.duke.edu Wed May 28 19:29:52 2003 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 28 May 2003 19:29:52 -0400 (EDT) Subject: Network RAM revisited In-Reply-To: Message-ID: On Wed, 28 May 2003, Mark Hahn wrote: > > Another question that bothers me is network latency deteriorates severely > > after packet size goes beyond 1-1.5 KB. > > I don't see that, unless by "severe" you mean latency=bandwidth/size ;) > fragmenting a packet should definitely not cause a big decrease in > throughput. also, support for jumbo MTU's is not that uncommon. Mark is dead right here. In fact, there are two regimes of bottleneck in networking. Small packets are latency dominated -- the interface cranks out packets as fast as it can, and typically bandwidth (such as it is) increases linearly with packet SIZE as R_l * P_s (max rate for latency bounded packets times packet size). Double the packet size, double the "bandwidth", but you just can't get any more pps through the interface/switch/interface combo (often with TCP stack on the side dominating the whole thing). What you're seeing around P_s = 1 KB is a crossover from latency dominated to bandwidth dominated (bottlenecked) traffic. You are approaching wire speed. As soon as this occurs you CAN'T continue spit out packets at the maximum rate so speed continues to increase linearly with packet size as the wire simply won't hold any more bps during the time you are using it. This causes latency to "increase" (or rather, the packet rate to decrease) as packet delivery starts to be delayed by the sheer time required to load the packet onto the wire at the maximum rate, not the time required to put the packet together and initiate transmission. The result is a near-exponential saturation curve exhibiting linear growth saturating at wirespeed less all sorts of cumulative overhead and retransmissions and other inefficiencies, typically about 10 MBps data transmission (around 90% of the theoretical limit after allowing for mandatory headers) for 100BT TCP/IP although this varies a LOT with the quality of your NIC and switch and wiring and protocol/stack. At one time fragmenting a packet stream of messages each just larger than the MTU caused one to fall off to a performance region that was once again latency dominated and cost one a "jump" down in bandwidth, but in recent years this jump has been small to absent as NIC latencies have dropped so even fragmented/split packets are still in the bandwidth dominated region where bandwidth is nearly saturated and slowly varying. > in summary: I believe network shared memory is simply not a great computing > model. if I was supervising a thesis project, I'd probably try to steer > the student towards something vaguely like Linda... I'm not sure I agree with this. There have certainly been major CS projects (like Duke's Trapeze) that have been devoted to creating a reasonably transparent network-based large memory model because there ARE problems (or at least have been problems) where there is a need for very large virtual memory spaces but disk-based swap is simply too slow. A FAST network (not ethernet, and not TCP/IP) with 5 usec or so latency and 100 MBps scale bandwidth, large B, can still beat up disk swap for certain (fairly random or at least nonlocal) memory access patterns. You assert that these patterns can be avoided by careful design and could be right. However, there is some virtue in having a general purpose magic-wand level tool where accessing more memory than you have kicks in a transparent mechanism for distributing the memory and runtime-optimizing its access -- basically creating an additional level of memory speed in the memory speed hierarchy -- so users don't HAVE to code for a particular size or architecture. I do think that simply providing "networked swap" through the existing VM is unlikely to be a great solution (although it might "work" with tools already in the kernel for at least testing purposes). The VM is almost certainly tuned for a single, highly expensive level of memory outside of physical DRAM, and there are too many orders of magnitude between DRAM and disk latencies and bandwidths for the tuning to "work" correctly for an intermediary network layer with very different advantages and nonlinearities. Then there is the page issue which if nothing else might require tuning or might favor certain hardware (capable of jumbo packets that can hold a full page) or both (different tunings for different hardware). What I would recommend is that the student talk to somebody like Jeff Chase at Duke and look over the literatures on existing and past projects that have addressed the issue. They'll need to quote Jeff's and the other people's work anyway in any sort of a sane dissertation, and they are by far the best people to tell them if the idea is still extant and worthy of further work (perhaps built on top of their base, perhaps not) or not. It is also wise to (as they are apparently doing) find a few people with applications that would "use a huge VM tomorrow" if it existed as a mainstream option in (say) an OTC distribution or even a specialized scyld or homebrew kernel. Weather, cosmology, there are a few "enormous" problems where researchers ALWAYS want to work bigger than current physical limits and budgets permit and can still get useful results even with the penalties imposed by disk or other VM extensions. For those workers, a "cluster" might be a single compute node and a farm of VM extension nodes providing some sort of CC-NUMA across the aggregate memory space for just the one core processor. If you have the problem in hand, it makes developing and testing a meaningful solution a whole lot easier... Or, as you say Mark, 64 bit systems may rapidly make the issue at least technically moot. COST nonlinearities, though, might still make a distributed/cluster DRAM VM attractive, as it might well be cheaper to buy a farm of 16 2 GB systems even with myrinet or sci interconnects to get to ballpark of 32 GB VM than it could be to buy a motherboard and all the memory capable of running 32 GB native on a single board. They won't sell a lot of the latter (at least at first), and they'll charge through the nose for developing them... and then there are the folks that would say fine, how about a network of 32 32 GB nodes to let me get to a TB of transparent VM? Indeed, one COULD argue that this is an idea that ONLY makes sense (again) now that there are 64 bit systems (and address spaces, kernels, compiler support) proliferating. Kernels that can use (mostly) all the memory one can put NOW on a 32 bit board have only recently come along, and grafting a de facto 64 bit address space onto a 32 bit architecture to permit accessing more than 4 GB of VM with no existing compiler or kernel support would be immensely painful and/or special library ("message passing" to slaves that do nothing but lookup or store data) evil (which may be why trapeze more or less stopped). 64 bit architectures could revive it again, especially while the aforementioned nonlinear price break between 64 bit (but still 2-4 GB limited on the cheap end) motherboards and 64 bit (but capable of holding 32 GB or more, expensive) motherboards holds. An Opteron is hardly more expensive than a plain old high end Athlon at similar levels of memory filling, and even a kilodollar scale premium for low-latency interconnects could still keep the price well (an order of magnitude?) below a large memory box for quite a while. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From rburns at cs.ucsb.edu Wed May 28 20:18:10 2003 From: rburns at cs.ucsb.edu (Ryan Burns) Date: Wed, 28 May 2003 17:18:10 -0700 Subject: DSM solution? Message-ID: <20030528171810.40420b18.rburns@cs.ucsb.edu> Hello, I'm looking for a transparent DSM solution. I'm trying to create a suite of software that allows users to run applications designed for multiprocessor computers to be run on a cluster. My problem is a little bit unusual, so let me state what I need to do: I need to be able to specify which node each thread/process runs on. Each thread/process needs to be able to access specific hardware on each node. For example, each thread might need to display to its own video card. Each thread/process needs to read shared memory from 1 specific thread/process which is part of the user application. So imagine I take a user application, that I know nothing about, fork it a few times onto other nodes, then each node needs to read memory from that original process. So another example would be: I have a program that get gets input from a user by some form or another, and then it displays a photo or something for each input it receives, and I want each copy of this program on each node to read that same input. I can't access that input b/c I don't know what it is, but the copy applications do. So my situation isn't quite as drastic as it seems, I do know a lot about the user app. I just don't know anything about its data types. So far all the solutions I've looked into don't seem to do what I want. I'm thinking when openmosix is able to merge shared memory apps, I will be able to use some of their code to solve my problem. Basically I'm looking for an easy way out. I don't want to write any custom kernel level code to solve the problem if I don't have to. If anyone knows of any transparent DSM solutions, please let me know. Thanks, Ryan Burns _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu May 29 03:15:12 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 29 May 2003 00:15:12 -0700 Subject: DSM solution? In-Reply-To: <20030528171810.40420b18.rburns@cs.ucsb.edu> References: <20030528171810.40420b18.rburns@cs.ucsb.edu> Message-ID: <20030529071512.GB1496@greglaptop.greghome.keyresearch.com> On Wed, May 28, 2003 at 05:18:10PM -0700, Ryan Burns wrote: > I'm looking for a transparent DSM solution. If you look around, you'll find 30+ projects that have produced DSM with varying degrees of transparency. You'll find that performance is quite low for a very transparent solution -- so it works OK only for programs which are nearly embarrassingly parallel. > Each thread/process needs to read shared memory from 1 specific > thread/process which is part of the user application. So imagine I take a > user application, that I know nothing about, fork it a few times onto > other nodes, then each node needs to read memory from that original > process. If your user application wrote and read files (or named pipes) instead of using shared memory, you could produce something probably much higher in performance, with much less work. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From lindahl at keyresearch.com Thu May 29 03:12:26 2003 From: lindahl at keyresearch.com (Greg Lindahl) Date: Thu, 29 May 2003 00:12:26 -0700 Subject: Network RAM revisited In-Reply-To: References: