From diep at xs4all.nl Mon Apr 9 20:14:52 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Tue, 10 Apr 2012 02:14:52 +0200 Subject: [Beowulf] Infiniband Advice which functions to use for what purpose Message-ID: hi, Trying to make an new model for infiniband for Diep. I need some advice which functioncalls/libraries to use for fastest possible communication over infiniband (mellanox qdr) from one node to another. There is a lot of possibilities there but what's communicating fastest? I need 2 different types of communication possibly 3 or more. Still can setup the model there how to communicate now so let's test the water: a) each node has a 1.5GB cache. so that's 1.5 GB * n each core of each node is randomly needing 192 bytes. Don't know which node in advance and don't know where in the gigabytes of cache (hashtable) it needs to read. what library and which function call is best to ask for this? Realize all 8 cores are busy, if i need to keep 1 core free handling all requests from all other nodes, that slows down each machine significantly as i lose 1 core then. b) for starting and stopping the difference cores (at all nodes) in a de-centralized manner, some variables are difficult to keep decentralized, you want them broadcasted to all nodes somehow updating shared memory at remote nodes in some sort of manner, so the mellanox card writing into the RAM without interrupting the probably 8 running cores, nor needing any of them to handle this. Is that possible somehow? If so, is it possible to update it with 1 function call to all n-1 other nodes? c) memory migration - which possibilities are there to do this - i probably need to build a manual memory migration when a specific job gets taken over from 1 node to another. Which function calls would you advice to use there, is there documentation on how to efficiently implement memory migration? I need to migrate roughly around a 2 kilobyte at a time. This doesn't happen too much obviously, yet the algorithms are so complex i can't avoid doing this if i want the utmost performance so i figured out on paper. And yes i do know there is some stuff that already has this built in - but that's possibly too slow for what i need. d) atomic reads/writes/spinlocks over infiniband. there probably is a function to set a lock at a remote memory adress, which one is it? Is there also a function call that sets a lock, and when lock is succesful directly returns you a bunch of bytes from a specific adress (nearby the lock); that would avoid me doing the procedure first setting a lock. Then sit duck and wait until lock is set. Then issue that read. Means we ship from node A to B something, then when lock set at B, goes back to A. Then A can read its bytes finally at B as it has the lock set. Is there a combined function that is faster than this and is just directly after it can get the lock at B return those bytes to A? e) when doing the spinlock from A, is the core A.c that tries to set the lock at node B, is that core spinning? My previous experience there is that nowadays and/or in past when trying to do this, some implementations instead of having your core spin for a bunch of microseconds, they put your core to idle, which means that it needs to get fired by the runqueue, to say it in a simple manner, once again, which again means a 10-30 milliseconds delay until it has received that data. Do cores get put in prison for up to 30 years when trying to set a lock with the function call in D, do i have both options or am i so lucky? Many thanks for taking a look at my questions and even more to those responding! Kind Regards, Vincent _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sun Apr 15 23:26:51 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Mon, 16 Apr 2012 05:26:51 +0200 Subject: [Beowulf] openmpi 2.2 standards and infiniband cards Message-ID: hi, I'm reading in open mpi 2.2 standards and my eye fell onto something amazing. http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf chapter 11 "one-sided communications" page 339: "it is erroneous to have concurrent conflicting accesses to the same memory location in a window" Does this mean that each update, either read or write in itself is atomic with infiniband? In computerchess it can happen we simply write and read to the same locations. This can result of course in garbled data. Most don't care, some like me store a CRC and care even less. Odds is relative small it happens, but it happens. About once each 200 billion operations there is an atomic coincidence that 2 writes happen to the same location i measured (at Origin3800 @ 200 cpu's @ 120GB ram), resulting in garbage written at that specific cacheline, or 2 consecutive cachelines sharing 20 bytes of data (obviously usually this last case happens - at PC hardware actually only the last case can occur and entries garbled within 1 cacheline). Now the actual reads are a byte or 160, from which only 20 bytes will get used, so the statistical odds is a lot larger than this 1 in 200 billion that it occurs that overlapping parts of RAM get requested by 2 or more cores at the same time, randomly somewhere at the cluster and/or writes of 20 bytes that fall within that range. What's actually happening in hardware here? As it says further: "if a location is updated by a put or accumulate operation, then this location cannot be accessed by a load or another RMA operation until the updating operation has completed." Well it's gonna happen, not much, but sometimes. Of course i don't care if there is some slowdown in that once in a billion time that 2 or more cores write/read at the same memory within the window, but i do care when normal operations get slowed down by this spec as given in MPI 2.2 :) If remote cores ask/write RAM (which usually are different non overlapping RMA requests from the RAM) by put/get a random 20-160 bytes scathered through say a gigabyte of RAM of the receiving node, can the receiving node then issue those say half a dozen random lookups/writes to the RAM buffer of a gigabyte in a concurrent manner? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From john.hearns at mclaren.com Tue Apr 17 11:26:23 2012 From: john.hearns at mclaren.com (Hearns, John) Date: Tue, 17 Apr 2012 16:26:23 +0100 Subject: [Beowulf] Ubuntu MAAS Message-ID: <207BB2F60743C34496BE41039233A8090D09C8DC@MRL-PWEXCHMB02.mil.tagmclarengroup.com> I read a ZDnet article on Ubuntu LTS pitching to be your cloud and data centre distribution on choice. It mentions Ubunti Metal-As-A-Service http://www.markshuttleworth.com/archives/1103 https://wiki.ubuntu.com/ServerTeam/MAAS/ I guess this is what clustering types have been doing for a long time with various cluster deployment and management suites. Also note Mark Shuttleworths comment about the cost of the OS per node : "As we enter an era in which ATOM is as important in the data centre as XEON, an operating system like Ubuntu makes even more sense" I guess this chimes with the initial Beowulfery spirit - when you have low-cost nodes, why use an OS (whether it is Windows, Solaris etc) Which is a significant fraction of the nodes cost. John Hearns | CFD Hardware Specialist | McLaren Racing Limited McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK T: +44 (0) 1483 262000 D: +44 (0) 1483 262352 F: +44 (0) 1483 261928 E: john.hearns at mclaren.com W: www.mclaren.com The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eagles051387 at gmail.com Tue Apr 17 11:43:59 2012 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Tue, 17 Apr 2012 17:43:59 +0200 Subject: [Beowulf] Ubuntu MAAS In-Reply-To: <207BB2F60743C34496BE41039233A8090D09C8DC@MRL-PWEXCHMB02.mil.tagmclarengroup.com> References: <207BB2F60743C34496BE41039233A8090D09C8DC@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: <4F8D8FBF.3040500@gmail.com> This is just an FYI I know a developer and he said this is still something new canonical are working on to replace orchestra in regards to provisioning, I dont think its going ot be ready for prime time at least until the next LTS. I woudl like to do some testing come summer about this as this is a feature that interests me greatly. On 4/17/12 5:26 PM, Hearns, John wrote: > > I read a ZDnet article on Ubuntu LTS pitching to be your cloud and > data centre distribution on choice. > > It mentions Ubunti Metal-As-A-Service > > http://www.markshuttleworth.com/archives/1103 > > https://wiki.ubuntu.com/ServerTeam/MAAS/ > > I guess this is what clustering types have been doing for a long time > with various cluster deployment and management suites. > > Also note Mark Shuttleworths comment about the cost of the OS per node : > > "As we enter an era in which ATOM is as important in the data centre > as XEON, an operating system like Ubuntu makes even more sense" > > I guess this chimes with the initial Beowulfery spirit -- when you > have low-cost nodes, why use an OS (whether it is Windows, Solaris etc) > > Which is a significant fraction of the nodes cost. > > *John Hearns**| CFD Hardware Specialist |**McLaren Racing Limited* > McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK > > > *T: * +44 (0) 1483 262000 > > *D: *+44 (0) 1483 262352 > > *F:* +44 (0) 1483 261928 > *E:*john.hearns at mclaren.com > > *W: *www.mclaren.com > > The contents of this email are confidential and for the exclusive use > of the intended recipient. If you receive this email in error you > should not copy it, retransmit it, use it or disclose its contents but > should return it to the sender immediately and delete your copy. > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at gmail.com Tue Apr 17 20:06:23 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Tue, 17 Apr 2012 20:06:23 -0400 Subject: [Beowulf] 2 Security bugs fixed in Grid Engine Message-ID: There were 2 security related bugs fixed and released in Grid Engine today: - Code injection via LD_* environment variables - sgepasswd buffer overflow Oracle fixed both of them in their CPU (Critical Patch Update) release for Oracle Grid Engine this afternoon. For Sun Grid Engine (6.2u5) and Open Grid Scheduler/Grid Engine, visit: http://gridscheduler.sourceforge.net/security.html The first one was found by William Hay back in Nov 2011. And the second one was reported by an outside security researcher to Oracle. The details of the bug were passed onto me, and we (all the Grid Engine forks) decided that we should share any security related information instead of putting it in marketing slides. Download patches and pre-compiled binaries for: - SGE 6.2u5, 6.2u5p1, 6.2u5p2 - Open Grid Scheduler/Grid Engine 2011.11 from the URL above. To apply the patches, just replace the older version of the binaries with the newer version. Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 11:05:05 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 11:05:05 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand Message-ID: <4F8ED821.5000204@ias.edu> Beowulfers, I'm planning on adding some upgrades to my existing cluster, which has 66 compute nodes pluss the head node. Networking consists of a Cisco 7012 IB switch with 6 out of 12 line cards installed, giving me a capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet switches that have only six extra ports between them. I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, and then begin adding/replacing nodes in the cluster. Obviously, I'll need to increase capacity of both my IB and ethernet networks. The questions I have are about upgrading my InifiniBand. 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox the only game in town these days? 2. Due to the size of my cluster, it looks like buying a just a core/enterprise IB switch with capacity for ~100 ports is the best option (I don't expect my cluster to go much bigger than this in the next 4-5 years). Based on that criteria, it looks like the Mellanox IS5100 is my only option. Am I over looking other options? http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49 3. In my searching yesterday, I didn't find any FDR core/enterprise switches with > 36 ports, other than the Mellanox SX6536. At 648 ports, the SX6536is too big for my needs. I've got to be over looking other products, right? http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49 4. Adding an additional line card to my existing switch looks like it will cost me only ~$5,000, and give me the additional capacity I'll need for the next 1-2 years. I'm thinking it makes sense to do that, and wait for affordable FDR switches to come out with the port count I'm looking for instead of upgrading to QDR right now, and start buying hardware with FDR HCAs in preparation for that. Please feel free to agree/disagree. This brings me to my next question... 5. FDR and QDR should be backwards compatible with my existing DDR hardware, but how exactly does work? If I have, say an FDR switch with a mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to the lowest-common denominator, or will the slow-down be based on the two nodes involved in the communication only? When I googled for an answer, all I found were marketing documents that guaranteed backwards compatibility, but didn't go to this level of detail, I searched the standard spec (v1.2.1), and didn't find an obvious answer to this question. 6. I see some Mellanox docs saying their FDR switches are compliant with v1.3 of the standard, but the latest version available for download is 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is that correct? -- Prentice _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Shainer at Mellanox.com Wed Apr 18 11:27:21 2012 From: Shainer at Mellanox.com (Gilad Shainer) Date: Wed, 18 Apr 2012 15:27:21 +0000 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8ED821.5000204@ias.edu> References: <4F8ED821.5000204@ias.edu> Message-ID: > Beowulfers, > > I'm planning on adding some upgrades to my existing cluster, which has > 66 compute nodes pluss the head node. Networking consists of a Cisco > 7012 IB switch with 6 out of 12 line cards installed, giving me a capacity of 72 > DDR ports, expandable to 144, and two 40-port ethernet switches that have > only six extra ports between them. > > I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, and then > begin adding/replacing nodes in the cluster. Obviously, I'll need to increase > capacity of both my IB and ethernet networks. The questions I have are > about upgrading my InifiniBand. > > 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox the only > game in town these days? Intel bought the QLogic InfiniBand business so this is a second option > 2. Due to the size of my cluster, it looks like buying a just a core/enterprise IB > switch with capacity for ~100 ports is the best option (I don't expect my > cluster to go much bigger than this in the next 4-5 years). Based on that > criteria, it looks like the Mellanox > IS5100 is my only option. Am I over looking other options? You can also take 36 port switches, few more cables, and build the desired network size (for example for Fat Tree topology). It is easy to do, might be more cost effective. If you need help to design the topology (which ports connects to which port, I can send you a description). With this option, you can also do any kind of oversubscription if you want to. > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ > family=71&menu_section=49 > > 3. In my searching yesterday, I didn't find any FDR core/enterprise switches > with > 36 ports, other than the Mellanox SX6536. At 648 ports, the SX6536is > too big for my needs. I've got to be over looking other products, right? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ > family=122&menu_section=49 More options are getting out now. 324-port version will be available in a week, and the 216 few weeks after. Before the summer that 108 will be released. > 4. Adding an additional line card to my existing switch looks like it will cost > me only ~$5,000, and give me the additional capacity I'll need for the next 1- > 2 years. I'm thinking it makes sense to do that, and wait for affordable FDR > switches to come out with the port count I'm looking for instead of > upgrading to QDR right now, and start buying hardware with FDR HCAs in > preparation for that. Please feel free to agree/disagree. This brings me to my > next question... Depends what you want to build. You can take FDR today, build 2:1 oversubscription to get "QDR" throughput and this will be cheaper than using QDR switches. In any case, if you need any help on the negotiation side, let me know. > 5. FDR and QDR should be backwards compatible with my existing DDR > hardware, but how exactly does work? If I have, say an FDR switch with a > mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to the > lowest-common denominator, or will the slow-down be based on the two > nodes involved in the communication only? When I googled for an answer, > all I found were marketing documents that guaranteed backwards > compatibility, but didn't go to this level of detail, I searched the standard > spec (v1.2.1), and didn't find an obvious answer to this question. You can mix and match anything on the InfiniBand side. You can connect SDR, DDR, QDR and FDR and it all will work. When you do that, a direct connection between 2 ports will be run at the common denominator. So if you have FDR port connected to FDR port directly, it will run FDR. If you have DDR port connected directly to FDR port, that connection will run DDR. In your case, part of the fabric will run FDR, part will run DDR. > 6. I see some Mellanox docs saying their FDR switches are compliant with > v1.3 of the standard, but the latest version available for download is 1.2.1. I > take it the final version of 1.3 hasn't been ratified yet. Is that correct? 1.3 is the IBTA spec that includes FDR and EDR. The spec is completed, but not on the web site yet. > -- > Prentice > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 14:59:25 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 14:59:25 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: References: <4F8ED821.5000204@ias.edu> Message-ID: <4F8F0F0D.5000500@ias.edu> Gilad, Thanks for the quick, helpful responses. See my in-line comments below. On 04/18/2012 11:27 AM, Gilad Shainer wrote: >> Beowulfers, >> >> I'm planning on adding some upgrades to my existing cluster, which has >> 66 compute nodes pluss the head node. Networking consists of a Cisco >> 7012 IB switch with 6 out of 12 line cards installed, giving me a capacity of 72 >> DDR ports, expandable to 144, and two 40-port ethernet switches that have >> only six extra ports between them. >> >> I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, and then >> begin adding/replacing nodes in the cluster. Obviously, I'll need to increase >> capacity of both my IB and ethernet networks. The questions I have are >> about upgrading my InifiniBand. >> >> 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox the only >> game in town these days? > Intel bought the QLogic InfiniBand business so this is a second option I searched both the QLogic and Intel websites for 'InfiniBand", and neither returned any hits yesterday. It makes sense that you can't find any IB info on QLogic's site anymore. Today, I was able to find the Link for Intel TrueScale InfiniBand products. Intel did a good job of hiding/burying the link under "More Products" on their Products pull-down menu. No idea why I couldn't find it by searching yesterday. Typo in search box, maybe? > >> 2. Due to the size of my cluster, it looks like buying a just a core/enterprise IB >> switch with capacity for ~100 ports is the best option (I don't expect my >> cluster to go much bigger than this in the next 4-5 years). Based on that >> criteria, it looks like the Mellanox >> IS5100 is my only option. Am I over looking other options? > You can also take 36 port switches, few more cables, and build the desired network size (for example for Fat Tree topology). It is easy to do, might be more cost effective. If you need help to design the topology (which ports connects to which port, I can send you a description). With this option, you can also do any kind of oversubscription if you want to. I was looking into a fat-tree topology yesterday. Considering the number of additional switches needed, and the cabling costs, I'm not sure it will really be cost effective. Just to stay at the same capacity I'm at now, 72 ports, I'd need to by 6 switches + cables. > >> http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ >> family=71&menu_section=49 >> >> 3. In my searching yesterday, I didn't find any FDR core/enterprise switches >> with > 36 ports, other than the Mellanox SX6536. At 648 ports, the SX6536is >> too big for my needs. I've got to be over looking other products, right? >> >> http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ >> family=122&menu_section=49 > More options are getting out now. 324-port version will be available in a week, and the 216 few weeks after. Before the summer that 108 will be released. That's in my timeframe, so I'll keep an eye on the Mellanox website. > >> 4. Adding an additional line card to my existing switch looks like it will cost >> me only ~$5,000, and give me the additional capacity I'll need for the next 1- >> 2 years. I'm thinking it makes sense to do that, and wait for affordable FDR >> switches to come out with the port count I'm looking for instead of >> upgrading to QDR right now, and start buying hardware with FDR HCAs in >> preparation for that. Please feel free to agree/disagree. This brings me to my >> next question... > Depends what you want to build. You can take FDR today, build 2:1 oversubscription to get "QDR" throughput and this will be cheaper than using QDR switches. In any case, if you need any help on the negotiation side, let me know. Thanks for the offer. If I decide to buy new switches instead of expanding my DDR switch, i'll e-mail you off-list. > >> 5. FDR and QDR should be backwards compatible with my existing DDR >> hardware, but how exactly does work? If I have, say an FDR switch with a >> mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to the >> lowest-common denominator, or will the slow-down be based on the two >> nodes involved in the communication only? When I googled for an answer, >> all I found were marketing documents that guaranteed backwards >> compatibility, but didn't go to this level of detail, I searched the standard >> spec (v1.2.1), and didn't find an obvious answer to this question. > You can mix and match anything on the InfiniBand side. You can connect SDR, DDR, QDR and FDR and it all will work. When you do that, a direct connection between 2 ports will be run at the common denominator. So if you have FDR port connected to FDR port directly, it will run FDR. If you have DDR port connected directly to FDR port, that connection will run DDR. In your case, part of the fabric will run FDR, part will run DDR. That's what I suspected. Thanks for the confirmation. > > >> 6. I see some Mellanox docs saying their FDR switches are compliant with >> v1.3 of the standard, but the latest version available for download is 1.2.1. I >> take it the final version of 1.3 hasn't been ratified yet. Is that correct? > > 1.3 is the IBTA spec that includes FDR and EDR. The spec is completed, but not on the web site yet. Ditto. -- Prentice _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 15:02:26 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 15:02:26 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: References: <4F8ED821.5000204@ias.edu> Message-ID: <4F8F0FC2.8000001@ias.edu> Aggregation spine? Can you tell me more about that? Can you give me a part/model number? Prentice On 04/18/2012 11:22 AM, Andrew Howard wrote: > I would talk to Mellanox about your options for switch topology. We > opted not to go with the single 648-port FDR director switch, but > instead use top-of-rack leaf switches (the 36-port guys) and then an > aggregation spine to connect those. It performs beautifully. It also > means we don't have to worry about buying longer (more expensive) > cables to run to the director switch, we can buy the shorter cables to > run to the rack switch and then only have to buy a few 10M cables to > run to the spine. > > -- > Andrew Howard > HPC Systems Engineer > Purdue University > (765) 889-2523 > > > > On Wed, Apr 18, 2012 at 11:05 AM, Prentice Bisbal > wrote: > > Beowulfers, > > I'm planning on adding some upgrades to my existing cluster, which has > 66 compute nodes pluss the head node. Networking consists of a Cisco > 7012 IB switch with 6 out of 12 line cards installed, giving me a > capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet > switches that have only six extra ports between them. > > I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, > and then begin adding/replacing nodes in the cluster. Obviously, I'll > need to increase capacity of both my IB and ethernet networks. The > questions I have are about upgrading my InifiniBand. > > 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox > the only game in town these days? > > 2. Due to the size of my cluster, it looks like buying a just a > core/enterprise IB switch with capacity for ~100 ports is the best > option (I don't expect my cluster to go much bigger than this in the > next 4-5 years). Based on that criteria, it looks like the Mellanox > IS5100 is my only option. Am I over looking other options? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49 > > > 3. In my searching yesterday, I didn't find any FDR core/enterprise > switches with > 36 ports, other than the Mellanox SX6536. At 648 > ports, > the SX6536is too big for my needs. I've got to be over looking other > products, right? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49 > > > 4. Adding an additional line card to my existing switch looks like it > will cost me only ~$5,000, and give me the additional capacity > I'll need > for the next 1-2 years. I'm thinking it makes sense to do that, > and wait > for affordable FDR switches to come out with the port count I'm > looking > for instead of upgrading to QDR right now, and start buying hardware > with FDR HCAs in preparation for that. Please feel free to > agree/disagree. This brings me to my next question... > > 5. FDR and QDR should be backwards compatible with my existing DDR > hardware, but how exactly does work? If I have, say an FDR switch > with a > mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to > the lowest-common denominator, or will the slow-down be based on > the two > nodes involved in the communication only? When I googled for an > answer, > all I found were marketing documents that guaranteed backwards > compatibility, but didn't go to this level of detail, I searched the > standard spec (v1.2.1), and didn't find an obvious answer to this > question. > > 6. I see some Mellanox docs saying their FDR switches are > compliant with > v1.3 of the standard, but the latest version available for download is > 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is > that correct? > > -- > Prentice > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Shainer at Mellanox.com Wed Apr 18 15:07:33 2012 From: Shainer at Mellanox.com (Gilad Shainer) Date: Wed, 18 Apr 2012 19:07:33 +0000 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8F0F0D.5000500@ias.edu> References: <4F8ED821.5000204@ias.edu> <4F8F0F0D.5000500@ias.edu> Message-ID: > Gilad, > > Thanks for the quick, helpful responses. See my in-line comments below. Thanks for the comments. > On 04/18/2012 11:27 AM, Gilad Shainer wrote: > >> Beowulfers, > >> > >> I'm planning on adding some upgrades to my existing cluster, which > >> has > >> 66 compute nodes pluss the head node. Networking consists of a Cisco > >> 7012 IB switch with 6 out of 12 line cards installed, giving me a > >> capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet > >> switches that have only six extra ports between them. > >> > >> I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, > >> and then begin adding/replacing nodes in the cluster. Obviously, I'll > >> need to increase capacity of both my IB and ethernet networks. The > >> questions I have are about upgrading my InifiniBand. > >> > >> 1. It looks like QLogic is out of the InfiniBand business. Is > >> Mellanox the only game in town these days? > > Intel bought the QLogic InfiniBand business so this is a second option > > I searched both the QLogic and Intel websites for 'InfiniBand", and neither > returned any hits yesterday. It makes sense that you can't find any IB info on > QLogic's site anymore. Today, I was able to find the Link for Intel TrueScale > InfiniBand products. Intel did a good job of hiding/burying the link under > "More Products" on their Products pull-down menu. No idea why I couldn't > find it by searching yesterday. > Typo in search box, maybe? > > > > >> 2. Due to the size of my cluster, it looks like buying a just a > >> core/enterprise IB switch with capacity for ~100 ports is the best > >> option (I don't expect my cluster to go much bigger than this in the > >> next 4-5 years). Based on that criteria, it looks like the Mellanox > >> IS5100 is my only option. Am I over looking other options? > > You can also take 36 port switches, few more cables, and build the desired > network size (for example for Fat Tree topology). It is easy to do, might be > more cost effective. If you need help to design the topology (which ports > connects to which port, I can send you a description). With this option, you > can also do any kind of oversubscription if you want to. > > I was looking into a fat-tree topology yesterday. Considering the number of > additional switches needed, and the cabling costs, I'm not sure it will really > be cost effective. Just to stay at the same capacity I'm at now, 72 ports, I'd > need to by 6 switches + cables. It depends on the system that you have, cable distance etc. In most cases it can be more cost effective, but it is easier to use one large switch. In any case, if you need help to find the best option, email me the topology > > > >> > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ > >> family=71&menu_section=49 > >> > >> 3. In my searching yesterday, I didn't find any FDR core/enterprise > >> switches with > 36 ports, other than the Mellanox SX6536. At 648 > >> ports, the SX6536is too big for my needs. I've got to be over looking other > products, right? > >> > >> > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ > >> family=122&menu_section=49 > > More options are getting out now. 324-port version will be available in a > week, and the 216 few weeks after. Before the summer that 108 will be > released. > > That's in my timeframe, so I'll keep an eye on the Mellanox website. Sure. Feel free to email me directly, and I can connect you to the folks that can help > > > >> 4. Adding an additional line card to my existing switch looks like it > >> will cost me only ~$5,000, and give me the additional capacity I'll > >> need for the next 1- > >> 2 years. I'm thinking it makes sense to do that, and wait for > >> affordable FDR switches to come out with the port count I'm looking > >> for instead of upgrading to QDR right now, and start buying hardware > >> with FDR HCAs in preparation for that. Please feel free to > >> agree/disagree. This brings me to my next question... > > Depends what you want to build. You can take FDR today, build 2:1 > oversubscription to get "QDR" throughput and this will be cheaper than > using QDR switches. In any case, if you need any help on the negotiation > side, let me know. > > Thanks for the offer. If I decide to buy new switches instead of expanding my > DDR switch, i'll e-mail you off-list. > > > > >> 5. FDR and QDR should be backwards compatible with my existing DDR > >> hardware, but how exactly does work? If I have, say an FDR switch > >> with a mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow > >> down to the lowest-common denominator, or will the slow-down be > based > >> on the two nodes involved in the communication only? When I googled > >> for an answer, all I found were marketing documents that guaranteed > >> backwards compatibility, but didn't go to this level of detail, I > >> searched the standard spec (v1.2.1), and didn't find an obvious answer to > this question. > > You can mix and match anything on the InfiniBand side. You can connect > SDR, DDR, QDR and FDR and it all will work. When you do that, a direct > connection between 2 ports will be run at the common denominator. So if > you have FDR port connected to FDR port directly, it will run FDR. If you have > DDR port connected directly to FDR port, that connection will run DDR. In > your case, part of the fabric will run FDR, part will run DDR. > > That's what I suspected. Thanks for the confirmation. > > > > > >> 6. I see some Mellanox docs saying their FDR switches are compliant > >> with > >> v1.3 of the standard, but the latest version available for download > >> is 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is that > correct? > > > > 1.3 is the IBTA spec that includes FDR and EDR. The spec is completed, but > not on the web site yet. > > Ditto. > > -- > Prentice _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 15:45:50 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 15:45:50 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8ED821.5000204@ias.edu> References: <4F8ED821.5000204@ias.edu> Message-ID: <4F8F19EE.20301@ias.edu> I just thought of something else... All of my current IB devices (switch, HCAs) are copper with CX4 connectors. It looks like all the Mellanox QDR and FDR cards use QSFP connectors, so that's something else I'll have to consider with my upgrade plans. -- Prentice On 04/18/2012 11:05 AM, Prentice Bisbal wrote: > Beowulfers, > > I'm planning on adding some upgrades to my existing cluster, which has > 66 compute nodes pluss the head node. Networking consists of a Cisco > 7012 IB switch with 6 out of 12 line cards installed, giving me a > capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet > switches that have only six extra ports between them. > > I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, > and then begin adding/replacing nodes in the cluster. Obviously, I'll > need to increase capacity of both my IB and ethernet networks. The > questions I have are about upgrading my InifiniBand. > > 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox > the only game in town these days? > > 2. Due to the size of my cluster, it looks like buying a just a > core/enterprise IB switch with capacity for ~100 ports is the best > option (I don't expect my cluster to go much bigger than this in the > next 4-5 years). Based on that criteria, it looks like the Mellanox > IS5100 is my only option. Am I over looking other options? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49 > > 3. In my searching yesterday, I didn't find any FDR core/enterprise > switches with > 36 ports, other than the Mellanox SX6536. At 648 ports, > the SX6536is too big for my needs. I've got to be over looking other > products, right? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49 > > 4. Adding an additional line card to my existing switch looks like it > will cost me only ~$5,000, and give me the additional capacity I'll need > for the next 1-2 years. I'm thinking it makes sense to do that, and wait > for affordable FDR switches to come out with the port count I'm looking > for instead of upgrading to QDR right now, and start buying hardware > with FDR HCAs in preparation for that. Please feel free to > agree/disagree. This brings me to my next question... > > 5. FDR and QDR should be backwards compatible with my existing DDR > hardware, but how exactly does work? If I have, say an FDR switch with a > mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to > the lowest-common denominator, or will the slow-down be based on the two > nodes involved in the communication only? When I googled for an answer, > all I found were marketing documents that guaranteed backwards > compatibility, but didn't go to this level of detail, I searched the > standard spec (v1.2.1), and didn't find an obvious answer to this question. > > 6. I see some Mellanox docs saying their FDR switches are compliant with > v1.3 of the standard, but the latest version available for download is > 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is > that correct? > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Apr 18 15:42:12 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 18 Apr 2012 15:42:12 -0400 (EDT) Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8F0FC2.8000001@ias.edu> References: <4F8ED821.5000204@ias.edu> <4F8F0FC2.8000001@ias.edu> Message-ID: > Aggregation spine? Can you tell me more about that? Can you give me a > part/model number? spine is just the term for the trunk of a fat tree. usually the per-rack switches are called leaves since if nothing else, they may not be at the top of the rack, or there may be more than one per rack... the good thing about the leaf/spine approach is that it's modular, and possibly less vendor-locked-in. (not that IB is really multi-vendor anyway). cable-wise, I'm not sure leaf-spine really wins, since you can think of a chassis switch as a leaf-spine with FR4 rather than CX4. AFAIKT, the same radix-36 switch is used in each. spine/leave can be distributed so that there's no one place where you get too many cables. "less than fully fat" fabrics seem to be pretty common when taking the modular approach. for instance, if a 36x switch is split into 24 down (node) and 12 up, you can put two back-to-back, or have three going into a single spine switch, or more going into multiple spines. you could even have some racks with more spineward links. and in your case, you could vary the number of links going to your existing chassis switch (though it probably shouldn't be the spine since all its links are slower...) -mark _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Shainer at Mellanox.com Wed Apr 18 16:08:20 2012 From: Shainer at Mellanox.com (Gilad Shainer) Date: Wed, 18 Apr 2012 20:08:20 +0000 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8F19EE.20301@ias.edu> References: <4F8ED821.5000204@ias.edu> <4F8F19EE.20301@ias.edu> Message-ID: All QDR and FDR is QSFP - not only Mellanox. There are QSFP to CX-4 cables if you need. Gilad -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Prentice Bisbal Sent: Wednesday, April 18, 2012 12:47 PM To: Beowulf Mailing List Subject: Re: [Beowulf] Questions about upgrading InfiniBand I just thought of something else... All of my current IB devices (switch, HCAs) are copper with CX4 connectors. It looks like all the Mellanox QDR and FDR cards use QSFP connectors, so that's something else I'll have to consider with my upgrade plans. -- Prentice On 04/18/2012 11:05 AM, Prentice Bisbal wrote: > Beowulfers, > > I'm planning on adding some upgrades to my existing cluster, which has > 66 compute nodes pluss the head node. Networking consists of a Cisco > 7012 IB switch with 6 out of 12 line cards installed, giving me a > capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet > switches that have only six extra ports between them. > > I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, > and then begin adding/replacing nodes in the cluster. Obviously, I'll > need to increase capacity of both my IB and ethernet networks. The > questions I have are about upgrading my InifiniBand. > > 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox > the only game in town these days? > > 2. Due to the size of my cluster, it looks like buying a just a > core/enterprise IB switch with capacity for ~100 ports is the best > option (I don't expect my cluster to go much bigger than this in the > next 4-5 years). Based on that criteria, it looks like the Mellanox > IS5100 is my only option. Am I over looking other options? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_fami > ly=71&menu_section=49 > > 3. In my searching yesterday, I didn't find any FDR core/enterprise > switches with > 36 ports, other than the Mellanox SX6536. At 648 > ports, the SX6536is too big for my needs. I've got to be over looking > other products, right? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_fami > ly=122&menu_section=49 > > 4. Adding an additional line card to my existing switch looks like it > will cost me only ~$5,000, and give me the additional capacity I'll > need for the next 1-2 years. I'm thinking it makes sense to do that, > and wait for affordable FDR switches to come out with the port count > I'm looking for instead of upgrading to QDR right now, and start > buying hardware with FDR HCAs in preparation for that. Please feel > free to agree/disagree. This brings me to my next question... > > 5. FDR and QDR should be backwards compatible with my existing DDR > hardware, but how exactly does work? If I have, say an FDR switch with > a mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down > to the lowest-common denominator, or will the slow-down be based on > the two nodes involved in the communication only? When I googled for > an answer, all I found were marketing documents that guaranteed > backwards compatibility, but didn't go to this level of detail, I > searched the standard spec (v1.2.1), and didn't find an obvious answer to this question. > > 6. I see some Mellanox docs saying their FDR switches are compliant > with > v1.3 of the standard, but the latest version available for download is > 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is > that correct? > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 16:16:47 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 16:16:47 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: References: <4F8ED821.5000204@ias.edu> <4F8F0FC2.8000001@ias.edu> Message-ID: <4F8F212F.7080205@ias.edu> On 04/18/2012 03:42 PM, Mark Hahn wrote: >> Aggregation spine? Can you tell me more about that? Can you give me a >> part/model number? > > spine is just the term for the trunk of a fat tree. usually the > per-rack switches are called leaves since if nothing else, they may > not be at the top of the rack, or there may be more than one per rack... Ahh... gotcha. In the previous e-mail, it sounded like a special line card or something from the context. Tripped up by terminology. > > the good thing about the leaf/spine approach is that it's modular, > and possibly less vendor-locked-in. (not that IB is really > multi-vendor anyway). > > cable-wise, I'm not sure leaf-spine really wins, since you can think > of a chassis switch as a leaf-spine with FR4 rather than CX4. AFAIKT, > the same radix-36 switch is used in each. > > spine/leave can be distributed so that there's no one place where you > get too many cables. My cluster is only 3 racks, with the head-node and IB switch in the middle rack, so the cable don't have too far to go, so switching to top-of-rack switches isn't that big of a deal for me. Of course, my cluster might expand as part of this upgrade. > > "less than fully fat" fabrics seem to be pretty common when taking the > modular approach. for instance, if a 36x switch is split into 24 down > (node) and 12 up, you can put two back-to-back, or have three going > into a single spine switch, or more going into multiple > spines. you could even have some racks with more spineward links. > and in your case, you could vary the number of links going to your > existing chassis switch (though it probably shouldn't be the spine > since all its links are slower...) > > -mark > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Wed Apr 18 21:53:51 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Thu, 19 Apr 2012 11:53:51 +1000 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? Message-ID: <4F8F702F.7070208@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, For hysterical raisins we have an IBM iDataPlex system which is running QDR IB in datagram mode. To that IB network we'll be adding another QDR system which can only run in connected mode. The kicker is that our IB network is used for GPFS over IPoIB and so our NSD's will need to move to connected mode for the new system. I've been Googling without success to find out if you can do such a migration live (i.e. change the servers to connected mode, increase their MTUs and then migrate clients to connected mode (we have enough redundancy in servers to do this) or whether we'll need to schedule an outage and take the whole system down and bring it back up in connected mode. Any thoughts? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk+PcC8ACgkQO2KABBYQAh8wrwCghA14T85C0WIegdURbFtW5Spb mDMAn0k/HTHFEi1avoJlSidrWa5qNCjP =DBuj -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From h-bugge at online.no Thu Apr 19 03:04:53 2012 From: h-bugge at online.no (=?iso-8859-1?Q?H=E5kon_Bugge?=) Date: Thu, 19 Apr 2012 09:04:53 +0200 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? In-Reply-To: <4F8F702F.7070208@unimelb.edu.au> References: <4F8F702F.7070208@unimelb.edu.au> Message-ID: echo connected > /sys/class/net/ib0/mode -h On 19. apr. 2012, at 03.53, Christopher Samuel wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi folks, > > For hysterical raisins we have an IBM iDataPlex system which is > running QDR IB in datagram mode. To that IB network we'll be adding > another QDR system which can only run in connected mode. > > The kicker is that our IB network is used for GPFS over IPoIB and so > our NSD's will need to move to connected mode for the new system. > > I've been Googling without success to find out if you can do such a > migration live (i.e. change the servers to connected mode, increase > their MTUs and then migrate clients to connected mode (we have enough > redundancy in servers to do this) or whether we'll need to schedule an > outage and take the whole system down and bring it back up in > connected mode. > > Any thoughts? > > cheers, > Chris > - -- > Christopher Samuel - Senior Systems Administrator > VLSCI - Victorian Life Sciences Computation Initiative > Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 > http://www.vlsci.unimelb.edu.au/ > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAk+PcC8ACgkQO2KABBYQAh8wrwCghA14T85C0WIegdURbFtW5Spb > mDMAn0k/HTHFEi1avoJlSidrWa5qNCjP > =DBuj > -----END PGP SIGNATURE----- > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Thu Apr 19 04:10:57 2012 From: samuel at unimelb.edu.au (Chris Samuel) Date: Thu, 19 Apr 2012 18:10:57 +1000 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? In-Reply-To: References: <4F8F702F.7070208@unimelb.edu.au> Message-ID: <201204191810.57977.samuel@unimelb.edu.au> On Thursday 19 April 2012 17:04:53 H?kon Bugge wrote: > echo connected > /sys/class/net/ib0/mode Umm, yes, we know how to set it; the question is whether introducing nodes that have it set onto an IB fabric with nodes in datagram mode will cause issues and/or instability ? For instance can nodes in connected mode talk to nodes in datagram mode, and vice versa ? cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From h-bugge at online.no Thu Apr 19 05:06:06 2012 From: h-bugge at online.no (=?iso-8859-1?Q?H=E5kon_Bugge?=) Date: Thu, 19 Apr 2012 11:06:06 +0200 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? In-Reply-To: <201204191810.57977.samuel@unimelb.edu.au> References: <4F8F702F.7070208@unimelb.edu.au> <201204191810.57977.samuel@unimelb.edu.au> Message-ID: On 19. apr. 2012, at 10.10, Chris Samuel wrote: > On Thursday 19 April 2012 17:04:53 H?kon Bugge wrote: > >> echo connected > /sys/class/net/ib0/mode > > Umm, yes, we know how to set it; the question is whether introducing > nodes that have it set onto an IB fabric with nodes in datagram mode > will cause issues and/or instability ? > > For instance can nodes in connected mode talk to nodes in datagram > mode, and vice versa ? Datagram mode is required. For connected mode, the node is required to establish both an UD QP (for nodes not capable of connected mode and multicast) and an RC QP. During address resolution, the requester capability is included as part of the L2 address. So yes, it works. -h _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From raysonlogin at gmail.com Thu Apr 19 10:26:19 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Thu, 19 Apr 2012 10:26:19 -0400 Subject: [Beowulf] 2 Security bugs fixed in Grid Engine In-Reply-To: <4F90058A.3000900@brightcomputing.com> References: <4F90058A.3000900@brightcomputing.com> Message-ID: Right, the GE2011.11p1.patch diff is against GE2011.11. GE2011.11p1 (ie. trunk) is compatible with GE2011.11, and GE2011.11 is also compatible with SGE 6.2u5. I can quickly create a diff for GE2011.11 during lunch time today - will let you know when it is done. Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ On Thu, Apr 19, 2012 at 8:31 AM, Taras Shapovalov wrote: > Hi, > > I am trying to apply GE2011.11p1.patch for GE2011.11 and it fails. It seems, > the developers of GE have created this patch for the trunk version of GE > (which is not the same as the stable version). Is it correct? > > -- > Best regards, > Taras > -- ================================================== Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.net/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From raysonlogin at gmail.com Thu Apr 19 13:22:05 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Thu, 19 Apr 2012 13:22:05 -0400 Subject: [Beowulf] 2 Security bugs fixed in Grid Engine In-Reply-To: References: <4F90058A.3000900@brightcomputing.com> Message-ID: Taras, Updated for GE2011.11: http://gridscheduler.sourceforge.net/security.html Note that with this patch, users won't be able to pass dangerous env. vars into the environment of epilog or prolog (and SGE's rshd, sshd, etc) via qsub -v or qsub -V . However, the user job environment is not affected. Also, any of those "dangerous" env. vars can be inherited from the execution daemon's original start environment (so if LD_LIBRARY_PATH is really needed, set it in the execution daemon's environment). Compare to other implementations, we think our fix is not intrusive at all. We have never seen any sites running epilog or prolog that needs users' LD_LIBRARY_PATH to function. Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ On Thu, Apr 19, 2012 at 10:26 AM, Rayson Ho wrote: > Right, the GE2011.11p1.patch diff is against GE2011.11. GE2011.11p1 > (ie. trunk) is compatible with GE2011.11, and GE2011.11 is also > compatible with SGE 6.2u5. > > I can quickly create a diff for GE2011.11 during lunch time today - > will let you know when it is done. > > Rayson > > ================================= > Open Grid Scheduler / Grid Engine > http://gridscheduler.sourceforge.net/ > > Scalable Grid Engine Support Program > http://www.scalablelogic.com/ > > > On Thu, Apr 19, 2012 at 8:31 AM, Taras Shapovalov > wrote: >> Hi, >> >> I am trying to apply GE2011.11p1.patch for GE2011.11 and it fails. It seems, >> the developers of GE have created this patch for the trunk version of GE >> (which is not the same as the stable version). Is it correct? >> >> -- >> Best regards, >> Taras >> > > > > -- > ================================================== > Open Grid Scheduler - The Official Open Source Grid Engine > http://gridscheduler.sourceforge.net/ -- ================================================== Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.net/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From raysonlogin at gmail.com Thu Apr 19 14:34:08 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Thu, 19 Apr 2012 14:34:08 -0400 Subject: [Beowulf] Next release of Open Grid Scheduler & the Gompute User Group Meeting Message-ID: The next release of Open Grid Scheduler/Grid Engine will be released at the Gompute User Group Meeting. The Gompute User Group Meeting is a free, 2-day, HPC event in Gothenburg, Sweden. Register for the event at: http://www.simdi.se/ ** Please let me know if you are interested in a Grid Engine track. Gridcore/Gompute contributed booth space at SC11 for the Grid Engine 2011.11 release (the first major release of open-source Grid Engine after separation from Oracle), and joined the Open Grid Scheduler project in April 2012. Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Fri Apr 20 09:37:34 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Fri, 20 Apr 2012 09:37:34 -0400 Subject: [Beowulf] New industry for Iceland? Message-ID: <4F91669E.5050901@ias.edu> Combine this article: "A Cool Place for Cheap Flops" http://www.hpcwire.com/hpcwire/2012-04-11/a_cool_place_for_cheap_flops.html With this paper: "Relativistic Statistical Arbitrage" dspace.mit.edu/openaccess-disseminate/1721.1/62859 And it's looks like Iceland has a new industry: Datacenters for the high-frequency trading (HFT) gang. Just remember - you heard it here first, folks! ;) -- Prentice _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Fri Apr 20 03:47:46 2012 From: samuel at unimelb.edu.au (Chris Samuel) Date: Fri, 20 Apr 2012 17:47:46 +1000 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? In-Reply-To: References: <4F8F702F.7070208@unimelb.edu.au> <201204191810.57977.samuel@unimelb.edu.au> Message-ID: <201204201747.46531.samuel@unimelb.edu.au> On Thursday 19 April 2012 19:06:06 H?kon Bugge wrote: > So yes, it works. Thanks, and also thanks to Gilad from Mellanox who put me in contact with another person who was able to answer this and other questions. One valuable thing I learnt was that IPoIB includes neighbor MTU information and so a system sending IP packets from a connected mode host to a datagram mode host will already know the destinations MTU. cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Apr 20 20:02:40 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 21 Apr 2012 02:02:40 +0200 Subject: [Beowulf] New industry for Iceland? In-Reply-To: <4F91669E.5050901@ias.edu> References: <4F91669E.5050901@ias.edu> Message-ID: When i was in iceland some years ago for a long term meeting (no not the start of wikileaks - i was there for a less harmful reason) the thing happened i had feared - for some days when i was there half of the internet adresses in Europe mainland were impossible to reach from iceland. This happens regurarly from there. I had taken the server lucky with me, thanks to www.hotels.nl for sponsoring that. Usually Icelanders have 7 jobs, have more chessgrandmasters per 100k inhabitants than any other nation; actually tomorrow i might play an icelander who emigrated to Europe. So the guy who's heading the new datacenter there is probably a busy man. Just read on... He'll first need to build a construction against the 3000+ small earthquakes a year Iceland has or so, then every component needed he needs to import of course; a fuse broken? In Iceland that's BAD news - they might not be in store in a cirlce of 1000 kilometer around you :) Then when something arrives at the airport, your datacenter equipment can travel over the only road the iceland has. Now if the datacenter is on that road that's rather good news. If not then probably you need to be so lucky it was very well packaged as at the rocky surface there everything trembles to pieces; probably that's why so many cars over there are using those massive wheels - when driving over small rocks you feel that a tad less; but even these cars have problems with rocky surfaces with say a 5CM rocks. Only some bigger trucks which they do not really have over there, can handle that - we have 1 such truck in Netherlands - it joins the big races. This for sure is gonna be the lowest reliability type datacenter, yet it would be typically icelandic for the guy with the 7 jobs putting the datacenter together to get something up and running there :) Yet for vulcanologists it's an interesting island. Maybe one of them is interested in visiting Iceland and pay 15 euro for a hamburger meal. Vincent On Apr 20, 2012, at 3:37 PM, Prentice Bisbal wrote: > Combine this article: > > "A Cool Place for Cheap Flops" > http://www.hpcwire.com/hpcwire/2012-04-11/ > a_cool_place_for_cheap_flops.html > > With this paper: > > "Relativistic Statistical Arbitrage" > dspace.mit.edu/openaccess-disseminate/1721.1/62859 > > And it's looks like Iceland has a new industry: Datacenters for the > high-frequency trading (HFT) gang. > > Just remember - you heard it here first, folks! ;) > > -- > Prentice > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Tue Apr 24 22:58:39 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 24 Apr 2012 22:58:39 -0400 (EDT) Subject: [Beowulf] yikes: intel buys cray's spine Message-ID: http://www.eetimes.com/electronics-news/4371639/Cray-sells-interconnect-hardware-unit-to-Intel that's one market where AMD no longer plays eh? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From lindahl at pbm.com Wed Apr 25 02:52:20 2012 From: lindahl at pbm.com (Greg Lindahl) Date: Tue, 24 Apr 2012 23:52:20 -0700 Subject: [Beowulf] yikes: intel buys cray's spine In-Reply-To: References: Message-ID: <20120425065220.GB14230@bx9.net> > http://www.eetimes.com/electronics-news/4371639/Cray-sells-interconnect-hardware-unit-to-Intel This is a real surprise. Intel said then that the IB stuff they bought from QLogic/PathScale was intended for exoscale computing. For this buy Intel says: > "This deal does not affect our current Infiniband product plans and at > this moment we don't disclose future product plans related to acquired > assets," he added. And a Cray guy said, this time: > "If interconnects are being incorporated into processors, we want to > look at other areas where we can differentiate," Very interesting. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Wed Apr 25 04:52:11 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 25 Apr 2012 10:52:11 +0200 Subject: [Beowulf] yikes: intel buys cray's spine In-Reply-To: <20120425065220.GB14230@bx9.net> References: <20120425065220.GB14230@bx9.net> Message-ID: On Apr 25, 2012, at 8:52 AM, Greg Lindahl wrote: >> http://www.eetimes.com/electronics-news/4371639/Cray-sells- >> interconnect-hardware-unit-to-Intel > > This is a real surprise. Intel said then that the IB stuff they bought > from QLogic/PathScale was intended for exoscale computing. For this > buy Intel says: > >> "This deal does not affect our current Infiniband product plans >> and at >> this moment we don't disclose future product plans related to >> acquired >> assets," he added. > > And a Cray guy said, this time: > >> "If interconnects are being incorporated into processors, we want to >> look at other areas where we can differentiate," > > Very interesting. > > -- greg Though it seems contradictory statements, for a very huge company it's not a problem to own 2 different productlines. Being 'director' of a company like that is kind of being a small manager within intel. Yet managing these newly acquired intel productlines requires a total different sort of leadership if i may say so. It's not babysitting a product - it requires in the long term real innovative form of thinking. Such persons aren't working usually as a manager at a huge company. These huge companies are total different organized and have such creative persons at total different spots in the organisation, requiring far more overhead in terms of number of employees, to get the same thing done. That means they need more turnover out of this business than Cray and Qlogics had, meanwhile the manager that after a while takes the spot of the CEO now, he will want to grow further and deeper into the giants company business, so it's always a gamble what will get back at that spot to manage. > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Thu Apr 26 21:34:37 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Fri, 27 Apr 2012 11:34:37 +1000 Subject: [Beowulf] yikes: intel buys cray's spine In-Reply-To: <20120425065220.GB14230@bx9.net> References: <20120425065220.GB14230@bx9.net> Message-ID: <4F99F7AD.5010701@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 25/04/12 16:52, Greg Lindahl wrote: > And a Cray guy said, this time: > >>> "If interconnects are being incorporated into processors, we >>> want to look at other areas where we can differentiate," > > Very interesting. Yeah, I'd guess that AMD would be a little worried, perhaps they should look at buying Gnodal, where some of the Quadrics people ended up, to get in on that act.. :-) - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk+Z960ACgkQO2KABBYQAh/TqQCfS4V3sNu3pf7cOIOJbSgRrmPB KEAAoIzjmfcz9J+3ot1TYNhbC2DIOTy4 =ndpG -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Thu Apr 26 21:48:35 2012 From: deadline at eadline.org (Douglas Eadline) Date: Thu, 26 Apr 2012 21:48:35 -0400 Subject: [Beowulf] yikes: intel buys cray's spine In-Reply-To: <4F99F7AD.5010701@unimelb.edu.au> References: <20120425065220.GB14230@bx9.net> <4F99F7AD.5010701@unimelb.edu.au> Message-ID: <718e2c97ebdad95beba1e4c602c6848d.squirrel@mail.eadline.org> > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 25/04/12 16:52, Greg Lindahl wrote: > >> And a Cray guy said, this time: >> >>>> "If interconnects are being incorporated into processors, we >>>> want to look at other areas where we can differentiate," >> >> Very interesting. > > Yeah, I'd guess that AMD would be a little worried, perhaps they > should look at buying Gnodal, where some of the Quadrics people ended > up, to get in on that act.. :-) Some of things I read suggested this is about server fabrics. AMD's purchase of SeaMicro out from under Intel's arm may have had something to do with it. Intel needed a fabric for dense server boxes. I would think there may be a "back license" in there for Cray somewhere. Not sure this is an HPC play. -- Doug > > - -- > Christopher Samuel - Senior Systems Administrator > VLSCI - Victorian Life Sciences Computation Initiative > Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 > http://www.vlsci.unimelb.edu.au/ > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAk+Z960ACgkQO2KABBYQAh/TqQCfS4V3sNu3pf7cOIOJbSgRrmPB > KEAAoIzjmfcz9J+3ot1TYNhbC2DIOTy4 > =ndpG > -----END PGP SIGNATURE----- > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From diep at xs4all.nl Mon Apr 9 20:14:52 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Tue, 10 Apr 2012 02:14:52 +0200 Subject: [Beowulf] Infiniband Advice which functions to use for what purpose Message-ID: hi, Trying to make an new model for infiniband for Diep. I need some advice which functioncalls/libraries to use for fastest possible communication over infiniband (mellanox qdr) from one node to another. There is a lot of possibilities there but what's communicating fastest? I need 2 different types of communication possibly 3 or more. Still can setup the model there how to communicate now so let's test the water: a) each node has a 1.5GB cache. so that's 1.5 GB * n each core of each node is randomly needing 192 bytes. Don't know which node in advance and don't know where in the gigabytes of cache (hashtable) it needs to read. what library and which function call is best to ask for this? Realize all 8 cores are busy, if i need to keep 1 core free handling all requests from all other nodes, that slows down each machine significantly as i lose 1 core then. b) for starting and stopping the difference cores (at all nodes) in a de-centralized manner, some variables are difficult to keep decentralized, you want them broadcasted to all nodes somehow updating shared memory at remote nodes in some sort of manner, so the mellanox card writing into the RAM without interrupting the probably 8 running cores, nor needing any of them to handle this. Is that possible somehow? If so, is it possible to update it with 1 function call to all n-1 other nodes? c) memory migration - which possibilities are there to do this - i probably need to build a manual memory migration when a specific job gets taken over from 1 node to another. Which function calls would you advice to use there, is there documentation on how to efficiently implement memory migration? I need to migrate roughly around a 2 kilobyte at a time. This doesn't happen too much obviously, yet the algorithms are so complex i can't avoid doing this if i want the utmost performance so i figured out on paper. And yes i do know there is some stuff that already has this built in - but that's possibly too slow for what i need. d) atomic reads/writes/spinlocks over infiniband. there probably is a function to set a lock at a remote memory adress, which one is it? Is there also a function call that sets a lock, and when lock is succesful directly returns you a bunch of bytes from a specific adress (nearby the lock); that would avoid me doing the procedure first setting a lock. Then sit duck and wait until lock is set. Then issue that read. Means we ship from node A to B something, then when lock set at B, goes back to A. Then A can read its bytes finally at B as it has the lock set. Is there a combined function that is faster than this and is just directly after it can get the lock at B return those bytes to A? e) when doing the spinlock from A, is the core A.c that tries to set the lock at node B, is that core spinning? My previous experience there is that nowadays and/or in past when trying to do this, some implementations instead of having your core spin for a bunch of microseconds, they put your core to idle, which means that it needs to get fired by the runqueue, to say it in a simple manner, once again, which again means a 10-30 milliseconds delay until it has received that data. Do cores get put in prison for up to 30 years when trying to set a lock with the function call in D, do i have both options or am i so lucky? Many thanks for taking a look at my questions and even more to those responding! Kind Regards, Vincent _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sun Apr 15 23:26:51 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Mon, 16 Apr 2012 05:26:51 +0200 Subject: [Beowulf] openmpi 2.2 standards and infiniband cards Message-ID: hi, I'm reading in open mpi 2.2 standards and my eye fell onto something amazing. http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf chapter 11 "one-sided communications" page 339: "it is erroneous to have concurrent conflicting accesses to the same memory location in a window" Does this mean that each update, either read or write in itself is atomic with infiniband? In computerchess it can happen we simply write and read to the same locations. This can result of course in garbled data. Most don't care, some like me store a CRC and care even less. Odds is relative small it happens, but it happens. About once each 200 billion operations there is an atomic coincidence that 2 writes happen to the same location i measured (at Origin3800 @ 200 cpu's @ 120GB ram), resulting in garbage written at that specific cacheline, or 2 consecutive cachelines sharing 20 bytes of data (obviously usually this last case happens - at PC hardware actually only the last case can occur and entries garbled within 1 cacheline). Now the actual reads are a byte or 160, from which only 20 bytes will get used, so the statistical odds is a lot larger than this 1 in 200 billion that it occurs that overlapping parts of RAM get requested by 2 or more cores at the same time, randomly somewhere at the cluster and/or writes of 20 bytes that fall within that range. What's actually happening in hardware here? As it says further: "if a location is updated by a put or accumulate operation, then this location cannot be accessed by a load or another RMA operation until the updating operation has completed." Well it's gonna happen, not much, but sometimes. Of course i don't care if there is some slowdown in that once in a billion time that 2 or more cores write/read at the same memory within the window, but i do care when normal operations get slowed down by this spec as given in MPI 2.2 :) If remote cores ask/write RAM (which usually are different non overlapping RMA requests from the RAM) by put/get a random 20-160 bytes scathered through say a gigabyte of RAM of the receiving node, can the receiving node then issue those say half a dozen random lookups/writes to the RAM buffer of a gigabyte in a concurrent manner? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From john.hearns at mclaren.com Tue Apr 17 11:26:23 2012 From: john.hearns at mclaren.com (Hearns, John) Date: Tue, 17 Apr 2012 16:26:23 +0100 Subject: [Beowulf] Ubuntu MAAS Message-ID: <207BB2F60743C34496BE41039233A8090D09C8DC@MRL-PWEXCHMB02.mil.tagmclarengroup.com> I read a ZDnet article on Ubuntu LTS pitching to be your cloud and data centre distribution on choice. It mentions Ubunti Metal-As-A-Service http://www.markshuttleworth.com/archives/1103 https://wiki.ubuntu.com/ServerTeam/MAAS/ I guess this is what clustering types have been doing for a long time with various cluster deployment and management suites. Also note Mark Shuttleworths comment about the cost of the OS per node : "As we enter an era in which ATOM is as important in the data centre as XEON, an operating system like Ubuntu makes even more sense" I guess this chimes with the initial Beowulfery spirit - when you have low-cost nodes, why use an OS (whether it is Windows, Solaris etc) Which is a significant fraction of the nodes cost. John Hearns | CFD Hardware Specialist | McLaren Racing Limited McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK T: +44 (0) 1483 262000 D: +44 (0) 1483 262352 F: +44 (0) 1483 261928 E: john.hearns at mclaren.com W: www.mclaren.com The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eagles051387 at gmail.com Tue Apr 17 11:43:59 2012 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Tue, 17 Apr 2012 17:43:59 +0200 Subject: [Beowulf] Ubuntu MAAS In-Reply-To: <207BB2F60743C34496BE41039233A8090D09C8DC@MRL-PWEXCHMB02.mil.tagmclarengroup.com> References: <207BB2F60743C34496BE41039233A8090D09C8DC@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: <4F8D8FBF.3040500@gmail.com> This is just an FYI I know a developer and he said this is still something new canonical are working on to replace orchestra in regards to provisioning, I dont think its going ot be ready for prime time at least until the next LTS. I woudl like to do some testing come summer about this as this is a feature that interests me greatly. On 4/17/12 5:26 PM, Hearns, John wrote: > > I read a ZDnet article on Ubuntu LTS pitching to be your cloud and > data centre distribution on choice. > > It mentions Ubunti Metal-As-A-Service > > http://www.markshuttleworth.com/archives/1103 > > https://wiki.ubuntu.com/ServerTeam/MAAS/ > > I guess this is what clustering types have been doing for a long time > with various cluster deployment and management suites. > > Also note Mark Shuttleworths comment about the cost of the OS per node : > > "As we enter an era in which ATOM is as important in the data centre > as XEON, an operating system like Ubuntu makes even more sense" > > I guess this chimes with the initial Beowulfery spirit -- when you > have low-cost nodes, why use an OS (whether it is Windows, Solaris etc) > > Which is a significant fraction of the nodes cost. > > *John Hearns**| CFD Hardware Specialist |**McLaren Racing Limited* > McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK > > > *T: * +44 (0) 1483 262000 > > *D: *+44 (0) 1483 262352 > > *F:* +44 (0) 1483 261928 > *E:*john.hearns at mclaren.com > > *W: *www.mclaren.com > > The contents of this email are confidential and for the exclusive use > of the intended recipient. If you receive this email in error you > should not copy it, retransmit it, use it or disclose its contents but > should return it to the sender immediately and delete your copy. > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From raysonlogin at gmail.com Tue Apr 17 20:06:23 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Tue, 17 Apr 2012 20:06:23 -0400 Subject: [Beowulf] 2 Security bugs fixed in Grid Engine Message-ID: There were 2 security related bugs fixed and released in Grid Engine today: - Code injection via LD_* environment variables - sgepasswd buffer overflow Oracle fixed both of them in their CPU (Critical Patch Update) release for Oracle Grid Engine this afternoon. For Sun Grid Engine (6.2u5) and Open Grid Scheduler/Grid Engine, visit: http://gridscheduler.sourceforge.net/security.html The first one was found by William Hay back in Nov 2011. And the second one was reported by an outside security researcher to Oracle. The details of the bug were passed onto me, and we (all the Grid Engine forks) decided that we should share any security related information instead of putting it in marketing slides. Download patches and pre-compiled binaries for: - SGE 6.2u5, 6.2u5p1, 6.2u5p2 - Open Grid Scheduler/Grid Engine 2011.11 from the URL above. To apply the patches, just replace the older version of the binaries with the newer version. Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 11:05:05 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 11:05:05 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand Message-ID: <4F8ED821.5000204@ias.edu> Beowulfers, I'm planning on adding some upgrades to my existing cluster, which has 66 compute nodes pluss the head node. Networking consists of a Cisco 7012 IB switch with 6 out of 12 line cards installed, giving me a capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet switches that have only six extra ports between them. I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, and then begin adding/replacing nodes in the cluster. Obviously, I'll need to increase capacity of both my IB and ethernet networks. The questions I have are about upgrading my InifiniBand. 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox the only game in town these days? 2. Due to the size of my cluster, it looks like buying a just a core/enterprise IB switch with capacity for ~100 ports is the best option (I don't expect my cluster to go much bigger than this in the next 4-5 years). Based on that criteria, it looks like the Mellanox IS5100 is my only option. Am I over looking other options? http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49 3. In my searching yesterday, I didn't find any FDR core/enterprise switches with > 36 ports, other than the Mellanox SX6536. At 648 ports, the SX6536is too big for my needs. I've got to be over looking other products, right? http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49 4. Adding an additional line card to my existing switch looks like it will cost me only ~$5,000, and give me the additional capacity I'll need for the next 1-2 years. I'm thinking it makes sense to do that, and wait for affordable FDR switches to come out with the port count I'm looking for instead of upgrading to QDR right now, and start buying hardware with FDR HCAs in preparation for that. Please feel free to agree/disagree. This brings me to my next question... 5. FDR and QDR should be backwards compatible with my existing DDR hardware, but how exactly does work? If I have, say an FDR switch with a mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to the lowest-common denominator, or will the slow-down be based on the two nodes involved in the communication only? When I googled for an answer, all I found were marketing documents that guaranteed backwards compatibility, but didn't go to this level of detail, I searched the standard spec (v1.2.1), and didn't find an obvious answer to this question. 6. I see some Mellanox docs saying their FDR switches are compliant with v1.3 of the standard, but the latest version available for download is 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is that correct? -- Prentice _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Shainer at Mellanox.com Wed Apr 18 11:27:21 2012 From: Shainer at Mellanox.com (Gilad Shainer) Date: Wed, 18 Apr 2012 15:27:21 +0000 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8ED821.5000204@ias.edu> References: <4F8ED821.5000204@ias.edu> Message-ID: > Beowulfers, > > I'm planning on adding some upgrades to my existing cluster, which has > 66 compute nodes pluss the head node. Networking consists of a Cisco > 7012 IB switch with 6 out of 12 line cards installed, giving me a capacity of 72 > DDR ports, expandable to 144, and two 40-port ethernet switches that have > only six extra ports between them. > > I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, and then > begin adding/replacing nodes in the cluster. Obviously, I'll need to increase > capacity of both my IB and ethernet networks. The questions I have are > about upgrading my InifiniBand. > > 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox the only > game in town these days? Intel bought the QLogic InfiniBand business so this is a second option > 2. Due to the size of my cluster, it looks like buying a just a core/enterprise IB > switch with capacity for ~100 ports is the best option (I don't expect my > cluster to go much bigger than this in the next 4-5 years). Based on that > criteria, it looks like the Mellanox > IS5100 is my only option. Am I over looking other options? You can also take 36 port switches, few more cables, and build the desired network size (for example for Fat Tree topology). It is easy to do, might be more cost effective. If you need help to design the topology (which ports connects to which port, I can send you a description). With this option, you can also do any kind of oversubscription if you want to. > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ > family=71&menu_section=49 > > 3. In my searching yesterday, I didn't find any FDR core/enterprise switches > with > 36 ports, other than the Mellanox SX6536. At 648 ports, the SX6536is > too big for my needs. I've got to be over looking other products, right? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ > family=122&menu_section=49 More options are getting out now. 324-port version will be available in a week, and the 216 few weeks after. Before the summer that 108 will be released. > 4. Adding an additional line card to my existing switch looks like it will cost > me only ~$5,000, and give me the additional capacity I'll need for the next 1- > 2 years. I'm thinking it makes sense to do that, and wait for affordable FDR > switches to come out with the port count I'm looking for instead of > upgrading to QDR right now, and start buying hardware with FDR HCAs in > preparation for that. Please feel free to agree/disagree. This brings me to my > next question... Depends what you want to build. You can take FDR today, build 2:1 oversubscription to get "QDR" throughput and this will be cheaper than using QDR switches. In any case, if you need any help on the negotiation side, let me know. > 5. FDR and QDR should be backwards compatible with my existing DDR > hardware, but how exactly does work? If I have, say an FDR switch with a > mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to the > lowest-common denominator, or will the slow-down be based on the two > nodes involved in the communication only? When I googled for an answer, > all I found were marketing documents that guaranteed backwards > compatibility, but didn't go to this level of detail, I searched the standard > spec (v1.2.1), and didn't find an obvious answer to this question. You can mix and match anything on the InfiniBand side. You can connect SDR, DDR, QDR and FDR and it all will work. When you do that, a direct connection between 2 ports will be run at the common denominator. So if you have FDR port connected to FDR port directly, it will run FDR. If you have DDR port connected directly to FDR port, that connection will run DDR. In your case, part of the fabric will run FDR, part will run DDR. > 6. I see some Mellanox docs saying their FDR switches are compliant with > v1.3 of the standard, but the latest version available for download is 1.2.1. I > take it the final version of 1.3 hasn't been ratified yet. Is that correct? 1.3 is the IBTA spec that includes FDR and EDR. The spec is completed, but not on the web site yet. > -- > Prentice > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 14:59:25 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 14:59:25 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: References: <4F8ED821.5000204@ias.edu> Message-ID: <4F8F0F0D.5000500@ias.edu> Gilad, Thanks for the quick, helpful responses. See my in-line comments below. On 04/18/2012 11:27 AM, Gilad Shainer wrote: >> Beowulfers, >> >> I'm planning on adding some upgrades to my existing cluster, which has >> 66 compute nodes pluss the head node. Networking consists of a Cisco >> 7012 IB switch with 6 out of 12 line cards installed, giving me a capacity of 72 >> DDR ports, expandable to 144, and two 40-port ethernet switches that have >> only six extra ports between them. >> >> I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, and then >> begin adding/replacing nodes in the cluster. Obviously, I'll need to increase >> capacity of both my IB and ethernet networks. The questions I have are >> about upgrading my InifiniBand. >> >> 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox the only >> game in town these days? > Intel bought the QLogic InfiniBand business so this is a second option I searched both the QLogic and Intel websites for 'InfiniBand", and neither returned any hits yesterday. It makes sense that you can't find any IB info on QLogic's site anymore. Today, I was able to find the Link for Intel TrueScale InfiniBand products. Intel did a good job of hiding/burying the link under "More Products" on their Products pull-down menu. No idea why I couldn't find it by searching yesterday. Typo in search box, maybe? > >> 2. Due to the size of my cluster, it looks like buying a just a core/enterprise IB >> switch with capacity for ~100 ports is the best option (I don't expect my >> cluster to go much bigger than this in the next 4-5 years). Based on that >> criteria, it looks like the Mellanox >> IS5100 is my only option. Am I over looking other options? > You can also take 36 port switches, few more cables, and build the desired network size (for example for Fat Tree topology). It is easy to do, might be more cost effective. If you need help to design the topology (which ports connects to which port, I can send you a description). With this option, you can also do any kind of oversubscription if you want to. I was looking into a fat-tree topology yesterday. Considering the number of additional switches needed, and the cabling costs, I'm not sure it will really be cost effective. Just to stay at the same capacity I'm at now, 72 ports, I'd need to by 6 switches + cables. > >> http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ >> family=71&menu_section=49 >> >> 3. In my searching yesterday, I didn't find any FDR core/enterprise switches >> with > 36 ports, other than the Mellanox SX6536. At 648 ports, the SX6536is >> too big for my needs. I've got to be over looking other products, right? >> >> http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ >> family=122&menu_section=49 > More options are getting out now. 324-port version will be available in a week, and the 216 few weeks after. Before the summer that 108 will be released. That's in my timeframe, so I'll keep an eye on the Mellanox website. > >> 4. Adding an additional line card to my existing switch looks like it will cost >> me only ~$5,000, and give me the additional capacity I'll need for the next 1- >> 2 years. I'm thinking it makes sense to do that, and wait for affordable FDR >> switches to come out with the port count I'm looking for instead of >> upgrading to QDR right now, and start buying hardware with FDR HCAs in >> preparation for that. Please feel free to agree/disagree. This brings me to my >> next question... > Depends what you want to build. You can take FDR today, build 2:1 oversubscription to get "QDR" throughput and this will be cheaper than using QDR switches. In any case, if you need any help on the negotiation side, let me know. Thanks for the offer. If I decide to buy new switches instead of expanding my DDR switch, i'll e-mail you off-list. > >> 5. FDR and QDR should be backwards compatible with my existing DDR >> hardware, but how exactly does work? If I have, say an FDR switch with a >> mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to the >> lowest-common denominator, or will the slow-down be based on the two >> nodes involved in the communication only? When I googled for an answer, >> all I found were marketing documents that guaranteed backwards >> compatibility, but didn't go to this level of detail, I searched the standard >> spec (v1.2.1), and didn't find an obvious answer to this question. > You can mix and match anything on the InfiniBand side. You can connect SDR, DDR, QDR and FDR and it all will work. When you do that, a direct connection between 2 ports will be run at the common denominator. So if you have FDR port connected to FDR port directly, it will run FDR. If you have DDR port connected directly to FDR port, that connection will run DDR. In your case, part of the fabric will run FDR, part will run DDR. That's what I suspected. Thanks for the confirmation. > > >> 6. I see some Mellanox docs saying their FDR switches are compliant with >> v1.3 of the standard, but the latest version available for download is 1.2.1. I >> take it the final version of 1.3 hasn't been ratified yet. Is that correct? > > 1.3 is the IBTA spec that includes FDR and EDR. The spec is completed, but not on the web site yet. Ditto. -- Prentice _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 15:02:26 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 15:02:26 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: References: <4F8ED821.5000204@ias.edu> Message-ID: <4F8F0FC2.8000001@ias.edu> Aggregation spine? Can you tell me more about that? Can you give me a part/model number? Prentice On 04/18/2012 11:22 AM, Andrew Howard wrote: > I would talk to Mellanox about your options for switch topology. We > opted not to go with the single 648-port FDR director switch, but > instead use top-of-rack leaf switches (the 36-port guys) and then an > aggregation spine to connect those. It performs beautifully. It also > means we don't have to worry about buying longer (more expensive) > cables to run to the director switch, we can buy the shorter cables to > run to the rack switch and then only have to buy a few 10M cables to > run to the spine. > > -- > Andrew Howard > HPC Systems Engineer > Purdue University > (765) 889-2523 > > > > On Wed, Apr 18, 2012 at 11:05 AM, Prentice Bisbal > wrote: > > Beowulfers, > > I'm planning on adding some upgrades to my existing cluster, which has > 66 compute nodes pluss the head node. Networking consists of a Cisco > 7012 IB switch with 6 out of 12 line cards installed, giving me a > capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet > switches that have only six extra ports between them. > > I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, > and then begin adding/replacing nodes in the cluster. Obviously, I'll > need to increase capacity of both my IB and ethernet networks. The > questions I have are about upgrading my InifiniBand. > > 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox > the only game in town these days? > > 2. Due to the size of my cluster, it looks like buying a just a > core/enterprise IB switch with capacity for ~100 ports is the best > option (I don't expect my cluster to go much bigger than this in the > next 4-5 years). Based on that criteria, it looks like the Mellanox > IS5100 is my only option. Am I over looking other options? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49 > > > 3. In my searching yesterday, I didn't find any FDR core/enterprise > switches with > 36 ports, other than the Mellanox SX6536. At 648 > ports, > the SX6536is too big for my needs. I've got to be over looking other > products, right? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49 > > > 4. Adding an additional line card to my existing switch looks like it > will cost me only ~$5,000, and give me the additional capacity > I'll need > for the next 1-2 years. I'm thinking it makes sense to do that, > and wait > for affordable FDR switches to come out with the port count I'm > looking > for instead of upgrading to QDR right now, and start buying hardware > with FDR HCAs in preparation for that. Please feel free to > agree/disagree. This brings me to my next question... > > 5. FDR and QDR should be backwards compatible with my existing DDR > hardware, but how exactly does work? If I have, say an FDR switch > with a > mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to > the lowest-common denominator, or will the slow-down be based on > the two > nodes involved in the communication only? When I googled for an > answer, > all I found were marketing documents that guaranteed backwards > compatibility, but didn't go to this level of detail, I searched the > standard spec (v1.2.1), and didn't find an obvious answer to this > question. > > 6. I see some Mellanox docs saying their FDR switches are > compliant with > v1.3 of the standard, but the latest version available for download is > 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is > that correct? > > -- > Prentice > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Shainer at Mellanox.com Wed Apr 18 15:07:33 2012 From: Shainer at Mellanox.com (Gilad Shainer) Date: Wed, 18 Apr 2012 19:07:33 +0000 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8F0F0D.5000500@ias.edu> References: <4F8ED821.5000204@ias.edu> <4F8F0F0D.5000500@ias.edu> Message-ID: > Gilad, > > Thanks for the quick, helpful responses. See my in-line comments below. Thanks for the comments. > On 04/18/2012 11:27 AM, Gilad Shainer wrote: > >> Beowulfers, > >> > >> I'm planning on adding some upgrades to my existing cluster, which > >> has > >> 66 compute nodes pluss the head node. Networking consists of a Cisco > >> 7012 IB switch with 6 out of 12 line cards installed, giving me a > >> capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet > >> switches that have only six extra ports between them. > >> > >> I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, > >> and then begin adding/replacing nodes in the cluster. Obviously, I'll > >> need to increase capacity of both my IB and ethernet networks. The > >> questions I have are about upgrading my InifiniBand. > >> > >> 1. It looks like QLogic is out of the InfiniBand business. Is > >> Mellanox the only game in town these days? > > Intel bought the QLogic InfiniBand business so this is a second option > > I searched both the QLogic and Intel websites for 'InfiniBand", and neither > returned any hits yesterday. It makes sense that you can't find any IB info on > QLogic's site anymore. Today, I was able to find the Link for Intel TrueScale > InfiniBand products. Intel did a good job of hiding/burying the link under > "More Products" on their Products pull-down menu. No idea why I couldn't > find it by searching yesterday. > Typo in search box, maybe? > > > > >> 2. Due to the size of my cluster, it looks like buying a just a > >> core/enterprise IB switch with capacity for ~100 ports is the best > >> option (I don't expect my cluster to go much bigger than this in the > >> next 4-5 years). Based on that criteria, it looks like the Mellanox > >> IS5100 is my only option. Am I over looking other options? > > You can also take 36 port switches, few more cables, and build the desired > network size (for example for Fat Tree topology). It is easy to do, might be > more cost effective. If you need help to design the topology (which ports > connects to which port, I can send you a description). With this option, you > can also do any kind of oversubscription if you want to. > > I was looking into a fat-tree topology yesterday. Considering the number of > additional switches needed, and the cabling costs, I'm not sure it will really > be cost effective. Just to stay at the same capacity I'm at now, 72 ports, I'd > need to by 6 switches + cables. It depends on the system that you have, cable distance etc. In most cases it can be more cost effective, but it is easier to use one large switch. In any case, if you need help to find the best option, email me the topology > > > >> > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ > >> family=71&menu_section=49 > >> > >> 3. In my searching yesterday, I didn't find any FDR core/enterprise > >> switches with > 36 ports, other than the Mellanox SX6536. At 648 > >> ports, the SX6536is too big for my needs. I've got to be over looking other > products, right? > >> > >> > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_ > >> family=122&menu_section=49 > > More options are getting out now. 324-port version will be available in a > week, and the 216 few weeks after. Before the summer that 108 will be > released. > > That's in my timeframe, so I'll keep an eye on the Mellanox website. Sure. Feel free to email me directly, and I can connect you to the folks that can help > > > >> 4. Adding an additional line card to my existing switch looks like it > >> will cost me only ~$5,000, and give me the additional capacity I'll > >> need for the next 1- > >> 2 years. I'm thinking it makes sense to do that, and wait for > >> affordable FDR switches to come out with the port count I'm looking > >> for instead of upgrading to QDR right now, and start buying hardware > >> with FDR HCAs in preparation for that. Please feel free to > >> agree/disagree. This brings me to my next question... > > Depends what you want to build. You can take FDR today, build 2:1 > oversubscription to get "QDR" throughput and this will be cheaper than > using QDR switches. In any case, if you need any help on the negotiation > side, let me know. > > Thanks for the offer. If I decide to buy new switches instead of expanding my > DDR switch, i'll e-mail you off-list. > > > > >> 5. FDR and QDR should be backwards compatible with my existing DDR > >> hardware, but how exactly does work? If I have, say an FDR switch > >> with a mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow > >> down to the lowest-common denominator, or will the slow-down be > based > >> on the two nodes involved in the communication only? When I googled > >> for an answer, all I found were marketing documents that guaranteed > >> backwards compatibility, but didn't go to this level of detail, I > >> searched the standard spec (v1.2.1), and didn't find an obvious answer to > this question. > > You can mix and match anything on the InfiniBand side. You can connect > SDR, DDR, QDR and FDR and it all will work. When you do that, a direct > connection between 2 ports will be run at the common denominator. So if > you have FDR port connected to FDR port directly, it will run FDR. If you have > DDR port connected directly to FDR port, that connection will run DDR. In > your case, part of the fabric will run FDR, part will run DDR. > > That's what I suspected. Thanks for the confirmation. > > > > > >> 6. I see some Mellanox docs saying their FDR switches are compliant > >> with > >> v1.3 of the standard, but the latest version available for download > >> is 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is that > correct? > > > > 1.3 is the IBTA spec that includes FDR and EDR. The spec is completed, but > not on the web site yet. > > Ditto. > > -- > Prentice _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 15:45:50 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 15:45:50 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8ED821.5000204@ias.edu> References: <4F8ED821.5000204@ias.edu> Message-ID: <4F8F19EE.20301@ias.edu> I just thought of something else... All of my current IB devices (switch, HCAs) are copper with CX4 connectors. It looks like all the Mellanox QDR and FDR cards use QSFP connectors, so that's something else I'll have to consider with my upgrade plans. -- Prentice On 04/18/2012 11:05 AM, Prentice Bisbal wrote: > Beowulfers, > > I'm planning on adding some upgrades to my existing cluster, which has > 66 compute nodes pluss the head node. Networking consists of a Cisco > 7012 IB switch with 6 out of 12 line cards installed, giving me a > capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet > switches that have only six extra ports between them. > > I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, > and then begin adding/replacing nodes in the cluster. Obviously, I'll > need to increase capacity of both my IB and ethernet networks. The > questions I have are about upgrading my InifiniBand. > > 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox > the only game in town these days? > > 2. Due to the size of my cluster, it looks like buying a just a > core/enterprise IB switch with capacity for ~100 ports is the best > option (I don't expect my cluster to go much bigger than this in the > next 4-5 years). Based on that criteria, it looks like the Mellanox > IS5100 is my only option. Am I over looking other options? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49 > > 3. In my searching yesterday, I didn't find any FDR core/enterprise > switches with > 36 ports, other than the Mellanox SX6536. At 648 ports, > the SX6536is too big for my needs. I've got to be over looking other > products, right? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49 > > 4. Adding an additional line card to my existing switch looks like it > will cost me only ~$5,000, and give me the additional capacity I'll need > for the next 1-2 years. I'm thinking it makes sense to do that, and wait > for affordable FDR switches to come out with the port count I'm looking > for instead of upgrading to QDR right now, and start buying hardware > with FDR HCAs in preparation for that. Please feel free to > agree/disagree. This brings me to my next question... > > 5. FDR and QDR should be backwards compatible with my existing DDR > hardware, but how exactly does work? If I have, say an FDR switch with a > mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to > the lowest-common denominator, or will the slow-down be based on the two > nodes involved in the communication only? When I googled for an answer, > all I found were marketing documents that guaranteed backwards > compatibility, but didn't go to this level of detail, I searched the > standard spec (v1.2.1), and didn't find an obvious answer to this question. > > 6. I see some Mellanox docs saying their FDR switches are compliant with > v1.3 of the standard, but the latest version available for download is > 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is > that correct? > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Apr 18 15:42:12 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 18 Apr 2012 15:42:12 -0400 (EDT) Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8F0FC2.8000001@ias.edu> References: <4F8ED821.5000204@ias.edu> <4F8F0FC2.8000001@ias.edu> Message-ID: > Aggregation spine? Can you tell me more about that? Can you give me a > part/model number? spine is just the term for the trunk of a fat tree. usually the per-rack switches are called leaves since if nothing else, they may not be at the top of the rack, or there may be more than one per rack... the good thing about the leaf/spine approach is that it's modular, and possibly less vendor-locked-in. (not that IB is really multi-vendor anyway). cable-wise, I'm not sure leaf-spine really wins, since you can think of a chassis switch as a leaf-spine with FR4 rather than CX4. AFAIKT, the same radix-36 switch is used in each. spine/leave can be distributed so that there's no one place where you get too many cables. "less than fully fat" fabrics seem to be pretty common when taking the modular approach. for instance, if a 36x switch is split into 24 down (node) and 12 up, you can put two back-to-back, or have three going into a single spine switch, or more going into multiple spines. you could even have some racks with more spineward links. and in your case, you could vary the number of links going to your existing chassis switch (though it probably shouldn't be the spine since all its links are slower...) -mark _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Shainer at Mellanox.com Wed Apr 18 16:08:20 2012 From: Shainer at Mellanox.com (Gilad Shainer) Date: Wed, 18 Apr 2012 20:08:20 +0000 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: <4F8F19EE.20301@ias.edu> References: <4F8ED821.5000204@ias.edu> <4F8F19EE.20301@ias.edu> Message-ID: All QDR and FDR is QSFP - not only Mellanox. There are QSFP to CX-4 cables if you need. Gilad -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Prentice Bisbal Sent: Wednesday, April 18, 2012 12:47 PM To: Beowulf Mailing List Subject: Re: [Beowulf] Questions about upgrading InfiniBand I just thought of something else... All of my current IB devices (switch, HCAs) are copper with CX4 connectors. It looks like all the Mellanox QDR and FDR cards use QSFP connectors, so that's something else I'll have to consider with my upgrade plans. -- Prentice On 04/18/2012 11:05 AM, Prentice Bisbal wrote: > Beowulfers, > > I'm planning on adding some upgrades to my existing cluster, which has > 66 compute nodes pluss the head node. Networking consists of a Cisco > 7012 IB switch with 6 out of 12 line cards installed, giving me a > capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet > switches that have only six extra ports between them. > > I'd like to add a Lustre filesystem (over InfiniBand) to my cluster, > and then begin adding/replacing nodes in the cluster. Obviously, I'll > need to increase capacity of both my IB and ethernet networks. The > questions I have are about upgrading my InifiniBand. > > 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox > the only game in town these days? > > 2. Due to the size of my cluster, it looks like buying a just a > core/enterprise IB switch with capacity for ~100 ports is the best > option (I don't expect my cluster to go much bigger than this in the > next 4-5 years). Based on that criteria, it looks like the Mellanox > IS5100 is my only option. Am I over looking other options? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_fami > ly=71&menu_section=49 > > 3. In my searching yesterday, I didn't find any FDR core/enterprise > switches with > 36 ports, other than the Mellanox SX6536. At 648 > ports, the SX6536is too big for my needs. I've got to be over looking > other products, right? > > http://www.mellanox.com/content/pages.php?pg=products_dyn&product_fami > ly=122&menu_section=49 > > 4. Adding an additional line card to my existing switch looks like it > will cost me only ~$5,000, and give me the additional capacity I'll > need for the next 1-2 years. I'm thinking it makes sense to do that, > and wait for affordable FDR switches to come out with the port count > I'm looking for instead of upgrading to QDR right now, and start > buying hardware with FDR HCAs in preparation for that. Please feel > free to agree/disagree. This brings me to my next question... > > 5. FDR and QDR should be backwards compatible with my existing DDR > hardware, but how exactly does work? If I have, say an FDR switch with > a mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down > to the lowest-common denominator, or will the slow-down be based on > the two nodes involved in the communication only? When I googled for > an answer, all I found were marketing documents that guaranteed > backwards compatibility, but didn't go to this level of detail, I > searched the standard spec (v1.2.1), and didn't find an obvious answer to this question. > > 6. I see some Mellanox docs saying their FDR switches are compliant > with > v1.3 of the standard, but the latest version available for download is > 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is > that correct? > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Wed Apr 18 16:16:47 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Wed, 18 Apr 2012 16:16:47 -0400 Subject: [Beowulf] Questions about upgrading InfiniBand In-Reply-To: References: <4F8ED821.5000204@ias.edu> <4F8F0FC2.8000001@ias.edu> Message-ID: <4F8F212F.7080205@ias.edu> On 04/18/2012 03:42 PM, Mark Hahn wrote: >> Aggregation spine? Can you tell me more about that? Can you give me a >> part/model number? > > spine is just the term for the trunk of a fat tree. usually the > per-rack switches are called leaves since if nothing else, they may > not be at the top of the rack, or there may be more than one per rack... Ahh... gotcha. In the previous e-mail, it sounded like a special line card or something from the context. Tripped up by terminology. > > the good thing about the leaf/spine approach is that it's modular, > and possibly less vendor-locked-in. (not that IB is really > multi-vendor anyway). > > cable-wise, I'm not sure leaf-spine really wins, since you can think > of a chassis switch as a leaf-spine with FR4 rather than CX4. AFAIKT, > the same radix-36 switch is used in each. > > spine/leave can be distributed so that there's no one place where you > get too many cables. My cluster is only 3 racks, with the head-node and IB switch in the middle rack, so the cable don't have too far to go, so switching to top-of-rack switches isn't that big of a deal for me. Of course, my cluster might expand as part of this upgrade. > > "less than fully fat" fabrics seem to be pretty common when taking the > modular approach. for instance, if a 36x switch is split into 24 down > (node) and 12 up, you can put two back-to-back, or have three going > into a single spine switch, or more going into multiple > spines. you could even have some racks with more spineward links. > and in your case, you could vary the number of links going to your > existing chassis switch (though it probably shouldn't be the spine > since all its links are slower...) > > -mark > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Wed Apr 18 21:53:51 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Thu, 19 Apr 2012 11:53:51 +1000 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? Message-ID: <4F8F702F.7070208@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, For hysterical raisins we have an IBM iDataPlex system which is running QDR IB in datagram mode. To that IB network we'll be adding another QDR system which can only run in connected mode. The kicker is that our IB network is used for GPFS over IPoIB and so our NSD's will need to move to connected mode for the new system. I've been Googling without success to find out if you can do such a migration live (i.e. change the servers to connected mode, increase their MTUs and then migrate clients to connected mode (we have enough redundancy in servers to do this) or whether we'll need to schedule an outage and take the whole system down and bring it back up in connected mode. Any thoughts? cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk+PcC8ACgkQO2KABBYQAh8wrwCghA14T85C0WIegdURbFtW5Spb mDMAn0k/HTHFEi1avoJlSidrWa5qNCjP =DBuj -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From h-bugge at online.no Thu Apr 19 03:04:53 2012 From: h-bugge at online.no (=?iso-8859-1?Q?H=E5kon_Bugge?=) Date: Thu, 19 Apr 2012 09:04:53 +0200 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? In-Reply-To: <4F8F702F.7070208@unimelb.edu.au> References: <4F8F702F.7070208@unimelb.edu.au> Message-ID: echo connected > /sys/class/net/ib0/mode -h On 19. apr. 2012, at 03.53, Christopher Samuel wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi folks, > > For hysterical raisins we have an IBM iDataPlex system which is > running QDR IB in datagram mode. To that IB network we'll be adding > another QDR system which can only run in connected mode. > > The kicker is that our IB network is used for GPFS over IPoIB and so > our NSD's will need to move to connected mode for the new system. > > I've been Googling without success to find out if you can do such a > migration live (i.e. change the servers to connected mode, increase > their MTUs and then migrate clients to connected mode (we have enough > redundancy in servers to do this) or whether we'll need to schedule an > outage and take the whole system down and bring it back up in > connected mode. > > Any thoughts? > > cheers, > Chris > - -- > Christopher Samuel - Senior Systems Administrator > VLSCI - Victorian Life Sciences Computation Initiative > Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 > http://www.vlsci.unimelb.edu.au/ > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAk+PcC8ACgkQO2KABBYQAh8wrwCghA14T85C0WIegdURbFtW5Spb > mDMAn0k/HTHFEi1avoJlSidrWa5qNCjP > =DBuj > -----END PGP SIGNATURE----- > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Thu Apr 19 04:10:57 2012 From: samuel at unimelb.edu.au (Chris Samuel) Date: Thu, 19 Apr 2012 18:10:57 +1000 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? In-Reply-To: References: <4F8F702F.7070208@unimelb.edu.au> Message-ID: <201204191810.57977.samuel@unimelb.edu.au> On Thursday 19 April 2012 17:04:53 H?kon Bugge wrote: > echo connected > /sys/class/net/ib0/mode Umm, yes, we know how to set it; the question is whether introducing nodes that have it set onto an IB fabric with nodes in datagram mode will cause issues and/or instability ? For instance can nodes in connected mode talk to nodes in datagram mode, and vice versa ? cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From h-bugge at online.no Thu Apr 19 05:06:06 2012 From: h-bugge at online.no (=?iso-8859-1?Q?H=E5kon_Bugge?=) Date: Thu, 19 Apr 2012 11:06:06 +0200 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? In-Reply-To: <201204191810.57977.samuel@unimelb.edu.au> References: <4F8F702F.7070208@unimelb.edu.au> <201204191810.57977.samuel@unimelb.edu.au> Message-ID: On 19. apr. 2012, at 10.10, Chris Samuel wrote: > On Thursday 19 April 2012 17:04:53 H?kon Bugge wrote: > >> echo connected > /sys/class/net/ib0/mode > > Umm, yes, we know how to set it; the question is whether introducing > nodes that have it set onto an IB fabric with nodes in datagram mode > will cause issues and/or instability ? > > For instance can nodes in connected mode talk to nodes in datagram > mode, and vice versa ? Datagram mode is required. For connected mode, the node is required to establish both an UD QP (for nodes not capable of connected mode and multicast) and an RC QP. During address resolution, the requester capability is included as part of the L2 address. So yes, it works. -h _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From raysonlogin at gmail.com Thu Apr 19 10:26:19 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Thu, 19 Apr 2012 10:26:19 -0400 Subject: [Beowulf] 2 Security bugs fixed in Grid Engine In-Reply-To: <4F90058A.3000900@brightcomputing.com> References: <4F90058A.3000900@brightcomputing.com> Message-ID: Right, the GE2011.11p1.patch diff is against GE2011.11. GE2011.11p1 (ie. trunk) is compatible with GE2011.11, and GE2011.11 is also compatible with SGE 6.2u5. I can quickly create a diff for GE2011.11 during lunch time today - will let you know when it is done. Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ On Thu, Apr 19, 2012 at 8:31 AM, Taras Shapovalov wrote: > Hi, > > I am trying to apply GE2011.11p1.patch for GE2011.11 and it fails. It seems, > the developers of GE have created this patch for the trunk version of GE > (which is not the same as the stable version). Is it correct? > > -- > Best regards, > Taras > -- ================================================== Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.net/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From raysonlogin at gmail.com Thu Apr 19 13:22:05 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Thu, 19 Apr 2012 13:22:05 -0400 Subject: [Beowulf] 2 Security bugs fixed in Grid Engine In-Reply-To: References: <4F90058A.3000900@brightcomputing.com> Message-ID: Taras, Updated for GE2011.11: http://gridscheduler.sourceforge.net/security.html Note that with this patch, users won't be able to pass dangerous env. vars into the environment of epilog or prolog (and SGE's rshd, sshd, etc) via qsub -v or qsub -V . However, the user job environment is not affected. Also, any of those "dangerous" env. vars can be inherited from the execution daemon's original start environment (so if LD_LIBRARY_PATH is really needed, set it in the execution daemon's environment). Compare to other implementations, we think our fix is not intrusive at all. We have never seen any sites running epilog or prolog that needs users' LD_LIBRARY_PATH to function. Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ On Thu, Apr 19, 2012 at 10:26 AM, Rayson Ho wrote: > Right, the GE2011.11p1.patch diff is against GE2011.11. GE2011.11p1 > (ie. trunk) is compatible with GE2011.11, and GE2011.11 is also > compatible with SGE 6.2u5. > > I can quickly create a diff for GE2011.11 during lunch time today - > will let you know when it is done. > > Rayson > > ================================= > Open Grid Scheduler / Grid Engine > http://gridscheduler.sourceforge.net/ > > Scalable Grid Engine Support Program > http://www.scalablelogic.com/ > > > On Thu, Apr 19, 2012 at 8:31 AM, Taras Shapovalov > wrote: >> Hi, >> >> I am trying to apply GE2011.11p1.patch for GE2011.11 and it fails. It seems, >> the developers of GE have created this patch for the trunk version of GE >> (which is not the same as the stable version). Is it correct? >> >> -- >> Best regards, >> Taras >> > > > > -- > ================================================== > Open Grid Scheduler - The Official Open Source Grid Engine > http://gridscheduler.sourceforge.net/ -- ================================================== Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.net/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From raysonlogin at gmail.com Thu Apr 19 14:34:08 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Thu, 19 Apr 2012 14:34:08 -0400 Subject: [Beowulf] Next release of Open Grid Scheduler & the Gompute User Group Meeting Message-ID: The next release of Open Grid Scheduler/Grid Engine will be released at the Gompute User Group Meeting. The Gompute User Group Meeting is a free, 2-day, HPC event in Gothenburg, Sweden. Register for the event at: http://www.simdi.se/ ** Please let me know if you are interested in a Grid Engine track. Gridcore/Gompute contributed booth space at SC11 for the Grid Engine 2011.11 release (the first major release of open-source Grid Engine after separation from Oracle), and joined the Open Grid Scheduler project in April 2012. Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From prentice at ias.edu Fri Apr 20 09:37:34 2012 From: prentice at ias.edu (Prentice Bisbal) Date: Fri, 20 Apr 2012 09:37:34 -0400 Subject: [Beowulf] New industry for Iceland? Message-ID: <4F91669E.5050901@ias.edu> Combine this article: "A Cool Place for Cheap Flops" http://www.hpcwire.com/hpcwire/2012-04-11/a_cool_place_for_cheap_flops.html With this paper: "Relativistic Statistical Arbitrage" dspace.mit.edu/openaccess-disseminate/1721.1/62859 And it's looks like Iceland has a new industry: Datacenters for the high-frequency trading (HFT) gang. Just remember - you heard it here first, folks! ;) -- Prentice _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Fri Apr 20 03:47:46 2012 From: samuel at unimelb.edu.au (Chris Samuel) Date: Fri, 20 Apr 2012 17:47:46 +1000 Subject: [Beowulf] Migrating from IB datagram mode to connected mode live ? In-Reply-To: References: <4F8F702F.7070208@unimelb.edu.au> <201204191810.57977.samuel@unimelb.edu.au> Message-ID: <201204201747.46531.samuel@unimelb.edu.au> On Thursday 19 April 2012 19:06:06 H?kon Bugge wrote: > So yes, it works. Thanks, and also thanks to Gilad from Mellanox who put me in contact with another person who was able to answer this and other questions. One valuable thing I learnt was that IPoIB includes neighbor MTU information and so a system sending IP packets from a connected mode host to a datagram mode host will already know the destinations MTU. cheers, Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Apr 20 20:02:40 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 21 Apr 2012 02:02:40 +0200 Subject: [Beowulf] New industry for Iceland? In-Reply-To: <4F91669E.5050901@ias.edu> References: <4F91669E.5050901@ias.edu> Message-ID: When i was in iceland some years ago for a long term meeting (no not the start of wikileaks - i was there for a less harmful reason) the thing happened i had feared - for some days when i was there half of the internet adresses in Europe mainland were impossible to reach from iceland. This happens regurarly from there. I had taken the server lucky with me, thanks to www.hotels.nl for sponsoring that. Usually Icelanders have 7 jobs, have more chessgrandmasters per 100k inhabitants than any other nation; actually tomorrow i might play an icelander who emigrated to Europe. So the guy who's heading the new datacenter there is probably a busy man. Just read on... He'll first need to build a construction against the 3000+ small earthquakes a year Iceland has or so, then every component needed he needs to import of course; a fuse broken? In Iceland that's BAD news - they might not be in store in a cirlce of 1000 kilometer around you :) Then when something arrives at the airport, your datacenter equipment can travel over the only road the iceland has. Now if the datacenter is on that road that's rather good news. If not then probably you need to be so lucky it was very well packaged as at the rocky surface there everything trembles to pieces; probably that's why so many cars over there are using those massive wheels - when driving over small rocks you feel that a tad less; but even these cars have problems with rocky surfaces with say a 5CM rocks. Only some bigger trucks which they do not really have over there, can handle that - we have 1 such truck in Netherlands - it joins the big races. This for sure is gonna be the lowest reliability type datacenter, yet it would be typically icelandic for the guy with the 7 jobs putting the datacenter together to get something up and running there :) Yet for vulcanologists it's an interesting island. Maybe one of them is interested in visiting Iceland and pay 15 euro for a hamburger meal. Vincent On Apr 20, 2012, at 3:37 PM, Prentice Bisbal wrote: > Combine this article: > > "A Cool Place for Cheap Flops" > http://www.hpcwire.com/hpcwire/2012-04-11/ > a_cool_place_for_cheap_flops.html > > With this paper: > > "Relativistic Statistical Arbitrage" > dspace.mit.edu/openaccess-disseminate/1721.1/62859 > > And it's looks like Iceland has a new industry: Datacenters for the > high-frequency trading (HFT) gang. > > Just remember - you heard it here first, folks! ;) > > -- > Prentice > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Tue Apr 24 22:58:39 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 24 Apr 2012 22:58:39 -0400 (EDT) Subject: [Beowulf] yikes: intel buys cray's spine Message-ID: http://www.eetimes.com/electronics-news/4371639/Cray-sells-interconnect-hardware-unit-to-Intel that's one market where AMD no longer plays eh? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From lindahl at pbm.com Wed Apr 25 02:52:20 2012 From: lindahl at pbm.com (Greg Lindahl) Date: Tue, 24 Apr 2012 23:52:20 -0700 Subject: [Beowulf] yikes: intel buys cray's spine In-Reply-To: References: Message-ID: <20120425065220.GB14230@bx9.net> > http://www.eetimes.com/electronics-news/4371639/Cray-sells-interconnect-hardware-unit-to-Intel This is a real surprise. Intel said then that the IB stuff they bought from QLogic/PathScale was intended for exoscale computing. For this buy Intel says: > "This deal does not affect our current Infiniband product plans and at > this moment we don't disclose future product plans related to acquired > assets," he added. And a Cray guy said, this time: > "If interconnects are being incorporated into processors, we want to > look at other areas where we can differentiate," Very interesting. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Wed Apr 25 04:52:11 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 25 Apr 2012 10:52:11 +0200 Subject: [Beowulf] yikes: intel buys cray's spine In-Reply-To: <20120425065220.GB14230@bx9.net> References: <20120425065220.GB14230@bx9.net> Message-ID: On Apr 25, 2012, at 8:52 AM, Greg Lindahl wrote: >> http://www.eetimes.com/electronics-news/4371639/Cray-sells- >> interconnect-hardware-unit-to-Intel > > This is a real surprise. Intel said then that the IB stuff they bought > from QLogic/PathScale was intended for exoscale computing. For this > buy Intel says: > >> "This deal does not affect our current Infiniband product plans >> and at >> this moment we don't disclose future product plans related to >> acquired >> assets," he added. > > And a Cray guy said, this time: > >> "If interconnects are being incorporated into processors, we want to >> look at other areas where we can differentiate," > > Very interesting. > > -- greg Though it seems contradictory statements, for a very huge company it's not a problem to own 2 different productlines. Being 'director' of a company like that is kind of being a small manager within intel. Yet managing these newly acquired intel productlines requires a total different sort of leadership if i may say so. It's not babysitting a product - it requires in the long term real innovative form of thinking. Such persons aren't working usually as a manager at a huge company. These huge companies are total different organized and have such creative persons at total different spots in the organisation, requiring far more overhead in terms of number of employees, to get the same thing done. That means they need more turnover out of this business than Cray and Qlogics had, meanwhile the manager that after a while takes the spot of the CEO now, he will want to grow further and deeper into the giants company business, so it's always a gamble what will get back at that spot to manage. > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Thu Apr 26 21:34:37 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Fri, 27 Apr 2012 11:34:37 +1000 Subject: [Beowulf] yikes: intel buys cray's spine In-Reply-To: <20120425065220.GB14230@bx9.net> References: <20120425065220.GB14230@bx9.net> Message-ID: <4F99F7AD.5010701@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 25/04/12 16:52, Greg Lindahl wrote: > And a Cray guy said, this time: > >>> "If interconnects are being incorporated into processors, we >>> want to look at other areas where we can differentiate," > > Very interesting. Yeah, I'd guess that AMD would be a little worried, perhaps they should look at buying Gnodal, where some of the Quadrics people ended up, to get in on that act.. :-) - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk+Z960ACgkQO2KABBYQAh/TqQCfS4V3sNu3pf7cOIOJbSgRrmPB KEAAoIzjmfcz9J+3ot1TYNhbC2DIOTy4 =ndpG -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Thu Apr 26 21:48:35 2012 From: deadline at eadline.org (Douglas Eadline) Date: Thu, 26 Apr 2012 21:48:35 -0400 Subject: [Beowulf] yikes: intel buys cray's spine In-Reply-To: <4F99F7AD.5010701@unimelb.edu.au> References: <20120425065220.GB14230@bx9.net> <4F99F7AD.5010701@unimelb.edu.au> Message-ID: <718e2c97ebdad95beba1e4c602c6848d.squirrel@mail.eadline.org> > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 25/04/12 16:52, Greg Lindahl wrote: > >> And a Cray guy said, this time: >> >>>> "If interconnects are being incorporated into processors, we >>>> want to look at other areas where we can differentiate," >> >> Very interesting. > > Yeah, I'd guess that AMD would be a little worried, perhaps they > should look at buying Gnodal, where some of the Quadrics people ended > up, to get in on that act.. :-) Some of things I read suggested this is about server fabrics. AMD's purchase of SeaMicro out from under Intel's arm may have had something to do with it. Intel needed a fabric for dense server boxes. I would think there may be a "back license" in there for Cray somewhere. Not sure this is an HPC play. -- Doug > > - -- > Christopher Samuel - Senior Systems Administrator > VLSCI - Victorian Life Sciences Computation Initiative > Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 > http://www.vlsci.unimelb.edu.au/ > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAk+Z960ACgkQO2KABBYQAh/TqQCfS4V3sNu3pf7cOIOJbSgRrmPB > KEAAoIzjmfcz9J+3ot1TYNhbC2DIOTy4 > =ndpG > -----END PGP SIGNATURE----- > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf