From kilian.cavalotti.work at gmail.com Wed Feb 1 03:57:47 2012 From: kilian.cavalotti.work at gmail.com (Kilian Cavalotti) Date: Wed, 1 Feb 2012 09:57:47 +0100 Subject: [Beowulf] rear door heat exchangers In-Reply-To: References: Message-ID: Hi Michael, On Tue, Jan 31, 2012 at 9:55 PM, Michael Di Domenico wrote: > i'm looking for, but have not found yet, a rear door heat exchanger > with fans. ?the door should be able to support up to 35kw using > chilled water. ?has anyone seen such an animal? Yep. Bull provides such doors, which can cool up to 40kW per rack. See http://www.bull.com/extreme-computing/cool-cabinet-door.html Cheers, -- Kilian _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From kilian.cavalotti.work at gmail.com Wed Feb 1 04:04:56 2012 From: kilian.cavalotti.work at gmail.com (Kilian Cavalotti) Date: Wed, 1 Feb 2012 10:04:56 +0100 Subject: [Beowulf] moderation - was cpu's versus gpu's - was Intel buys QLogic In-Reply-To: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1> References: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1> Message-ID: On Wed, Feb 1, 2012 at 1:18 AM, Herbert Fruchtl wrote: > 2) You need a moderator. It's quite some work, so it will only be done by somebody who gets some satisfaction out of it. This means that the job will attract exactly the kind of people who will not moderate neutrally and dispassionately. Even if they try, there's the fact that power corrupts. You're tempted to censor views that are too far from your own ("ludicrous" is the word you would use), and in the end you have an in-crowd confirming each other's views. I so agree. > If you really find somebody's views (and their presentation) objectionable, just killfile them (it's called "filter" in the 21st century). And if certain people think ad hominem attacks help their case, ignore them instead of thinking you can look dignified in taking them on in their own game. You won't. Right. Simply ignoring posts from people you don't want to read about is not so taxing, and it's also the best way to keep trolling attacks at a reasonable level. There's probably a dozen ways to automatically filter them, the easier one being the old faithful eyeball grep, which can match a sender's name way before your conscious brain can realize it. Cheers, -- Kilian _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From john.hearns at mclaren.com Wed Feb 1 04:39:05 2012 From: john.hearns at mclaren.com (Hearns, John) Date: Wed, 1 Feb 2012 09:39:05 -0000 Subject: [Beowulf] rear door heat exchangers References: Message-ID: <207BB2F60743C34496BE41039233A8090AF320C0@MRL-PWEXCHMB02.mil.tagmclarengroup.com> > > i'm looking for, but have not found yet, a rear door heat exchanger > with fans. the door should be able to support up to 35kw using > chilled water. has anyone seen such an animal? > > most of the ones i've seen utilize a side car that sits beside the > rack. unfortunately, i'm space limited and i need something that will > hang on the back of the rack. SGI ICE clusters have chilled water rear doors just like that. Four radiator sections which hinge out so you can access the rear of the rack, and you can keep the rack running while you hinge out one door at a time. No 'side car' cooling unit - you just couple it up to chilled water feed and return via flexible pipes. Also check out CO2 cooling from Trox http://www.troxaitcs.com/aitcs/products/CO2OLrac/index.html The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From mdidomenico4 at gmail.com Wed Feb 1 08:20:56 2012 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Wed, 1 Feb 2012 08:20:56 -0500 Subject: [Beowulf] rear door heat exchangers In-Reply-To: References: Message-ID: On Tue, Jan 31, 2012 at 5:23 PM, wrote: > Hi, > > We have installed a lot of racks with rear door heat exchangers but these > are without fans instead using the in-server fans to push the air through > the element. We are doing this with ~20kW per rack. > > How the hell are you drinking 35kW in a rack? start working with GPU's... you'll find out real fast... _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From mdidomenico4 at gmail.com Wed Feb 1 08:23:06 2012 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Wed, 1 Feb 2012 08:23:06 -0500 Subject: [Beowulf] rear door heat exchangers In-Reply-To: References: Message-ID: On Tue, Jan 31, 2012 at 6:47 PM, Lux, Jim (337C) wrote: > Maybe there's an issue with the weight and or flexible tubing on a swinging door? > > The Hoffman products in Andrew's email, I think, aren't the kind that hang on a door, more hang on the side of a large box/cabinet (Type 4,12, 3R enclosure) or wall. > > They're also air/air heat exchanges or airconditioners (and vortex coolers.. but you don't want one of those unless you have a LOT of compressed air available) > > http://www.42u.com/cooling/liquid-cooling/liquid-cooling.htm > shows "in-row liquid cooling" but I think that's sort of in parallel > > They do mention, lower down on the page, "Rear Door Liquid Cooling" > But I notice that the Liebert XDF-5 which is basically a rack and chiller deck in one, only pulls out 14kW. > > From DoE: > http://www1.eere.energy.gov/femp/pdfs/rdhe_cr.pdf > > They refer the ones installed at LLBL ?as RDHx units, but carefully avoid telling you the brand or any decent data. ?They do say they cost $6k/door, and suck up 10-11kW/rack with 9 gal/min flow of 72F water. > > Googling RDHx turns up "CoolCentric.com" > http://www.coolcentric.com/resources/data_sheets/Coolcentric-Rear-Door-Heat-Exchanger-Data-Sheet.pdf > > 33kW is as good as they can do. > > I also note that they have no fans in them. Yes, these are the doors we have now. I was trying to remain vendor agnostic on the list. We have them running well passively up to 25kw now. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 1 08:33:04 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 1 Feb 2012 14:33:04 +0100 Subject: [Beowulf] moderation - was cpu's versus gpu's - was Intel buys QLogic In-Reply-To: References: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1> Message-ID: <20120201133304.GG7343@leitl.org> On Wed, Feb 01, 2012 at 10:04:56AM +0100, Kilian Cavalotti wrote: > Right. Simply ignoring posts from people you don't want to read about > is not so taxing, and it's also the best way to keep trolling attacks Utterly wrong. Empirically, key contributors will be the first to jump ship. Walking is easier than participating in a poorly managed forum. > at a reasonable level. There's probably a dozen ways to automatically > filter them, the easier one being the old faithful eyeball grep, which The point is that most people won't bother, and just leave. > can match a sender's name way before your conscious brain can realize > it. We don't seem to share the same reality. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Wed Feb 1 08:42:30 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Wed, 01 Feb 2012 08:42:30 -0500 Subject: [Beowulf] On filtering In-Reply-To: <20120201133304.GG7343@leitl.org> References: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1> <20120201133304.GG7343@leitl.org> Message-ID: <4F294146.4030500@scalableinformatics.com> We seem to have morphed from a technical/business discussion into a meta discussion on filtering. In an ironical development, I am seriously considering filtering this discussion. For those who want/demand a strong moderation hand, I simply don't see this happening. Eugen's points about the strong contributers leaving first doesn't appear to be the case here (or on any list I have ever been on over the past ... 20-ish years) . Likewise, a strong moderation queue will do what its done to other mailing lists, with moderators whom have day jobs, and thats to pretty much kill the discussion. I can point to a number where the moderation queue (used mostly for spam filtering) has worked against free form discussion (as we have here). Some of the lists on bioinformatics.org specifically demonstrate that moderation isn't conducive to discussion. For those who don't want moderation and prefer local filtering ... procmail based, eyeball grep based (not egrep but close), the system will continue to function. Now that this is said, can we please .... PLEASE .... go back to our regularly scheduled cluster(s) ? Please ? -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 1 09:04:43 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 1 Feb 2012 15:04:43 +0100 Subject: [Beowulf] Seamicro switches Atoms with Xeons in SM10000-XE; 64x Sandy Bridge in 10" Message-ID: <20120201140443.GI7343@leitl.org> http://www.seamicro.com/sm10000xe uses custom "Freedom" fabric more coverage in kraut @ http://www.heise.de/newsticker/meldung/Server-packt-256-Xeon-Kerne-in-10-Hoeheneinheiten-1425949.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 1 10:08:24 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 1 Feb 2012 10:08:24 -0500 (EST) Subject: [Beowulf] cloud: ho hum? Message-ID: in hopes of leaving the moderation discussion behind, here's a more interesting topic: cloud wrt beowulf/hpc. when I meet cloud-enthused people, normally I just explain how HPC clustering has been doing PaaS cloud all along. there are some people who run with it though: bioinformatics people mostly, who take personal affront to the concept of their jobs being queued. (they don't seem to understand that queueing is a function of how efficiently utilized a cluster is, and since a cloud is indeed a cluster, you get queueing in a cloud as well.) part of the issue here seems to be that people buy into a couple fallacies that they apply to cloud: - private sector is inherently more efficient. this is a bit of a mystery to me, but I guess this is one of the great rhetorical successes of the neocon movement. I've looked at Amazon prices, and they are remarkably high - depending on purchasing model, about 20x higher than an academic-run research cluster. why is there not more skepticism of outsourcing, since it always means your cost includes one or more corporate profit margins? - economies of scale: people seem to think that a datacenter at the scale of google/amazon/facebook is going to be dramatically cheaper. while I'm sure they get a good deal from their suppliers, I also doubt it's game-changing. power, for instance, is a relatively modest portion of costs, ~10% per year of a server's purchase price. machineroom cost is pretty linear with number of nodes (power); people overhead is very small (say, > 1000 servers per fte.) most of all, I just don't see how cloud changes the HPC picture at all. HPC is already based on shared resources handling burstiness of demand - if anything, cloud is simply slower. certainly I can't submit a job to EC2 that uses half the Virgina zone and expect it to run immediately. it's not clear to me whether cloud-pushers are getting real traction with the funding agencies (gov is neocon here in Canada.) it worries me that cloud might be framed as "better computing than HPC". I'm curious: what kind of cloudiness are you seeing? thanks, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From dag at sonsorol.org Wed Feb 1 10:37:30 2012 From: dag at sonsorol.org (Chris Dagdigian) Date: Wed, 01 Feb 2012 10:37:30 -0500 Subject: [Beowulf] cloud: ho hum? In-Reply-To: References: Message-ID: <4F295C3A.4030606@sonsorol.org> My $.02 from what I see in industry (life sciences) - The ability to transform capital expense money into OpEx money alone is pushing some cloud interest at high levels. No joke. Possibly a very large cloud interest driver in the larger organizations. This is also attractive for tiny startups and companies just leaving the VC incubation phase. - Deployment speed. We have customers who wait weeks after making an IT helpdesk request for a new VM to be created. Other customers take 1+ years to design, RFP and choose their HPC solution and another 4 months to deploy it. If you can do in minutes (via good DevOps techniques) what the IT organization normally takes weeks or months to do then you've got some good arguments for targeting cloud environments for quick, dev, test and on-off scientific computing environments - Quick capability gains - in some cases it's quicker and easier to get quick access to GPUs, servers with 10Gbe interconnects and well-built systems for running MapReduce style big data workflows on cloud platforms - Data exchange. Cloud is a good place for collaborators to meet and work together without punching massive holes in local firewalls. It's also a good place to either put data or get data from an outsourced provider or collaborator/partner. Many Genome Sequencing outsourcing companies can deliver your genomes directly to an EBS or AWS S3 bucket these days. - I'm a believer in the pricing and economies of scale argument in some cases. For pricing take AWS S3 as an example - internal IT people who snipe at the pricing willfully (or not) seem to ignore the inconvenient fact that S3 does not acknowledge a successful object PUT request until the data has landed in 3 datacenters. If you want an honest cost comparison for cloud-based object storage then you have to start with legit fully-loaded cost estimates for deploying and running an internal petascale-capable system that spans three separate facilities. That ain't cheap. - Truthfully though I don't use or push cloud economic arguments all that much these days. It's incredibly easy to distort the numbers anyway you want so it's rare to have a - Ability to do work that was not considered viable at home. The 90,000 core AWS Top500 cluster that was in the news is a good example. Some organizations have HPC or other problems of such scale that running them internally is not even on the radar. In rare cases spinning up something massive and exotic for a few days is a viable option. - Cyclical needs. Some of my customers have big compute needs that come about only every 3-4 years; most are looking at cloud now rather than buying local gear and seeing it depreciate or be under-utilized most of the time I agree that the cloud is overhyped and we certainly don't see a ton of HPC migrating entirely to the cloud. What we see in the trenches and out in the real world is significant interest in leveraging the cloud for Speed, Capability, Cost or "weird" use cases. -Chrius Tel: | Mobile: _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 1 10:41:38 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 1 Feb 2012 10:41:38 -0500 (EST) Subject: [Beowulf] cloud: ho hum? In-Reply-To: <3E91C69ADC46C4408C85991183A275F4096A548F68@FRGOCMSXMB04.EAME.SYNGENTA.ORG> References: <3E91C69ADC46C4408C85991183A275F4096A548F68@FRGOCMSXMB04.EAME.SYNGENTA.ORG> Message-ID: > My take on it is if we've got a large, steady scientific HPC load then we'd >want in-house capacity to cover that. you mean "production", basically. mostly fixed in size/length. > But if we had a project that had small bursts of intense computation we >might prefer to find a larger pool of compute resource - cloud could be one >of the options. In fact cloud could well be the most straightforward. The >alternative might mean slowing down a key piece of R&D project work. you seem to be comparing to small HPC, sized to meed production demand. I'm not talking about that at all: I'm assuming, perhaps unwarantedly, that most large HPC facilities are like ours, with some modest production demand, but with most of the workload already comprised of the interleaved bursts from thousands of researchers. > I'm not ignoring your points, I'm flagging up that our unusual burst of >demand might be someone else's minor blip. It then becomes worth our while >to offload that to an outsourced resource, whether cloud or not. afaikt, you're just saying "bursts and production don't mix". that's true, but isn't it very small-scale? handling burstiness just means finding a deep enough pool. efficient use of that pool just means getting enough (hopefully independently timed) bursters. this is not an argument for outsourcing per se, or for private-sector somehow being more efficient. there also seems to be a bit of class-warfare surrounding this issue: the claim that "new" disciplines ("disciplines") like bioinformatics and big-data are poorly served by traditional HPC clusters. they seem to resent spending on interconnect, for instance. to me, this seems like novices being obliviously ignorant - sure, QDR to each node seems like a waste of money, but once you get 12,24,32 cores per node, you're going to want to have something faster than Gb or 10Gb, even if you only ever use it for files, not MPI. (for that matter, I think there's a natural progression toward more complex processing as a field matures, which will lead fields that currently do serial farming towards "real" parallelism...) there's a nasty "don't give them money because they don't do it right" thing going on in the guise of cloud and (mostly) bioIT. regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Wed Feb 1 10:45:42 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Wed, 01 Feb 2012 10:45:42 -0500 Subject: [Beowulf] cloud: ho hum? In-Reply-To: References: Message-ID: <4F295E26.1050700@scalableinformatics.com> On 02/01/2012 10:08 AM, Mark Hahn wrote: > in hopes of leaving the moderation discussion behind, > here's a more interesting topic: cloud wrt beowulf/hpc. > > when I meet cloud-enthused people, normally I just explain how > HPC clustering has been doing PaaS cloud all along. there are some > people who run with it though: bioinformatics people mostly, who > take personal affront to the concept of their jobs being queued. Heh ... to put it mildly, this subset of HPC users tend to be more prone to fads a fair number of others. As often as not, we have to work to solve the real problem in part by helping to unmask the real problem (and move past the perceptions of what some CS person told them the problems were). > (they don't seem to understand that queueing is a function of how > efficiently utilized a cluster is, and since a cloud is indeed a > cluster, you get queueing in a cloud as well.) Sort of, but the illusion in a cloud is, that its all theirs, regardless of whether or not its emulated/virtualized/bare metal. > > part of the issue here seems to be that people buy into a couple > fallacies that they apply to cloud: > - private sector is inherently more efficient. this is a bit > of a mystery to me, but I guess this is one of the great rhetorical > successes of the neocon movement. I've looked at Amazon prices, I'll ignore the obvious (and profoundly incorrect) political stance here, and focus upon the (failed) economic argument. Yes, the competitive private sector is *always* more efficient at delivering goods and services than the non-competitive government sector. The only time the private sector is less efficient is when there is no meaningful competition, then the consumers of a good or service will pay market pricing set, not by competitive forces, but by the preference of the dominant vendor which does not need to compete to win the business. For example, in desktop software environments, for the better part of 20 years, Microsoft has been the dominant player, and has had complete freedom to set whatever pricing it wishes. Now that it faces competitive pressure on several fronts, you are seeing pricing starting to react accordingly to market forces. Economics 101 applies: Competitive market forces enable efficient markets. Non-competitive market forces don't. > and they are remarkably high - depending on purchasing model, > about 20x higher than an academic-run research cluster. why is there Hmmm ... I don't think you are taking everything into account, and more to the point, you are not comparing apples to oranges. Compare Amazon to CRL to Joyent to Sabalcore to ... . You will find competitive pricing among these for similar use cases. In all cases, your up front costs and recurring costs are capped. You want to use 10k nodes for 1 hour, you can. And it won't cost you 10k nodes of capital + infrastructure, power, cooling, ... to make it happen. You want 10k nodes for one hour at an academic site? Get in line, and someone has to have laid out the capex for all of this. Just because you don't see this direct cost, or the chargeback to you as an end user doesn't reflect a cost recovery and a profit (latter being irrelevant for most academic sites) doesn't mean it "costs 1/20 as much". It means you haven't accounted for the real costs correctly. > not more skepticism of outsourcing, since it always means your cost > includes one or more corporate profit margins? ... and is corporate profit a bad thing? Seriously? There is a cost associated with you not taking the capital charge for the systems you use, or for the OPEX of using them. Or for the other indirect costs surrounding the rest of this. You are paying for the privilege of keeping your costs down. So, for an academic user that has to obtain 10k CPU hours on 1000 CPUs, in order to solve their problem, they can a) sign on to and get a grant for SHARCNET and others, which involve some sort of charge back mechanism (cause SHARCNET and others have to pay for their power, cooling, data, people) b) build their own cluster (which makes sense only if you do many runs), c) buy it from Amazon/CRL/Sabalcore/... and only pay for what they use and start running right away. So which one makes the most sense? Rhetorical question to a degree as it depends strongly upon the use case, the grant needs, etc. > > - economies of scale: people seem to think that a datacenter at the > scale of google/amazon/facebook is going to be dramatically cheaper. It generally is. > while I'm sure they get a good deal from their suppliers, I also > doubt it's game-changing. power, for instance, is a relatively > modest portion of costs, ~10% per year of a server's purchase price. Then why do Google et al colocate their data centers near cheap power if power is only a modest/minute fraction of the total cost? TCO matters, and if you have to pay for power 24x7 during the life of the system, you want to minimize this cost. Multiple the cost of power for 1 server by 100k, add in other bits, and this modest fraction starts adding up to significant amounts (and fractions of the total cost), very quickly. It can be game changing. Which is why they locate their data centers where there is an optimin (minimizing total lifetime costs of power, taxes, etc.) as compared with the nearby data center where you pay a premium for convenience. > machineroom cost is pretty linear with number of nodes (power); > people overhead is very small (say,> 1000 servers per fte.) > > most of all, I just don't see how cloud changes the HPC picture at all. > HPC is already based on shared resources handling burstiness of demand - Not all HPC is this way. Actually most isn't. > if anything, cloud is simply slower. certainly I can't submit a job to > EC2 that uses half the Virgina zone and expect it to run immediately. > it's not clear to me whether cloud-pushers are getting real traction with > the funding agencies (gov is neocon here in Canada.) it worries me that > cloud might be framed as "better computing than HPC". Hmmm. > > I'm curious: what kind of cloudiness are you seeing? Quite a bit. People are looking at clouds for private use with trivial extension to public usage for computing. We are seeing huge amounts of private storage cloud builds. Cloud is ASP v3 (or v4 if you count clusters). In ASPs, large external high cost gear was centralized. Economics simply didn't work for it and this model died. Clusters started around then. Grid/Utility Computing started around then, and Amazon launched their offering at the notional end of this market. Grid was largely a bust from a commercial view, as it again had bad economics. Clusters were in full blossom then. Economics favored them. If you like to look at Clusters as ASP v3, you can, though they've been running along side of the fads. Cloud is ASP v3 or v4 (if you say clusters were v3). Natural evolution of taking a cluster, putting a VM on demand on it, or running something bare metal on it. Where its located matters to a degree, and data motion is still the hardest problem, and its getting harder. This is why private data clouds (and computing clouds) are getting more popular. This said, like all other fads/trends, Cloud is (massively over-)hyped. It has value, it has staying power (unlike grid, ASP, ...). It solves a specific set of problems, and does so well, and you pay a premium for solving those set of problems in that manner. We see more folks building private clouds (e.g. clusters with more intelligent allocation/provisioning) than we do see people run exclusively on the cloud. In financial services, we've had customers tell us how wonderful it was (from a convenience view) and how awful it was (from a performance view). It matters more to people who care about getting cycles than for people who care about getting really good single CPU performance. Cloud is a throughput engine, and this mode of operation is becoming more important over time. Even in HPC. Especially with BigData (hey, wanna talk about a massively over-hyped term? There's one for ya ... they hype masks the real issues, and this is a shame, but such is life). And for what its worth, VC's are positively throwing money at cloud/big data companies. This doesn't make it better. Probably worse. But thats a whole other discussion. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 1 10:59:33 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 1 Feb 2012 10:59:33 -0500 (EST) Subject: [Beowulf] cloud: ho hum? In-Reply-To: <4F295C3A.4030606@sonsorol.org> References: <4F295C3A.4030606@sonsorol.org> Message-ID: > - The ability to transform capital expense money into OpEx money alone > is pushing some cloud interest at high levels. No joke. Possibly a very > large cloud interest driver in the larger organizations. This is also > attractive for tiny startups and companies just leaving the VC > incubation phase. why is that? in a simple example, EC2 m1.small on-demand costs $745 per ecu-year; a $3k server gets you about 18 ecu-years and you can run it for at least 3 years. going EC2 means you buy the server 4 times a year. obviously a workload with steep, narrow and sparse demand will prefer to rent - is that it? (I'm not clear on why workloads would be like that...) > - Deployment speed. We have customers who wait weeks after making an IT > helpdesk request for a new VM to be created. Other customers take 1+ no. there's nothing technical here: dysfunctional IT orgs should simply be fixed. outsourcing as a workaround for BOFHishness is stupid... > - Quick capability gains - in some cases it's quicker and easier to get > quick access to GPUs, servers with 10Gbe interconnects and well-built > systems for running MapReduce style big data workflows on cloud platforms again, this only makes sense if your demand is impulse-like. is that actually the case? > - Data exchange. Cloud is a good place for collaborators to meet and > work together without punching massive holes in local firewalls. It's again, fire your BOFHish IT. > provider or collaborator/partner. Many Genome Sequencing outsourcing > companies can deliver your genomes directly to an EBS or AWS S3 bucket > these days. interesting, but is this really a concern? how big is a genome "these days"? > - I'm a believer in the pricing and economies of scale argument in some > cases. For pricing take AWS S3 as an example - internal IT people who > snipe at the pricing willfully (or not) seem to ignore the inconvenient > fact that S3 does not acknowledge a successful object PUT request until > the data has landed in 3 datacenters. you lost me there. do you mean your in-house IT can't do reliable storage? > If you want an honest cost comparison for cloud-based object storage > then you have to start with legit fully-loaded cost estimates for > deploying and running an internal petascale-capable system that spans > three separate facilities. That ain't cheap. you mean "granularity is large", I guess. obviously, it _is_ cheap: anything above ~10 racks is linear (I claim). > - Truthfully though I don't use or push cloud economic arguments all > that much these days. It's incredibly easy to distort the numbers anyway > you want so it's rare to have a it's just that I can't figure out any way to make costs of running EC2 more than about $80 per ecu-hour (vs even spot pricing which is $237). are you suggesting that ec2 compute costs are subsidizing the storage and transfer facilities (in spite of Amazon having separate prices for store/transfer)? > - Ability to do work that was not considered viable at home. The 90,000 > core AWS Top500 cluster that was in the news is a good example. Some > organizations have HPC or other problems of such scale that running them > internally is not even on the radar. In rare cases spinning up something > massive and exotic for a few days is a viable option. I'd love to hear of a case that wasn't a PR stunt... > - Cyclical needs. Some of my customers have big compute needs that come > about only every 3-4 years; most are looking at cloud now rather than > buying local gear and seeing it depreciate or be under-utilized most of > the time seems weird to me. thanks! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Wed Feb 1 11:03:25 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 1 Feb 2012 08:03:25 -0800 Subject: [Beowulf] cloud: ho hum? In-Reply-To: Message-ID: On 2/1/12 7:08 AM, "Mark Hahn" wrote: >in hopes of leaving the moderation discussion behind, >here's a more interesting topic: cloud wrt beowulf/hpc. > >when I meet cloud-enthused people, normally I just explain how >HPC clustering has been doing PaaS cloud all along. there are some >people who run with it though: bioinformatics people mostly, who >take personal affront to the concept of their jobs being queued. >(they don't seem to understand that queueing is a function of how >efficiently utilized a cluster is, and since a cloud is indeed a >cluster, you get queueing in a cloud as well.) > >part of the issue here seems to be that people buy into a couple >fallacies that they apply to cloud: > - private sector is inherently more efficient. this is a bit > of a mystery to me, but I guess this is one of the great rhetorical > successes of the neocon movement. I've looked at Amazon prices, > and they are remarkably high - depending on purchasing model, > about 20x higher than an academic-run research cluster. why is there > not more skepticism of outsourcing, since it always means your cost > includes one or more corporate profit margins? 'twas ever thus, I suspect. We get the same thing at JPL. Whatever potential inefficiencies there are with academically oriented or government toilers, the fact that we are non-profit means instantly that we have a 10% advantage. But we have an overhead of proving we're not ripping off the taxpayer, and that probably eats up the advantage That said, private industry does have some advantages in some circumstances: They are probably more nimble when it comes to ramping up manufacturing. There are definitely inefficiencies in government work, because of the increased scrutiny that expenditures of tax dollars get. We bear a heavier burden of proving that we're getting what we paid for, that the procurement was free and unbiased, etc. Those $1000 hammer stories are a case in point. There are numerous common business to business practices that are outright illegal when done in a business to government context. You can argue about whether the practices are moral or ethical, but the fact of the matter is that things like finder's fees, profit as a fixed percentage of job cost, etc are all perfectly legal and common in business. There are probably some aspects of this that allow business to perform some task cheaper than government can, at least in the short run. That is, business can externalize some of the costs, while government cannot. These days, though, industry is paying more for software talent than the government is (you won't see JPL or civil service offering fresh-out CS majors $100k/yr+50k hire bonus + 100k RSU like facebook is). I think that when all is said and done, it's about the same. After all, everyone is buying the same sand and the same people to do the work. Any differences are really small scale arbitrage opportunities. > > - economies of scale: people seem to think that a datacenter at the > scale of google/amazon/facebook is going to be dramatically cheaper. > while I'm sure they get a good deal from their suppliers, I also > doubt it's game-changing. power, for instance, is a relatively > modest portion of costs, ~10% per year of a server's purchase price. > machineroom cost is pretty linear with number of nodes (power); > people overhead is very small (say, > 1000 servers per fte.) I suspect that there's a sort of middle ground where "clouding" or "co-lo hosting" or "rent a rack' is cheaper. Someone who has a need for say, 10-50 machines. That's really not enough to justify a built in infrastructure, but it's too big to "have the receptionist manage it". The folks running 1000s of servers, they've got the economy of scale built in, so they'll be making their choice upon small optimizations (cheaper to buy Amazon time because our electricity rates happen to be high right now) or because they have a wildly fluctuating need (we need 10,000 CPUs this week, but none for the next 3 weeks after that) But there are thousands and thousands of medium sized organizations that could probably benefit from "someone else" providing the computing infrastructure. Think of some manufacturing and design company that makes widgets, but needs some server horsepower to do whatever it is. Their business isn't doing sys admin, backups, etc. They can usefully outsource that to "the cloud" and focus their efforts on their core competencies. They can work a deal where someone else does the off-site backups, etc. and they don't have to worry about it. (yes, they could also hire a consulting company to do much of this as well, but for "commodity computing" maybe a generic provider "the cloud" is a better solution.) > >most of all, I just don't see how cloud changes the HPC picture at all. >HPC is already based on shared resources handling burstiness of demand - >if anything, cloud is simply slower. certainly I can't submit a job to >EC2 that uses half the Virgina zone and expect it to run immediately. >it's not clear to me whether cloud-pushers are getting real traction with >the funding agencies (gov is neocon here in Canada.) it worries me that >cloud might be framed as "better computing than HPC". > >I'm curious: what kind of cloudiness are you seeing? We've got a big "use the cloud" thing going on at JPL (and within NASA as well). To a certain extent, I think (personal opinion here, not JPL's or NASA's) it's a "everyone is talking about cloud, so we better do something with it, so at least we can comment intelligently". But it's also useful for bursty load. We have a real problem with physical space for more computers amid our aging infrastructure (most of our buildings are 40-50 years old) and the need for "I must have my hands on the physical box" is going away, as the interface mechanisms get smoother and cleaner. It's all about control, after all.. I, who strongly advocate personal supercomputers under your desk, because nobody is looking over your shoulder trying to optimize their utilization, find that the concept of smoothly divisible and scalable compute power available with a network connection is pretty close to what you want. The "external control and optimization" aspects that prompt my desire for personal computing come about when the cost granularity of the system is sufficiently coarse that a bureaucracy springs up to manage the system, which inevitably means that the "transaction cost" to get an increment of computation goes up, and they impose a minimum transaction size that is substantially larger than my "incremental need". Example using test equipment. I might want to use a $100,000 spectrum analyzer for half a day. That's a $5,000/month kind of rental, with a 2 month minimum. I'd happily pay the $50-100 for a half day's use, but because the system doesn't accommodate short usage, I'm stuck for $10,000 to do my measurement which is worth $100. And there's no effective way for me to resell the extra 60 days worth of spectrum analyzer availability. This is because my need patterns are mismatched to the supply patterns. The cloud concept has definitely worked to reduce the "transaction cost". You can buy an hour's time on 100 CPUs, pretty easily. Nobody is coming after you to help chip in for the capital cost on the machine room, or asking you to buy a month's worth of time. And, I think that in the HPC world in general, this sort of model has already existed (and heck, it goes way back to when IBM didn't sell computers, they sold CPU seconds and Core seconds and I/O Channel seconds). But it is totally unfamiliar to a lot of current IT people, who have never worked with "timesharing" systems. Their conceptual models are built around "buy a PC or three or hundred" and then scaled to "buy racks of servers and put them in a room" or, "get a loan to buy 1000 servers and put them in a room", or, perhaps "lease 1000 servers and hire a room"... All of those are really based upon "buying" (in some sense) a physical box and providing for it's care and feeding. The big difference in cloud is that you are buying "service" on a fine scale. And the term cloud is just a wonderful sexy marketing description that no-doubt came from someone looking at network diagrams. There has been that "cloud" bubble around for decades to represent "stuff over which we don't have control nor do we care, it's just there and outside our domain" > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Wed Feb 1 11:23:16 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 1 Feb 2012 08:23:16 -0800 Subject: [Beowulf] cloud: ho hum? In-Reply-To: Message-ID: On 2/1/12 7:59 AM, "Mark Hahn" wrote: > >> - Deployment speed. We have customers who wait weeks after making an IT >> helpdesk request for a new VM to be created. Other customers take 1+ > >no. there's nothing technical here: dysfunctional IT orgs should simply >be fixed. outsourcing as a workaround for BOFHishness is stupid... > The IT org in this situation isn't necessarily dysfunctional. Say you're an R&D group of 100 people in a company with 200k employees. Their IT org is optimized for the 200k, not for the 100. Outsourcing is a logical choice here. (this is the specialization, vertical vs horizontal integration, etc. discussion). Yes, there are inefficient service organizations everywhere, and there always will be. The hardest thing for project managers to learn is that you MUST plan for average, not above average, performance. The fact that sometimes you get above average helps counteract the unknowable problems that result in below average. Example from NASA.. Pathfinder put a rover on Mars for (ostensibly) $25M and set a mind bendingly aggressively low bar for future missions. That's not because Pathfinder was particularly well managed (it was well managed, but that's not why the cost was low).. It's more because of a happy coincidence of lots of circumstances that made something that realistically should have cost around $150M cost 1/6 of that. They got lucky with people to work on it, they got lucky with spare parts from other missions, they got lucky in being small, so avoiding a lot of oversight costs. Next Mars missions in 1998.. Hey Faster, Better, Cheaper, we can do it again. We'll put TWO probes at Mars for the cost of one $100M mission. Oops, one crashed into the surface, the other missed orbit injection and probably burned up. Much soul searching and reflection.. Next Mars mission (MER 2003) costs over $1B for two rovers. (and you can bet there was a LOT more reviews and oversight) MER got unlucky, in a lot of ways. Original estimates of costs (from Pathfinder) turned out to be inappropriate (some examples below). But the real story is that Pathfinder happened to be out on the tail of the probability distribution of cost, and MER was more in the middle. Pathfinder's probability of failure was MUCH higher than MERs. - You can't just scale up airbags and parachutes - The fast, low documentation approach of Pathfinder means you don't actually have drawings from which you can build stuff with no changes. - Parts that survived for Pathfinder, when actually tested for environments, had a high probability of failing, so Pathfinder "got lucky" and the parts had to get redesigned. - MER was a lot bigger, so the "average" performance of the team inevitably showed the applicability of the central limit theorem. - MER was a lot bigger, so the N^k, where k>1, communications costs rose faster than the job size. - As the job costs more, it gets more attention, so more management controls and reviews are put into place. There's a big difference between a failure of a mission flying one or two instruments on a cheap and cheerful rover assembled from commercial parts and flying a dozen instruments on a $400M rover. >> _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Greg at Keller.net Wed Feb 1 13:27:20 2012 From: Greg at Keller.net (Greg Keller) Date: Wed, 1 Feb 2012 12:27:20 -0600 Subject: [Beowulf] cloud: ho hum? Message-ID: I Sell HPC Cycles over the Internet > Date: Wed, 1 Feb 2012 10:08:24 -0500 (EST) > From: Mark Hahn > Subject: [Beowulf] cloud: ho hum? > ? ? ? ?- private sector is inherently more efficient. ?this is a bit > ? ? ? ?of a mystery to me, but I guess this is one of the great rhetorical > ? ? ? ?successes of the neocon movement. It depends on who's accounting for what. Businesses typically have to include Power, Tax and Higher markups from Vendors for HW and SW than Gov't and Academics. Also NSF and others, at least on paper, require them to charge no more than "cost" when selling cycles as I understand. So they should always be cheaper if the system is 100% utilized over the life of the system. > I've looked at Amazon prices, > ? ? ? ?and they are remarkably high - depending on purchasing model, > ? ? ? ?about 20x higher than an academic-run research cluster. ?why is there > ? ? ? ?not more skepticism of outsourcing, since it always means your cost > ? ? ? ?includes one or more corporate profit margins? Please don't consider Amazon pricing "HPC in the Cloud"'s baseline or norm. Especially price/performance. They do a great job at what they do well, but in this instance they actually poison the market because the price/performance is so bad for many workloads. > > ? ? ? ?- economies of scale: people seem to think that a datacenter at the > ? ? ? ?scale of google/amazon/facebook is going to be dramatically cheaper. > ? ? ? ?while I'm sure they get a good deal from their suppliers, I also > ? ? ? ?doubt it's game-changing. ?power, for instance, is a relatively > ? ? ? ?modest portion of costs, ~10% per year of a server's purchase price. > ? ? ? ?machineroom cost is pretty linear with number of nodes (power); > ? ? ? ?people overhead is very small (say, > 1000 servers per fte.) There is also a significant penalty for any provider that builds first, sells second that negates much of the "Economy of scale". It's like buying hard drive space 2 years in advance, if you wait and buy in smaller chunks as needed you will end up with a lot more space over the 2 years for the same spend. > > most of all, I just don't see how cloud changes the HPC picture at all. > HPC is already based on shared resources handling burstiness of demand - > if anything, cloud is simply slower. ?certainly I can't submit a job to > EC2 that uses half the Virgina zone and expect it to run immediately. > it's not clear to me whether cloud-pushers are getting real traction with > the funding agencies (gov is neocon here in Canada.) ?it worries me that > cloud might be framed as "better computing than HPC". When done well, it's a continuation of a longer term trend: researchers build their own cluster and don't share... then they put them together in departments and team share... then those get pulled into enterprise scale systems... then enterprises "share" discretely/blindly through time-sharing at some external provider. > > I'm curious: what kind of cloudiness are you seeing? Most organizations that have a large enough continuous baseline load can probably save money doing the baseline in-house and "bursting" to providers for special projects and anything that's speculative enough that they may not be doing it for 3-5 years. If your admin costs as much as your cluster (because they read this list and are awesome) of 16 nodes you may be better off outsourcing even the baseline. Ridiculous overhead, space, delay, or power costs can also help make outsourcing HPC a better use for budget If you outsource everything you don't have redundant providers you may be adding risk to your organization to save a little money. Once a few providers that are independent support similar access and control systems you could have redundant providers and shift workloads. If your competitor buys your current provider and shuts it down Oracle style you still have a system to run on while you setup at a new redundant provider. This is IMHO the key limiter of HPC Outsourcing growth and for good reasons. Some of our early adopters have no choice but to go external because the budget doesn't allow for a purchase big enough to meet a short term project's needs, so the redundancy risk of outsourcing is negated. > > thanks, mark hahn. > Also, a few points in reply to Chris... > Date: Wed, 01 Feb 2012 10:37:30 -0500 > From: Chris Dagdigian > Subject: Re: [Beowulf] cloud: ho hum? > My $.02 from what I see in industry (life sciences) > > - The ability to transform capital expense money into OpEx money alone > is pushing some cloud interest at high levels. No joke. Possibly a very > large cloud interest driver in the larger organizations. This is also > attractive for tiny startups and companies just leaving the VC > incubation phase. Very proven in our business experience. Not just VC, even fortune 10 companies have projects that look like internal startups that may fail in months. > > - Deployment speed. We have customers who wait weeks after making an IT > helpdesk request for a new VM to be created. Other customers take 1+ > years to design, RFP and choose their HPC solution and another 4 months > to deploy it. ?If you can do in minutes (via good DevOps techniques) > what the IT organization normally takes weeks or months to do then > you've got some good arguments for targeting cloud environments for > quick, dev, test and on-off scientific computing environments We call this "Corporate Inertia". Risk aversion by internal staff makes obvious decisions committee decisions so no one person gets blamed if there are complaints. > > - Quick capability gains - in some cases it's quicker and easier to get > quick access to GPUs, servers with 10Gbe interconnects and well-built > systems for running MapReduce style big data workflows on cloud platforms > > I agree that the cloud is overhyped and we certainly don't see a ton of > HPC migrating entirely to the cloud. What we see in the trenches and out > in the real world is significant interest in leveraging the cloud for > Speed, Capability, Cost or "weird" use cases. Agreed. "Enterprise" cloud and "HPC" cloud aim at opposite purposes. One is to subdivide a system for higher utilization (value), one is to combine multiple systems for performance. HPC benefits greatly from transparent understanding of the actual hardware and configurations, Enterprise benefits from not caring or needing to. So "good" HPC Clouds are Transparent whereas Enterprise clouds should only be translucent. You can't win a Nascar Race with $xM worth of Convertible Geo Metro's, you Need a single $xM Nascar and a good pit crew (admins). > > > -Chrius > Cheers! Greg I Sell HPC Cycles over the Internet _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From skylar.thompson at gmail.com Wed Feb 1 20:09:27 2012 From: skylar.thompson at gmail.com (Skylar Thompson) Date: Wed, 01 Feb 2012 17:09:27 -0800 Subject: [Beowulf] rear door heat exchangers In-Reply-To: References: Message-ID: <4F29E247.1070807@gmail.com> On 2/1/2012 5:20 AM, Michael Di Domenico wrote: > On Tue, Jan 31, 2012 at 5:23 PM, wrote: >> Hi, >> >> We have installed a lot of racks with rear door heat exchangers but these >> are without fans instead using the in-server fans to push the air through >> the element. We are doing this with ~20kW per rack. >> >> How the hell are you drinking 35kW in a rack? > > start working with GPU's... you'll find out real fast... You don't even necessarily need GPUs --- our latest blade chassis suck up 7500W in 7U going at full bore. It's pretty unpleasant standing behind them, though. -- -- Skylar Thompson (skylar.thompson at gmail.com) -- http://www.cs.earlham.edu/~skylar/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From DIEP at xs4all.nl Fri Feb 3 17:24:17 2012 From: DIEP at xs4all.nl (Vincent Diepeveen) Date: Fri, 3 Feb 2012 23:24:17 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap Message-ID: http://www.anandtech.com/show/5503/understanding-amds-roadmap-new- direction/2 AMD's new roadmap basically says they stop high performance CPU development. GPU line will continue also inside cpu's integrated. Total monopoly for intel for applications needing CPU's as it seems. That might that companies that need some more CPU crunching no longer can build cheap 4 socket machines that have good performance. Knowing usually the topend AMD 4 socket system used to be just above or under $10k, with doube the core count normally spoken of what you'd expect at a 4 socket system, making it, to some extend similar to 8 socket system (be it with very low clocked cpu's), that will no longer perform well. Intel's 8 socket solution at Oracle, the latest one, usually is around $200k a machine. So this means that clustering is only cheap choice then. Comments? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From DIEP at xs4all.nl Sat Feb 4 00:09:53 2012 From: DIEP at xs4all.nl (Vincent Diepeveen) Date: Sat, 4 Feb 2012 06:09:53 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: Message-ID: On Feb 4, 2012, at 5:09 AM, Mark Hahn wrote: >> http://www.anandtech.com/show/5503/understanding-amds-roadmap-new- >> direction/2 >> >> AMD's new roadmap basically says they stop high performance CPU >> development. > > well, they won't pursue P4-netburst-ish damn-the-torpedoes style > "high" _desktop_ performance. their server roadmap is pretty > solid, though they've dropped the 10/20c chips. (which might not > have made sense in terms of power envelope. or, for that matter, > the fact that even cache-friendly code probably wants more than 2 > memory channels for 10 cores...) > > I think AMD is absolutely right: the market is for mobile devices, > for power-efficient servers, and for media-intensive desktops. > >> GPU line will continue >> also inside cpu's integrated. Total monopoly for intel for >> applications needing CPU's as it seems. > > you seem to have missed AMD's main point, which HSA: the concept of > pushing x86 and GPU together to enable something higher-performing. > it's not a crazy idea, though pretty ambitious. > >> That might that companies that need some more CPU crunching no longer >> can build cheap 4 socket > > 4s is still on the roadmap; I don't see why you'd expect it to > disappear. > it costs them very little to support, and does serve a modest market. AMD is moving to 28 nm years after intel has reached 22 nm. So anything that has to perform it's over of course. Add to that we see how Indian engineering works - using 2 billion transistors for something intel did do 3 years earlier in the same proces technology using far under 1 billion. The mobile market is very competative with many players. Not just intel and AMD. Price matters a lot there. Intel when having same quality engineers would easily beat AMD there. 22 nm versus 28 nm. It's not a contest. It's game over normally spoken. Yet for mobile market a lot is possible it's a competative market. As for 4 socket servers - their roadmap basically shows the same cpu's like they have now. They can clock it maybe a tad higher win 1% here, win 1% there. That's about it. So AMD doesn't compete on number of cores. So they can't be on par with intel 2 socket machines basically if this roadmap presents reality. Basically AMD just mastered 32 nm, 3 years after intel released cpu's for it. Just in order to get bulldozer on par with i7-965 from years ago, if we speak about integers, they needed to have bulldozer consume A LOT more power. If their new design team is so bad in producing equipment that can use little power, how do you guess they can compete in an older proces technology against matured intel 22 nm products in the 'highend' mobile market? CPU wise they're total history. Moving your R&D to 3d world has become a total disaster for AMD. It's that their gpu line is doing well, but let's ask around a bit - how many run gpgpu on AMD gpu's in opencl? They're not supporting their opencl very well, to say polite. I can give examples if you're interested. Yet most important point to make there is that one of the biggest competative aspects of AMD was getting a lot of CPU cores for a cheap price. Those days are definitely over. I don't see how their design team even remotely has any clue about building low power products the coming years, if we see the massive mess ups. Basically AMD has 1 great chip from a few years ago still playing in the market place, if we speak about the cpu division. It won't be long until everyone has forgotten about that as well. As for crunching,which is what most people do on this list, the AMD cpu's aren't interesting anymore. Sure their GPU's are for the floating point guys (not for integers as they support opencl not very well - so not all hardware instructions, some crucial ones - are available in openCL of that gpu, which is a major reason to choose for Nvidia if you have to do integer crunching). But one very big division called CPU sales, forget it. They have a big problem. Seems they get some sort of Asian company now if we also study the code names very well, which just can deliver crap cpu's for a cheap price; cpu's eating too much power for western standards. Forget low power design in India - wont' happen. > >> So this means that clustering is only cheap choice then. > > clustering has always been the cheap solution. hence this list! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Fri Feb 3 23:09:34 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Fri, 3 Feb 2012 23:09:34 -0500 (EST) Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: Message-ID: > http://www.anandtech.com/show/5503/understanding-amds-roadmap-new- > direction/2 > > AMD's new roadmap basically says they stop high performance CPU > development. well, they won't pursue P4-netburst-ish damn-the-torpedoes style "high" _desktop_ performance. their server roadmap is pretty solid, though they've dropped the 10/20c chips. (which might not have made sense in terms of power envelope. or, for that matter, the fact that even cache-friendly code probably wants more than 2 memory channels for 10 cores...) I think AMD is absolutely right: the market is for mobile devices, for power-efficient servers, and for media-intensive desktops. > GPU line will continue > also inside cpu's integrated. Total monopoly for intel for > applications needing CPU's as it seems. you seem to have missed AMD's main point, which HSA: the concept of pushing x86 and GPU together to enable something higher-performing. it's not a crazy idea, though pretty ambitious. > That might that companies that need some more CPU crunching no longer > can build cheap 4 socket 4s is still on the roadmap; I don't see why you'd expect it to disappear. it costs them very little to support, and does serve a modest market. > So this means that clustering is only cheap choice then. clustering has always been the cheap solution. hence this list! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From cap at nsc.liu.se Wed Feb 8 08:13:49 2012 From: cap at nsc.liu.se (Peter =?iso-8859-1?q?Kjellstr=F6m?=) Date: Wed, 8 Feb 2012 14:13:49 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: Message-ID: <201202081413.53665.cap@nsc.liu.se> > > GPU line will continue > > also inside cpu's integrated. Total monopoly for intel for > > applications needing CPU's as it seems. > > you seem to have missed AMD's main point, which HSA: the concept of > pushing x86 and GPU together to enable something higher-performing. > it's not a crazy idea, though pretty ambitious. The APU concept has a few interesting points but certainly also a few major problems (when comparing it to a cpu + stand alone gpu setup): * Memory bandwidth to all those FPUs * Power (CPUs in servers today max out around 120W with GPUs at >250W) Either way we're in for an interesting future (as usual) :-) /Peter -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From diep at xs4all.nl Wed Feb 8 08:27:31 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 8 Feb 2012 14:27:31 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202081413.53665.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> Message-ID: On Feb 8, 2012, at 2:13 PM, Peter Kjellstr?m wrote: >>> GPU line will continue >>> also inside cpu's integrated. Total monopoly for intel for >>> applications needing CPU's as it seems. >> >> you seem to have missed AMD's main point, which HSA: the concept of >> pushing x86 and GPU together to enable something higher-performing. >> it's not a crazy idea, though pretty ambitious. > > The APU concept has a few interesting points but certainly also a > few major > problems (when comparing it to a cpu + stand alone gpu setup): > > * Memory bandwidth to all those FPUs > * Power (CPUs in servers today max out around 120W with GPUs at > >250W) And the gpu's in those cpu's probably won't be doing double precision at all. Maybe they work at 16% the speed of a similar highend GPU. AMD announced exactly that for their cheap line gpu's which are already a lot better than waht's gonna be put in the cpu's as it seems. Add to that that the gpu will have probably very limited amount of cores. So it'll be possibly a factor 200 slower than a 7990 gpu in double precision, which should be a teraflop or 2 nearly in double precision and probably will eat around a 450 watt when crunching double precision code at all Processing Elements, as the streamcores are called in OpenCL. I'm not sure why people at this mailing list are excited about including gpu's in cpu's. It's great for mobile phones and netbooks and cheapskate laptops and so. From performance viewpoint ignore it. > > Either way we're in for an interesting future (as usual) :-) > > /Peter > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 8 08:34:12 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 8 Feb 2012 14:34:12 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202081413.53665.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> Message-ID: <20120208133412.GN7343@leitl.org> On Wed, Feb 08, 2012 at 02:13:49PM +0100, Peter Kjellstr?m wrote: > * Memory bandwidth to all those FPUs Memory stacking via TSV is coming. APUs with their very apparent memory bottlenecks will accelerate it. > * Power (CPUs in servers today max out around 120W with GPUs at >250W) I don't see why you can't integrate APU+memory+heatsink in a watercooled module that is plugged into the backplane which contains the switched signalling fabric. > Either way we're in for an interesting future (as usual) :-) I don't see how x86 should make it to exascale. It's too bad MRAM/FeRAM/whatever isn't ready for SoC yet. Also, Moore should end by around 2020 or earlier, and architecture only pushes you one or two generations further at most. Don't see how 3D integration should be ready by then, and 2.5 D only buys you another one or two doublings at best. (TSV stacking is obviously off-Moore). _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Wed Feb 8 09:01:30 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 8 Feb 2012 15:01:30 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <20120208133412.GN7343@leitl.org> References: <201202081413.53665.cap@nsc.liu.se> <20120208133412.GN7343@leitl.org> Message-ID: <71EF5B74-3187-4BA6-B8F4-D27E6EE95A7F@xs4all.nl> On Feb 8, 2012, at 2:34 PM, Eugen Leitl wrote: > On Wed, Feb 08, 2012 at 02:13:49PM +0100, Peter Kjellstr?m wrote: > >> * Memory bandwidth to all those FPUs > > Memory stacking via TSV is coming. APUs with their very apparent > memory bottlenecks will accelerate it. > >> * Power (CPUs in servers today max out around 120W with GPUs at >> >250W) > > I don't see why you can't integrate APU+memory+heatsink in a > watercooled module that is plugged into the backplane which > contains the switched signalling fabric. > Because also for the upcoming new Xbox they have the same power envelope as they have in the 'highend' cpu's for the built in gpu. So we do not speak yet about laptop cpu's as it'll be less there. They have at most 18 watts for the built in GPU. So first thing they do is kill all double precision of it. Even if they wouldn't. the 6990 and the highend nvidia and the 7990 they are all 375 watt TDP on paper (in reality 450+ watt). So whatever your 'opinion' is on how they design stuff, it will always be factor 375 / 18 = 20.8 times slower than a GPU. And they can get over the tpd with the gpu's easily as the pci-e connectors will easily pump in more watts, with the built in gpu's they can't as the power doesn't come from the pci-e but from more strict specs. But now let's look at design. They cannot 'turn off' the AVX in the cpu, as then it doesn't support the latest games, and cpu's nowadays are only about taking care you do better at the latest game and nothing else matters, whatever fairy tale they'll tell you. CPU's are an exorbitantly expensive part of the computer. They are so expensive those x64 cpu's, because of the 'blessing' of patents. Only 2 companies are able to release x64 cpu's right now and probably soon only 1, as we'll have to see whether AMD survives this. One of those companies is not even in a hurry to release their 8 core Xeons in 32 nm, maybe they want to make more profit with a higher yield cpu at 22 nm. If we already know the gpu is crap in double precision because it just has 18 watts, and we also know that the CPU has AVX, it's pretty useless to let the GPU do the double precision calculations. So the obvious optimization is to kick out all double precision logics in the gpu, which doesn't save transistors as some will tell you, as it usually is all the same chip, just they turn off the transistors, giving them higher yields, so cheaper production price. That's what they'll do if they want to make a profit and i bet their owners will be very unhappy if they do not make a profit. So yes, in a nerd world it would be possible to just include a 2 core chippie that's just 32 bits x86 of a watt or 10, and give majority of the power envelope to a double precision optimized gpu, maybe even 50 watts, which makes it 'only' factor 8 slower, in theory, than a GPU card. Yet that's not very likely to happen. >> Either way we're in for an interesting future (as usual) :-) > > I don't see how x86 should make it to exascale. It's too > bad MRAM/FeRAM/whatever isn't ready for SoC yet. Also, Moore > should end by around 2020 or earlier, and architecture only > pushes you one or two generations further at most. Don't see > how 3D integration should be ready by then, and 2.5 D only > buys you another one or two doublings at best. (TSV stacking > is obviously off-Moore). > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Wed Feb 8 09:06:31 2012 From: deadline at eadline.org (Douglas Eadline) Date: Wed, 8 Feb 2012 09:06:31 -0500 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202081413.53665.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> Message-ID: <9923a899e0c2735b549b108ad889cf32.squirrel@mail.eadline.org> >> > GPU line will continue >> > also inside cpu's integrated. Total monopoly for intel for >> > applications needing CPU's as it seems. >> >> you seem to have missed AMD's main point, which HSA: the concept of >> pushing x86 and GPU together to enable something higher-performing. >> it's not a crazy idea, though pretty ambitious. > > The APU concept has a few interesting points but certainly also a few > major > problems (when comparing it to a cpu + stand alone gpu setup): > > * Memory bandwidth to all those FPUs I thought that was part of the issue, removing the PCI bus from the CPU/GPU connection. Of course, the APU has a lower memory bandwidth than the pure GPU, but in theory now the PCI bottleneck is gone. > * Power (CPUs in servers today max out around 120W with GPUs at >250W) I see this as more of a smearing out of the GPU (SIMD unit). Instead of one big GPU sitting on the PCI bus shared by 2-4 sockets, now each socket has it's own GPU on the same memory bus. Unless, I'm not following the APU design correctly. > > Either way we're in for an interesting future (as usual) :-) Indeed. > > /Peter > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From james.p.lux at jpl.nasa.gov Wed Feb 8 09:18:50 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 8 Feb 2012 06:18:50 -0800 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <20120208133412.GN7343@leitl.org> Message-ID: On 2/8/12 5:34 AM, "Eugen Leitl" wrote: >On Wed, Feb 08, 2012 at 02:13:49PM +0100, Peter Kjellstr?m wrote: > >> * Memory bandwidth to all those FPUs > >Memory stacking via TSV is coming. APUs with their very apparent >memory bottlenecks will accelerate it. > >> * Power (CPUs in servers today max out around 120W with GPUs at >250W) > >I don't see why you can't integrate APU+memory+heatsink in a >watercooled module that is plugged into the backplane which >contains the switched signalling fabric. I don't know about that.. I don't see the semiconductor companies making such an integrated widget, so it's basically some sort of integrator that would do it: like a mobo manufacturer. But I don't think the volume is there for the traditional mobo types to find it interesting. So now you're talking about small volume specialized mfrs, like the ones who sell into the conduction cooled MIL/AERO market. And those are *expensive*... Not just because of the plethora of requirements and documentation that the customer wants in that market.. It's all about mfr volume. The whole idea of "plugging in" a liquid cooled thing to a backplane is also sort of unusual. A connector that can carry both high speed digital signals, power, AND liquid without leaking would be weird. And even if it's not "one connector", logically, that whole mating surface of the module is a connector. Reliable liquid connectors usually need some sort of latching or positive action: a collar that snaps in place (think air hose) or turns or does something to put a clamping force on an O-ring or other gasket. It can be done (and probably has), but it's going to be "exotic" and expensive. > >> Either way we're in for an interesting future (as usual) :-) > >I don't see how x86 should make it to exascale. It's too >bad MRAM/FeRAM/whatever isn't ready for SoC yet. Even if you put the memory on the chip, you still have the interconnect scaling problem. Light speed and distance, if nothing else. Putting everything on a chip just shrinks the problem, but it's just like 15 years ago with PC tower cases on shelving and Ethernet interconnects. > Also, Moore >should end by around 2020 or earlier, and architecture only >pushes you one or two generations further at most. Don't see >how 3D integration should be ready by then, and 2.5 D only >buys you another one or two doublings at best. (TSV stacking >is obviously off-Moore). > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 8 11:39:24 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 8 Feb 2012 17:39:24 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <20120208133412.GN7343@leitl.org> Message-ID: <20120208163924.GQ7343@leitl.org> On Wed, Feb 08, 2012 at 06:18:50AM -0800, Lux, Jim (337C) wrote: > It can be done (and probably has), but it's going to be "exotic" and > expensive. There are some liquid metal (gallium alloy, strangely enough not sodium/potassium eutectic which would be plenty cheaper albeit a fire hazard if exposed to air) cooled GPUs for the gamer market. I might have also read about one which uses a metal pump with no movable parts which utilizes MHD, though I don't remember where I've read that. There are also plenty of watercooled systems among enthusiasts, including some that include CPU and GPU coolers in the same circuit. I could see how gamers could push watercooled systems into COTS mainstream, it wouldn't be the first time. Multi-GPU settings are reasonably common there, so the PCIe would seem like a good initial fabric candidate. > >I don't see how x86 should make it to exascale. It's too > >bad MRAM/FeRAM/whatever isn't ready for SoC yet. > > > Even if you put the memory on the chip, you still have the interconnect > scaling problem. Light speed and distance, if nothing else. Putting You can eventually put that on WSI (in fact, somebody wanted to do that with ARM-based nodes and DRAM wafers bonded on top, with redundant routing around dead dies -- I presume this would also take care of graceful degradation if you can do it at runtime, or at least reconfigure after failure, and go back to last snapshot). Worst-case distances would be then ~400 mm within the wafer, and possibly shorter if you interconnect these with fiber. The only other way to reduce average signalling distance is real 3D integration. > everything on a chip just shrinks the problem, but it's just like 15 years > ago with PC tower cases on shelving and Ethernet interconnects. Sooner or later you run into questions like "what's within the lightcone of a 1 nm device", at which point you've reached the limits of classical computation, nevermind that I don't see how you can cool anything with effective >THz refresh rate. I'm extremely sceptical about QC feasibility, though there's some work with nitrogen vacancies in diamond which could produce qubit entanglement in solid state, perhaps even at room temperature. I just don't think it would scale well enough, and since Scott Aaronson also appears dubious I'm in good company. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 8 12:15:01 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 8 Feb 2012 12:15:01 -0500 (EST) Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202081413.53665.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> Message-ID: > The APU concept has a few interesting points but certainly also a few major > problems (when comparing it to a cpu + stand alone gpu setup): > > * Memory bandwidth to all those FPUs well, sorta. my experience with GP-GPU programming today is that your first goal is to avoid touching anything offchip anyway (spilling, etc), so I'm not sure this is a big problem. obviously, the integrated GPU is a small slice of a "real" add-in GPU, so needs proportionately less bandwidth. > * Power (CPUs in servers today max out around 120W with GPUs at >250W) sure, though the other way to think of this is that you have 250W or so of power overhead hanging off your GPU cards. you can amortize the "host overhead" by adding several GPUs, but... think of it this way: an APU is just a low-mid-end add-in GPU with the host integrated onto it ;) I think the real question is whether someone will produce a minimalist APU node. since Llano has on-die PCIE, it seems like you'd need only APU, 2-4 dimms and a network chip or two. that's going to add up to very little beyond the the APU's 65 or 100W TDP... (I figure 150/node including PSU overhead.) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Wed Feb 8 12:52:55 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 8 Feb 2012 09:52:55 -0800 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <20120208163924.GQ7343@leitl.org> References: <20120208133412.GN7343@leitl.org> <20120208163924.GQ7343@leitl.org> Message-ID: Odd threading/quoting behavior in my mail client.. Comments below with ** -----Original Message----- From: Eugen Leitl [mailto:eugen at leitl.org] Sent: Wednesday, February 08, 2012 8:39 AM To: Lux, Jim (337C); Beowulf at beowulf.org Subject: Re: [Beowulf] Clusters just got more important - AMD's roadmap On Wed, Feb 08, 2012 at 06:18:50AM -0800, Lux, Jim (337C) wrote: > It can be done (and probably has), but it's going to be "exotic" and > expensive. There are some liquid metal (gallium alloy, strangely enough not sodium/potassium eutectic which would be plenty cheaper albeit a fire hazard if exposed to air) cooled GPUs for the gamer market. I might have also read about one which uses a metal pump with no movable parts which utilizes MHD, though I don't remember where I've read that. There are also plenty of watercooled systems among enthusiasts, including some that include CPU and GPU coolers in the same circuit. I could see how gamers could push watercooled systems into COTS mainstream, it wouldn't be the first time. Multi-GPU settings are reasonably common there, so the PCIe would seem like a good initial fabric candidate. **Those aren't plug into a backplane type configurations, though. They can use permanent connections, or something that is a pain to do, but you only have to do it once. I could see something using heat pipes, too, which possibly mates with some sort of thermal transfer socket, but again, we're talking exotic, and not amenable to large volumes to bring the price down. > >I don't see how x86 should make it to exascale. It's too bad > >MRAM/FeRAM/whatever isn't ready for SoC yet. > > > Even if you put the memory on the chip, you still have the > interconnect scaling problem. Light speed and distance, if nothing > else. Putting You can eventually put that on WSI (in fact, somebody wanted to do that with ARM-based nodes and DRAM wafers bonded on top, with redundant routing around dead dies -- I presume this would also take care of graceful degradation if you can do it at runtime, or at least reconfigure after failure, and go back to last snapshot). Worst-case distances would be then ~400 mm within the wafer, and possibly shorter if you interconnect these with fiber. ** Even better is free space optical interconnect, but that's pretty speculative today. And even if you went to WSI (or thick film hybrids or something similar), you're still limited in scalability by the light time delay between nodes. If you had just the bare silicon (no packages) for all the memory and CPU chips in a 1000 node cluster, that's still a pretty big ball o'silicon. And a big ball o'silicon that is dissipating a fair amount of heat. The only other way to reduce average signalling distance is real 3D integration. ** I agree. This has been done numerous times in the history of computing. IBM had that cryogenically cooled stack a few decades ago. The "round" Cray is another example. > everything on a chip just shrinks the problem, but it's just like 15 > years ago with PC tower cases on shelving and Ethernet interconnects. Sooner or later you run into questions like "what's within the lightcone of a 1 nm device", at which point you've reached the limits of classical computation, nevermind that I don't see how you can cool anything with effective >THz refresh rate. I'm extremely sceptical about QC feasibility, though there's some work with nitrogen vacancies in diamond which could produce qubit entanglement in solid state, perhaps even at room temperature. I just don't think it would scale well enough, and since Scott Aaronson also appears dubious I'm in good company. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Wed Feb 8 12:55:23 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 8 Feb 2012 09:55:23 -0800 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <201202081413.53665.cap@nsc.liu.se> Message-ID: We can probably look back to the history of non-integrated floating point for this kind of thing. 8087/8086, etc. I used to work with a guy who was a key mover at Floating Point Systems, probably one of the first applications of "attached special purpose processor", and ALL of the issues we're talking about here came up in that connection, just as with coprocessors since time immemorial. I think the real question is: "does the fact we're doing this at a different scale, change any of the fundamental limitations or make something easier than it was the last time" -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Mark Hahn Sent: Wednesday, February 08, 2012 9:15 AM To: Beowulf Mailing List Subject: Re: [Beowulf] Clusters just got more important - AMD's roadmap > The APU concept has a few interesting points but certainly also a few > major problems (when comparing it to a cpu + stand alone gpu setup): > > * Memory bandwidth to all those FPUs well, sorta. my experience with GP-GPU programming today is that your first goal is to avoid touching anything offchip anyway (spilling, etc), so I'm not sure this is a big problem. obviously, the integrated GPU is a small slice of a "real" add-in GPU, so needs proportionately less bandwidth. > * Power (CPUs in servers today max out around 120W with GPUs at >250W) sure, though the other way to think of this is that you have 250W or so of power overhead hanging off your GPU cards. you can amortize the "host overhead" by adding several GPUs, but... think of it this way: an APU is just a low-mid-end add-in GPU with the host integrated onto it ;) I think the real question is whether someone will produce a minimalist APU node. since Llano has on-die PCIE, it seems like you'd need only APU, 2-4 dimms and a network chip or two. that's going to add up to very little beyond the the APU's 65 or 100W TDP... (I figure 150/node including PSU overhead.) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Wed Feb 8 13:41:34 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 8 Feb 2012 19:41:34 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <201202081413.53665.cap@nsc.liu.se> Message-ID: On Feb 8, 2012, at 6:15 PM, Mark Hahn wrote: >> The APU concept has a few interesting points but certainly also a >> few major >> problems (when comparing it to a cpu + stand alone gpu setup): >> >> * Memory bandwidth to all those FPUs > > well, sorta. my experience with GP-GPU programming today is that your > first goal is to avoid touching anything offchip anyway (spilling, > etc), > so I'm not sure this is a big problem. obviously, the integrated GPU > is a small slice of a "real" add-in GPU, so needs proportionately > less bandwidth. Most of the code that's real fast on gpgpu simply doesn't leave the compute units at all. For outsiders: a compute unit is basically 1 vector core (or SIMD) of a gpu with its own registers and its own shared memory (64 KB or so at nvidia + registers which is quite a tad and 32 KB sharedmemory for AMD + a big multiple of that for local registers) So that's 64 PE's (processing elements) at newer generation AMD's (6000 and 7000 series), or 32 at nvidia. Nvidia has 512 PE's and latest AMD has 2048 PE's. You really don't want to touch the RAM much in gpgpu computing. RAM slows down. There is zero difference from programming model there between AMD and Nvidia gpu's. Anything that does other stuff than just inside a compute unit of 32 or 64 'cores' is not gonna scale well. Good example is the Trial Factorisation for Mersenne that works at Nvidia very well in CUDA. Basically candidates get generated at cpu's, shipped a bunch to the gpu, then all calculations occur within a compute unit for a bunch of candidates. The problem there you stumble upon as well is not so much the bandwidth from cpu to gpu. It's simply the problem that the CPU's are not fast enough to generate candidates for the GPU, as the GPU is a 200x faster or so than CPU core. The cpu's just can't feed the gpu as they're too slow generating factor candidates to keep the gpu busy. Remember this is just a single GPU and a relative cheap one. As for games, one would guess it's easier to scale well for graphics, yet they do not. Call it clumsy programming, call it bad paid coders, call it 'not necessary to fix as we'll buy a faster gpu soon'; as a result you typically see that gpgpu programs that scale well, they cause the gpu's to eat a lot more power than any game. > >> * Power (CPUs in servers today max out around 120W with GPUs at >> >250W) > > sure, though the other way to think of this is that you have 250W > or so of power overhead hanging off your GPU cards. you can amortize > the "host overhead" by adding several GPUs, but... > > think of it this way: an APU is just a low-mid-end add-in GPU > with the host integrated onto it ;) > > I think the real question is whether someone will produce a minimalist > APU node. since Llano has on-die PCIE, it seems like you'd need only > APU, 2-4 dimms and a network chip or two. that's going to add up to > very little beyond the the APU's 65 or 100W TDP... (I figure 150/node > including PSU overhead.) > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Wed Feb 8 14:05:29 2012 From: deadline at eadline.org (Douglas Eadline) Date: Wed, 8 Feb 2012 14:05:29 -0500 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <201202081413.53665.cap@nsc.liu.se> Message-ID: <1673fe4c73cb087ec4b98a15de5b1d29.squirrel@mail.eadline.org> snip > think of it this way: an APU is just a low-mid-end add-in GPU > with the host integrated onto it ;) > > I think the real question is whether someone will produce a minimalist > APU node. since Llano has on-die PCIE, it seems like you'd need only > APU, 2-4 dimms and a network chip or two. that's going to add up to > very little beyond the the APU's 65 or 100W TDP... (I figure 150/node > including PSU overhead.) I plan on looking at these for my Limulus systems. There are a bunch of microATX and miniITX boards for these APUs, if you use the A8 it has 4 x86 cores and 400 SIMD cores. -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From cap at nsc.liu.se Wed Feb 8 14:27:49 2012 From: cap at nsc.liu.se (Peter =?iso-8859-1?q?Kjellstr=F6m?=) Date: Wed, 8 Feb 2012 20:27:49 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <201202081413.53665.cap@nsc.liu.se> Message-ID: <201202082027.53627.cap@nsc.liu.se> On Wednesday, February 08, 2012 06:15:01 PM Mark Hahn wrote: > > The APU concept has a few interesting points but certainly also a few > > major problems (when comparing it to a cpu + stand alone gpu setup): > > > > * Memory bandwidth to all those FPUs > > well, sorta. my experience with GP-GPU programming today is that your > first goal is to avoid touching anything offchip anyway (spilling, etc), > so I'm not sure this is a big problem. obviously, the integrated GPU > is a small slice of a "real" add-in GPU, so needs proportionately > less bandwidth. Well yes you want to avoid touching memory on a GPU (just as you do on a CPU). But just as you cant completely avoid it on a CPU nor can you on a GPU. On a current socket (CPU) you see maybe 20 GB/s and 50 GF and the flop-wise much faster GPU is also alot faster in memory access (>200 GB/s). Now I admit I'm not a GPU programmer but are you saying those 200 GB/s aren't needed? My assumption was that the fact that CPU-codes depend on cache for performance but still need good memory bandwidth held true even on GPUs. Anyway, my point I guess was mostly that it's a lot easier to sort out hundreds of gigs per second to memory on a device with RAM directly on the PCB than on a server socket. Also, if the APU is a "small slice of a real GPU" then I question the point (not much GPU power per classic core or total system foot-print). ... > I think the real question is whether someone will produce a minimalist > APU node. since Llano has on-die PCIE, it seems like you'd need only > APU, 2-4 dimms and a network chip or two. that's going to add up to > very little beyond the the APU's 65 or 100W TDP... (I figure 150/node > including PSU overhead.) I think anything beyond early testing is a fair bit into the future. For the APU to become interesting I think we need a few (or all of): * Memory shared with the CPU in some useable way (did not say the c-word..) * A proper number crunching version (ecc...) * A fairly high tdp part on a socket with good memory bw * Noticeably better "host to device" bandwidth and even more, latency And don't get me wrong, I'm not saying the above is particularly unlikely... /Peter -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From diep at xs4all.nl Wed Feb 8 16:01:08 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 8 Feb 2012 22:01:08 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202082027.53627.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> <201202082027.53627.cap@nsc.liu.se> Message-ID: <47D69765-C40D-4A78-815C-E5EDFA4881E4@xs4all.nl> On Feb 8, 2012, at 8:27 PM, Peter Kjellstr?m wrote: > On Wednesday, February 08, 2012 06:15:01 PM Mark Hahn wrote: >>> The APU concept has a few interesting points but certainly also a >>> few >>> major problems (when comparing it to a cpu + stand alone gpu setup): >>> >>> * Memory bandwidth to all those FPUs >> >> well, sorta. my experience with GP-GPU programming today is that >> your >> first goal is to avoid touching anything offchip anyway (spilling, >> etc), >> so I'm not sure this is a big problem. obviously, the integrated GPU >> is a small slice of a "real" add-in GPU, so needs proportionately >> less bandwidth. > > Well yes you want to avoid touching memory on a GPU (just as you do > on a CPU). > But just as you cant completely avoid it on a CPU nor can you on a > GPU. On a > current socket (CPU) you see maybe 20 GB/s and 50 GF and the flop- > wise much 50 gflop on a cpu - first of all very little software actually gets 50 gflop out of a CPU. It might execute 2 instructions a second in SIMD, yet not when you multiply. To start with it has just 1 multiplication unit, so you already start with losing factor 2. So effective output that the CPU delivers isn't much more than its bandwidth and caches can handle. Now let's skip the multiply-add further int his discussion AFAIK most total optimized codes can't use this. yet for gpu's discussion is the same there. But not for the output bandwidth. In a GPU on other hand you do achieve this throughput it can deliver. It's delivering, multiply add not counted, 0.5 Tflop per second, that's 4 Tbytes/s. Or factor 20 above it's maximum bandwidth to the RAM. RAM can get prefetched, yet there are no clever caches on the GPU. Some read L2 cache, that's about it. Writes to the local shared cache also are not adviced as the bandwidth of it is a lot slower than the speed of the compute units can deliver. So basically if you read and/or write at full speed to the RAM, you slow down factor 20 or so, a slowdown a CPU does *not* have, as basically it's so slow that CPU, that its RAM can keep up with it. > faster GPU is also alot faster in memory access (>200 GB/s). > > Now I admit I'm not a GPU programmer but are you saying those 200 > GB/s aren't > needed? My assumption was that the fact that CPU-codes depend on > cache for > performance but still need good memory bandwidth held true even on > GPUs. > > Anyway, my point I guess was mostly that it's a lot easier to sort out > hundreds of gigs per second to memory on a device with RAM directly > on the PCB > than on a server socket. > > Also, if the APU is a "small slice of a real GPU" then I question > the point > (not much GPU power per classic core or total system foot-print). > > ... >> I think the real question is whether someone will produce a >> minimalist >> APU node. since Llano has on-die PCIE, it seems like you'd need only >> APU, 2-4 dimms and a network chip or two. that's going to add up to >> very little beyond the the APU's 65 or 100W TDP... (I figure 150/ >> node >> including PSU overhead.) > > I think anything beyond early testing is a fair bit into the > future. For the > APU to become interesting I think we need a few (or all of): > > * Memory shared with the CPU in some useable way (did not say the > c-word..) > * A proper number crunching version (ecc...) > * A fairly high tdp part on a socket with good memory bw > * Noticeably better "host to device" bandwidth and even more, latency > > And don't get me wrong, I'm not saying the above is particularly > unlikely... > > /Peter > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Thu Feb 9 06:12:12 2012 From: eugen at leitl.org (Eugen Leitl) Date: Thu, 9 Feb 2012 12:12:12 +0100 Subject: [Beowulf] =?utf-8?q?The_death_of_CPU_scaling=3A_From_one_core_to_?= =?utf-8?b?bWFueSDigJQgYW5kIHdoeSB3ZeKAmXJlIHN0aWxsCXN0dWNr?= Message-ID: <20120209111212.GX7343@leitl.org> http://www.extremetech.com/computing/116561-the-death-of-cpu-scaling-from-one-core-to-many-and-why-were-still-stuck?print The death of CPU scaling: From one core to many ? and why we?re still stuck By Joel Hruska on February 1, 2012 at 2:31 pm It?s been nearly eight years since Intel canceled Tejas and announced its plans for a new multi-core architecture. The press wasted little time in declaring conventional CPU scaling dead ? and while the media has a tendency to bury products, trends, and occasionally people well before their expiration date, this is one declaration that?s stood the test of time. To understand the magnitude of what happened in 2004 it may help to consult the following chart. It shows transistor counts, clock speeds, power consumption, and instruction-level parallelism (ILP). The doubling of transistor counts every two years is known as Moore?s law, but over time, assumptions about performance and power consumption were also made and shown to advance along similar lines. Moore got all the credit, but he wasn?t the only visionary at work. For decades, microprocessors followed what?s known as Dennard scaling. Dennard predicted that oxide thickness, transistor length, and transistor width could all be scaled by a constant factor. Dennard scaling is what gave Moore?s law its teeth; it?s the reason the general-purpose microprocessor was able to overtake and dominate other types of computers. CPU Scaling [1]CPU scaling showing transistor density, power consumption, and efficiency. Chart originally from The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software [2] The original 8086 drew ~1.84W and the P3 1GHz drew 33W, meaning that CPU power consumption increased by 17.9x while CPU frequency improved by 125x. Note that this doesn?t include the other advances that occurred over the same time period, such as the adoption of L1/L2 caches, the invention of out-of-order execution, or the use of superscaling and pipelining to improve processor efficiency. It?s for this reason that the 1990s are sometimes referred to as the golden age of scaling. This expanded version of Moore?s law held true into the mid-2000s, at which point the power consumption and clock speed improvements collapsed. The problem at 90nm was that transistor gates became too thin to prevent current from leaking out into the substrate. Intel and other semiconductor manufacturers have fought back with innovations [3] like strained silicon, hi-k metal gate, FinFET, and FD-SOI ? but none of these has re-enabled anything like the scaling we once enjoyed. From 2007 to 2011, maximum CPU clock speed (with Turbo Mode enabled) rose from 2.93GHz to 3.9GHz, an increase of 33%. From 1994 to 1998, CPU clock speeds rose by 300%. Next page: The multi-core swerve [4] The multi-core swerve For the past seven years, Intel and AMD have emphasized multi-core CPUs as the answer to scaling system performance, but there are multiple reasons to think the trend towards rising core counts is largely over. First and foremost, there?s the fact that adding more CPU cores never results in perfect scaling. In any parallelized program, performance is ultimately limited by the amount of serial code (code that can only be executed on one processor). This is known as Amdahl?s law. Other factors, such as the difficulty of maintaining concurrency across a large number of cores, also limit the practical scaling of multi-core solutions. Amdahl's Law [5] AMD?s Bulldozer is a further example of how bolting more cores together can result in a slower end product [6]. Bulldozer was designed to share logic and caches in order to reduce die size and allow for more cores per processor, but the chip?s power consumption badly limits its clock speed while slow caches hamstring instructions per cycle (IPC). Even if Bulldozer had been a significantly better chip, it wouldn?t change the long-term trend towards diminishing marginal returns. The more cores per die, the lower the chip?s overall clock speed. This leaves the CPU ever more reliant on parallelism to extract acceptable performance. AMD isn?t the only company to run into this problem; Oracle?s new T4 processor is the first Niagara-class chip to focus on improving single-thread performance rather than pushing up the total number of threads per CPU. Rage Jobs [7] The difficulty of software optimization is a further reason why adding more CPU cores doesn?t help much. Game developers have made progress in using multi-core systems, but the rate of advance has been slow. Games like Rage [8] and Battlefield 3 ? two high-profile titles that use multiple cores ? both utilized new engines designed from the ground-up with multi-core scaling as a primary goal. The bottom line is that its been easier for Intel and AMD to add cores than it is for software to take advantage of them. Seven years after the multi-core era began, it?s already morphing into something different. Next page: The rise (and limit) of Many-Core [9] The rise (and limit) of Many-Core In this context, we?re using the term ?many-core? to refer to a wide range of programmable hardware. GPUs from AMD and Nvidia are both ?many-core? products, as are chips from companies like Tilera. Intel?s Knights Corner [10] is a many-core chip. The death of conventional scaling has sparked a sharp increase in the number of companies researching various types of specialized CPU cores. Prior to that point, general-purpose CPU architectures, exemplified by Intel?s x86, had eaten through the high-end domains of add-in boards and co-processors at a ferocious rate. Once that trend slammed into the brick wall of physics, more specialist architectures began to appear. Many-core Scaling [11]Note: Three exclamation points doesn?t actually mean anything, despite the fondest wishes of AMD?s marketing department Despite what some companies like to claim, specialized many-core chips don?t ?break? Moore?s law in any way and are not exempt from the realities of semiconductor manufacturing. What they offer is a tradeoff ? a less general, more specialized architecture that?s capable of superior performance on a narrower range of problems. They?re also less encumbered by socket power constraints ? Intel?s CPUs top out at 140W TDP; Nvidia?s upper-range GPUs are in the 250W range. Intel?s upcoming Many Integrated Core (MIC) architecture is partly an attempt to capitalize on the benefits of having a separate interface and giant PCB for specialized, ultra-parallel data crunching. AMD, meanwhile, has focused on consumer-side applications and the integration of CPU and GPU via what it calls Graphics Core Next [12]. Regardless of market segmentation, all three companies are talking about integrating specialized co-processors that excel at specific tasks, one of which happens to be graphics. AMD's many-core strategy [13] Unfortunately, this isn?t a solution. Incorporating a specialized many-core processor on-die or relying on a discrete solution to boost performance is a bid to improve efficiency per watt, but it does nothing to address the underlying problem that transistors can no longer be counted on to scale the way they used to. The fact that transistor density continues to scale while power consumption and clock speed do not has given rise to a new term: dark silicon. It refers to the percentage of silicon on a processor that can?t be powered up simultaneously without breaching the chip?s TDP. A recent report in dark silicon and the future of multi-core devices describes the future in stark terms. The researchers considered both transistor scaling as forecast by the International Technology Roadmap for Semiconductors (ITRS) and by a more conservative amount; they factored in the use of APU-style combinations, the rise of so-called ?wimpy? cores [14], and the future scaling of general-purpose multiprocessors. They concluded: Regardless of chip organization and topology, multicore scaling is power limited to a degree not widely appreciated by the computing community? Given the low performance returns? adding more cores will not provide sufficient benefit to justify continued process scaling. Given the time-frame of this problem and its scale, radical or even incremental ideas simply cannot be developed along typical academic research and industry product cycles? A new driver of transistor utility must be found, or the economics of process scaling will break and Moore?s Law will end well before we hit final manufacturing limits Over the next few years scaling will continue to slowly improve. Intel will likely meander up to 6-8 cores for mainstream desktop users at some point, quad-cores will become standard at every product level, and we?ll see much tighter integration of CPU and GPU. Past that, it?s unclear what happens next. The gap between present-day systems and DARPA?s exascale computing initiative [15] will diminish only marginally with each successive node; there?s no clear understanding of how ? or if ? classic Dennard scaling can be re-initiated. This is part one of a two-part story. Part two will deal with how Intel is addressing the problem through what it calls the ?More than Moore? approach and its impact on the mobile market. Endnotes : http://www.extremetech.com/wp-content/uploads/2012/02/CPU-Scaling.jpg The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software: http://www.gotw.ca/publications/concurrency-ddj.htm fought back with innovations: http://www.extremetech.com/extreme/106899-beyond-22nm-applied-materials-the-unsung-silicon-hero The multi-core swerve: http://www.extremetech.com/computing/116561-the-death-of-cpu-scaling-from-one-core-to-many-and-why-were-still-stuck/2 : http://www.extremetech.com/wp-content/uploads/2012/02/Amdahl.png a slower end product: http://www.extremetech.com/computing/100583-analyzing-bulldozers-scaling-single-thread-performance : http://www.extremetech.com/wp-content/uploads/2012/02/Rage-Jobs.jpg Rage: http://www.extremetech.com/gaming/99729-deconstructing-rage-what-went-wrong-and-how-to-fix-it The rise (and limit) of Many-Core: http://www.extremetech.com/computing/116561-the-death-of-cpu-scaling-from-one-core-to-many-and-why-were-still-stuck/3 Knights Corner: http://www.extremetech.com/extreme/73426-intel-plans-specialized-50core-chip : http://www.extremetech.com/wp-content/uploads/2012/02/Scaling1.jpg Graphics Core Next: http://www.extremetech.com/computing/110133-radeon-hd-7970-one-gpu-to-rule-them-all : http://www.extremetech.com/wp-content/uploads/2012/02/ManyCoreAMD.jpg ?wimpy? cores: http://www.extremetech.com/computing/112319-creative-announces-100-core-system-on-a-chip DARPA?s exascale computing initiative: http://www.extremetech.com/computing/116081-darpa-summons-researchers-to-reinvent-computing _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Thu Feb 9 06:42:55 2012 From: eugen at leitl.org (Eugen Leitl) Date: Thu, 9 Feb 2012 12:42:55 +0100 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking Message-ID: <20120209114255.GC7343@leitl.org> http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu-performance-by-20-without-overclocking Engineers boost AMD CPU performance by 20% without overclocking By Sebastian Anthony on February 7, 2012 at 12:44 pm AMD Llano APU die (GPU on the right) Engineers at North Carolina State University have used a novel technique to boost the performance of an AMD Fusion APU by more than 20%. This speed-up was achieved purely through software and using commercial (probably Llano) silicon. No overclocking was used. In an AMD APU there is both a CPU and GPU, both on the same piece of silicon. In conventional applications ? in a Llano-powered laptop, for example ? the CPU and GPU hardly talk to each other; the CPU does its thing, and the GPU pushes polygons. What the researchers have done is to marry the CPU and GPU together to take advantage of each core?s strengths. To achieve the 20% boost, the researchers reduce the CPU to a fetch/decode unit, and the GPU becomes the primary computation unit. This works out well because CPUs are generally very strong at fetching data from memory, and GPUs are essentially just monstrous floating point units. In practice, this means the CPU is focused on working out what data the GPU needs (pre-fetching), the GPU?s pipes stay full, and a 20% performance boost arises. Now, unfortunately we don?t have the exact details of how the North Carolina researchers achieved this speed-up. We know it?s in software, but that?s about it. The team probably wrote a very specific piece of code (or a compiler) that uses the AMD APU in this way. The press release doesn?t say ?Windows ran 20% faster? or ?Crysis 2 ran 20% faster,? which suggests we?re probably looking at a synthetic, hand-coded benchmark. We will know more when the team presents its research on February 27 at the International Symposium on High Performance Computer Architecture. For what it?s worth, this kind of CPU/GPU integration is exactly what AMD is angling for with its Heterogeneous System Architecture (formerly known as Fusion System Architecture). AMD has a huge advantage over Intel when it comes to GPUs, but that means nothing if the software chain (compilers, libraries, developers) isn?t in place. The good news is that Intel doesn?t have anything even remotely close to AMD?s APU coming down the pipeline, which means AMD has a few years to see where this HSA path leads. If the 20% speed boost can be brought to market in the next year or two, AMD might actually have a chance. Updated @ 17:54: The co-author of the paper, Huiyang Zhou, was kind enough to send us the research paper. It seems production silicon wasn?t actually used; instead, the software tweaks were carried out a simulated future AMD APU with shared L3 cache (probably Trinity). It?s also worth noting that AMD sponsored and co-authored this paper. Updated @ 04:11 Some further clarification: Basically, the research paper is a bit cryptic. It seems the engineers wrote some real code, but executed it on a simulated AMD CPU with L3 cache (i.e. probably Trinity). It does seem like their working is correct. In other words, this is still a good example of the speed-ups that heterogeneous systems will bring? in a year or two. Read more at North Carolina State University _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Thu Feb 9 11:20:32 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Thu, 9 Feb 2012 11:20:32 -0500 (EST) Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: <20120209114255.GC7343@leitl.org> References: <20120209114255.GC7343@leitl.org> Message-ID: > http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu-performance-by-20-without-overclocking afaikt, they discovered that using the cpu to prefech for the gpu is a win. this is either obvious or quite strange - the latter because one of the basic principles of gpu programming is to have several times more threads than cores in order to let the scheduler hide latency. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Thu Feb 9 11:44:18 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Thu, 9 Feb 2012 11:44:18 -0500 (EST) Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: References: <20120209114255.GC7343@leitl.org> Message-ID: > I am afraid to use again AMD CPUs. choosing chips should not be about fear. > I already have 3 Intel 7 2600 and One > Laptop i7 2630. I do algoritms that needs much powerfull processor. > I bought one AMD 8120, after 24 hours , I have switched for other 2600 K. > Unless AMD will make one powerfull Processor. I will continue using Intel. I'm curious whether you have any insight into what about your workload fared poorly on the AMD chip. these particular models are similar in their cache and clocks; I wonder whether your experience could be due to, for instance, code that spends all its time in cache misses (where temporal inteleaving with HT might work well.) or whether the AMD chip was not adequately cooled, preventing it from scaling its clock. or whether your test was hurt by the now well-known module/L1 scheduling issue. specrateFP results seem to indicate AMD is not doing badly. of course, the Intel system is also more expensive. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 10 08:58:21 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 10 Feb 2012 14:58:21 +0100 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: <20120209114255.GC7343@leitl.org> References: <20120209114255.GC7343@leitl.org> Message-ID: <4A633EAC-CD78-44A7-9C82-455CAD09B89F@xs4all.nl> On Feb 9, 2012, at 12:42 PM, Eugen Leitl wrote: > > http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu- > performance-by-20-without-overclocking Seems that they used a GPGPU application and had the cpu help speedup the gpgpu by also helping to calculate. So the gpu doesn't help the cpu. So the article title is wrong. It should be : engineers boost AMD gpu performance by 20% by having the CPU give a hand _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Fri Feb 10 12:00:02 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 10 Feb 2012 09:00:02 -0800 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: <4A633EAC-CD78-44A7-9C82-455CAD09B89F@xs4all.nl> Message-ID: Expecting headlines to be accurate is a fool's errand... Be glad it actually said AMD. It's bad enough that the article writers often erroneously summarize, but a still different person writes the headline. Back in print days, the headline had to "look nice".. Today, it's probably more about SEO to drive traffic. But both lead to interesting headlines that don't necessarily reflect the content of the article. On 2/10/12 5:58 AM, "Vincent Diepeveen" wrote: > >On Feb 9, 2012, at 12:42 PM, Eugen Leitl wrote: > >> >> http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu- >> performance-by-20-without-overclocking > >Seems that they used a GPGPU application and had the cpu help speedup >the gpgpu by also helping to calculate. > >So the gpu doesn't help the cpu. So the article title is wrong. > >It should be : engineers boost AMD gpu performance by 20% by having >the CPU give a hand > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Fri Feb 10 12:08:54 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Fri, 10 Feb 2012 12:08:54 -0500 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: References: Message-ID: <4F354F26.2040103@scalableinformatics.com> On 02/10/2012 12:00 PM, Lux, Jim (337C) wrote: > Expecting headlines to be accurate is a fool's errand... > Be glad it actually said AMD. Expecting articles contents to reflect in any reasonable way upon reality may be a similar problem. There are a few, precious few writers who really grok the technology because they live it: Doug Eadline, Jeff Layton, Henry Newman, Chris Mellor, Dan Olds, Rich Brueckner, ... . The vast majority of articles I've had some contact with the authors on (not in the above group) have been erroneous to the point of being completely non-informational. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 10 12:48:08 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 10 Feb 2012 18:48:08 +0100 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: <4F354F26.2040103@scalableinformatics.com> References: <4F354F26.2040103@scalableinformatics.com> Message-ID: Another interesting question is how a few cores cores would be able to speedup a typical single precision gpgpu application by 20%. That would means that the gpu is really slow, especially if we realize this is just 1 or 2 CPU cores or so. Your gpgpu code really has to kind of be not so very professional to have 2 cpu cores alraedy contribute some 20% to that. Most gpgpu codes here on a modern GPU you need about a 200+ cpu cores and that's usually codes which do not run optimal at gpu's, as it has to do with huge prime numbers, so simulating that at a 64 bits cpu is more efficient than a 32 bits gpu. So in their case the claim is that for their experiments, assuming 2 cpu cores, that would be 20%. Means we have a gpu that's 20x slower or so than a fermi at 512 cores/HD6970 @ 1536. 1536 / 20 = 76.8 gpu streamcores. That's AMD Processing Element count. for nvidia this is similar to 76.8 / 4 = 19.2 cores This laptop is from 2007, sure it is a macbookpro 17'' apple, has a core2 duo 2.4Ghz and has a Nvidia GT 8600M with 32 CUDA cores. So if we extrapolate back, the built in gpu is gonna kick that new AMD chip, right? Vincent On Feb 10, 2012, at 6:08 PM, Joe Landman wrote: > On 02/10/2012 12:00 PM, Lux, Jim (337C) wrote: >> Expecting headlines to be accurate is a fool's errand... >> Be glad it actually said AMD. > > Expecting articles contents to reflect in any reasonable way upon > reality may be a similar problem. There are a few, precious few > writers > who really grok the technology because they live it: Doug Eadline, > Jeff > Layton, Henry Newman, Chris Mellor, Dan Olds, Rich Brueckner, ... . > > The vast majority of articles I've had some contact with the > authors on > (not in the above group) have been erroneous to the point of being > completely non-informational. > > > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics Inc. > email: landman at scalableinformatics.com > web : http://scalableinformatics.com > http://scalableinformatics.com/sicluster > phone: +1 734 786 8423 x121 > fax : +1 866 888 3112 > cell : +1 734 612 4615 > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Shainer at Mellanox.com Fri Feb 10 19:36:45 2012 From: Shainer at Mellanox.com (Gilad Shainer) Date: Sat, 11 Feb 2012 00:36:45 +0000 Subject: [Beowulf] HPC Advisory Council and Swiss Supercomputing Centre to host HPC Switzerland Conference 2012 Message-ID: Sending on behalf of the HPC Advisory Council and the Swiss Supercomputing Center Date: March 13-15, 2012 Location: Palazoo dei Congressi, Lugano, Switzerland The HPC Advisory Council and the Swiss Supercomputing Centre will host the HPC Advisory Council Switzerland Conference 2012 in the Lugano Convention Centre, Lugano, Switzerland, from March 13-15, 2012. The conference will focus on High-Performance Computing (HPC) education, hands-on and classroom training and overviews of new important HPC developments and trends. The conference will include comprehensive education for topics such as high-performance and parallel I/O, communication libraries (such as MPI, SHMEM and PGAS), GPU and accelerations, Big Data, high-performance cloud computing, high-speed interconnects, and will include advanced topics and development for upcoming HPC technologies. In addition, attendees will receive hands-on training for topics on clustering, network, troubleshooting, tuning, and optimizations. For the complete agenda and schedule, please refer to the conference website - http://www.hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/index.php. The 3-day conference is CHF 80.00. Registration is required and can be made at the HPC Advisory Council Switzerland Conference registration page. Media sponsorship and coverage is being provided by HPC-CH, insideHPC and Scientific Computing World. Thanks, Gilad -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eugen at leitl.org Thu Feb 16 10:26:53 2012 From: eugen at leitl.org (Eugen Leitl) Date: Thu, 16 Feb 2012 16:26:53 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right Message-ID: <20120216152653.GQ7343@leitl.org> http://www.fragmentationneeded.net/2011/12/pricing-and-trading-networks-down-is-up.html Pricing and Trading Networks: Down is Up, Left is Right My introduction to enterprise networking was a little backward. I started out supporting trading floors, backend pricing systems, low-latency algorithmic trading systems, etc... I got there because I'd been responsible for UNIX systems producing and consuming multicast data at several large financial firms. Inevitably, the firm's network admin folks weren't up to speed on matters of performance tuning, multicast configuration and QoS, so that's where I focused my attention. One of these firms offered me a job with the word "network" in the title, and I was off to the races. It amazes me how little I knew in those days. I was doing PIM and MSDP designs before the phrases "link state" and "distance vector" were in my vocabulary! I had no idea what was populating the unicast routing table of my switches, but I knew that the table was populated, and I knew what PIM was going to do with that data. More incredible is how my ignorance of "normal" ways of doing things (AVVID, SONA, Cisco Enterprise Architecture, multi-tier designs, etc...) gave me an advantage over folks who had been properly indoctrinated. My designs worked well for these applications, but looked crazy to the rest of the network staff (whose underperforming traditional designs I was replacing). The trading floor is a weird place, with funny requirements. In this post I'm going to go over some of the things that make trading floor networking... Interesting. Redundant Application Flows The first thing to know about pricing systems is that you generally have two copies of any pricing data flowing through the environment at any time. Ideally, these two sets originate from different head-end systems, get transit from different wide area service providers, ride different physical infrastructure into opposite sides of your data center, and terminate on different NICs in the receiving servers. If you're getting data directly from an exchange, that data will probably be arriving as multicast flows. Redundant multicast flows. The same data arrives at your edge from two different sources, using two different multicast groups. If you're buying data from a value-add aggregator (Reuters, Bloomberg, etc...), then it probably arrives via TCP from at least two different sources. The data may be duplicate copies (redundancy), or be distributed among the flows with an N+1 load-sharing scheme. Losing One Packet Is Bad Most application flows have no problem with packet loss. High performance trading systems are not in this category. Think of the state of the pricing data like a spreadsheet. The rows represents a securities -- something that traders buy and sell. The columns represent attributes of that security: bid price, ask price, daily high and low, last trade price, last trade exchange, etc... Our spreadsheet has around 100 columns and 200,000 rows. That's 20 million cells. Every message that rolls in from a multicast feed updates one of those cells. You just lost a packet. Which cell is wrong? Easy answer: All of them. If a trader can't trust his data, he can't trade. These applications have repair mechanisms, but they're generally slow and/or clunky. Some of them even involve touch tone. Really: The Securities Industry Automation Corporation (SIAC) provides a retransmission capability for the output data from host systems. As part of this service, SIAC provides the AutoLink facility to assist vendors with requesting retransmissions by submitting requests over a touch-tone telephone set Reconvergence Is Bad Because we've got two copies of the data coming in. There's no reason to fix a single failure. If something breaks, you can let it stay broken until the end of the day. What's that? You think it's worth fixing things with a dynamic routing protocol? Okay cool, route around the problem. Just so long as you can guarantee that "flow A" and "flow B" never traverse the same core router. Why am I paying for two copies of this data if you're going to push it through a single device? You just told me that the device is so fragile that you feel compelled to route around failures! Don't Cluster the Firewalls The same reason we don't let routing reconverge applies here. If there are two pricing firewalls, don't tell them about each other. Run them as standalone units. Put them in separate rooms, even. We can afford to lose half of a redundant feed. We cannot afford to lose both feeds, even for the few milliseconds required for the standby firewall take over. Two clusters (four firewalls) would be okay, just keep the "A" and "B" feeds separate! Don't team the server NICs The flow-splitting logic applies all the way down to the servers. If they've got two NICs available for incoming pricing data, these NICs should be dedicated per-flow. Even if there are NICs-a-plenty, the teaming schemes are all bad news because like flows, application components are also disposable. It's okay to lose one. Getting one back? That's sometimes worse. Keep reading... Recovery Can Kill You Most of these pricing systems include a mechanism for data receivers to request retransmission of lost data, but the recovery can be a problem. With few exceptions, the network applications in use on the trading floor don't do any sort of flow control. It's like they're trying to hurt you. Imagine a university lecture where a sleeping student wakes up, asks the lecturer to repeat the last 30 minutes, and the lecturer complies. That's kind of how these systems work. Except that the lecturer complies at wire speed, and the whole lecture hall full of students is compelled to continue taking notes. Why should the every other receiver be penalized because one system screwed up? I've got trades to clear! The following snapshot is from the Cisco CVD for trading systems. it shows how aggressive these systems can be. A nominal 5Mb/s trading application regularly hits wire-speed (100Mb/s) in this case. The graph shows a small network when things are working right. A big trading backend at a large financial services firm can easily push that green line into the multi-gigabit range. Make things interesting by breaking stuff and you'll over-run even your best 10Gb/s switch buffers (6716 cards have 90MB per port) easily. Slow Servers Are Good Lots of networks run with clients deliberately connected at slower speeds than their server. Maybe you have 10/100 ports in the wiring closet and gigabit-attached servers. Pricing networks require exactly the opposite. The lecturer in my analogy isn't just a single lecturer. It's a team of lecturers. They all go into wire-speed mode when the sleeping student wakes up. How will you deliver multiple simultaneous gigabit-ish multicast streams to your access ports? You can't. I've fixed more than one trading system by setting server interfaces down to 100Mb/s or even 10Mb/s. Fast clients, slow servers is where you want to be. Slowing down the servers can turn N*1Gb/s worth of data into N*100Mb/s -- something we can actually handle. Bad Apple Syndrome The sleeping student example is actually pretty common. It's amazing to see the impact that can arise from things like: a clock update on a workstation ripping a CD with iTunes briefly closing the lid on a laptop The trading floor is usually a population of Windows machines with users sitting behind them. Keeping these things from killing each other is a daunting task. One bad apple will truly spoil the bunch. How Fast Is It? System performance is usually measured in terms of stuff per interval. That's meaningless on the trading floor. The opening bell at NYSE is like turning on a fire hose. The only metric that matters is the answer to this question: Did you spill even one drop of water? How close were you to the limit? Will you make it through tomorrow's trading day too? I read on twitter that Ben Bernanke got a bad piece of fish for dinner. How confident are you now? Performance of these systems is binary. You either survived or you did not. There is no "system is running slow" in this world. Routing Is Upside Down While not unique to trading floors, we do lots of multicast here. Multicast is funny because it relies on routing traffic away from the source, rather than routing it toward the destination. Getting into and staying in this mindset can be a challenge. I started out with no idea how routing worked, so had no problem getting into the multicast mindset :-) NACK not ACK Almost every network protocol relies on data receivers ACKnowledging their receipt of data. But not here. Pricing systems only speak up when something goes missing. QoS Isn't The Answer QoS might seem like the answer to make sure that we get through the day smoothly, but it's not. In fact, it can be counterproductive. QoS is about managed un-fairness... Choosing which packets to drop. But pricing systems are usually deployed on dedicated systems with dedicated switches. Every packet is critical, and there's probably more of them than we can handle. There's nothing we can drop. Making matters worse, enabling QoS on many switching platforms reduces the buffers available to our critical pricing flows, because the buffers necessarily get carved so that they can be allocated to different kinds of traffic. It's counter intuitive, but 'no mls qos' is sometimes the right thing to do. Load Balancing Ain't All It's Cracked Up To Be By default, CEF doesn't load balance multicast flows. CEF load balancing of multicast can be enabled and enhanced, but doesn't happen out of the box. We can get screwed on EtherChannel links too: Sometimes these quirky applications intermingle unicast data with the multicast stream. Perhaps a latecomer to the trading floor wants to start watching Cisco's stock price. Before he can begin, he needs all 100 cells associated with CSCO. This is sometimes called the "Initial Image." He ignores updates for CSCO until he's got the that starting point loaded up. CSCO has updated 9000 times today, so the server unicasts the initial image: "Here are all 100 cells for CSCO as of update #9000: blah blah blah...". Then the price changes, and the server multicasts update #9001 to all receivers. If there's a load balanced path (either CEF or an aggregate link) between the server and client, then our new client could get update 9001 (multicast) before the initial image (unicast) shows up. The client will discard update 9001 because he's expecting a full record, not an update to a single cell. Next, the initial image shows up, and the client knows he's got everything through update #9000. Then update #9002 arrives. Hey, what happened to #9001? Post-mortem analysis of these kinds of incidents will boil down to the software folks saying: We put the messages on the wire in the correct order. They were delivered by the network in the wrong order. ARP Times Out NACK-based applications sit quietly until there's a problem. So quietly that they might forget the hardware address associated with their gateway or with a neighbor. No problem, right? ARP will figure it out... Eventually. Because these are generally UDP-based applications without flow control, the system doesn't fire off a single packet, then sit and wait like it might when talking TCP. No, these systems can suddenly kick off a whole bunch of UDP datagrams destined for a system it hasn't talked to in hours. The lower layers in the IP stack need to hold onto these packets until the ARP resolution process is complete. But the packets keep rolling down the stack! The outstanding ARP queue is only 1 packet deep in many implementations. The queue overflows and data is lost. It's not strictly a network problem, but don't worry. Your phone will ring. Losing Data Causes You to Lose Data There's a nasty failure mode underlying the NACK-based scheme. Lost data will be retransmitted. If you couldn't handle the data flow the first time around, why expect to handle wire speed retransmission of that data on top of the data that's coming in the next instant? If the data loss was caused by a Bad Apple receiver, then all his peers suffer the consequences. You may have many bad apples in a moment. One Bad Apple will spoil the bunch. If the data loss was caused by an overloaded network component, then you're rewarded by compounding increases in packet rate. The exchanges don't stop trading, and the data sources have a large queue of data to re-send. TCP applications slow down in the face of congestion. Pricing applications speed up. Packet Decodes Aren't Available Some of the wire formats you'll be dealing with are closed-source secrets. Others are published standards for which no WireShark decodes are publicly available. Either way, you're pretty much on your own when it comes to analysis. Updates Responding to Will's question about data sources: The streams come from the various exchanges (NASDAQ, NYSE, FTSE, etc...) Because each of these exchanges use their own data format, there's usually some layers of processing required to get them into a common format for application consumption. This processing can happen at a value-add data distributor (Reuters, Bloomberg, Activ), or it can be done in-house by the end user. Local processing has the advantage of lower latency because you don't have to have the data shipped from the exchange to a middleman before you see it. Other streams come from application components within the company. There are usually some layers of processing (between 2 and 12) between a pricing update first hitting your equipment, and when that update is consumed by a trader. The processing can include format changes, addition of custom fields, delay engines (delayed data can be given away for free), vendor-switch systems (I don't trust data vendor "A", switch me to "B"), etc... Most of those layers are going to be multicast, and they're going to be the really dangerous ones, because the sources can clobber you with LAN speeds, rather than WAN speeds. As far as getting the data goes, you can move your servers into the exchange's facility for low-latency access (some exchanges actually provision the same length of fiber to each colocated customer, so that nobody can claim a latency disadvantage), you can provision your own point-to-point circuit for data access, you can buy a fat local loop from a financial network provider like BT/Radianz (probably MPLS on the back end so that one local loop can get you to all your pricing and clearing partners), or you can buy the data from a value-add aggregator like Reuters or Bloomberg. Responding to Will's question about SSM: I've never seen an SSM pricing component. They may be out there, but they might not be a super good fit. Here's why: Everything in these setups is redundant, all the way down to software components. It's redundant in ways we're not used to seeing in enterprises. No load-balancer required here. The software components collaborate and share workload dynamically. If one ticker plant fails, his partner knows what update was successfully transmitted by the dead peer, and takes over from that point. Consuming systems don't know who the servers are, and don't care. A server could be replaced at any moment. In fact, it's not just downstream pricing data that's multicast. Many of these systems use a model where the clients don't know who the data sources are. Instead of sending requests to a server, they multicast their requests for data, and the servers multicast the replies back. Instead of: hello server, nice to meet you. I'd like such-and-such. it's actually: hello? servers? I'd like such-and-such! I'm ready, so go ahead and send it whenever... Not knowing who your server is kind of runs counter to the SSM ideal. It could be done with a pool of servers, I've just never seen it. The exchanges are particularly slow-moving when it comes to changing things. The modern exchange feed, particularly ones like the "touch tone" example I cited are literally ticker-tape punch signals wrapped up in an IP multicast header. The old school scheme was to have a ticker tape machine hooked to a "line" from the exchange. Maybe you'd have two of them (A and B again). There would be a third one for retransmit. Ticker machine run out of paper? Call the exchange, and here's more-or-less what happens: Cut the chunk of paper containing the updates you missed out of their spool of tape. Scissors are involved here. Grab a bit of header tape that says: "this is retransmit data for XYZ Bank". Tape these two pieces of paper together, and feed them through a reader that's attached to the "retransmit line" Every bank in New York will get the retransmits, but they'll know to ignore them. XYZ Bank clips the retransmit data out of the retransmit ticker machine, and pastes it into place on the end where the machine ran out of paper. These terms "tick" "line" and "retransmit", etc... all still apply with modern IP based systems. I've read the developer guides for these systems (to write wireshark decodes), and it's like a trip back in time. Some of these systems are still so closely coupled to the paper-punch system that you get chads all over the floor and paper cuts all over your hands just from reading the API guide :-) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Thu Feb 16 11:26:08 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Thu, 16 Feb 2012 17:26:08 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <20120216152653.GQ7343@leitl.org> References: <20120216152653.GQ7343@leitl.org> Message-ID: Yes very good article. In fact it's even more clumsy than most guess. For those who didn't get the problem of the article - there is 2 datafeeds A and B that ship 'market data' at most exchanges. Basically all big exchanges work similar there. Especially the derivatives exchanges, most of them have a very similar protocol, and that's the only spot where you really can make money now based upon speed. The market data is lists of what price you can get something for (say a future Mellanox, MLNX is its short), and what price it sells for. It's however an incremental update meanwhile the datafeeds are RAW/ UDP, so not TCP. TCP as we know is about the only protocol fixed well, so the RAW format poses a big problem there. On paper it is indeed possible to ask for retransmission at a different channel, but in case of market surges that's gonna be too slow of course. The other feed you also cannot use, as 1 of the both feeds A and B is gonna be a lot faster than the other. Even the very slow IBM software, you can publicly buy, if you google for it, at their twitter they claim a total processing speed of the 'market data' of around 7-11 us (microseconds), the trading (so buying and selling yourself) happens with a TCP connection usually. Of course you can forget trading at platforms or a computer that's not on the exchange itself. Just the latency as we know of receiving data from a datacenter with an ocean in between you measure in milliseconds, factor 1000 slower than microseconds. So won't make you much of a profit to trade like that, except if you use it to compare 2 different exchanges with each other and try to profit from that. That's however basically asking of you to be a billion dollar company as you need quite some infrastructure for that to be fast. Speaking of having big cash being an advantage - some exchanges offer if you pay really big cash a faster connection (10 gigabit versus 1 gigabit for cheap dollars). Most traders won't be able to pay for that big big connection. So it's funny that different exchanges get mentionned here - as you're only fast enough for 1 exchange with a local machine as a small trader. But now the weirdest thing - i offered myself at different spots to write a < 1 microsecond software to parse that marketdata, but no one wants to hire you it seems - they just look for JAVA coders. Example is also Morgan Chase. All their job offers, which i receive daily, 99% is JAVA jobs. Java is of course TOO SLOW for working with trading data. Much better is C/C++ and assembler after you already have achieved that < 1 microseconds in C. Note that the 'FPGA'S" that are advertized costs millions most of them and the only latency quote i saw there is 2 microseconds, which also sounds a bit slow to me, but well. Furthermore they usually hire FINANCIALS who happen to be able to program. They are so so behind in this world - not seldom also because many of the IT managers i spoke with of financial companies, they hardly have a highschool degree - they don't take any risk. What's the risk of paying 1 programmer to make a faster framework for you? We speak about hundreds of millions to billion dollar companies now which don't take that risk. Speed is of course everything. It's a NSA game now - and i keep wondering why only a few hedgefunds hire such persons - majority of the traders over here, they really run so much behind - you would be very shocked if you realize how much profit they do not make because of this. Most importantly of course, now i'm gonna say something most on this list understand and which to financial guys is nearly impossible to explain, being very fast removes a worst case. Odds you go bankrupt are A LOT SMALLER during market surges and you lose less during unpredicted surges by your algorithms. Speaking of algorithms - the word algorithm in the financial world is too heavy of a word. Gibberish is a better wording. But well i say that last of course as someone who has made pretty complex algorithms for patterns in computerchess - also took me 15 years to learn how to do that. Please note that it's wishful thinking guessing anything would change in how trading happens at exchanges - if one nation would modify something - traders just move to a different exchange. Right now CME (chicago) is the biggest derivatives market. Seems they try in Europe to create a bigger one. I'm pretty sure i don't betray any banking secrecy code if i call them very clever. If they learn one day what a computer is, that is. And as 0 of the traders *ever* in his life will be busy 'improving' the system, of course 0 politicians have any clue what happens over there and that sort of a NSA race for speed it has become. Though i'm very good at that, i'm not sure whether i like it. During a congressional hearing in the US a year or 2 ago or so, one academic who clearly realized the problem, stated that he wanted to introduce a penalty directly after trading - as he had figured out that some traders trade like 200 times a second in the same future Mellanox (same instrument this is called in traders terminology), to keep using the same example. He wanted to 'solve' that problem by introducing a rule that after trading in an instrument one would need to wait for another 100 milliseconds to trade again in that instrument. A guy from tradeworx hammered that away, as that it would hurt liquidity at the market. Note such academic solutions do not solve the fundamental problem that if someone goes first and is 1 picosecond faster, that he's the one allowed to buy that instrument against that price it was offered for. That there is a delay afterwards doesn't solve that fundamental problem. Furthermore you kind of tease away traders and exchanges don't like that. In the meantime exchanges are upgrading their hardware and moving to new datacenters. Some already migrated past years. So any discussion in politics here already is total outdated as the datacenters got way faster. FTSE for example announced that their total processing time has been reduced to somewhat just above a 100 microseconds and migrated to infiniband. More are too follow there. On Feb 16, 2012, at 4:26 PM, Eugen Leitl wrote: > > http://www.fragmentationneeded.net/2011/12/pricing-and-trading- > networks-down-is-up.html > > Pricing and Trading Networks: Down is Up, Left is Right > > My introduction to enterprise networking was a little backward. I > started out > supporting trading floors, backend pricing systems, low-latency > algorithmic > trading systems, etc... I got there because I'd been responsible > for UNIX > systems producing and consuming multicast data at several large > financial > firms. > > Inevitably, the firm's network admin folks weren't up to speed on > matters of > performance tuning, multicast configuration and QoS, so that's where I > focused my attention. One of these firms offered me a job with the > word > "network" in the title, and I was off to the races. > > It amazes me how little I knew in those days. I was doing PIM and MSDP > designs before the phrases "link state" and "distance vector" were > in my > vocabulary! I had no idea what was populating the unicast routing > table of my > switches, but I knew that the table was populated, and I knew what > PIM was > going to do with that data. > > More incredible is how my ignorance of "normal" ways of doing > things (AVVID, > SONA, Cisco Enterprise Architecture, multi-tier designs, etc...) > gave me an > advantage over folks who had been properly indoctrinated. My > designs worked > well for these applications, but looked crazy to the rest of the > network > staff (whose underperforming traditional designs I was replacing). > > The trading floor is a weird place, with funny requirements. In > this post I'm > going to go over some of the things that make trading floor > networking... > Interesting. > > Redundant Application Flows > > The first thing to know about pricing systems is that you generally > have two > copies of any pricing data flowing through the environment at any > time. > Ideally, these two sets originate from different head-end systems, get > transit from different wide area service providers, ride different > physical > infrastructure into opposite sides of your data center, and > terminate on > different NICs in the receiving servers. > > If you're getting data directly from an exchange, that data will > probably be > arriving as multicast flows. Redundant multicast flows. The same > data arrives > at your edge from two different sources, using two different multicast > groups. > > If you're buying data from a value-add aggregator (Reuters, Bloomberg, > etc...), then it probably arrives via TCP from at least two different > sources. The data may be duplicate copies (redundancy), or be > distributed > among the flows with an N+1 load-sharing scheme. > > Losing One Packet Is Bad > > Most application flows have no problem with packet loss. High > performance > trading systems are not in this category. > > Think of the state of the pricing data like a spreadsheet. The rows > represents a securities -- something that traders buy and sell. The > columns > represent attributes of that security: bid price, ask price, daily > high and > low, last trade price, last trade exchange, etc... > > Our spreadsheet has around 100 columns and 200,000 rows. That's 20 > million > cells. Every message that rolls in from a multicast feed updates > one of those > cells. You just lost a packet. Which cell is wrong? Easy answer: > All of them. > If a trader can't trust his data, he can't trade. > > These applications have repair mechanisms, but they're generally > slow and/or > clunky. Some of them even involve touch tone. Really: > > The Securities Industry Automation Corporation (SIAC) provides a > retransmission capability for the output data from host systems. > As part of > this service, SIAC provides the AutoLink facility to assist vendors > with > requesting retransmissions by submitting requests over a touch-tone > telephone > set > > Reconvergence Is Bad > > Because we've got two copies of the data coming in. There's no > reason to fix > a single failure. If something breaks, you can let it stay broken > until the > end of the day. > > What's that? You think it's worth fixing things with a dynamic routing > protocol? Okay cool, route around the problem. Just so long as you can > guarantee that "flow A" and "flow B" never traverse the same core > router. Why > am I paying for two copies of this data if you're going to push it > through a > single device? You just told me that the device is so fragile that > you feel > compelled to route around failures! > > Don't Cluster the Firewalls > > The same reason we don't let routing reconverge applies here. If > there are > two pricing firewalls, don't tell them about each other. Run them as > standalone units. Put them in separate rooms, even. We can afford > to lose > half of a redundant feed. We cannot afford to lose both feeds, even > for the > few milliseconds required for the standby firewall take over. Two > clusters > (four firewalls) would be okay, just keep the "A" and "B" feeds > separate! > > Don't team the server NICs > > The flow-splitting logic applies all the way down to the servers. > If they've > got two NICs available for incoming pricing data, these NICs should be > dedicated per-flow. Even if there are NICs-a-plenty, the teaming > schemes are > all bad news because like flows, application components are also > disposable. > It's okay to lose one. Getting one back? That's sometimes worse. Keep > reading... > > Recovery Can Kill You > > Most of these pricing systems include a mechanism for data > receivers to > request retransmission of lost data, but the recovery can be a > problem. With > few exceptions, the network applications in use on the trading > floor don't do > any sort of flow control. It's like they're trying to hurt you. > > Imagine a university lecture where a sleeping student wakes up, > asks the > lecturer to repeat the last 30 minutes, and the lecturer complies. > That's > kind of how these systems work. > > Except that the lecturer complies at wire speed, and the whole > lecture hall > full of students is compelled to continue taking notes. Why should > the every > other receiver be penalized because one system screwed up? I've got > trades to > clear! > > The following snapshot is from the Cisco CVD for trading systems. > it shows > how aggressive these systems can be. A nominal 5Mb/s trading > application > regularly hits wire-speed (100Mb/s) in this case. > > The graph shows a small network when things are working right. A > big trading > backend at a large financial services firm can easily push that > green line > into the multi-gigabit range. Make things interesting by breaking > stuff and > you'll over-run even your best 10Gb/s switch buffers (6716 cards > have 90MB > per port) easily. > > Slow Servers Are Good > > Lots of networks run with clients deliberately connected at slower > speeds > than their server. Maybe you have 10/100 ports in the wiring closet > and > gigabit-attached servers. Pricing networks require exactly the > opposite. The > lecturer in my analogy isn't just a single lecturer. It's a team of > lecturers. They all go into wire-speed mode when the sleeping > student wakes > up. > > How will you deliver multiple simultaneous gigabit-ish multicast > streams to > your access ports? You can't. I've fixed more than one trading > system by > setting server interfaces down to 100Mb/s or even 10Mb/s. Fast > clients, slow > servers is where you want to be. > > Slowing down the servers can turn N*1Gb/s worth of data into > N*100Mb/s -- > something we can actually handle. > > Bad Apple Syndrome > > The sleeping student example is actually pretty common. It's > amazing to see > the impact that can arise from things like: > > a clock update on a workstation > > ripping a CD with iTunes > > briefly closing the lid on a laptop > > The trading floor is usually a population of Windows machines with > users > sitting behind them. Keeping these things from killing each other is a > daunting task. One bad apple will truly spoil the bunch. > > How Fast Is It? > > System performance is usually measured in terms of stuff per > interval. That's > meaningless on the trading floor. The opening bell at NYSE is like > turning on > a fire hose. The only metric that matters is the answer to this > question: Did > you spill even one drop of water? > > How close were you to the limit? Will you make it through > tomorrow's trading > day too? > > I read on twitter that Ben Bernanke got a bad piece of fish for > dinner. How > confident are you now? Performance of these systems is binary. You > either > survived or you did not. There is no "system is running slow" in > this world. > > Routing Is Upside Down > > While not unique to trading floors, we do lots of multicast here. > Multicast > is funny because it relies on routing traffic away from the source, > rather > than routing it toward the destination. Getting into and staying in > this > mindset can be a challenge. I started out with no idea how routing > worked, so > had no problem getting into the multicast mindset :-) > > NACK not ACK > > Almost every network protocol relies on data receivers > ACKnowledging their > receipt of data. But not here. Pricing systems only speak up when > something > goes missing. > > QoS Isn't The Answer > > QoS might seem like the answer to make sure that we get through the > day > smoothly, but it's not. In fact, it can be counterproductive. > > QoS is about managed un-fairness... Choosing which packets to drop. > But > pricing systems are usually deployed on dedicated systems with > dedicated > switches. Every packet is critical, and there's probably more of > them than we > can handle. There's nothing we can drop. > > Making matters worse, enabling QoS on many switching platforms > reduces the > buffers available to our critical pricing flows, because the buffers > necessarily get carved so that they can be allocated to different > kinds of > traffic. It's counter intuitive, but 'no mls qos' is sometimes the > right > thing to do. > > Load Balancing Ain't All It's Cracked Up To Be > > By default, CEF doesn't load balance multicast flows. CEF load > balancing of > multicast can be enabled and enhanced, but doesn't happen out of > the box. > > We can get screwed on EtherChannel links too: Sometimes these quirky > applications intermingle unicast data with the multicast stream. > Perhaps a > latecomer to the trading floor wants to start watching Cisco's > stock price. > Before he can begin, he needs all 100 cells associated with CSCO. > This is > sometimes called the "Initial Image." He ignores updates for CSCO > until he's > got the that starting point loaded up. > > CSCO has updated 9000 times today, so the server unicasts the > initial image: > "Here are all 100 cells for CSCO as of update #9000: blah blah > blah...". Then > the price changes, and the server multicasts update #9001 to all > receivers. > > If there's a load balanced path (either CEF or an aggregate link) > between the > server and client, then our new client could get update 9001 > (multicast) > before the initial image (unicast) shows up. The client will > discard update > 9001 because he's expecting a full record, not an update to a > single cell. > > Next, the initial image shows up, and the client knows he's got > everything > through update #9000. Then update #9002 arrives. Hey, what happened > to #9001? > > Post-mortem analysis of these kinds of incidents will boil down to the > software folks saying: > > We put the messages on the wire in the correct order. They were > delivered > by the network in the wrong order. > > ARP Times Out > > NACK-based applications sit quietly until there's a problem. So > quietly that > they might forget the hardware address associated with their > gateway or with > a neighbor. > > No problem, right? ARP will figure it out... Eventually. Because > these are > generally UDP-based applications without flow control, the system > doesn't > fire off a single packet, then sit and wait like it might when > talking TCP. > No, these systems can suddenly kick off a whole bunch of UDP datagrams > destined for a system it hasn't talked to in hours. > > The lower layers in the IP stack need to hold onto these packets > until the > ARP resolution process is complete. But the packets keep rolling > down the > stack! The outstanding ARP queue is only 1 packet deep in many > implementations. The queue overflows and data is lost. It's not > strictly a > network problem, but don't worry. Your phone will ring. > > Losing Data Causes You to Lose Data > > There's a nasty failure mode underlying the NACK-based scheme. Lost > data will > be retransmitted. If you couldn't handle the data flow the first > time around, > why expect to handle wire speed retransmission of that data on top > of the > data that's coming in the next instant? > > If the data loss was caused by a Bad Apple receiver, then all his > peers > suffer the consequences. You may have many bad apples in a moment. > One Bad > Apple will spoil the bunch. > > If the data loss was caused by an overloaded network component, > then you're > rewarded by compounding increases in packet rate. The exchanges > don't stop > trading, and the data sources have a large queue of data to re-send. > > TCP applications slow down in the face of congestion. Pricing > applications > speed up. > > Packet Decodes Aren't Available > > Some of the wire formats you'll be dealing with are closed-source > secrets. > Others are published standards for which no WireShark decodes are > publicly > available. Either way, you're pretty much on your own when it comes to > analysis. > > Updates > > Responding to Will's question about data sources: The streams come > from the > various exchanges (NASDAQ, NYSE, FTSE, etc...) Because each of these > exchanges use their own data format, there's usually some layers of > processing required to get them into a common format for application > consumption. This processing can happen at a value-add data > distributor > (Reuters, Bloomberg, Activ), or it can be done in-house by the end > user. > Local processing has the advantage of lower latency because you > don't have to > have the data shipped from the exchange to a middleman before you > see it. > > Other streams come from application components within the company. > There are > usually some layers of processing (between 2 and 12) between a > pricing update > first hitting your equipment, and when that update is consumed by a > trader. > The processing can include format changes, addition of custom > fields, delay > engines (delayed data can be given away for free), vendor-switch > systems (I > don't trust data vendor "A", switch me to "B"), etc... > > Most of those layers are going to be multicast, and they're going > to be the > really dangerous ones, because the sources can clobber you with LAN > speeds, > rather than WAN speeds. > > As far as getting the data goes, you can move your servers into the > exchange's facility for low-latency access (some exchanges actually > provision > the same length of fiber to each colocated customer, so that nobody > can claim > a latency disadvantage), you can provision your own point-to-point > circuit > for data access, you can buy a fat local loop from a financial network > provider like BT/Radianz (probably MPLS on the back end so that one > local > loop can get you to all your pricing and clearing partners), or you > can buy > the data from a value-add aggregator like Reuters or Bloomberg. > > Responding to Will's question about SSM: I've never seen an SSM > pricing > component. They may be out there, but they might not be a super > good fit. > Here's why: Everything in these setups is redundant, all the way > down to > software components. It's redundant in ways we're not used to > seeing in > enterprises. No load-balancer required here. The software components > collaborate and share workload dynamically. If one ticker plant > fails, his > partner knows what update was successfully transmitted by the dead > peer, and > takes over from that point. Consuming systems don't know who the > servers are, > and don't care. A server could be replaced at any moment. > > In fact, it's not just downstream pricing data that's multicast. > Many of > these systems use a model where the clients don't know who the data > sources > are. Instead of sending requests to a server, they multicast their > requests > for data, and the servers multicast the replies back. Instead of: > > hello server, nice to meet you. I'd like such-and- > such. > > it's actually: > > hello? servers? I'd like such-and-such! I'm ready, so go ahead > and send > it whenever... > > Not knowing who your server is kind of runs counter to the SSM > ideal. It > could be done with a pool of servers, I've just never seen it. > > The exchanges are particularly slow-moving when it comes to > changing things. > The modern exchange feed, particularly ones like the "touch tone" > example I > cited are literally ticker-tape punch signals wrapped up in an IP > multicast > header. > > The old school scheme was to have a ticker tape machine hooked to a > "line" > from the exchange. Maybe you'd have two of them (A and B again). > There would > be a third one for retransmit. Ticker machine run out of paper? > Call the > exchange, and here's more-or-less what happens: > > Cut the chunk of paper containing the updates you missed out of > their > spool of tape. Scissors are involved here. > > Grab a bit of header tape that says: "this is retransmit data > for XYZ > Bank". > > Tape these two pieces of paper together, and feed them through > a reader > that's attached to the "retransmit line" > > Every bank in New York will get the retransmits, but they'll > know to > ignore them. > > XYZ Bank clips the retransmit data out of the retransmit ticker > machine, > and pastes it into place on the end where the machine ran out of > paper. > > These terms "tick" "line" and "retransmit", etc... all still apply > with > modern IP based systems. I've read the developer guides for these > systems (to > write wireshark decodes), and it's like a trip back in time. Some > of these > systems are still so closely coupled to the paper-punch system that > you get > chads all over the floor and paper cuts all over your hands just > from reading > the API guide :-) > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Thu Feb 16 16:29:22 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 16 Feb 2012 13:29:22 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <20120216152653.GQ7343@leitl.org> Message-ID: They don't hire their High Frequency Trading software people from ads. Personal recommendations, more likely. The ads for Java coders are for run of the mill back end banking stuff. Most banks are doing their enterprise scale work in Java (which is replacing PowerBuilder, RPG, and COBOL). Or Java interfaces to a SQL backend. I don't know of any million dollar FPGAs. Even space qualified big Xilinx parts are about a tenth of that. Modern FPGAs could do substantially better than 1 microsecond latency. They have multiGbps interfaces on chip (e.g. Rocket I/O) and external clocks in the hundreds of MHz range. Now, if you're doing store and forward routing of 1000 bit packets on a 1Gbps wire, of course you're going to have 1 microsecond latency. -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Vincent Diepeveen Sent: Thursday, February 16, 2012 8:26 AM To: Eugen Leitl Cc: tt at postbiota.org; Beowulf Mailing List; forkit! Subject: Re: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right Yes very good article. 1) But now the weirdest thing - i offered myself at different spots to write a < 1 microsecond software to parse that marketdata, but no one wants to hire you it seems - they just look for JAVA coders. Example is also Morgan Chase. All their job offers, which i receive daily, 99% is JAVA jobs. ---------------- Java is of course TOO SLOW for working with trading data. Much better is C/C++ and assembler after you already have achieved that < 1 microseconds in C. Note that the 'FPGA'S" that are advertized costs millions most of them and the only latency quote i saw there is 2 microseconds, which also sounds a bit slow to me, but well. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 17 04:11:29 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 17 Feb 2012 10:11:29 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <20120216152653.GQ7343@leitl.org> Message-ID: <63FF25A2-52FD-42FC-9C12-04188E68E2C8@xs4all.nl> On Feb 16, 2012, at 10:29 PM, Lux, Jim (337C) wrote: > They don't hire their High Frequency Trading software people from > ads. Personal recommendations, more likely. The ads for Java > coders are for run of the mill back end banking stuff. Most banks > are doing their enterprise scale work in Java (which is replacing > PowerBuilder, RPG, and COBOL). Or Java interfaces to a SQL backend. > Most platforms have been entirely written in JAVA - there are many platforms and i read recently an estimate the platforms are 80% of the total trading volume - what happens inside the platform didn't even get counted. If your entire base already is in Java, it is the logical choice if you're a noob to choose that for a platform as well. Yet it directly makes you a loser. A good example in Netherlands of such platform, but there are a lot more platforms is Flowtraders. They hire exclusively Java coders. neary all the NSA level programmers are hardly busy with object orientation (as that's dead slow of course, even avoiding object oriented manners of setting up code you're in company code a lot slower because of the many layers deep code you have where no compiler can make sense out of). I bet those using those platforms just lose money nowadays. Must be very exceptionel to find someone who made a profit there - for sure not in a systematic manner. Yet of course the platform owners make cash because of the fixed fee on each transaction - so they cheer loud. In fact nearly all big financial institutes have such a platform that will take care you lose money as the only way to win some is long term trading. Long term trading in this is trading that takes longer than a few seconds, say buy something in the morning and sell it in the afternoon. > I don't know of any million dollar FPGAs. Even space qualified big > Xilinx parts are about a tenth of that. > Most software has a price of a couple of tens of thousands of dollars a month. The FPGA's are a multiple of that. the IBM websphere bla bla that can be used to parse market data and trade, it's around a $100k a year. Add some tens of thousands for additional functionality. Getting a FPGA a tad higher clocked from Xilinx, say the first sample at 22 nm, is probably not cheap either. Being 1 Mhz higher clocked with your FPGA there than the competition is worth tens of millions of course. Price doesn't matter at this area; the hedgefunds who are convinced invest major cash into being faster. > Modern FPGAs could do substantially better than 1 microsecond > latency. They have multiGbps interfaces on chip (e.g. Rocket I/O) > and external clocks in the hundreds of MHz range. Now, if you're > doing store and forward routing of 1000 bit packets on a 1Gbps > wire, of course you're going to have 1 microsecond latency. > Actually most trading software that you can hire for tens of thousands a month has latencies closer to 500 microseconds to parse some data and then give the order to do a trade. Only recently they try to speed it up. Majority trades at 100+ microseconds. Of course NIC latencies are not counted here - we just speak about latencies you suffer at the computer. And sure this is at very high clocked Nehalem processors. By now i guess many will have moved to IBM software as that's mass software that's offered for just a 100k dollar a year or so, and claims, last time i checked, around a 7 to 11 microseconds. This was tested by themselves, so not during a surge. In fact whatever fast latency you claim is useless, it's all about the surge latency of course. The Exchanges measure surges in intervals of 50 milliseconds. So to give an example Bernanke coughs loud when the word 'US overspending', which currently is approaching 50% of total income of the US government is (income projected 2644 billion, spendings far over 3600 billion). No way to fix that. Obama's trick to express it as a percentage of GDP, i'm sure financial world is total ignoring that. This makes the markets extremely volatile though. From traders viewpoint that's a long term consideration which todays exchanges won't reflect of course. Yet it means that if 1 sentence gets said, that for a second suddenly you'll see huge surge in the market. The external platforms usually get blocked out so you can have up to 15 minutes delay for your query to reach the market, if you're on an external platform, as you can see in some analysis; any ticker you'll see or reflection of what's going on is gonna be minutes behind - we saw this clearly during the flash crash. Only the datacenter of the exchange itself was accurate and anyone far away from that had to wait for minutes because of all the traffic jams caused by massive trading volumes. In short trading from external platforms during a crash is the silliest thing possible. You need a box inside the exchange - or you better be prepared to be a loser when trading in derivatives. So just to avoid you from losing all your money, investing big cash into being the fastest, it is worth it. For some of you, they are maybe a bit surprised i just speak about futures here and not other tradeables. That's for a simple reason, the derivatives market has expanded major league and is probably the only exchange where, if you're having fast hardware and fast software, can still make good money. Hope this doesn't get as a shock for you, but if you bought a year or 70 ago the Dow Jones Industrial, same by the way for others, and just did do nothing for 70 years then by 2005 or so, you would've had an average profit of 12% a year roughly. From which far over 7% is indexation and nearly 5% is dividend. Most financials however found that 12% not good enough as they wanted to perform above the market average (indexation). It was common to see no one got hired who couldn't bring home 20%+, that's what they wanted. The stock market is simply moving total horizontal now, they no longer can make that 12% now; also it's much harder to sell them, whereas futures are easy to sell and buy. There is other derivatives as well, and i see on TV financials each time advertize other derivatives, yet in reality the markets mainly trades futures, things like spreads and swaps are less popular. Again the difference of presentation to the public what to buy and do, versus what they do their own. Huge differences there. Just not even funny it is. > > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf- > bounces at beowulf.org] On Behalf Of Vincent Diepeveen > Sent: Thursday, February 16, 2012 8:26 AM > To: Eugen Leitl > Cc: tt at postbiota.org; Beowulf Mailing List; forkit! > Subject: Re: [Beowulf] Pricing and Trading Networks: Down is Up, > Left is Right > > Yes very good article. > > 1) > But now the weirdest thing - i offered myself at different spots to > write a < 1 microsecond software to parse that marketdata, but no > one wants to hire you it seems - they just look for JAVA coders. > > Example is also Morgan Chase. All their job offers, which i receive > daily, 99% is JAVA jobs. > > ---------------- > Java is of course TOO SLOW for working with trading data. Much > better is C/C++ and assembler after you already have achieved that > < 1 microseconds in C. > > Note that the 'FPGA'S" that are advertized costs millions most of > them and the only latency quote i saw there is 2 microseconds, > which also sounds a bit slow to me, but well. > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Fri Feb 17 09:12:13 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 17 Feb 2012 06:12:13 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <63FF25A2-52FD-42FC-9C12-04188E68E2C8@xs4all.nl> Message-ID: On 2/17/12 1:11 AM, "Vincent Diepeveen" wrote: > >> I don't know of any million dollar FPGAs. Even space qualified big >> Xilinx parts are about a tenth of that. >> > >Most software has a price of a couple of tens of thousands of dollars >a month. >The FPGA's are a multiple of that. >the IBM websphere bla bla that can be used to parse market data and >trade, it's around a $100k a year. >Add some tens of thousands for additional functionality. Are you talking about the software cost, not the hardware platform cost? If so, I'd go for that.. The population of FPGA developers is probably 1/100 the number of conventional Von Neuman machine developers (in whatever language). Interestingly, such a scarcity does not translate to 100x higher pay. Most of the surveys show that in terms of median compensation FPGA designers get maybe 30-40% more than software developers. I guess there's much more than 100x the demand for generalized software developers. I wonder if the same is true of GPU developers. A slight premium in pay, but not much. > >Getting a FPGA a tad higher clocked from Xilinx, say the first sample >at 22 nm, is probably not cheap either. The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere $132k, qty 1 (drops to $127k in qty 1000) (16 week lead time) 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, 28Gb/s transceivers, etc.etc.etc. To put it in a box with power supply and interfaces probably would set you back a good chunk of a million dollars. >> _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 17 10:42:35 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 17 Feb 2012 16:42:35 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: Message-ID: <550AB79F-4939-4D56-AF2E-61E14792D071@xs4all.nl> On Feb 17, 2012, at 3:12 PM, Lux, Jim (337C) wrote: > > > On 2/17/12 1:11 AM, "Vincent Diepeveen" wrote: >> >>> I don't know of any million dollar FPGAs. Even space qualified big >>> Xilinx parts are about a tenth of that. >>> >> >> Most software has a price of a couple of tens of thousands of dollars >> a month. >> The FPGA's are a multiple of that. >> the IBM websphere bla bla that can be used to parse market data and >> trade, it's around a $100k a year. >> Add some tens of thousands for additional functionality. > > > > Are you talking about the software cost, not the hardware platform > cost? I thought it was obvious from what i wrote it's solution costs. Traders usually buy in ready solutions - a few hedgefunds excepted. > If so, I'd go for that.. The population of FPGA developers is probably > 1/100 the number of conventional Von Neuman machine developers (in > whatever language). > > Interestingly, such a scarcity does not translate to 100x higher pay. I'm not a magnificent FPGA developer - i'd say first speedup the software - there is so many traders real slow there. But they don't even are interested IN THAT. > Most of the surveys show that in terms of median compensation FPGA > designers get maybe 30-40% more than software developers. The rate for development is $1 an hour in india. If you deal with major companies in India it's $2.50 an hour including everything. > > I guess there's much more than 100x the demand for generalized > software > developers. > How good are the 'generalized software developers' they are hiring actually to do their development? Suppose you just hire JAVA guys and girls. Now ignore the quants of course, what language they develop in is not so important as you can easily parse that to your own solution. Besides i'd guess, but have no information there, that most queries to trade are actually dead simple decisions. > I wonder if the same is true of GPU developers. A slight premium > in pay, > but not much. > > > >> >> Getting a FPGA a tad higher clocked from Xilinx, say the first sample >> at 22 nm, is probably not cheap either. > > The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere > $132k, qty 1 > (drops to $127k in qty 1000) > (16 week lead time) > > 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, 28Gb/s > transceivers, etc.etc.etc. > how high that would clock? note you also need to integrate the fastest 10 gigabit nic - right now that's solarflare. > > To put it in a box with power supply and interfaces probably would > set you > back a good chunk of a million dollars. > if you'd do it in a simple manner sure - but if you're already on that path of speed and your main income is based upon speed you want something faster than what's online there of course. You phone xilinx and intel and demand they print fpga's before printing cpu's in a new proces technology, and clock 'em as high as possible. I would guess that's not a beginnersteam and has several members. you soon have an expensive team there, in terms of salary pressure i'd guess it's around a 4 million a year minimum. That's not a problem for the hedgefunds to pay actually. If you buy things in it's also a seven digit number a year. I do believe however there is enough room for a good software implementation to make great cash based upon speed. The rule is you need to be within the 10% fastest traders to make a profit. That might or might not be true when it's based upon speed. Many trader groups are just busy with 1 specific part of the market - say for example oil companies. Most are experts at just 1 tiny market. For them i'd suspect a software implementation that gets them in the top, would already make them very effective traders. There is limits on what you are allowed to sell in quantity, meanwhile the total trading volume in derivatives keeps growing, so you sure can make great cash if you're amongst the fastest and not necessarily need to beat the top trading hedgefunds. For me it's incredible that there is so little jobs actually for trading in imperative languages and basically 99.9% of all jobs are Java there. I just simply would never do business with jpmorgan chase. they seem to hire 0 persons who are busy imperative or even in C++. Basically means that the entire market of genius guys who know how to beat you in game tree search, which is about all artificial intelligence experts, they're shut out from majority of companies to get a job there. The way to get a job is to be young and have a degree in economics. Now a year or 10 ago that might've been most interesting in the trading world - but things have changed. Derivatives market back then was ultra tiny, right now it's of gigantic proportions, trade volumes have exploded, so the traders simply didn't see how some others who make a profit innovate - and those sure keep it a BIG secret. From my viewpoint however it's a very dangerous thing what happens there. At military level secrets they try to keep secret - yet there is no commercial money involved such as in the trading world, where even knowing 1 sentence here can make you major cash. Government is even more behind. You first have to prove that speed is everything. Well i do know - i come from computerchess - if majority of decision taking is so dead simple in trading (so the patterns), compare 2 things then based upon that take a decision, obviously it means that speed is everything. If i look at IBM, i guess it took them until 2008 to really put an improved trading solution at the market. I'd take that and just make a dedicated trading framework that's 10x faster than that (so that means parsing the market data and integrated take the trading decisions). Though there are hundreds of thousands of trading groups world wide - google a bit around - you'll just find Java jobs. 99% of the traders are just clueless whom to hire i'd guess. > >>> > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Fri Feb 17 13:05:15 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 17 Feb 2012 10:05:15 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <550AB79F-4939-4D56-AF2E-61E14792D071@xs4all.nl> Message-ID: On 2/17/12 7:42 AM, "Vincent Diepeveen" wrote: > >On Feb 17, 2012, at 3:12 PM, Lux, Jim (337C) wrote: > > >> If so, I'd go for that.. The population of FPGA developers is probably >> 1/100 the number of conventional Von Neuman machine developers (in >> whatever language). >> >> Interestingly, such a scarcity does not translate to 100x higher pay. > >I'm not a magnificent FPGA developer - i'd say first speedup the >software - there is so many traders >real slow there. But they don't even are interested IN THAT. > >> Most of the surveys show that in terms of median compensation FPGA >> designers get maybe 30-40% more than software developers. > >The rate for development is $1 an hour in india. If you deal with >major companies in India it's $2.50 an hour >including everything. I think you're a bit low there. People I know who are contracting off-shore development say that the net cost (to the US firm) is about 1/4 and 1/3 what the equivalent person would cost in the US. (and the price is rising) You can hire very low level people quite inexpensively on a bare contract, but you spend more managing them, and compensating for the incredible defect density. And I doubt that good FPGA folks are as thick on the ground in low cost places as they are in the US or Europe. >>> >>> Getting a FPGA a tad higher clocked from Xilinx, say the first sample >>> at 22 nm, is probably not cheap either. >> >> The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere >> $132k, qty 1 >> (drops to $127k in qty 1000) >> (16 week lead time) >> >> 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, 28Gb/s >> transceivers, etc.etc.etc. >> > >how high that would clock? It's a bit tricky when talking clock rates in FPGAs.. Most designs are only regionally synchronous, so you need to take into account propagation delays across the chip if you want max performance. You feed a low speed clock (few hundred MHz) into the chip and it gets multiplied up in onchip DPLLs/DCMs. The MMCM takes 1066 Mhz max input. There's also different clocks for "entirely on chip" and "going off chip", particularly for things like the GigE or PCI-X interfaces (which have their own clocks) Likewise these things have built in interfaces to RAM (so you can get like 1900 Mb/s to DDR ram, running the core at 2V). It's more like doing logic designs.. Propagation delays from D to O through combinatorial logic look like they're in the 0.09 ns range. CLB flipflops have a setup time of 0.04ns and hold time of 0.13 ns, so that kind of looks like you could toggle at around 2 Ghz There's plenty of documentation out there, but it's not as simple as "I'm running the CPU at 3 GHz" > >note you also need to integrate the fastest 10 gigabit nic - right >now that's solarflare. Do you? Why not use some other interconnect. > >> >> To put it in a box with power supply and interfaces probably would >> set you >> back a good chunk of a million dollars. >> > >if you'd do it in a simple manner sure - but if you're already on >that path of speed and your main income is based upon speed you >want something faster than what's online there of course. > >You phone xilinx and intel and demand they print fpga's before >printing cpu's in a new proces technology, and clock 'em as high as >possible. I seriously doubt that even a huge customer is going to change Xilinx's plans. They basically push the technology to what they can do constrained by manufacturability. Then you have all the thermal issues to worry about. These big parts can dissipate more heat than you can get out through the package/pins, or even spread across the die. >guess it's around a 4 million a year minimum. > >That's not a problem for the hedgefunds to pay actually. I suspect that money isn't the scarce resource, people are. There aren't many people in the world who can effectively do this kind of thing. (For the kind of thing you're talking about, it's probably in the 100s, maybe 1000s, total, worldwide) > >For me it's incredible that there is so little jobs actually for >trading in imperative languages and basically 99.9% of all jobs are >Java there. Why is this amazing? The vast majority of money spent on software is spent on run of the mill, mundane chores like payroll accounting, inventory control, processing consumer transactions, etc. So there's a huge population of people to draw from. If you're looking for top people, and you need some number of them, you're better off taking the 3 or 4 sigma from the mean people from a huge population, than taking the 1 sigma people from a small population. > >I just simply would never do business with jpmorgan chase. they seem >to hire 0 persons who are busy imperative or even in C++. So what? That's just a personal preference on your part. How do you know they aren't hiring those people through some other channel? You don't see a lot of ads out there for FORTRAN programmers, but here at JPL, about 25% of the software work that's being done is in FORTRAN. And of the people doing software at JPL, more than half do NOT have degrees in CS or even EE, and I'd venture that they were not hired as "software developers": more likely they were hired for their domain specific knowledge. >Basically means that the entire market of genius guys who know how to >beat you in game tree search, which is about all artificial >intelligence experts, >they're shut out from majority of companies to get a job there. Definitely not. If you're in that tippy top 0.1%, you're not getting jobs by throwing your resume over the transom. You're getting a job because you know someone or someone came to know of you through other means. I didn't get my job at JPL by submitting a resume, and I think that's true of the vast majority of people here. It was also true of my last job, doing special effects work. And in fact, now that I think back, I think I have had only one job which was a resume in response to an ad. > >The way to get a job is to be young and have a degree in economics. Depends on what job you want. A good fraction of the technical degree grads at MIT are being hired by the finance industry. This concerns people like us at JPL, because we can't offer competitive pay and benefits, and on a personal note, I think it's a shame that they're probably not going to be using their skills in the field in which they were actually trained. But the way to get a good job has always been, and will always be, to know someone. There have been *numerous* well controlled studies that looked at hiring behavior, and regardless of what the recruiting people say, in reality managers make their decisions on the same few factors, and "recommendation of coworker or industry colleague" is right up there in the top few. > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 17 14:54:53 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 17 Feb 2012 20:54:53 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: Message-ID: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> On Feb 17, 2012, at 7:05 PM, Lux, Jim (337C) wrote: > > > On 2/17/12 7:42 AM, "Vincent Diepeveen" wrote: > >> >> On Feb 17, 2012, at 3:12 PM, Lux, Jim (337C) wrote: >> >> >>> If so, I'd go for that.. The population of FPGA developers is >>> probably >>> 1/100 the number of conventional Von Neuman machine developers (in >>> whatever language). >>> >>> Interestingly, such a scarcity does not translate to 100x higher >>> pay. >> >> I'm not a magnificent FPGA developer - i'd say first speedup the >> software - there is so many traders >> real slow there. But they don't even are interested IN THAT. >> >>> Most of the surveys show that in terms of median compensation FPGA >>> designers get maybe 30-40% more than software developers. >> >> The rate for development is $1 an hour in india. If you deal with >> major companies in India it's $2.50 an hour >> including everything. > > > I think you're a bit low there. People I know who are contracting > off-shore development say that the net cost (to the US firm) is > about 1/4 > and 1/3 what the equivalent person would cost in the US. (and the > price > is rising) Well they are wrong then. Usually they also count in the costs at the location which is a staff at the spot, so not in India. The real costs in India you can get it for are $2.50 an hour. This is *big* consultancy companies. The consultants who're independant are usually rates of around $1 - $2 an hour. Philippines sits around $1.11 an hour for consultants. If i look to actual produced work in India you see they're very good in trying to get more work, like additional pay for extra features - so most projects budgeted at X hours usually end up in 2X; of course you and i know that doesn't mean it took 2X :) The actual developers working for such companies complain loud usually about their salary. Yet also for them getting work as a consultant is very difficult. They sit around $150 a month. This is *normal* development. As you know i'm more involved in mass market products, usually that's the better developers on this planet; nothing as complicated as producing a good mass market product as it has to work everywhere. In India rates for that are open market and also go up rapidly. I've even heard of some who got a $1000 a month there in India. But this is really 1 in a 100 developers. > > You can hire very low level people quite inexpensively on a bare > contract, > but you spend more managing them, and compensating for the incredible > defect density. It was indeed the habit to have the managers over here. However times have changed. Nowadays management also sits in Asia. > > And I doubt that good FPGA folks are as thick on the ground in low > cost > places as they are in the US or Europe. We were discussing normal development and i noted that normal development happens in India for $1 an hour for independants and $2.50 including management overhead and everything an hour, by the major consultancy companies. If i see how much power Bulldozer CPU eat, then i can assure you that India is about the last spot on the planet where i'd have develop a FPGA for trading :) To start with you're not gonna ship a development board to India - it's gonna disappear without anyone knowing and without anyone who can be blamed. I remember how i shipped some stuff to India. 0% arrived. Tracking codes - forget it - that's just a number from the west - something utmost useless in India. > > >>>> >>>> Getting a FPGA a tad higher clocked from Xilinx, say the first >>>> sample >>>> at 22 nm, is probably not cheap either. >>> >>> The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere >>> $132k, qty 1 >>> (drops to $127k in qty 1000) >>> (16 week lead time) >>> >>> 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, >>> 28Gb/s >>> transceivers, etc.etc.etc. >>> >> >> how high that would clock? > It's a bit tricky when talking clock rates in FPGAs.. Most designs are > only regionally synchronous, so you need to take into account > propagation > delays across the chip if you want max performance. > > You feed a low speed clock (few hundred MHz) into the chip and it gets > multiplied up in onchip DPLLs/DCMs. > The MMCM takes 1066 Mhz max input. I'm guessing around a 2.5Ghz. We know they expected some boards to be able to clock around 1.7Ghz. > > There's also different clocks for "entirely on chip" and "going off > chip", > particularly for things like the GigE or PCI-X interfaces (which have > their own clocks) Likewise these things have built in interfaces > to RAM > (so you can get like 1900 Mb/s to DDR ram, running the core at 2V). > > It's more like doing logic designs.. Propagation delays from D to O > through combinatorial logic look like they're in the 0.09 ns > range. CLB > flipflops have a setup time of 0.04ns and hold time of 0.13 ns, so > that > kind of looks like you could toggle at around 2 Ghz > > There's plenty of documentation out there, but it's not as simple > as "I'm > running the CPU at 3 GHz" > I've not seen any actual designs - they are utmost top secret - no military secret is as secret as what they use in the datacenters in own designs - besides that the biggest military secrets, like when a war is gonna happen, you roughly can predict quite well if you watch the news. Forget DDR ram, DDR ram really is too slow in latency also forget about pci-x. I'd say also forget about pci-e - think of custom mainboards. On the FPGA card there's gonna be massive SRAM besides a solarflare NIC. You need quite a lot actually as the entire datafeed has to get streamed to the rest of the mainboard for storage and more complicated trading analysis i'd suppose. I would guess the chip can do the simple trading decisions in a pretty simple manner. This keeps the chipdesign relative simple and you can focus upon clocking it high. The rest you want to do in software. Losing the pci-e latencies then is crucial as that's a big bottleneck. Also forget normal RAM. No nothing DDR3. Just SRAM and many dozens of gigabytes of it as we want to keep the entire day preferably in RAM and our analysis are nonstop hammering onto the SRAM so basically the bandwidth to the SRAM determines how fast our analysis will be. The simple trading decisions already get done by the FPGA of course. But well i guess they probably tried to get ALL trading decisions inside the fpga - who knows. As for the mainboards, to quote someone : "there are some very custom designs out there". Yet of course you can do real well in software already for a fraction of that budget. > >> >> note you also need to integrate the fastest 10 gigabit nic - right >> now that's solarflare. > > > Do you? Why not use some other interconnect. Because the datacenter of the exchange dictates to you what sort of protocol they use. Several are now exclusively solarflare. FTSE has gone to infiniband - but i'm not sure whether that's their internal machines only. FTSE is only limited interesting of course - Chicago is more interesting :) Realize also by the way that you have many connections. Really a lot. Market A comes from IP adres X, Market B from IP adres Y and so on, we speak of hundreds of IP adresses you have to connect to simultaneously. You get those adresses in a dynamic manner, say in a XML type manner. It's a total mess they created and they love create a mess as that means more work and work means you can make money from rich traders. They really overcomplicated it all. Also realize at any moment they can change the messages - the way how the messages look like you get in a dynamic manner in XML. There is nothing really hard defined. That's why all that generic software is trading at such slow speeds! No nothing hardcoded. Basically the whole protocol is NOT designed for speed. Calling it a spaghettidesign would be a compliment. The financials LOVE to just add to the mess and never 'fix' things. See it as adding components to the space shuttle and then pray it still works. Actually to get back to the factual criticism in that article - the space shuttle was intended to work correctly - the exchanges do not give the garantuee that you get the information at all :) Only the trading is TCP, the crucial stream what bids and offers there are, they are in RAW format and just 1 channel is gonna be fast enough for you to try follow it. If that channel doesn't have the correct info - bad luck for you. That bad luck of course happens during big surges - we all realize that. This lossy way of information is the underlying method based upon which decisions get taken which have major implications :) There is actually from the FIX/FAST community meetings on the protocol at a regular interval. To just show your company name you have to pay a $25k. To also display it at the conference is a big multiple of that. The next one is scheduled for London at the 13th of March 2012: www.fixprotocol.org >> >>> >>> To put it in a box with power supply and interfaces probably would >>> set you >>> back a good chunk of a million dollars. >>> >> >> if you'd do it in a simple manner sure - but if you're already on >> that path of speed and your main income is based upon speed you >> want something faster than what's online there of course. >> >> You phone xilinx and intel and demand they print fpga's before >> printing cpu's in a new proces technology, and clock 'em as high as >> possible. > > > I seriously doubt that even a huge customer is going to change > Xilinx's > plans. They basically push the technology to what they can do > constrained > by manufacturability. > > Then you have all the thermal issues to worry about. These big > parts can > dissipate more heat than you can get out through the package/pins, > or even > spread across the die. > It has crossed my mind for just a second that if 1 government would put 1 team together and fund it, and have them produce a magnificent trading solution, which for example plugs in flawless into the IBM websphere, and have their own traders buy this, especially small ones with just ties to their own nation, that in theory things still are total legal. A team, of course so called under a company name, produces a product, and traders buy this in from this company. The only thing is they don't tell around they do business with this company - only their accountant sees that name - things stil perfectly legal. Now some of those traders will make a profit and others will lose some; statistically they wil however perform a lot better than they used to do. That means that a big flow of cash moves towards 1 nation. Of course you can also do this just in software - all you have to beat is IBM, which is freaking peanuts for a good programmer. The only important thing is that they're with that fastest 10%. No need to have the fastest solution there - let some hedgefunds design that (they already did). It's just a throught experiment, but it sure would basically win a lot of money from abroad to your own nation; in the end a big part of that then flows into your own nation and pays for things. Do not think in small numbers here - they trade so much money daily, just a small statistical advantage as your software is faster than it would otherwise have been, is having a huge impact. > >> guess it's around a 4 million a year minimum. >> >> That's not a problem for the hedgefunds to pay actually. > > I suspect that money isn't the scarce resource, people are. There > aren't > many people in the world who can effectively do this kind of thing. > (For > the kind of thing you're talking about, it's probably in the 100s, > maybe > 1000s, total, worldwide) Well i'm sure there aren't many, but i'm very sure they aren't trying to get the best programmers - they mainly hire JAVA coders everywhere. > >> >> For me it's incredible that there is so little jobs actually for >> trading in imperative languages and basically 99.9% of all jobs are >> Java there. > > > Why is this amazing? Suppose you only hire people who are lefthanded. That's basically what they're doing. Because speed is everything at the exchanges now and you aren't gonna get the best people this way, as they shut out the biggest experts basically, with a few exceptions. > The vast majority of money spent on software is > spent on run of the mill, mundane chores like payroll accounting, > inventory control, processing consumer transactions, etc. > > So there's a huge population of people to draw from. > > If you're looking for top people, and you need some number of them, > you're > better off taking the 3 or 4 sigma from the mean people from a huge > population, than taking the 1 sigma people from a small population. > They just draw out of a small population of usually very bad programmers who studied finance. How do you get the best software engineers then for your trading application? > > >> >> I just simply would never do business with jpmorgan chase. they seem >> to hire 0 persons who are busy imperative or even in C++. > > > So what? That's just a personal preference on your part. How do > you know > they aren't hiring those people through some other channel? And what would that 'other channel' be then? Vaste majority simply isn't doing this. > > You don't see a lot of ads out there for FORTRAN programmers, but > here at > JPL, about 25% of the software work that's being done is in > FORTRAN. And No matter how genius you are, if all you do is write fortran, and didn't study finance, then you are not allowed to write a trading application, AS YOU WON'T GET HIRED :) > of the people doing software at JPL, more than half do NOT have > degrees in > CS or even EE, and I'd venture that they were not hired as "software > developers": more likely they were hired for their domain specific > knowledge. > >> Basically means that the entire market of genius guys who know how to >> beat you in game tree search, which is about all artificial >> intelligence experts, >> they're shut out from majority of companies to get a job there. > > Definitely not. If you're in that tippy top 0.1%, you're not > getting jobs > by throwing your resume over the transom. You're getting a job > because > you know someone or someone came to know of you through other means. > You seem to be the expert in hiring people :) > I didn't get my job at JPL by submitting a resume, and I think > that's true > of the vast majority of people here. It was also true of my last job, > doing special effects work. And in fact, now that I think back, I > think I > have had only one job which was a resume in response to an ad. > When you got there, there was a circle at your resume around the word 'NASA' And nothing else mattered i bet :) > > >> >> The way to get a job is to be young and have a degree in economics. > > Depends on what job you want. > > A good fraction of the technical degree grads at MIT are being > hired by > the finance industry. This concerns people like us at JPL, because we > can't offer competitive pay and benefits, and on a personal note, I > think > it's a shame that they're probably not going to be using their > skills in > the field in which they were actually trained. > > > But the way to get a good job has always been, and will always be, > to know > someone. > > There have been *numerous* well controlled studies that looked at > hiring > behavior, and regardless of what the recruiting people say, in reality > managers make their decisions on the same few factors, and > "recommendation > of coworker or industry colleague" is right up there in the top few. > > >> > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From worringen at googlemail.com Fri Feb 17 18:39:19 2012 From: worringen at googlemail.com (Joachim Worringen) Date: Sat, 18 Feb 2012 00:39:19 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> Message-ID: Vincent, I haven't read all zillion lines of your posts, but as I'm heading software engineering of a very successful "prop shop" (proprietary trading company), I might add some real-world comments: - Execution speed is important, but it's not everything. Only the simplest strategies purely rely on speed for success. - Even more important than execution speed is time-to-market. It's of no use to have the superfast thing ready when the market has moved into a different direction nine months ago. - Equally important is reliability and maintainability. Our inhouse development is based on C++, but we make very good money with Java-based third-party solutions as well. FPGAs are not the silver-bullet-solution either. Finding people to program them is hard, development is complex and takes time, verifying them takes even more, and is required for every little change. Think about time-to-market. Therefore, they are mainly used in limited scenarios (risk-checks), or are used with high-level-compiler support, giving away significant fraction of the potential performance. Oh, btw, we are always looking for bright, but also socially compliant developers. Joachim _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Fri Feb 17 23:45:17 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 17 Feb 2012 20:45:17 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> Message-ID: > > > >>>> The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere >>>> $132k, qty 1 >>>> (drops to $127k in qty 1000) >>>> (16 week lead time) >>>> >>>> 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, >>>> 28Gb/s >>>> transceivers, etc.etc.etc. >>>> >>> >>> how high that would clock? >> It's a bit tricky when talking clock rates in FPGAs.. Most designs are >> only regionally synchronous, so you need to take into account >> propagation >> delays across the chip if you want max performance. >> >> You feed a low speed clock (few hundred MHz) into the chip and it gets >> multiplied up in onchip DPLLs/DCMs. >> The MMCM takes 1066 Mhz max input. > >I'm guessing around a 2.5Ghz. Maybe, maybe not. Most big FPGA designs I've seen aren't one big synchronous blob in any case. If you have a pipelined process spread across the chip, you might clock individual chunks at X Mhz, but there's N cycles to get through the pipeline. Kind of depends whether in to out latency is important or bits per second throughput, I suppose. The old stationwagon full of tapes vs hot stuff ASIC. > >On the FPGA card there's gonna be massive SRAM besides a solarflare NIC. Maybe, maybe not. If you're shoving buffers around, parallelism might be more important than raw memory access time. > >I would guess the chip can do the simple trading decisions in a >pretty simple manner. >This keeps the chipdesign relative simple and you can focus upon >clocking it high. When you're talking designs with 10s of millions of gates, you can do pretty complex things. > >Also forget normal RAM. No nothing DDR3. Just SRAM and many dozens of >gigabytes of it as we want to keep the entire day preferably in RAM >and our analysis are nonstop hammering onto the SRAM so basically the >bandwidth to the SRAM determines how fast our analysis will be. I just cited the DDR3 as an example out of the datasheet. I'm sure that if you're interested you'll go download the Virtex 7 data sheet and study it. > >As for the mainboards, to quote someone : "there are some very custom >designs out there". I doubt there is anything such as a "standard" board using a $100k FPGA. I'd bet a fair number of cold frosty beverages that ALL boards using this kind of thing fit the "custom" category. >> >> >> I seriously doubt that even a huge customer is going to change >> Xilinx's >> plans. They basically push the technology to what they can do >> constrained >> by manufacturability. >> >> Then you have all the thermal issues to worry about. These big >> parts can >> dissipate more heat than you can get out through the package/pins, >> or even >> spread across the die. >> > >It has crossed my mind for just a second that if 1 government would >put 1 team together and fund it, >and have them produce a magnificent trading solution, >which for example plugs in flawless into the IBM websphere, >and have their own traders buy this, especially small ones with just >ties to their own nation, that in theory things still are total legal. There was a famous challenge about breaking DES, where it was done with FPGAs. But why would a government fund such a thing (except perhaps as an economic weapon.. Humorous thoughts of died in the wool Marxists cackling at the thought of destroying the capitalists with their own trading tools, developed by a centrally planned "trading machine establishment #3". > >How do you get the best software engineers then for your trading >application? By asking your other software developers? As we used to call it in the entertainment industry "Neportunity". > >And what would that 'other channel' be then? Personal contacts. > >Vaste majority simply isn't doing this. Let's see.. Unemployment in the software industry is down around 3% these days (viz 8-12% in general, and 20-25% in certain demographics and areas). They're finding jobs somehow. 10:1 or 50:1 resume to open position isn't uncommon. Somehow they find the 1 in 50, and a fair number of studies show that it's not done by some HR person carefully reviewing the 50 resumes to find the one shining diamond. >No matter how genius you are, if all you do is write fortran, and >didn't study finance, >then you are not allowed to write a trading application, AS YOU WON'T >GET HIRED :) More an example that different industries hire for different skill sets and backgrounds. You're right, I can't imagine FORTRAN being very useful in trading. But hey, I don't write trading apps.. For all I know it's mostly matrix math and FORTRAN is pretty good for that. Maybe they want people who write FORTH or LISP or PROLOG. The point is, it's a very niche market, looking for a very niche programmer that is probably not remotely representative of software developers at large. > >> I didn't get my job at JPL by submitting a resume, and I think >> > >When you got there, there was a circle at your resume around the word >'NASA' Actually not.. I hadn't worked at JPL then. The circle was around "microscan compressive receiver", purely by chance. > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 03:23:59 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 09:23:59 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> Message-ID: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: > Vincent, > > I haven't read all zillion lines of your posts, but as I'm heading > software engineering of a very successful "prop shop" (proprietary > trading company), I might add some real-world comments: > - Execution speed is important, but it's not everything. Only the > simplest strategies purely rely on speed for success. Which is 90% of all strategies of all traders. So statistics total refute you. in a very very hard way. The markets move total horizontal now - speed is the only thing that can make you good money at the derivatives markets now. > - Even more important than execution speed is time-to-market. It's of > no use to have the superfast thing ready when the market has moved > into a different direction nine months ago. > - Equally important is reliability and maintainability. > > Our inhouse development is based on C++, but we make very good money > with Java-based third-party solutions as well. > > FPGAs are not the silver-bullet-solution either. Finding people to > program them is hard, development is complex and takes time, verifying > them takes even more, and is required for every little change. Think > about time-to-market. Therefore, they are mainly used in limited > scenarios (risk-checks), or are used with high-level-compiler support, > giving away significant fraction of the potential performance. > I don't see fpga as the silver bullet either, because of its huge costs - as i described before the costs are much higher than a normal fpga development would be. Your time to market argument is total nonsense. Only a civil servant would show up with such statement. If you quickly tape out a trading product that's suck ass again, say 100 microseconds latency - no one will buy it. Fast/Fix was used 10 years ago and will be used 10 years from now. And if some politician in nation A says: "we limit the exchanges", then they move to nation B. > Oh, btw, we are always looking for bright, but also socially compliant > developers. > yes all those social people you hire to work in the financial industry - they'd never sell a bad product to someone else - making money is not the main concern - so so so social :) > Joachim > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 03:36:03 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 09:36:03 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> Message-ID: <42D8407A-9716-48BD-8381-14DC483B5909@xs4all.nl> Out of interest if i google: Date: 2009 "Joachim Worringen is a software architect at Dolphin Interconnect " That's what you did do before moving to keeping a few guys busy in your prop shop? Kind Regards, Vincent On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: > Vincent, > > I haven't read all zillion lines of your posts, but as I'm heading > software engineering of a very successful "prop shop" (proprietary > trading company), I might add some real-world comments: > - Execution speed is important, but it's not everything. Only the > simplest strategies purely rely on speed for success. > - Even more important than execution speed is time-to-market. It's of > no use to have the superfast thing ready when the market has moved > into a different direction nine months ago. > - Equally important is reliability and maintainability. > > Our inhouse development is based on C++, but we make very good money > with Java-based third-party solutions as well. > > FPGAs are not the silver-bullet-solution either. Finding people to > program them is hard, development is complex and takes time, verifying > them takes even more, and is required for every little change. Think > about time-to-market. Therefore, they are mainly used in limited > scenarios (risk-checks), or are used with high-level-compiler support, > giving away significant fraction of the potential performance. > > Oh, btw, we are always looking for bright, but also socially compliant > developers. > > Joachim > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From worringen at googlemail.com Sat Feb 18 04:13:03 2012 From: worringen at googlemail.com (Joachim Worringen) Date: Sat, 18 Feb 2012 10:13:03 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> Message-ID: On Sat, Feb 18, 2012 at 9:23 AM, Vincent Diepeveen wrote: > On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: >> - Execution speed is important, but it's not everything. Only the >> simplest strategies purely rely on speed for success. > > Which is 90% of all strategies of all traders. > > So statistics total refute you. > in a very very hard way. Our daily P&L statistics give us a different impression, but we are probably just to stupid to read them correctly. Joachim _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 04:31:26 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 10:31:26 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> Message-ID: On Feb 18, 2012, at 10:13 AM, Joachim Worringen wrote: > On Sat, Feb 18, 2012 at 9:23 AM, Vincent Diepeveen > wrote: >> On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: >>> - Execution speed is important, but it's not everything. Only the >>> simplest strategies purely rely on speed for success. >> >> Which is 90% of all strategies of all traders. >> >> So statistics total refute you. >> in a very very hard way. > > Our daily P&L statistics give us a different impression, but we are > probably just to stupid to read them correctly. > If your software isn't duck slow i bet you have problems doing simple trades. It reminds me bigtime this discussion with a few government guys who produce suck software keeping dudes busy; they complained that 90% of the trades they tried to submit were refused. Of course it's easy to show then that this is because they're way way too slow with their software - not seldom stil software in the hundreds of microseconds latency, and hardware latencies are not counted here. Nor the extremely slow nature of built in NIC's. In fact in a congressional hearing this fact also was mentionned - again someone using that same software like the first guy who complained to me - that 90% was getting refused; that obviously means you're too slow. Maybe hire better software engineers and buy a decent network card? > Joachim > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 06:04:28 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 12:04:28 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: Message-ID: On Feb 18, 2012, at 5:45 AM, Lux, Jim (337C) wrote: >> >> It has crossed my mind for just a second that if 1 government would >> put 1 team together and fund it, >> and have them produce a magnificent trading solution, >> which for example plugs in flawless into the IBM websphere, >> and have their own traders buy this, especially small ones with just >> ties to their own nation, that in theory things still are total >> legal. > > There was a famous challenge about breaking DES, where it was done > with > FPGAs. > But why would a government fund such a thing (except perhaps as an > economic weapon.. Humorous thoughts of died in the wool Marxists > cackling > at the thought of destroying the capitalists with their own trading > tools, > developed by a centrally planned "trading machine establishment #3". In the first place government already indirectly runs so many companies (pays directly or indirectly the jobs created there), a sick habit especially in Europe, that the plan i sketched there would be peanuts to execute. they do have the people for it, they already pay the companies to keep dudes busy and spoil years of their lives, which IMHO is a very evil thing; basically anyone above IQ120 is gonna get hammered down by the government in a mercilious manner; that means that vaste majority 95% or so, will shut up rest of his life, give 0 criticisms publicly anymore and go the selfish way - whereas he or she otherwise would give that criticism, which is so crucial for democracies to selfcorrect; so that really is a problem now for democracies as a democracy cannot function if the clever get hammered down. Fools take over then. Right now statistics here in Netherlands are that 90% of politicians are there just with intention of getting a job. Very selfexplaining statistics. Now on countries - most nations, and Germany and USA are no different there - work pretty mechanical in how they do business. If they see an opportunit to make big cash for their country they will be tempted to do it. Currently the huge changes of the past few years in financial industry, from an industry dominated by traders who had a financial background, to the current NSA type struggle for speed, hardware and game tree search type manners of making money there; in past what was very common was some guy X who owned a bunch of houses/buildings, who was trading. He talked to someone, CEO of some company. The CEO said nothing useful, but our guy X concluded he blinked a lot and based upon that tried to sell his interest in that company. That game has changed. What has come back is that new datacenters will allow up to trading thousands of times per second in the same instrument, say future Mellanox, which in long term we expect to go up. Yet we trade thousands of times per second in this same instrument. So when it drops 1 cent, we sell a little again, the slower traders will then take another few milliseconds or so to have sold, it'll go down, so we can really make a big profit based upon 1 expectation, just by trading nonstop. This is actually what happens. This is not an 'example'. this behaviour is what happens at the exchanges. At most exchanges it's limited right now to a 200 times per second in the same instrument, yet where the datacenters are having faster networks this 200 goes up already by a lot. These are *measured* statistics; so hedgefunds making great profits past few years, they factual trade up to 200 times per second in the same instrument during surges. Sure rest of day nothing happens there - but each day there is of course at least 2 surges, sometimes 3 or more. >> >> How do you get the best software engineers then for your trading >> application? > > By asking your other software developers? As we used to call it in > the > entertainment industry "Neportunity". >> >> And what would that 'other channel' be then? > Personal contacts. > Wasn't the idea of building a resume that you could get hired based upon NOT having a friend somewhere? >> >> Vaste majority simply isn't doing this. > > Let's see.. Unemployment in the software industry is down around 3% > these > days (viz 8-12% in general, and 20-25% in certain demographics and > areas). > They're finding jobs somehow. 10:1 or 50:1 resume to open > position isn't > uncommon. Somehow they find the 1 in 50, and a fair number of > studies show > that it's not done by some HR person carefully reviewing the 50 > resumes to > find the one shining diamond. Not sure about the States, but in Europe the statistics here get manipulated bigtime of course. A good example is that one day we had big unemployment, they moved then lots of folks from unemployment statistic to disabled statistics. Then suddenly lots of folks got still the same amount of money yet politicians cried victory that unemployment statistics went down. Well this nation, The Netherlands, is a bad example, as out of the 7.8 milion who are in workeable age, 5.5 million of them work at (semi-)government. Not really following elections in the US, but seems Obama wants to adapt to that model as well? - let's take that off the mailing list though > > >> No matter how genius you are, if all you do is write fortran, and >> didn't study finance, >> then you are not allowed to write a trading application, AS YOU WON'T >> GET HIRED :) > > More an example that different industries hire for different skill > sets > and backgrounds. You're right, I can't imagine FORTRAN being very > useful > in trading. But hey, I don't write trading apps.. For all I know it's > mostly matrix math and FORTRAN is pretty good for that. Maybe they > want > people who write FORTH or LISP or PROLOG. > > The point is, it's a very niche market, looking for a very niche > programmer that is probably not remotely representative of software > developers at large. > My point was that i would want to hire that genius guy. fortran or C, no big deal. If he can write fortran he can write C, or will have enough skills to get down to the utmost details speeding me up somewhere or fixing another problem, or finding another problem we have to avoid that we didn't notice yet. Financial industry is so spoiled as they pay such high salaries, that the best thing that describes it is 'the old schoolboys network'. It's so lucrative to work there in higher positions, that everyone of course wants it. I remember speaking to some very influential people past years and sometimes i always wondered how such mediocre guys or ladies managed to get where they are. But in the end i always concluded that some very mediocre guys sometimes have 1 big talent which nearly 0 geniuses have - and that's getting hired. > >> >>> I didn't get my job at JPL by submitting a resume, and I think >>> >> >> When you got there, there was a circle at your resume around the word >> 'NASA' > > Actually not.. I hadn't worked at JPL then. The circle was around > "microscan compressive receiver", purely by chance. Ah the truth is always painful isn't it? Over here most managers aren't very impressed by their own H&R departments. Most H&R departments over here get manned actually get 'girled' by 22-30 year old. Usually ladies sometimes men, with a simplistic college degree at most. Most actually are well informed yet questionable is whether they also are capable of thinking at that level. Selection of resumes/CV's always happens in the same manner. In this nation a job that involves technology, say making software for a network card, means they would require from you you get from a technical university. So normal universities, say for example Utrecht (in top 50 of planet), where i studied - they throw such CV's away. Not a single look get taken. It's just 'elimination time'. I heard many reports of guys who were allowed to speak for a job based upon 1 company they worked at and nothing else ontheir CV. One of them litterally reported that when he asked who had put that circle around that specific company he worked for, which for him was a minor job, just like 1% of what he had achieved in life, the explanation from the manager was he got the CV like that from H&R and that they had encircled that company for him - he didn't do it. Usually these ladies and men at H&R here are paid a salary that's well below what the actual software engineers make here. Entire IT average here over entire nation is around 52k euro a year. H&R is nearly half of that. H&R management is hardly over 40k euro a year. I remember talking to some major Chinese factories, and after having a good look over there, i was a tad amazed by the differences in payment there. Workers and software engineers really made little. Total peanuts. Also they worked 6 days a week and live on site in a small room, not seldom also shared with others (depending upon position). On other than the H&R manager made 50k dollar a year. A royal salary over there. Over factor 10 that of the workers there. They're having far more capable H&R over there than any H&R in this entire nation. > >> > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From atp at piskorski.com Sat Feb 18 11:12:15 2012 From: atp at piskorski.com (Andrew Piskorski) Date: Sat, 18 Feb 2012 11:12:15 -0500 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> Message-ID: <20120218161215.GA48861@piskorski.com> On Sat, Feb 18, 2012 at 09:23:59AM +0100, Vincent Diepeveen wrote: > On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: >> - Execution speed is important, but it's not everything. Only the >> simplest strategies purely rely on speed for success. > > Which is 90% of all strategies of all traders. And you know that how, Vincent? You've worked in the financial trading world so long and have talked with so many different (successful) traders, that a guess based on your own experience might well be reasonable? Oh wait, you've never done anything remotely like that. The only plausible way you could know that is if you've been reading lots of comprehensive academic and/or industry surveys (if they exist) of traders. Sounds interesting. So how about you point us to the best summary of all that data on what sort of strategies traders say they use? I'm sure there must be one, because you wouldn't just be making wild assertions unsupported by any evidence whatsoever, right? Joachim, thanks for chiming in with observations based on your real-world experience. It's nice to see some piece of Vincent's rants occasionally inspire worthwhile content. Jim Lux, you too, even more so; I've learned interesting tidbits about FPGAs, etc. from your recent posts. -- Andrew Piskorski _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Sat Feb 18 12:02:08 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Sat, 18 Feb 2012 12:02:08 -0500 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <20120218161215.GA48861@piskorski.com> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> Message-ID: <4F3FD990.7000503@scalableinformatics.com> On 02/18/2012 11:12 AM, Andrew Piskorski wrote: > You've worked in the financial trading world so long and have talked > with so many different (successful) traders, that a guess based on > your own experience might well be reasonable? Oh wait, you've never > done anything remotely like that. FWIW: our customers in this market all say the same thing in terms of "Time To Market" and correctness/maintainability. It sucks if your code is write once. Its bad if it takes you N months to deploy something that a competitor can deploy in N weeks (or N days). What we are seeing (from customers) are mixes of C/C++ and a number of domain specific languages (DSL), as well as "scripting" languages which have JIT compilers or JIT->VM execution paths. Java is one of these, as well as a few others. Oddly, I haven't seen so much Python in this, a little Perl, and zero Ruby. The languages that are in use are very interesting (the well known ones), and the ones that aren't as well known or are private DSLs are pretty darn cool. There is much to be said for a language that enables an ease of expression of an algorithm, and doesn't get in your way with housekeeping and language bureaucracy crap. Understand that I am a fan of more terse languages, and the ones that force you into massive over-keyboarding (cough cough ... where's my coffee) should just have a nice '#include "boring_stuff.inc"' to simplify them. > The only plausible way you could know that is if you've been reading > lots of comprehensive academic and/or industry surveys (if they exist) > of traders. Sounds interesting. So how about you point us to the > best summary of all that data on what sort of strategies traders say > they use? I'm sure there must be one, because you wouldn't just be > making wild assertions unsupported by any evidence whatsoever, right? Heh ... /stands up to give a good takedown ovation ... There's lots of (mis)information out there on what HF* (multiple different types of high frequency trading ... not just equities) implies. The naive view is that the only thing that matters is speed of execution. As we service this market, we aren't directly involved in day to day elements of the participants coding, but we are aware of (some) issues that impact it. > Joachim, thanks for chiming in with observations based on your > real-world experience. It's nice to see some piece of Vincent's rants > occasionally inspire worthwhile content. Jim Lux, you too, even more > so; I've learned interesting tidbits about FPGAs, etc. from your > recent posts. Seconded. Nice to see some real end users speak up here (and HF* is most definitely a big data/HPC problem ... big HPC?). And I always enjoy Jim's posts. On silver bullets, there aren't any. Ever. Anyone trying to convince you of this is selling you something. FPGAs are very good at some subset of problems, but they are extremely hard to 'program'. Unless you get one of the "compilers" which use a virtual CPU of some sort to execute the code ... in which case you are giving up a majority of your usable performance anyway. And if someone from Convey or Mitrionics v2 wants to jump in and call BS (and even better, say something interesting on how you can avoid giving up the performance), I'd love to see/hear this. FPGAs have become something of a "red headed stepchild" of accelerators. The tasks they are good for, they are very good for. But getting near optimal performance is hard (based upon my past experience/knowledge ... more than 1 year old), and usually violates the "minimize time to market" criterion. If you have a problem which will change infrequently, and doesn't involve too much DP floating point, and lots of integer ops ... FPGAs might be a great fit technologically, though the other aspects have to be taken into account. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 12:30:49 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 18:30:49 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <4F3FD990.7000503@scalableinformatics.com> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> Message-ID: <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> On Feb 18, 2012, at 6:02 PM, Joe Landman wrote: > On 02/18/2012 11:12 AM, Andrew Piskorski wrote: > >> You've worked in the financial trading world so long and have talked >> with so many different (successful) traders, that a guess based on >> your own experience might well be reasonable? Oh wait, you've never >> done anything remotely like that. > > FWIW: our customers in this market all say the same thing in terms of > "Time To Market" and correctness/maintainability. It sucks if your > code > is write once. Its bad if it takes you N months to deploy something > that a competitor can deploy in N weeks (or N days). Say you produce a cpu now within 2 weeks. it's 200 watt. it's 1 gflop and it's $100 production price including R&D overhead, including everything except shipping to customers. Good idea to sell? time to market matters for simple software engineering - not for stuff that has to perform ok? > > What we are seeing (from customers) are mixes of C/C++ and a number of > domain specific languages (DSL), as well as "scripting" languages > which > have JIT compilers or JIT->VM execution paths. Java is one of > these, as > well as a few others. > > Oddly, I haven't seen so much Python in this, a little Perl, and zero > Ruby. The languages that are in use are very interesting (the well > known ones), and the ones that aren't as well known or are private > DSLs > are pretty darn cool. There is much to be said for a language that > enables an ease of expression of an algorithm, and doesn't get in your > way with housekeeping and language bureaucracy crap. > > Understand that I am a fan of more terse languages, and the ones that > force you into massive over-keyboarding (cough cough ... where's my > coffee) should just have a nice '#include "boring_stuff.inc"' to > simplify them. > >> The only plausible way you could know that is if you've been reading >> lots of comprehensive academic and/or industry surveys (if they >> exist) >> of traders. Sounds interesting. So how about you point us to the >> best summary of all that data on what sort of strategies traders say >> they use? I'm sure there must be one, because you wouldn't just be >> making wild assertions unsupported by any evidence whatsoever, right? > > Heh ... > > /stands up to give a good takedown ovation ... Yeah i bet some dudes want to know at which age i wrote my first mortgage calculator and and at which age i wrote my first trading application - but i won't. Had you read what i posted - which you obviously never do - it would be pretty obvious to you i had. Also shows how total ignorant you are about performance. but let me ask you next 3 questions: a) when are you gonna buy a bulldozer cpu from your own cash? If so why? And if not why not? b) you drink of course a cola from a local supermarket isn't it, so no pepsi nor coca cola - price matters most isn't it? expensive wines? Time to market huh? c) in 2003 AMD was first to market a x64 cpu, intel following with core2 some years later. Did you buy this yourself or advice others to buy it? - i bet like everyone on this list you get daily requests on what to buy isn't it, so be honest - what did you advice in 2003 and 2004 to those surrounding you? > > There's lots of (mis)information out there on what HF* (multiple > different types of high frequency trading ... not just equities) > implies. The naive view is that the only thing that matters is > speed of > execution. As we service this market, we aren't directly involved in > day to day elements of the participants coding, but we are aware of > (some) issues that impact it. > >> Joachim, thanks for chiming in with observations based on your >> real-world experience. It's nice to see some piece of Vincent's >> rants >> occasionally inspire worthwhile content. Jim Lux, you too, even more >> so; I've learned interesting tidbits about FPGAs, etc. from your >> recent posts. > > Seconded. Nice to see some real end users speak up here (and HF* is > most definitely a big data/HPC problem ... big HPC?). And I always > enjoy Jim's posts. > > On silver bullets, there aren't any. Ever. Anyone trying to convince > you of this is selling you something. > > FPGAs are very good at some subset of problems, but they are extremely > hard to 'program'. Unless you get one of the "compilers" which use a > virtual CPU of some sort to execute the code ... in which case you are > giving up a majority of your usable performance anyway. And if > someone > from Convey or Mitrionics v2 wants to jump in and call BS (and even > better, say something interesting on how you can avoid giving up the > performance), I'd love to see/hear this. FPGAs have become > something of > a "red headed stepchild" of accelerators. The tasks they are good > for, > they are very good for. But getting near optimal performance is hard > (based upon my past experience/knowledge ... more than 1 year old), > and > usually violates the "minimize time to market" criterion. > > If you have a problem which will change infrequently, and doesn't > involve too much DP floating point, and lots of integer ops ... FPGAs > might be a great fit technologically, though the other aspects have to > be taken into account. > > Joe > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics Inc. > email: landman at scalableinformatics.com > web : http://scalableinformatics.com > http://scalableinformatics.com/sicluster > phone: +1 734 786 8423 x121 > fax : +1 866 888 3112 > cell : +1 734 612 4615 > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Sat Feb 18 13:26:20 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Sat, 18 Feb 2012 13:26:20 -0500 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> Message-ID: <4F3FED4C.3010903@scalableinformatics.com> On 02/18/2012 12:30 PM, Vincent Diepeveen wrote: > > On Feb 18, 2012, at 6:02 PM, Joe Landman wrote: > >> On 02/18/2012 11:12 AM, Andrew Piskorski wrote: >> >>> You've worked in the financial trading world so long and have talked >>> with so many different (successful) traders, that a guess based on >>> your own experience might well be reasonable? Oh wait, you've never >>> done anything remotely like that. >> >> FWIW: our customers in this market all say the same thing in terms of >> "Time To Market" and correctness/maintainability. It sucks if your code >> is write once. Its bad if it takes you N months to deploy something >> that a competitor can deploy in N weeks (or N days). > > Say you produce a cpu now within 2 weeks. it's 200 watt. it's 1 gflop > and it's $100 production price > including R&D overhead, including everything except shipping to customers. > > Good idea to sell? ... from which I take it you didn't comprehend Joachim's point. Time to market is a way to say how quickly a code or computing platform (software side) can be put into production. At least this is what we are told. Time to market has *nothing* whatsoever to do with what you are talking about in this context. Others feel free to jump in and correct/dispute/etc. this. > > time to market matters for simple software engineering - not for stuff > that has to perform ok? See above. [...] > Yeah i bet some dudes want to know at which age i wrote my first > mortgage calculator > and and at which age i wrote my first trading application - but i won't. > > Had you read what i posted - which you obviously never do - it would be > pretty obvious to you i had. Hmmm ... I used to read what you wrote (note the past tense) though I skim it for ... er ... nuggets ... these days. > > Also shows how total ignorant you are about performance. but let me ask > you next 3 questions: I. am. ignorant. about. performance. This is either the most insulting or amusing thing I've read in a really long time. Years. I'll take it as amusing. Yes Vincent. We, who push our gear that hits more than 5 GB/s sustained to and from spinning rust, in a single box, with less than 50 disks required to get there ... we who push our flash arrays and SSD arrays that hit millions of IOPs ... Yes Vincent, we are ignorant of performance. We obviously don't get or understand performance. We have no interest in it. (the preceding should be read aloud with a voice positively dripping in sarcasm) I am laughing now. No, really. Thats laughter you hear. > > a) when are you gonna buy a bulldozer cpu from your own cash? If so why? > And if not why not? I bought (with my own cash) Opteron and Xeon. Clearspeed, FGPAs, GPUS etc. And you? Won't buy bulldozer for us for a while. Looks like it needs some performance tweaks, and the compilers have to do a better job for it. > b) you drink of course a cola from a local supermarket isn't it, so no > pepsi nor coca cola - price matters most isn't it? Er ... I don't drink soda (sound of an opaque metaphor shattering). > expensive wines? Time to market huh? Ahh ... an obtuse path to discuss time to market as a function of cost. So if your development cost requires you pay expensive programmers for 6 months working on very expensive hardware with very expensive tools for an application that will have a usable lifetime of 3-6 months (or whatever window is relevant) ... ... versus ... you pay your very expensive programmers for 1-2 weeks working on less expensive hardware with well designed and inexpensive tools for an application that will have a usable lifetime of 3-6 months (or whatever window is relevant) ... which of these will a) cost less to build/test/deploy b) have a longer time in market to make you money? And your expensive programmers get to work on the next task, therefore increasing your ability to have a collection of tools actively engaged on the market. Seriously, if you don't get why this matters, well ... > c) in 2003 AMD was first to market a x64 cpu, intel following with core2 > some years later. Did you buy this yourself or advice others to buy it? > - i bet like everyone on this list you get daily requests on what to buy > isn't it, so be honest > - what did you advice in 2003 and 2004 to those surrounding you? Oh. My. Vincent ... um ... How do I say this ... Why don't you google me, with the phrase ... I dunno ... "AMD whitepaper opteron" or similar things. Here's one you might find: http://developer.amd.com/assets/Computational_Chemistry_Paper.pdf I can send you others in PDF form if you like. If you read some of the white papers we wrote for them, you might even find where they got the APU expression from ... Heh. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 13:48:01 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 19:48:01 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <4F3FED4C.3010903@scalableinformatics.com> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> <4F3FED4C.3010903@scalableinformatics.com> Message-ID: On Feb 18, 2012, at 7:26 PM, Joe Landman wrote: > > Ahh ... an obtuse path to discuss time to market as a function of > cost. > > So if your development cost requires you pay expensive programmers > for 6 months working on very expensive hardware with very expensive > tools for an application that will have a usable lifetime of 3-6 > months (or whatever window is relevant) ... > > ... versus ... > > you pay your very expensive programmers for 1-2 weeks working on > less expensive hardware with well designed and inexpensive tools > for an application that will have a usable lifetime of 3-6 months > (or whatever window is relevant) > > ... which of these will > > a) cost less to build/test/deploy > b) have a longer time in market to make you money? > > > And your expensive programmers get to work on the next task, > therefore increasing your ability to have a collection of tools > actively engaged on the market. > > Seriously, if you don't get why this matters, well ... > Why is quadrics bankrupt? Why has everyone forgotten about myri as high performance network? Why is itanium end of line? Simple - they don't perform well and/or were too expensive. In all your naivity you forgot the most important thing; PERFORMANCE and RELIABILITY. Your fast $2.50 an hour software engineers are a) not producing reliable code b) no performance both crucial for trading software. The entire discussion here and the entire mailing list is interested in reliability and performance. They care about the big clusters the traders have in their backyard that do all the calculations - and i can't say anything sensible there either except that there is A FEW hedgefunds that publicly admit they do have big clusters with highend networks and reliable ones and we do know it's all in big bunkers - so we can be sure that no one, especially the government, doesn't know what happens there. So that's why we aren't discussing the most interesting stuff they got. From just 1 hedgefund we know they have a couple of thousands of nodes; that's a rather big one though. The rest we can just guess. What we can discuss is the highperformance part which is the trading engine. Either in fpga or in software or a hybrid. Now you're telling that 'time to market' matters there? What world are you from. THIS IS ABOUT PERFORMANCE. if you can't deliver the performance, you don't even need to START the project. I really doubt you know anything about selling a performance product if all you care about it time to market :) Vincent _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Sat Feb 18 14:40:09 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Sat, 18 Feb 2012 14:40:09 -0500 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> <4F3FED4C.3010903@scalableinformatics.com> Message-ID: <4F3FFE99.1040305@scalableinformatics.com> On 02/18/2012 01:48 PM, Vincent Diepeveen wrote: >> >> Seriously, if you don't get why this matters, well ... >> > > Why is quadrics bankrupt? Because ... they ... ran ... out ... of ... money? > Why has everyone forgotten about myri as high performance network? Because ... other ... competitors ... have ... emerged ... with ... better ... more standardized ... stuff? > Why is itanium end of line? Umm ... anyone who knows anything about business would be able to answer this one for you. > Simple - they don't perform well and/or were too expensive. Not so simple. Quadrics ran out of money. Myri was surpassed with other tech. Itanium was never a good business idea. > > In all your naivity you forgot the most important thing; PERFORMANCE and > RELIABILITY. Ahh ... all of my naivet?. And on that note, I'll close my participation in this amusing thread. A conversation where you have not only a failure in communication, but a profound ... seemingly unbounded ... seemingly willful ... failure in comprehension ... yeah, not so much of a good conversation. Really Vincent, its been entertaining. Naivet? ... /chuckles -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Sat Feb 18 15:33:33 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Sat, 18 Feb 2012 15:33:33 -0500 (EST) Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> <4F3FED4C.3010903@scalableinformatics.com> Message-ID: > Why is quadrics bankrupt? > Why has everyone forgotten about myri as high performance network? > Why is itanium end of line? > > Simple - they don't perform well and/or were too expensive. false. they didn't execute well enough. quadrics owned HPC for a while, but didn't execute properly in tansitioning past qsnet2. they could have nipped IB in the bud. imagine if quadrics had managed to ship 10Gb VM-aware adapters as well as intelligent switching fabrics, and quickly moved to 40Gb and DCE-like wire-level features. myrinet also lost its place in HPC, probably for similar reasons, though it isn't quite gone. its current product line seems focused on high-freq trading, though I have no idea how well they succeed. ironically, MX, with 3-4 us latency, is still reasonably attractive. I have no idea why quadrics and myricom didn't manage to win by following the ethernet(ish) path. perhaps it was just the drag induced by the rest of the eth world, since they would have needed to make 40Gb prevalent several years ago to compete with IB. itanium, as well, succeeded in a limited domain, but suffered because Intel wasn't willing to commit to it instead of facing the threat of AMDs x86_64 head-on. was it the right approach? I still think VLIW is a mistake ISA-wise, but perhaps if Intel had put as much ingenuity into it as they did into >=nehalem, it might have succeeded. all three cases are failures of execution. > The entire discussion here and the entire mailing list is interested > in reliability and performance. HPC is about performance; reliability is only of interest when it becomes a threat to performance. > I really doubt you know anything about selling a performance product > if all you care about it time to market :) poor execution becomes a failure precisely because of time-to-market. specifically, the three examples have failed because poor execution let alternatives arrive in the market ahead of them. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Sat Feb 18 17:17:53 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Sat, 18 Feb 2012 14:17:53 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <4F3FD990.7000503@scalableinformatics.com> Message-ID: On 2/18/12 9:02 AM, "Joe Landman" wrote: > >FPGAs are very good at some subset of problems, but they are extremely >hard to 'program'. Unless you get one of the "compilers" which use a >virtual CPU of some sort to execute the code ... in which case you are >giving up a majority of your usable performance anyway. And if someone >from Convey or Mitrionics v2 wants to jump in and call BS (and even >better, say something interesting on how you can avoid giving up the >performance), I'd love to see/hear this. FPGAs have become something of >a "red headed stepchild" of accelerators. The tasks they are good for, >they are very good for. But getting near optimal performance is hard >(based upon my past experience/knowledge ... more than 1 year old), and >usually violates the "minimize time to market" criterion. > >If you have a problem which will change infrequently, and doesn't >involve too much DP floating point, and lots of integer ops ... FPGAs >might be a great fit technologically, though the other aspects have to >be taken into account. Reprogrammable FPGAs (tiny ones) were available in the mid 80s, so you could say that they're about 25 years old now. Compare that to more conventional computers, say, mid 40s.. Think about how mature compilers and such were in 1965, especially in terms of optimizers, etc. And think about how many software developers there were back then (in comparison to the general technical professional population). FPGAs will get there. (of course, conventional CPUs are always going to be ahead).. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Sun Feb 19 22:24:13 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Mon, 20 Feb 2012 14:24:13 +1100 Subject: [Beowulf] PCPro: AMD: what went wrong? Message-ID: <4F41BCDD.4080408@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Interesting article on why AMD has been on the back foot. http://www.pcpro.co.uk/features/372859/amd-what-went-wrong/print [...] Yet comparison is inevitable ? and not very complimentary. Our review concluded that ?Intel still holds all the cards?, with pricier AMD FX processors delivering benchmark scores synonymous with Intel?s mid-range Core i5s. The verdict was unanimous; our sister title bit-tech dubbed the FX-8150 a ?stinker?. Light was shed on Bulldozer?s problems when ex-AMD engineer Cliff Maier spoke out about manufacturing issues during the earliest stages of design. ?Management decided there should be cross-engineering [between AMD and ATI], which meant we had to stop hand-crafting CPU designs,? he said. Production switched to faster automated methods, but Maier says the change meant AMD?s chips lost ?performance and efficiency? as crucial parts were designed by machines, rather than experienced engineers. AMD?s latest chips haven?t stoked the fires of consumers, either. Martin Sawyer, technical director at Chillblast, reports that ?demand for AMD has been quite slow?, and there?s no rush to buy Bulldozer. ?With no AMD solutions competitive with an Intel Core i5-2500K?, he says, ?AMD is a tough sell in the mid- and high-end market.? Another British PC supplier told us off-the-record that sales are partly propped up by die-hards who only buy AMD ?because they don?t like Intel?. [...] - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9BvN0ACgkQO2KABBYQAh81OACfU+Lzu7NANVdGm8BJ1+mwuEp+ Z1wAnRKgNOby5Jn56W0LCSeVsn88bpih =c54h -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Mon Feb 20 13:10:22 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Mon, 20 Feb 2012 13:10:22 -0500 (EST) Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: <4F41BCDD.4080408@unimelb.edu.au> References: <4F41BCDD.4080408@unimelb.edu.au> Message-ID: > mid-range Core i5s. The verdict was unanimous; our sister title > bit-tech dubbed the FX-8150 a ?stinker?. well, for desktops. specFPrate scores are pretty competitive (though sandybridge xeons are reportedly quite a bit better.) > Light was shed on Bulldozer?s problems when ex-AMD engineer Cliff > Maier spoke out about manufacturing issues during the earliest stages > of design. ?Management decided there should be cross-engineering > [between AMD and ATI], which meant we had to stop hand-crafting CPU > designs,? he said. I'm purely armchair when it comes to low-level chip design, but to me, this makes it sound like there are problems with their tools. what's the nature of the magic that slower/human design makes, as opposed to the magic-less automatic design? is this a tooling-up issue that would only affect the first rev of auto-designed CPUs? does this also imply that having humans tweak the design would make the GPU/APU chips faster, smaller or more power-efficient? presumably this change from semi-manual to automatic design (layout?) was motivated by a desire to improve time-to-market. or perhaps improve consistency/predictability of development? have any such improvements resulted? from here, it looks like BD was a bit of a stinker and that the market is to some extent waiting to see whether Piledriver is the chip that BD should have been. if PD had followed BD by a few months, this discussion would have a different tone. then again, GPUs were once claimed to have a rapid innovation cycle, but afaikt that was a result of immaturity. current GPU cycles are pretty long, seemlingly as long as, say, Intel's tick-tock. Fermi has been out for a long while with no significant successor. ATI chips seem to rev a high-order digit about once a year, but I'm not sure I'd really call 5xxx a whole different generation than 6xxx. (actually, 4xxx (2008) was pretty similar as well...) > Production switched to faster automated methods, but Maier says the > change meant AMD?s chips lost ?performance and efficiency? as crucial > parts were designed by machines, rather than experienced engineers. were these experienced engineers sitting on their hands during this time? > AMD?s latest chips haven?t stoked the fires of consumers, either. > Martin Sawyer, technical director at Chillblast, reports that ?demand > for AMD has been quite slow?, and there?s no rush to buy Bulldozer. well, APU demand seems OK, though not very exciting because the CPU cores in these chips are largely what AMD has been shipping for years. > ?With no AMD solutions competitive with an Intel Core i5-2500K?, he > says, ?AMD is a tough sell in the mid- and high-end market.? Another > British PC supplier told us off-the-record that sales are partly > propped up by die-hards who only buy AMD ?because they don?t like Intel?. to some extent. certainly AMD has at various times in the past been able to claim the crown in: - 64b ISA and performance - memory bandwidth and/or cpu:mem balance - power efficiency - integrated CPU-GPU price/performance. - specrate-type throughput/price efficiency but Intel has executed remarkably well to take these away. for instance, although AMD's APUs are quite nice, Intel systems are power efficient enough that you can build a system with an add-in-card and still match or beat the APU power envelope. Intel seems to extract more stream-type memory bandwidth from the same dimms. and Intel has what seems like a pipeline already loaded with promising chips (SB Xeons, and presumably ivybridge improvements after that). MIC seems promising, but then again with GCN, GPUs are becoming less of an obstacle course for masochists. from the outside, we have very little visibility into what's going on with AMD. they seem to be making some changes, which is good, since there have been serious problems. whether they're the right changes, I donno. it's a little surprising to me how slowly they're moving, since being near-death would seem to encourage urgency. in some sense, the current state is near market equilibrium, though: Intel has the performance lead and is clearly charging a premium, with AMD trailing but arguably offering decent value with cheaper chips. this doesn't seem like a way for AMD to grow market share, though. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Mon Feb 20 15:29:43 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Mon, 20 Feb 2012 12:29:43 -0800 Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: Message-ID: Comments below about automated vs manual design.. On 2/20/12 10:10 AM, "Mark Hahn" wrote: >> mid-range Core i5s. The verdict was unanimous; our sister title >> bit-tech dubbed the FX-8150 a ?stinker?. > >well, for desktops. specFPrate scores are pretty competitive >(though sandybridge xeons are reportedly quite a bit better.) > >> Light was shed on Bulldozer?s problems when ex-AMD engineer Cliff >> Maier spoke out about manufacturing issues during the earliest stages >> of design. ?Management decided there should be cross-engineering >> [between AMD and ATI], which meant we had to stop hand-crafting CPU >> designs,? he said. > >I'm purely armchair when it comes to low-level chip design, but to me, >this makes it sound like there are problems with their tools. what's >the nature of the magic that slower/human design makes, as opposed to >the magic-less automatic design? One place where humans can do a better job is in the place and route, particularly if the design is tight on available space. If there's plenty of room, an autorouter can do pretty well, but if it's tight, you get to high 90s % routed, and then it gets sticky. It's a very, very complex problem because you have to not only find room for interconnects, but trade off propagation delay so that it can actually run at rated speed: spreading out slows you down. (same basic problem as routing printed circuit boards) Granted modern place and route is very sophisticated, but ultimately, it's a heuristic process (Xilinx had simulated annealing back in the 80s, for instance) which is trying to capture routine guidelines and rules (as opposed to trying guided random strategies like GA, etc.) Skilled humans can "learn" from previous similar experience, which so far, the automated tools don't. That is, a company doesn't do new CPU designs every week, so there's not a huge experience base for a "learning" router to learn from. The other thing that humans can do is have a better feel for working the tolerances.. That is, they can make use of knowledge that some variabilities are correlated (e.g. Two parts side by side on the die will "track", something that is poorly captured in a spec for the individual parts). Pushing the timing margins is where it's all done. > is this a tooling-up issue that would >only affect the first rev of auto-designed CPUs? does this also imply >that having humans tweak the design would make the GPU/APU chips faster, >smaller or more power-efficient? Historically, the output of the automated tools is very hard to modify by a human, except in a peephole optimization sense. This is because a human generated design will typically have some sort of conceptual architecture that all hangs together. An automated design tends to be, well, unconstrained by the need for a consistent conceptual view. It's a lot harder to change something in one place and know that it won't break something else, if you didn?t follow and particpate the design process from the top. There's a very distinct parallel here to optimizing compilers and "hand coded assembly". There are equivalent tools to profilers and such, but it's the whole thing about how a bad top level design can't be saved by extreme low level optimization. Bear in mind that Verilog and VHDL are about like Assembler (even if they have a "high level" sort of C-like look to them). There are big subroutine libraries (aka IP cores), but it's nothing like, say, an automatically parallelizing FORTRAN compiler that makes effective use of a vector unit. > >presumably this change from semi-manual to automatic design (layout?) >was motivated by a desire to improve time-to-market. or perhaps improve >consistency/predictability of development? have any such improvements >resulted? from here, it looks like BD was a bit of a stinker and that >the market is to some extent waiting to see whether Piledriver is the >chip that BD should have been. if PD had followed BD by a few months, >this discussion would have a different tone. There is a HUGE desire to do better automated design, for the same reason we use high level languages to develop software: it greatly improves productivity (in terms of number of designs that can be produced by one person). There aren't all that many people doing high complexity IC development. Consider something like a IEEE-1394 (Firewire) core. There are probably only 4 or 5 people in the *world* who are competent to design it or at least lead a design: not only do you need to know all the idiosyncracies of the process, but you also need to really understand IEEE-1394 in all of it's funky protocol details. Ditto for processor cores. For an example of a fairly simple and well documented core, take a look at the LEON implementations of the SPARC (which are available for free as GPLed VHDL). That's still a pretty complex piece of logic, and not something you just leap into modifying, or recreating. http://www.gaisler.com/cms/index.php?option=com_content&task=view&id=156&It emid=104 > >then again, GPUs were once claimed to have a rapid innovation cycle, >but afaikt that was a result of immaturity. current GPU cycles are >pretty long, seemlingly as long as, say, Intel's tick-tock. Fermi >has been out for a long while with no significant successor. ATI >chips seem to rev a high-order digit about once a year, but I'm not >sure I'd really call 5xxx a whole different generation than 6xxx. >(actually, 4xxx (2008) was pretty similar as well...) I suspect that the "cycle rate" is driven by market forces. At some point, there's less demand for higher performance, particularly for something consumer driven like GPUs. At some point, you're rendering all the objects you need at resolutions higher than human visual resolution, and you don't need to go faster. Maybe the back-end physics engine could be improved (render individual sparks in a flame or droplets in a cloud) but there's a sort of cost benefit analysis that goes into this. For consumer "single processor" kinds of applications we're probably in that zone.. How much faster do you need to render that spreadsheet or word document. The bottleneck isn't the processor, it's the data pipe coming in, whether streamed from a DVD or over the network connection. > >> Production switched to faster automated methods, but Maier says the >> change meant AMD?s chips lost ?performance and efficiency? as crucial >> parts were designed by machines, rather than experienced engineers. > >were these experienced engineers sitting on their hands during this time? No, they were designing other things (or were hired away by someone else). There's always more design work to be done than people to do it. Maybe AMD had some Human Resources/Talent Management/Human Capital issues and their top talent bolted to somewhere else? (there are people with a LOT of cash in the financial industry and in government who are interested in ASIC designs.. At least if the ads in the back of IEEE Spectrum and similar are any sign.) Being a skilled VLSI designer capable of leading a big CPU design these days is probably a "guaranteed employment and name your salary" kind of profession. > >> AMD?s latest chips haven?t stoked the fires of consumers, either. >> Martin Sawyer, technical director at Chillblast, reports that ?demand >> for AMD has been quite slow?, and there?s no rush to buy Bulldozer. > >well, APU demand seems OK, though not very exciting because the CPU >cores in these chips are largely what AMD has been shipping for years. I would speculate that consumer performance demands have leveled out, for the data bottleneck reasons discussed above. Sure, I'd like to rip DVDs to my server a bit faster, but I'm not going to go out and buy a new computer to do it (and of course, it's still limited by how fast I can read the DVD) > >> ?With no AMD solutions competitive with an Intel Core i5-2500K?, he >> says, ?AMD is a tough sell in the mid- and high-end market.? Another >> British PC supplier told us off-the-record that sales are partly >> propped up by die-hards who only buy AMD ?because they don?t like >>Intel?. > >to some extent. certainly AMD has at various times in the past been able >to claim the crown in: > - 64b ISA and performance > - memory bandwidth and/or cpu:mem balance > - power efficiency > - integrated CPU-GPU price/performance. > - specrate-type throughput/price efficiency > >but Intel has executed remarkably well to take these away. for instance, >although AMD's APUs are quite nice, Intel systems are power efficient >enough that you can build a system with an add-in-card and still match >or beat the APU power envelope. Intel seems to extract more stream-type >memory bandwidth from the same dimms. and Intel has what seems like a >pipeline already loaded with promising chips (SB Xeons, and presumably >ivybridge improvements after that). MIC seems promising, but then again >with GCN, GPUs are becoming less of an obstacle course for masochists. Maybe Intel hired all of AMDs top folks away, and that's why AMD is using more automated design? > >from the outside, we have very little visibility into what's going on with >AMD. they seem to be making some changes, which is good, since there have >been serious problems. whether they're the right changes, I donno. it's >a little surprising to me how slowly they're moving, since being >near-death >would seem to encourage urgency. in some sense, the current state is near >market equilibrium, though: Intel has the performance lead and is clearly >charging a premium, with AMD trailing but arguably offering decent value >with cheaper chips. this doesn't seem like a way for AMD to grow market >share, though. But hasn't that really been the case since the very early days of x86? I seem to recall some computers out in my garage with AMD 286 and 386 clones in them. AMD could also attack the embedded processor market with high integration flavors of the processors. Does AMD really need to grow market share? If the overall pie keeps getting bigger, they can grow, keeping constant percentage market share. They've been around long enough that by no means could they be considered a start=up in a rapid growth phase. > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Mon Feb 20 15:48:49 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Mon, 20 Feb 2012 21:48:49 +0100 Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: References: Message-ID: On Feb 20, 2012, at 9:29 PM, Lux, Jim (337C) wrote: > Comments below about automated vs manual design.. > > > On 2/20/12 10:10 AM, "Mark Hahn" wrote: > >>> mid-range Core i5s. The verdict was unanimous; our sister title >>> bit-tech dubbed the FX-8150 a ?stinker?. >> >> well, for desktops. specFPrate scores are pretty competitive >> (though sandybridge xeons are reportedly quite a bit better.) >> >>> Light was shed on Bulldozer?s problems when ex-AMD engineer Cliff >>> Maier spoke out about manufacturing issues during the earliest >>> stages >>> of design. ?Management decided there should be cross-engineering >>> [between AMD and ATI], which meant we had to stop hand-crafting CPU >>> designs,? he said. >> >> I'm purely armchair when it comes to low-level chip design, but to >> me, >> this makes it sound like there are problems with their tools. what's >> the nature of the magic that slower/human design makes, as opposed to >> the magic-less automatic design? > > One place where humans can do a better job is in the place and route, > particularly if the design is tight on available space. If there's > plenty > of room, an autorouter can do pretty well, but if it's tight, you > get to > high 90s % routed, and then it gets sticky. It's a very, very complex > problem because you have to not only find room for interconnects, but > trade off propagation delay so that it can actually run at rated > speed: > spreading out slows you down. (same basic problem as routing printed > circuit boards) > > Granted modern place and route is very sophisticated, but > ultimately, it's > a heuristic process (Xilinx had simulated annealing back in the > 80s, for > instance) which is trying to capture routine guidelines and rules (as > opposed to trying guided random strategies like GA, etc.) Actually for hand optimization of yields at modern CPU's stuff like simulated annealing is less popular. You can actually also use lineair solvers for that, in order to recalculate entire design and under the right constraints it gives an optimal solution, which is not garantueed for the non-lineair solving methods as those also easily can pick a local maximum. Stuff like simulated annealing is more popular at the non-lineair problems such as in artificial intelligence. > > Skilled humans can "learn" from previous similar experience, which > so far, > the automated tools don't. That is, a company doesn't do new CPU > designs > every week, so there's not a huge experience base for a "learning" > router > to learn from. > > The other thing that humans can do is have a better feel for > working the > tolerances.. That is, they can make use of knowledge that some > variabilities are correlated (e.g. Two parts side by side on the > die will > "track", something that is poorly captured in a spec for the > individual > parts). > > Pushing the timing margins is where it's all done. > > > >> is this a tooling-up issue that would >> only affect the first rev of auto-designed CPUs? does this also >> imply >> that having humans tweak the design would make the GPU/APU chips >> faster, >> smaller or more power-efficient? > > > Historically, the output of the automated tools is very hard to > modify by > a human, except in a peephole optimization sense. This is because > a human > generated design will typically have some sort of conceptual > architecture > that all hangs together. An automated design tends to be, well, > unconstrained by the need for a consistent conceptual view. > > It's a lot harder to change something in one place and know that it > won't > break something else, if you didn?t follow and particpate the design > process from the top. > > There's a very distinct parallel here to optimizing compilers and > "hand > coded assembly". There are equivalent tools to profilers and such, > but > it's the whole thing about how a bad top level design can't be > saved by > extreme low level optimization. > > > Bear in mind that Verilog and VHDL are about like Assembler (even > if they > have a "high level" sort of C-like look to them). There are big > subroutine libraries (aka IP cores), but it's nothing like, say, an > automatically parallelizing FORTRAN compiler that makes effective > use of a > vector unit. > >> >> presumably this change from semi-manual to automatic design (layout?) >> was motivated by a desire to improve time-to-market. or perhaps >> improve >> consistency/predictability of development? have any such >> improvements >> resulted? from here, it looks like BD was a bit of a stinker and >> that >> the market is to some extent waiting to see whether Piledriver is the >> chip that BD should have been. if PD had followed BD by a few >> months, >> this discussion would have a different tone. > > > There is a HUGE desire to do better automated design, for the same > reason > we use high level languages to develop software: it greatly improves > productivity (in terms of number of designs that can be produced by > one > person). > > There aren't all that many people doing high complexity IC > development. > Consider something like a IEEE-1394 (Firewire) core. There are > probably > only 4 or 5 people in the *world* who are competent to design it or at > least lead a design: not only do you need to know all the > idiosyncracies > of the process, but you also need to really understand IEEE-1394 in > all of > it's funky protocol details. > > Ditto for processor cores. For an example of a fairly simple and well > documented core, take a look at the LEON implementations of the SPARC > (which are available for free as GPLed VHDL). That's still a pretty > complex piece of logic, and not something you just leap into > modifying, or > recreating. > http://www.gaisler.com/cms/index.php? > option=com_content&task=view&id=156&It > emid=104 > > > > > >> >> then again, GPUs were once claimed to have a rapid innovation cycle, >> but afaikt that was a result of immaturity. current GPU cycles are >> pretty long, seemlingly as long as, say, Intel's tick-tock. Fermi >> has been out for a long while with no significant successor. ATI >> chips seem to rev a high-order digit about once a year, but I'm not >> sure I'd really call 5xxx a whole different generation than 6xxx. >> (actually, 4xxx (2008) was pretty similar as well...) > > > I suspect that the "cycle rate" is driven by market forces. At some > point, > there's less demand for higher performance, particularly for something > consumer driven like GPUs. At some point, you're rendering all the > objects you need at resolutions higher than human visual > resolution, and > you don't need to go faster. Maybe the back-end physics engine > could be > improved (render individual sparks in a flame or droplets in a > cloud) but > there's a sort of cost benefit analysis that goes into this. > > For consumer "single processor" kinds of applications we're > probably in > that zone.. How much faster do you need to render that spreadsheet > or word > document. The bottleneck isn't the processor, it's the data pipe > coming > in, whether streamed from a DVD or over the network connection. > > >> >>> Production switched to faster automated methods, but Maier says the >>> change meant AMD?s chips lost ?performance and efficiency? as >>> crucial >>> parts were designed by machines, rather than experienced engineers. >> >> were these experienced engineers sitting on their hands during >> this time? > > > No, they were designing other things (or were hired away by someone > else). > There's always more design work to be done than people to do it. > Maybe > AMD had some Human Resources/Talent Management/Human Capital issues > and > their top talent bolted to somewhere else? (there are people with > a LOT > of cash in the financial industry and in government who are > interested in > ASIC designs.. At least if the ads in the back of IEEE Spectrum and > similar are any sign.) > > Being a skilled VLSI designer capable of leading a big CPU design > these > days is probably a "guaranteed employment and name your salary" > kind of > profession. > > >> >>> AMD?s latest chips haven?t stoked the fires of consumers, either. >>> Martin Sawyer, technical director at Chillblast, reports that ? >>> demand >>> for AMD has been quite slow?, and there?s no rush to buy Bulldozer. >> >> well, APU demand seems OK, though not very exciting because the CPU >> cores in these chips are largely what AMD has been shipping for >> years. > > > I would speculate that consumer performance demands have leveled > out, for > the data bottleneck reasons discussed above. Sure, I'd like to rip > DVDs > to my server a bit faster, but I'm not going to go out and buy a new > computer to do it (and of course, it's still limited by how fast I can > read the DVD) >> >>> ?With no AMD solutions competitive with an Intel Core i5-2500K?, he >>> says, ?AMD is a tough sell in the mid- and high-end market.? Another >>> British PC supplier told us off-the-record that sales are partly >>> propped up by die-hards who only buy AMD ?because they don?t like >>> Intel?. >> >> to some extent. certainly AMD has at various times in the past >> been able >> to claim the crown in: >> - 64b ISA and performance >> - memory bandwidth and/or cpu:mem balance >> - power efficiency >> - integrated CPU-GPU price/performance. >> - specrate-type throughput/price efficiency >> >> but Intel has executed remarkably well to take these away. for >> instance, >> although AMD's APUs are quite nice, Intel systems are power efficient >> enough that you can build a system with an add-in-card and still >> match >> or beat the APU power envelope. Intel seems to extract more >> stream-type >> memory bandwidth from the same dimms. and Intel has what seems >> like a >> pipeline already loaded with promising chips (SB Xeons, and >> presumably >> ivybridge improvements after that). MIC seems promising, but then >> again >> with GCN, GPUs are becoming less of an obstacle course for >> masochists. > > > Maybe Intel hired all of AMDs top folks away, and that's why AMD is > using > more automated design? > >> >> from the outside, we have very little visibility into what's going >> on with >> AMD. they seem to be making some changes, which is good, since >> there have >> been serious problems. whether they're the right changes, I >> donno. it's >> a little surprising to me how slowly they're moving, since being >> near-death >> would seem to encourage urgency. in some sense, the current state >> is near >> market equilibrium, though: Intel has the performance lead and is >> clearly >> charging a premium, with AMD trailing but arguably offering decent >> value >> with cheaper chips. this doesn't seem like a way for AMD to grow >> market >> share, though. > > > But hasn't that really been the case since the very early days of > x86? I > seem to recall some computers out in my garage with AMD 286 and 386 > clones > in them. > > AMD could also attack the embedded processor market with high > integration > flavors of the processors. > > > Does AMD really need to grow market share? If the overall pie keeps > getting bigger, they can grow, keeping constant percentage market > share. > They've been around long enough that by no means could they be > considered > a start=up in a rapid growth phase. > > >> > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From michf at post.tau.ac.il Mon Feb 20 17:52:03 2012 From: michf at post.tau.ac.il (Micha) Date: Tue, 21 Feb 2012 00:52:03 +0200 Subject: [Beowulf] newb question - help with NUMA + mpich2 + GPUs Message-ID: <4F42CE93.4070600@post.tau.ac.il> Sorry if this is inappropriate here. I'm finally growing from clusters of single CPUs to a machine with multiple CPUs, which means that I need to start taking note of NUMA issues. I'm looking for information on how to achieve that with mpi under linux. I'm currently using mpich2, but I don't mind switching if needed. Things are actually more complex as this is a mixed GPU/GPU (CUDA) system so I'm also looking for how to effectively transfer data between GPUs siting on different PCIe slots and find the affinity between GPUs and CPUs. Also at what stage is the support for using MPI to copy between GPUs? Thanks for any pointers _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Mon Feb 20 22:31:12 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Tue, 21 Feb 2012 14:31:12 +1100 Subject: [Beowulf] newb question - help with NUMA + mpich2 + GPUs In-Reply-To: <4F42CE93.4070600@post.tau.ac.il> References: <4F42CE93.4070600@post.tau.ac.il> Message-ID: <4F431000.3060808@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 21/02/12 09:52, Micha wrote: > Things are actually more complex as this is a mixed GPU/GPU (CUDA) > system so I'm also looking for how to effectively transfer data > between GPUs siting on different PCIe slots and find the affinity > between GPUs and CPUs. Also at what stage is the support for using > MPI to copy between GPUs? The hwloc library from the Open-MPI folks will probably help with some of it: http://www.open-mpi.org/projects/hwloc/ It can show you which cores are near which PCI devices for instance and lstopo is a fantastic tool for getting a quick overview of a node. I *believe* that code is in the 1.5 series but it'd be well worth asking the question on the open-mpi lists to get a definitive answer from someone who knows what they're talking about. :-) There was also a discussion on the Open-MPI devel list recently about why MVAPICH2 appears to do better than it with GPUs (for the moment), the summary is here: http://www.open-mpi.org/community/lists/devel/2012/02/10430.php Hope this helps! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9DEAAACgkQO2KABBYQAh8h1gCfRtYtAY6hra6ckeoC60ZkfqOe qPwAnAsZCHB/5E9QMYutgTMKiW4cdlxO =cwG5 -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Mon Feb 20 22:39:37 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Tue, 21 Feb 2012 14:39:37 +1100 Subject: [Beowulf] Controlling java's hunger for RAM Message-ID: <4F4311F9.9060200@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, A perennial issue on our x86 clusters is Java and its unpredictableness for wanting RAM. It seems it defaults to trying to mmap() a quarter of system RAM (12GB on our 48GB nodes for instance) unless overridden by the -Xmx parameter. Now we'd like a way to be able to set a default for -Xmx for all Java processes, but cannot use $_JAVA_OPTIONS as that overrides the command line options rather than the other way around (which would have been the sensible way to do it). The reason why is that Torque sets RLIMIT_RSS to enforce memory requests on jobs, and malloc() now usually calls mmap() to allocate rather than sbrk() so RLIMIT_DATA is useless (not enforced). Any ideas?? cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9DEfkACgkQO2KABBYQAh8tbACfavVT01WYQYbkeYgKfjoUJ6jP 9AEAnRDqHcU9Egf9fM24KTtZxUhvfN9u =/fHd -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Mon Feb 20 23:14:06 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Mon, 20 Feb 2012 20:14:06 -0800 Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: Message-ID: On 2/20/12 12:48 PM, "Vincent Diepeveen" wrote: > >On Feb 20, 2012, at 9:29 PM, Lux, Jim (337C) wrote: > >> Comments below about automated vs manual design.. >> >> Granted modern place and route is very sophisticated, but >> ultimately, it's >> a heuristic process (Xilinx had simulated annealing back in the >> 80s, for >> instance) which is trying to capture routine guidelines and rules (as >> opposed to trying guided random strategies like GA, etc.) > >Actually for hand optimization of yields at modern CPU's stuff like >simulated annealing is less popular. I don't know that anyone still uses simulated annealing..it was an example of what kinds of strategies were used in the early days. Back in the late 70s, early 80s, I was looking into automated layout of PCBs. It was pretty grim.. 80% routing, then it would die. The computational challenge is substantial. > >You can actually also use lineair solvers for that, in order to >recalculate entire design and under the right >constraints it gives an optimal solution, which is not garantueed for >the non-lineair solving methods as >those also easily can pick a local maximum. I don't think a linear solver would work. > >Stuff like simulated annealing is more popular at the non-lineair >problems such as in artificial intelligence. The place and route problem is highly nonlinear with a lot of weird interactions. I'll be the first to confess that I am pretty bad at PCB or ASIC layout, but there's a lot of tricky constraints that aren't a linear function of position in some form. Imagine having a data bus with 32 lines that you need to have minimal skew between, so it can be latched. I suppose this is a kind of game playing application so move tree search strategies might work. Certainly it has EP aspects (or nearly EP), so a big parallel machine might help. For all we know, Intel and AMD have big clusters helping the designers out, running 1000 copies of timing simulators. Does Cadence, Synopsis, etc. have parallel versions? >> _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Tue Feb 21 00:02:39 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 21 Feb 2012 00:02:39 -0500 (EST) Subject: [Beowulf] Controlling java's hunger for RAM In-Reply-To: <4F4311F9.9060200@unimelb.edu.au> References: <4F4311F9.9060200@unimelb.edu.au> Message-ID: > The reason why is that Torque sets RLIMIT_RSS to enforce memory RLIMIT_RSS is simply a noop; we use RLIMIT_AS (vmem, pvmem). a good thing about this is that it's fully consistent with the vm.overcommit_memory=2 (ie conservative) mode. some people find it offputting the way VM counts things like code (usually mapped multiple times) or F77 arrays that are defined larger than used. regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Tue Feb 21 00:21:50 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Tue, 21 Feb 2012 16:21:50 +1100 Subject: [Beowulf] Controlling java's hunger for RAM In-Reply-To: References: <4F4311F9.9060200@unimelb.edu.au> Message-ID: <4F4329EE.8070406@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 21/02/12 16:02, Mark Hahn wrote: > RLIMIT_RSS is simply a noop; we use RLIMIT_AS (vmem, pvmem). Gah, sorry, it is indeed RLIMIT_AS we set too (locally patched Torque so that mem and pmem set it too). Perils of writing this stuff from memory whilst dealing with the after affects of someone accidentally upgrading one node of a multi-master CentOS DS LDAP cluster whilst I was away.. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9DKe4ACgkQO2KABBYQAh8wYQCcC1wbmsXdpqhd28UP8h7wFD5X aV0AnjfLcIu6z2E0zyKMhJWT7/Kz4vJD =y1WX -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Tue Feb 21 04:32:33 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Tue, 21 Feb 2012 10:32:33 +0100 Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: References: Message-ID: <2DE30CF4-F028-4798-967F-A9C3FD703509@xs4all.nl> On Feb 21, 2012, at 5:14 AM, Lux, Jim (337C) wrote: > > > On 2/20/12 12:48 PM, "Vincent Diepeveen" wrote: > >> >> On Feb 20, 2012, at 9:29 PM, Lux, Jim (337C) wrote: >> >>> Comments below about automated vs manual design.. >>> >>> Granted modern place and route is very sophisticated, but >>> ultimately, it's >>> a heuristic process (Xilinx had simulated annealing back in the >>> 80s, for >>> instance) which is trying to capture routine guidelines and rules >>> (as >>> opposed to trying guided random strategies like GA, etc.) >> >> Actually for hand optimization of yields at modern CPU's stuff like >> simulated annealing is less popular. > > I don't know that anyone still uses simulated annealing..it was an > example > of what kinds of strategies were used in the early days. Back in > the late > 70s, early 80s, I was looking into automated layout of PCBs. It was > pretty grim.. 80% routing, then it would die. The computational > challenge is substantial. > > > >> >> You can actually also use lineair solvers for that, in order to >> recalculate entire design and under the right >> constraints it gives an optimal solution, which is not garantueed for >> the non-lineair solving methods as >> those also easily can pick a local maximum. > > > I don't think a linear solver would work. That's always the tool getting used. A chip has too many parameters to approximate in the first place. non-lineair approximation by randomness is popular more in what my expertise is - parameter tuning for example for chessprograms, despite it's the chessprogram with worlds largest evaluation function (and as programmed by 1 program - sure some parts total outdated years 90 code), it has just some 20k tunable parameters or so, depending upon how you count. In CPU's that's different of course, so given constraints lineair solvers get used, at least for the nanometers that the cpu's have right now. I'm not familiar with 22 nm there. >> >> Stuff like simulated annealing is more popular at the non-lineair >> problems such as in artificial intelligence. > > The place and route problem is highly nonlinear with a lot of weird > interactions. > > I'll be the first to confess that I am pretty bad at PCB or ASIC > layout, well realize i'm a software engineer, yet the optimization is 100% software. Usually there is small companies around the big shots that deliver to AMD and Intel machines, which is relative tiny companies more or less belonging to them, which try to improve yields. See it as service software belonging to the machines delivered by the ASML's. > but there's a lot of tricky constraints that aren't a linear > function of > position in some form. Imagine having a data bus with 32 lines > that you > need to have minimal skew between, so it can be latched. > It's not artificial intelligence, it's in all cases something that eats 2 dimensional space, where the relative primitive model of modelling the lines are in fact not straight lines but have roundings everywhere. You can do this incredible complex with 0% odds you're gonna solve it optimal, or you can make a simpler model and solve it perfectly. Something that definitely would be tougher if within the component there would be moveable parts. So they all choose to model it using lineair programming. That gives a perfect solution and within just at most a few days of calculation. Using approximation even a moderately CPU would need worlds largest supercomputer longer than we live. > > I suppose this is a kind of game playing application so move tree > search > strategies might work. Certainly it has EP aspects (or nearly EP), > so a Tree search is far more complex than the above. The above has a clear goal: yield optimization and everything is always possible to solve by taking a tad more space. The nonlineair aspects at doing this, which the lineair model doesn't take into consideration is the reason why throwing a couple of hundreds of engineers at hand optimizing the CPU is so effective. > big parallel machine might help. For all we know, Intel and AMD > have big > clusters helping the designers out, running 1000 copies of timing > simulators. Does Cadence, Synopsis, etc. have parallel versions? I don't speak for anyone here except myself. For AMD it's easier to guess than for intel, as intel is moving to 22 nm. I'm not familiar with 22 nm. Each new generation machine has new problems, let me assure you that. Realize how rather inefficient simulated annealing and similar methods are. Trying to progress using randomness. This is a problem already with 20k parameters. Realize the big supercomputers thrown at improving parameters of smaller engines than Diep with just at most a 2000 parameters or so. Initially some guys tried to get away by defining functions, reducing that amount of parameters to 50-200. They still do. Then N*SA throws a big supercomputer at the problem and that's how they do it. At a 150 parameters you're already looking at an oracle size of 2+ million instances and a multiple of that in tuningssteps, this still tunes into a local maximum by the way, no garantuee for the optimum solution. This local maximum already takes long. For something with 2+ billion transistors, you're any close to realize the problem of doing that in a lossy manner, risking to get in a local maximum? It means even a tiny modification takes forever to solve, just to keep the same yields. So in case of CPU's just redefine the entire problem to lineair, OR try to create a lookuptable which you can insert in the lineair solver. You can insert a non-lineair subproblem with just a handful of parameters then into a lineair solver, if you have a table that lists all possibilities. Any scientists who claims on paper that using approximation he has a non-lineair solver that will always find the garantueed best optimum, the first question i ask myself : "what's the big O worst case to get *any* reasonable solution at all?" Most usually first go from total randomness and need to run through all steps and only when nearing the end of the algorithm they start having *a solution* which still is not optimal. You don't have time to wait for that *a solution* in fact. > > > >>> > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From john.hearns at mclaren.com Tue Feb 21 04:42:26 2012 From: john.hearns at mclaren.com (Hearns, John) Date: Tue, 21 Feb 2012 09:42:26 -0000 Subject: [Beowulf] newb question - help with NUMA + mpich2 + GPUs References: <4F42CE93.4070600@post.tau.ac.il> Message-ID: <207BB2F60743C34496BE41039233A8090B7D6D6C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Micha. Here is probably a good starting point for you: http://www.open-mpi.org/projects/hwloc/ I would download hwloc, install it on your system and print out the topology. That page provides good reading matter on numa placement. Also on a NUMA machine I find that the 'htop' utility is very useful - you should always check that Processes are running on the CPUs you think they should be http://htop.sourceforge.net/ The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 22 10:50:33 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 22 Feb 2012 16:50:33 +0100 Subject: [Beowulf] Chinese 16 core CPU uses message passing Message-ID: <20120222155033.GK7343@leitl.org> (experimental chip. not Godson) http://semiaccurate.com/2012/02/21/chinese-16-core-cpu-uses-message-passing/ Chinese 16 core CPU uses message passing Gone are the old days of massive shared memory architectures. Feb 21, 2012 in Chips Tweet by Mads ?lholm During the first day of ISSCC in San Francisco research from Fudan University in Shanghai described a brand new microprocessor that does away with the traditional shared memory architecture. Photo courtesy of Fudan The advantage of using a message passing scheme is that it scales much better than the shared memory. Whereas shared memory relies on software, the message passing scheme has been implemented using mailboxes designed in hardware, according to the research paper that was presented at ISSC. The processor itself consists of 16 RISC cores that share two small cores for shared memory access, but much of the communication is done using message passing. The processor also does away with the traditional caches and instead implements an extended register file. The end result is a processor that has been implemented on a TSMC 65nm L CMOS processor and runs at up to 800MHz. When dialed back to 750MHz each core can run at 1.2V and only consume 34mW, which shows that the design is extremely energy efficient. We look forward to seeing this in the wild. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 22 17:08:22 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 22 Feb 2012 17:08:22 -0500 (EST) Subject: [Beowulf] Chinese 16 core CPU uses message passing In-Reply-To: <20120222155033.GK7343@leitl.org> References: <20120222155033.GK7343@leitl.org> Message-ID: > (experimental chip. not Godson) > > http://semiaccurate.com/2012/02/21/chinese-16-core-cpu-uses-message-passing/ unfortately little real info there. > Gone are the old days of massive shared memory architectures. weird. has this writer looked at chip diagrams for the past few years? does 16 cores per memory interface (AMD) count as massive? yes, most systems are CC-NUMA, but the number of "massive" CC-NUMA systems can really be spelled with three letters: SGI. and basically boutique. > During the first day of ISSCC in San Francisco research from Fudan University > in Shanghai described a brand new microprocessor "brand new microprocessor" is a sort of funny phrase. lots of things have been tried before, including, afaikt, everything in this chip. this paper seems similar to http://dx.doi.org/10.1109/ICSICT.2010.5667778 (some of the same authors) which also involves a network-on-chip and "extended register file". it's based on MIPS32, which is a pretty popular choice for arch experiments. > that does away with the > traditional shared memory architecture. not really. > Photo courtesy of Fudan > > The advantage of using a message passing scheme is that it scales much better > than the shared memory. apples scale better than oranges, too. the duality of MP and SM is not a new concept - not that we have such a great handle on it. > Whereas shared memory relies on software, yikes. oversimplify much? > the message > passing scheme has been implemented using mailboxes designed in hardware, > according to the research paper that was presented at ISSC. The processor > itself consists of 16 RISC cores that share two small cores for shared memory > access, I'd prefer to see it described as 8 compute cores surrounding a memory core, with all cores on an in-chip network, but (presumably) no coherency between the two memory cores. the diagram makes the chip look to be focused on stream processing (the related paper uses reed-solomon decoding as its test load). but much of the communication is done using message passing. The > processor also does away with the traditional caches and instead implements > an extended register file. well, I think I'd call the MCore a cache; if you do, the diagram looks much more conventional... I love experimental chips and arch; I wish this paper were available already. but the field is very well-plowed - that doesn't detract from its fertility. what I _don't_ see in my sampling of current papers is any attempt to create a new or improved programming model that can nicely scale, both in terms of architecture and productivity, to systems of many cores. I'm also pretty convinced that one needs to start with a model that doesn't start with separate boxes labeled "cpu" and "memory". _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From john.hearns at mclaren.com Tue Feb 28 10:36:51 2012 From: john.hearns at mclaren.com (Hearns, John) Date: Tue, 28 Feb 2012 15:36:51 -0000 Subject: [Beowulf] Computer on a stick Message-ID: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Cotton Candy http://www.reghardware.com/2012/02/28/fxi_technologies_offers_cotton_can dy_linux_pc_on_a_stick/ A candidate for my 'fit a cluster inside the glovebox of your car' idea if ever there was one! 1.2Ghz processor and 1GB DRAM What was the spec of the original Beowulf project nodes? The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Daniel.Pfenniger at unige.ch Tue Feb 28 11:19:02 2012 From: Daniel.Pfenniger at unige.ch (Daniel Pfenniger) Date: Tue, 28 Feb 2012 17:19:02 +0100 Subject: [Beowulf] Computer on a stick In-Reply-To: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: <4F4CFE76.5000502@unige.ch> Hearns, John a ?crit : > Cotton Candy > > http://www.reghardware.com/2012/02/28/fxi_technologies_offers_cotton_candy_linux_pc_on_a_stick/ > > A candidate for my ?fit a cluster inside the glovebox of your car? idea if ever > there was one! > > 1.2Ghz processor and 1GB DRAM > > What was the spec of the original Beowulf project nodes? > From top of my approximate brain memory, in 1994 a Pentium 70-100MHz processor and 8MB RAM memory were upper commodity hardware specs. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Tue Feb 28 11:32:26 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Tue, 28 Feb 2012 08:32:26 -0800 Subject: [Beowulf] Computer on a stick In-Reply-To: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: Way slower than that, I'm sure. Were they even Pentiums? From: "Hearns, John" > Date: Tue, 28 Feb 2012 07:36:51 -0800 To: "beowulf at beowulf.org" > Subject: [Beowulf] Computer on a stick Cotton Candy http://www.reghardware.com/2012/02/28/fxi_technologies_offers_cotton_candy_linux_pc_on_a_stick/ A candidate for my ?fit a cluster inside the glovebox of your car? idea if ever there was one! 1.2Ghz processor and 1GB DRAM What was the spec of the original Beowulf project nodes? The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eugen at leitl.org Tue Feb 28 11:39:08 2012 From: eugen at leitl.org (Eugen Leitl) Date: Tue, 28 Feb 2012 17:39:08 +0100 Subject: [Beowulf] Computer on a stick In-Reply-To: References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: <20120228163908.GQ7343@leitl.org> On Tue, Feb 28, 2012 at 08:32:26AM -0800, Lux, Jim (337C) wrote: > Way slower than that, I'm sure. Were they even Pentiums? IIRC Becker's first cluster were i486? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From raysonlogin at gmail.com Tue Feb 28 11:57:59 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Tue, 28 Feb 2012 11:57:59 -0500 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: References: <4F354F26.2040103@scalableinformatics.com> Message-ID: The paper is now available online, "CPU-Assisted GPGPU on Fused CPU-GPU Architectures": http://people.engr.ncsu.edu/hzhou/hpca_12_final.pdf (I have not read the whole paper yet) I think the core idea is that the CPU acts as a prefetch thread and pulls data into the shared L3 for the GPU cores (this work is like other prefetch thread research projects that use the otherwise spare SMT threads to do prefetching for the main compute thread), and as the GPU cores get more cache hits, the performance is better (hence the 20% mentioned in the extremetech article). Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ On Fri, Feb 10, 2012 at 12:48 PM, Vincent Diepeveen wrote: > Another interesting question is how a few cores cores would be able > to speedup > a typical single precision gpgpu application by 20%. > > That would means that the gpu is really slow, especially if we > realize this is just 1 or 2 CPU cores or so. > > Your gpgpu code really has to kind of be not so very professional to > have 2 cpu cores alraedy contribute > some 20% to that. > > Most gpgpu codes here on a modern GPU you need about a 200+ cpu cores > and that's usually codes which > do not run optimal at gpu's, as it has to do with huge prime numbers, > so simulating that at a 64 bits cpu is more > efficient than a 32 bits gpu. > > So in their case the claim is that for their experiments, assuming 2 > cpu cores, that would be 20%. Means we have a > gpu that's 20x slower or so than a fermi at 512 cores/HD6970 @ 1536. > > 1536 / 20 = 76.8 gpu streamcores. That's AMD Processing Element > count. for nvidia this is similar to 76.8 / 4 = 19.2 cores > > This laptop is from 2007, sure it is a macbookpro 17'' apple, has a > core2 duo 2.4Ghz and has a Nvidia GT 8600M with 32 CUDA cores. > > So if we extrapolate back, the built in gpu is gonna kick that new > AMD chip, right? > > Vincent > > On Feb 10, 2012, at 6:08 PM, Joe Landman wrote: > >> On 02/10/2012 12:00 PM, Lux, Jim (337C) wrote: >>> Expecting headlines to be accurate is a fool's errand... >>> Be glad it actually said AMD. >> >> Expecting articles contents to reflect in any reasonable way upon >> reality may be a similar problem. ?There are a few, precious few >> writers >> who really grok the technology because they live it: ?Doug Eadline, >> Jeff >> Layton, Henry Newman, Chris Mellor, Dan Olds, Rich Brueckner, ... . >> >> The vast majority of articles I've had some contact with the >> authors on >> (not in the above group) have been erroneous to the point of being >> completely non-informational. >> >> >> >> -- >> Joseph Landman, Ph.D >> Founder and CEO >> Scalable Informatics Inc. >> email: landman at scalableinformatics.com >> web ?: http://scalableinformatics.com >> ? ? ? ? http://scalableinformatics.com/sicluster >> phone: +1 734 786 8423 x121 >> fax ?: +1 866 888 3112 >> cell : +1 734 612 4615 >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin >> Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- ================================================== Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.net/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Daniel.Pfenniger at unige.ch Tue Feb 28 12:34:52 2012 From: Daniel.Pfenniger at unige.ch (Daniel Pfenniger) Date: Tue, 28 Feb 2012 18:34:52 +0100 Subject: [Beowulf] Computer on a stick In-Reply-To: References: Message-ID: <4F4D103C.8080005@unige.ch> Lux, Jim (337C) a ?crit : > Way slower than that, I'm sure. Were they even Pentiums? > From http://www.intel.com/pressroom/kits/quickrefyr.htm#1994 the Pentium 60MHz Pentium was announced March 1993, and one year later the 100MHz Pentium. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Tue Feb 28 15:09:28 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 28 Feb 2012 15:09:28 -0500 (EST) Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: References: <4F354F26.2040103@scalableinformatics.com> Message-ID: > The paper is now available online, "CPU-Assisted GPGPU on Fused > CPU-GPU Architectures": > > http://people.engr.ncsu.edu/hzhou/hpca_12_final.pdf thanks for the reference. > (I have not read the whole paper yet) I think the core idea is that > the CPU acts as a prefetch thread and pulls data into the shared L3 > for the GPU cores (this work is like other prefetch thread research yes, though it's a bit puzzling, since the whole point of GPU design is to have lots of runnable threads on hand, so that you simply switch from stalled to non-stalled threads to hide latency. so in the context of prefetching, I'd expect a bundle of threads to make a non-prefetched reference, stall, but for other bundles to utilize the vector unit while the reference is resolved. gotta read the paper I guess! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Tue Feb 28 17:48:21 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Wed, 29 Feb 2012 09:48:21 +1100 Subject: [Beowulf] Computer on a stick In-Reply-To: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: <4F4D59B5.1010804@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 29/02/12 02:36, Hearns, John wrote: > What was the spec of the original Beowulf project nodes? Their paper says: http://egscbeowulf.er.usgs.gov/geninfo/Beowulf-ICPP95.pdf # The Beowulf prototype employs 100 MHz Intel DX4 microprocessors # and a 500 MByte disk drive per processor. [...] # The DX4 delivers greater computational power than other members # of the 486 family not only from its higher clock speed, but also # from its 16 KByte primary cache (twice the size of other 486 # primary caches) 6]. Each motherboard also contains a 256 KByte # secondary cache. cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9NWbUACgkQO2KABBYQAh8fygCbBOWa54mENcGbxPzVxlXJf/v5 efEAniJyHtVYra+atRMr/drJzP9oVZ70 =ESXv -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Tue Feb 28 18:27:08 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Tue, 28 Feb 2012 15:27:08 -0800 Subject: [Beowulf] Computer on a stick In-Reply-To: <4F4D59B5.1010804@unimelb.edu.au> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> Message-ID: And from a simple statement in that paper: "It is clear from these results that higher bandwidth networks are required" Did an entire industry spring.. -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Christopher Samuel Sent: Tuesday, February 28, 2012 2:48 PM To: beowulf at beowulf.org Subject: Re: [Beowulf] Computer on a stick -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 29/02/12 02:36, Hearns, John wrote: > What was the spec of the original Beowulf project nodes? Their paper says: http://egscbeowulf.er.usgs.gov/geninfo/Beowulf-ICPP95.pdf # The Beowulf prototype employs 100 MHz Intel DX4 microprocessors # and a 500 MByte disk drive per processor. [...] # The DX4 delivers greater computational power than other members # of the 486 family not only from its higher clock speed, but also # from its 16 KByte primary cache (twice the size of other 486 # primary caches) 6]. Each motherboard also contains a 256 KByte # secondary cache. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Tue Feb 28 22:26:57 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 28 Feb 2012 22:26:57 -0500 (EST) Subject: [Beowulf] Computer on a stick In-Reply-To: References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> Message-ID: > And from a simple statement in that paper: > "It is clear from these results that higher bandwidth networks are required" half-duplex 10Mb! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Tue Feb 28 23:53:08 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Tue, 28 Feb 2012 20:53:08 -0800 Subject: [Beowulf] Computer on a stick In-Reply-To: Message-ID: *Bonded* 10mbps ethernet... And zero copy drivers.. On 2/28/12 7:26 PM, "Mark Hahn" wrote: >> And from a simple statement in that paper: >> "It is clear from these results that higher bandwidth networks are >>required" > >half-duplex 10Mb! >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Daniel.Pfenniger at unige.ch Wed Feb 29 03:10:55 2012 From: Daniel.Pfenniger at unige.ch (Daniel Pfenniger) Date: Wed, 29 Feb 2012 09:10:55 +0100 Subject: [Beowulf] Computer on a stick In-Reply-To: <4F4D59B5.1010804@unimelb.edu.au> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> Message-ID: <4F4DDD8F.6030500@unige.ch> Christopher Samuel a ?crit : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 29/02/12 02:36, Hearns, John wrote: > >> What was the spec of the original Beowulf project nodes? > > Their paper says: > > http://egscbeowulf.er.usgs.gov/geninfo/Beowulf-ICPP95.pdf > Thanks for the link. In the article, and before the hardware specs is aptly mentioned the then very young Linux operating system: "The Beowulf parallel workstation project is driven by a set of requirements for high performance scientific workstations in the Earth and space sciences community and the opportunity of low cost computing made available through the PC related mass market of commodity subsystems. This opportunity is also facilitated by the availability of the Linux operating system, a robust Unix-like system environment with source code that is targeted for the x86 family of microprocessors including the Intel Pentium." It is well this combination of commodity standardized hardware (computer, mass storage, and network) and freely tunable software which allowed the project to flourish. Dan _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Wed Feb 29 08:52:42 2012 From: deadline at eadline.org (Douglas Eadline) Date: Wed, 29 Feb 2012 08:52:42 -0500 Subject: [Beowulf] Computer on a stick In-Reply-To: References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> Message-ID: <1f28a103248eb307389618a33618b779.squirrel@mail.eadline.org> FYI There is a very good article in Linux magazine written by Tom Sterling in 2003 that provides a first person history (I have used it to stamp out more than a few urban legends) http://www.linux-mag.com/id/1378/ -- Doug > And from a simple statement in that paper: > "It is clear from these results that higher bandwidth networks are > required" > > Did an entire industry spring.. > > > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On > Behalf Of Christopher Samuel > Sent: Tuesday, February 28, 2012 2:48 PM > To: beowulf at beowulf.org > Subject: Re: [Beowulf] Computer on a stick > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 29/02/12 02:36, Hearns, John wrote: > >> What was the spec of the original Beowulf project nodes? > > Their paper says: > > http://egscbeowulf.er.usgs.gov/geninfo/Beowulf-ICPP95.pdf > > # The Beowulf prototype employs 100 MHz Intel DX4 microprocessors # and a > 500 MByte disk drive per processor. > [...] > # The DX4 delivers greater computational power than other members # of the > 486 family not only from its higher clock speed, but also # from its 16 > KByte primary cache (twice the size of other 486 # primary caches) 6]. > Each motherboard also contains a 256 KByte # secondary cache. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From trainor at presciencetrust.org Wed Feb 29 16:40:52 2012 From: trainor at presciencetrust.org (Douglas J. Trainor) Date: Wed, 29 Feb 2012 16:40:52 -0500 Subject: [Beowulf] Raspberry Pi Message-ID: thought some people here should see the Raspberry Pi -- $35 computer with Toronto-designed software sells out worldwide in minutes http://bit.ly/xDVEub [takes you to thestar.com] Say hi to the Raspberry Pi, the $35 computer (with photo) http://bit.ly/xDe8fJ [takes you to csmonitor.com] _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Wed Feb 29 17:40:41 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Thu, 01 Mar 2012 09:40:41 +1100 Subject: [Beowulf] Computer on a stick In-Reply-To: <1f28a103248eb307389618a33618b779.squirrel@mail.eadline.org> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> <1f28a103248eb307389618a33618b779.squirrel@mail.eadline.org> Message-ID: <4F4EA969.90003@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/03/12 00:52, Douglas Eadline wrote: > FYI > > There is a very good article in Linux magazine written by Tom > Sterling in 2003 that provides a first person history (I have used > it to stamp out more than a few urban legends) > > http://www.linux-mag.com/id/1378/ Thanks Doug.. he writes: # Comprising sixteen Intel 100 MHz 80486-based PCs, each with # 32 Mbytes of memory and a gigabyte hard disk, and interconnected # by means of two, parallel 10-Base-T Ethernet LANs, this first # PC cluster delivered sustained performance on real world, # numerically-intensive, scientific applications (e.g., PPM) in # the range of 70 Mflops. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9OqWkACgkQO2KABBYQAh8QFgCggq3gnvvERDlR7WAY+ywc7u8B KKwAniifLZ3xLjWVpMygYMzjyelQCr4k =Go8R -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Wed Feb 29 17:48:07 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Thu, 01 Mar 2012 09:48:07 +1100 Subject: [Beowulf] Raspberry Pi In-Reply-To: References: Message-ID: <4F4EAB27.2080201@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/03/12 08:40, Douglas J. Trainor wrote: > thought some people here should see the Raspberry Pi -- Been following this for a while, they're rather neat little devices.. http://www.raspberrypi.org/#modelb http://en.wikipedia.org/wiki/Raspberry_Pi#Hardware Their usual site is down at the moment as it couldn't cope with demand (but then again, neither could Farnell or RS, their retailers :-) ). cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9OqycACgkQO2KABBYQAh9fPQCfclqiYqHuwnU6fZWanGDr4x+D lwcAoJHn0BYTIKAPZBGPusZw6FfHRwcC =mJdE -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From kilian.cavalotti.work at gmail.com Wed Feb 1 03:57:47 2012 From: kilian.cavalotti.work at gmail.com (Kilian Cavalotti) Date: Wed, 1 Feb 2012 09:57:47 +0100 Subject: [Beowulf] rear door heat exchangers In-Reply-To: References: Message-ID: Hi Michael, On Tue, Jan 31, 2012 at 9:55 PM, Michael Di Domenico wrote: > i'm looking for, but have not found yet, a rear door heat exchanger > with fans. ?the door should be able to support up to 35kw using > chilled water. ?has anyone seen such an animal? Yep. Bull provides such doors, which can cool up to 40kW per rack. See http://www.bull.com/extreme-computing/cool-cabinet-door.html Cheers, -- Kilian _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From kilian.cavalotti.work at gmail.com Wed Feb 1 04:04:56 2012 From: kilian.cavalotti.work at gmail.com (Kilian Cavalotti) Date: Wed, 1 Feb 2012 10:04:56 +0100 Subject: [Beowulf] moderation - was cpu's versus gpu's - was Intel buys QLogic In-Reply-To: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1> References: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1> Message-ID: On Wed, Feb 1, 2012 at 1:18 AM, Herbert Fruchtl wrote: > 2) You need a moderator. It's quite some work, so it will only be done by somebody who gets some satisfaction out of it. This means that the job will attract exactly the kind of people who will not moderate neutrally and dispassionately. Even if they try, there's the fact that power corrupts. You're tempted to censor views that are too far from your own ("ludicrous" is the word you would use), and in the end you have an in-crowd confirming each other's views. I so agree. > If you really find somebody's views (and their presentation) objectionable, just killfile them (it's called "filter" in the 21st century). And if certain people think ad hominem attacks help their case, ignore them instead of thinking you can look dignified in taking them on in their own game. You won't. Right. Simply ignoring posts from people you don't want to read about is not so taxing, and it's also the best way to keep trolling attacks at a reasonable level. There's probably a dozen ways to automatically filter them, the easier one being the old faithful eyeball grep, which can match a sender's name way before your conscious brain can realize it. Cheers, -- Kilian _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From john.hearns at mclaren.com Wed Feb 1 04:39:05 2012 From: john.hearns at mclaren.com (Hearns, John) Date: Wed, 1 Feb 2012 09:39:05 -0000 Subject: [Beowulf] rear door heat exchangers References: Message-ID: <207BB2F60743C34496BE41039233A8090AF320C0@MRL-PWEXCHMB02.mil.tagmclarengroup.com> > > i'm looking for, but have not found yet, a rear door heat exchanger > with fans. the door should be able to support up to 35kw using > chilled water. has anyone seen such an animal? > > most of the ones i've seen utilize a side car that sits beside the > rack. unfortunately, i'm space limited and i need something that will > hang on the back of the rack. SGI ICE clusters have chilled water rear doors just like that. Four radiator sections which hinge out so you can access the rear of the rack, and you can keep the rack running while you hinge out one door at a time. No 'side car' cooling unit - you just couple it up to chilled water feed and return via flexible pipes. Also check out CO2 cooling from Trox http://www.troxaitcs.com/aitcs/products/CO2OLrac/index.html The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From mdidomenico4 at gmail.com Wed Feb 1 08:20:56 2012 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Wed, 1 Feb 2012 08:20:56 -0500 Subject: [Beowulf] rear door heat exchangers In-Reply-To: References: Message-ID: On Tue, Jan 31, 2012 at 5:23 PM, wrote: > Hi, > > We have installed a lot of racks with rear door heat exchangers but these > are without fans instead using the in-server fans to push the air through > the element. We are doing this with ~20kW per rack. > > How the hell are you drinking 35kW in a rack? start working with GPU's... you'll find out real fast... _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From mdidomenico4 at gmail.com Wed Feb 1 08:23:06 2012 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Wed, 1 Feb 2012 08:23:06 -0500 Subject: [Beowulf] rear door heat exchangers In-Reply-To: References: Message-ID: On Tue, Jan 31, 2012 at 6:47 PM, Lux, Jim (337C) wrote: > Maybe there's an issue with the weight and or flexible tubing on a swinging door? > > The Hoffman products in Andrew's email, I think, aren't the kind that hang on a door, more hang on the side of a large box/cabinet (Type 4,12, 3R enclosure) or wall. > > They're also air/air heat exchanges or airconditioners (and vortex coolers.. but you don't want one of those unless you have a LOT of compressed air available) > > http://www.42u.com/cooling/liquid-cooling/liquid-cooling.htm > shows "in-row liquid cooling" but I think that's sort of in parallel > > They do mention, lower down on the page, "Rear Door Liquid Cooling" > But I notice that the Liebert XDF-5 which is basically a rack and chiller deck in one, only pulls out 14kW. > > From DoE: > http://www1.eere.energy.gov/femp/pdfs/rdhe_cr.pdf > > They refer the ones installed at LLBL ?as RDHx units, but carefully avoid telling you the brand or any decent data. ?They do say they cost $6k/door, and suck up 10-11kW/rack with 9 gal/min flow of 72F water. > > Googling RDHx turns up "CoolCentric.com" > http://www.coolcentric.com/resources/data_sheets/Coolcentric-Rear-Door-Heat-Exchanger-Data-Sheet.pdf > > 33kW is as good as they can do. > > I also note that they have no fans in them. Yes, these are the doors we have now. I was trying to remain vendor agnostic on the list. We have them running well passively up to 25kw now. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 1 08:33:04 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 1 Feb 2012 14:33:04 +0100 Subject: [Beowulf] moderation - was cpu's versus gpu's - was Intel buys QLogic In-Reply-To: References: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1> Message-ID: <20120201133304.GG7343@leitl.org> On Wed, Feb 01, 2012 at 10:04:56AM +0100, Kilian Cavalotti wrote: > Right. Simply ignoring posts from people you don't want to read about > is not so taxing, and it's also the best way to keep trolling attacks Utterly wrong. Empirically, key contributors will be the first to jump ship. Walking is easier than participating in a poorly managed forum. > at a reasonable level. There's probably a dozen ways to automatically > filter them, the easier one being the old faithful eyeball grep, which The point is that most people won't bother, and just leave. > can match a sender's name way before your conscious brain can realize > it. We don't seem to share the same reality. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Wed Feb 1 08:42:30 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Wed, 01 Feb 2012 08:42:30 -0500 Subject: [Beowulf] On filtering In-Reply-To: <20120201133304.GG7343@leitl.org> References: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1> <20120201133304.GG7343@leitl.org> Message-ID: <4F294146.4030500@scalableinformatics.com> We seem to have morphed from a technical/business discussion into a meta discussion on filtering. In an ironical development, I am seriously considering filtering this discussion. For those who want/demand a strong moderation hand, I simply don't see this happening. Eugen's points about the strong contributers leaving first doesn't appear to be the case here (or on any list I have ever been on over the past ... 20-ish years) . Likewise, a strong moderation queue will do what its done to other mailing lists, with moderators whom have day jobs, and thats to pretty much kill the discussion. I can point to a number where the moderation queue (used mostly for spam filtering) has worked against free form discussion (as we have here). Some of the lists on bioinformatics.org specifically demonstrate that moderation isn't conducive to discussion. For those who don't want moderation and prefer local filtering ... procmail based, eyeball grep based (not egrep but close), the system will continue to function. Now that this is said, can we please .... PLEASE .... go back to our regularly scheduled cluster(s) ? Please ? -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 1 09:04:43 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 1 Feb 2012 15:04:43 +0100 Subject: [Beowulf] Seamicro switches Atoms with Xeons in SM10000-XE; 64x Sandy Bridge in 10" Message-ID: <20120201140443.GI7343@leitl.org> http://www.seamicro.com/sm10000xe uses custom "Freedom" fabric more coverage in kraut @ http://www.heise.de/newsticker/meldung/Server-packt-256-Xeon-Kerne-in-10-Hoeheneinheiten-1425949.html _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 1 10:08:24 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 1 Feb 2012 10:08:24 -0500 (EST) Subject: [Beowulf] cloud: ho hum? Message-ID: in hopes of leaving the moderation discussion behind, here's a more interesting topic: cloud wrt beowulf/hpc. when I meet cloud-enthused people, normally I just explain how HPC clustering has been doing PaaS cloud all along. there are some people who run with it though: bioinformatics people mostly, who take personal affront to the concept of their jobs being queued. (they don't seem to understand that queueing is a function of how efficiently utilized a cluster is, and since a cloud is indeed a cluster, you get queueing in a cloud as well.) part of the issue here seems to be that people buy into a couple fallacies that they apply to cloud: - private sector is inherently more efficient. this is a bit of a mystery to me, but I guess this is one of the great rhetorical successes of the neocon movement. I've looked at Amazon prices, and they are remarkably high - depending on purchasing model, about 20x higher than an academic-run research cluster. why is there not more skepticism of outsourcing, since it always means your cost includes one or more corporate profit margins? - economies of scale: people seem to think that a datacenter at the scale of google/amazon/facebook is going to be dramatically cheaper. while I'm sure they get a good deal from their suppliers, I also doubt it's game-changing. power, for instance, is a relatively modest portion of costs, ~10% per year of a server's purchase price. machineroom cost is pretty linear with number of nodes (power); people overhead is very small (say, > 1000 servers per fte.) most of all, I just don't see how cloud changes the HPC picture at all. HPC is already based on shared resources handling burstiness of demand - if anything, cloud is simply slower. certainly I can't submit a job to EC2 that uses half the Virgina zone and expect it to run immediately. it's not clear to me whether cloud-pushers are getting real traction with the funding agencies (gov is neocon here in Canada.) it worries me that cloud might be framed as "better computing than HPC". I'm curious: what kind of cloudiness are you seeing? thanks, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From dag at sonsorol.org Wed Feb 1 10:37:30 2012 From: dag at sonsorol.org (Chris Dagdigian) Date: Wed, 01 Feb 2012 10:37:30 -0500 Subject: [Beowulf] cloud: ho hum? In-Reply-To: References: Message-ID: <4F295C3A.4030606@sonsorol.org> My $.02 from what I see in industry (life sciences) - The ability to transform capital expense money into OpEx money alone is pushing some cloud interest at high levels. No joke. Possibly a very large cloud interest driver in the larger organizations. This is also attractive for tiny startups and companies just leaving the VC incubation phase. - Deployment speed. We have customers who wait weeks after making an IT helpdesk request for a new VM to be created. Other customers take 1+ years to design, RFP and choose their HPC solution and another 4 months to deploy it. If you can do in minutes (via good DevOps techniques) what the IT organization normally takes weeks or months to do then you've got some good arguments for targeting cloud environments for quick, dev, test and on-off scientific computing environments - Quick capability gains - in some cases it's quicker and easier to get quick access to GPUs, servers with 10Gbe interconnects and well-built systems for running MapReduce style big data workflows on cloud platforms - Data exchange. Cloud is a good place for collaborators to meet and work together without punching massive holes in local firewalls. It's also a good place to either put data or get data from an outsourced provider or collaborator/partner. Many Genome Sequencing outsourcing companies can deliver your genomes directly to an EBS or AWS S3 bucket these days. - I'm a believer in the pricing and economies of scale argument in some cases. For pricing take AWS S3 as an example - internal IT people who snipe at the pricing willfully (or not) seem to ignore the inconvenient fact that S3 does not acknowledge a successful object PUT request until the data has landed in 3 datacenters. If you want an honest cost comparison for cloud-based object storage then you have to start with legit fully-loaded cost estimates for deploying and running an internal petascale-capable system that spans three separate facilities. That ain't cheap. - Truthfully though I don't use or push cloud economic arguments all that much these days. It's incredibly easy to distort the numbers anyway you want so it's rare to have a - Ability to do work that was not considered viable at home. The 90,000 core AWS Top500 cluster that was in the news is a good example. Some organizations have HPC or other problems of such scale that running them internally is not even on the radar. In rare cases spinning up something massive and exotic for a few days is a viable option. - Cyclical needs. Some of my customers have big compute needs that come about only every 3-4 years; most are looking at cloud now rather than buying local gear and seeing it depreciate or be under-utilized most of the time I agree that the cloud is overhyped and we certainly don't see a ton of HPC migrating entirely to the cloud. What we see in the trenches and out in the real world is significant interest in leveraging the cloud for Speed, Capability, Cost or "weird" use cases. -Chrius Tel: | Mobile: _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 1 10:41:38 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 1 Feb 2012 10:41:38 -0500 (EST) Subject: [Beowulf] cloud: ho hum? In-Reply-To: <3E91C69ADC46C4408C85991183A275F4096A548F68@FRGOCMSXMB04.EAME.SYNGENTA.ORG> References: <3E91C69ADC46C4408C85991183A275F4096A548F68@FRGOCMSXMB04.EAME.SYNGENTA.ORG> Message-ID: > My take on it is if we've got a large, steady scientific HPC load then we'd >want in-house capacity to cover that. you mean "production", basically. mostly fixed in size/length. > But if we had a project that had small bursts of intense computation we >might prefer to find a larger pool of compute resource - cloud could be one >of the options. In fact cloud could well be the most straightforward. The >alternative might mean slowing down a key piece of R&D project work. you seem to be comparing to small HPC, sized to meed production demand. I'm not talking about that at all: I'm assuming, perhaps unwarantedly, that most large HPC facilities are like ours, with some modest production demand, but with most of the workload already comprised of the interleaved bursts from thousands of researchers. > I'm not ignoring your points, I'm flagging up that our unusual burst of >demand might be someone else's minor blip. It then becomes worth our while >to offload that to an outsourced resource, whether cloud or not. afaikt, you're just saying "bursts and production don't mix". that's true, but isn't it very small-scale? handling burstiness just means finding a deep enough pool. efficient use of that pool just means getting enough (hopefully independently timed) bursters. this is not an argument for outsourcing per se, or for private-sector somehow being more efficient. there also seems to be a bit of class-warfare surrounding this issue: the claim that "new" disciplines ("disciplines") like bioinformatics and big-data are poorly served by traditional HPC clusters. they seem to resent spending on interconnect, for instance. to me, this seems like novices being obliviously ignorant - sure, QDR to each node seems like a waste of money, but once you get 12,24,32 cores per node, you're going to want to have something faster than Gb or 10Gb, even if you only ever use it for files, not MPI. (for that matter, I think there's a natural progression toward more complex processing as a field matures, which will lead fields that currently do serial farming towards "real" parallelism...) there's a nasty "don't give them money because they don't do it right" thing going on in the guise of cloud and (mostly) bioIT. regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Wed Feb 1 10:45:42 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Wed, 01 Feb 2012 10:45:42 -0500 Subject: [Beowulf] cloud: ho hum? In-Reply-To: References: Message-ID: <4F295E26.1050700@scalableinformatics.com> On 02/01/2012 10:08 AM, Mark Hahn wrote: > in hopes of leaving the moderation discussion behind, > here's a more interesting topic: cloud wrt beowulf/hpc. > > when I meet cloud-enthused people, normally I just explain how > HPC clustering has been doing PaaS cloud all along. there are some > people who run with it though: bioinformatics people mostly, who > take personal affront to the concept of their jobs being queued. Heh ... to put it mildly, this subset of HPC users tend to be more prone to fads a fair number of others. As often as not, we have to work to solve the real problem in part by helping to unmask the real problem (and move past the perceptions of what some CS person told them the problems were). > (they don't seem to understand that queueing is a function of how > efficiently utilized a cluster is, and since a cloud is indeed a > cluster, you get queueing in a cloud as well.) Sort of, but the illusion in a cloud is, that its all theirs, regardless of whether or not its emulated/virtualized/bare metal. > > part of the issue here seems to be that people buy into a couple > fallacies that they apply to cloud: > - private sector is inherently more efficient. this is a bit > of a mystery to me, but I guess this is one of the great rhetorical > successes of the neocon movement. I've looked at Amazon prices, I'll ignore the obvious (and profoundly incorrect) political stance here, and focus upon the (failed) economic argument. Yes, the competitive private sector is *always* more efficient at delivering goods and services than the non-competitive government sector. The only time the private sector is less efficient is when there is no meaningful competition, then the consumers of a good or service will pay market pricing set, not by competitive forces, but by the preference of the dominant vendor which does not need to compete to win the business. For example, in desktop software environments, for the better part of 20 years, Microsoft has been the dominant player, and has had complete freedom to set whatever pricing it wishes. Now that it faces competitive pressure on several fronts, you are seeing pricing starting to react accordingly to market forces. Economics 101 applies: Competitive market forces enable efficient markets. Non-competitive market forces don't. > and they are remarkably high - depending on purchasing model, > about 20x higher than an academic-run research cluster. why is there Hmmm ... I don't think you are taking everything into account, and more to the point, you are not comparing apples to oranges. Compare Amazon to CRL to Joyent to Sabalcore to ... . You will find competitive pricing among these for similar use cases. In all cases, your up front costs and recurring costs are capped. You want to use 10k nodes for 1 hour, you can. And it won't cost you 10k nodes of capital + infrastructure, power, cooling, ... to make it happen. You want 10k nodes for one hour at an academic site? Get in line, and someone has to have laid out the capex for all of this. Just because you don't see this direct cost, or the chargeback to you as an end user doesn't reflect a cost recovery and a profit (latter being irrelevant for most academic sites) doesn't mean it "costs 1/20 as much". It means you haven't accounted for the real costs correctly. > not more skepticism of outsourcing, since it always means your cost > includes one or more corporate profit margins? ... and is corporate profit a bad thing? Seriously? There is a cost associated with you not taking the capital charge for the systems you use, or for the OPEX of using them. Or for the other indirect costs surrounding the rest of this. You are paying for the privilege of keeping your costs down. So, for an academic user that has to obtain 10k CPU hours on 1000 CPUs, in order to solve their problem, they can a) sign on to and get a grant for SHARCNET and others, which involve some sort of charge back mechanism (cause SHARCNET and others have to pay for their power, cooling, data, people) b) build their own cluster (which makes sense only if you do many runs), c) buy it from Amazon/CRL/Sabalcore/... and only pay for what they use and start running right away. So which one makes the most sense? Rhetorical question to a degree as it depends strongly upon the use case, the grant needs, etc. > > - economies of scale: people seem to think that a datacenter at the > scale of google/amazon/facebook is going to be dramatically cheaper. It generally is. > while I'm sure they get a good deal from their suppliers, I also > doubt it's game-changing. power, for instance, is a relatively > modest portion of costs, ~10% per year of a server's purchase price. Then why do Google et al colocate their data centers near cheap power if power is only a modest/minute fraction of the total cost? TCO matters, and if you have to pay for power 24x7 during the life of the system, you want to minimize this cost. Multiple the cost of power for 1 server by 100k, add in other bits, and this modest fraction starts adding up to significant amounts (and fractions of the total cost), very quickly. It can be game changing. Which is why they locate their data centers where there is an optimin (minimizing total lifetime costs of power, taxes, etc.) as compared with the nearby data center where you pay a premium for convenience. > machineroom cost is pretty linear with number of nodes (power); > people overhead is very small (say,> 1000 servers per fte.) > > most of all, I just don't see how cloud changes the HPC picture at all. > HPC is already based on shared resources handling burstiness of demand - Not all HPC is this way. Actually most isn't. > if anything, cloud is simply slower. certainly I can't submit a job to > EC2 that uses half the Virgina zone and expect it to run immediately. > it's not clear to me whether cloud-pushers are getting real traction with > the funding agencies (gov is neocon here in Canada.) it worries me that > cloud might be framed as "better computing than HPC". Hmmm. > > I'm curious: what kind of cloudiness are you seeing? Quite a bit. People are looking at clouds for private use with trivial extension to public usage for computing. We are seeing huge amounts of private storage cloud builds. Cloud is ASP v3 (or v4 if you count clusters). In ASPs, large external high cost gear was centralized. Economics simply didn't work for it and this model died. Clusters started around then. Grid/Utility Computing started around then, and Amazon launched their offering at the notional end of this market. Grid was largely a bust from a commercial view, as it again had bad economics. Clusters were in full blossom then. Economics favored them. If you like to look at Clusters as ASP v3, you can, though they've been running along side of the fads. Cloud is ASP v3 or v4 (if you say clusters were v3). Natural evolution of taking a cluster, putting a VM on demand on it, or running something bare metal on it. Where its located matters to a degree, and data motion is still the hardest problem, and its getting harder. This is why private data clouds (and computing clouds) are getting more popular. This said, like all other fads/trends, Cloud is (massively over-)hyped. It has value, it has staying power (unlike grid, ASP, ...). It solves a specific set of problems, and does so well, and you pay a premium for solving those set of problems in that manner. We see more folks building private clouds (e.g. clusters with more intelligent allocation/provisioning) than we do see people run exclusively on the cloud. In financial services, we've had customers tell us how wonderful it was (from a convenience view) and how awful it was (from a performance view). It matters more to people who care about getting cycles than for people who care about getting really good single CPU performance. Cloud is a throughput engine, and this mode of operation is becoming more important over time. Even in HPC. Especially with BigData (hey, wanna talk about a massively over-hyped term? There's one for ya ... they hype masks the real issues, and this is a shame, but such is life). And for what its worth, VC's are positively throwing money at cloud/big data companies. This doesn't make it better. Probably worse. But thats a whole other discussion. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 1 10:59:33 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 1 Feb 2012 10:59:33 -0500 (EST) Subject: [Beowulf] cloud: ho hum? In-Reply-To: <4F295C3A.4030606@sonsorol.org> References: <4F295C3A.4030606@sonsorol.org> Message-ID: > - The ability to transform capital expense money into OpEx money alone > is pushing some cloud interest at high levels. No joke. Possibly a very > large cloud interest driver in the larger organizations. This is also > attractive for tiny startups and companies just leaving the VC > incubation phase. why is that? in a simple example, EC2 m1.small on-demand costs $745 per ecu-year; a $3k server gets you about 18 ecu-years and you can run it for at least 3 years. going EC2 means you buy the server 4 times a year. obviously a workload with steep, narrow and sparse demand will prefer to rent - is that it? (I'm not clear on why workloads would be like that...) > - Deployment speed. We have customers who wait weeks after making an IT > helpdesk request for a new VM to be created. Other customers take 1+ no. there's nothing technical here: dysfunctional IT orgs should simply be fixed. outsourcing as a workaround for BOFHishness is stupid... > - Quick capability gains - in some cases it's quicker and easier to get > quick access to GPUs, servers with 10Gbe interconnects and well-built > systems for running MapReduce style big data workflows on cloud platforms again, this only makes sense if your demand is impulse-like. is that actually the case? > - Data exchange. Cloud is a good place for collaborators to meet and > work together without punching massive holes in local firewalls. It's again, fire your BOFHish IT. > provider or collaborator/partner. Many Genome Sequencing outsourcing > companies can deliver your genomes directly to an EBS or AWS S3 bucket > these days. interesting, but is this really a concern? how big is a genome "these days"? > - I'm a believer in the pricing and economies of scale argument in some > cases. For pricing take AWS S3 as an example - internal IT people who > snipe at the pricing willfully (or not) seem to ignore the inconvenient > fact that S3 does not acknowledge a successful object PUT request until > the data has landed in 3 datacenters. you lost me there. do you mean your in-house IT can't do reliable storage? > If you want an honest cost comparison for cloud-based object storage > then you have to start with legit fully-loaded cost estimates for > deploying and running an internal petascale-capable system that spans > three separate facilities. That ain't cheap. you mean "granularity is large", I guess. obviously, it _is_ cheap: anything above ~10 racks is linear (I claim). > - Truthfully though I don't use or push cloud economic arguments all > that much these days. It's incredibly easy to distort the numbers anyway > you want so it's rare to have a it's just that I can't figure out any way to make costs of running EC2 more than about $80 per ecu-hour (vs even spot pricing which is $237). are you suggesting that ec2 compute costs are subsidizing the storage and transfer facilities (in spite of Amazon having separate prices for store/transfer)? > - Ability to do work that was not considered viable at home. The 90,000 > core AWS Top500 cluster that was in the news is a good example. Some > organizations have HPC or other problems of such scale that running them > internally is not even on the radar. In rare cases spinning up something > massive and exotic for a few days is a viable option. I'd love to hear of a case that wasn't a PR stunt... > - Cyclical needs. Some of my customers have big compute needs that come > about only every 3-4 years; most are looking at cloud now rather than > buying local gear and seeing it depreciate or be under-utilized most of > the time seems weird to me. thanks! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Wed Feb 1 11:03:25 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 1 Feb 2012 08:03:25 -0800 Subject: [Beowulf] cloud: ho hum? In-Reply-To: Message-ID: On 2/1/12 7:08 AM, "Mark Hahn" wrote: >in hopes of leaving the moderation discussion behind, >here's a more interesting topic: cloud wrt beowulf/hpc. > >when I meet cloud-enthused people, normally I just explain how >HPC clustering has been doing PaaS cloud all along. there are some >people who run with it though: bioinformatics people mostly, who >take personal affront to the concept of their jobs being queued. >(they don't seem to understand that queueing is a function of how >efficiently utilized a cluster is, and since a cloud is indeed a >cluster, you get queueing in a cloud as well.) > >part of the issue here seems to be that people buy into a couple >fallacies that they apply to cloud: > - private sector is inherently more efficient. this is a bit > of a mystery to me, but I guess this is one of the great rhetorical > successes of the neocon movement. I've looked at Amazon prices, > and they are remarkably high - depending on purchasing model, > about 20x higher than an academic-run research cluster. why is there > not more skepticism of outsourcing, since it always means your cost > includes one or more corporate profit margins? 'twas ever thus, I suspect. We get the same thing at JPL. Whatever potential inefficiencies there are with academically oriented or government toilers, the fact that we are non-profit means instantly that we have a 10% advantage. But we have an overhead of proving we're not ripping off the taxpayer, and that probably eats up the advantage That said, private industry does have some advantages in some circumstances: They are probably more nimble when it comes to ramping up manufacturing. There are definitely inefficiencies in government work, because of the increased scrutiny that expenditures of tax dollars get. We bear a heavier burden of proving that we're getting what we paid for, that the procurement was free and unbiased, etc. Those $1000 hammer stories are a case in point. There are numerous common business to business practices that are outright illegal when done in a business to government context. You can argue about whether the practices are moral or ethical, but the fact of the matter is that things like finder's fees, profit as a fixed percentage of job cost, etc are all perfectly legal and common in business. There are probably some aspects of this that allow business to perform some task cheaper than government can, at least in the short run. That is, business can externalize some of the costs, while government cannot. These days, though, industry is paying more for software talent than the government is (you won't see JPL or civil service offering fresh-out CS majors $100k/yr+50k hire bonus + 100k RSU like facebook is). I think that when all is said and done, it's about the same. After all, everyone is buying the same sand and the same people to do the work. Any differences are really small scale arbitrage opportunities. > > - economies of scale: people seem to think that a datacenter at the > scale of google/amazon/facebook is going to be dramatically cheaper. > while I'm sure they get a good deal from their suppliers, I also > doubt it's game-changing. power, for instance, is a relatively > modest portion of costs, ~10% per year of a server's purchase price. > machineroom cost is pretty linear with number of nodes (power); > people overhead is very small (say, > 1000 servers per fte.) I suspect that there's a sort of middle ground where "clouding" or "co-lo hosting" or "rent a rack' is cheaper. Someone who has a need for say, 10-50 machines. That's really not enough to justify a built in infrastructure, but it's too big to "have the receptionist manage it". The folks running 1000s of servers, they've got the economy of scale built in, so they'll be making their choice upon small optimizations (cheaper to buy Amazon time because our electricity rates happen to be high right now) or because they have a wildly fluctuating need (we need 10,000 CPUs this week, but none for the next 3 weeks after that) But there are thousands and thousands of medium sized organizations that could probably benefit from "someone else" providing the computing infrastructure. Think of some manufacturing and design company that makes widgets, but needs some server horsepower to do whatever it is. Their business isn't doing sys admin, backups, etc. They can usefully outsource that to "the cloud" and focus their efforts on their core competencies. They can work a deal where someone else does the off-site backups, etc. and they don't have to worry about it. (yes, they could also hire a consulting company to do much of this as well, but for "commodity computing" maybe a generic provider "the cloud" is a better solution.) > >most of all, I just don't see how cloud changes the HPC picture at all. >HPC is already based on shared resources handling burstiness of demand - >if anything, cloud is simply slower. certainly I can't submit a job to >EC2 that uses half the Virgina zone and expect it to run immediately. >it's not clear to me whether cloud-pushers are getting real traction with >the funding agencies (gov is neocon here in Canada.) it worries me that >cloud might be framed as "better computing than HPC". > >I'm curious: what kind of cloudiness are you seeing? We've got a big "use the cloud" thing going on at JPL (and within NASA as well). To a certain extent, I think (personal opinion here, not JPL's or NASA's) it's a "everyone is talking about cloud, so we better do something with it, so at least we can comment intelligently". But it's also useful for bursty load. We have a real problem with physical space for more computers amid our aging infrastructure (most of our buildings are 40-50 years old) and the need for "I must have my hands on the physical box" is going away, as the interface mechanisms get smoother and cleaner. It's all about control, after all.. I, who strongly advocate personal supercomputers under your desk, because nobody is looking over your shoulder trying to optimize their utilization, find that the concept of smoothly divisible and scalable compute power available with a network connection is pretty close to what you want. The "external control and optimization" aspects that prompt my desire for personal computing come about when the cost granularity of the system is sufficiently coarse that a bureaucracy springs up to manage the system, which inevitably means that the "transaction cost" to get an increment of computation goes up, and they impose a minimum transaction size that is substantially larger than my "incremental need". Example using test equipment. I might want to use a $100,000 spectrum analyzer for half a day. That's a $5,000/month kind of rental, with a 2 month minimum. I'd happily pay the $50-100 for a half day's use, but because the system doesn't accommodate short usage, I'm stuck for $10,000 to do my measurement which is worth $100. And there's no effective way for me to resell the extra 60 days worth of spectrum analyzer availability. This is because my need patterns are mismatched to the supply patterns. The cloud concept has definitely worked to reduce the "transaction cost". You can buy an hour's time on 100 CPUs, pretty easily. Nobody is coming after you to help chip in for the capital cost on the machine room, or asking you to buy a month's worth of time. And, I think that in the HPC world in general, this sort of model has already existed (and heck, it goes way back to when IBM didn't sell computers, they sold CPU seconds and Core seconds and I/O Channel seconds). But it is totally unfamiliar to a lot of current IT people, who have never worked with "timesharing" systems. Their conceptual models are built around "buy a PC or three or hundred" and then scaled to "buy racks of servers and put them in a room" or, "get a loan to buy 1000 servers and put them in a room", or, perhaps "lease 1000 servers and hire a room"... All of those are really based upon "buying" (in some sense) a physical box and providing for it's care and feeding. The big difference in cloud is that you are buying "service" on a fine scale. And the term cloud is just a wonderful sexy marketing description that no-doubt came from someone looking at network diagrams. There has been that "cloud" bubble around for decades to represent "stuff over which we don't have control nor do we care, it's just there and outside our domain" > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Wed Feb 1 11:23:16 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 1 Feb 2012 08:23:16 -0800 Subject: [Beowulf] cloud: ho hum? In-Reply-To: Message-ID: On 2/1/12 7:59 AM, "Mark Hahn" wrote: > >> - Deployment speed. We have customers who wait weeks after making an IT >> helpdesk request for a new VM to be created. Other customers take 1+ > >no. there's nothing technical here: dysfunctional IT orgs should simply >be fixed. outsourcing as a workaround for BOFHishness is stupid... > The IT org in this situation isn't necessarily dysfunctional. Say you're an R&D group of 100 people in a company with 200k employees. Their IT org is optimized for the 200k, not for the 100. Outsourcing is a logical choice here. (this is the specialization, vertical vs horizontal integration, etc. discussion). Yes, there are inefficient service organizations everywhere, and there always will be. The hardest thing for project managers to learn is that you MUST plan for average, not above average, performance. The fact that sometimes you get above average helps counteract the unknowable problems that result in below average. Example from NASA.. Pathfinder put a rover on Mars for (ostensibly) $25M and set a mind bendingly aggressively low bar for future missions. That's not because Pathfinder was particularly well managed (it was well managed, but that's not why the cost was low).. It's more because of a happy coincidence of lots of circumstances that made something that realistically should have cost around $150M cost 1/6 of that. They got lucky with people to work on it, they got lucky with spare parts from other missions, they got lucky in being small, so avoiding a lot of oversight costs. Next Mars missions in 1998.. Hey Faster, Better, Cheaper, we can do it again. We'll put TWO probes at Mars for the cost of one $100M mission. Oops, one crashed into the surface, the other missed orbit injection and probably burned up. Much soul searching and reflection.. Next Mars mission (MER 2003) costs over $1B for two rovers. (and you can bet there was a LOT more reviews and oversight) MER got unlucky, in a lot of ways. Original estimates of costs (from Pathfinder) turned out to be inappropriate (some examples below). But the real story is that Pathfinder happened to be out on the tail of the probability distribution of cost, and MER was more in the middle. Pathfinder's probability of failure was MUCH higher than MERs. - You can't just scale up airbags and parachutes - The fast, low documentation approach of Pathfinder means you don't actually have drawings from which you can build stuff with no changes. - Parts that survived for Pathfinder, when actually tested for environments, had a high probability of failing, so Pathfinder "got lucky" and the parts had to get redesigned. - MER was a lot bigger, so the "average" performance of the team inevitably showed the applicability of the central limit theorem. - MER was a lot bigger, so the N^k, where k>1, communications costs rose faster than the job size. - As the job costs more, it gets more attention, so more management controls and reviews are put into place. There's a big difference between a failure of a mission flying one or two instruments on a cheap and cheerful rover assembled from commercial parts and flying a dozen instruments on a $400M rover. >> _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Greg at Keller.net Wed Feb 1 13:27:20 2012 From: Greg at Keller.net (Greg Keller) Date: Wed, 1 Feb 2012 12:27:20 -0600 Subject: [Beowulf] cloud: ho hum? Message-ID: I Sell HPC Cycles over the Internet > Date: Wed, 1 Feb 2012 10:08:24 -0500 (EST) > From: Mark Hahn > Subject: [Beowulf] cloud: ho hum? > ? ? ? ?- private sector is inherently more efficient. ?this is a bit > ? ? ? ?of a mystery to me, but I guess this is one of the great rhetorical > ? ? ? ?successes of the neocon movement. It depends on who's accounting for what. Businesses typically have to include Power, Tax and Higher markups from Vendors for HW and SW than Gov't and Academics. Also NSF and others, at least on paper, require them to charge no more than "cost" when selling cycles as I understand. So they should always be cheaper if the system is 100% utilized over the life of the system. > I've looked at Amazon prices, > ? ? ? ?and they are remarkably high - depending on purchasing model, > ? ? ? ?about 20x higher than an academic-run research cluster. ?why is there > ? ? ? ?not more skepticism of outsourcing, since it always means your cost > ? ? ? ?includes one or more corporate profit margins? Please don't consider Amazon pricing "HPC in the Cloud"'s baseline or norm. Especially price/performance. They do a great job at what they do well, but in this instance they actually poison the market because the price/performance is so bad for many workloads. > > ? ? ? ?- economies of scale: people seem to think that a datacenter at the > ? ? ? ?scale of google/amazon/facebook is going to be dramatically cheaper. > ? ? ? ?while I'm sure they get a good deal from their suppliers, I also > ? ? ? ?doubt it's game-changing. ?power, for instance, is a relatively > ? ? ? ?modest portion of costs, ~10% per year of a server's purchase price. > ? ? ? ?machineroom cost is pretty linear with number of nodes (power); > ? ? ? ?people overhead is very small (say, > 1000 servers per fte.) There is also a significant penalty for any provider that builds first, sells second that negates much of the "Economy of scale". It's like buying hard drive space 2 years in advance, if you wait and buy in smaller chunks as needed you will end up with a lot more space over the 2 years for the same spend. > > most of all, I just don't see how cloud changes the HPC picture at all. > HPC is already based on shared resources handling burstiness of demand - > if anything, cloud is simply slower. ?certainly I can't submit a job to > EC2 that uses half the Virgina zone and expect it to run immediately. > it's not clear to me whether cloud-pushers are getting real traction with > the funding agencies (gov is neocon here in Canada.) ?it worries me that > cloud might be framed as "better computing than HPC". When done well, it's a continuation of a longer term trend: researchers build their own cluster and don't share... then they put them together in departments and team share... then those get pulled into enterprise scale systems... then enterprises "share" discretely/blindly through time-sharing at some external provider. > > I'm curious: what kind of cloudiness are you seeing? Most organizations that have a large enough continuous baseline load can probably save money doing the baseline in-house and "bursting" to providers for special projects and anything that's speculative enough that they may not be doing it for 3-5 years. If your admin costs as much as your cluster (because they read this list and are awesome) of 16 nodes you may be better off outsourcing even the baseline. Ridiculous overhead, space, delay, or power costs can also help make outsourcing HPC a better use for budget If you outsource everything you don't have redundant providers you may be adding risk to your organization to save a little money. Once a few providers that are independent support similar access and control systems you could have redundant providers and shift workloads. If your competitor buys your current provider and shuts it down Oracle style you still have a system to run on while you setup at a new redundant provider. This is IMHO the key limiter of HPC Outsourcing growth and for good reasons. Some of our early adopters have no choice but to go external because the budget doesn't allow for a purchase big enough to meet a short term project's needs, so the redundancy risk of outsourcing is negated. > > thanks, mark hahn. > Also, a few points in reply to Chris... > Date: Wed, 01 Feb 2012 10:37:30 -0500 > From: Chris Dagdigian > Subject: Re: [Beowulf] cloud: ho hum? > My $.02 from what I see in industry (life sciences) > > - The ability to transform capital expense money into OpEx money alone > is pushing some cloud interest at high levels. No joke. Possibly a very > large cloud interest driver in the larger organizations. This is also > attractive for tiny startups and companies just leaving the VC > incubation phase. Very proven in our business experience. Not just VC, even fortune 10 companies have projects that look like internal startups that may fail in months. > > - Deployment speed. We have customers who wait weeks after making an IT > helpdesk request for a new VM to be created. Other customers take 1+ > years to design, RFP and choose their HPC solution and another 4 months > to deploy it. ?If you can do in minutes (via good DevOps techniques) > what the IT organization normally takes weeks or months to do then > you've got some good arguments for targeting cloud environments for > quick, dev, test and on-off scientific computing environments We call this "Corporate Inertia". Risk aversion by internal staff makes obvious decisions committee decisions so no one person gets blamed if there are complaints. > > - Quick capability gains - in some cases it's quicker and easier to get > quick access to GPUs, servers with 10Gbe interconnects and well-built > systems for running MapReduce style big data workflows on cloud platforms > > I agree that the cloud is overhyped and we certainly don't see a ton of > HPC migrating entirely to the cloud. What we see in the trenches and out > in the real world is significant interest in leveraging the cloud for > Speed, Capability, Cost or "weird" use cases. Agreed. "Enterprise" cloud and "HPC" cloud aim at opposite purposes. One is to subdivide a system for higher utilization (value), one is to combine multiple systems for performance. HPC benefits greatly from transparent understanding of the actual hardware and configurations, Enterprise benefits from not caring or needing to. So "good" HPC Clouds are Transparent whereas Enterprise clouds should only be translucent. You can't win a Nascar Race with $xM worth of Convertible Geo Metro's, you Need a single $xM Nascar and a good pit crew (admins). > > > -Chrius > Cheers! Greg I Sell HPC Cycles over the Internet _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From skylar.thompson at gmail.com Wed Feb 1 20:09:27 2012 From: skylar.thompson at gmail.com (Skylar Thompson) Date: Wed, 01 Feb 2012 17:09:27 -0800 Subject: [Beowulf] rear door heat exchangers In-Reply-To: References: Message-ID: <4F29E247.1070807@gmail.com> On 2/1/2012 5:20 AM, Michael Di Domenico wrote: > On Tue, Jan 31, 2012 at 5:23 PM, wrote: >> Hi, >> >> We have installed a lot of racks with rear door heat exchangers but these >> are without fans instead using the in-server fans to push the air through >> the element. We are doing this with ~20kW per rack. >> >> How the hell are you drinking 35kW in a rack? > > start working with GPU's... you'll find out real fast... You don't even necessarily need GPUs --- our latest blade chassis suck up 7500W in 7U going at full bore. It's pretty unpleasant standing behind them, though. -- -- Skylar Thompson (skylar.thompson at gmail.com) -- http://www.cs.earlham.edu/~skylar/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From DIEP at xs4all.nl Fri Feb 3 17:24:17 2012 From: DIEP at xs4all.nl (Vincent Diepeveen) Date: Fri, 3 Feb 2012 23:24:17 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap Message-ID: http://www.anandtech.com/show/5503/understanding-amds-roadmap-new- direction/2 AMD's new roadmap basically says they stop high performance CPU development. GPU line will continue also inside cpu's integrated. Total monopoly for intel for applications needing CPU's as it seems. That might that companies that need some more CPU crunching no longer can build cheap 4 socket machines that have good performance. Knowing usually the topend AMD 4 socket system used to be just above or under $10k, with doube the core count normally spoken of what you'd expect at a 4 socket system, making it, to some extend similar to 8 socket system (be it with very low clocked cpu's), that will no longer perform well. Intel's 8 socket solution at Oracle, the latest one, usually is around $200k a machine. So this means that clustering is only cheap choice then. Comments? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From DIEP at xs4all.nl Sat Feb 4 00:09:53 2012 From: DIEP at xs4all.nl (Vincent Diepeveen) Date: Sat, 4 Feb 2012 06:09:53 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: Message-ID: On Feb 4, 2012, at 5:09 AM, Mark Hahn wrote: >> http://www.anandtech.com/show/5503/understanding-amds-roadmap-new- >> direction/2 >> >> AMD's new roadmap basically says they stop high performance CPU >> development. > > well, they won't pursue P4-netburst-ish damn-the-torpedoes style > "high" _desktop_ performance. their server roadmap is pretty > solid, though they've dropped the 10/20c chips. (which might not > have made sense in terms of power envelope. or, for that matter, > the fact that even cache-friendly code probably wants more than 2 > memory channels for 10 cores...) > > I think AMD is absolutely right: the market is for mobile devices, > for power-efficient servers, and for media-intensive desktops. > >> GPU line will continue >> also inside cpu's integrated. Total monopoly for intel for >> applications needing CPU's as it seems. > > you seem to have missed AMD's main point, which HSA: the concept of > pushing x86 and GPU together to enable something higher-performing. > it's not a crazy idea, though pretty ambitious. > >> That might that companies that need some more CPU crunching no longer >> can build cheap 4 socket > > 4s is still on the roadmap; I don't see why you'd expect it to > disappear. > it costs them very little to support, and does serve a modest market. AMD is moving to 28 nm years after intel has reached 22 nm. So anything that has to perform it's over of course. Add to that we see how Indian engineering works - using 2 billion transistors for something intel did do 3 years earlier in the same proces technology using far under 1 billion. The mobile market is very competative with many players. Not just intel and AMD. Price matters a lot there. Intel when having same quality engineers would easily beat AMD there. 22 nm versus 28 nm. It's not a contest. It's game over normally spoken. Yet for mobile market a lot is possible it's a competative market. As for 4 socket servers - their roadmap basically shows the same cpu's like they have now. They can clock it maybe a tad higher win 1% here, win 1% there. That's about it. So AMD doesn't compete on number of cores. So they can't be on par with intel 2 socket machines basically if this roadmap presents reality. Basically AMD just mastered 32 nm, 3 years after intel released cpu's for it. Just in order to get bulldozer on par with i7-965 from years ago, if we speak about integers, they needed to have bulldozer consume A LOT more power. If their new design team is so bad in producing equipment that can use little power, how do you guess they can compete in an older proces technology against matured intel 22 nm products in the 'highend' mobile market? CPU wise they're total history. Moving your R&D to 3d world has become a total disaster for AMD. It's that their gpu line is doing well, but let's ask around a bit - how many run gpgpu on AMD gpu's in opencl? They're not supporting their opencl very well, to say polite. I can give examples if you're interested. Yet most important point to make there is that one of the biggest competative aspects of AMD was getting a lot of CPU cores for a cheap price. Those days are definitely over. I don't see how their design team even remotely has any clue about building low power products the coming years, if we see the massive mess ups. Basically AMD has 1 great chip from a few years ago still playing in the market place, if we speak about the cpu division. It won't be long until everyone has forgotten about that as well. As for crunching,which is what most people do on this list, the AMD cpu's aren't interesting anymore. Sure their GPU's are for the floating point guys (not for integers as they support opencl not very well - so not all hardware instructions, some crucial ones - are available in openCL of that gpu, which is a major reason to choose for Nvidia if you have to do integer crunching). But one very big division called CPU sales, forget it. They have a big problem. Seems they get some sort of Asian company now if we also study the code names very well, which just can deliver crap cpu's for a cheap price; cpu's eating too much power for western standards. Forget low power design in India - wont' happen. > >> So this means that clustering is only cheap choice then. > > clustering has always been the cheap solution. hence this list! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Fri Feb 3 23:09:34 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Fri, 3 Feb 2012 23:09:34 -0500 (EST) Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: Message-ID: > http://www.anandtech.com/show/5503/understanding-amds-roadmap-new- > direction/2 > > AMD's new roadmap basically says they stop high performance CPU > development. well, they won't pursue P4-netburst-ish damn-the-torpedoes style "high" _desktop_ performance. their server roadmap is pretty solid, though they've dropped the 10/20c chips. (which might not have made sense in terms of power envelope. or, for that matter, the fact that even cache-friendly code probably wants more than 2 memory channels for 10 cores...) I think AMD is absolutely right: the market is for mobile devices, for power-efficient servers, and for media-intensive desktops. > GPU line will continue > also inside cpu's integrated. Total monopoly for intel for > applications needing CPU's as it seems. you seem to have missed AMD's main point, which HSA: the concept of pushing x86 and GPU together to enable something higher-performing. it's not a crazy idea, though pretty ambitious. > That might that companies that need some more CPU crunching no longer > can build cheap 4 socket 4s is still on the roadmap; I don't see why you'd expect it to disappear. it costs them very little to support, and does serve a modest market. > So this means that clustering is only cheap choice then. clustering has always been the cheap solution. hence this list! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From cap at nsc.liu.se Wed Feb 8 08:13:49 2012 From: cap at nsc.liu.se (Peter =?iso-8859-1?q?Kjellstr=F6m?=) Date: Wed, 8 Feb 2012 14:13:49 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: Message-ID: <201202081413.53665.cap@nsc.liu.se> > > GPU line will continue > > also inside cpu's integrated. Total monopoly for intel for > > applications needing CPU's as it seems. > > you seem to have missed AMD's main point, which HSA: the concept of > pushing x86 and GPU together to enable something higher-performing. > it's not a crazy idea, though pretty ambitious. The APU concept has a few interesting points but certainly also a few major problems (when comparing it to a cpu + stand alone gpu setup): * Memory bandwidth to all those FPUs * Power (CPUs in servers today max out around 120W with GPUs at >250W) Either way we're in for an interesting future (as usual) :-) /Peter -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From diep at xs4all.nl Wed Feb 8 08:27:31 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 8 Feb 2012 14:27:31 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202081413.53665.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> Message-ID: On Feb 8, 2012, at 2:13 PM, Peter Kjellstr?m wrote: >>> GPU line will continue >>> also inside cpu's integrated. Total monopoly for intel for >>> applications needing CPU's as it seems. >> >> you seem to have missed AMD's main point, which HSA: the concept of >> pushing x86 and GPU together to enable something higher-performing. >> it's not a crazy idea, though pretty ambitious. > > The APU concept has a few interesting points but certainly also a > few major > problems (when comparing it to a cpu + stand alone gpu setup): > > * Memory bandwidth to all those FPUs > * Power (CPUs in servers today max out around 120W with GPUs at > >250W) And the gpu's in those cpu's probably won't be doing double precision at all. Maybe they work at 16% the speed of a similar highend GPU. AMD announced exactly that for their cheap line gpu's which are already a lot better than waht's gonna be put in the cpu's as it seems. Add to that that the gpu will have probably very limited amount of cores. So it'll be possibly a factor 200 slower than a 7990 gpu in double precision, which should be a teraflop or 2 nearly in double precision and probably will eat around a 450 watt when crunching double precision code at all Processing Elements, as the streamcores are called in OpenCL. I'm not sure why people at this mailing list are excited about including gpu's in cpu's. It's great for mobile phones and netbooks and cheapskate laptops and so. From performance viewpoint ignore it. > > Either way we're in for an interesting future (as usual) :-) > > /Peter > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 8 08:34:12 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 8 Feb 2012 14:34:12 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202081413.53665.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> Message-ID: <20120208133412.GN7343@leitl.org> On Wed, Feb 08, 2012 at 02:13:49PM +0100, Peter Kjellstr?m wrote: > * Memory bandwidth to all those FPUs Memory stacking via TSV is coming. APUs with their very apparent memory bottlenecks will accelerate it. > * Power (CPUs in servers today max out around 120W with GPUs at >250W) I don't see why you can't integrate APU+memory+heatsink in a watercooled module that is plugged into the backplane which contains the switched signalling fabric. > Either way we're in for an interesting future (as usual) :-) I don't see how x86 should make it to exascale. It's too bad MRAM/FeRAM/whatever isn't ready for SoC yet. Also, Moore should end by around 2020 or earlier, and architecture only pushes you one or two generations further at most. Don't see how 3D integration should be ready by then, and 2.5 D only buys you another one or two doublings at best. (TSV stacking is obviously off-Moore). _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Wed Feb 8 09:01:30 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 8 Feb 2012 15:01:30 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <20120208133412.GN7343@leitl.org> References: <201202081413.53665.cap@nsc.liu.se> <20120208133412.GN7343@leitl.org> Message-ID: <71EF5B74-3187-4BA6-B8F4-D27E6EE95A7F@xs4all.nl> On Feb 8, 2012, at 2:34 PM, Eugen Leitl wrote: > On Wed, Feb 08, 2012 at 02:13:49PM +0100, Peter Kjellstr?m wrote: > >> * Memory bandwidth to all those FPUs > > Memory stacking via TSV is coming. APUs with their very apparent > memory bottlenecks will accelerate it. > >> * Power (CPUs in servers today max out around 120W with GPUs at >> >250W) > > I don't see why you can't integrate APU+memory+heatsink in a > watercooled module that is plugged into the backplane which > contains the switched signalling fabric. > Because also for the upcoming new Xbox they have the same power envelope as they have in the 'highend' cpu's for the built in gpu. So we do not speak yet about laptop cpu's as it'll be less there. They have at most 18 watts for the built in GPU. So first thing they do is kill all double precision of it. Even if they wouldn't. the 6990 and the highend nvidia and the 7990 they are all 375 watt TDP on paper (in reality 450+ watt). So whatever your 'opinion' is on how they design stuff, it will always be factor 375 / 18 = 20.8 times slower than a GPU. And they can get over the tpd with the gpu's easily as the pci-e connectors will easily pump in more watts, with the built in gpu's they can't as the power doesn't come from the pci-e but from more strict specs. But now let's look at design. They cannot 'turn off' the AVX in the cpu, as then it doesn't support the latest games, and cpu's nowadays are only about taking care you do better at the latest game and nothing else matters, whatever fairy tale they'll tell you. CPU's are an exorbitantly expensive part of the computer. They are so expensive those x64 cpu's, because of the 'blessing' of patents. Only 2 companies are able to release x64 cpu's right now and probably soon only 1, as we'll have to see whether AMD survives this. One of those companies is not even in a hurry to release their 8 core Xeons in 32 nm, maybe they want to make more profit with a higher yield cpu at 22 nm. If we already know the gpu is crap in double precision because it just has 18 watts, and we also know that the CPU has AVX, it's pretty useless to let the GPU do the double precision calculations. So the obvious optimization is to kick out all double precision logics in the gpu, which doesn't save transistors as some will tell you, as it usually is all the same chip, just they turn off the transistors, giving them higher yields, so cheaper production price. That's what they'll do if they want to make a profit and i bet their owners will be very unhappy if they do not make a profit. So yes, in a nerd world it would be possible to just include a 2 core chippie that's just 32 bits x86 of a watt or 10, and give majority of the power envelope to a double precision optimized gpu, maybe even 50 watts, which makes it 'only' factor 8 slower, in theory, than a GPU card. Yet that's not very likely to happen. >> Either way we're in for an interesting future (as usual) :-) > > I don't see how x86 should make it to exascale. It's too > bad MRAM/FeRAM/whatever isn't ready for SoC yet. Also, Moore > should end by around 2020 or earlier, and architecture only > pushes you one or two generations further at most. Don't see > how 3D integration should be ready by then, and 2.5 D only > buys you another one or two doublings at best. (TSV stacking > is obviously off-Moore). > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Wed Feb 8 09:06:31 2012 From: deadline at eadline.org (Douglas Eadline) Date: Wed, 8 Feb 2012 09:06:31 -0500 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202081413.53665.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> Message-ID: <9923a899e0c2735b549b108ad889cf32.squirrel@mail.eadline.org> >> > GPU line will continue >> > also inside cpu's integrated. Total monopoly for intel for >> > applications needing CPU's as it seems. >> >> you seem to have missed AMD's main point, which HSA: the concept of >> pushing x86 and GPU together to enable something higher-performing. >> it's not a crazy idea, though pretty ambitious. > > The APU concept has a few interesting points but certainly also a few > major > problems (when comparing it to a cpu + stand alone gpu setup): > > * Memory bandwidth to all those FPUs I thought that was part of the issue, removing the PCI bus from the CPU/GPU connection. Of course, the APU has a lower memory bandwidth than the pure GPU, but in theory now the PCI bottleneck is gone. > * Power (CPUs in servers today max out around 120W with GPUs at >250W) I see this as more of a smearing out of the GPU (SIMD unit). Instead of one big GPU sitting on the PCI bus shared by 2-4 sockets, now each socket has it's own GPU on the same memory bus. Unless, I'm not following the APU design correctly. > > Either way we're in for an interesting future (as usual) :-) Indeed. > > /Peter > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From james.p.lux at jpl.nasa.gov Wed Feb 8 09:18:50 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 8 Feb 2012 06:18:50 -0800 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <20120208133412.GN7343@leitl.org> Message-ID: On 2/8/12 5:34 AM, "Eugen Leitl" wrote: >On Wed, Feb 08, 2012 at 02:13:49PM +0100, Peter Kjellstr?m wrote: > >> * Memory bandwidth to all those FPUs > >Memory stacking via TSV is coming. APUs with their very apparent >memory bottlenecks will accelerate it. > >> * Power (CPUs in servers today max out around 120W with GPUs at >250W) > >I don't see why you can't integrate APU+memory+heatsink in a >watercooled module that is plugged into the backplane which >contains the switched signalling fabric. I don't know about that.. I don't see the semiconductor companies making such an integrated widget, so it's basically some sort of integrator that would do it: like a mobo manufacturer. But I don't think the volume is there for the traditional mobo types to find it interesting. So now you're talking about small volume specialized mfrs, like the ones who sell into the conduction cooled MIL/AERO market. And those are *expensive*... Not just because of the plethora of requirements and documentation that the customer wants in that market.. It's all about mfr volume. The whole idea of "plugging in" a liquid cooled thing to a backplane is also sort of unusual. A connector that can carry both high speed digital signals, power, AND liquid without leaking would be weird. And even if it's not "one connector", logically, that whole mating surface of the module is a connector. Reliable liquid connectors usually need some sort of latching or positive action: a collar that snaps in place (think air hose) or turns or does something to put a clamping force on an O-ring or other gasket. It can be done (and probably has), but it's going to be "exotic" and expensive. > >> Either way we're in for an interesting future (as usual) :-) > >I don't see how x86 should make it to exascale. It's too >bad MRAM/FeRAM/whatever isn't ready for SoC yet. Even if you put the memory on the chip, you still have the interconnect scaling problem. Light speed and distance, if nothing else. Putting everything on a chip just shrinks the problem, but it's just like 15 years ago with PC tower cases on shelving and Ethernet interconnects. > Also, Moore >should end by around 2020 or earlier, and architecture only >pushes you one or two generations further at most. Don't see >how 3D integration should be ready by then, and 2.5 D only >buys you another one or two doublings at best. (TSV stacking >is obviously off-Moore). > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 8 11:39:24 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 8 Feb 2012 17:39:24 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <20120208133412.GN7343@leitl.org> Message-ID: <20120208163924.GQ7343@leitl.org> On Wed, Feb 08, 2012 at 06:18:50AM -0800, Lux, Jim (337C) wrote: > It can be done (and probably has), but it's going to be "exotic" and > expensive. There are some liquid metal (gallium alloy, strangely enough not sodium/potassium eutectic which would be plenty cheaper albeit a fire hazard if exposed to air) cooled GPUs for the gamer market. I might have also read about one which uses a metal pump with no movable parts which utilizes MHD, though I don't remember where I've read that. There are also plenty of watercooled systems among enthusiasts, including some that include CPU and GPU coolers in the same circuit. I could see how gamers could push watercooled systems into COTS mainstream, it wouldn't be the first time. Multi-GPU settings are reasonably common there, so the PCIe would seem like a good initial fabric candidate. > >I don't see how x86 should make it to exascale. It's too > >bad MRAM/FeRAM/whatever isn't ready for SoC yet. > > > Even if you put the memory on the chip, you still have the interconnect > scaling problem. Light speed and distance, if nothing else. Putting You can eventually put that on WSI (in fact, somebody wanted to do that with ARM-based nodes and DRAM wafers bonded on top, with redundant routing around dead dies -- I presume this would also take care of graceful degradation if you can do it at runtime, or at least reconfigure after failure, and go back to last snapshot). Worst-case distances would be then ~400 mm within the wafer, and possibly shorter if you interconnect these with fiber. The only other way to reduce average signalling distance is real 3D integration. > everything on a chip just shrinks the problem, but it's just like 15 years > ago with PC tower cases on shelving and Ethernet interconnects. Sooner or later you run into questions like "what's within the lightcone of a 1 nm device", at which point you've reached the limits of classical computation, nevermind that I don't see how you can cool anything with effective >THz refresh rate. I'm extremely sceptical about QC feasibility, though there's some work with nitrogen vacancies in diamond which could produce qubit entanglement in solid state, perhaps even at room temperature. I just don't think it would scale well enough, and since Scott Aaronson also appears dubious I'm in good company. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 8 12:15:01 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 8 Feb 2012 12:15:01 -0500 (EST) Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202081413.53665.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> Message-ID: > The APU concept has a few interesting points but certainly also a few major > problems (when comparing it to a cpu + stand alone gpu setup): > > * Memory bandwidth to all those FPUs well, sorta. my experience with GP-GPU programming today is that your first goal is to avoid touching anything offchip anyway (spilling, etc), so I'm not sure this is a big problem. obviously, the integrated GPU is a small slice of a "real" add-in GPU, so needs proportionately less bandwidth. > * Power (CPUs in servers today max out around 120W with GPUs at >250W) sure, though the other way to think of this is that you have 250W or so of power overhead hanging off your GPU cards. you can amortize the "host overhead" by adding several GPUs, but... think of it this way: an APU is just a low-mid-end add-in GPU with the host integrated onto it ;) I think the real question is whether someone will produce a minimalist APU node. since Llano has on-die PCIE, it seems like you'd need only APU, 2-4 dimms and a network chip or two. that's going to add up to very little beyond the the APU's 65 or 100W TDP... (I figure 150/node including PSU overhead.) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Wed Feb 8 12:52:55 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 8 Feb 2012 09:52:55 -0800 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <20120208163924.GQ7343@leitl.org> References: <20120208133412.GN7343@leitl.org> <20120208163924.GQ7343@leitl.org> Message-ID: Odd threading/quoting behavior in my mail client.. Comments below with ** -----Original Message----- From: Eugen Leitl [mailto:eugen at leitl.org] Sent: Wednesday, February 08, 2012 8:39 AM To: Lux, Jim (337C); Beowulf at beowulf.org Subject: Re: [Beowulf] Clusters just got more important - AMD's roadmap On Wed, Feb 08, 2012 at 06:18:50AM -0800, Lux, Jim (337C) wrote: > It can be done (and probably has), but it's going to be "exotic" and > expensive. There are some liquid metal (gallium alloy, strangely enough not sodium/potassium eutectic which would be plenty cheaper albeit a fire hazard if exposed to air) cooled GPUs for the gamer market. I might have also read about one which uses a metal pump with no movable parts which utilizes MHD, though I don't remember where I've read that. There are also plenty of watercooled systems among enthusiasts, including some that include CPU and GPU coolers in the same circuit. I could see how gamers could push watercooled systems into COTS mainstream, it wouldn't be the first time. Multi-GPU settings are reasonably common there, so the PCIe would seem like a good initial fabric candidate. **Those aren't plug into a backplane type configurations, though. They can use permanent connections, or something that is a pain to do, but you only have to do it once. I could see something using heat pipes, too, which possibly mates with some sort of thermal transfer socket, but again, we're talking exotic, and not amenable to large volumes to bring the price down. > >I don't see how x86 should make it to exascale. It's too bad > >MRAM/FeRAM/whatever isn't ready for SoC yet. > > > Even if you put the memory on the chip, you still have the > interconnect scaling problem. Light speed and distance, if nothing > else. Putting You can eventually put that on WSI (in fact, somebody wanted to do that with ARM-based nodes and DRAM wafers bonded on top, with redundant routing around dead dies -- I presume this would also take care of graceful degradation if you can do it at runtime, or at least reconfigure after failure, and go back to last snapshot). Worst-case distances would be then ~400 mm within the wafer, and possibly shorter if you interconnect these with fiber. ** Even better is free space optical interconnect, but that's pretty speculative today. And even if you went to WSI (or thick film hybrids or something similar), you're still limited in scalability by the light time delay between nodes. If you had just the bare silicon (no packages) for all the memory and CPU chips in a 1000 node cluster, that's still a pretty big ball o'silicon. And a big ball o'silicon that is dissipating a fair amount of heat. The only other way to reduce average signalling distance is real 3D integration. ** I agree. This has been done numerous times in the history of computing. IBM had that cryogenically cooled stack a few decades ago. The "round" Cray is another example. > everything on a chip just shrinks the problem, but it's just like 15 > years ago with PC tower cases on shelving and Ethernet interconnects. Sooner or later you run into questions like "what's within the lightcone of a 1 nm device", at which point you've reached the limits of classical computation, nevermind that I don't see how you can cool anything with effective >THz refresh rate. I'm extremely sceptical about QC feasibility, though there's some work with nitrogen vacancies in diamond which could produce qubit entanglement in solid state, perhaps even at room temperature. I just don't think it would scale well enough, and since Scott Aaronson also appears dubious I'm in good company. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Wed Feb 8 12:55:23 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 8 Feb 2012 09:55:23 -0800 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <201202081413.53665.cap@nsc.liu.se> Message-ID: We can probably look back to the history of non-integrated floating point for this kind of thing. 8087/8086, etc. I used to work with a guy who was a key mover at Floating Point Systems, probably one of the first applications of "attached special purpose processor", and ALL of the issues we're talking about here came up in that connection, just as with coprocessors since time immemorial. I think the real question is: "does the fact we're doing this at a different scale, change any of the fundamental limitations or make something easier than it was the last time" -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Mark Hahn Sent: Wednesday, February 08, 2012 9:15 AM To: Beowulf Mailing List Subject: Re: [Beowulf] Clusters just got more important - AMD's roadmap > The APU concept has a few interesting points but certainly also a few > major problems (when comparing it to a cpu + stand alone gpu setup): > > * Memory bandwidth to all those FPUs well, sorta. my experience with GP-GPU programming today is that your first goal is to avoid touching anything offchip anyway (spilling, etc), so I'm not sure this is a big problem. obviously, the integrated GPU is a small slice of a "real" add-in GPU, so needs proportionately less bandwidth. > * Power (CPUs in servers today max out around 120W with GPUs at >250W) sure, though the other way to think of this is that you have 250W or so of power overhead hanging off your GPU cards. you can amortize the "host overhead" by adding several GPUs, but... think of it this way: an APU is just a low-mid-end add-in GPU with the host integrated onto it ;) I think the real question is whether someone will produce a minimalist APU node. since Llano has on-die PCIE, it seems like you'd need only APU, 2-4 dimms and a network chip or two. that's going to add up to very little beyond the the APU's 65 or 100W TDP... (I figure 150/node including PSU overhead.) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Wed Feb 8 13:41:34 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 8 Feb 2012 19:41:34 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <201202081413.53665.cap@nsc.liu.se> Message-ID: On Feb 8, 2012, at 6:15 PM, Mark Hahn wrote: >> The APU concept has a few interesting points but certainly also a >> few major >> problems (when comparing it to a cpu + stand alone gpu setup): >> >> * Memory bandwidth to all those FPUs > > well, sorta. my experience with GP-GPU programming today is that your > first goal is to avoid touching anything offchip anyway (spilling, > etc), > so I'm not sure this is a big problem. obviously, the integrated GPU > is a small slice of a "real" add-in GPU, so needs proportionately > less bandwidth. Most of the code that's real fast on gpgpu simply doesn't leave the compute units at all. For outsiders: a compute unit is basically 1 vector core (or SIMD) of a gpu with its own registers and its own shared memory (64 KB or so at nvidia + registers which is quite a tad and 32 KB sharedmemory for AMD + a big multiple of that for local registers) So that's 64 PE's (processing elements) at newer generation AMD's (6000 and 7000 series), or 32 at nvidia. Nvidia has 512 PE's and latest AMD has 2048 PE's. You really don't want to touch the RAM much in gpgpu computing. RAM slows down. There is zero difference from programming model there between AMD and Nvidia gpu's. Anything that does other stuff than just inside a compute unit of 32 or 64 'cores' is not gonna scale well. Good example is the Trial Factorisation for Mersenne that works at Nvidia very well in CUDA. Basically candidates get generated at cpu's, shipped a bunch to the gpu, then all calculations occur within a compute unit for a bunch of candidates. The problem there you stumble upon as well is not so much the bandwidth from cpu to gpu. It's simply the problem that the CPU's are not fast enough to generate candidates for the GPU, as the GPU is a 200x faster or so than CPU core. The cpu's just can't feed the gpu as they're too slow generating factor candidates to keep the gpu busy. Remember this is just a single GPU and a relative cheap one. As for games, one would guess it's easier to scale well for graphics, yet they do not. Call it clumsy programming, call it bad paid coders, call it 'not necessary to fix as we'll buy a faster gpu soon'; as a result you typically see that gpgpu programs that scale well, they cause the gpu's to eat a lot more power than any game. > >> * Power (CPUs in servers today max out around 120W with GPUs at >> >250W) > > sure, though the other way to think of this is that you have 250W > or so of power overhead hanging off your GPU cards. you can amortize > the "host overhead" by adding several GPUs, but... > > think of it this way: an APU is just a low-mid-end add-in GPU > with the host integrated onto it ;) > > I think the real question is whether someone will produce a minimalist > APU node. since Llano has on-die PCIE, it seems like you'd need only > APU, 2-4 dimms and a network chip or two. that's going to add up to > very little beyond the the APU's 65 or 100W TDP... (I figure 150/node > including PSU overhead.) > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Wed Feb 8 14:05:29 2012 From: deadline at eadline.org (Douglas Eadline) Date: Wed, 8 Feb 2012 14:05:29 -0500 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <201202081413.53665.cap@nsc.liu.se> Message-ID: <1673fe4c73cb087ec4b98a15de5b1d29.squirrel@mail.eadline.org> snip > think of it this way: an APU is just a low-mid-end add-in GPU > with the host integrated onto it ;) > > I think the real question is whether someone will produce a minimalist > APU node. since Llano has on-die PCIE, it seems like you'd need only > APU, 2-4 dimms and a network chip or two. that's going to add up to > very little beyond the the APU's 65 or 100W TDP... (I figure 150/node > including PSU overhead.) I plan on looking at these for my Limulus systems. There are a bunch of microATX and miniITX boards for these APUs, if you use the A8 it has 4 x86 cores and 400 SIMD cores. -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From cap at nsc.liu.se Wed Feb 8 14:27:49 2012 From: cap at nsc.liu.se (Peter =?iso-8859-1?q?Kjellstr=F6m?=) Date: Wed, 8 Feb 2012 20:27:49 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: References: <201202081413.53665.cap@nsc.liu.se> Message-ID: <201202082027.53627.cap@nsc.liu.se> On Wednesday, February 08, 2012 06:15:01 PM Mark Hahn wrote: > > The APU concept has a few interesting points but certainly also a few > > major problems (when comparing it to a cpu + stand alone gpu setup): > > > > * Memory bandwidth to all those FPUs > > well, sorta. my experience with GP-GPU programming today is that your > first goal is to avoid touching anything offchip anyway (spilling, etc), > so I'm not sure this is a big problem. obviously, the integrated GPU > is a small slice of a "real" add-in GPU, so needs proportionately > less bandwidth. Well yes you want to avoid touching memory on a GPU (just as you do on a CPU). But just as you cant completely avoid it on a CPU nor can you on a GPU. On a current socket (CPU) you see maybe 20 GB/s and 50 GF and the flop-wise much faster GPU is also alot faster in memory access (>200 GB/s). Now I admit I'm not a GPU programmer but are you saying those 200 GB/s aren't needed? My assumption was that the fact that CPU-codes depend on cache for performance but still need good memory bandwidth held true even on GPUs. Anyway, my point I guess was mostly that it's a lot easier to sort out hundreds of gigs per second to memory on a device with RAM directly on the PCB than on a server socket. Also, if the APU is a "small slice of a real GPU" then I question the point (not much GPU power per classic core or total system foot-print). ... > I think the real question is whether someone will produce a minimalist > APU node. since Llano has on-die PCIE, it seems like you'd need only > APU, 2-4 dimms and a network chip or two. that's going to add up to > very little beyond the the APU's 65 or 100W TDP... (I figure 150/node > including PSU overhead.) I think anything beyond early testing is a fair bit into the future. For the APU to become interesting I think we need a few (or all of): * Memory shared with the CPU in some useable way (did not say the c-word..) * A proper number crunching version (ecc...) * A fairly high tdp part on a socket with good memory bw * Noticeably better "host to device" bandwidth and even more, latency And don't get me wrong, I'm not saying the above is particularly unlikely... /Peter -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From diep at xs4all.nl Wed Feb 8 16:01:08 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Wed, 8 Feb 2012 22:01:08 +0100 Subject: [Beowulf] Clusters just got more important - AMD's roadmap In-Reply-To: <201202082027.53627.cap@nsc.liu.se> References: <201202081413.53665.cap@nsc.liu.se> <201202082027.53627.cap@nsc.liu.se> Message-ID: <47D69765-C40D-4A78-815C-E5EDFA4881E4@xs4all.nl> On Feb 8, 2012, at 8:27 PM, Peter Kjellstr?m wrote: > On Wednesday, February 08, 2012 06:15:01 PM Mark Hahn wrote: >>> The APU concept has a few interesting points but certainly also a >>> few >>> major problems (when comparing it to a cpu + stand alone gpu setup): >>> >>> * Memory bandwidth to all those FPUs >> >> well, sorta. my experience with GP-GPU programming today is that >> your >> first goal is to avoid touching anything offchip anyway (spilling, >> etc), >> so I'm not sure this is a big problem. obviously, the integrated GPU >> is a small slice of a "real" add-in GPU, so needs proportionately >> less bandwidth. > > Well yes you want to avoid touching memory on a GPU (just as you do > on a CPU). > But just as you cant completely avoid it on a CPU nor can you on a > GPU. On a > current socket (CPU) you see maybe 20 GB/s and 50 GF and the flop- > wise much 50 gflop on a cpu - first of all very little software actually gets 50 gflop out of a CPU. It might execute 2 instructions a second in SIMD, yet not when you multiply. To start with it has just 1 multiplication unit, so you already start with losing factor 2. So effective output that the CPU delivers isn't much more than its bandwidth and caches can handle. Now let's skip the multiply-add further int his discussion AFAIK most total optimized codes can't use this. yet for gpu's discussion is the same there. But not for the output bandwidth. In a GPU on other hand you do achieve this throughput it can deliver. It's delivering, multiply add not counted, 0.5 Tflop per second, that's 4 Tbytes/s. Or factor 20 above it's maximum bandwidth to the RAM. RAM can get prefetched, yet there are no clever caches on the GPU. Some read L2 cache, that's about it. Writes to the local shared cache also are not adviced as the bandwidth of it is a lot slower than the speed of the compute units can deliver. So basically if you read and/or write at full speed to the RAM, you slow down factor 20 or so, a slowdown a CPU does *not* have, as basically it's so slow that CPU, that its RAM can keep up with it. > faster GPU is also alot faster in memory access (>200 GB/s). > > Now I admit I'm not a GPU programmer but are you saying those 200 > GB/s aren't > needed? My assumption was that the fact that CPU-codes depend on > cache for > performance but still need good memory bandwidth held true even on > GPUs. > > Anyway, my point I guess was mostly that it's a lot easier to sort out > hundreds of gigs per second to memory on a device with RAM directly > on the PCB > than on a server socket. > > Also, if the APU is a "small slice of a real GPU" then I question > the point > (not much GPU power per classic core or total system foot-print). > > ... >> I think the real question is whether someone will produce a >> minimalist >> APU node. since Llano has on-die PCIE, it seems like you'd need only >> APU, 2-4 dimms and a network chip or two. that's going to add up to >> very little beyond the the APU's 65 or 100W TDP... (I figure 150/ >> node >> including PSU overhead.) > > I think anything beyond early testing is a fair bit into the > future. For the > APU to become interesting I think we need a few (or all of): > > * Memory shared with the CPU in some useable way (did not say the > c-word..) > * A proper number crunching version (ecc...) > * A fairly high tdp part on a socket with good memory bw > * Noticeably better "host to device" bandwidth and even more, latency > > And don't get me wrong, I'm not saying the above is particularly > unlikely... > > /Peter > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Thu Feb 9 06:12:12 2012 From: eugen at leitl.org (Eugen Leitl) Date: Thu, 9 Feb 2012 12:12:12 +0100 Subject: [Beowulf] =?utf-8?q?The_death_of_CPU_scaling=3A_From_one_core_to_?= =?utf-8?b?bWFueSDigJQgYW5kIHdoeSB3ZeKAmXJlIHN0aWxsCXN0dWNr?= Message-ID: <20120209111212.GX7343@leitl.org> http://www.extremetech.com/computing/116561-the-death-of-cpu-scaling-from-one-core-to-many-and-why-were-still-stuck?print The death of CPU scaling: From one core to many ? and why we?re still stuck By Joel Hruska on February 1, 2012 at 2:31 pm It?s been nearly eight years since Intel canceled Tejas and announced its plans for a new multi-core architecture. The press wasted little time in declaring conventional CPU scaling dead ? and while the media has a tendency to bury products, trends, and occasionally people well before their expiration date, this is one declaration that?s stood the test of time. To understand the magnitude of what happened in 2004 it may help to consult the following chart. It shows transistor counts, clock speeds, power consumption, and instruction-level parallelism (ILP). The doubling of transistor counts every two years is known as Moore?s law, but over time, assumptions about performance and power consumption were also made and shown to advance along similar lines. Moore got all the credit, but he wasn?t the only visionary at work. For decades, microprocessors followed what?s known as Dennard scaling. Dennard predicted that oxide thickness, transistor length, and transistor width could all be scaled by a constant factor. Dennard scaling is what gave Moore?s law its teeth; it?s the reason the general-purpose microprocessor was able to overtake and dominate other types of computers. CPU Scaling [1]CPU scaling showing transistor density, power consumption, and efficiency. Chart originally from The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software [2] The original 8086 drew ~1.84W and the P3 1GHz drew 33W, meaning that CPU power consumption increased by 17.9x while CPU frequency improved by 125x. Note that this doesn?t include the other advances that occurred over the same time period, such as the adoption of L1/L2 caches, the invention of out-of-order execution, or the use of superscaling and pipelining to improve processor efficiency. It?s for this reason that the 1990s are sometimes referred to as the golden age of scaling. This expanded version of Moore?s law held true into the mid-2000s, at which point the power consumption and clock speed improvements collapsed. The problem at 90nm was that transistor gates became too thin to prevent current from leaking out into the substrate. Intel and other semiconductor manufacturers have fought back with innovations [3] like strained silicon, hi-k metal gate, FinFET, and FD-SOI ? but none of these has re-enabled anything like the scaling we once enjoyed. From 2007 to 2011, maximum CPU clock speed (with Turbo Mode enabled) rose from 2.93GHz to 3.9GHz, an increase of 33%. From 1994 to 1998, CPU clock speeds rose by 300%. Next page: The multi-core swerve [4] The multi-core swerve For the past seven years, Intel and AMD have emphasized multi-core CPUs as the answer to scaling system performance, but there are multiple reasons to think the trend towards rising core counts is largely over. First and foremost, there?s the fact that adding more CPU cores never results in perfect scaling. In any parallelized program, performance is ultimately limited by the amount of serial code (code that can only be executed on one processor). This is known as Amdahl?s law. Other factors, such as the difficulty of maintaining concurrency across a large number of cores, also limit the practical scaling of multi-core solutions. Amdahl's Law [5] AMD?s Bulldozer is a further example of how bolting more cores together can result in a slower end product [6]. Bulldozer was designed to share logic and caches in order to reduce die size and allow for more cores per processor, but the chip?s power consumption badly limits its clock speed while slow caches hamstring instructions per cycle (IPC). Even if Bulldozer had been a significantly better chip, it wouldn?t change the long-term trend towards diminishing marginal returns. The more cores per die, the lower the chip?s overall clock speed. This leaves the CPU ever more reliant on parallelism to extract acceptable performance. AMD isn?t the only company to run into this problem; Oracle?s new T4 processor is the first Niagara-class chip to focus on improving single-thread performance rather than pushing up the total number of threads per CPU. Rage Jobs [7] The difficulty of software optimization is a further reason why adding more CPU cores doesn?t help much. Game developers have made progress in using multi-core systems, but the rate of advance has been slow. Games like Rage [8] and Battlefield 3 ? two high-profile titles that use multiple cores ? both utilized new engines designed from the ground-up with multi-core scaling as a primary goal. The bottom line is that its been easier for Intel and AMD to add cores than it is for software to take advantage of them. Seven years after the multi-core era began, it?s already morphing into something different. Next page: The rise (and limit) of Many-Core [9] The rise (and limit) of Many-Core In this context, we?re using the term ?many-core? to refer to a wide range of programmable hardware. GPUs from AMD and Nvidia are both ?many-core? products, as are chips from companies like Tilera. Intel?s Knights Corner [10] is a many-core chip. The death of conventional scaling has sparked a sharp increase in the number of companies researching various types of specialized CPU cores. Prior to that point, general-purpose CPU architectures, exemplified by Intel?s x86, had eaten through the high-end domains of add-in boards and co-processors at a ferocious rate. Once that trend slammed into the brick wall of physics, more specialist architectures began to appear. Many-core Scaling [11]Note: Three exclamation points doesn?t actually mean anything, despite the fondest wishes of AMD?s marketing department Despite what some companies like to claim, specialized many-core chips don?t ?break? Moore?s law in any way and are not exempt from the realities of semiconductor manufacturing. What they offer is a tradeoff ? a less general, more specialized architecture that?s capable of superior performance on a narrower range of problems. They?re also less encumbered by socket power constraints ? Intel?s CPUs top out at 140W TDP; Nvidia?s upper-range GPUs are in the 250W range. Intel?s upcoming Many Integrated Core (MIC) architecture is partly an attempt to capitalize on the benefits of having a separate interface and giant PCB for specialized, ultra-parallel data crunching. AMD, meanwhile, has focused on consumer-side applications and the integration of CPU and GPU via what it calls Graphics Core Next [12]. Regardless of market segmentation, all three companies are talking about integrating specialized co-processors that excel at specific tasks, one of which happens to be graphics. AMD's many-core strategy [13] Unfortunately, this isn?t a solution. Incorporating a specialized many-core processor on-die or relying on a discrete solution to boost performance is a bid to improve efficiency per watt, but it does nothing to address the underlying problem that transistors can no longer be counted on to scale the way they used to. The fact that transistor density continues to scale while power consumption and clock speed do not has given rise to a new term: dark silicon. It refers to the percentage of silicon on a processor that can?t be powered up simultaneously without breaching the chip?s TDP. A recent report in dark silicon and the future of multi-core devices describes the future in stark terms. The researchers considered both transistor scaling as forecast by the International Technology Roadmap for Semiconductors (ITRS) and by a more conservative amount; they factored in the use of APU-style combinations, the rise of so-called ?wimpy? cores [14], and the future scaling of general-purpose multiprocessors. They concluded: Regardless of chip organization and topology, multicore scaling is power limited to a degree not widely appreciated by the computing community? Given the low performance returns? adding more cores will not provide sufficient benefit to justify continued process scaling. Given the time-frame of this problem and its scale, radical or even incremental ideas simply cannot be developed along typical academic research and industry product cycles? A new driver of transistor utility must be found, or the economics of process scaling will break and Moore?s Law will end well before we hit final manufacturing limits Over the next few years scaling will continue to slowly improve. Intel will likely meander up to 6-8 cores for mainstream desktop users at some point, quad-cores will become standard at every product level, and we?ll see much tighter integration of CPU and GPU. Past that, it?s unclear what happens next. The gap between present-day systems and DARPA?s exascale computing initiative [15] will diminish only marginally with each successive node; there?s no clear understanding of how ? or if ? classic Dennard scaling can be re-initiated. This is part one of a two-part story. Part two will deal with how Intel is addressing the problem through what it calls the ?More than Moore? approach and its impact on the mobile market. Endnotes : http://www.extremetech.com/wp-content/uploads/2012/02/CPU-Scaling.jpg The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software: http://www.gotw.ca/publications/concurrency-ddj.htm fought back with innovations: http://www.extremetech.com/extreme/106899-beyond-22nm-applied-materials-the-unsung-silicon-hero The multi-core swerve: http://www.extremetech.com/computing/116561-the-death-of-cpu-scaling-from-one-core-to-many-and-why-were-still-stuck/2 : http://www.extremetech.com/wp-content/uploads/2012/02/Amdahl.png a slower end product: http://www.extremetech.com/computing/100583-analyzing-bulldozers-scaling-single-thread-performance : http://www.extremetech.com/wp-content/uploads/2012/02/Rage-Jobs.jpg Rage: http://www.extremetech.com/gaming/99729-deconstructing-rage-what-went-wrong-and-how-to-fix-it The rise (and limit) of Many-Core: http://www.extremetech.com/computing/116561-the-death-of-cpu-scaling-from-one-core-to-many-and-why-were-still-stuck/3 Knights Corner: http://www.extremetech.com/extreme/73426-intel-plans-specialized-50core-chip : http://www.extremetech.com/wp-content/uploads/2012/02/Scaling1.jpg Graphics Core Next: http://www.extremetech.com/computing/110133-radeon-hd-7970-one-gpu-to-rule-them-all : http://www.extremetech.com/wp-content/uploads/2012/02/ManyCoreAMD.jpg ?wimpy? cores: http://www.extremetech.com/computing/112319-creative-announces-100-core-system-on-a-chip DARPA?s exascale computing initiative: http://www.extremetech.com/computing/116081-darpa-summons-researchers-to-reinvent-computing _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Thu Feb 9 06:42:55 2012 From: eugen at leitl.org (Eugen Leitl) Date: Thu, 9 Feb 2012 12:42:55 +0100 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking Message-ID: <20120209114255.GC7343@leitl.org> http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu-performance-by-20-without-overclocking Engineers boost AMD CPU performance by 20% without overclocking By Sebastian Anthony on February 7, 2012 at 12:44 pm AMD Llano APU die (GPU on the right) Engineers at North Carolina State University have used a novel technique to boost the performance of an AMD Fusion APU by more than 20%. This speed-up was achieved purely through software and using commercial (probably Llano) silicon. No overclocking was used. In an AMD APU there is both a CPU and GPU, both on the same piece of silicon. In conventional applications ? in a Llano-powered laptop, for example ? the CPU and GPU hardly talk to each other; the CPU does its thing, and the GPU pushes polygons. What the researchers have done is to marry the CPU and GPU together to take advantage of each core?s strengths. To achieve the 20% boost, the researchers reduce the CPU to a fetch/decode unit, and the GPU becomes the primary computation unit. This works out well because CPUs are generally very strong at fetching data from memory, and GPUs are essentially just monstrous floating point units. In practice, this means the CPU is focused on working out what data the GPU needs (pre-fetching), the GPU?s pipes stay full, and a 20% performance boost arises. Now, unfortunately we don?t have the exact details of how the North Carolina researchers achieved this speed-up. We know it?s in software, but that?s about it. The team probably wrote a very specific piece of code (or a compiler) that uses the AMD APU in this way. The press release doesn?t say ?Windows ran 20% faster? or ?Crysis 2 ran 20% faster,? which suggests we?re probably looking at a synthetic, hand-coded benchmark. We will know more when the team presents its research on February 27 at the International Symposium on High Performance Computer Architecture. For what it?s worth, this kind of CPU/GPU integration is exactly what AMD is angling for with its Heterogeneous System Architecture (formerly known as Fusion System Architecture). AMD has a huge advantage over Intel when it comes to GPUs, but that means nothing if the software chain (compilers, libraries, developers) isn?t in place. The good news is that Intel doesn?t have anything even remotely close to AMD?s APU coming down the pipeline, which means AMD has a few years to see where this HSA path leads. If the 20% speed boost can be brought to market in the next year or two, AMD might actually have a chance. Updated @ 17:54: The co-author of the paper, Huiyang Zhou, was kind enough to send us the research paper. It seems production silicon wasn?t actually used; instead, the software tweaks were carried out a simulated future AMD APU with shared L3 cache (probably Trinity). It?s also worth noting that AMD sponsored and co-authored this paper. Updated @ 04:11 Some further clarification: Basically, the research paper is a bit cryptic. It seems the engineers wrote some real code, but executed it on a simulated AMD CPU with L3 cache (i.e. probably Trinity). It does seem like their working is correct. In other words, this is still a good example of the speed-ups that heterogeneous systems will bring? in a year or two. Read more at North Carolina State University _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Thu Feb 9 11:20:32 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Thu, 9 Feb 2012 11:20:32 -0500 (EST) Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: <20120209114255.GC7343@leitl.org> References: <20120209114255.GC7343@leitl.org> Message-ID: > http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu-performance-by-20-without-overclocking afaikt, they discovered that using the cpu to prefech for the gpu is a win. this is either obvious or quite strange - the latter because one of the basic principles of gpu programming is to have several times more threads than cores in order to let the scheduler hide latency. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Thu Feb 9 11:44:18 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Thu, 9 Feb 2012 11:44:18 -0500 (EST) Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: References: <20120209114255.GC7343@leitl.org> Message-ID: > I am afraid to use again AMD CPUs. choosing chips should not be about fear. > I already have 3 Intel 7 2600 and One > Laptop i7 2630. I do algoritms that needs much powerfull processor. > I bought one AMD 8120, after 24 hours , I have switched for other 2600 K. > Unless AMD will make one powerfull Processor. I will continue using Intel. I'm curious whether you have any insight into what about your workload fared poorly on the AMD chip. these particular models are similar in their cache and clocks; I wonder whether your experience could be due to, for instance, code that spends all its time in cache misses (where temporal inteleaving with HT might work well.) or whether the AMD chip was not adequately cooled, preventing it from scaling its clock. or whether your test was hurt by the now well-known module/L1 scheduling issue. specrateFP results seem to indicate AMD is not doing badly. of course, the Intel system is also more expensive. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 10 08:58:21 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 10 Feb 2012 14:58:21 +0100 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: <20120209114255.GC7343@leitl.org> References: <20120209114255.GC7343@leitl.org> Message-ID: <4A633EAC-CD78-44A7-9C82-455CAD09B89F@xs4all.nl> On Feb 9, 2012, at 12:42 PM, Eugen Leitl wrote: > > http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu- > performance-by-20-without-overclocking Seems that they used a GPGPU application and had the cpu help speedup the gpgpu by also helping to calculate. So the gpu doesn't help the cpu. So the article title is wrong. It should be : engineers boost AMD gpu performance by 20% by having the CPU give a hand _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Fri Feb 10 12:00:02 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 10 Feb 2012 09:00:02 -0800 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: <4A633EAC-CD78-44A7-9C82-455CAD09B89F@xs4all.nl> Message-ID: Expecting headlines to be accurate is a fool's errand... Be glad it actually said AMD. It's bad enough that the article writers often erroneously summarize, but a still different person writes the headline. Back in print days, the headline had to "look nice".. Today, it's probably more about SEO to drive traffic. But both lead to interesting headlines that don't necessarily reflect the content of the article. On 2/10/12 5:58 AM, "Vincent Diepeveen" wrote: > >On Feb 9, 2012, at 12:42 PM, Eugen Leitl wrote: > >> >> http://www.extremetech.com/computing/117377-engineers-boost-amd-cpu- >> performance-by-20-without-overclocking > >Seems that they used a GPGPU application and had the cpu help speedup >the gpgpu by also helping to calculate. > >So the gpu doesn't help the cpu. So the article title is wrong. > >It should be : engineers boost AMD gpu performance by 20% by having >the CPU give a hand > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Fri Feb 10 12:08:54 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Fri, 10 Feb 2012 12:08:54 -0500 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: References: Message-ID: <4F354F26.2040103@scalableinformatics.com> On 02/10/2012 12:00 PM, Lux, Jim (337C) wrote: > Expecting headlines to be accurate is a fool's errand... > Be glad it actually said AMD. Expecting articles contents to reflect in any reasonable way upon reality may be a similar problem. There are a few, precious few writers who really grok the technology because they live it: Doug Eadline, Jeff Layton, Henry Newman, Chris Mellor, Dan Olds, Rich Brueckner, ... . The vast majority of articles I've had some contact with the authors on (not in the above group) have been erroneous to the point of being completely non-informational. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 10 12:48:08 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 10 Feb 2012 18:48:08 +0100 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: <4F354F26.2040103@scalableinformatics.com> References: <4F354F26.2040103@scalableinformatics.com> Message-ID: Another interesting question is how a few cores cores would be able to speedup a typical single precision gpgpu application by 20%. That would means that the gpu is really slow, especially if we realize this is just 1 or 2 CPU cores or so. Your gpgpu code really has to kind of be not so very professional to have 2 cpu cores alraedy contribute some 20% to that. Most gpgpu codes here on a modern GPU you need about a 200+ cpu cores and that's usually codes which do not run optimal at gpu's, as it has to do with huge prime numbers, so simulating that at a 64 bits cpu is more efficient than a 32 bits gpu. So in their case the claim is that for their experiments, assuming 2 cpu cores, that would be 20%. Means we have a gpu that's 20x slower or so than a fermi at 512 cores/HD6970 @ 1536. 1536 / 20 = 76.8 gpu streamcores. That's AMD Processing Element count. for nvidia this is similar to 76.8 / 4 = 19.2 cores This laptop is from 2007, sure it is a macbookpro 17'' apple, has a core2 duo 2.4Ghz and has a Nvidia GT 8600M with 32 CUDA cores. So if we extrapolate back, the built in gpu is gonna kick that new AMD chip, right? Vincent On Feb 10, 2012, at 6:08 PM, Joe Landman wrote: > On 02/10/2012 12:00 PM, Lux, Jim (337C) wrote: >> Expecting headlines to be accurate is a fool's errand... >> Be glad it actually said AMD. > > Expecting articles contents to reflect in any reasonable way upon > reality may be a similar problem. There are a few, precious few > writers > who really grok the technology because they live it: Doug Eadline, > Jeff > Layton, Henry Newman, Chris Mellor, Dan Olds, Rich Brueckner, ... . > > The vast majority of articles I've had some contact with the > authors on > (not in the above group) have been erroneous to the point of being > completely non-informational. > > > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics Inc. > email: landman at scalableinformatics.com > web : http://scalableinformatics.com > http://scalableinformatics.com/sicluster > phone: +1 734 786 8423 x121 > fax : +1 866 888 3112 > cell : +1 734 612 4615 > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Shainer at Mellanox.com Fri Feb 10 19:36:45 2012 From: Shainer at Mellanox.com (Gilad Shainer) Date: Sat, 11 Feb 2012 00:36:45 +0000 Subject: [Beowulf] HPC Advisory Council and Swiss Supercomputing Centre to host HPC Switzerland Conference 2012 Message-ID: Sending on behalf of the HPC Advisory Council and the Swiss Supercomputing Center Date: March 13-15, 2012 Location: Palazoo dei Congressi, Lugano, Switzerland The HPC Advisory Council and the Swiss Supercomputing Centre will host the HPC Advisory Council Switzerland Conference 2012 in the Lugano Convention Centre, Lugano, Switzerland, from March 13-15, 2012. The conference will focus on High-Performance Computing (HPC) education, hands-on and classroom training and overviews of new important HPC developments and trends. The conference will include comprehensive education for topics such as high-performance and parallel I/O, communication libraries (such as MPI, SHMEM and PGAS), GPU and accelerations, Big Data, high-performance cloud computing, high-speed interconnects, and will include advanced topics and development for upcoming HPC technologies. In addition, attendees will receive hands-on training for topics on clustering, network, troubleshooting, tuning, and optimizations. For the complete agenda and schedule, please refer to the conference website - http://www.hpcadvisorycouncil.com/events/2012/Switzerland-Workshop/index.php. The 3-day conference is CHF 80.00. Registration is required and can be made at the HPC Advisory Council Switzerland Conference registration page. Media sponsorship and coverage is being provided by HPC-CH, insideHPC and Scientific Computing World. Thanks, Gilad -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eugen at leitl.org Thu Feb 16 10:26:53 2012 From: eugen at leitl.org (Eugen Leitl) Date: Thu, 16 Feb 2012 16:26:53 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right Message-ID: <20120216152653.GQ7343@leitl.org> http://www.fragmentationneeded.net/2011/12/pricing-and-trading-networks-down-is-up.html Pricing and Trading Networks: Down is Up, Left is Right My introduction to enterprise networking was a little backward. I started out supporting trading floors, backend pricing systems, low-latency algorithmic trading systems, etc... I got there because I'd been responsible for UNIX systems producing and consuming multicast data at several large financial firms. Inevitably, the firm's network admin folks weren't up to speed on matters of performance tuning, multicast configuration and QoS, so that's where I focused my attention. One of these firms offered me a job with the word "network" in the title, and I was off to the races. It amazes me how little I knew in those days. I was doing PIM and MSDP designs before the phrases "link state" and "distance vector" were in my vocabulary! I had no idea what was populating the unicast routing table of my switches, but I knew that the table was populated, and I knew what PIM was going to do with that data. More incredible is how my ignorance of "normal" ways of doing things (AVVID, SONA, Cisco Enterprise Architecture, multi-tier designs, etc...) gave me an advantage over folks who had been properly indoctrinated. My designs worked well for these applications, but looked crazy to the rest of the network staff (whose underperforming traditional designs I was replacing). The trading floor is a weird place, with funny requirements. In this post I'm going to go over some of the things that make trading floor networking... Interesting. Redundant Application Flows The first thing to know about pricing systems is that you generally have two copies of any pricing data flowing through the environment at any time. Ideally, these two sets originate from different head-end systems, get transit from different wide area service providers, ride different physical infrastructure into opposite sides of your data center, and terminate on different NICs in the receiving servers. If you're getting data directly from an exchange, that data will probably be arriving as multicast flows. Redundant multicast flows. The same data arrives at your edge from two different sources, using two different multicast groups. If you're buying data from a value-add aggregator (Reuters, Bloomberg, etc...), then it probably arrives via TCP from at least two different sources. The data may be duplicate copies (redundancy), or be distributed among the flows with an N+1 load-sharing scheme. Losing One Packet Is Bad Most application flows have no problem with packet loss. High performance trading systems are not in this category. Think of the state of the pricing data like a spreadsheet. The rows represents a securities -- something that traders buy and sell. The columns represent attributes of that security: bid price, ask price, daily high and low, last trade price, last trade exchange, etc... Our spreadsheet has around 100 columns and 200,000 rows. That's 20 million cells. Every message that rolls in from a multicast feed updates one of those cells. You just lost a packet. Which cell is wrong? Easy answer: All of them. If a trader can't trust his data, he can't trade. These applications have repair mechanisms, but they're generally slow and/or clunky. Some of them even involve touch tone. Really: The Securities Industry Automation Corporation (SIAC) provides a retransmission capability for the output data from host systems. As part of this service, SIAC provides the AutoLink facility to assist vendors with requesting retransmissions by submitting requests over a touch-tone telephone set Reconvergence Is Bad Because we've got two copies of the data coming in. There's no reason to fix a single failure. If something breaks, you can let it stay broken until the end of the day. What's that? You think it's worth fixing things with a dynamic routing protocol? Okay cool, route around the problem. Just so long as you can guarantee that "flow A" and "flow B" never traverse the same core router. Why am I paying for two copies of this data if you're going to push it through a single device? You just told me that the device is so fragile that you feel compelled to route around failures! Don't Cluster the Firewalls The same reason we don't let routing reconverge applies here. If there are two pricing firewalls, don't tell them about each other. Run them as standalone units. Put them in separate rooms, even. We can afford to lose half of a redundant feed. We cannot afford to lose both feeds, even for the few milliseconds required for the standby firewall take over. Two clusters (four firewalls) would be okay, just keep the "A" and "B" feeds separate! Don't team the server NICs The flow-splitting logic applies all the way down to the servers. If they've got two NICs available for incoming pricing data, these NICs should be dedicated per-flow. Even if there are NICs-a-plenty, the teaming schemes are all bad news because like flows, application components are also disposable. It's okay to lose one. Getting one back? That's sometimes worse. Keep reading... Recovery Can Kill You Most of these pricing systems include a mechanism for data receivers to request retransmission of lost data, but the recovery can be a problem. With few exceptions, the network applications in use on the trading floor don't do any sort of flow control. It's like they're trying to hurt you. Imagine a university lecture where a sleeping student wakes up, asks the lecturer to repeat the last 30 minutes, and the lecturer complies. That's kind of how these systems work. Except that the lecturer complies at wire speed, and the whole lecture hall full of students is compelled to continue taking notes. Why should the every other receiver be penalized because one system screwed up? I've got trades to clear! The following snapshot is from the Cisco CVD for trading systems. it shows how aggressive these systems can be. A nominal 5Mb/s trading application regularly hits wire-speed (100Mb/s) in this case. The graph shows a small network when things are working right. A big trading backend at a large financial services firm can easily push that green line into the multi-gigabit range. Make things interesting by breaking stuff and you'll over-run even your best 10Gb/s switch buffers (6716 cards have 90MB per port) easily. Slow Servers Are Good Lots of networks run with clients deliberately connected at slower speeds than their server. Maybe you have 10/100 ports in the wiring closet and gigabit-attached servers. Pricing networks require exactly the opposite. The lecturer in my analogy isn't just a single lecturer. It's a team of lecturers. They all go into wire-speed mode when the sleeping student wakes up. How will you deliver multiple simultaneous gigabit-ish multicast streams to your access ports? You can't. I've fixed more than one trading system by setting server interfaces down to 100Mb/s or even 10Mb/s. Fast clients, slow servers is where you want to be. Slowing down the servers can turn N*1Gb/s worth of data into N*100Mb/s -- something we can actually handle. Bad Apple Syndrome The sleeping student example is actually pretty common. It's amazing to see the impact that can arise from things like: a clock update on a workstation ripping a CD with iTunes briefly closing the lid on a laptop The trading floor is usually a population of Windows machines with users sitting behind them. Keeping these things from killing each other is a daunting task. One bad apple will truly spoil the bunch. How Fast Is It? System performance is usually measured in terms of stuff per interval. That's meaningless on the trading floor. The opening bell at NYSE is like turning on a fire hose. The only metric that matters is the answer to this question: Did you spill even one drop of water? How close were you to the limit? Will you make it through tomorrow's trading day too? I read on twitter that Ben Bernanke got a bad piece of fish for dinner. How confident are you now? Performance of these systems is binary. You either survived or you did not. There is no "system is running slow" in this world. Routing Is Upside Down While not unique to trading floors, we do lots of multicast here. Multicast is funny because it relies on routing traffic away from the source, rather than routing it toward the destination. Getting into and staying in this mindset can be a challenge. I started out with no idea how routing worked, so had no problem getting into the multicast mindset :-) NACK not ACK Almost every network protocol relies on data receivers ACKnowledging their receipt of data. But not here. Pricing systems only speak up when something goes missing. QoS Isn't The Answer QoS might seem like the answer to make sure that we get through the day smoothly, but it's not. In fact, it can be counterproductive. QoS is about managed un-fairness... Choosing which packets to drop. But pricing systems are usually deployed on dedicated systems with dedicated switches. Every packet is critical, and there's probably more of them than we can handle. There's nothing we can drop. Making matters worse, enabling QoS on many switching platforms reduces the buffers available to our critical pricing flows, because the buffers necessarily get carved so that they can be allocated to different kinds of traffic. It's counter intuitive, but 'no mls qos' is sometimes the right thing to do. Load Balancing Ain't All It's Cracked Up To Be By default, CEF doesn't load balance multicast flows. CEF load balancing of multicast can be enabled and enhanced, but doesn't happen out of the box. We can get screwed on EtherChannel links too: Sometimes these quirky applications intermingle unicast data with the multicast stream. Perhaps a latecomer to the trading floor wants to start watching Cisco's stock price. Before he can begin, he needs all 100 cells associated with CSCO. This is sometimes called the "Initial Image." He ignores updates for CSCO until he's got the that starting point loaded up. CSCO has updated 9000 times today, so the server unicasts the initial image: "Here are all 100 cells for CSCO as of update #9000: blah blah blah...". Then the price changes, and the server multicasts update #9001 to all receivers. If there's a load balanced path (either CEF or an aggregate link) between the server and client, then our new client could get update 9001 (multicast) before the initial image (unicast) shows up. The client will discard update 9001 because he's expecting a full record, not an update to a single cell. Next, the initial image shows up, and the client knows he's got everything through update #9000. Then update #9002 arrives. Hey, what happened to #9001? Post-mortem analysis of these kinds of incidents will boil down to the software folks saying: We put the messages on the wire in the correct order. They were delivered by the network in the wrong order. ARP Times Out NACK-based applications sit quietly until there's a problem. So quietly that they might forget the hardware address associated with their gateway or with a neighbor. No problem, right? ARP will figure it out... Eventually. Because these are generally UDP-based applications without flow control, the system doesn't fire off a single packet, then sit and wait like it might when talking TCP. No, these systems can suddenly kick off a whole bunch of UDP datagrams destined for a system it hasn't talked to in hours. The lower layers in the IP stack need to hold onto these packets until the ARP resolution process is complete. But the packets keep rolling down the stack! The outstanding ARP queue is only 1 packet deep in many implementations. The queue overflows and data is lost. It's not strictly a network problem, but don't worry. Your phone will ring. Losing Data Causes You to Lose Data There's a nasty failure mode underlying the NACK-based scheme. Lost data will be retransmitted. If you couldn't handle the data flow the first time around, why expect to handle wire speed retransmission of that data on top of the data that's coming in the next instant? If the data loss was caused by a Bad Apple receiver, then all his peers suffer the consequences. You may have many bad apples in a moment. One Bad Apple will spoil the bunch. If the data loss was caused by an overloaded network component, then you're rewarded by compounding increases in packet rate. The exchanges don't stop trading, and the data sources have a large queue of data to re-send. TCP applications slow down in the face of congestion. Pricing applications speed up. Packet Decodes Aren't Available Some of the wire formats you'll be dealing with are closed-source secrets. Others are published standards for which no WireShark decodes are publicly available. Either way, you're pretty much on your own when it comes to analysis. Updates Responding to Will's question about data sources: The streams come from the various exchanges (NASDAQ, NYSE, FTSE, etc...) Because each of these exchanges use their own data format, there's usually some layers of processing required to get them into a common format for application consumption. This processing can happen at a value-add data distributor (Reuters, Bloomberg, Activ), or it can be done in-house by the end user. Local processing has the advantage of lower latency because you don't have to have the data shipped from the exchange to a middleman before you see it. Other streams come from application components within the company. There are usually some layers of processing (between 2 and 12) between a pricing update first hitting your equipment, and when that update is consumed by a trader. The processing can include format changes, addition of custom fields, delay engines (delayed data can be given away for free), vendor-switch systems (I don't trust data vendor "A", switch me to "B"), etc... Most of those layers are going to be multicast, and they're going to be the really dangerous ones, because the sources can clobber you with LAN speeds, rather than WAN speeds. As far as getting the data goes, you can move your servers into the exchange's facility for low-latency access (some exchanges actually provision the same length of fiber to each colocated customer, so that nobody can claim a latency disadvantage), you can provision your own point-to-point circuit for data access, you can buy a fat local loop from a financial network provider like BT/Radianz (probably MPLS on the back end so that one local loop can get you to all your pricing and clearing partners), or you can buy the data from a value-add aggregator like Reuters or Bloomberg. Responding to Will's question about SSM: I've never seen an SSM pricing component. They may be out there, but they might not be a super good fit. Here's why: Everything in these setups is redundant, all the way down to software components. It's redundant in ways we're not used to seeing in enterprises. No load-balancer required here. The software components collaborate and share workload dynamically. If one ticker plant fails, his partner knows what update was successfully transmitted by the dead peer, and takes over from that point. Consuming systems don't know who the servers are, and don't care. A server could be replaced at any moment. In fact, it's not just downstream pricing data that's multicast. Many of these systems use a model where the clients don't know who the data sources are. Instead of sending requests to a server, they multicast their requests for data, and the servers multicast the replies back. Instead of: hello server, nice to meet you. I'd like such-and-such. it's actually: hello? servers? I'd like such-and-such! I'm ready, so go ahead and send it whenever... Not knowing who your server is kind of runs counter to the SSM ideal. It could be done with a pool of servers, I've just never seen it. The exchanges are particularly slow-moving when it comes to changing things. The modern exchange feed, particularly ones like the "touch tone" example I cited are literally ticker-tape punch signals wrapped up in an IP multicast header. The old school scheme was to have a ticker tape machine hooked to a "line" from the exchange. Maybe you'd have two of them (A and B again). There would be a third one for retransmit. Ticker machine run out of paper? Call the exchange, and here's more-or-less what happens: Cut the chunk of paper containing the updates you missed out of their spool of tape. Scissors are involved here. Grab a bit of header tape that says: "this is retransmit data for XYZ Bank". Tape these two pieces of paper together, and feed them through a reader that's attached to the "retransmit line" Every bank in New York will get the retransmits, but they'll know to ignore them. XYZ Bank clips the retransmit data out of the retransmit ticker machine, and pastes it into place on the end where the machine ran out of paper. These terms "tick" "line" and "retransmit", etc... all still apply with modern IP based systems. I've read the developer guides for these systems (to write wireshark decodes), and it's like a trip back in time. Some of these systems are still so closely coupled to the paper-punch system that you get chads all over the floor and paper cuts all over your hands just from reading the API guide :-) _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Thu Feb 16 11:26:08 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Thu, 16 Feb 2012 17:26:08 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <20120216152653.GQ7343@leitl.org> References: <20120216152653.GQ7343@leitl.org> Message-ID: Yes very good article. In fact it's even more clumsy than most guess. For those who didn't get the problem of the article - there is 2 datafeeds A and B that ship 'market data' at most exchanges. Basically all big exchanges work similar there. Especially the derivatives exchanges, most of them have a very similar protocol, and that's the only spot where you really can make money now based upon speed. The market data is lists of what price you can get something for (say a future Mellanox, MLNX is its short), and what price it sells for. It's however an incremental update meanwhile the datafeeds are RAW/ UDP, so not TCP. TCP as we know is about the only protocol fixed well, so the RAW format poses a big problem there. On paper it is indeed possible to ask for retransmission at a different channel, but in case of market surges that's gonna be too slow of course. The other feed you also cannot use, as 1 of the both feeds A and B is gonna be a lot faster than the other. Even the very slow IBM software, you can publicly buy, if you google for it, at their twitter they claim a total processing speed of the 'market data' of around 7-11 us (microseconds), the trading (so buying and selling yourself) happens with a TCP connection usually. Of course you can forget trading at platforms or a computer that's not on the exchange itself. Just the latency as we know of receiving data from a datacenter with an ocean in between you measure in milliseconds, factor 1000 slower than microseconds. So won't make you much of a profit to trade like that, except if you use it to compare 2 different exchanges with each other and try to profit from that. That's however basically asking of you to be a billion dollar company as you need quite some infrastructure for that to be fast. Speaking of having big cash being an advantage - some exchanges offer if you pay really big cash a faster connection (10 gigabit versus 1 gigabit for cheap dollars). Most traders won't be able to pay for that big big connection. So it's funny that different exchanges get mentionned here - as you're only fast enough for 1 exchange with a local machine as a small trader. But now the weirdest thing - i offered myself at different spots to write a < 1 microsecond software to parse that marketdata, but no one wants to hire you it seems - they just look for JAVA coders. Example is also Morgan Chase. All their job offers, which i receive daily, 99% is JAVA jobs. Java is of course TOO SLOW for working with trading data. Much better is C/C++ and assembler after you already have achieved that < 1 microseconds in C. Note that the 'FPGA'S" that are advertized costs millions most of them and the only latency quote i saw there is 2 microseconds, which also sounds a bit slow to me, but well. Furthermore they usually hire FINANCIALS who happen to be able to program. They are so so behind in this world - not seldom also because many of the IT managers i spoke with of financial companies, they hardly have a highschool degree - they don't take any risk. What's the risk of paying 1 programmer to make a faster framework for you? We speak about hundreds of millions to billion dollar companies now which don't take that risk. Speed is of course everything. It's a NSA game now - and i keep wondering why only a few hedgefunds hire such persons - majority of the traders over here, they really run so much behind - you would be very shocked if you realize how much profit they do not make because of this. Most importantly of course, now i'm gonna say something most on this list understand and which to financial guys is nearly impossible to explain, being very fast removes a worst case. Odds you go bankrupt are A LOT SMALLER during market surges and you lose less during unpredicted surges by your algorithms. Speaking of algorithms - the word algorithm in the financial world is too heavy of a word. Gibberish is a better wording. But well i say that last of course as someone who has made pretty complex algorithms for patterns in computerchess - also took me 15 years to learn how to do that. Please note that it's wishful thinking guessing anything would change in how trading happens at exchanges - if one nation would modify something - traders just move to a different exchange. Right now CME (chicago) is the biggest derivatives market. Seems they try in Europe to create a bigger one. I'm pretty sure i don't betray any banking secrecy code if i call them very clever. If they learn one day what a computer is, that is. And as 0 of the traders *ever* in his life will be busy 'improving' the system, of course 0 politicians have any clue what happens over there and that sort of a NSA race for speed it has become. Though i'm very good at that, i'm not sure whether i like it. During a congressional hearing in the US a year or 2 ago or so, one academic who clearly realized the problem, stated that he wanted to introduce a penalty directly after trading - as he had figured out that some traders trade like 200 times a second in the same future Mellanox (same instrument this is called in traders terminology), to keep using the same example. He wanted to 'solve' that problem by introducing a rule that after trading in an instrument one would need to wait for another 100 milliseconds to trade again in that instrument. A guy from tradeworx hammered that away, as that it would hurt liquidity at the market. Note such academic solutions do not solve the fundamental problem that if someone goes first and is 1 picosecond faster, that he's the one allowed to buy that instrument against that price it was offered for. That there is a delay afterwards doesn't solve that fundamental problem. Furthermore you kind of tease away traders and exchanges don't like that. In the meantime exchanges are upgrading their hardware and moving to new datacenters. Some already migrated past years. So any discussion in politics here already is total outdated as the datacenters got way faster. FTSE for example announced that their total processing time has been reduced to somewhat just above a 100 microseconds and migrated to infiniband. More are too follow there. On Feb 16, 2012, at 4:26 PM, Eugen Leitl wrote: > > http://www.fragmentationneeded.net/2011/12/pricing-and-trading- > networks-down-is-up.html > > Pricing and Trading Networks: Down is Up, Left is Right > > My introduction to enterprise networking was a little backward. I > started out > supporting trading floors, backend pricing systems, low-latency > algorithmic > trading systems, etc... I got there because I'd been responsible > for UNIX > systems producing and consuming multicast data at several large > financial > firms. > > Inevitably, the firm's network admin folks weren't up to speed on > matters of > performance tuning, multicast configuration and QoS, so that's where I > focused my attention. One of these firms offered me a job with the > word > "network" in the title, and I was off to the races. > > It amazes me how little I knew in those days. I was doing PIM and MSDP > designs before the phrases "link state" and "distance vector" were > in my > vocabulary! I had no idea what was populating the unicast routing > table of my > switches, but I knew that the table was populated, and I knew what > PIM was > going to do with that data. > > More incredible is how my ignorance of "normal" ways of doing > things (AVVID, > SONA, Cisco Enterprise Architecture, multi-tier designs, etc...) > gave me an > advantage over folks who had been properly indoctrinated. My > designs worked > well for these applications, but looked crazy to the rest of the > network > staff (whose underperforming traditional designs I was replacing). > > The trading floor is a weird place, with funny requirements. In > this post I'm > going to go over some of the things that make trading floor > networking... > Interesting. > > Redundant Application Flows > > The first thing to know about pricing systems is that you generally > have two > copies of any pricing data flowing through the environment at any > time. > Ideally, these two sets originate from different head-end systems, get > transit from different wide area service providers, ride different > physical > infrastructure into opposite sides of your data center, and > terminate on > different NICs in the receiving servers. > > If you're getting data directly from an exchange, that data will > probably be > arriving as multicast flows. Redundant multicast flows. The same > data arrives > at your edge from two different sources, using two different multicast > groups. > > If you're buying data from a value-add aggregator (Reuters, Bloomberg, > etc...), then it probably arrives via TCP from at least two different > sources. The data may be duplicate copies (redundancy), or be > distributed > among the flows with an N+1 load-sharing scheme. > > Losing One Packet Is Bad > > Most application flows have no problem with packet loss. High > performance > trading systems are not in this category. > > Think of the state of the pricing data like a spreadsheet. The rows > represents a securities -- something that traders buy and sell. The > columns > represent attributes of that security: bid price, ask price, daily > high and > low, last trade price, last trade exchange, etc... > > Our spreadsheet has around 100 columns and 200,000 rows. That's 20 > million > cells. Every message that rolls in from a multicast feed updates > one of those > cells. You just lost a packet. Which cell is wrong? Easy answer: > All of them. > If a trader can't trust his data, he can't trade. > > These applications have repair mechanisms, but they're generally > slow and/or > clunky. Some of them even involve touch tone. Really: > > The Securities Industry Automation Corporation (SIAC) provides a > retransmission capability for the output data from host systems. > As part of > this service, SIAC provides the AutoLink facility to assist vendors > with > requesting retransmissions by submitting requests over a touch-tone > telephone > set > > Reconvergence Is Bad > > Because we've got two copies of the data coming in. There's no > reason to fix > a single failure. If something breaks, you can let it stay broken > until the > end of the day. > > What's that? You think it's worth fixing things with a dynamic routing > protocol? Okay cool, route around the problem. Just so long as you can > guarantee that "flow A" and "flow B" never traverse the same core > router. Why > am I paying for two copies of this data if you're going to push it > through a > single device? You just told me that the device is so fragile that > you feel > compelled to route around failures! > > Don't Cluster the Firewalls > > The same reason we don't let routing reconverge applies here. If > there are > two pricing firewalls, don't tell them about each other. Run them as > standalone units. Put them in separate rooms, even. We can afford > to lose > half of a redundant feed. We cannot afford to lose both feeds, even > for the > few milliseconds required for the standby firewall take over. Two > clusters > (four firewalls) would be okay, just keep the "A" and "B" feeds > separate! > > Don't team the server NICs > > The flow-splitting logic applies all the way down to the servers. > If they've > got two NICs available for incoming pricing data, these NICs should be > dedicated per-flow. Even if there are NICs-a-plenty, the teaming > schemes are > all bad news because like flows, application components are also > disposable. > It's okay to lose one. Getting one back? That's sometimes worse. Keep > reading... > > Recovery Can Kill You > > Most of these pricing systems include a mechanism for data > receivers to > request retransmission of lost data, but the recovery can be a > problem. With > few exceptions, the network applications in use on the trading > floor don't do > any sort of flow control. It's like they're trying to hurt you. > > Imagine a university lecture where a sleeping student wakes up, > asks the > lecturer to repeat the last 30 minutes, and the lecturer complies. > That's > kind of how these systems work. > > Except that the lecturer complies at wire speed, and the whole > lecture hall > full of students is compelled to continue taking notes. Why should > the every > other receiver be penalized because one system screwed up? I've got > trades to > clear! > > The following snapshot is from the Cisco CVD for trading systems. > it shows > how aggressive these systems can be. A nominal 5Mb/s trading > application > regularly hits wire-speed (100Mb/s) in this case. > > The graph shows a small network when things are working right. A > big trading > backend at a large financial services firm can easily push that > green line > into the multi-gigabit range. Make things interesting by breaking > stuff and > you'll over-run even your best 10Gb/s switch buffers (6716 cards > have 90MB > per port) easily. > > Slow Servers Are Good > > Lots of networks run with clients deliberately connected at slower > speeds > than their server. Maybe you have 10/100 ports in the wiring closet > and > gigabit-attached servers. Pricing networks require exactly the > opposite. The > lecturer in my analogy isn't just a single lecturer. It's a team of > lecturers. They all go into wire-speed mode when the sleeping > student wakes > up. > > How will you deliver multiple simultaneous gigabit-ish multicast > streams to > your access ports? You can't. I've fixed more than one trading > system by > setting server interfaces down to 100Mb/s or even 10Mb/s. Fast > clients, slow > servers is where you want to be. > > Slowing down the servers can turn N*1Gb/s worth of data into > N*100Mb/s -- > something we can actually handle. > > Bad Apple Syndrome > > The sleeping student example is actually pretty common. It's > amazing to see > the impact that can arise from things like: > > a clock update on a workstation > > ripping a CD with iTunes > > briefly closing the lid on a laptop > > The trading floor is usually a population of Windows machines with > users > sitting behind them. Keeping these things from killing each other is a > daunting task. One bad apple will truly spoil the bunch. > > How Fast Is It? > > System performance is usually measured in terms of stuff per > interval. That's > meaningless on the trading floor. The opening bell at NYSE is like > turning on > a fire hose. The only metric that matters is the answer to this > question: Did > you spill even one drop of water? > > How close were you to the limit? Will you make it through > tomorrow's trading > day too? > > I read on twitter that Ben Bernanke got a bad piece of fish for > dinner. How > confident are you now? Performance of these systems is binary. You > either > survived or you did not. There is no "system is running slow" in > this world. > > Routing Is Upside Down > > While not unique to trading floors, we do lots of multicast here. > Multicast > is funny because it relies on routing traffic away from the source, > rather > than routing it toward the destination. Getting into and staying in > this > mindset can be a challenge. I started out with no idea how routing > worked, so > had no problem getting into the multicast mindset :-) > > NACK not ACK > > Almost every network protocol relies on data receivers > ACKnowledging their > receipt of data. But not here. Pricing systems only speak up when > something > goes missing. > > QoS Isn't The Answer > > QoS might seem like the answer to make sure that we get through the > day > smoothly, but it's not. In fact, it can be counterproductive. > > QoS is about managed un-fairness... Choosing which packets to drop. > But > pricing systems are usually deployed on dedicated systems with > dedicated > switches. Every packet is critical, and there's probably more of > them than we > can handle. There's nothing we can drop. > > Making matters worse, enabling QoS on many switching platforms > reduces the > buffers available to our critical pricing flows, because the buffers > necessarily get carved so that they can be allocated to different > kinds of > traffic. It's counter intuitive, but 'no mls qos' is sometimes the > right > thing to do. > > Load Balancing Ain't All It's Cracked Up To Be > > By default, CEF doesn't load balance multicast flows. CEF load > balancing of > multicast can be enabled and enhanced, but doesn't happen out of > the box. > > We can get screwed on EtherChannel links too: Sometimes these quirky > applications intermingle unicast data with the multicast stream. > Perhaps a > latecomer to the trading floor wants to start watching Cisco's > stock price. > Before he can begin, he needs all 100 cells associated with CSCO. > This is > sometimes called the "Initial Image." He ignores updates for CSCO > until he's > got the that starting point loaded up. > > CSCO has updated 9000 times today, so the server unicasts the > initial image: > "Here are all 100 cells for CSCO as of update #9000: blah blah > blah...". Then > the price changes, and the server multicasts update #9001 to all > receivers. > > If there's a load balanced path (either CEF or an aggregate link) > between the > server and client, then our new client could get update 9001 > (multicast) > before the initial image (unicast) shows up. The client will > discard update > 9001 because he's expecting a full record, not an update to a > single cell. > > Next, the initial image shows up, and the client knows he's got > everything > through update #9000. Then update #9002 arrives. Hey, what happened > to #9001? > > Post-mortem analysis of these kinds of incidents will boil down to the > software folks saying: > > We put the messages on the wire in the correct order. They were > delivered > by the network in the wrong order. > > ARP Times Out > > NACK-based applications sit quietly until there's a problem. So > quietly that > they might forget the hardware address associated with their > gateway or with > a neighbor. > > No problem, right? ARP will figure it out... Eventually. Because > these are > generally UDP-based applications without flow control, the system > doesn't > fire off a single packet, then sit and wait like it might when > talking TCP. > No, these systems can suddenly kick off a whole bunch of UDP datagrams > destined for a system it hasn't talked to in hours. > > The lower layers in the IP stack need to hold onto these packets > until the > ARP resolution process is complete. But the packets keep rolling > down the > stack! The outstanding ARP queue is only 1 packet deep in many > implementations. The queue overflows and data is lost. It's not > strictly a > network problem, but don't worry. Your phone will ring. > > Losing Data Causes You to Lose Data > > There's a nasty failure mode underlying the NACK-based scheme. Lost > data will > be retransmitted. If you couldn't handle the data flow the first > time around, > why expect to handle wire speed retransmission of that data on top > of the > data that's coming in the next instant? > > If the data loss was caused by a Bad Apple receiver, then all his > peers > suffer the consequences. You may have many bad apples in a moment. > One Bad > Apple will spoil the bunch. > > If the data loss was caused by an overloaded network component, > then you're > rewarded by compounding increases in packet rate. The exchanges > don't stop > trading, and the data sources have a large queue of data to re-send. > > TCP applications slow down in the face of congestion. Pricing > applications > speed up. > > Packet Decodes Aren't Available > > Some of the wire formats you'll be dealing with are closed-source > secrets. > Others are published standards for which no WireShark decodes are > publicly > available. Either way, you're pretty much on your own when it comes to > analysis. > > Updates > > Responding to Will's question about data sources: The streams come > from the > various exchanges (NASDAQ, NYSE, FTSE, etc...) Because each of these > exchanges use their own data format, there's usually some layers of > processing required to get them into a common format for application > consumption. This processing can happen at a value-add data > distributor > (Reuters, Bloomberg, Activ), or it can be done in-house by the end > user. > Local processing has the advantage of lower latency because you > don't have to > have the data shipped from the exchange to a middleman before you > see it. > > Other streams come from application components within the company. > There are > usually some layers of processing (between 2 and 12) between a > pricing update > first hitting your equipment, and when that update is consumed by a > trader. > The processing can include format changes, addition of custom > fields, delay > engines (delayed data can be given away for free), vendor-switch > systems (I > don't trust data vendor "A", switch me to "B"), etc... > > Most of those layers are going to be multicast, and they're going > to be the > really dangerous ones, because the sources can clobber you with LAN > speeds, > rather than WAN speeds. > > As far as getting the data goes, you can move your servers into the > exchange's facility for low-latency access (some exchanges actually > provision > the same length of fiber to each colocated customer, so that nobody > can claim > a latency disadvantage), you can provision your own point-to-point > circuit > for data access, you can buy a fat local loop from a financial network > provider like BT/Radianz (probably MPLS on the back end so that one > local > loop can get you to all your pricing and clearing partners), or you > can buy > the data from a value-add aggregator like Reuters or Bloomberg. > > Responding to Will's question about SSM: I've never seen an SSM > pricing > component. They may be out there, but they might not be a super > good fit. > Here's why: Everything in these setups is redundant, all the way > down to > software components. It's redundant in ways we're not used to > seeing in > enterprises. No load-balancer required here. The software components > collaborate and share workload dynamically. If one ticker plant > fails, his > partner knows what update was successfully transmitted by the dead > peer, and > takes over from that point. Consuming systems don't know who the > servers are, > and don't care. A server could be replaced at any moment. > > In fact, it's not just downstream pricing data that's multicast. > Many of > these systems use a model where the clients don't know who the data > sources > are. Instead of sending requests to a server, they multicast their > requests > for data, and the servers multicast the replies back. Instead of: > > hello server, nice to meet you. I'd like such-and- > such. > > it's actually: > > hello? servers? I'd like such-and-such! I'm ready, so go ahead > and send > it whenever... > > Not knowing who your server is kind of runs counter to the SSM > ideal. It > could be done with a pool of servers, I've just never seen it. > > The exchanges are particularly slow-moving when it comes to > changing things. > The modern exchange feed, particularly ones like the "touch tone" > example I > cited are literally ticker-tape punch signals wrapped up in an IP > multicast > header. > > The old school scheme was to have a ticker tape machine hooked to a > "line" > from the exchange. Maybe you'd have two of them (A and B again). > There would > be a third one for retransmit. Ticker machine run out of paper? > Call the > exchange, and here's more-or-less what happens: > > Cut the chunk of paper containing the updates you missed out of > their > spool of tape. Scissors are involved here. > > Grab a bit of header tape that says: "this is retransmit data > for XYZ > Bank". > > Tape these two pieces of paper together, and feed them through > a reader > that's attached to the "retransmit line" > > Every bank in New York will get the retransmits, but they'll > know to > ignore them. > > XYZ Bank clips the retransmit data out of the retransmit ticker > machine, > and pastes it into place on the end where the machine ran out of > paper. > > These terms "tick" "line" and "retransmit", etc... all still apply > with > modern IP based systems. I've read the developer guides for these > systems (to > write wireshark decodes), and it's like a trip back in time. Some > of these > systems are still so closely coupled to the paper-punch system that > you get > chads all over the floor and paper cuts all over your hands just > from reading > the API guide :-) > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Thu Feb 16 16:29:22 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 16 Feb 2012 13:29:22 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <20120216152653.GQ7343@leitl.org> Message-ID: They don't hire their High Frequency Trading software people from ads. Personal recommendations, more likely. The ads for Java coders are for run of the mill back end banking stuff. Most banks are doing their enterprise scale work in Java (which is replacing PowerBuilder, RPG, and COBOL). Or Java interfaces to a SQL backend. I don't know of any million dollar FPGAs. Even space qualified big Xilinx parts are about a tenth of that. Modern FPGAs could do substantially better than 1 microsecond latency. They have multiGbps interfaces on chip (e.g. Rocket I/O) and external clocks in the hundreds of MHz range. Now, if you're doing store and forward routing of 1000 bit packets on a 1Gbps wire, of course you're going to have 1 microsecond latency. -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Vincent Diepeveen Sent: Thursday, February 16, 2012 8:26 AM To: Eugen Leitl Cc: tt at postbiota.org; Beowulf Mailing List; forkit! Subject: Re: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right Yes very good article. 1) But now the weirdest thing - i offered myself at different spots to write a < 1 microsecond software to parse that marketdata, but no one wants to hire you it seems - they just look for JAVA coders. Example is also Morgan Chase. All their job offers, which i receive daily, 99% is JAVA jobs. ---------------- Java is of course TOO SLOW for working with trading data. Much better is C/C++ and assembler after you already have achieved that < 1 microseconds in C. Note that the 'FPGA'S" that are advertized costs millions most of them and the only latency quote i saw there is 2 microseconds, which also sounds a bit slow to me, but well. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 17 04:11:29 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 17 Feb 2012 10:11:29 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <20120216152653.GQ7343@leitl.org> Message-ID: <63FF25A2-52FD-42FC-9C12-04188E68E2C8@xs4all.nl> On Feb 16, 2012, at 10:29 PM, Lux, Jim (337C) wrote: > They don't hire their High Frequency Trading software people from > ads. Personal recommendations, more likely. The ads for Java > coders are for run of the mill back end banking stuff. Most banks > are doing their enterprise scale work in Java (which is replacing > PowerBuilder, RPG, and COBOL). Or Java interfaces to a SQL backend. > Most platforms have been entirely written in JAVA - there are many platforms and i read recently an estimate the platforms are 80% of the total trading volume - what happens inside the platform didn't even get counted. If your entire base already is in Java, it is the logical choice if you're a noob to choose that for a platform as well. Yet it directly makes you a loser. A good example in Netherlands of such platform, but there are a lot more platforms is Flowtraders. They hire exclusively Java coders. neary all the NSA level programmers are hardly busy with object orientation (as that's dead slow of course, even avoiding object oriented manners of setting up code you're in company code a lot slower because of the many layers deep code you have where no compiler can make sense out of). I bet those using those platforms just lose money nowadays. Must be very exceptionel to find someone who made a profit there - for sure not in a systematic manner. Yet of course the platform owners make cash because of the fixed fee on each transaction - so they cheer loud. In fact nearly all big financial institutes have such a platform that will take care you lose money as the only way to win some is long term trading. Long term trading in this is trading that takes longer than a few seconds, say buy something in the morning and sell it in the afternoon. > I don't know of any million dollar FPGAs. Even space qualified big > Xilinx parts are about a tenth of that. > Most software has a price of a couple of tens of thousands of dollars a month. The FPGA's are a multiple of that. the IBM websphere bla bla that can be used to parse market data and trade, it's around a $100k a year. Add some tens of thousands for additional functionality. Getting a FPGA a tad higher clocked from Xilinx, say the first sample at 22 nm, is probably not cheap either. Being 1 Mhz higher clocked with your FPGA there than the competition is worth tens of millions of course. Price doesn't matter at this area; the hedgefunds who are convinced invest major cash into being faster. > Modern FPGAs could do substantially better than 1 microsecond > latency. They have multiGbps interfaces on chip (e.g. Rocket I/O) > and external clocks in the hundreds of MHz range. Now, if you're > doing store and forward routing of 1000 bit packets on a 1Gbps > wire, of course you're going to have 1 microsecond latency. > Actually most trading software that you can hire for tens of thousands a month has latencies closer to 500 microseconds to parse some data and then give the order to do a trade. Only recently they try to speed it up. Majority trades at 100+ microseconds. Of course NIC latencies are not counted here - we just speak about latencies you suffer at the computer. And sure this is at very high clocked Nehalem processors. By now i guess many will have moved to IBM software as that's mass software that's offered for just a 100k dollar a year or so, and claims, last time i checked, around a 7 to 11 microseconds. This was tested by themselves, so not during a surge. In fact whatever fast latency you claim is useless, it's all about the surge latency of course. The Exchanges measure surges in intervals of 50 milliseconds. So to give an example Bernanke coughs loud when the word 'US overspending', which currently is approaching 50% of total income of the US government is (income projected 2644 billion, spendings far over 3600 billion). No way to fix that. Obama's trick to express it as a percentage of GDP, i'm sure financial world is total ignoring that. This makes the markets extremely volatile though. From traders viewpoint that's a long term consideration which todays exchanges won't reflect of course. Yet it means that if 1 sentence gets said, that for a second suddenly you'll see huge surge in the market. The external platforms usually get blocked out so you can have up to 15 minutes delay for your query to reach the market, if you're on an external platform, as you can see in some analysis; any ticker you'll see or reflection of what's going on is gonna be minutes behind - we saw this clearly during the flash crash. Only the datacenter of the exchange itself was accurate and anyone far away from that had to wait for minutes because of all the traffic jams caused by massive trading volumes. In short trading from external platforms during a crash is the silliest thing possible. You need a box inside the exchange - or you better be prepared to be a loser when trading in derivatives. So just to avoid you from losing all your money, investing big cash into being the fastest, it is worth it. For some of you, they are maybe a bit surprised i just speak about futures here and not other tradeables. That's for a simple reason, the derivatives market has expanded major league and is probably the only exchange where, if you're having fast hardware and fast software, can still make good money. Hope this doesn't get as a shock for you, but if you bought a year or 70 ago the Dow Jones Industrial, same by the way for others, and just did do nothing for 70 years then by 2005 or so, you would've had an average profit of 12% a year roughly. From which far over 7% is indexation and nearly 5% is dividend. Most financials however found that 12% not good enough as they wanted to perform above the market average (indexation). It was common to see no one got hired who couldn't bring home 20%+, that's what they wanted. The stock market is simply moving total horizontal now, they no longer can make that 12% now; also it's much harder to sell them, whereas futures are easy to sell and buy. There is other derivatives as well, and i see on TV financials each time advertize other derivatives, yet in reality the markets mainly trades futures, things like spreads and swaps are less popular. Again the difference of presentation to the public what to buy and do, versus what they do their own. Huge differences there. Just not even funny it is. > > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf- > bounces at beowulf.org] On Behalf Of Vincent Diepeveen > Sent: Thursday, February 16, 2012 8:26 AM > To: Eugen Leitl > Cc: tt at postbiota.org; Beowulf Mailing List; forkit! > Subject: Re: [Beowulf] Pricing and Trading Networks: Down is Up, > Left is Right > > Yes very good article. > > 1) > But now the weirdest thing - i offered myself at different spots to > write a < 1 microsecond software to parse that marketdata, but no > one wants to hire you it seems - they just look for JAVA coders. > > Example is also Morgan Chase. All their job offers, which i receive > daily, 99% is JAVA jobs. > > ---------------- > Java is of course TOO SLOW for working with trading data. Much > better is C/C++ and assembler after you already have achieved that > < 1 microseconds in C. > > Note that the 'FPGA'S" that are advertized costs millions most of > them and the only latency quote i saw there is 2 microseconds, > which also sounds a bit slow to me, but well. > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Fri Feb 17 09:12:13 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 17 Feb 2012 06:12:13 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <63FF25A2-52FD-42FC-9C12-04188E68E2C8@xs4all.nl> Message-ID: On 2/17/12 1:11 AM, "Vincent Diepeveen" wrote: > >> I don't know of any million dollar FPGAs. Even space qualified big >> Xilinx parts are about a tenth of that. >> > >Most software has a price of a couple of tens of thousands of dollars >a month. >The FPGA's are a multiple of that. >the IBM websphere bla bla that can be used to parse market data and >trade, it's around a $100k a year. >Add some tens of thousands for additional functionality. Are you talking about the software cost, not the hardware platform cost? If so, I'd go for that.. The population of FPGA developers is probably 1/100 the number of conventional Von Neuman machine developers (in whatever language). Interestingly, such a scarcity does not translate to 100x higher pay. Most of the surveys show that in terms of median compensation FPGA designers get maybe 30-40% more than software developers. I guess there's much more than 100x the demand for generalized software developers. I wonder if the same is true of GPU developers. A slight premium in pay, but not much. > >Getting a FPGA a tad higher clocked from Xilinx, say the first sample >at 22 nm, is probably not cheap either. The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere $132k, qty 1 (drops to $127k in qty 1000) (16 week lead time) 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, 28Gb/s transceivers, etc.etc.etc. To put it in a box with power supply and interfaces probably would set you back a good chunk of a million dollars. >> _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 17 10:42:35 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 17 Feb 2012 16:42:35 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: Message-ID: <550AB79F-4939-4D56-AF2E-61E14792D071@xs4all.nl> On Feb 17, 2012, at 3:12 PM, Lux, Jim (337C) wrote: > > > On 2/17/12 1:11 AM, "Vincent Diepeveen" wrote: >> >>> I don't know of any million dollar FPGAs. Even space qualified big >>> Xilinx parts are about a tenth of that. >>> >> >> Most software has a price of a couple of tens of thousands of dollars >> a month. >> The FPGA's are a multiple of that. >> the IBM websphere bla bla that can be used to parse market data and >> trade, it's around a $100k a year. >> Add some tens of thousands for additional functionality. > > > > Are you talking about the software cost, not the hardware platform > cost? I thought it was obvious from what i wrote it's solution costs. Traders usually buy in ready solutions - a few hedgefunds excepted. > If so, I'd go for that.. The population of FPGA developers is probably > 1/100 the number of conventional Von Neuman machine developers (in > whatever language). > > Interestingly, such a scarcity does not translate to 100x higher pay. I'm not a magnificent FPGA developer - i'd say first speedup the software - there is so many traders real slow there. But they don't even are interested IN THAT. > Most of the surveys show that in terms of median compensation FPGA > designers get maybe 30-40% more than software developers. The rate for development is $1 an hour in india. If you deal with major companies in India it's $2.50 an hour including everything. > > I guess there's much more than 100x the demand for generalized > software > developers. > How good are the 'generalized software developers' they are hiring actually to do their development? Suppose you just hire JAVA guys and girls. Now ignore the quants of course, what language they develop in is not so important as you can easily parse that to your own solution. Besides i'd guess, but have no information there, that most queries to trade are actually dead simple decisions. > I wonder if the same is true of GPU developers. A slight premium > in pay, > but not much. > > > >> >> Getting a FPGA a tad higher clocked from Xilinx, say the first sample >> at 22 nm, is probably not cheap either. > > The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere > $132k, qty 1 > (drops to $127k in qty 1000) > (16 week lead time) > > 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, 28Gb/s > transceivers, etc.etc.etc. > how high that would clock? note you also need to integrate the fastest 10 gigabit nic - right now that's solarflare. > > To put it in a box with power supply and interfaces probably would > set you > back a good chunk of a million dollars. > if you'd do it in a simple manner sure - but if you're already on that path of speed and your main income is based upon speed you want something faster than what's online there of course. You phone xilinx and intel and demand they print fpga's before printing cpu's in a new proces technology, and clock 'em as high as possible. I would guess that's not a beginnersteam and has several members. you soon have an expensive team there, in terms of salary pressure i'd guess it's around a 4 million a year minimum. That's not a problem for the hedgefunds to pay actually. If you buy things in it's also a seven digit number a year. I do believe however there is enough room for a good software implementation to make great cash based upon speed. The rule is you need to be within the 10% fastest traders to make a profit. That might or might not be true when it's based upon speed. Many trader groups are just busy with 1 specific part of the market - say for example oil companies. Most are experts at just 1 tiny market. For them i'd suspect a software implementation that gets them in the top, would already make them very effective traders. There is limits on what you are allowed to sell in quantity, meanwhile the total trading volume in derivatives keeps growing, so you sure can make great cash if you're amongst the fastest and not necessarily need to beat the top trading hedgefunds. For me it's incredible that there is so little jobs actually for trading in imperative languages and basically 99.9% of all jobs are Java there. I just simply would never do business with jpmorgan chase. they seem to hire 0 persons who are busy imperative or even in C++. Basically means that the entire market of genius guys who know how to beat you in game tree search, which is about all artificial intelligence experts, they're shut out from majority of companies to get a job there. The way to get a job is to be young and have a degree in economics. Now a year or 10 ago that might've been most interesting in the trading world - but things have changed. Derivatives market back then was ultra tiny, right now it's of gigantic proportions, trade volumes have exploded, so the traders simply didn't see how some others who make a profit innovate - and those sure keep it a BIG secret. From my viewpoint however it's a very dangerous thing what happens there. At military level secrets they try to keep secret - yet there is no commercial money involved such as in the trading world, where even knowing 1 sentence here can make you major cash. Government is even more behind. You first have to prove that speed is everything. Well i do know - i come from computerchess - if majority of decision taking is so dead simple in trading (so the patterns), compare 2 things then based upon that take a decision, obviously it means that speed is everything. If i look at IBM, i guess it took them until 2008 to really put an improved trading solution at the market. I'd take that and just make a dedicated trading framework that's 10x faster than that (so that means parsing the market data and integrated take the trading decisions). Though there are hundreds of thousands of trading groups world wide - google a bit around - you'll just find Java jobs. 99% of the traders are just clueless whom to hire i'd guess. > >>> > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Fri Feb 17 13:05:15 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 17 Feb 2012 10:05:15 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <550AB79F-4939-4D56-AF2E-61E14792D071@xs4all.nl> Message-ID: On 2/17/12 7:42 AM, "Vincent Diepeveen" wrote: > >On Feb 17, 2012, at 3:12 PM, Lux, Jim (337C) wrote: > > >> If so, I'd go for that.. The population of FPGA developers is probably >> 1/100 the number of conventional Von Neuman machine developers (in >> whatever language). >> >> Interestingly, such a scarcity does not translate to 100x higher pay. > >I'm not a magnificent FPGA developer - i'd say first speedup the >software - there is so many traders >real slow there. But they don't even are interested IN THAT. > >> Most of the surveys show that in terms of median compensation FPGA >> designers get maybe 30-40% more than software developers. > >The rate for development is $1 an hour in india. If you deal with >major companies in India it's $2.50 an hour >including everything. I think you're a bit low there. People I know who are contracting off-shore development say that the net cost (to the US firm) is about 1/4 and 1/3 what the equivalent person would cost in the US. (and the price is rising) You can hire very low level people quite inexpensively on a bare contract, but you spend more managing them, and compensating for the incredible defect density. And I doubt that good FPGA folks are as thick on the ground in low cost places as they are in the US or Europe. >>> >>> Getting a FPGA a tad higher clocked from Xilinx, say the first sample >>> at 22 nm, is probably not cheap either. >> >> The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere >> $132k, qty 1 >> (drops to $127k in qty 1000) >> (16 week lead time) >> >> 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, 28Gb/s >> transceivers, etc.etc.etc. >> > >how high that would clock? It's a bit tricky when talking clock rates in FPGAs.. Most designs are only regionally synchronous, so you need to take into account propagation delays across the chip if you want max performance. You feed a low speed clock (few hundred MHz) into the chip and it gets multiplied up in onchip DPLLs/DCMs. The MMCM takes 1066 Mhz max input. There's also different clocks for "entirely on chip" and "going off chip", particularly for things like the GigE or PCI-X interfaces (which have their own clocks) Likewise these things have built in interfaces to RAM (so you can get like 1900 Mb/s to DDR ram, running the core at 2V). It's more like doing logic designs.. Propagation delays from D to O through combinatorial logic look like they're in the 0.09 ns range. CLB flipflops have a setup time of 0.04ns and hold time of 0.13 ns, so that kind of looks like you could toggle at around 2 Ghz There's plenty of documentation out there, but it's not as simple as "I'm running the CPU at 3 GHz" > >note you also need to integrate the fastest 10 gigabit nic - right >now that's solarflare. Do you? Why not use some other interconnect. > >> >> To put it in a box with power supply and interfaces probably would >> set you >> back a good chunk of a million dollars. >> > >if you'd do it in a simple manner sure - but if you're already on >that path of speed and your main income is based upon speed you >want something faster than what's online there of course. > >You phone xilinx and intel and demand they print fpga's before >printing cpu's in a new proces technology, and clock 'em as high as >possible. I seriously doubt that even a huge customer is going to change Xilinx's plans. They basically push the technology to what they can do constrained by manufacturability. Then you have all the thermal issues to worry about. These big parts can dissipate more heat than you can get out through the package/pins, or even spread across the die. >guess it's around a 4 million a year minimum. > >That's not a problem for the hedgefunds to pay actually. I suspect that money isn't the scarce resource, people are. There aren't many people in the world who can effectively do this kind of thing. (For the kind of thing you're talking about, it's probably in the 100s, maybe 1000s, total, worldwide) > >For me it's incredible that there is so little jobs actually for >trading in imperative languages and basically 99.9% of all jobs are >Java there. Why is this amazing? The vast majority of money spent on software is spent on run of the mill, mundane chores like payroll accounting, inventory control, processing consumer transactions, etc. So there's a huge population of people to draw from. If you're looking for top people, and you need some number of them, you're better off taking the 3 or 4 sigma from the mean people from a huge population, than taking the 1 sigma people from a small population. > >I just simply would never do business with jpmorgan chase. they seem >to hire 0 persons who are busy imperative or even in C++. So what? That's just a personal preference on your part. How do you know they aren't hiring those people through some other channel? You don't see a lot of ads out there for FORTRAN programmers, but here at JPL, about 25% of the software work that's being done is in FORTRAN. And of the people doing software at JPL, more than half do NOT have degrees in CS or even EE, and I'd venture that they were not hired as "software developers": more likely they were hired for their domain specific knowledge. >Basically means that the entire market of genius guys who know how to >beat you in game tree search, which is about all artificial >intelligence experts, >they're shut out from majority of companies to get a job there. Definitely not. If you're in that tippy top 0.1%, you're not getting jobs by throwing your resume over the transom. You're getting a job because you know someone or someone came to know of you through other means. I didn't get my job at JPL by submitting a resume, and I think that's true of the vast majority of people here. It was also true of my last job, doing special effects work. And in fact, now that I think back, I think I have had only one job which was a resume in response to an ad. > >The way to get a job is to be young and have a degree in economics. Depends on what job you want. A good fraction of the technical degree grads at MIT are being hired by the finance industry. This concerns people like us at JPL, because we can't offer competitive pay and benefits, and on a personal note, I think it's a shame that they're probably not going to be using their skills in the field in which they were actually trained. But the way to get a good job has always been, and will always be, to know someone. There have been *numerous* well controlled studies that looked at hiring behavior, and regardless of what the recruiting people say, in reality managers make their decisions on the same few factors, and "recommendation of coworker or industry colleague" is right up there in the top few. > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Fri Feb 17 14:54:53 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Fri, 17 Feb 2012 20:54:53 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: Message-ID: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> On Feb 17, 2012, at 7:05 PM, Lux, Jim (337C) wrote: > > > On 2/17/12 7:42 AM, "Vincent Diepeveen" wrote: > >> >> On Feb 17, 2012, at 3:12 PM, Lux, Jim (337C) wrote: >> >> >>> If so, I'd go for that.. The population of FPGA developers is >>> probably >>> 1/100 the number of conventional Von Neuman machine developers (in >>> whatever language). >>> >>> Interestingly, such a scarcity does not translate to 100x higher >>> pay. >> >> I'm not a magnificent FPGA developer - i'd say first speedup the >> software - there is so many traders >> real slow there. But they don't even are interested IN THAT. >> >>> Most of the surveys show that in terms of median compensation FPGA >>> designers get maybe 30-40% more than software developers. >> >> The rate for development is $1 an hour in india. If you deal with >> major companies in India it's $2.50 an hour >> including everything. > > > I think you're a bit low there. People I know who are contracting > off-shore development say that the net cost (to the US firm) is > about 1/4 > and 1/3 what the equivalent person would cost in the US. (and the > price > is rising) Well they are wrong then. Usually they also count in the costs at the location which is a staff at the spot, so not in India. The real costs in India you can get it for are $2.50 an hour. This is *big* consultancy companies. The consultants who're independant are usually rates of around $1 - $2 an hour. Philippines sits around $1.11 an hour for consultants. If i look to actual produced work in India you see they're very good in trying to get more work, like additional pay for extra features - so most projects budgeted at X hours usually end up in 2X; of course you and i know that doesn't mean it took 2X :) The actual developers working for such companies complain loud usually about their salary. Yet also for them getting work as a consultant is very difficult. They sit around $150 a month. This is *normal* development. As you know i'm more involved in mass market products, usually that's the better developers on this planet; nothing as complicated as producing a good mass market product as it has to work everywhere. In India rates for that are open market and also go up rapidly. I've even heard of some who got a $1000 a month there in India. But this is really 1 in a 100 developers. > > You can hire very low level people quite inexpensively on a bare > contract, > but you spend more managing them, and compensating for the incredible > defect density. It was indeed the habit to have the managers over here. However times have changed. Nowadays management also sits in Asia. > > And I doubt that good FPGA folks are as thick on the ground in low > cost > places as they are in the US or Europe. We were discussing normal development and i noted that normal development happens in India for $1 an hour for independants and $2.50 including management overhead and everything an hour, by the major consultancy companies. If i see how much power Bulldozer CPU eat, then i can assure you that India is about the last spot on the planet where i'd have develop a FPGA for trading :) To start with you're not gonna ship a development board to India - it's gonna disappear without anyone knowing and without anyone who can be blamed. I remember how i shipped some stuff to India. 0% arrived. Tracking codes - forget it - that's just a number from the west - something utmost useless in India. > > >>>> >>>> Getting a FPGA a tad higher clocked from Xilinx, say the first >>>> sample >>>> at 22 nm, is probably not cheap either. >>> >>> The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere >>> $132k, qty 1 >>> (drops to $127k in qty 1000) >>> (16 week lead time) >>> >>> 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, >>> 28Gb/s >>> transceivers, etc.etc.etc. >>> >> >> how high that would clock? > It's a bit tricky when talking clock rates in FPGAs.. Most designs are > only regionally synchronous, so you need to take into account > propagation > delays across the chip if you want max performance. > > You feed a low speed clock (few hundred MHz) into the chip and it gets > multiplied up in onchip DPLLs/DCMs. > The MMCM takes 1066 Mhz max input. I'm guessing around a 2.5Ghz. We know they expected some boards to be able to clock around 1.7Ghz. > > There's also different clocks for "entirely on chip" and "going off > chip", > particularly for things like the GigE or PCI-X interfaces (which have > their own clocks) Likewise these things have built in interfaces > to RAM > (so you can get like 1900 Mb/s to DDR ram, running the core at 2V). > > It's more like doing logic designs.. Propagation delays from D to O > through combinatorial logic look like they're in the 0.09 ns > range. CLB > flipflops have a setup time of 0.04ns and hold time of 0.13 ns, so > that > kind of looks like you could toggle at around 2 Ghz > > There's plenty of documentation out there, but it's not as simple > as "I'm > running the CPU at 3 GHz" > I've not seen any actual designs - they are utmost top secret - no military secret is as secret as what they use in the datacenters in own designs - besides that the biggest military secrets, like when a war is gonna happen, you roughly can predict quite well if you watch the news. Forget DDR ram, DDR ram really is too slow in latency also forget about pci-x. I'd say also forget about pci-e - think of custom mainboards. On the FPGA card there's gonna be massive SRAM besides a solarflare NIC. You need quite a lot actually as the entire datafeed has to get streamed to the rest of the mainboard for storage and more complicated trading analysis i'd suppose. I would guess the chip can do the simple trading decisions in a pretty simple manner. This keeps the chipdesign relative simple and you can focus upon clocking it high. The rest you want to do in software. Losing the pci-e latencies then is crucial as that's a big bottleneck. Also forget normal RAM. No nothing DDR3. Just SRAM and many dozens of gigabytes of it as we want to keep the entire day preferably in RAM and our analysis are nonstop hammering onto the SRAM so basically the bandwidth to the SRAM determines how fast our analysis will be. The simple trading decisions already get done by the FPGA of course. But well i guess they probably tried to get ALL trading decisions inside the fpga - who knows. As for the mainboards, to quote someone : "there are some very custom designs out there". Yet of course you can do real well in software already for a fraction of that budget. > >> >> note you also need to integrate the fastest 10 gigabit nic - right >> now that's solarflare. > > > Do you? Why not use some other interconnect. Because the datacenter of the exchange dictates to you what sort of protocol they use. Several are now exclusively solarflare. FTSE has gone to infiniband - but i'm not sure whether that's their internal machines only. FTSE is only limited interesting of course - Chicago is more interesting :) Realize also by the way that you have many connections. Really a lot. Market A comes from IP adres X, Market B from IP adres Y and so on, we speak of hundreds of IP adresses you have to connect to simultaneously. You get those adresses in a dynamic manner, say in a XML type manner. It's a total mess they created and they love create a mess as that means more work and work means you can make money from rich traders. They really overcomplicated it all. Also realize at any moment they can change the messages - the way how the messages look like you get in a dynamic manner in XML. There is nothing really hard defined. That's why all that generic software is trading at such slow speeds! No nothing hardcoded. Basically the whole protocol is NOT designed for speed. Calling it a spaghettidesign would be a compliment. The financials LOVE to just add to the mess and never 'fix' things. See it as adding components to the space shuttle and then pray it still works. Actually to get back to the factual criticism in that article - the space shuttle was intended to work correctly - the exchanges do not give the garantuee that you get the information at all :) Only the trading is TCP, the crucial stream what bids and offers there are, they are in RAW format and just 1 channel is gonna be fast enough for you to try follow it. If that channel doesn't have the correct info - bad luck for you. That bad luck of course happens during big surges - we all realize that. This lossy way of information is the underlying method based upon which decisions get taken which have major implications :) There is actually from the FIX/FAST community meetings on the protocol at a regular interval. To just show your company name you have to pay a $25k. To also display it at the conference is a big multiple of that. The next one is scheduled for London at the 13th of March 2012: www.fixprotocol.org >> >>> >>> To put it in a box with power supply and interfaces probably would >>> set you >>> back a good chunk of a million dollars. >>> >> >> if you'd do it in a simple manner sure - but if you're already on >> that path of speed and your main income is based upon speed you >> want something faster than what's online there of course. >> >> You phone xilinx and intel and demand they print fpga's before >> printing cpu's in a new proces technology, and clock 'em as high as >> possible. > > > I seriously doubt that even a huge customer is going to change > Xilinx's > plans. They basically push the technology to what they can do > constrained > by manufacturability. > > Then you have all the thermal issues to worry about. These big > parts can > dissipate more heat than you can get out through the package/pins, > or even > spread across the die. > It has crossed my mind for just a second that if 1 government would put 1 team together and fund it, and have them produce a magnificent trading solution, which for example plugs in flawless into the IBM websphere, and have their own traders buy this, especially small ones with just ties to their own nation, that in theory things still are total legal. A team, of course so called under a company name, produces a product, and traders buy this in from this company. The only thing is they don't tell around they do business with this company - only their accountant sees that name - things stil perfectly legal. Now some of those traders will make a profit and others will lose some; statistically they wil however perform a lot better than they used to do. That means that a big flow of cash moves towards 1 nation. Of course you can also do this just in software - all you have to beat is IBM, which is freaking peanuts for a good programmer. The only important thing is that they're with that fastest 10%. No need to have the fastest solution there - let some hedgefunds design that (they already did). It's just a throught experiment, but it sure would basically win a lot of money from abroad to your own nation; in the end a big part of that then flows into your own nation and pays for things. Do not think in small numbers here - they trade so much money daily, just a small statistical advantage as your software is faster than it would otherwise have been, is having a huge impact. > >> guess it's around a 4 million a year minimum. >> >> That's not a problem for the hedgefunds to pay actually. > > I suspect that money isn't the scarce resource, people are. There > aren't > many people in the world who can effectively do this kind of thing. > (For > the kind of thing you're talking about, it's probably in the 100s, > maybe > 1000s, total, worldwide) Well i'm sure there aren't many, but i'm very sure they aren't trying to get the best programmers - they mainly hire JAVA coders everywhere. > >> >> For me it's incredible that there is so little jobs actually for >> trading in imperative languages and basically 99.9% of all jobs are >> Java there. > > > Why is this amazing? Suppose you only hire people who are lefthanded. That's basically what they're doing. Because speed is everything at the exchanges now and you aren't gonna get the best people this way, as they shut out the biggest experts basically, with a few exceptions. > The vast majority of money spent on software is > spent on run of the mill, mundane chores like payroll accounting, > inventory control, processing consumer transactions, etc. > > So there's a huge population of people to draw from. > > If you're looking for top people, and you need some number of them, > you're > better off taking the 3 or 4 sigma from the mean people from a huge > population, than taking the 1 sigma people from a small population. > They just draw out of a small population of usually very bad programmers who studied finance. How do you get the best software engineers then for your trading application? > > >> >> I just simply would never do business with jpmorgan chase. they seem >> to hire 0 persons who are busy imperative or even in C++. > > > So what? That's just a personal preference on your part. How do > you know > they aren't hiring those people through some other channel? And what would that 'other channel' be then? Vaste majority simply isn't doing this. > > You don't see a lot of ads out there for FORTRAN programmers, but > here at > JPL, about 25% of the software work that's being done is in > FORTRAN. And No matter how genius you are, if all you do is write fortran, and didn't study finance, then you are not allowed to write a trading application, AS YOU WON'T GET HIRED :) > of the people doing software at JPL, more than half do NOT have > degrees in > CS or even EE, and I'd venture that they were not hired as "software > developers": more likely they were hired for their domain specific > knowledge. > >> Basically means that the entire market of genius guys who know how to >> beat you in game tree search, which is about all artificial >> intelligence experts, >> they're shut out from majority of companies to get a job there. > > Definitely not. If you're in that tippy top 0.1%, you're not > getting jobs > by throwing your resume over the transom. You're getting a job > because > you know someone or someone came to know of you through other means. > You seem to be the expert in hiring people :) > I didn't get my job at JPL by submitting a resume, and I think > that's true > of the vast majority of people here. It was also true of my last job, > doing special effects work. And in fact, now that I think back, I > think I > have had only one job which was a resume in response to an ad. > When you got there, there was a circle at your resume around the word 'NASA' And nothing else mattered i bet :) > > >> >> The way to get a job is to be young and have a degree in economics. > > Depends on what job you want. > > A good fraction of the technical degree grads at MIT are being > hired by > the finance industry. This concerns people like us at JPL, because we > can't offer competitive pay and benefits, and on a personal note, I > think > it's a shame that they're probably not going to be using their > skills in > the field in which they were actually trained. > > > But the way to get a good job has always been, and will always be, > to know > someone. > > There have been *numerous* well controlled studies that looked at > hiring > behavior, and regardless of what the recruiting people say, in reality > managers make their decisions on the same few factors, and > "recommendation > of coworker or industry colleague" is right up there in the top few. > > >> > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From worringen at googlemail.com Fri Feb 17 18:39:19 2012 From: worringen at googlemail.com (Joachim Worringen) Date: Sat, 18 Feb 2012 00:39:19 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> Message-ID: Vincent, I haven't read all zillion lines of your posts, but as I'm heading software engineering of a very successful "prop shop" (proprietary trading company), I might add some real-world comments: - Execution speed is important, but it's not everything. Only the simplest strategies purely rely on speed for success. - Even more important than execution speed is time-to-market. It's of no use to have the superfast thing ready when the market has moved into a different direction nine months ago. - Equally important is reliability and maintainability. Our inhouse development is based on C++, but we make very good money with Java-based third-party solutions as well. FPGAs are not the silver-bullet-solution either. Finding people to program them is hard, development is complex and takes time, verifying them takes even more, and is required for every little change. Think about time-to-market. Therefore, they are mainly used in limited scenarios (risk-checks), or are used with high-level-compiler support, giving away significant fraction of the potential performance. Oh, btw, we are always looking for bright, but also socially compliant developers. Joachim _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Fri Feb 17 23:45:17 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 17 Feb 2012 20:45:17 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> Message-ID: > > > >>>> The biggest, baddest Virtex 7 (XC7V2000TL2FLG1925E) is a mere >>>> $132k, qty 1 >>>> (drops to $127k in qty 1000) >>>> (16 week lead time) >>>> >>>> 2 million logic cells 70 Mb onchip block ram, 3600 dsp slices, >>>> 28Gb/s >>>> transceivers, etc.etc.etc. >>>> >>> >>> how high that would clock? >> It's a bit tricky when talking clock rates in FPGAs.. Most designs are >> only regionally synchronous, so you need to take into account >> propagation >> delays across the chip if you want max performance. >> >> You feed a low speed clock (few hundred MHz) into the chip and it gets >> multiplied up in onchip DPLLs/DCMs. >> The MMCM takes 1066 Mhz max input. > >I'm guessing around a 2.5Ghz. Maybe, maybe not. Most big FPGA designs I've seen aren't one big synchronous blob in any case. If you have a pipelined process spread across the chip, you might clock individual chunks at X Mhz, but there's N cycles to get through the pipeline. Kind of depends whether in to out latency is important or bits per second throughput, I suppose. The old stationwagon full of tapes vs hot stuff ASIC. > >On the FPGA card there's gonna be massive SRAM besides a solarflare NIC. Maybe, maybe not. If you're shoving buffers around, parallelism might be more important than raw memory access time. > >I would guess the chip can do the simple trading decisions in a >pretty simple manner. >This keeps the chipdesign relative simple and you can focus upon >clocking it high. When you're talking designs with 10s of millions of gates, you can do pretty complex things. > >Also forget normal RAM. No nothing DDR3. Just SRAM and many dozens of >gigabytes of it as we want to keep the entire day preferably in RAM >and our analysis are nonstop hammering onto the SRAM so basically the >bandwidth to the SRAM determines how fast our analysis will be. I just cited the DDR3 as an example out of the datasheet. I'm sure that if you're interested you'll go download the Virtex 7 data sheet and study it. > >As for the mainboards, to quote someone : "there are some very custom >designs out there". I doubt there is anything such as a "standard" board using a $100k FPGA. I'd bet a fair number of cold frosty beverages that ALL boards using this kind of thing fit the "custom" category. >> >> >> I seriously doubt that even a huge customer is going to change >> Xilinx's >> plans. They basically push the technology to what they can do >> constrained >> by manufacturability. >> >> Then you have all the thermal issues to worry about. These big >> parts can >> dissipate more heat than you can get out through the package/pins, >> or even >> spread across the die. >> > >It has crossed my mind for just a second that if 1 government would >put 1 team together and fund it, >and have them produce a magnificent trading solution, >which for example plugs in flawless into the IBM websphere, >and have their own traders buy this, especially small ones with just >ties to their own nation, that in theory things still are total legal. There was a famous challenge about breaking DES, where it was done with FPGAs. But why would a government fund such a thing (except perhaps as an economic weapon.. Humorous thoughts of died in the wool Marxists cackling at the thought of destroying the capitalists with their own trading tools, developed by a centrally planned "trading machine establishment #3". > >How do you get the best software engineers then for your trading >application? By asking your other software developers? As we used to call it in the entertainment industry "Neportunity". > >And what would that 'other channel' be then? Personal contacts. > >Vaste majority simply isn't doing this. Let's see.. Unemployment in the software industry is down around 3% these days (viz 8-12% in general, and 20-25% in certain demographics and areas). They're finding jobs somehow. 10:1 or 50:1 resume to open position isn't uncommon. Somehow they find the 1 in 50, and a fair number of studies show that it's not done by some HR person carefully reviewing the 50 resumes to find the one shining diamond. >No matter how genius you are, if all you do is write fortran, and >didn't study finance, >then you are not allowed to write a trading application, AS YOU WON'T >GET HIRED :) More an example that different industries hire for different skill sets and backgrounds. You're right, I can't imagine FORTRAN being very useful in trading. But hey, I don't write trading apps.. For all I know it's mostly matrix math and FORTRAN is pretty good for that. Maybe they want people who write FORTH or LISP or PROLOG. The point is, it's a very niche market, looking for a very niche programmer that is probably not remotely representative of software developers at large. > >> I didn't get my job at JPL by submitting a resume, and I think >> > >When you got there, there was a circle at your resume around the word >'NASA' Actually not.. I hadn't worked at JPL then. The circle was around "microscan compressive receiver", purely by chance. > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 03:23:59 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 09:23:59 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> Message-ID: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: > Vincent, > > I haven't read all zillion lines of your posts, but as I'm heading > software engineering of a very successful "prop shop" (proprietary > trading company), I might add some real-world comments: > - Execution speed is important, but it's not everything. Only the > simplest strategies purely rely on speed for success. Which is 90% of all strategies of all traders. So statistics total refute you. in a very very hard way. The markets move total horizontal now - speed is the only thing that can make you good money at the derivatives markets now. > - Even more important than execution speed is time-to-market. It's of > no use to have the superfast thing ready when the market has moved > into a different direction nine months ago. > - Equally important is reliability and maintainability. > > Our inhouse development is based on C++, but we make very good money > with Java-based third-party solutions as well. > > FPGAs are not the silver-bullet-solution either. Finding people to > program them is hard, development is complex and takes time, verifying > them takes even more, and is required for every little change. Think > about time-to-market. Therefore, they are mainly used in limited > scenarios (risk-checks), or are used with high-level-compiler support, > giving away significant fraction of the potential performance. > I don't see fpga as the silver bullet either, because of its huge costs - as i described before the costs are much higher than a normal fpga development would be. Your time to market argument is total nonsense. Only a civil servant would show up with such statement. If you quickly tape out a trading product that's suck ass again, say 100 microseconds latency - no one will buy it. Fast/Fix was used 10 years ago and will be used 10 years from now. And if some politician in nation A says: "we limit the exchanges", then they move to nation B. > Oh, btw, we are always looking for bright, but also socially compliant > developers. > yes all those social people you hire to work in the financial industry - they'd never sell a bad product to someone else - making money is not the main concern - so so so social :) > Joachim > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 03:36:03 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 09:36:03 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> Message-ID: <42D8407A-9716-48BD-8381-14DC483B5909@xs4all.nl> Out of interest if i google: Date: 2009 "Joachim Worringen is a software architect at Dolphin Interconnect " That's what you did do before moving to keeping a few guys busy in your prop shop? Kind Regards, Vincent On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: > Vincent, > > I haven't read all zillion lines of your posts, but as I'm heading > software engineering of a very successful "prop shop" (proprietary > trading company), I might add some real-world comments: > - Execution speed is important, but it's not everything. Only the > simplest strategies purely rely on speed for success. > - Even more important than execution speed is time-to-market. It's of > no use to have the superfast thing ready when the market has moved > into a different direction nine months ago. > - Equally important is reliability and maintainability. > > Our inhouse development is based on C++, but we make very good money > with Java-based third-party solutions as well. > > FPGAs are not the silver-bullet-solution either. Finding people to > program them is hard, development is complex and takes time, verifying > them takes even more, and is required for every little change. Think > about time-to-market. Therefore, they are mainly used in limited > scenarios (risk-checks), or are used with high-level-compiler support, > giving away significant fraction of the potential performance. > > Oh, btw, we are always looking for bright, but also socially compliant > developers. > > Joachim > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From worringen at googlemail.com Sat Feb 18 04:13:03 2012 From: worringen at googlemail.com (Joachim Worringen) Date: Sat, 18 Feb 2012 10:13:03 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> Message-ID: On Sat, Feb 18, 2012 at 9:23 AM, Vincent Diepeveen wrote: > On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: >> - Execution speed is important, but it's not everything. Only the >> simplest strategies purely rely on speed for success. > > Which is 90% of all strategies of all traders. > > So statistics total refute you. > in a very very hard way. Our daily P&L statistics give us a different impression, but we are probably just to stupid to read them correctly. Joachim _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 04:31:26 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 10:31:26 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <508776CC-60DF-4CF1-AB3F-2377C766567D@xs4all.nl> <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> Message-ID: On Feb 18, 2012, at 10:13 AM, Joachim Worringen wrote: > On Sat, Feb 18, 2012 at 9:23 AM, Vincent Diepeveen > wrote: >> On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: >>> - Execution speed is important, but it's not everything. Only the >>> simplest strategies purely rely on speed for success. >> >> Which is 90% of all strategies of all traders. >> >> So statistics total refute you. >> in a very very hard way. > > Our daily P&L statistics give us a different impression, but we are > probably just to stupid to read them correctly. > If your software isn't duck slow i bet you have problems doing simple trades. It reminds me bigtime this discussion with a few government guys who produce suck software keeping dudes busy; they complained that 90% of the trades they tried to submit were refused. Of course it's easy to show then that this is because they're way way too slow with their software - not seldom stil software in the hundreds of microseconds latency, and hardware latencies are not counted here. Nor the extremely slow nature of built in NIC's. In fact in a congressional hearing this fact also was mentionned - again someone using that same software like the first guy who complained to me - that 90% was getting refused; that obviously means you're too slow. Maybe hire better software engineers and buy a decent network card? > Joachim > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 06:04:28 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 12:04:28 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: Message-ID: On Feb 18, 2012, at 5:45 AM, Lux, Jim (337C) wrote: >> >> It has crossed my mind for just a second that if 1 government would >> put 1 team together and fund it, >> and have them produce a magnificent trading solution, >> which for example plugs in flawless into the IBM websphere, >> and have their own traders buy this, especially small ones with just >> ties to their own nation, that in theory things still are total >> legal. > > There was a famous challenge about breaking DES, where it was done > with > FPGAs. > But why would a government fund such a thing (except perhaps as an > economic weapon.. Humorous thoughts of died in the wool Marxists > cackling > at the thought of destroying the capitalists with their own trading > tools, > developed by a centrally planned "trading machine establishment #3". In the first place government already indirectly runs so many companies (pays directly or indirectly the jobs created there), a sick habit especially in Europe, that the plan i sketched there would be peanuts to execute. they do have the people for it, they already pay the companies to keep dudes busy and spoil years of their lives, which IMHO is a very evil thing; basically anyone above IQ120 is gonna get hammered down by the government in a mercilious manner; that means that vaste majority 95% or so, will shut up rest of his life, give 0 criticisms publicly anymore and go the selfish way - whereas he or she otherwise would give that criticism, which is so crucial for democracies to selfcorrect; so that really is a problem now for democracies as a democracy cannot function if the clever get hammered down. Fools take over then. Right now statistics here in Netherlands are that 90% of politicians are there just with intention of getting a job. Very selfexplaining statistics. Now on countries - most nations, and Germany and USA are no different there - work pretty mechanical in how they do business. If they see an opportunit to make big cash for their country they will be tempted to do it. Currently the huge changes of the past few years in financial industry, from an industry dominated by traders who had a financial background, to the current NSA type struggle for speed, hardware and game tree search type manners of making money there; in past what was very common was some guy X who owned a bunch of houses/buildings, who was trading. He talked to someone, CEO of some company. The CEO said nothing useful, but our guy X concluded he blinked a lot and based upon that tried to sell his interest in that company. That game has changed. What has come back is that new datacenters will allow up to trading thousands of times per second in the same instrument, say future Mellanox, which in long term we expect to go up. Yet we trade thousands of times per second in this same instrument. So when it drops 1 cent, we sell a little again, the slower traders will then take another few milliseconds or so to have sold, it'll go down, so we can really make a big profit based upon 1 expectation, just by trading nonstop. This is actually what happens. This is not an 'example'. this behaviour is what happens at the exchanges. At most exchanges it's limited right now to a 200 times per second in the same instrument, yet where the datacenters are having faster networks this 200 goes up already by a lot. These are *measured* statistics; so hedgefunds making great profits past few years, they factual trade up to 200 times per second in the same instrument during surges. Sure rest of day nothing happens there - but each day there is of course at least 2 surges, sometimes 3 or more. >> >> How do you get the best software engineers then for your trading >> application? > > By asking your other software developers? As we used to call it in > the > entertainment industry "Neportunity". >> >> And what would that 'other channel' be then? > Personal contacts. > Wasn't the idea of building a resume that you could get hired based upon NOT having a friend somewhere? >> >> Vaste majority simply isn't doing this. > > Let's see.. Unemployment in the software industry is down around 3% > these > days (viz 8-12% in general, and 20-25% in certain demographics and > areas). > They're finding jobs somehow. 10:1 or 50:1 resume to open > position isn't > uncommon. Somehow they find the 1 in 50, and a fair number of > studies show > that it's not done by some HR person carefully reviewing the 50 > resumes to > find the one shining diamond. Not sure about the States, but in Europe the statistics here get manipulated bigtime of course. A good example is that one day we had big unemployment, they moved then lots of folks from unemployment statistic to disabled statistics. Then suddenly lots of folks got still the same amount of money yet politicians cried victory that unemployment statistics went down. Well this nation, The Netherlands, is a bad example, as out of the 7.8 milion who are in workeable age, 5.5 million of them work at (semi-)government. Not really following elections in the US, but seems Obama wants to adapt to that model as well? - let's take that off the mailing list though > > >> No matter how genius you are, if all you do is write fortran, and >> didn't study finance, >> then you are not allowed to write a trading application, AS YOU WON'T >> GET HIRED :) > > More an example that different industries hire for different skill > sets > and backgrounds. You're right, I can't imagine FORTRAN being very > useful > in trading. But hey, I don't write trading apps.. For all I know it's > mostly matrix math and FORTRAN is pretty good for that. Maybe they > want > people who write FORTH or LISP or PROLOG. > > The point is, it's a very niche market, looking for a very niche > programmer that is probably not remotely representative of software > developers at large. > My point was that i would want to hire that genius guy. fortran or C, no big deal. If he can write fortran he can write C, or will have enough skills to get down to the utmost details speeding me up somewhere or fixing another problem, or finding another problem we have to avoid that we didn't notice yet. Financial industry is so spoiled as they pay such high salaries, that the best thing that describes it is 'the old schoolboys network'. It's so lucrative to work there in higher positions, that everyone of course wants it. I remember speaking to some very influential people past years and sometimes i always wondered how such mediocre guys or ladies managed to get where they are. But in the end i always concluded that some very mediocre guys sometimes have 1 big talent which nearly 0 geniuses have - and that's getting hired. > >> >>> I didn't get my job at JPL by submitting a resume, and I think >>> >> >> When you got there, there was a circle at your resume around the word >> 'NASA' > > Actually not.. I hadn't worked at JPL then. The circle was around > "microscan compressive receiver", purely by chance. Ah the truth is always painful isn't it? Over here most managers aren't very impressed by their own H&R departments. Most H&R departments over here get manned actually get 'girled' by 22-30 year old. Usually ladies sometimes men, with a simplistic college degree at most. Most actually are well informed yet questionable is whether they also are capable of thinking at that level. Selection of resumes/CV's always happens in the same manner. In this nation a job that involves technology, say making software for a network card, means they would require from you you get from a technical university. So normal universities, say for example Utrecht (in top 50 of planet), where i studied - they throw such CV's away. Not a single look get taken. It's just 'elimination time'. I heard many reports of guys who were allowed to speak for a job based upon 1 company they worked at and nothing else ontheir CV. One of them litterally reported that when he asked who had put that circle around that specific company he worked for, which for him was a minor job, just like 1% of what he had achieved in life, the explanation from the manager was he got the CV like that from H&R and that they had encircled that company for him - he didn't do it. Usually these ladies and men at H&R here are paid a salary that's well below what the actual software engineers make here. Entire IT average here over entire nation is around 52k euro a year. H&R is nearly half of that. H&R management is hardly over 40k euro a year. I remember talking to some major Chinese factories, and after having a good look over there, i was a tad amazed by the differences in payment there. Workers and software engineers really made little. Total peanuts. Also they worked 6 days a week and live on site in a small room, not seldom also shared with others (depending upon position). On other than the H&R manager made 50k dollar a year. A royal salary over there. Over factor 10 that of the workers there. They're having far more capable H&R over there than any H&R in this entire nation. > >> > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From atp at piskorski.com Sat Feb 18 11:12:15 2012 From: atp at piskorski.com (Andrew Piskorski) Date: Sat, 18 Feb 2012 11:12:15 -0500 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> Message-ID: <20120218161215.GA48861@piskorski.com> On Sat, Feb 18, 2012 at 09:23:59AM +0100, Vincent Diepeveen wrote: > On Feb 18, 2012, at 12:39 AM, Joachim Worringen wrote: >> - Execution speed is important, but it's not everything. Only the >> simplest strategies purely rely on speed for success. > > Which is 90% of all strategies of all traders. And you know that how, Vincent? You've worked in the financial trading world so long and have talked with so many different (successful) traders, that a guess based on your own experience might well be reasonable? Oh wait, you've never done anything remotely like that. The only plausible way you could know that is if you've been reading lots of comprehensive academic and/or industry surveys (if they exist) of traders. Sounds interesting. So how about you point us to the best summary of all that data on what sort of strategies traders say they use? I'm sure there must be one, because you wouldn't just be making wild assertions unsupported by any evidence whatsoever, right? Joachim, thanks for chiming in with observations based on your real-world experience. It's nice to see some piece of Vincent's rants occasionally inspire worthwhile content. Jim Lux, you too, even more so; I've learned interesting tidbits about FPGAs, etc. from your recent posts. -- Andrew Piskorski _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Sat Feb 18 12:02:08 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Sat, 18 Feb 2012 12:02:08 -0500 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <20120218161215.GA48861@piskorski.com> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> Message-ID: <4F3FD990.7000503@scalableinformatics.com> On 02/18/2012 11:12 AM, Andrew Piskorski wrote: > You've worked in the financial trading world so long and have talked > with so many different (successful) traders, that a guess based on > your own experience might well be reasonable? Oh wait, you've never > done anything remotely like that. FWIW: our customers in this market all say the same thing in terms of "Time To Market" and correctness/maintainability. It sucks if your code is write once. Its bad if it takes you N months to deploy something that a competitor can deploy in N weeks (or N days). What we are seeing (from customers) are mixes of C/C++ and a number of domain specific languages (DSL), as well as "scripting" languages which have JIT compilers or JIT->VM execution paths. Java is one of these, as well as a few others. Oddly, I haven't seen so much Python in this, a little Perl, and zero Ruby. The languages that are in use are very interesting (the well known ones), and the ones that aren't as well known or are private DSLs are pretty darn cool. There is much to be said for a language that enables an ease of expression of an algorithm, and doesn't get in your way with housekeeping and language bureaucracy crap. Understand that I am a fan of more terse languages, and the ones that force you into massive over-keyboarding (cough cough ... where's my coffee) should just have a nice '#include "boring_stuff.inc"' to simplify them. > The only plausible way you could know that is if you've been reading > lots of comprehensive academic and/or industry surveys (if they exist) > of traders. Sounds interesting. So how about you point us to the > best summary of all that data on what sort of strategies traders say > they use? I'm sure there must be one, because you wouldn't just be > making wild assertions unsupported by any evidence whatsoever, right? Heh ... /stands up to give a good takedown ovation ... There's lots of (mis)information out there on what HF* (multiple different types of high frequency trading ... not just equities) implies. The naive view is that the only thing that matters is speed of execution. As we service this market, we aren't directly involved in day to day elements of the participants coding, but we are aware of (some) issues that impact it. > Joachim, thanks for chiming in with observations based on your > real-world experience. It's nice to see some piece of Vincent's rants > occasionally inspire worthwhile content. Jim Lux, you too, even more > so; I've learned interesting tidbits about FPGAs, etc. from your > recent posts. Seconded. Nice to see some real end users speak up here (and HF* is most definitely a big data/HPC problem ... big HPC?). And I always enjoy Jim's posts. On silver bullets, there aren't any. Ever. Anyone trying to convince you of this is selling you something. FPGAs are very good at some subset of problems, but they are extremely hard to 'program'. Unless you get one of the "compilers" which use a virtual CPU of some sort to execute the code ... in which case you are giving up a majority of your usable performance anyway. And if someone from Convey or Mitrionics v2 wants to jump in and call BS (and even better, say something interesting on how you can avoid giving up the performance), I'd love to see/hear this. FPGAs have become something of a "red headed stepchild" of accelerators. The tasks they are good for, they are very good for. But getting near optimal performance is hard (based upon my past experience/knowledge ... more than 1 year old), and usually violates the "minimize time to market" criterion. If you have a problem which will change infrequently, and doesn't involve too much DP floating point, and lots of integer ops ... FPGAs might be a great fit technologically, though the other aspects have to be taken into account. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 12:30:49 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 18:30:49 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <4F3FD990.7000503@scalableinformatics.com> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> Message-ID: <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> On Feb 18, 2012, at 6:02 PM, Joe Landman wrote: > On 02/18/2012 11:12 AM, Andrew Piskorski wrote: > >> You've worked in the financial trading world so long and have talked >> with so many different (successful) traders, that a guess based on >> your own experience might well be reasonable? Oh wait, you've never >> done anything remotely like that. > > FWIW: our customers in this market all say the same thing in terms of > "Time To Market" and correctness/maintainability. It sucks if your > code > is write once. Its bad if it takes you N months to deploy something > that a competitor can deploy in N weeks (or N days). Say you produce a cpu now within 2 weeks. it's 200 watt. it's 1 gflop and it's $100 production price including R&D overhead, including everything except shipping to customers. Good idea to sell? time to market matters for simple software engineering - not for stuff that has to perform ok? > > What we are seeing (from customers) are mixes of C/C++ and a number of > domain specific languages (DSL), as well as "scripting" languages > which > have JIT compilers or JIT->VM execution paths. Java is one of > these, as > well as a few others. > > Oddly, I haven't seen so much Python in this, a little Perl, and zero > Ruby. The languages that are in use are very interesting (the well > known ones), and the ones that aren't as well known or are private > DSLs > are pretty darn cool. There is much to be said for a language that > enables an ease of expression of an algorithm, and doesn't get in your > way with housekeeping and language bureaucracy crap. > > Understand that I am a fan of more terse languages, and the ones that > force you into massive over-keyboarding (cough cough ... where's my > coffee) should just have a nice '#include "boring_stuff.inc"' to > simplify them. > >> The only plausible way you could know that is if you've been reading >> lots of comprehensive academic and/or industry surveys (if they >> exist) >> of traders. Sounds interesting. So how about you point us to the >> best summary of all that data on what sort of strategies traders say >> they use? I'm sure there must be one, because you wouldn't just be >> making wild assertions unsupported by any evidence whatsoever, right? > > Heh ... > > /stands up to give a good takedown ovation ... Yeah i bet some dudes want to know at which age i wrote my first mortgage calculator and and at which age i wrote my first trading application - but i won't. Had you read what i posted - which you obviously never do - it would be pretty obvious to you i had. Also shows how total ignorant you are about performance. but let me ask you next 3 questions: a) when are you gonna buy a bulldozer cpu from your own cash? If so why? And if not why not? b) you drink of course a cola from a local supermarket isn't it, so no pepsi nor coca cola - price matters most isn't it? expensive wines? Time to market huh? c) in 2003 AMD was first to market a x64 cpu, intel following with core2 some years later. Did you buy this yourself or advice others to buy it? - i bet like everyone on this list you get daily requests on what to buy isn't it, so be honest - what did you advice in 2003 and 2004 to those surrounding you? > > There's lots of (mis)information out there on what HF* (multiple > different types of high frequency trading ... not just equities) > implies. The naive view is that the only thing that matters is > speed of > execution. As we service this market, we aren't directly involved in > day to day elements of the participants coding, but we are aware of > (some) issues that impact it. > >> Joachim, thanks for chiming in with observations based on your >> real-world experience. It's nice to see some piece of Vincent's >> rants >> occasionally inspire worthwhile content. Jim Lux, you too, even more >> so; I've learned interesting tidbits about FPGAs, etc. from your >> recent posts. > > Seconded. Nice to see some real end users speak up here (and HF* is > most definitely a big data/HPC problem ... big HPC?). And I always > enjoy Jim's posts. > > On silver bullets, there aren't any. Ever. Anyone trying to convince > you of this is selling you something. > > FPGAs are very good at some subset of problems, but they are extremely > hard to 'program'. Unless you get one of the "compilers" which use a > virtual CPU of some sort to execute the code ... in which case you are > giving up a majority of your usable performance anyway. And if > someone > from Convey or Mitrionics v2 wants to jump in and call BS (and even > better, say something interesting on how you can avoid giving up the > performance), I'd love to see/hear this. FPGAs have become > something of > a "red headed stepchild" of accelerators. The tasks they are good > for, > they are very good for. But getting near optimal performance is hard > (based upon my past experience/knowledge ... more than 1 year old), > and > usually violates the "minimize time to market" criterion. > > If you have a problem which will change infrequently, and doesn't > involve too much DP floating point, and lots of integer ops ... FPGAs > might be a great fit technologically, though the other aspects have to > be taken into account. > > Joe > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics Inc. > email: landman at scalableinformatics.com > web : http://scalableinformatics.com > http://scalableinformatics.com/sicluster > phone: +1 734 786 8423 x121 > fax : +1 866 888 3112 > cell : +1 734 612 4615 > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Sat Feb 18 13:26:20 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Sat, 18 Feb 2012 13:26:20 -0500 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> Message-ID: <4F3FED4C.3010903@scalableinformatics.com> On 02/18/2012 12:30 PM, Vincent Diepeveen wrote: > > On Feb 18, 2012, at 6:02 PM, Joe Landman wrote: > >> On 02/18/2012 11:12 AM, Andrew Piskorski wrote: >> >>> You've worked in the financial trading world so long and have talked >>> with so many different (successful) traders, that a guess based on >>> your own experience might well be reasonable? Oh wait, you've never >>> done anything remotely like that. >> >> FWIW: our customers in this market all say the same thing in terms of >> "Time To Market" and correctness/maintainability. It sucks if your code >> is write once. Its bad if it takes you N months to deploy something >> that a competitor can deploy in N weeks (or N days). > > Say you produce a cpu now within 2 weeks. it's 200 watt. it's 1 gflop > and it's $100 production price > including R&D overhead, including everything except shipping to customers. > > Good idea to sell? ... from which I take it you didn't comprehend Joachim's point. Time to market is a way to say how quickly a code or computing platform (software side) can be put into production. At least this is what we are told. Time to market has *nothing* whatsoever to do with what you are talking about in this context. Others feel free to jump in and correct/dispute/etc. this. > > time to market matters for simple software engineering - not for stuff > that has to perform ok? See above. [...] > Yeah i bet some dudes want to know at which age i wrote my first > mortgage calculator > and and at which age i wrote my first trading application - but i won't. > > Had you read what i posted - which you obviously never do - it would be > pretty obvious to you i had. Hmmm ... I used to read what you wrote (note the past tense) though I skim it for ... er ... nuggets ... these days. > > Also shows how total ignorant you are about performance. but let me ask > you next 3 questions: I. am. ignorant. about. performance. This is either the most insulting or amusing thing I've read in a really long time. Years. I'll take it as amusing. Yes Vincent. We, who push our gear that hits more than 5 GB/s sustained to and from spinning rust, in a single box, with less than 50 disks required to get there ... we who push our flash arrays and SSD arrays that hit millions of IOPs ... Yes Vincent, we are ignorant of performance. We obviously don't get or understand performance. We have no interest in it. (the preceding should be read aloud with a voice positively dripping in sarcasm) I am laughing now. No, really. Thats laughter you hear. > > a) when are you gonna buy a bulldozer cpu from your own cash? If so why? > And if not why not? I bought (with my own cash) Opteron and Xeon. Clearspeed, FGPAs, GPUS etc. And you? Won't buy bulldozer for us for a while. Looks like it needs some performance tweaks, and the compilers have to do a better job for it. > b) you drink of course a cola from a local supermarket isn't it, so no > pepsi nor coca cola - price matters most isn't it? Er ... I don't drink soda (sound of an opaque metaphor shattering). > expensive wines? Time to market huh? Ahh ... an obtuse path to discuss time to market as a function of cost. So if your development cost requires you pay expensive programmers for 6 months working on very expensive hardware with very expensive tools for an application that will have a usable lifetime of 3-6 months (or whatever window is relevant) ... ... versus ... you pay your very expensive programmers for 1-2 weeks working on less expensive hardware with well designed and inexpensive tools for an application that will have a usable lifetime of 3-6 months (or whatever window is relevant) ... which of these will a) cost less to build/test/deploy b) have a longer time in market to make you money? And your expensive programmers get to work on the next task, therefore increasing your ability to have a collection of tools actively engaged on the market. Seriously, if you don't get why this matters, well ... > c) in 2003 AMD was first to market a x64 cpu, intel following with core2 > some years later. Did you buy this yourself or advice others to buy it? > - i bet like everyone on this list you get daily requests on what to buy > isn't it, so be honest > - what did you advice in 2003 and 2004 to those surrounding you? Oh. My. Vincent ... um ... How do I say this ... Why don't you google me, with the phrase ... I dunno ... "AMD whitepaper opteron" or similar things. Here's one you might find: http://developer.amd.com/assets/Computational_Chemistry_Paper.pdf I can send you others in PDF form if you like. If you read some of the white papers we wrote for them, you might even find where they got the APU expression from ... Heh. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Sat Feb 18 13:48:01 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Sat, 18 Feb 2012 19:48:01 +0100 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <4F3FED4C.3010903@scalableinformatics.com> References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> <4F3FED4C.3010903@scalableinformatics.com> Message-ID: On Feb 18, 2012, at 7:26 PM, Joe Landman wrote: > > Ahh ... an obtuse path to discuss time to market as a function of > cost. > > So if your development cost requires you pay expensive programmers > for 6 months working on very expensive hardware with very expensive > tools for an application that will have a usable lifetime of 3-6 > months (or whatever window is relevant) ... > > ... versus ... > > you pay your very expensive programmers for 1-2 weeks working on > less expensive hardware with well designed and inexpensive tools > for an application that will have a usable lifetime of 3-6 months > (or whatever window is relevant) > > ... which of these will > > a) cost less to build/test/deploy > b) have a longer time in market to make you money? > > > And your expensive programmers get to work on the next task, > therefore increasing your ability to have a collection of tools > actively engaged on the market. > > Seriously, if you don't get why this matters, well ... > Why is quadrics bankrupt? Why has everyone forgotten about myri as high performance network? Why is itanium end of line? Simple - they don't perform well and/or were too expensive. In all your naivity you forgot the most important thing; PERFORMANCE and RELIABILITY. Your fast $2.50 an hour software engineers are a) not producing reliable code b) no performance both crucial for trading software. The entire discussion here and the entire mailing list is interested in reliability and performance. They care about the big clusters the traders have in their backyard that do all the calculations - and i can't say anything sensible there either except that there is A FEW hedgefunds that publicly admit they do have big clusters with highend networks and reliable ones and we do know it's all in big bunkers - so we can be sure that no one, especially the government, doesn't know what happens there. So that's why we aren't discussing the most interesting stuff they got. From just 1 hedgefund we know they have a couple of thousands of nodes; that's a rather big one though. The rest we can just guess. What we can discuss is the highperformance part which is the trading engine. Either in fpga or in software or a hybrid. Now you're telling that 'time to market' matters there? What world are you from. THIS IS ABOUT PERFORMANCE. if you can't deliver the performance, you don't even need to START the project. I really doubt you know anything about selling a performance product if all you care about it time to market :) Vincent _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From landman at scalableinformatics.com Sat Feb 18 14:40:09 2012 From: landman at scalableinformatics.com (Joe Landman) Date: Sat, 18 Feb 2012 14:40:09 -0500 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> <4F3FED4C.3010903@scalableinformatics.com> Message-ID: <4F3FFE99.1040305@scalableinformatics.com> On 02/18/2012 01:48 PM, Vincent Diepeveen wrote: >> >> Seriously, if you don't get why this matters, well ... >> > > Why is quadrics bankrupt? Because ... they ... ran ... out ... of ... money? > Why has everyone forgotten about myri as high performance network? Because ... other ... competitors ... have ... emerged ... with ... better ... more standardized ... stuff? > Why is itanium end of line? Umm ... anyone who knows anything about business would be able to answer this one for you. > Simple - they don't perform well and/or were too expensive. Not so simple. Quadrics ran out of money. Myri was surpassed with other tech. Itanium was never a good business idea. > > In all your naivity you forgot the most important thing; PERFORMANCE and > RELIABILITY. Ahh ... all of my naivet?. And on that note, I'll close my participation in this amusing thread. A conversation where you have not only a failure in communication, but a profound ... seemingly unbounded ... seemingly willful ... failure in comprehension ... yeah, not so much of a good conversation. Really Vincent, its been entertaining. Naivet? ... /chuckles -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Sat Feb 18 15:33:33 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Sat, 18 Feb 2012 15:33:33 -0500 (EST) Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: References: <032DD585-C678-4D27-902C-40823405348D@xs4all.nl> <20120218161215.GA48861@piskorski.com> <4F3FD990.7000503@scalableinformatics.com> <7FFFEFA3-B7C6-4DB1-8C65-3C631D922F5F@xs4all.nl> <4F3FED4C.3010903@scalableinformatics.com> Message-ID: > Why is quadrics bankrupt? > Why has everyone forgotten about myri as high performance network? > Why is itanium end of line? > > Simple - they don't perform well and/or were too expensive. false. they didn't execute well enough. quadrics owned HPC for a while, but didn't execute properly in tansitioning past qsnet2. they could have nipped IB in the bud. imagine if quadrics had managed to ship 10Gb VM-aware adapters as well as intelligent switching fabrics, and quickly moved to 40Gb and DCE-like wire-level features. myrinet also lost its place in HPC, probably for similar reasons, though it isn't quite gone. its current product line seems focused on high-freq trading, though I have no idea how well they succeed. ironically, MX, with 3-4 us latency, is still reasonably attractive. I have no idea why quadrics and myricom didn't manage to win by following the ethernet(ish) path. perhaps it was just the drag induced by the rest of the eth world, since they would have needed to make 40Gb prevalent several years ago to compete with IB. itanium, as well, succeeded in a limited domain, but suffered because Intel wasn't willing to commit to it instead of facing the threat of AMDs x86_64 head-on. was it the right approach? I still think VLIW is a mistake ISA-wise, but perhaps if Intel had put as much ingenuity into it as they did into >=nehalem, it might have succeeded. all three cases are failures of execution. > The entire discussion here and the entire mailing list is interested > in reliability and performance. HPC is about performance; reliability is only of interest when it becomes a threat to performance. > I really doubt you know anything about selling a performance product > if all you care about it time to market :) poor execution becomes a failure precisely because of time-to-market. specifically, the three examples have failed because poor execution let alternatives arrive in the market ahead of them. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Sat Feb 18 17:17:53 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Sat, 18 Feb 2012 14:17:53 -0800 Subject: [Beowulf] Pricing and Trading Networks: Down is Up, Left is Right In-Reply-To: <4F3FD990.7000503@scalableinformatics.com> Message-ID: On 2/18/12 9:02 AM, "Joe Landman" wrote: > >FPGAs are very good at some subset of problems, but they are extremely >hard to 'program'. Unless you get one of the "compilers" which use a >virtual CPU of some sort to execute the code ... in which case you are >giving up a majority of your usable performance anyway. And if someone >from Convey or Mitrionics v2 wants to jump in and call BS (and even >better, say something interesting on how you can avoid giving up the >performance), I'd love to see/hear this. FPGAs have become something of >a "red headed stepchild" of accelerators. The tasks they are good for, >they are very good for. But getting near optimal performance is hard >(based upon my past experience/knowledge ... more than 1 year old), and >usually violates the "minimize time to market" criterion. > >If you have a problem which will change infrequently, and doesn't >involve too much DP floating point, and lots of integer ops ... FPGAs >might be a great fit technologically, though the other aspects have to >be taken into account. Reprogrammable FPGAs (tiny ones) were available in the mid 80s, so you could say that they're about 25 years old now. Compare that to more conventional computers, say, mid 40s.. Think about how mature compilers and such were in 1965, especially in terms of optimizers, etc. And think about how many software developers there were back then (in comparison to the general technical professional population). FPGAs will get there. (of course, conventional CPUs are always going to be ahead).. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Sun Feb 19 22:24:13 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Mon, 20 Feb 2012 14:24:13 +1100 Subject: [Beowulf] PCPro: AMD: what went wrong? Message-ID: <4F41BCDD.4080408@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Interesting article on why AMD has been on the back foot. http://www.pcpro.co.uk/features/372859/amd-what-went-wrong/print [...] Yet comparison is inevitable ? and not very complimentary. Our review concluded that ?Intel still holds all the cards?, with pricier AMD FX processors delivering benchmark scores synonymous with Intel?s mid-range Core i5s. The verdict was unanimous; our sister title bit-tech dubbed the FX-8150 a ?stinker?. Light was shed on Bulldozer?s problems when ex-AMD engineer Cliff Maier spoke out about manufacturing issues during the earliest stages of design. ?Management decided there should be cross-engineering [between AMD and ATI], which meant we had to stop hand-crafting CPU designs,? he said. Production switched to faster automated methods, but Maier says the change meant AMD?s chips lost ?performance and efficiency? as crucial parts were designed by machines, rather than experienced engineers. AMD?s latest chips haven?t stoked the fires of consumers, either. Martin Sawyer, technical director at Chillblast, reports that ?demand for AMD has been quite slow?, and there?s no rush to buy Bulldozer. ?With no AMD solutions competitive with an Intel Core i5-2500K?, he says, ?AMD is a tough sell in the mid- and high-end market.? Another British PC supplier told us off-the-record that sales are partly propped up by die-hards who only buy AMD ?because they don?t like Intel?. [...] - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9BvN0ACgkQO2KABBYQAh81OACfU+Lzu7NANVdGm8BJ1+mwuEp+ Z1wAnRKgNOby5Jn56W0LCSeVsn88bpih =c54h -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Mon Feb 20 13:10:22 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Mon, 20 Feb 2012 13:10:22 -0500 (EST) Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: <4F41BCDD.4080408@unimelb.edu.au> References: <4F41BCDD.4080408@unimelb.edu.au> Message-ID: > mid-range Core i5s. The verdict was unanimous; our sister title > bit-tech dubbed the FX-8150 a ?stinker?. well, for desktops. specFPrate scores are pretty competitive (though sandybridge xeons are reportedly quite a bit better.) > Light was shed on Bulldozer?s problems when ex-AMD engineer Cliff > Maier spoke out about manufacturing issues during the earliest stages > of design. ?Management decided there should be cross-engineering > [between AMD and ATI], which meant we had to stop hand-crafting CPU > designs,? he said. I'm purely armchair when it comes to low-level chip design, but to me, this makes it sound like there are problems with their tools. what's the nature of the magic that slower/human design makes, as opposed to the magic-less automatic design? is this a tooling-up issue that would only affect the first rev of auto-designed CPUs? does this also imply that having humans tweak the design would make the GPU/APU chips faster, smaller or more power-efficient? presumably this change from semi-manual to automatic design (layout?) was motivated by a desire to improve time-to-market. or perhaps improve consistency/predictability of development? have any such improvements resulted? from here, it looks like BD was a bit of a stinker and that the market is to some extent waiting to see whether Piledriver is the chip that BD should have been. if PD had followed BD by a few months, this discussion would have a different tone. then again, GPUs were once claimed to have a rapid innovation cycle, but afaikt that was a result of immaturity. current GPU cycles are pretty long, seemlingly as long as, say, Intel's tick-tock. Fermi has been out for a long while with no significant successor. ATI chips seem to rev a high-order digit about once a year, but I'm not sure I'd really call 5xxx a whole different generation than 6xxx. (actually, 4xxx (2008) was pretty similar as well...) > Production switched to faster automated methods, but Maier says the > change meant AMD?s chips lost ?performance and efficiency? as crucial > parts were designed by machines, rather than experienced engineers. were these experienced engineers sitting on their hands during this time? > AMD?s latest chips haven?t stoked the fires of consumers, either. > Martin Sawyer, technical director at Chillblast, reports that ?demand > for AMD has been quite slow?, and there?s no rush to buy Bulldozer. well, APU demand seems OK, though not very exciting because the CPU cores in these chips are largely what AMD has been shipping for years. > ?With no AMD solutions competitive with an Intel Core i5-2500K?, he > says, ?AMD is a tough sell in the mid- and high-end market.? Another > British PC supplier told us off-the-record that sales are partly > propped up by die-hards who only buy AMD ?because they don?t like Intel?. to some extent. certainly AMD has at various times in the past been able to claim the crown in: - 64b ISA and performance - memory bandwidth and/or cpu:mem balance - power efficiency - integrated CPU-GPU price/performance. - specrate-type throughput/price efficiency but Intel has executed remarkably well to take these away. for instance, although AMD's APUs are quite nice, Intel systems are power efficient enough that you can build a system with an add-in-card and still match or beat the APU power envelope. Intel seems to extract more stream-type memory bandwidth from the same dimms. and Intel has what seems like a pipeline already loaded with promising chips (SB Xeons, and presumably ivybridge improvements after that). MIC seems promising, but then again with GCN, GPUs are becoming less of an obstacle course for masochists. from the outside, we have very little visibility into what's going on with AMD. they seem to be making some changes, which is good, since there have been serious problems. whether they're the right changes, I donno. it's a little surprising to me how slowly they're moving, since being near-death would seem to encourage urgency. in some sense, the current state is near market equilibrium, though: Intel has the performance lead and is clearly charging a premium, with AMD trailing but arguably offering decent value with cheaper chips. this doesn't seem like a way for AMD to grow market share, though. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Mon Feb 20 15:29:43 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Mon, 20 Feb 2012 12:29:43 -0800 Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: Message-ID: Comments below about automated vs manual design.. On 2/20/12 10:10 AM, "Mark Hahn" wrote: >> mid-range Core i5s. The verdict was unanimous; our sister title >> bit-tech dubbed the FX-8150 a ?stinker?. > >well, for desktops. specFPrate scores are pretty competitive >(though sandybridge xeons are reportedly quite a bit better.) > >> Light was shed on Bulldozer?s problems when ex-AMD engineer Cliff >> Maier spoke out about manufacturing issues during the earliest stages >> of design. ?Management decided there should be cross-engineering >> [between AMD and ATI], which meant we had to stop hand-crafting CPU >> designs,? he said. > >I'm purely armchair when it comes to low-level chip design, but to me, >this makes it sound like there are problems with their tools. what's >the nature of the magic that slower/human design makes, as opposed to >the magic-less automatic design? One place where humans can do a better job is in the place and route, particularly if the design is tight on available space. If there's plenty of room, an autorouter can do pretty well, but if it's tight, you get to high 90s % routed, and then it gets sticky. It's a very, very complex problem because you have to not only find room for interconnects, but trade off propagation delay so that it can actually run at rated speed: spreading out slows you down. (same basic problem as routing printed circuit boards) Granted modern place and route is very sophisticated, but ultimately, it's a heuristic process (Xilinx had simulated annealing back in the 80s, for instance) which is trying to capture routine guidelines and rules (as opposed to trying guided random strategies like GA, etc.) Skilled humans can "learn" from previous similar experience, which so far, the automated tools don't. That is, a company doesn't do new CPU designs every week, so there's not a huge experience base for a "learning" router to learn from. The other thing that humans can do is have a better feel for working the tolerances.. That is, they can make use of knowledge that some variabilities are correlated (e.g. Two parts side by side on the die will "track", something that is poorly captured in a spec for the individual parts). Pushing the timing margins is where it's all done. > is this a tooling-up issue that would >only affect the first rev of auto-designed CPUs? does this also imply >that having humans tweak the design would make the GPU/APU chips faster, >smaller or more power-efficient? Historically, the output of the automated tools is very hard to modify by a human, except in a peephole optimization sense. This is because a human generated design will typically have some sort of conceptual architecture that all hangs together. An automated design tends to be, well, unconstrained by the need for a consistent conceptual view. It's a lot harder to change something in one place and know that it won't break something else, if you didn?t follow and particpate the design process from the top. There's a very distinct parallel here to optimizing compilers and "hand coded assembly". There are equivalent tools to profilers and such, but it's the whole thing about how a bad top level design can't be saved by extreme low level optimization. Bear in mind that Verilog and VHDL are about like Assembler (even if they have a "high level" sort of C-like look to them). There are big subroutine libraries (aka IP cores), but it's nothing like, say, an automatically parallelizing FORTRAN compiler that makes effective use of a vector unit. > >presumably this change from semi-manual to automatic design (layout?) >was motivated by a desire to improve time-to-market. or perhaps improve >consistency/predictability of development? have any such improvements >resulted? from here, it looks like BD was a bit of a stinker and that >the market is to some extent waiting to see whether Piledriver is the >chip that BD should have been. if PD had followed BD by a few months, >this discussion would have a different tone. There is a HUGE desire to do better automated design, for the same reason we use high level languages to develop software: it greatly improves productivity (in terms of number of designs that can be produced by one person). There aren't all that many people doing high complexity IC development. Consider something like a IEEE-1394 (Firewire) core. There are probably only 4 or 5 people in the *world* who are competent to design it or at least lead a design: not only do you need to know all the idiosyncracies of the process, but you also need to really understand IEEE-1394 in all of it's funky protocol details. Ditto for processor cores. For an example of a fairly simple and well documented core, take a look at the LEON implementations of the SPARC (which are available for free as GPLed VHDL). That's still a pretty complex piece of logic, and not something you just leap into modifying, or recreating. http://www.gaisler.com/cms/index.php?option=com_content&task=view&id=156&It emid=104 > >then again, GPUs were once claimed to have a rapid innovation cycle, >but afaikt that was a result of immaturity. current GPU cycles are >pretty long, seemlingly as long as, say, Intel's tick-tock. Fermi >has been out for a long while with no significant successor. ATI >chips seem to rev a high-order digit about once a year, but I'm not >sure I'd really call 5xxx a whole different generation than 6xxx. >(actually, 4xxx (2008) was pretty similar as well...) I suspect that the "cycle rate" is driven by market forces. At some point, there's less demand for higher performance, particularly for something consumer driven like GPUs. At some point, you're rendering all the objects you need at resolutions higher than human visual resolution, and you don't need to go faster. Maybe the back-end physics engine could be improved (render individual sparks in a flame or droplets in a cloud) but there's a sort of cost benefit analysis that goes into this. For consumer "single processor" kinds of applications we're probably in that zone.. How much faster do you need to render that spreadsheet or word document. The bottleneck isn't the processor, it's the data pipe coming in, whether streamed from a DVD or over the network connection. > >> Production switched to faster automated methods, but Maier says the >> change meant AMD?s chips lost ?performance and efficiency? as crucial >> parts were designed by machines, rather than experienced engineers. > >were these experienced engineers sitting on their hands during this time? No, they were designing other things (or were hired away by someone else). There's always more design work to be done than people to do it. Maybe AMD had some Human Resources/Talent Management/Human Capital issues and their top talent bolted to somewhere else? (there are people with a LOT of cash in the financial industry and in government who are interested in ASIC designs.. At least if the ads in the back of IEEE Spectrum and similar are any sign.) Being a skilled VLSI designer capable of leading a big CPU design these days is probably a "guaranteed employment and name your salary" kind of profession. > >> AMD?s latest chips haven?t stoked the fires of consumers, either. >> Martin Sawyer, technical director at Chillblast, reports that ?demand >> for AMD has been quite slow?, and there?s no rush to buy Bulldozer. > >well, APU demand seems OK, though not very exciting because the CPU >cores in these chips are largely what AMD has been shipping for years. I would speculate that consumer performance demands have leveled out, for the data bottleneck reasons discussed above. Sure, I'd like to rip DVDs to my server a bit faster, but I'm not going to go out and buy a new computer to do it (and of course, it's still limited by how fast I can read the DVD) > >> ?With no AMD solutions competitive with an Intel Core i5-2500K?, he >> says, ?AMD is a tough sell in the mid- and high-end market.? Another >> British PC supplier told us off-the-record that sales are partly >> propped up by die-hards who only buy AMD ?because they don?t like >>Intel?. > >to some extent. certainly AMD has at various times in the past been able >to claim the crown in: > - 64b ISA and performance > - memory bandwidth and/or cpu:mem balance > - power efficiency > - integrated CPU-GPU price/performance. > - specrate-type throughput/price efficiency > >but Intel has executed remarkably well to take these away. for instance, >although AMD's APUs are quite nice, Intel systems are power efficient >enough that you can build a system with an add-in-card and still match >or beat the APU power envelope. Intel seems to extract more stream-type >memory bandwidth from the same dimms. and Intel has what seems like a >pipeline already loaded with promising chips (SB Xeons, and presumably >ivybridge improvements after that). MIC seems promising, but then again >with GCN, GPUs are becoming less of an obstacle course for masochists. Maybe Intel hired all of AMDs top folks away, and that's why AMD is using more automated design? > >from the outside, we have very little visibility into what's going on with >AMD. they seem to be making some changes, which is good, since there have >been serious problems. whether they're the right changes, I donno. it's >a little surprising to me how slowly they're moving, since being >near-death >would seem to encourage urgency. in some sense, the current state is near >market equilibrium, though: Intel has the performance lead and is clearly >charging a premium, with AMD trailing but arguably offering decent value >with cheaper chips. this doesn't seem like a way for AMD to grow market >share, though. But hasn't that really been the case since the very early days of x86? I seem to recall some computers out in my garage with AMD 286 and 386 clones in them. AMD could also attack the embedded processor market with high integration flavors of the processors. Does AMD really need to grow market share? If the overall pie keeps getting bigger, they can grow, keeping constant percentage market share. They've been around long enough that by no means could they be considered a start=up in a rapid growth phase. > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Mon Feb 20 15:48:49 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Mon, 20 Feb 2012 21:48:49 +0100 Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: References: Message-ID: On Feb 20, 2012, at 9:29 PM, Lux, Jim (337C) wrote: > Comments below about automated vs manual design.. > > > On 2/20/12 10:10 AM, "Mark Hahn" wrote: > >>> mid-range Core i5s. The verdict was unanimous; our sister title >>> bit-tech dubbed the FX-8150 a ?stinker?. >> >> well, for desktops. specFPrate scores are pretty competitive >> (though sandybridge xeons are reportedly quite a bit better.) >> >>> Light was shed on Bulldozer?s problems when ex-AMD engineer Cliff >>> Maier spoke out about manufacturing issues during the earliest >>> stages >>> of design. ?Management decided there should be cross-engineering >>> [between AMD and ATI], which meant we had to stop hand-crafting CPU >>> designs,? he said. >> >> I'm purely armchair when it comes to low-level chip design, but to >> me, >> this makes it sound like there are problems with their tools. what's >> the nature of the magic that slower/human design makes, as opposed to >> the magic-less automatic design? > > One place where humans can do a better job is in the place and route, > particularly if the design is tight on available space. If there's > plenty > of room, an autorouter can do pretty well, but if it's tight, you > get to > high 90s % routed, and then it gets sticky. It's a very, very complex > problem because you have to not only find room for interconnects, but > trade off propagation delay so that it can actually run at rated > speed: > spreading out slows you down. (same basic problem as routing printed > circuit boards) > > Granted modern place and route is very sophisticated, but > ultimately, it's > a heuristic process (Xilinx had simulated annealing back in the > 80s, for > instance) which is trying to capture routine guidelines and rules (as > opposed to trying guided random strategies like GA, etc.) Actually for hand optimization of yields at modern CPU's stuff like simulated annealing is less popular. You can actually also use lineair solvers for that, in order to recalculate entire design and under the right constraints it gives an optimal solution, which is not garantueed for the non-lineair solving methods as those also easily can pick a local maximum. Stuff like simulated annealing is more popular at the non-lineair problems such as in artificial intelligence. > > Skilled humans can "learn" from previous similar experience, which > so far, > the automated tools don't. That is, a company doesn't do new CPU > designs > every week, so there's not a huge experience base for a "learning" > router > to learn from. > > The other thing that humans can do is have a better feel for > working the > tolerances.. That is, they can make use of knowledge that some > variabilities are correlated (e.g. Two parts side by side on the > die will > "track", something that is poorly captured in a spec for the > individual > parts). > > Pushing the timing margins is where it's all done. > > > >> is this a tooling-up issue that would >> only affect the first rev of auto-designed CPUs? does this also >> imply >> that having humans tweak the design would make the GPU/APU chips >> faster, >> smaller or more power-efficient? > > > Historically, the output of the automated tools is very hard to > modify by > a human, except in a peephole optimization sense. This is because > a human > generated design will typically have some sort of conceptual > architecture > that all hangs together. An automated design tends to be, well, > unconstrained by the need for a consistent conceptual view. > > It's a lot harder to change something in one place and know that it > won't > break something else, if you didn?t follow and particpate the design > process from the top. > > There's a very distinct parallel here to optimizing compilers and > "hand > coded assembly". There are equivalent tools to profilers and such, > but > it's the whole thing about how a bad top level design can't be > saved by > extreme low level optimization. > > > Bear in mind that Verilog and VHDL are about like Assembler (even > if they > have a "high level" sort of C-like look to them). There are big > subroutine libraries (aka IP cores), but it's nothing like, say, an > automatically parallelizing FORTRAN compiler that makes effective > use of a > vector unit. > >> >> presumably this change from semi-manual to automatic design (layout?) >> was motivated by a desire to improve time-to-market. or perhaps >> improve >> consistency/predictability of development? have any such >> improvements >> resulted? from here, it looks like BD was a bit of a stinker and >> that >> the market is to some extent waiting to see whether Piledriver is the >> chip that BD should have been. if PD had followed BD by a few >> months, >> this discussion would have a different tone. > > > There is a HUGE desire to do better automated design, for the same > reason > we use high level languages to develop software: it greatly improves > productivity (in terms of number of designs that can be produced by > one > person). > > There aren't all that many people doing high complexity IC > development. > Consider something like a IEEE-1394 (Firewire) core. There are > probably > only 4 or 5 people in the *world* who are competent to design it or at > least lead a design: not only do you need to know all the > idiosyncracies > of the process, but you also need to really understand IEEE-1394 in > all of > it's funky protocol details. > > Ditto for processor cores. For an example of a fairly simple and well > documented core, take a look at the LEON implementations of the SPARC > (which are available for free as GPLed VHDL). That's still a pretty > complex piece of logic, and not something you just leap into > modifying, or > recreating. > http://www.gaisler.com/cms/index.php? > option=com_content&task=view&id=156&It > emid=104 > > > > > >> >> then again, GPUs were once claimed to have a rapid innovation cycle, >> but afaikt that was a result of immaturity. current GPU cycles are >> pretty long, seemlingly as long as, say, Intel's tick-tock. Fermi >> has been out for a long while with no significant successor. ATI >> chips seem to rev a high-order digit about once a year, but I'm not >> sure I'd really call 5xxx a whole different generation than 6xxx. >> (actually, 4xxx (2008) was pretty similar as well...) > > > I suspect that the "cycle rate" is driven by market forces. At some > point, > there's less demand for higher performance, particularly for something > consumer driven like GPUs. At some point, you're rendering all the > objects you need at resolutions higher than human visual > resolution, and > you don't need to go faster. Maybe the back-end physics engine > could be > improved (render individual sparks in a flame or droplets in a > cloud) but > there's a sort of cost benefit analysis that goes into this. > > For consumer "single processor" kinds of applications we're > probably in > that zone.. How much faster do you need to render that spreadsheet > or word > document. The bottleneck isn't the processor, it's the data pipe > coming > in, whether streamed from a DVD or over the network connection. > > >> >>> Production switched to faster automated methods, but Maier says the >>> change meant AMD?s chips lost ?performance and efficiency? as >>> crucial >>> parts were designed by machines, rather than experienced engineers. >> >> were these experienced engineers sitting on their hands during >> this time? > > > No, they were designing other things (or were hired away by someone > else). > There's always more design work to be done than people to do it. > Maybe > AMD had some Human Resources/Talent Management/Human Capital issues > and > their top talent bolted to somewhere else? (there are people with > a LOT > of cash in the financial industry and in government who are > interested in > ASIC designs.. At least if the ads in the back of IEEE Spectrum and > similar are any sign.) > > Being a skilled VLSI designer capable of leading a big CPU design > these > days is probably a "guaranteed employment and name your salary" > kind of > profession. > > >> >>> AMD?s latest chips haven?t stoked the fires of consumers, either. >>> Martin Sawyer, technical director at Chillblast, reports that ? >>> demand >>> for AMD has been quite slow?, and there?s no rush to buy Bulldozer. >> >> well, APU demand seems OK, though not very exciting because the CPU >> cores in these chips are largely what AMD has been shipping for >> years. > > > I would speculate that consumer performance demands have leveled > out, for > the data bottleneck reasons discussed above. Sure, I'd like to rip > DVDs > to my server a bit faster, but I'm not going to go out and buy a new > computer to do it (and of course, it's still limited by how fast I can > read the DVD) >> >>> ?With no AMD solutions competitive with an Intel Core i5-2500K?, he >>> says, ?AMD is a tough sell in the mid- and high-end market.? Another >>> British PC supplier told us off-the-record that sales are partly >>> propped up by die-hards who only buy AMD ?because they don?t like >>> Intel?. >> >> to some extent. certainly AMD has at various times in the past >> been able >> to claim the crown in: >> - 64b ISA and performance >> - memory bandwidth and/or cpu:mem balance >> - power efficiency >> - integrated CPU-GPU price/performance. >> - specrate-type throughput/price efficiency >> >> but Intel has executed remarkably well to take these away. for >> instance, >> although AMD's APUs are quite nice, Intel systems are power efficient >> enough that you can build a system with an add-in-card and still >> match >> or beat the APU power envelope. Intel seems to extract more >> stream-type >> memory bandwidth from the same dimms. and Intel has what seems >> like a >> pipeline already loaded with promising chips (SB Xeons, and >> presumably >> ivybridge improvements after that). MIC seems promising, but then >> again >> with GCN, GPUs are becoming less of an obstacle course for >> masochists. > > > Maybe Intel hired all of AMDs top folks away, and that's why AMD is > using > more automated design? > >> >> from the outside, we have very little visibility into what's going >> on with >> AMD. they seem to be making some changes, which is good, since >> there have >> been serious problems. whether they're the right changes, I >> donno. it's >> a little surprising to me how slowly they're moving, since being >> near-death >> would seem to encourage urgency. in some sense, the current state >> is near >> market equilibrium, though: Intel has the performance lead and is >> clearly >> charging a premium, with AMD trailing but arguably offering decent >> value >> with cheaper chips. this doesn't seem like a way for AMD to grow >> market >> share, though. > > > But hasn't that really been the case since the very early days of > x86? I > seem to recall some computers out in my garage with AMD 286 and 386 > clones > in them. > > AMD could also attack the embedded processor market with high > integration > flavors of the processors. > > > Does AMD really need to grow market share? If the overall pie keeps > getting bigger, they can grow, keeping constant percentage market > share. > They've been around long enough that by no means could they be > considered > a start=up in a rapid growth phase. > > >> > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From michf at post.tau.ac.il Mon Feb 20 17:52:03 2012 From: michf at post.tau.ac.il (Micha) Date: Tue, 21 Feb 2012 00:52:03 +0200 Subject: [Beowulf] newb question - help with NUMA + mpich2 + GPUs Message-ID: <4F42CE93.4070600@post.tau.ac.il> Sorry if this is inappropriate here. I'm finally growing from clusters of single CPUs to a machine with multiple CPUs, which means that I need to start taking note of NUMA issues. I'm looking for information on how to achieve that with mpi under linux. I'm currently using mpich2, but I don't mind switching if needed. Things are actually more complex as this is a mixed GPU/GPU (CUDA) system so I'm also looking for how to effectively transfer data between GPUs siting on different PCIe slots and find the affinity between GPUs and CPUs. Also at what stage is the support for using MPI to copy between GPUs? Thanks for any pointers _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Mon Feb 20 22:31:12 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Tue, 21 Feb 2012 14:31:12 +1100 Subject: [Beowulf] newb question - help with NUMA + mpich2 + GPUs In-Reply-To: <4F42CE93.4070600@post.tau.ac.il> References: <4F42CE93.4070600@post.tau.ac.il> Message-ID: <4F431000.3060808@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 21/02/12 09:52, Micha wrote: > Things are actually more complex as this is a mixed GPU/GPU (CUDA) > system so I'm also looking for how to effectively transfer data > between GPUs siting on different PCIe slots and find the affinity > between GPUs and CPUs. Also at what stage is the support for using > MPI to copy between GPUs? The hwloc library from the Open-MPI folks will probably help with some of it: http://www.open-mpi.org/projects/hwloc/ It can show you which cores are near which PCI devices for instance and lstopo is a fantastic tool for getting a quick overview of a node. I *believe* that code is in the 1.5 series but it'd be well worth asking the question on the open-mpi lists to get a definitive answer from someone who knows what they're talking about. :-) There was also a discussion on the Open-MPI devel list recently about why MVAPICH2 appears to do better than it with GPUs (for the moment), the summary is here: http://www.open-mpi.org/community/lists/devel/2012/02/10430.php Hope this helps! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9DEAAACgkQO2KABBYQAh8h1gCfRtYtAY6hra6ckeoC60ZkfqOe qPwAnAsZCHB/5E9QMYutgTMKiW4cdlxO =cwG5 -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Mon Feb 20 22:39:37 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Tue, 21 Feb 2012 14:39:37 +1100 Subject: [Beowulf] Controlling java's hunger for RAM Message-ID: <4F4311F9.9060200@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks, A perennial issue on our x86 clusters is Java and its unpredictableness for wanting RAM. It seems it defaults to trying to mmap() a quarter of system RAM (12GB on our 48GB nodes for instance) unless overridden by the -Xmx parameter. Now we'd like a way to be able to set a default for -Xmx for all Java processes, but cannot use $_JAVA_OPTIONS as that overrides the command line options rather than the other way around (which would have been the sensible way to do it). The reason why is that Torque sets RLIMIT_RSS to enforce memory requests on jobs, and malloc() now usually calls mmap() to allocate rather than sbrk() so RLIMIT_DATA is useless (not enforced). Any ideas?? cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9DEfkACgkQO2KABBYQAh8tbACfavVT01WYQYbkeYgKfjoUJ6jP 9AEAnRDqHcU9Egf9fM24KTtZxUhvfN9u =/fHd -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Mon Feb 20 23:14:06 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Mon, 20 Feb 2012 20:14:06 -0800 Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: Message-ID: On 2/20/12 12:48 PM, "Vincent Diepeveen" wrote: > >On Feb 20, 2012, at 9:29 PM, Lux, Jim (337C) wrote: > >> Comments below about automated vs manual design.. >> >> Granted modern place and route is very sophisticated, but >> ultimately, it's >> a heuristic process (Xilinx had simulated annealing back in the >> 80s, for >> instance) which is trying to capture routine guidelines and rules (as >> opposed to trying guided random strategies like GA, etc.) > >Actually for hand optimization of yields at modern CPU's stuff like >simulated annealing is less popular. I don't know that anyone still uses simulated annealing..it was an example of what kinds of strategies were used in the early days. Back in the late 70s, early 80s, I was looking into automated layout of PCBs. It was pretty grim.. 80% routing, then it would die. The computational challenge is substantial. > >You can actually also use lineair solvers for that, in order to >recalculate entire design and under the right >constraints it gives an optimal solution, which is not garantueed for >the non-lineair solving methods as >those also easily can pick a local maximum. I don't think a linear solver would work. > >Stuff like simulated annealing is more popular at the non-lineair >problems such as in artificial intelligence. The place and route problem is highly nonlinear with a lot of weird interactions. I'll be the first to confess that I am pretty bad at PCB or ASIC layout, but there's a lot of tricky constraints that aren't a linear function of position in some form. Imagine having a data bus with 32 lines that you need to have minimal skew between, so it can be latched. I suppose this is a kind of game playing application so move tree search strategies might work. Certainly it has EP aspects (or nearly EP), so a big parallel machine might help. For all we know, Intel and AMD have big clusters helping the designers out, running 1000 copies of timing simulators. Does Cadence, Synopsis, etc. have parallel versions? >> _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Tue Feb 21 00:02:39 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 21 Feb 2012 00:02:39 -0500 (EST) Subject: [Beowulf] Controlling java's hunger for RAM In-Reply-To: <4F4311F9.9060200@unimelb.edu.au> References: <4F4311F9.9060200@unimelb.edu.au> Message-ID: > The reason why is that Torque sets RLIMIT_RSS to enforce memory RLIMIT_RSS is simply a noop; we use RLIMIT_AS (vmem, pvmem). a good thing about this is that it's fully consistent with the vm.overcommit_memory=2 (ie conservative) mode. some people find it offputting the way VM counts things like code (usually mapped multiple times) or F77 arrays that are defined larger than used. regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Tue Feb 21 00:21:50 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Tue, 21 Feb 2012 16:21:50 +1100 Subject: [Beowulf] Controlling java's hunger for RAM In-Reply-To: References: <4F4311F9.9060200@unimelb.edu.au> Message-ID: <4F4329EE.8070406@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 21/02/12 16:02, Mark Hahn wrote: > RLIMIT_RSS is simply a noop; we use RLIMIT_AS (vmem, pvmem). Gah, sorry, it is indeed RLIMIT_AS we set too (locally patched Torque so that mem and pmem set it too). Perils of writing this stuff from memory whilst dealing with the after affects of someone accidentally upgrading one node of a multi-master CentOS DS LDAP cluster whilst I was away.. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9DKe4ACgkQO2KABBYQAh8wYQCcC1wbmsXdpqhd28UP8h7wFD5X aV0AnjfLcIu6z2E0zyKMhJWT7/Kz4vJD =y1WX -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From diep at xs4all.nl Tue Feb 21 04:32:33 2012 From: diep at xs4all.nl (Vincent Diepeveen) Date: Tue, 21 Feb 2012 10:32:33 +0100 Subject: [Beowulf] PCPro: AMD: what went wrong? In-Reply-To: References: Message-ID: <2DE30CF4-F028-4798-967F-A9C3FD703509@xs4all.nl> On Feb 21, 2012, at 5:14 AM, Lux, Jim (337C) wrote: > > > On 2/20/12 12:48 PM, "Vincent Diepeveen" wrote: > >> >> On Feb 20, 2012, at 9:29 PM, Lux, Jim (337C) wrote: >> >>> Comments below about automated vs manual design.. >>> >>> Granted modern place and route is very sophisticated, but >>> ultimately, it's >>> a heuristic process (Xilinx had simulated annealing back in the >>> 80s, for >>> instance) which is trying to capture routine guidelines and rules >>> (as >>> opposed to trying guided random strategies like GA, etc.) >> >> Actually for hand optimization of yields at modern CPU's stuff like >> simulated annealing is less popular. > > I don't know that anyone still uses simulated annealing..it was an > example > of what kinds of strategies were used in the early days. Back in > the late > 70s, early 80s, I was looking into automated layout of PCBs. It was > pretty grim.. 80% routing, then it would die. The computational > challenge is substantial. > > > >> >> You can actually also use lineair solvers for that, in order to >> recalculate entire design and under the right >> constraints it gives an optimal solution, which is not garantueed for >> the non-lineair solving methods as >> those also easily can pick a local maximum. > > > I don't think a linear solver would work. That's always the tool getting used. A chip has too many parameters to approximate in the first place. non-lineair approximation by randomness is popular more in what my expertise is - parameter tuning for example for chessprograms, despite it's the chessprogram with worlds largest evaluation function (and as programmed by 1 program - sure some parts total outdated years 90 code), it has just some 20k tunable parameters or so, depending upon how you count. In CPU's that's different of course, so given constraints lineair solvers get used, at least for the nanometers that the cpu's have right now. I'm not familiar with 22 nm there. >> >> Stuff like simulated annealing is more popular at the non-lineair >> problems such as in artificial intelligence. > > The place and route problem is highly nonlinear with a lot of weird > interactions. > > I'll be the first to confess that I am pretty bad at PCB or ASIC > layout, well realize i'm a software engineer, yet the optimization is 100% software. Usually there is small companies around the big shots that deliver to AMD and Intel machines, which is relative tiny companies more or less belonging to them, which try to improve yields. See it as service software belonging to the machines delivered by the ASML's. > but there's a lot of tricky constraints that aren't a linear > function of > position in some form. Imagine having a data bus with 32 lines > that you > need to have minimal skew between, so it can be latched. > It's not artificial intelligence, it's in all cases something that eats 2 dimensional space, where the relative primitive model of modelling the lines are in fact not straight lines but have roundings everywhere. You can do this incredible complex with 0% odds you're gonna solve it optimal, or you can make a simpler model and solve it perfectly. Something that definitely would be tougher if within the component there would be moveable parts. So they all choose to model it using lineair programming. That gives a perfect solution and within just at most a few days of calculation. Using approximation even a moderately CPU would need worlds largest supercomputer longer than we live. > > I suppose this is a kind of game playing application so move tree > search > strategies might work. Certainly it has EP aspects (or nearly EP), > so a Tree search is far more complex than the above. The above has a clear goal: yield optimization and everything is always possible to solve by taking a tad more space. The nonlineair aspects at doing this, which the lineair model doesn't take into consideration is the reason why throwing a couple of hundreds of engineers at hand optimizing the CPU is so effective. > big parallel machine might help. For all we know, Intel and AMD > have big > clusters helping the designers out, running 1000 copies of timing > simulators. Does Cadence, Synopsis, etc. have parallel versions? I don't speak for anyone here except myself. For AMD it's easier to guess than for intel, as intel is moving to 22 nm. I'm not familiar with 22 nm. Each new generation machine has new problems, let me assure you that. Realize how rather inefficient simulated annealing and similar methods are. Trying to progress using randomness. This is a problem already with 20k parameters. Realize the big supercomputers thrown at improving parameters of smaller engines than Diep with just at most a 2000 parameters or so. Initially some guys tried to get away by defining functions, reducing that amount of parameters to 50-200. They still do. Then N*SA throws a big supercomputer at the problem and that's how they do it. At a 150 parameters you're already looking at an oracle size of 2+ million instances and a multiple of that in tuningssteps, this still tunes into a local maximum by the way, no garantuee for the optimum solution. This local maximum already takes long. For something with 2+ billion transistors, you're any close to realize the problem of doing that in a lossy manner, risking to get in a local maximum? It means even a tiny modification takes forever to solve, just to keep the same yields. So in case of CPU's just redefine the entire problem to lineair, OR try to create a lookuptable which you can insert in the lineair solver. You can insert a non-lineair subproblem with just a handful of parameters then into a lineair solver, if you have a table that lists all possibilities. Any scientists who claims on paper that using approximation he has a non-lineair solver that will always find the garantueed best optimum, the first question i ask myself : "what's the big O worst case to get *any* reasonable solution at all?" Most usually first go from total randomness and need to run through all steps and only when nearing the end of the algorithm they start having *a solution* which still is not optimal. You don't have time to wait for that *a solution* in fact. > > > >>> > > _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From john.hearns at mclaren.com Tue Feb 21 04:42:26 2012 From: john.hearns at mclaren.com (Hearns, John) Date: Tue, 21 Feb 2012 09:42:26 -0000 Subject: [Beowulf] newb question - help with NUMA + mpich2 + GPUs References: <4F42CE93.4070600@post.tau.ac.il> Message-ID: <207BB2F60743C34496BE41039233A8090B7D6D6C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Micha. Here is probably a good starting point for you: http://www.open-mpi.org/projects/hwloc/ I would download hwloc, install it on your system and print out the topology. That page provides good reading matter on numa placement. Also on a NUMA machine I find that the 'htop' utility is very useful - you should always check that Processes are running on the CPUs you think they should be http://htop.sourceforge.net/ The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From eugen at leitl.org Wed Feb 22 10:50:33 2012 From: eugen at leitl.org (Eugen Leitl) Date: Wed, 22 Feb 2012 16:50:33 +0100 Subject: [Beowulf] Chinese 16 core CPU uses message passing Message-ID: <20120222155033.GK7343@leitl.org> (experimental chip. not Godson) http://semiaccurate.com/2012/02/21/chinese-16-core-cpu-uses-message-passing/ Chinese 16 core CPU uses message passing Gone are the old days of massive shared memory architectures. Feb 21, 2012 in Chips Tweet by Mads ?lholm During the first day of ISSCC in San Francisco research from Fudan University in Shanghai described a brand new microprocessor that does away with the traditional shared memory architecture. Photo courtesy of Fudan The advantage of using a message passing scheme is that it scales much better than the shared memory. Whereas shared memory relies on software, the message passing scheme has been implemented using mailboxes designed in hardware, according to the research paper that was presented at ISSC. The processor itself consists of 16 RISC cores that share two small cores for shared memory access, but much of the communication is done using message passing. The processor also does away with the traditional caches and instead implements an extended register file. The end result is a processor that has been implemented on a TSMC 65nm L CMOS processor and runs at up to 800MHz. When dialed back to 750MHz each core can run at 1.2V and only consume 34mW, which shows that the design is extremely energy efficient. We look forward to seeing this in the wild. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Wed Feb 22 17:08:22 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 22 Feb 2012 17:08:22 -0500 (EST) Subject: [Beowulf] Chinese 16 core CPU uses message passing In-Reply-To: <20120222155033.GK7343@leitl.org> References: <20120222155033.GK7343@leitl.org> Message-ID: > (experimental chip. not Godson) > > http://semiaccurate.com/2012/02/21/chinese-16-core-cpu-uses-message-passing/ unfortately little real info there. > Gone are the old days of massive shared memory architectures. weird. has this writer looked at chip diagrams for the past few years? does 16 cores per memory interface (AMD) count as massive? yes, most systems are CC-NUMA, but the number of "massive" CC-NUMA systems can really be spelled with three letters: SGI. and basically boutique. > During the first day of ISSCC in San Francisco research from Fudan University > in Shanghai described a brand new microprocessor "brand new microprocessor" is a sort of funny phrase. lots of things have been tried before, including, afaikt, everything in this chip. this paper seems similar to http://dx.doi.org/10.1109/ICSICT.2010.5667778 (some of the same authors) which also involves a network-on-chip and "extended register file". it's based on MIPS32, which is a pretty popular choice for arch experiments. > that does away with the > traditional shared memory architecture. not really. > Photo courtesy of Fudan > > The advantage of using a message passing scheme is that it scales much better > than the shared memory. apples scale better than oranges, too. the duality of MP and SM is not a new concept - not that we have such a great handle on it. > Whereas shared memory relies on software, yikes. oversimplify much? > the message > passing scheme has been implemented using mailboxes designed in hardware, > according to the research paper that was presented at ISSC. The processor > itself consists of 16 RISC cores that share two small cores for shared memory > access, I'd prefer to see it described as 8 compute cores surrounding a memory core, with all cores on an in-chip network, but (presumably) no coherency between the two memory cores. the diagram makes the chip look to be focused on stream processing (the related paper uses reed-solomon decoding as its test load). but much of the communication is done using message passing. The > processor also does away with the traditional caches and instead implements > an extended register file. well, I think I'd call the MCore a cache; if you do, the diagram looks much more conventional... I love experimental chips and arch; I wish this paper were available already. but the field is very well-plowed - that doesn't detract from its fertility. what I _don't_ see in my sampling of current papers is any attempt to create a new or improved programming model that can nicely scale, both in terms of architecture and productivity, to systems of many cores. I'm also pretty convinced that one needs to start with a model that doesn't start with separate boxes labeled "cpu" and "memory". _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From john.hearns at mclaren.com Tue Feb 28 10:36:51 2012 From: john.hearns at mclaren.com (Hearns, John) Date: Tue, 28 Feb 2012 15:36:51 -0000 Subject: [Beowulf] Computer on a stick Message-ID: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Cotton Candy http://www.reghardware.com/2012/02/28/fxi_technologies_offers_cotton_can dy_linux_pc_on_a_stick/ A candidate for my 'fit a cluster inside the glovebox of your car' idea if ever there was one! 1.2Ghz processor and 1GB DRAM What was the spec of the original Beowulf project nodes? The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Daniel.Pfenniger at unige.ch Tue Feb 28 11:19:02 2012 From: Daniel.Pfenniger at unige.ch (Daniel Pfenniger) Date: Tue, 28 Feb 2012 17:19:02 +0100 Subject: [Beowulf] Computer on a stick In-Reply-To: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: <4F4CFE76.5000502@unige.ch> Hearns, John a ?crit : > Cotton Candy > > http://www.reghardware.com/2012/02/28/fxi_technologies_offers_cotton_candy_linux_pc_on_a_stick/ > > A candidate for my ?fit a cluster inside the glovebox of your car? idea if ever > there was one! > > 1.2Ghz processor and 1GB DRAM > > What was the spec of the original Beowulf project nodes? > From top of my approximate brain memory, in 1994 a Pentium 70-100MHz processor and 8MB RAM memory were upper commodity hardware specs. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Tue Feb 28 11:32:26 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Tue, 28 Feb 2012 08:32:26 -0800 Subject: [Beowulf] Computer on a stick In-Reply-To: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: Way slower than that, I'm sure. Were they even Pentiums? From: "Hearns, John" > Date: Tue, 28 Feb 2012 07:36:51 -0800 To: "beowulf at beowulf.org" > Subject: [Beowulf] Computer on a stick Cotton Candy http://www.reghardware.com/2012/02/28/fxi_technologies_offers_cotton_candy_linux_pc_on_a_stick/ A candidate for my ?fit a cluster inside the glovebox of your car? idea if ever there was one! 1.2Ghz processor and 1GB DRAM What was the spec of the original Beowulf project nodes? The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eugen at leitl.org Tue Feb 28 11:39:08 2012 From: eugen at leitl.org (Eugen Leitl) Date: Tue, 28 Feb 2012 17:39:08 +0100 Subject: [Beowulf] Computer on a stick In-Reply-To: References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: <20120228163908.GQ7343@leitl.org> On Tue, Feb 28, 2012 at 08:32:26AM -0800, Lux, Jim (337C) wrote: > Way slower than that, I'm sure. Were they even Pentiums? IIRC Becker's first cluster were i486? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From raysonlogin at gmail.com Tue Feb 28 11:57:59 2012 From: raysonlogin at gmail.com (Rayson Ho) Date: Tue, 28 Feb 2012 11:57:59 -0500 Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: References: <4F354F26.2040103@scalableinformatics.com> Message-ID: The paper is now available online, "CPU-Assisted GPGPU on Fused CPU-GPU Architectures": http://people.engr.ncsu.edu/hzhou/hpca_12_final.pdf (I have not read the whole paper yet) I think the core idea is that the CPU acts as a prefetch thread and pulls data into the shared L3 for the GPU cores (this work is like other prefetch thread research projects that use the otherwise spare SMT threads to do prefetching for the main compute thread), and as the GPU cores get more cache hits, the performance is better (hence the 20% mentioned in the extremetech article). Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ On Fri, Feb 10, 2012 at 12:48 PM, Vincent Diepeveen wrote: > Another interesting question is how a few cores cores would be able > to speedup > a typical single precision gpgpu application by 20%. > > That would means that the gpu is really slow, especially if we > realize this is just 1 or 2 CPU cores or so. > > Your gpgpu code really has to kind of be not so very professional to > have 2 cpu cores alraedy contribute > some 20% to that. > > Most gpgpu codes here on a modern GPU you need about a 200+ cpu cores > and that's usually codes which > do not run optimal at gpu's, as it has to do with huge prime numbers, > so simulating that at a 64 bits cpu is more > efficient than a 32 bits gpu. > > So in their case the claim is that for their experiments, assuming 2 > cpu cores, that would be 20%. Means we have a > gpu that's 20x slower or so than a fermi at 512 cores/HD6970 @ 1536. > > 1536 / 20 = 76.8 gpu streamcores. That's AMD Processing Element > count. for nvidia this is similar to 76.8 / 4 = 19.2 cores > > This laptop is from 2007, sure it is a macbookpro 17'' apple, has a > core2 duo 2.4Ghz and has a Nvidia GT 8600M with 32 CUDA cores. > > So if we extrapolate back, the built in gpu is gonna kick that new > AMD chip, right? > > Vincent > > On Feb 10, 2012, at 6:08 PM, Joe Landman wrote: > >> On 02/10/2012 12:00 PM, Lux, Jim (337C) wrote: >>> Expecting headlines to be accurate is a fool's errand... >>> Be glad it actually said AMD. >> >> Expecting articles contents to reflect in any reasonable way upon >> reality may be a similar problem. ?There are a few, precious few >> writers >> who really grok the technology because they live it: ?Doug Eadline, >> Jeff >> Layton, Henry Newman, Chris Mellor, Dan Olds, Rich Brueckner, ... . >> >> The vast majority of articles I've had some contact with the >> authors on >> (not in the above group) have been erroneous to the point of being >> completely non-informational. >> >> >> >> -- >> Joseph Landman, Ph.D >> Founder and CEO >> Scalable Informatics Inc. >> email: landman at scalableinformatics.com >> web ?: http://scalableinformatics.com >> ? ? ? ? http://scalableinformatics.com/sicluster >> phone: +1 734 786 8423 x121 >> fax ?: +1 866 888 3112 >> cell : +1 734 612 4615 >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin >> Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- ================================================== Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.net/ _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Daniel.Pfenniger at unige.ch Tue Feb 28 12:34:52 2012 From: Daniel.Pfenniger at unige.ch (Daniel Pfenniger) Date: Tue, 28 Feb 2012 18:34:52 +0100 Subject: [Beowulf] Computer on a stick In-Reply-To: References: Message-ID: <4F4D103C.8080005@unige.ch> Lux, Jim (337C) a ?crit : > Way slower than that, I'm sure. Were they even Pentiums? > From http://www.intel.com/pressroom/kits/quickrefyr.htm#1994 the Pentium 60MHz Pentium was announced March 1993, and one year later the 100MHz Pentium. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Tue Feb 28 15:09:28 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 28 Feb 2012 15:09:28 -0500 (EST) Subject: [Beowulf] Engineers boost AMD CPU performance by 20% without overclocking In-Reply-To: References: <4F354F26.2040103@scalableinformatics.com> Message-ID: > The paper is now available online, "CPU-Assisted GPGPU on Fused > CPU-GPU Architectures": > > http://people.engr.ncsu.edu/hzhou/hpca_12_final.pdf thanks for the reference. > (I have not read the whole paper yet) I think the core idea is that > the CPU acts as a prefetch thread and pulls data into the shared L3 > for the GPU cores (this work is like other prefetch thread research yes, though it's a bit puzzling, since the whole point of GPU design is to have lots of runnable threads on hand, so that you simply switch from stalled to non-stalled threads to hide latency. so in the context of prefetching, I'd expect a bundle of threads to make a non-prefetched reference, stall, but for other bundles to utilize the vector unit while the reference is resolved. gotta read the paper I guess! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Tue Feb 28 17:48:21 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Wed, 29 Feb 2012 09:48:21 +1100 Subject: [Beowulf] Computer on a stick In-Reply-To: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> Message-ID: <4F4D59B5.1010804@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 29/02/12 02:36, Hearns, John wrote: > What was the spec of the original Beowulf project nodes? Their paper says: http://egscbeowulf.er.usgs.gov/geninfo/Beowulf-ICPP95.pdf # The Beowulf prototype employs 100 MHz Intel DX4 microprocessors # and a 500 MByte disk drive per processor. [...] # The DX4 delivers greater computational power than other members # of the 486 family not only from its higher clock speed, but also # from its 16 KByte primary cache (twice the size of other 486 # primary caches) 6]. Each motherboard also contains a 256 KByte # secondary cache. cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9NWbUACgkQO2KABBYQAh8fygCbBOWa54mENcGbxPzVxlXJf/v5 efEAniJyHtVYra+atRMr/drJzP9oVZ70 =ESXv -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Tue Feb 28 18:27:08 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Tue, 28 Feb 2012 15:27:08 -0800 Subject: [Beowulf] Computer on a stick In-Reply-To: <4F4D59B5.1010804@unimelb.edu.au> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> Message-ID: And from a simple statement in that paper: "It is clear from these results that higher bandwidth networks are required" Did an entire industry spring.. -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Christopher Samuel Sent: Tuesday, February 28, 2012 2:48 PM To: beowulf at beowulf.org Subject: Re: [Beowulf] Computer on a stick -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 29/02/12 02:36, Hearns, John wrote: > What was the spec of the original Beowulf project nodes? Their paper says: http://egscbeowulf.er.usgs.gov/geninfo/Beowulf-ICPP95.pdf # The Beowulf prototype employs 100 MHz Intel DX4 microprocessors # and a 500 MByte disk drive per processor. [...] # The DX4 delivers greater computational power than other members # of the 486 family not only from its higher clock speed, but also # from its 16 KByte primary cache (twice the size of other 486 # primary caches) 6]. Each motherboard also contains a 256 KByte # secondary cache. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From hahn at mcmaster.ca Tue Feb 28 22:26:57 2012 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 28 Feb 2012 22:26:57 -0500 (EST) Subject: [Beowulf] Computer on a stick In-Reply-To: References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> Message-ID: > And from a simple statement in that paper: > "It is clear from these results that higher bandwidth networks are required" half-duplex 10Mb! _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From james.p.lux at jpl.nasa.gov Tue Feb 28 23:53:08 2012 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Tue, 28 Feb 2012 20:53:08 -0800 Subject: [Beowulf] Computer on a stick In-Reply-To: Message-ID: *Bonded* 10mbps ethernet... And zero copy drivers.. On 2/28/12 7:26 PM, "Mark Hahn" wrote: >> And from a simple statement in that paper: >> "It is clear from these results that higher bandwidth networks are >>required" > >half-duplex 10Mb! >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From Daniel.Pfenniger at unige.ch Wed Feb 29 03:10:55 2012 From: Daniel.Pfenniger at unige.ch (Daniel Pfenniger) Date: Wed, 29 Feb 2012 09:10:55 +0100 Subject: [Beowulf] Computer on a stick In-Reply-To: <4F4D59B5.1010804@unimelb.edu.au> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> Message-ID: <4F4DDD8F.6030500@unige.ch> Christopher Samuel a ?crit : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 29/02/12 02:36, Hearns, John wrote: > >> What was the spec of the original Beowulf project nodes? > > Their paper says: > > http://egscbeowulf.er.usgs.gov/geninfo/Beowulf-ICPP95.pdf > Thanks for the link. In the article, and before the hardware specs is aptly mentioned the then very young Linux operating system: "The Beowulf parallel workstation project is driven by a set of requirements for high performance scientific workstations in the Earth and space sciences community and the opportunity of low cost computing made available through the PC related mass market of commodity subsystems. This opportunity is also facilitated by the availability of the Linux operating system, a robust Unix-like system environment with source code that is targeted for the x86 family of microprocessors including the Intel Pentium." It is well this combination of commodity standardized hardware (computer, mass storage, and network) and freely tunable software which allowed the project to flourish. Dan _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From deadline at eadline.org Wed Feb 29 08:52:42 2012 From: deadline at eadline.org (Douglas Eadline) Date: Wed, 29 Feb 2012 08:52:42 -0500 Subject: [Beowulf] Computer on a stick In-Reply-To: References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> Message-ID: <1f28a103248eb307389618a33618b779.squirrel@mail.eadline.org> FYI There is a very good article in Linux magazine written by Tom Sterling in 2003 that provides a first person history (I have used it to stamp out more than a few urban legends) http://www.linux-mag.com/id/1378/ -- Doug > And from a simple statement in that paper: > "It is clear from these results that higher bandwidth networks are > required" > > Did an entire industry spring.. > > > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On > Behalf Of Christopher Samuel > Sent: Tuesday, February 28, 2012 2:48 PM > To: beowulf at beowulf.org > Subject: Re: [Beowulf] Computer on a stick > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 29/02/12 02:36, Hearns, John wrote: > >> What was the spec of the original Beowulf project nodes? > > Their paper says: > > http://egscbeowulf.er.usgs.gov/geninfo/Beowulf-ICPP95.pdf > > # The Beowulf prototype employs 100 MHz Intel DX4 microprocessors # and a > 500 MByte disk drive per processor. > [...] > # The DX4 delivers greater computational power than other members # of the > 486 family not only from its higher clock speed, but also # from its 16 > KByte primary cache (twice the size of other 486 # primary caches) 6]. > Each motherboard also contains a 256 KByte # secondary cache. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From trainor at presciencetrust.org Wed Feb 29 16:40:52 2012 From: trainor at presciencetrust.org (Douglas J. Trainor) Date: Wed, 29 Feb 2012 16:40:52 -0500 Subject: [Beowulf] Raspberry Pi Message-ID: thought some people here should see the Raspberry Pi -- $35 computer with Toronto-designed software sells out worldwide in minutes http://bit.ly/xDVEub [takes you to thestar.com] Say hi to the Raspberry Pi, the $35 computer (with photo) http://bit.ly/xDe8fJ [takes you to csmonitor.com] _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Wed Feb 29 17:40:41 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Thu, 01 Mar 2012 09:40:41 +1100 Subject: [Beowulf] Computer on a stick In-Reply-To: <1f28a103248eb307389618a33618b779.squirrel@mail.eadline.org> References: <207BB2F60743C34496BE41039233A8090BB9972C@MRL-PWEXCHMB02.mil.tagmclarengroup.com> <4F4D59B5.1010804@unimelb.edu.au> <1f28a103248eb307389618a33618b779.squirrel@mail.eadline.org> Message-ID: <4F4EA969.90003@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/03/12 00:52, Douglas Eadline wrote: > FYI > > There is a very good article in Linux magazine written by Tom > Sterling in 2003 that provides a first person history (I have used > it to stamp out more than a few urban legends) > > http://www.linux-mag.com/id/1378/ Thanks Doug.. he writes: # Comprising sixteen Intel 100 MHz 80486-based PCs, each with # 32 Mbytes of memory and a gigabyte hard disk, and interconnected # by means of two, parallel 10-Base-T Ethernet LANs, this first # PC cluster delivered sustained performance on real world, # numerically-intensive, scientific applications (e.g., PPM) in # the range of 70 Mflops. - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9OqWkACgkQO2KABBYQAh8QFgCggq3gnvvERDlR7WAY+ywc7u8B KKwAniifLZ3xLjWVpMygYMzjyelQCr4k =Go8R -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From samuel at unimelb.edu.au Wed Feb 29 17:48:07 2012 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Thu, 01 Mar 2012 09:48:07 +1100 Subject: [Beowulf] Raspberry Pi In-Reply-To: References: Message-ID: <4F4EAB27.2080201@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/03/12 08:40, Douglas J. Trainor wrote: > thought some people here should see the Raspberry Pi -- Been following this for a while, they're rather neat little devices.. http://www.raspberrypi.org/#modelb http://en.wikipedia.org/wiki/Raspberry_Pi#Hardware Their usual site is down at the moment as it couldn't cope with demand (but then again, neither could Farnell or RS, their retailers :-) ). cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9OqycACgkQO2KABBYQAh9fPQCfclqiYqHuwnU6fZWanGDr4x+D lwcAoJHn0BYTIKAPZBGPusZw6FfHRwcC =mJdE -----END PGP SIGNATURE----- _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.