If it was easy everybody would be doing it, right? And what about those multi-cores?
Update: Microsoft has released their Windows Compute Cluster Server 2003 and the Beowulf list is again actively discussing this topic. In addition, previous postings on the Beowulf Mailing List (Look for "Why I want a Microsoft cluster...") discussed the entry of Microsoft into the HPC cluster space. I found the discussions interesting and well informed. I did however, take a step back to look at some fundamentals that define the HPC (High Performance Computing) market and came to the conclusion that before anyone "takes over anything", there are some issues that need to be addressed.
The fundamental issue is that doing HPC is hard. There is no easy way around and no shortcuts. Practitioners need to roll up their sleeves and work to get the performance and results they desire. Unless Microsoft has some magic, all the corporate Windows goodness will not help them in this arena. Just like everybody else they will have to roll up their sleeves. And, by the way, money cannot necessarily buy magic. For the record, that is all I'm going to say about Microsoft because, in my opinion, the things holding back HPC have little to do with the plumbing and a lot to do with the fundamentals. If you are looking for an anti-Microsoft rant, this is wrong article. Please stop reading.
Update: Joe Landman has some great follow-up comments at scalability.org.
Thanks for continuing. Now let's talk about the hard stuff. I previously covered some of the important issues Linux address in the HPC world. But clearly, these advantages are not enough. Piling up processors to achieve heroic TFLOPS numbers sounds like a dream come true, but one has to wonder, Where are all killer cluster applications that take advantage of the unlimited computational power? The Blue Collar Computing effort at OSC has a very good take on this issue.
I wonder about this as well, and I believe I know part of the answer. Let's look at this from a product standpoint because if there is an HPC market, there needs to be products that solve problems and earn money.
How to enter the HPC market
As a savvy business person, you know that hardware is cheap, software is freely available, the need is real, so there must be a way to clean up in this market. How are you going to "productize" this trend and take over the world?Build an Appliance
The appliance concept is a good approach. We buy everything from televisions to toasters as appliances. No need to build it from parts because the market for these items is so large it is economical for a manufacturer to make millions of these items. And, they are easy to use.
An appliance is usually simple to use because it is built to perform a very specific set of tasks. These constraints also make it easy to service as well. Like a game console the games may be different, but the reasons they all work (mostly) is a very tightly controlled hardware and software environment. If it breaks the recipe for fixing it is known (a diagnosis flow chart). Or more common, just throw it out because it is cheaper to build a new one than to pay someone to fix the old one.
So where are the HPC appliances? I will argue that they exist, but not quite at the desktop level. As an example, modern medical imaging equipment are HPC appliances. They do a lot of very specific calculations at the push a button. The new generation of game consoles (to a degree) are HPC appliances. And, what about the desktop? Why don't I have a simple sixteen processor cluster cube that I can plug into my desktop PC and run HPC codes to my hearts content. (For those old enough to remember the Inmos Transputer know that in 1987 it was possible to stuff quite a few very fast processors in desktop PCs).
Even today, it is not hard to envision a cluster cube using commodity hardware and open software. There would probably be some constraints, gigabit Ethernet, preset number of processors, limited memory and storage options. Such a system would be an appealing target for an ISV (Independent Software Vendor). Like a game console, a cluster cube represents a predictable and reproducible environment. So where is it? Why isn't everyone running out and building cheap parallel systems? The multi-core processor strategies will put plenty of processors on your desk in the coming years. Is the market ready? In a word, no.
In my opinion, there are several issues why dedicated HPC appliances might fail. First, there does not seem to be a "one size fits all" cluster design. Fine you say, but then there must be a subset of applications for which a general cluster can be designed. Perhaps, but even within a "problem space", the design of the cluster may need to reflect how the application is to be used. Some users may need a small number of "fat nodes" (a large amount of memory) while other may need a larger number of skinny nodes (a small amount of memory) to run a specific type of problem using the same application.
A second reason is the "boat anchor" problem. Those old game consoles worked well, want to run next generation game on them or try something new. Sorry, that is not possible. With exception of the people who seem to be able to run Linux on anything, the convenience of a console also means sacrificing some control. Nintendo Gamecubes may run Linux, but you can't add memory or upgrade the CPU. Appliances also help vendors "lock in" customers. Customers usually don't like this approach. At some point, a vendor may also stop supporting the appliance. Now you have a boat anchor.
Finally, and perhaps most importantly, there is a lack of application software to drive sales of such a device (including desktop multi-core machines). Thus, we have finally arrived at what I consider one of the hard parts. I believe the administrative software issues (the plumbing as I call it) is largely solved for cluster appliances and multi-core systems. The application software is not so easy. We'll take a look at this issue shortly. For now, let's look at the more traditional approach to HPC.
Build a Cluster
The HPC cluster has a thousand faces. It can be a small eight box cluster or it can be large 1024+ node beast. The choices are plentiful, the prices are reasonable and the software is freely available. You can offer customers unbelievable amounts of computing power housed in your custom designed enclosures. Microsoft now offers a "turn-key/point and click" Windows cluster system (complete with support for ISV applications). Why isn't the world running toward this solution?
For me, the cluster/HPC proposition is kind of like offering to put wings and jet engines on cars. You can give your customers the freedom to travel faster and farther than before, but your customers don't know how to drive cars all that well let alone become sky pilots. Plus, the infrastructure is not there to support the new breed of flying cars.
For those practiced in the art of HPC (i.e. those that know how to fly), clusters provide a large amount of "bang for the buck." End users need only buy what they need and no more. Recently at SC05, IDC reported that over the last five years the use of HPC clusters have exceed IDC's optimistic projections. In the last two years alone, clusters have grown from a one third market share to encompass almost half the market. Large capability systems (heroic supercomputers) have seen a decrease in market share. HPC clusters are disruptive. And by the way, IDC only counts those units shipped as "clusters", they don't count what they call "dark clusters" built by end users.
So where is the desk side/top cluster? There have been some efforts in this area including Orion Multisystems (now defunct) and the recent introduction of the Personal Cluster by Penguin Computing. (Update: Tyan has announced their personal super computer (PSC) and Ciara has announced a desk side cluster as well.) That is 200 GFLOPS next to your desk! Just for you! What are you going to do with it?