The other main advantage it that the the entire "cluster plumbing" is open. It allows optimizations and fixes that may not be needed for the mainstream and thus deemed unimportant for the kernel maintainers. A good example of this is the TCP acknowledgment fix implemented by Josip Loncaric. (See Sidebar Josip's Fix)
Another shinning example of cluster customization has been the process migration facilities introduced by bproc an openMosix. They were able to do things with an open kernel that would almost impossible with a closed source environment.
If we move above the kernel to the distribution level, we see a large amount of "customization" being done for specific HPC distributions of Linux. The BioBrew distribution is an example of a full Linux version tailored to bioinformatics users. Open software seems to have no bounds when it comes to HPC. If there is a need, the infrastructure is available for customization.
Finally, another factor that is often taken for granted is the Internet. Open collaboration and sharing would be quite difficult without it. News, packages, distributions, fixes, updates, patches, How To's, mailing lists, and even grids, all circulate freely throughout an international community.
The Marketing Department is Closed
In a closed source model, the features that the end users see are, of course, determined by the owner of the source code. Deciding what features a new product should have often falls in the hands of the marketing department. A good marketer checks to see what the competition has, what the current users want, and makes a decision based on the cost to implement and release new features. If you and a small cadre of users require some special feature you are at the mercy of the "marketing optimization" equations. For closed source, there is no other way. If you don't make the features list you are as they say SOL (bad-word-your-mom-told-you-not-to-say Out of Luck).
In the case of HPC, many features are at the bottom of the list because HPC market is not that big compared to other market segments. You will see a better return on your money by appeasing the bigger markets.
Let's look at process migration as an example. Both bproc an mosixrequired access to the intimate details of the kernel. These packages are extraordinary useful to the HPC market. The funny thing is they only show up in open software. There is no marketing department attempting to optimize ROI (Return on Investment). There are users who need something, there are implementors who will build things and get paid to keep them working, and no one else (no costs) in the middle. Marketing, in a sense, has been optimized from the equation.
Lawyer Free Zone
In 1997, I found an articlein EE Times describing the creation of a "Lawyer Free Zone" in Scotland to help foster collaboration in the semiconductor market. Interesting idea.
When I think about Linux and clusters, I think about how the GPL has created a "Lawyer Free Zone" for software development. (SCO of course believes otherwise). Think about the fact that there are people from many large companies (like IBM, SGI, SUN, and HP) who would, outside of the GPL, never put there development people in the same room -- let alone co-develop software. The large array of Linux file systems, is only one example of how clusters have benefited from this safe haven. In a sense, the GPL, has lowered the "lawyer latency" (measured in months/years) for collaborative projects to near zero.
In addition, discussion on mailing lists and technical meetings is also unencumbered. Everyone benefits by co-operating, which, by the way, is the goal of any successful legal agreement.
Because We Play Computer Hardball
From my experience with the HPC community, I can say with complete confidence that if Linux was unstable or did not work as expected, it would have been given the boot long ago. Losing a weeks worth of results because of node crash can be a serious setback. Although a lower level of stability is often an accepted part of the mainstream, it will find no quarter in the HPC world. The HPC market, by definition, pushes the limits of everything it touches. In this respect, Linux is a major league player.
Vendor Lock-in
The classic business strategy of selling a customer something that requires them to continue buying products and services, has fueled the growth of many companies. It has also been the best source of boat anchors in the HPC market.
Let's consider a common scenario. Your organization buys a nice new supercomputer called the Whopper Z1 from FBN (Fly By Night) Systems. The Whopper Z1 runs WOS 1.0 a version of UNIX ported for their system. The computer works well for the first year. Everyone is happy. Then, in the second year, you want to add more memory. Well, in order to keep the service contract intact, you need to buy the memory from FBN systems. Funny, it looks like the memory you bought for your home computer, but it costs ten times more than you paid. So you upgrade the memory, and while you are at it you upgrade to the next version of WOS (version 2.0). Everything is fine, until year three. It turns out that the Whopper Z1 is now going "off contract" because a new replacement system your organization is buying, called the Whopper Z2 , has been installed. The Whopper Z2 also has a new version of WOS (version 3.0) which does not run on the old Z1 system. Now the old Whopper Z1 is pretty much useless and will be kept on-line for another year to allow everyone to move their codes over the new new machine. After this time, you can not really sell it, or use it because hardware or software support is expensive and is considered obsolete. Ah, but if you tied a rope to it, it could indeed be used as a boat anchor.
Now consider the scenario where Linux was used for the operating system. Since the source code is available, you can if you choose keep the old Whopper Z1 running without a support contract. You can find people who can help you fix things. You may even have some "Linux Hackers" on staff because they have been running Linux at home for five years. And, as you find out, this is a good thing because FBN Systems goes out of business and now you are now stuck with two large pieces of hardware and a tape with a binary versions of WOS on it.
In the end, "Vendor Lock-in" is always bad for the customer. No one likes to here "you can't do that." The word "can't" and "Linux" are not often used in the same sentence.