[Beowulf] Redmond is at it, again

Robert G. Brown rgb at phy.duke.edu
Wed Jun 2 11:54:41 EDT 2004

On Thu, 27 May 2004, Laurence Liew wrote:

> Hi all
> > > If you work for a disro company and you are reading this,
> > > all I can say is, WAKE UP! You are going down the tubes fast
> > > in the HPC market. If you work for a cluster vendor and you are
> > > reading this, please push your management hard to adopt at
> > > least one open distro. We'll pay for support for it, but not using
> > > the current pricing scheme that the Redhat's, Suse, etc. are
> > > charging.
> So what would be a per node price people (edu? commercial?) would pay
> for the OS with support and updates? Do note that support for HPC
> clusters requires technically competent and HPC savvy engineers and
> these do not come cheap....
> Based on previous discussions and email.. it seems that USD$50 - USD$100
> per node to be about right/acceptable for a cluster OS that is supported
> with updates and patches and basic support for HPC type questions.

Well, since both our cluster and LAN installations for the entire campus
proceed directly from a (mirror) repository that we maintain, let's look
at this very question.

  There is a baseline cost for building any distribution.  However, most
distributions are built on top of a common open code base with tens of
thousands of contributors, a few tens of whom (at most) actually work
directly for the distribution companies.  The bulk of building a "new"
upgrade of any rpm-based distribution is running a straight rpm
--rebuild of most of the -- unchanged -- application source rpm's.

  Testing the distribution WOULD be an expense, but all linux
distributions rely on the world's largest free beta testing network.
That is, everybody on the planet that develops, maintains, and uses the
open source software that they repackage and distribute.  The "rawhide"
process is where any distribution become ready to actually sell to
somebody, and guess who provides the actual testing and a whole lot of
the debugging of the inevitable bugs of a new release?  We do.  For
free.  The biggest single testing and certification process is likely
that of the kernel and its hardware drivers, and here you are screwed
either way -- the conservative kernels released only after extensive
testing still have bugs and problems with specific device combinations
(the permutations of hardware combinations are far too large a number to
test) and don't support lots of newer hardware; the development kernels
support more hardware and newer hardware but are yes, less stable.  Once
again, the real work is done by the kernel development team(s) and the
real debugging by the people who use the development kernels long before
they THINK of making it into a distribution.

  "Cluster" distributions are more of the same, except that few
distribution vendors take clusters seriously enough to provide a full
suite of tools including e.g. SGE, all flavors of MPI, bproc -- most of
the open source stuff available besides PVM and some flavor of MPI
doesn't make it into a mainstream distribution.  Yes, this provides a
market opportunity for cluster vendors to add value; however the basic
problem is one of pure packaging, as most of the tools one might wish to
add, if they are properly packaged, are an rpm --rebuild away from being
rpm -Uvh installed and yum updated.  How much is it REALLY worth to
provide this service?  If one amortizes the cost over all the clusters
out there (and clusters are still a small fraction of the total LAN and
server base) it wouldn't be a whole lot of money, provided only that one
did the work one time, well, and provided its value many times.

Beyond this, there are capital costs for servers to provide primary
mirrors for the distribution, infrastructure to provide all of the
above, and don't forget marketing, management, G&A costs.  Most of the
latter are expenses associated with the fact that you are running a
business and want to make as large a profit as you possibly can, which
means basically charging as much as the market will bear and a lot more
than it costs you to provide the actual services.

Now, let's look at the distribution system.  Outside of the fairly
trivial/negligible box-set market in bookstores and the like (we're
talking University-wide usage, which isn't based on distributing box
sets even for WinXX):  

  Do we package the software in actual boxes to sell?  No, there is no
packaging cost.  

  Do we manufacture anything and have to pay laborers in third world
companies to assemble complicated electronic and plastic parts?  No,
there is no assembly required, at least assembly with per-unit scaling
of costs.  

  Do we require and consume lots of raw materials?  No, only electrical
energy (which is negligibly cheap -- we're not talking smelting of
aluminum here).

  Do we need a huge and expensive assembly line or other excessively
expensive infrastructure?  No, a handful of air conditioned offices, a
few tens of employees, a midscale server environment, a LAN, and of
course computers all around.  The cost of this is not negligible, but it
is a cost borne by nearly every University in existence and so don't
expect them to be overwhelmingly impressed by it when they imagine this
cost divided by the number of Universities in the world.  It is also
leveraged, as noted, by the fact that MOST of what is being "assembled"
is a) free; and b) provided assemble-ready and tested by somebody else,
so that "assembly" is a totally automated process that need not rely on
human effort at all.

  Do we need an expensive testing environment?  Yes, but again the issue
is HOW expensive.  All software companies make the customer the final
round of beta testers, generally charging them for the privilege.  Open
Source software companies carry this far beyond what has been done with
closed source software.  So let's admit the cost of a testing lab and
some real expense here, but also note that it doesn't come CLOSE to
testing every feature of every provided software package -- that is done
in rawhide by actual users at no cost to the providing company except
helping to resolve problems that surface IF they cannot get them
resolved by the actual maintainers of the software packages.

  Do we need to provide a lot of service and how do service costs scale
across the enterprise? Here is the real, fundamental issue.  The answer,
in fact, is no.  Supporting an entire University costs (to the
distribution company) little more than supporting a single department
within a University, and not THAT much more than supporting a single
individual within the single department of that University.  The real
support costs are borne by the University's own support staff and
infrastructure, and since many of those individuals actively contribute
to the development and maintenance of open source software, the
distribution company may MAKE MONEY NET providing software for FREE to a
University but getting back the labor of those individuals in the REAL
process that develops and maintains their product.

I hope that it is obvious that the costs above are in fact laughably
small for software compared to, say, building and selling hard disk
drives or system motherboards or hard goods with per unit cost scaling.
Building a new disk drive requires many hours of engineering and testing
by an expensive team, capital investment in retooling expensive climate
controlled and dust free plants, lots of raw materials assembled
incrementally in OTHER plants and sold for a profit to the end-stage
assembly company, large labor costs, large marketing costs, large
warranty support and RMA costs, and of course the inevitable management
costs and profit margins.  Finally, hard drives are sold almost
exclusively by retailers with significant retail markup and a whole
additional layer of mouths to feed, warehouse and inventory expenses,
physical delivery expenses, consumer advertising, and local support.

Yet a hard drive sells typically for less than $100 per unit, and COULD
sell for as low as $50-60 and still make money for companies if they
weren't constantly reengineering them to make them bigger and faster.
Or possibly less -- a floppy drive now costs order of $10, a CD drive
$30, and a mass-produced hard drive might cost no more without the
running R&D and retooling.

So, explain again how we'd be happy paying unit prices of $100 per hard
disk if we knew that the company had to engineer precisely one hard disk
and put it into a magic duplicator box that made an infinite number of
duplicates, nearly instantly and in a fully automated fashion, at a
per-unit amortized and extraordinarily generous cost of a few kW-hours
of electricity (say, $0.25)?  Buying the disk directly from the company
(no middleman retail layer), delivered on magic wings at a cost too low
to incrementally account for?  Actually needing ourselves to only buy
ONE copy of the disk as we possess our OWN magic duplicator and delivery
system?  And knowing as we buy the disk that one of our own employees
built the three capacitors on the controller (all magically duplicated
for free), another employee help engineer the head relay, that employees
down the road at UNC contributed part of the platter design and
employees up the road a bit an NC state contributed the casing, knowing
that in fact the ONLY thing contributed by the company selling us the
disk at per-unit costs of $100 or so is a nifty logo, the packaging and
primary assembly itself and some of the documentation?

Personally I think Fedora is the way to go (not religiously "Fedora" per
se, but open source free distribution linux/unix in one or another of
its many variants -- Debian, Fedora, freeBSD, or any of the rest of
them).  I think that Fedora 2 is looking remarkably rich and expect that
it will rapidly become highly stable.  Fedora 1 wasn't half bad, given
that it was mostly repackaged RH 9 and RH 9 wasn't half bad.

Here is how I'd recommend that we (the cluster community) manage cluster

  a) Mirror Fedora (as we do at Duke).

  b) Contribute some small fraction of IT staff time to helping support
Fedora -- maintaining selected packages, helping debug/test, providing
secondary public mirrors.  We most of us (Universities in particular) do
this anyway and have ever since the kernel was in its 0.x state.  Making
it formal and a job descriptor for selected individuals enables Fedora
to build a reliable web of contributed support.

  c) Build one or more (campus wide or department local) distribution
repositories from the Fedora mirror.  This is basically the mirror plus
e.g. DHCP, tftp, kickstart support, yum support, additional
university-local packages.

  d) Install it on one host or 10,000 hosts at no additional incremental
costs.  Use it for servers, for LAN clients, for beowulf nodes, for
other kinds of cluster nodes.  Note that 10,000 hosts at $100 each is
a solid million dollars -- even 1000 hosts at $100 is $100,000 dollars.
This pays for a whole lot of the above (a few thousand for the mirror
and repositories, call it one whole FTE in labor, a trivial fraction of
the total infrastructure expense you bear anyway and a lot of it
opportunity cost expense with a lot of leeway as to how and when work is

I should note that I >>personally<< do >>all the above<< to install and
maintain my >>home LAN/cluster<<.  Yes, I PXE/kickstart install my kid's
desktop computers from my own repository mirror, with my own local rpm's
added on top of Duke's on top of a mirror.  One of my kids accidentally
reinstalled his desktop just two nights ago, and it took me five whole
minutes to make it right again post install (mostly because we use the
nvidia drivers and he has an ancient monitor not on any list, so I have
to finish off video by hand).  Cluster nodes at Duke I reinstall on a
whim as it takes five minutes of waiting and NO extra work.  This gives
you an accurate idea of the cost scaling for an enterprise, as my home
repository could SUPPORT a 10,000 CPU enterprise, quite nicely actually.

With kickstart and PXE and grub, (re)installing a cluster node is
basically selecting an installation option at boot time.  With yum,
updating a cluster node is automatic and transparent.  Adding software
is trivial.  Since cluster nodes tend to be identical, they REALLY scale
out there at the ultimate edge of scalability -- a single kickstart file
works for an entire cluster, and support issues are not per system, they
are strictly per cluster whether the cluster has one box or 10000
identical boxes in it.

I think the above makes it painfully clear why I consider per-node costs
of $100 (about the same as the hard disk in those nodes) for software of
any sort to be highway robbery, legalized larceny, and just plain silly
on the part of the consumer.  What is RHEL (for example) ever going to
be but older Fedora?  Fedora is an essential component of the RHEL
"certification" process.  Red Hat cannot sell software for these kinds
of per unit prices and still have it in beta mode where customers find
bugs, and NOBODY can test a distribution's worth of software in-house.
Even Sun realizes that they cannot charge Universities and non-profits
for desktop software any more and seem to have programs up whereby one
can get e.g. Solaris for x86 for free.

What WOULD I consider reasonable?  That is very, very simple.  At MOST a
few thousand dollars for a University wide, no-questions-asked site
license.  The software provided in the form of a private secure channel
to a primary repository mirror, so that the University can refresh their
mirror anywhere from one to four times a day.  Unlimited distribution
within e.g. duke.edu from campus repositories (something that is easy
enough to restrict at the web level).  Access to bugzilla or equivalent
for the masses, access to a "real" support person only for selected
toplevel repository staff in a gatekeeper model.  Hell, I'd even
tolerate a limit of (say) fifty hours of "real" support before one has
to buy additional time a la carte -- I don't think any school with a
competent staff would ever use up 50 hours in a year unless the
providing company was distributing crap.

This would >>fairly<< compensate the providing company, would reduce per
unit cost scaling for OS and support to pennies added onto the real and
inevitable LOCAL costs for providing same, and would in fact more or
less reproduce the way Fedora et. al. will continue to work anyway but
with a small but meaningful income stream attached to it.

As some Sun Microsystems humans wryly commented to me not so long ago,
Duke and other Universities are where the future decision makers of
every company in the world are being trained, today.  If they learn to
use Sun boxes while at Duke, they'll be likely to use Sun products in
the future when they are spending real corporate dollars to get them.
If they learn to use WinXX boxes, that's what they'll want to get on
exactly the same basis.  If they learn to use Linux boxes, and discover
that they can (for example) install linux and maintain linux
transparently and for free from a common repository and that it has all
the tools that they need to do nearly anything right there at their
mouse pointertips once they've done so, what do you think that they'll
favor when THEY are making decisions in five years?

Sun Microsystems, of course, self-destructed back in the early 90's when
they acquired a working x86 Unix and failed to give it away for free, or
nearly so.  If they'd sold it at DOS-like prices back then, Windows
would have been stillborn, linux would have never been more than a nifty
project, OS/2 would have withered, and Sun would "own the universe" the
way Microsoft does today.

It is really interesting to watch as Linux vendors follow in these
venerable footsteps.  Linux would literally not exist if it weren't for
Universities, as they are virtually the only large-scale environment
that adopted it on a widespread basis for purely cost-scaling reasons
and Universities have contributed, as noted above, a huge fraction of
the real costs of its development and maintenance and continue to so
contribute today.  So the first thing the companies that were >>built<<
on the basis of University support do when they finally start to break
into the corporate world -- largely because yes, Universities have been
graduating students with linux experience and perhaps more importantly
have been contributing a steady stream of linux-trained sysadmins and
programmers into the corporate world where they did indeed select what
they knew to be functional and stable -- is to RAISE PRICES THROUGH THE
ROOF back to Universities!

I think of it as evolution in action, and I don't mean the software...

So I'd have to say that I agree with Jeff, don't you think?  Prices in
the brave new world of software distribution need to reflect actual per
unit costs to the distributor, not just their understandable eagerness
to realize a revenue stream of $100's per unit per year on something
that costs them $1 per unit per year to provide.  For cluster people, I
think you'd have to be crazy to pay $100's per node unless there is some
show-stopper associated with the choice.  100 nodes at $400/year is
$40K, which is well over half an FTE at University prices.  Half an FTE
is more than enough to maintain Fedora for the entire University, let
alone a single 100 node cluster, even allowing for that individual
having to spend a whole month packaging and building stuff like SGE,
MPICH, bproc, that might or might not be NEEDED in that particular
cluster and that might or might not be available in Fedora anyway, once
people get smart about contributing stuff like this back to the


> Comments?
> Cheers!
> Laurence
> ps. I work for a cluster vendor. 
> On Wed, 2004-05-26 at 05:42, Jeffrey B. Layton wrote:
> > For some people, me being one of them, this is a HUGE threat. Let me
> > explain.
> > 
> > Redhat, Suse, etc. have gone to a per-CPU price for license
> > and support driving up the costs of installing the OS on a cluster.
> > In fact, the last quote I got for a 72 node cluster averaged about
> > $400 a node! Now management has started to compare this price
> > to Windows and guess what? - Windows is cheaper on a per
> > node basis. This includes purchasing and support! We already
> > have a massive license with MS so we get Windows pretty
> > cheap, but I can get Windows myself cheaper than $400 a node.
> > We started to talk to management about the support costs, etc.
> > but they argue that they already have a whole bunch of admins
> > trained in Windows so support costs should be low (I tried to
> > explain the difference between admining a cluster and admining
> > a Windows server). Plus we have an infrastructure to push
> > updates to Windows machines, so they though updates would
> > be a no-brainer as well.
> > 
> > At the ClusterWorld conference I tried to talk to various distro
> > vendors about pricing, but they weren't listening. I don't mind
> > paying a relatively small fee to purchase a distro, but paying a
> > support costs for each CPU is nuts! If I have a problem on one
> > node, I have a problem on ALL nodes. The distro companies
> > have yet to understand this. Novell is starting to come around
> > a bit, but they still aren't there yet.
> > 
> > We are encouraging cluster vendors to support at least one
> > open distro. That way we aren't held hostage to these companies
> > that don't understand HPC. I don't know if we're making any
> > head way, but I constantly hope we are. We're also trying to
> > push IT management into "approving" a couple of open distros,
> > but they are reluctant because they don't see a company behind
> > them (they insist on the "one throat to choke" model of support).
> > However, I'm still hopefully.
> > 
> > If you work for a disro company and you are reading this,
> > all I can say is, WAKE UP! You are going down the tubes fast
> > in the HPC market. If you work for a cluster vendor and you are
> > reading this, please push your management hard to adopt at
> > least one open distro. We'll pay for support for it, but not using
> > the current pricing scheme that the Redhat's, Suse, etc. are
> > charging.
> > 
> > If you work for a company that sells commercial software for
> > Linux (e.g. compilers, MPI, etc.), please support more than
> > RHEL and SLES! Think seriously about supporting an open
> > distro and also supporting a kernel/glibc combo.
> > 
> > I'm sure I'm not the only one who feels this way. Let's tell the
> > distro companies that we won't stand for it!
> > 
> > Thanks!
> > 
> > Jeff
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list