Why MDKC & CLIC are not comparable to RH Advanced [Was Re: Redhat Fedora]

Robert G. Brown rgb at phy.duke.edu
Fri Sep 26 09:45:23 EDT 2003

On Thu, 25 Sep 2003, Erwan Velu wrote:

> > > Scientists for example are not Linux Guru.
> > Ummm, what makes you say that?
> Sorry, I meant that not all the Scientists are able to manage all the configuration needed by a cluster.
> We give Scientists an easiest way to setup things that could take them
> longer.
> > p.s. Oh, you mean ALL scientists aren't linux bozos (gurus seems like a
> > pretentious word:-).  Sure, but scientists who work on massively
> > parallel computers are fairly likely to be, or they are likely to hire
> > someone who is.  If they are already paying that person large sums of
> > money, why pay you as well?  
> You can pay someone to manage more clusters when he's working on
> powerfull tools rather paying one doing everything by hand.
> If everyone should compile/configure/understand all the technologies
> included in a product, some could be frighten of this !

I know, I was just teasing you a bit -- note smileys (:-).  

However, from a purely practical point of view, we find that the thing
that limits the number of systems a manager can take care of for pretty
much any of the distributions these days is hardware reliability and
sheer number of systems.  To put it another way, give me an unlimited
supply of flunkies to install and repair hardware (only) and I'll
cheerily take care of a thousand node cluster all by myself, as far as
the software installation and management is concerned.  

At this point PXE, a variety of rapid (scripted or image based) install
methods, and yum for rpm post-install maintenance mean that installation
and maintenance of an almost indefinite number of machines is primarily
a matter of taking care of the repository(s) from which they are
installed and maintained, and this has to be done the same for a cluster
of eight nodes or eight hundred nodes.  Once the base repository is set
up and maintained, installation is turning on a node, reinstallation is
rebooting a node, software maintenance is doing nothing (the node
updates itself nightly) or doing a rare cluster-push to update in real

These things do take time, but the BIGGEST time is setting up the
repository and configuring a reasonable node image in the first place
(neither of which are terribly hard, as one can rsync a repository
mirror from many existing repositories and with a repository in hand a
node image is, fundamentally, a list of rpm's that a tool like yum can
be used to consistently install on top of a very minimal base image). On
a successful cluster design the average non-hardware per-node FTE effort
spent on the nodes themselves should be order of minutes per year.

Again, this doesn't mean that your product packaging is without value --
I'm sure that it is engineered to help neophytes who are unlikely to
achieve that level of efficiency initially get there quickly, and there
is indeed value in providing people with a repository image to mirror
including regular updates of the important core packages (as I said,
this is what I think you should sell). 

It's just that ANY of these packagings compete with opportunity cost
time available in many environments, alternative more or less prebuilt
or easy to build cluster packages (oscar rocks scyld clustermatic...),
and BECAUSE it is a big toolset so that one size fits all, contains a
lot of tools that many cluster people will never need or use (just like
a swiss army knife).  Some people just want to whittle wood and clean
fish and don't NEED a magnifying glass on their pocket knife!  They are
going to be asking themselves why they are "paying" for all these things
when they don't use but three of them they could build themselves in an
afternoon (or mirror prebuilt from various public repositories).  

This is what I think will ultimately limit your price point and needs to
shape your marketing strategy.  Your choice is ultimately going to be
fewer high margin sales or more low margin sales (I'm pretty safe there,
supply and demand law and all that:-).  This is true for all the linux
distributions.  Or you can try some sort of mixed strategy (as was
suggested earlier) and sell at all price points trying to leverage
additional margin on the basis of added value and willingness to pay and

In my opinion (for what it is worth) it would be a capital mistake for a
distribution to seek to armtwist high margins in the linux community.
As in the last mistake a lot of distributions may ever make.  The
problem is that the reason that there ARE distributions at all is
largely because of the difficulty, once upon a time, of an ordinary
human obtaining all the components required or desired to assemble a
usable system.  The distribution companies "sold" media where the whole
shebang was right there, packaged to be "easy to install", and (as time
passed) with package management systems and gradually improving
installation and administrative interfaces.

However, the interesting thing about software and the evolution of
hardware and the internet is that the once a GPL package exists, it is
available forever and eventually gets to be feature stable and readily
available in a stable, functional form to pretty much anybody.  New
hardware (like PXE) has changed the installation equation.  And the
internet is now supported by new tools such as rsync, with massive
amounts of storage routinely available.  If I can afford a 150 GB RAID 5
system to serve my home LAN (and I just built one for about $800) then
storing an utterly exhaustive linux repository isn't a big deal anymore.

These things CAN change the way linux distribution operates altogether.
They ARE changing the way linux distribution operates altogether.
Nobody buys distribution CD's, they mirror.  In professional
environments, booting from floppy has all but disappeared.  Packages are
designed a priori to permit simple rebuilds and dependency resolution.
I could >>really easily see<< an environment in five years where linux
distribution is >>nothing but a network of repositories<< driven by
tools like yum, augmented so that the rpm's are rebuilt as necessary in
real time.  Install a very primitive, pure kernel+gnu base, then
yum-build from SOURCE repositories into a known final state.

Right now there is little incentive for this to be built, at least not
in a hurry.  However, high margin price points would be a great
incentive.  Already half the systems people I know are wandering around
moaning (and irritated), and these are the very people to do something
about it if they get irritated enough.


Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list