Fedora cluster project? (was Re: [Beowulf] Opteron/Athlon Clustering)

Robert G. Brown rgb at phy.duke.edu
Wed Jun 9 09:44:25 EDT 2004

On Wed, 9 Jun 2004, Mitchell Skinner wrote:

> I've been kicking around the idea of starting a fedora-oriented cluster
> distribution.
> ***My goals for it would be:
> Social/political goals:
> 1. Ease of installation ("yum install cluster-master")
> 2. piggyback on other work (Fedora release engineering, mobs of people
> trying it out on commodity hardware)
> 3. Encourage outside contributions (have a completely open devel
> process, use a license without an advertising clause)
> 4. Be an integration point for applications ("yum install mpiblast")
> 5. Feed back upstream (to fedora and/or directly to maintainers)
> Technical goals/hopes:
> 1. Organize as a set of add-on packages, rather than a whole
> distribution (like OSCAR, but without the extra complexity of multiple
> base distributions).  This means creating SRPMs that can be fed upstream
> (unlike rocks-sge, for example).
> 2. Use RPM/anaconda to select architecture-specific files, like Rocks
> (handles heterogenous clusters more cleanly than systemimager (OSCAR) or
> network booting (warewulf))

Yes, this are dead on the money.  And with anonymous rsync and other
mirror tools, one doesn't even have to make it a separate "distribution"
-- one can either simply put up a repository to which yum can be
directed from anywhere or set up a toplevel site that others can mirror
(or both).  Mozilla, for example, distributes in this way -- one can
actually yum update directly from the mozilla site or use mozilla as it
comes in you primary repository (giving you a bit more choice and
control over when it is updated from the primary site) or you can create
a mirror of their repository and yum update or distribute from it.

> ***Potential Objections:
> 1.  "Fedora changes too frequently" - This is problematic in proportion
> to the pain of change.  One reason that change is painful is that people
> put it off, and then have to make a huge change all at once.  More
> frequent, more incremental changes can work, especially if you have the
> source to your apps.  This is assuming that the still-somewhat-untested
> fedora-legacy project doesn't work out; if it does then this objection
> is moot.  OTOH, if you want your closed-source ISV apps to be certified
> for your setup, then maybe this approach is not for you.

I don't know that I'd agree with this assertion.  Fedora is a project in
its first year, and began by basically repackaging RH 9 just to get a
supported base before RH 9 was officially unsupported.  However, it was
also forced almost immediately to e.g. do a build for x64, driven by
overwhelming demand.  It also became clear that it had started up right
on the cusp of a major kernel upgrade, since RH 9 was 2.4 and 2.6 had
been out there long enough to be considered quite mature and (as always)
has lots of advantages mixed in with the pain of change.

FC2 is "out there", but I'm certain that many places (Duke, for
instance) are proceeding very cautiously to build and test it pretty
thoroughly locally with a mix of rawhide process, early adopters (mostly
systems persons), and local debugging ultimately contributed back to the
primary image.  This has to proceed (at an institutional level) for ALL
supported architectures -- we installed FC1-x64 without really going
through this the first time because we had opterons we needed to run
"right now", but for FC2 we are proceeding very deliberately and one of
my cluster nodes is currently sacrificed solely for development/test
purposes for FC2.

Lots of sites tend to mirror us because Seth is both yum-god and
moderately compulsive when it comes to setting up stable repositories
that will de facto be used by thousands of systems here and elsewhere.
This is the REAL development cycle for Fedora (just as it was for RH).
RH released 8 and then 9 "too fast", but many sites (us included) simply
skipped 8 altogether and went from 7.3 to 9 when we finished putting
together a 9 repository that suited us and tested well.

So I personally think that Fedora will very likely stabilize and slow
down (in one sense) once FC2 is really out there and fully adopted by
many sites.  What I'd REALLY like to see is a period where the core is
very stable but where people develop and add fully consistent packages
and package groups (like a cluster package group) that is essentially a
set of source rpms and a build script.  This sort of packaging is HOW
FC2 can manage to "evolve" so quickly in spite of being run primarily by
the community.  As long as the core remains solid, well-constructed
source RPMs will just rebuild and run, with only rare problems caused by
major dependency/library changes.

I've been thinking about trying to set up a similar cluster tool
packaging project (not really a "distribution" project, since it would
just be an optional layer on top of Fedora or really pretty much any rpm
based distro) here at Duke, as you can probably tell from my previous
message.  However, the BEST way to do it is likely with lots of
volunteers in many places working according to a common set of rules and
with a common core repository and maybe CVS server.  That way different
groups could "own" particular tools and be responsible for packaging and
documenting the packaging of those tools, and (relatively infrequently)
for repackaging updates/upgrades.  The BEST groups to do this, of
course, are the primary tool developers, but this will likely come in
time as they see how this process increases the users of their tools
many fold.

> What do people think?

Obviously I think it is a peachy idea;-) I'm going to try to talk to
some of the various powers that be here at Duke this summer about this
and see if we cannot create a core site for this within the Duke CSEM
project and get some of the cluster engineers on campus to agree to
contribute some time (which they are doing anyway for a lot of these
tools, it is just time spent building one-offs for their clusters from
tarballs) to do a proper, reusable packaging.  It would almost certainly
be primarily built on top of Fedora-current (since that's where we are
going at the moment) but I would expect the source rpms to rebuild
cleanly on most rpm architectures, and with volunteers using the rpms on
other distributions we would eventually get them patched up so that they
DID rebuild cleanly, possibly even on e.g. Solaris and non-linux but rpm
supporting systems.


> Mitch

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list