[Beowulf] best Linux distribution

Robert G. Brown rgb at phy.duke.edu
Tue Oct 9 09:01:28 EDT 2007


On Tue, 9 Oct 2007, Douglas Eadline wrote:

>
> Excellent point. I have often thought that "diskless" provisioning
> opens up lots of opportunities to create custom node groups
> based on kernels or distributions. Throw in a virtualized
> head node and many ISV requirements could be handled this way
> e.g. a virtualized Suse environment running on top of Red Hat
> could request 32 Suse nodes from the scheduler (running under a
> Red Hat instance). The scheduler just provisions nodes as needed
> and sets them in a low power state when not being used.
> Going with fully virtualized nodes is another option provided
> the applications are still close to hardware.
>
> Note that diskless provisioning does not imply diskless nodes,
> if you need local drives, then you can still use them in a
> diskless booting scheme. Not nailing an OS to the hard drive
> on cluster nodes has lots of advantages.

This has been the subject for lots of Real Computer Science, some of it
done by Jeff Chase and students here at Duke (including Greg Lindahl's
ex-student, Justin Moore).  See "Cluster On Demand" here:

    http://www.cs.duke.edu/nicl/cod/

(and there are various other links GIYF).  COD is basically (as I
understand it) a layer of automated "provisioning software" that takes a
user's resource request (written IIRC in xmlish but I could easily be
wrong), creates a one-time cluster boot image that satisfies it,
allocates nodes from a large, generic pool, reboots them into the boot
image, connects them with suitable workspace (part of the provisioning),
and even starts up the user's job(s) on them.  Or not.

The provisioning is pretty much OS-neutral.  Want a Windows cluster?  No
problem (licensing permitting, of course).  Solaris?  If it will run on
the hardware and is supported, sure.  Linux obviously -- any flavor, any
size, licensed as needed or not.  Ditto all the other free and open
source OS's.  If you have specific needs for libraries, tools, memory,
processor count or cores, networking (and the needs can be met within
the cluster pool) it will allocate nodes, provision them, and crank them
up for you.

One of several GOOD things about this is that your nodes GO AWAY after
you are done with them.  Doing top-secret work for NSA?  Once you're
done (especially with diskless provisioning) there isn't even a disk
image left on the nodes to be reconstructed by means of advanced
magnetic analysis...

Provisioning really doesn't take very long any more.  Diskless almost no
time at all, but provisioning a full local boot image needn't take very
long either.

I don't know the status of this project, but just wanted to point out
that this is going on and that one day we may yet see a full open source
solution built in to Linux (as Linux is a very reasonable choice for a
toplevel platform to run this).  All of this can obviously be done by
hand, but it's the automation part that is interesting.  And of course
the advent of serious VM with processor level support means that we will
shortly have even more options -- a whole second way of doing it NOW is
to create and provision portable VMs and run them under e.g. VMware or
whatever.

    rgb

>
> --
> Doug
>
>
>> On Mon, 8 Oct 2007, Robert G. Brown wrote:
>>
>>> RHEL/Centos are good where vendors require "binary compatibility" on
>>> closed source software, as the standard of said binary
>>> compatibility.
>>
>> What strikes me in this whole discussion is the ideea of 'one
>> distribution fits all' when applied to all nodes of a cluster and all
>> applications that run on that cluster. In the days of PXE booting,
>> with several solutions readily available for either building a node
>> from scratch (like kickstart) or booting a prebuilt setup with
>> NFS-root or ramdisk, what's so difficult in matching on request a
>> node, an application and a distribution/custom setup ?
>>
>> Real case: A quantum mechanics code that we have bought some years ago
>> was provided only as staticly-linked binaries. They have worked fine
>> on the current distros at that time and we have succesfully used them
>> on CentOS-3 (2.4 kernel). However we discovered the hard way on the
>> new CentOS-5 (2.6 kernel) that the statically linked binaries didn't
>> work anymore as the kernel interfaces have changed - but, after a few
>> lines were changed in the config files and the nodes rebooted, the
>> binaries were again happily running in their required configuration.
>>
>> Of course, the admin is responsible in defining which
>> distributions/custom setups can run on a certain node, based on the
>> hardware of that node and the kernel of the distribution/custom setup.
>> But after this is done, the user can limit his/her jobs to running on
>> these nodes or ask the queueing system to set up a node according to
>> the requirements of the job (I think that term is 'provisioning').
>> Sure, it helps in this case to run a distribution with long support
>> (like RHEL/CentOS/SL, SLES or Ubuntu LTS) such that you don't have to
>> waste too much time yourself with updates, especially security related
>> ones.
>>
>>> Far short of Debian, but plenty big enough to include just about all
>>> mainstream useful packages for any cluster or LAN.
>>
>> I'm making sure that any cluster related package that is part of the
>> default distribution is not part of what the nodes get to run. Why ?
>> Because very often the common ground options used for building the
>> package (which is a good idea for a widely used distribution) don't
>> fit _my_ setup. So, I take the fact that the distibution offers me all
>> the needed tools as a fallback, but I'm always trying to match as well
>> as possible all the components. And if you search the archives of the
>> LAM/MPI mailing lists you'll see the larger picture...
>>
>> --
>> Bogdan Costescu
>>
>> IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
>> Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
>> Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
>> E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>> 
>>
>
>
> --
> Doug
>

-- 
Robert G. Brown
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone(cell): 1-919-280-8443
Web: http://www.phy.duke.edu/~rgb
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

!DSPAM:470b7eb3107731409419350!



More information about the Beowulf mailing list