[Beowulf] Hypothetical Situation

Bogdan Costescu bogdan.costescu at iwr.uni-heidelberg.de
Thu Jan 22 13:58:26 EST 2004

On Thu, 22 Jan 2004, Brent M. Clements wrote:

> 1. It must be part of a shared computing facility controlled by a batch
> queuing system.

Use the "epilogue" facility to run a script that installs a sane image and 
then reboots or first reboots and PXE+something (recent thread mentioned 
what this something can be :-)) takes care of the rest.
The trickier part is starting the user compiled kernel, because the
"epilogue" of a job would need to know about the requirements of the next
job. Making the change in the "prologue" of the user's job might not work
properly because the node is rebooted in the process of changing the
kernel and this breaks the link to the batch system daemon...

> 2. A normal user must be able to compile a customized kernel, then submit
> a job with a variable pointing to that kernel. The batch queuing system
> must then load that kernel onto the allocated nodes and reboot the nodes
> allocated.

I did something similar for a test and after I realized what are the 
implications I never put it into production, even though I would have 
allowed only kernels compiled by myself :-)

Basically, user compiled kernel means that the kernel can do anything;  
that means reading the content of /etc/shadow and dumping it into an
unprotected file, disturbing the network activity on the switch(es) where
it's connected, overloading file-servers, making connectiong from
priviledged ports (which can upset the batch system :-)), destroying local
file-systems. It can even go to destroying hardware by driving it with
out-of-spec parameters. Completely hijacking the node is another thing
that can be done; there's no guarantee with a user compiled kernel that
the batch system daemons are started, for example, so the master node
might loose any control over the node, requiring a human to press a

Now if these kernels come only from trusted persons, there are still many 
things that can go wrong. For example, running with a RHEL 3.0 user space 
and a non-NTPL kernel will break threaded applications. Compiling most 
(all) drivers in-kernel while user-space expects modules might break the 
whole system. Not including IP autoconfig when root FS is over NFS will 
not allow booting. And so on...

> 2a. If the node doesn't come back up after rebooting, the job must be
> canceled and the node rebuilt automatically with a stable kernel/image.

"doesn't come back up after rebooting" is a pretty vague description. 
Probably a watchdog (even the software one) would help, but might still 
not catch all cases.

> 3. When the job is finished, the node must be rebuilt automatically using
> a stable kernel/image.

If the user compiled kernel plays nice and gives back the node to the 
batch system... just reboot and instruct PXE to give a sane kernel/image.

You might have better chances of keeping things under control using some 
kind of virtualization technique, UML or VMWare/Bochs. Limits can be 
imposed on the host system, but performance might go down a lot.

One solution that I used at some point and will probably use again in the
near future is to have a (small) set of sane kernels that are allowed for
such purposes; that would allow for example conflicting drivers (like GM
and PM for Myrinet cards) to be used - for each job the kernel that
corresponds to the right driver is chosen. This greatly reduces the risks
as the kernels are compiled only by the admin who is supposed to know what
is doing :-)

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list