Building a Virtual Cluster with Xen (Part Two)

Article Index

Installing MPI (Message Passing Interface)

The MPI library that we will install will be MPICH, version 1.2.7p1. The link below should point to the most recent version, which is 1.2.7p1 at the time of writing. The installation is very simple:

  
-bash-3.00# wget -nd http://www-unix.mcs.anl.gov/mpi/mpich1/downloads/mpich.tar.gz  
-bash-3.00# tar -zxf mpich.tar.gz
-bash-3.00# cd mpich-1.2.7p1/
-bash-3.00# ./configure --prefix=/usr/local/mpich/mpich-1.2.7p1
-bash-3.00# make

We modify the file util/machines/machines.LINUX to contain the name of the machines where we will want the MPI jobs to run:

  
-bash-3.00# cat util/machines/machines.LINUX
# Change this file to contain the machines that you want to use
# to run MPI jobs on.  The format is one host name per line, with either
#    hostname
# or
#    hostname:n
# where n is the number of processors in an SMP.  The hostname should
# be the same as the result from the command "hostname"
slave1:4
slave2:4
slave3:4
slave4:4

Then we can proceed with the installation

  
-bash-3.00# make install

{mosgoogle right}

Configuration Of The MPICH Modulefile

As mentioned above, we will use the modules package to let users easily change from one version of the library to another one (useful if later on we decide to install, for example, the MPICH2 version of the library). We follow the instructions in the INSTALL file of the modules package and create the directory /usr/local/Modules/3.2.3/modulefiles/mpich/. Inside it we create two files:

  
-bash-3.00# cat /usr/local/Modules/3.2.3/modulefiles/mpich/mpich127p1
#%Module1.0#####################################################################
##
## mpich 1.2.7p1 modulefile
##
## modulefiles/mpich/mpich127p1
##
proc ModulesHelp { } {
        global version mpichroot

        puts stderr "\tmpich 1.2.7p1 - loads the MPICH version 1.2.7p1 library"
        puts stderr "\n\tThis adds $mpichroot/* to several of the"
        puts stderr "\tenvironment variables.\n"
}

module-whatis   "loads the MPICH 1.2.7p1 library"

# for Tcl script use only
set     version         1.2.7p1
set     mpichroot       /usr/local/mpich/mpich-1.2.7p1

prepend-path    PATH            $mpichroot/bin
prepend-path    MANPATH         $mpichroot/man
prepend-path    LD_LIBRARY_PATH $mpichroot/lib



-bash-3.00# cat /usr/local/Modules/3.2.3/modulefiles/mpich/.version
#%Module1.0###########################################################
##
## version file for MPICH
##
set ModulesVersion      "127p1"
-bash-3.00#

And we replicate them to the slave nodes:

  
-bash-3.00# cexec mkdir /usr/local/Modules/default/modulefiles/mpich
-bash-3.00# cpush /usr/local/Modules/default/modulefiles/mpich/.version 
-bash-3.00# cpush /usr/local/Modules/default/modulefiles/mpich/mpich127p1

Now we can verify that as user angelv we can use Modules to load/unload this version of MPICH

  
-bash-3.00# su - angelv
[angelv@boldo ~]$ echo $PATH
[angelv@boldo ~]$ module avail
[angelv@boldo ~]$ module load mpich/mpich127p1
[angelv@boldo ~]$ echo $PATH           (you should see now the path to mpich127p1 included)
[angelv@boldo ~]$ module unload mpich/mpich127p1
[angelv@boldo ~]$ echo $PATH           (the path should be again as per the first call) 
[angelv@boldo ~]$   

And since for the moment we will only have this version of MPICH installed, we can load this module by default by modifying the file /home/angelv/.bash_profile. The relevant lines of this file would look like:

  
# put any module loads here
        module add null
        module load mpich/mpich127p1

Now we can do a quick test to verify that MPICH works as intended. As angelv we do:

  
[angelv@boldo ~]$ cp /usr/local/mpich/mpich-1.2.7p1/examples/cpi.c .
[angelv@boldo ~]$ mpicc -o cpi cpi.c
[angelv@boldo ~]$ mpirun -np 17 cpi

As there are no problems, we distribute it to the slave nodes, by doing this in the master node as root:

  
-bash-3.00# cd /cshare
-bash-3.00# tar -cPf mpich-dist /usr/local/mpich
-bash-3.00# cexec tar -xPf /cshare/mpich-dist

Installing the Resource Manager and Scheduler

If you just wanted a cluster for yourself, you could probably get away without a resource manager and scheduler, but if you intend to let a number of people make use of the cluster, a job queue management system will help you increase its utilization and at the same time share the resources amongst the users in a fair way. Our choice of software to do this will be Torque as the resource manager and Maui as the resource scheduler.

Torque Installation

Torque installation is really simple:

 
-bash-3.00# wget -nd http://www.clusterresources.com/downloads/torque/torque-2.1.2.tar.gz
-bash-3.00# tar -zxf torque-2.1.2.tar.gz
-bash-3.00# cd torque-2.1.2
-bash-3.00# ./configure --prefix=/usr/local/torque/torque-2.1.2
-bash-3.00# make
-bash-3.00# make install

In order to use the modules package with this version of Torque, we create the directory /usr/local/Modules/3.2.3/modulefiles/torque/. Inside it we create two files (Note: we don't replicate this to the slave nodes, as this is mainly for users to have access to the man pages and the bin files, but normally users would not need this from the slaves):

  
-bash-3.00# cat /usr/local/Modules/3.2.3/modulefiles/torque/.version
#%Module1.0###########################################################
##
## version file for Torque
##
set ModulesVersion      "212"

-bash-3.00# cat /usr/local/Modules/3.2.3/modulefiles/torque/torque212
#%Module1.0#####################################################################
##
## Torque 2.1.2 modulefile
##
## modulefiles/torque/torque212
##
proc ModulesHelp { } {
        global version torqueroot

        puts stderr "\ttorque 2.1.2 - loads TORQUE version 2.1.2"
        puts stderr "\n\tThis adds $torqueroot/* to several of the"
        puts stderr "\tenvironment variables.\n"
}

module-whatis   "loads TORQUE 2.1.2"

# for Tcl script use only
set     version         2.1.2
set     torqueroot       /usr/local/torque/torque-2.1.2

prepend-path    PATH            $torqueroot/bin
prepend-path    MANPATH         $torqueroot/man
prepend-path    LD_LIBRARY_PATH $torqueroot/lib

Since for the moment we will have only one version of Torque, and we want to provide it by default to all users, we add the following line to the end of the file /etc/bashrc:

  
module load torque/torque212

To have access to torque commands, libraries, etc. from the root account we also need to get dot files for the root account, so we create the following two files (Note: Remember to log out and log in again so that these changes take effect):

 
-bash-3.00# cat /root/.bashrc
module load torque/torque212

-bash-3.00# cat /root/.bash_profile
# start .profile
if [ -f /etc/profile.modules ]
then
        . /etc/profile.modules
# put any module loads here
fi

sh() { bash "$@"; }

# end .profile


# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

We configure Torque following using the Torque Quick Start Guide. Specifically, we will use the manual configuration instructions.

  
-bash-3.00# /usr/local/torque/torque-2.1.2/sbin/pbs_server -t create
-bash-3.00# qmgr -c "set server scheduling=true"
-bash-3.00# qmgr -c "create queue batch queue_type=execution"
-bash-3.00# qmgr -c "set queue batch started=true"
-bash-3.00# qmgr -c "set queue batch enabled=true"
-bash-3.00# qmgr -c "set queue batch resources_default.nodes=1"
-bash-3.00# qmgr -c "set queue batch resources_default.walltime=3600"
-bash-3.00# qmgr -c "set server default_queue=batch"

We create the following two files:

  
-bash-3.00# cat /var/spool/torque/server_priv/nodes
slave1 np=4
slave2 np=4
slave3 np=4
slave4 np=4

-bash-3.00# cat /var/spool/torque/mom_priv/config
$usecp *:/home  /home

And we replicate the installation to the slaves:

  
-bash-3.00# tar -cPf /cshare/torque_slaves /usr/local/torque
-bash-3.00# tar -cPf /cshare/torque_spool /var/spool/torque

-bash-3.00# cexec tar -xPf /cshare/torque_slaves
-bash-3.00# cexec tar -xPf /cshare/torque_spool

We then can restart the server and start the MOMs in the slaves and verify that they report to the server:

  
-bash-3.00# /usr/local/torque/torque-2.1.2/bin/qterm -t quick
-bash-3.00# cexec /usr/local/torque/torque-2.1.2/sbin/pbs_mom
-bash-3.00# /usr/local/torque/torque-2.1.2/sbin/pbs_server
-bash-3.00# /usr/local/torque/torque-2.1.2/bin/pbsnodes -a 
            (after a time, we have to wait for the nodes to report)
slave1
     state = free
     np = 4
     ntype = cluster
[...]

Automatic Start of Torque at Boot Time

We probably want to start Torque automatically at boot time, which can be easily accomplished by creating the init file torque_server in the master node as follows (similar scripts can be found here or in the contrib/init.d directory of the source code). Remember to change the permission of these to 755, with the commands chmod 755 /etc/init.d/torque_server:

  
-bash-3.00# cat /etc/init.d/torque_server
#!/bin/bash
#
# Red Hat Linux Torque Resource script
#
# chkconfig: 345 80 80
# description: TORQUE is a scalable resource manager which manages jobs in
# cluster environments.

# Source function library.
. /etc/init.d/functions

TORQUEBINARY="/usr/local/torque/torque-2.1.2/sbin/pbs_server"

start() {
        if [ -x $TORQUEBINARY ]; then
          daemon $TORQUEBINARY
          RETVAL=$?
          return $RETVAL
        else
          echo "$0 ERROR: Torque server program not found"
        fi
}

stop() {
        echo -n $"Stopping $prog: "
        killproc $TORQUEBINARY
        RETVAL=$?
        echo
        return $RETVAL
}

restart() {
        stop
        start
}

reload() {
        restart
}

case "$1" in
start)
        start
        ;;
stop)
        stop
        ;;
reload|restart)
        restart
        ;;
status)
        status $TORQUEBINARY
        ;;
*)
        echo $"Usage: $0 {start|stop|restart|reload|status}"
        exit 1
esac

exit $?
exit $RETVAL

Similarly, we would create torque_mom init scripts in the slaves, stored as /etc/init.d/torque_mom, and with permissions 755. These files are almost identical to the file /etc/init.d/torque_server, except for the following two changes (output from the diff command):

 
-bash-3.00# diff /etc/init.d/torque_server torque_mom_boldo_slave1
12c12
< TORQUEBINARY="/usr/local/torque/torque-2.1.2/sbin/pbs_server"
---
> TORQUEBINARY="/usr/local/torque/torque-2.1.2/sbin/pbs_mom"
20c20
<           echo "$0 ERROR: Torque server program not found"
---
>           echo "$0 ERROR: Torque mom program not found"

Then we create the necessary symbolic links in both the master and the slaves:

  
-bash-3.00# /sbin/chkconfig --add torque_server
-bash-3.00# cexec /sbin/chkconfig --add torque_mom 

And lastly we link the libtorque.so.0 file to /usr/lib/ because later on Maui will need it at start-up time, and it will not be able to find it in its current location:

  
-bash-3.00# ln -s /usr/local/torque/torque-2.1.2/lib/libtorque.so.0 /usr/lib/

    Search

    Feedburner

    Login Form

    Share The Bananas


    Creative Commons License
    ©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.