starting jobs via bproc

Sean Dilda agrajag at scyld.com
Fri Aug 10 10:46:47 EDT 2001


On Thu, 09 Aug 2001, Nicholas Henke wrote:

> Ok, so now that we can detect jobs stopping, what would be the best way to
> start them using bproc? I need to be able to start mpi,pvm, and batch
> jobs. The batch jobs could be interactive or not. I know that Scyld uses
> beompi to run a mpich style mpi interaction, but I would really like to be
> able to use lam as well.
> 	I could see using either bproc_rexec, or using bpsh as a
> replacement for rsh. I have heard that there are problems using bpsh
> instead of rsh.
> 	Any ideas?

We no longer use beompi.  Our newest release just uses a modified
version of mpich.  beompi was also a modified version of mpich, but we
started over and didn't change as much in it for our newest release, and
it seems to be working a lot better than beompi did.  I'm not really
sure what the status of getting LAM to work with our stuff is.. if it
just uses rsh to start jobs, its theoretically possible to just use bpsh
instead.  However our mpich modifications actually uses BProc functions
like bproc_move() to send jobs to slave nodes.

I think you are missing the point with starting MPI jobs.  You don't
need to use BProc to start the MPI job, instead the MPI implementation
should know how to distribute itself using BProc.  You would want to use
BProc functions to start jobs that aren't inherintly parallel, or in the
code of a parallel program to make it parallel.  You shouldn't need to
use BProc functions to actually start an already parallel job.

What are the problems with using bpsh as a replacement for rsh?  The
only one I know of personally is that bpsh won't start an interactive
shell on the slave node.  Other than that, it should be a suitable
replacement for just starting jobs on slave nodes.  If there is a
fundamental problem with bpsh, please let me know so I can see about
getting it fixed.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 232 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20010810/b9a33831/attachment.sig>


More information about the Beowulf mailing list