You should try mpiexec
cafe
Hi,
we're running MPICH 1.2.4 on a 32 node dual cpu linux cluster (fast
ethernet), and are having some problems with the mpich job distribution.
An example from today:
The PBS job:
----------------------------------------
#PBS -l nodes=4:ppn=2,walltime=100:00:00
#
mpirun -np `wc -l < $PBS_NODEFILE` -machinefile $PBS_NODEFILE mfix.exe
----------------------------------------
is assigned to nodes:
node17/0+node15/0+node14/0+node11/0+node17/1+node15/1+node14/1+node11/1
PBS generates a PBS_NODEFILE containing:
-----------------------------
node17/0+node15/0+node14/0+node11/0+node17/1+node15/1+node14/1+node11/1
PBS generates a PBS_NODEFILE containing:
-----------------------------
node17
node15
node14
node11
node17
node15
node14
node11
-----------------------------
And this command is started in node 17:
mpirun -np 8 -machinefile /var/spool/PBS/aux/20996.fire executable
And then when I look over the nodes, there's 1 executable running on
node17, 3 on node15, 2 on node14 and 2 on node11.
Anybody seen something like this, and maybe have an idea of what might
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf