[Beowulf] PVM BEOLIN & SMP
Eduardo Cesar Cabrera Flores
eccf at super.unam.mx
Thu Jan 29 20:16:13 EST 2004
Are there anybody that could run sucessfully PVM BEOLIN conf into a
cluster using both of the processors of the dual machines?
Cause i got the next, output (i'm using OpenPBS to submit my job):
[eccf at mixbaal BEOLIN]$ more ~/pvml.10006
[pvmd pid1213] 01/29 19:43:50
PROC_LIST=nodo13:nodo13:nodo12:nodo12:nodo11:nodo1
1:nodo10:nodo10:
[pvmd pid1213] 01/29 19:43:50 8 nodes in list.
[t80040000] 01/29 19:43:50 nodo13 (192.168.1.13:32770) BEOLIN 3.4.4
[t80040000] 01/29 19:43:50 ready Thu Jan 29 19:43:50 2004
[t80040000] 01/29 19:43:50 mpp_find looking for t60000
[t80040000] 01/29 19:43:50 mpp_find looking for t60001
[t80040000] 01/29 19:43:50 mpp_find: Task not found
[t80040000] 01/29 19:43:50 mpp_find looking for t60000
[t80040000] 01/29 19:43:50 mpp_find looking for t60001
[t80040000] 01/29 19:43:50 mpp_find: Task not found
[t80040000] 01/29 19:43:50 mpp_new() sp=806f510 sp->n_link=806f510
[t80040000] 01/29 19:43:50 mpp_new() sp=806f510 sp->n_link=80828e8
[t80040000] 01/29 19:43:50 mppload(): Forking to nodo13
[t80040000] 01/29 19:43:50 mppload(): Forking to nodo13
[t80040000] 01/29 19:43:50 mppload(): Forking to nodo12
[t80040000] 01/29 19:43:50 mpp_find looking for t60000
[t80040000] 01/29 19:43:50 mpp_find looking for t60000
[t80040000] 01/29 19:43:50 mpp_find looking for t60002
[t80040000] 01/29 19:43:50 [t60000]
[t80040000] 01/29 19:43:50 [t60000] Estoy en el nodo nodo13
[t80040000] 01/29 19:43:50 mpp_free() sp=80828e8 sp->n_link=806f510
[t80040000] 01/29 19:43:50 mpp_free() n_ptype = 0, ptype = 0
[eccf at mixbaal BEOLIN]$
And that's it.
The process aparently is running but nothing else happens!!!!
Req'd Req'd
Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S
Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- -
-----
9947.mixbaal.su jlgr q_250m_2 chel 1247 2 -- 200mb 08:00 R
01:42
nodo04/0+nodo03/0
9952.mixbaal.su eccf q_500m_4 myjob.tcsh 1246 4 -- 500mb 16:00 R
00:00
nodo09/1+nodo09/0+nodo08/1+nodo08/0+nodo07/1+nodo07/0+nodo05/1+nodo05/0
9953.mixbaal.su eccf q_500m_4 shel 1146 4 -- 500mb 16:00 R
00:00
nodo13/1+nodo13/0+nodo12/1+nodo12/0+nodo11/1+nodo11/0+nodo10/1+nodo10/0
This is my shell scrit:
#!/bin/bash
#PBS -q q_500m_4n_2h
#PBS -l mem=500Mb
#PBS -l nodes=4:ppn=2
#PBS -o temporal1
#PBS -e tiempo_temporal
cd /home/staff/eccf/pvm3/bin/BEOLIN
cat $PBS_NODEFILE > salida
export PROC_LIST=`gawk 'BEGIN{ORS=":"} {getline a < FILENAME; print a}'
salida`
echo "El valor de PROC_LIST = `echo $PROC_LIST`"
/local/pvm3/lib/BEOLIN/pvm
/home/staff/eccf/pvm3/bin/BEOLIN/master1
echo "halt" | pvm
Any idea?
If i run 8 processes in 8 different nodes it perfectly runs!!!!
Thanks a lot
cafe
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list