[Beowulf] PVM BEOLIN & SMP

Eduardo Cesar Cabrera Flores eccf at super.unam.mx
Thu Jan 29 20:16:13 EST 2004



Are there anybody that could run sucessfully PVM BEOLIN conf into a 
cluster using both of the processors of the dual machines?

Cause i got the next, output (i'm using OpenPBS to submit my job):


[eccf at mixbaal BEOLIN]$ more ~/pvml.10006 
[pvmd pid1213] 01/29 19:43:50 
PROC_LIST=nodo13:nodo13:nodo12:nodo12:nodo11:nodo1
1:nodo10:nodo10:
[pvmd pid1213] 01/29 19:43:50 8 nodes in list.
[t80040000] 01/29 19:43:50 nodo13 (192.168.1.13:32770) BEOLIN 3.4.4
[t80040000] 01/29 19:43:50 ready Thu Jan 29 19:43:50 2004
[t80040000] 01/29 19:43:50 mpp_find looking for t60000
[t80040000] 01/29 19:43:50 mpp_find looking for t60001
[t80040000] 01/29 19:43:50 mpp_find:  Task not found
[t80040000] 01/29 19:43:50 mpp_find looking for t60000
[t80040000] 01/29 19:43:50 mpp_find looking for t60001
[t80040000] 01/29 19:43:50 mpp_find:  Task not found
[t80040000] 01/29 19:43:50 mpp_new() sp=806f510 sp->n_link=806f510
[t80040000] 01/29 19:43:50 mpp_new() sp=806f510 sp->n_link=80828e8
[t80040000] 01/29 19:43:50 mppload(): Forking to nodo13
[t80040000] 01/29 19:43:50 mppload(): Forking to nodo13
[t80040000] 01/29 19:43:50 mppload(): Forking to nodo12
[t80040000] 01/29 19:43:50 mpp_find looking for t60000
[t80040000] 01/29 19:43:50 mpp_find looking for t60000
[t80040000] 01/29 19:43:50 mpp_find looking for t60002
[t80040000] 01/29 19:43:50 [t60000] 
[t80040000] 01/29 19:43:50 [t60000]  Estoy en el nodo nodo13 
[t80040000] 01/29 19:43:50 mpp_free() sp=80828e8 sp->n_link=806f510
[t80040000] 01/29 19:43:50 mpp_free() n_ptype = 0, ptype = 0
[eccf at mixbaal BEOLIN]$ 


And that's it.


The process aparently is running but nothing else happens!!!!

                                                            Req'd  Req'd   
Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S 
Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - 
-----
9947.mixbaal.su jlgr     q_250m_2 chel         1247   2  --  200mb 08:00 R 
01:42
   nodo04/0+nodo03/0
9952.mixbaal.su eccf     q_500m_4 myjob.tcsh   1246   4  --  500mb 16:00 R 
00:00
   nodo09/1+nodo09/0+nodo08/1+nodo08/0+nodo07/1+nodo07/0+nodo05/1+nodo05/0
9953.mixbaal.su eccf     q_500m_4 shel         1146   4  --  500mb 16:00 R 
00:00
   nodo13/1+nodo13/0+nodo12/1+nodo12/0+nodo11/1+nodo11/0+nodo10/1+nodo10/0


This is my shell scrit:

#!/bin/bash
#PBS -q q_500m_4n_2h
#PBS -l mem=500Mb
#PBS -l nodes=4:ppn=2
#PBS -o temporal1
#PBS -e tiempo_temporal

cd  /home/staff/eccf/pvm3/bin/BEOLIN
cat $PBS_NODEFILE > salida
export PROC_LIST=`gawk 'BEGIN{ORS=":"} {getline a < FILENAME; print a}' 
salida`
echo "El valor de PROC_LIST = `echo $PROC_LIST`"
/local/pvm3/lib/BEOLIN/pvm
/home/staff/eccf/pvm3/bin/BEOLIN/master1
echo "halt" | pvm


Any idea?

If i run 8 processes in 8 different nodes it perfectly runs!!!!


Thanks a lot


cafe

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list