DQS drops jobs on SuSE 6.3 cluster

Kris Thielemans kris.thielemans at csc.mrc.ac.uk
Thu Nov 2 06:52:58 EST 2000


I'm trying to get DQS running on our cluster of 4 SuSE 6.3 systems. I tried
3 different versions of DQS
- the RPM package on the original CD
- the RPM pakcage provide on the SuSE website to update it to fix a y2k
problem (version 3.2.7)
- the newest version  (3.3.1) from ftp.scri.fsu.edu (compiled from

All 3 versions have the same problem:
jobs are occasionally dropped from the queue, or even not started

qsub somejob.sh   -> works ok
qstat -f                -> lists job

(a little bit later)
qstat -f                -> job gone

This happens with the simple dqs.sh example script that they provide for

There is NO error message in the dqs err_file, or anything in the log_file.

This problem also occurs when I disable all queues except 1 (on the same
node as the qmaster).

Any ideas?


Kris Thielemans

MRC Cyclotron Unit,
Hammersmith Hospital,
DuCane Rd,London W12 0NN, United Kingdom

Beowulf mailing list
Beowulf at beowulf.org

More information about the Beowulf mailing list