upgrading rh73 on an xCAT cluster

Cristian Tibirna ctibirna at giref.ulaval.ca
Wed Sep 24 14:51:52 EDT 2003


Yesterday I upgraded (first time after 7 months... I know, I know) the rh73 
rpms and the kernel. Since then, I have two nasty issues:

The update installed a new openssh (3.1.p1-14)

The auth of sshd through pam is annoyingly slower. All ssh connections (both 
from outside to the master and from any node to any node inside) _are_ 
succeeding, but a lot slower. I see this in the /var/log/messages too:

Sep 24 13:16:04 n15 sshd(pam_unix)[27164]: authentication failure; logname=\ 
uid=0 euid=0 tty=NODEVssh ruser= rhost=n01  user=root
Sep 24 13:16:04 n15 sshd(pam_unix)[24856]: session opened for user root by\ 

Both messages are for the same ssh connection attempt and the attempt 
succeeds, as I said. The only visible effect to the user is the slowness (the 
first failure is followed by a programmed delay in pam).

I looked a bit around the 'net and people have already complained a lot about 
this problem but I found no solution.

I also updated the kernel to 2.4.20-20.7 (redhat rpm).

Afterwards, my (and other users') SGE qmake jobs just get stuck in the middle 
(i.e. function correctly for a while then suddenly just sit there and do 
nothing for long time, without having completed). I feel it's some sort of 
NFS lockup problem as the master node (NFS server) gets very high loads 
(6.0-8.0) compared to before (2.0-3.0) the update of the kernel. The 
/var/log/messages says nothing useful.

Did anybody already updated a rh73 cluster equipped with SGE and using ssh 
internally? Observed these problems? Found solutions?

Thanks a lot.

Cristian Tibirna				(1-418-) 656-2131 / 4340
  Laval University - Quebec, CAN ... http://www.giref.ulaval.ca/~ctibirna
  Research professional at GIREF ... ctibirna at giref.ulaval.ca

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list