BEOWULF cluster hangs

Dean Johnson dtj at
Thu Sep 26 11:26:46 EDT 2002

You might also check what the network is doing. 

It may be an issue relating to your application, as I know of a
molecular dynamics app (NAMD) that has a particular pathological case
that exhibits similar behaviour. After many hours of running, the
particular simulation that has the problem would cause the application
to do progressively more "housekeeping" and cpu utilization would go
down greatly over time. If you killed and resumed it, the utilization
would be back up to where it should be, but after many hours it would
start having problems again. That particular problem was "fixed"
(pronounced "big kludge") by an iterative script that wouldn't let it
get into that state.


Beowulf mailing list, Beowulf at
To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list