[Beowulf] When is compute-node load-average "high" in the HPC context? Setting correct thresholds on a warning script.
rpnabar at gmail.com
Wed Sep 1 11:18:06 EDT 2010
On Wed, Sep 1, 2010 at 3:47 AM, Reuti <reuti at staff.uni-marburg.de> wrote:
> My impression was always (as there is a similar setting for the load_threshold in OGE), that it should limit the number of jobs on a big SMP machine when you oversubscribe by intention, as not all parallel jobs are really using all the CPU power over their lifetime (maybe such a machine was even operated w/o any NFS). Then allowing e.g. 72 slots for jobs on a 60 core maschine might get most out of it with a load near 100%.
Our scheduler is currently set as to never allow over-subscription.
Also, we don't allocate shared nodes. Users get resources in 8-core
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the Beowulf