queuing systems and cpu usage? (Partly OT)

John Brookes johnb at quadrics.com
Thu Jun 5 12:18:03 EDT 2003


Peter,

I came across LoadLeveler on an SP3 in a former job and also found the
scheduling to be pretty poor. To be fair to IBM, they readily admitted the
fact. I quickly found the Maui scheduler from the Maui Supercomputing Centre
(now defunct? its old url no longer works). At the time it was licensed
(though free), but they were working on the legal issues to make it freely
distributable. 

It's now a SourceForge project (so I assume they succeeded with the legal :)
I'm not sure if/where a fully-supported version that plugs into LL can be
found these days (or even whether the project retains the LL interface), but
it did much better than either of the built-ins at that time and works well
now (on other systems, at least). It's also highly configurable, so you can
make it as nice (or nasty!) as you like.

YM will almost certainly V, as my only experiences under IBM are from ~3yo
versions, but if you can get it to work it'd probably be a Good Thing.

Maui's often used as the scheduler for PBS and Sun GridEngine nowadays, so
getting to know its foibles wouldn't be wasted once your Linux/Athlon
cluster arrives.

The project is at:
http://mauischeduler.sourceforge.net/

Some information on Maui and the Maui/LL tie-in can be found at eg:
http://supercluster.org/documentation/

Cheers,

John Brookes
Quadrics Ltd.


> -----Original Message-----
> From: Peter Beerli [mailto:beerli at csit.fsu.edu]
> Sent: 05 June 2003 04:53
> To: Beowulf Mailing list
> Subject: queuing systems and cpu usage?
> 
> 
> Hi all,
> I am a novice to all clustering and queueing systems.
> So far, my cluster needs were satisfied on ad hoc cluster to run
> embarrassingly parallel code (MCMC runs). After my move, I can use
> time on a IBM Regatta cluster (and I am impatiently waiting 
> for my very
> own Linux Athlon cluster). The Regatta cluster is running 
> loadleveler which seems to
> have an abysmal job-scheduling performance (it seems that 
> currently out of the max=480 cpus
> only 288 are used [4 idle nodes out of 15] and many jobs 
> (including mine) are in the queues, waiting).
> 
> I would be glad to hear information what schedulers you 
> prefer and (preferably)
> also get some numbers of how many nodes and cpus are idle 
> under standard load**
> or other appropriate statistics (what is the appropriate statistics?).
> 
> (this is not really a question for this list but some might 
> be able to answer:
> is out regatta loadleveler misconfigured?)
> 
> **standard load: too many users submit jobs for the queue 
> with the longest time limit,
> very few medium, small  length jobs, most of the jobs are 
> obviously only using <32 cpus
> (a node on the regatta has 32 cpus)
> 
> thanks,
> Peter
> ----
> Peter Beerli,
> School of Computational Science and Information Technology (CSIT)
> Dirac Science Library, Florida State University
> Tallahassee, Florida 32306-4120 USA
> old webpage: 
http://evolution.genetics.washington.edu/PBhtmls/beerli.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list