[Beowulf] Best Setup for Batch Systems

Chris Samuel csamuel at vpac.org
Thu Feb 19 17:22:38 EST 2004

Hash: SHA1

On Fri, 20 Feb 2004 02:41 am, Rayson Ho wrote:

[No failover support in the pbs_server]

> I think it is one of the biggest problems with *PBS, especially in the
> compute farm environment.

Torque (formerly SPBS) is very stable, especially since we helped the 
SuperCluster folks clobber the various memory leaks in the server.

Our pbs_server has been running for almost a month now since I last restarted 
it (because I was doing a bit of system maintenance, not because of PBS 
problems, I think it'd been running for about 2 months before that) and it's 
only VSZ 3148 and RSS 2136. :-)

NB: I'm still running an SPBS release from early November as that's when we 
fixed the last memory leak and it's worked like a dream since then.

> The more advanced batch systems (SGE and LSF) have this feature for
> years, not sure why *PBS still don't have it.

I believe it's on the SuperCluster folks list of things to do, but they've 
been busy working on the stability front (as well as MAUI and Silver).

CC'd to the SuperCluster folks so they can respond.

> (AFAIK, PBSPro 5.4 will include it, but isn't it late??)

No idea, don't use it.

- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

Version: GnuPG v1.2.2 (GNU/Linux)


Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list