[Beowulf] Best Setup for Batch Systems
csamuel at vpac.org
Thu Feb 19 17:22:38 EST 2004
-----BEGIN PGP SIGNED MESSAGE-----
On Fri, 20 Feb 2004 02:41 am, Rayson Ho wrote:
[No failover support in the pbs_server]
> I think it is one of the biggest problems with *PBS, especially in the
> compute farm environment.
Torque (formerly SPBS) is very stable, especially since we helped the
SuperCluster folks clobber the various memory leaks in the server.
Our pbs_server has been running for almost a month now since I last restarted
it (because I was doing a bit of system maintenance, not because of PBS
problems, I think it'd been running for about 2 months before that) and it's
only VSZ 3148 and RSS 2136. :-)
NB: I'm still running an SPBS release from early November as that's when we
fixed the last memory leak and it's worked like a dream since then.
> The more advanced batch systems (SGE and LSF) have this feature for
> years, not sure why *PBS still don't have it.
I believe it's on the SuperCluster folks list of things to do, but they've
been busy working on the stability front (as well as MAUI and Silver).
CC'd to the SuperCluster folks so they can respond.
> (AFAIK, PBSPro 5.4 will include it, but isn't it late??)
No idea, don't use it.
Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
-----END PGP SIGNATURE-----
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf