[Beowulf] New member, upgrading our existing Beowulf cluster

Greg Lindahl lindahl at pbm.com
Thu Dec 3 21:41:29 EST 2009


> > E.g. you see a system disk going bad, but the user
> > will lose all their output unless the job runs for
> > 4 more weeks...
> 
> We run SMART tests and the like trying to proactively
> spot bad disks (and other hardware) prior to failures,
> but yes, that's inevitable.

It's not inevitable that the policy be that 3 month jobs are allowed.

But you know me: I never saw a battle I didn't want to fight :-) Arrr,
mateys, this be the BOFH, and I'm heere to educate you about the right
way to use this here supercomputer... my way... or walk the plank!

-- greg



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list