[Beowulf] New member, upgrading our existing Beowulf cluster

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Tue Dec 8 12:56:49 EST 2009




On 12/8/09 9:22 AM, "james bardin" <jbardin at bu.edu> wrote:

> On Tue, Dec 8, 2009 at 10:50 AM, Prentice Bisbal <prentice at ias.edu> wrote:
> 
>> You'd hope that. Most of my current clusters users are scientific
>> researchers in academia, not computer scientists. While some are
>> extremely computer savvy, others have learned just enough about
>> programming to do their calculations. Expecting the latter to write code
>> with checkpointing is unrealistic, and working in academia, I can't
>> force them to. Which is why taking down 4 nodes instead of just one is
>> less than ideal.
>> 
> 
> I find it's still advantageous to push them to learn it. A researcher
> working with a tight deadline for a grant will often see the light
> when a hardware failure loses them a month or more of data processing.
> It really is in their own best interests to learn about their tools.


What about some form of "image checkpoint" like "hibernation"... Should be
application unaware, just snapshots memory.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list