[Beowulf] Re: Interesting google server design

Greg Lindahl lindahl at pbm.com
Sat Apr 4 18:10:57 EDT 2009


On Sat, Apr 04, 2009 at 05:24:23PM -0400, Jason Riedy wrote:
> And Robert G. Brown writes:
> > For them servicing/replacing a system is cheap: Box dies.
> > Employee notes this, grabs box from Big Stack of Boxes, carries
> > it to dead box, removes dead box, replace it with new working
> > box, presses power switch, walks away.
> 
> Plus, your operator can be unskilled.

Um, not completely. These clusters work by starting with 3 copies of
every chunk of the data, and as you work you have to make sure that
you don't take down the wrong system and leave the cluster with 0 or 1
copies of a chunk of data. There are software mechanisms you can use
to help, but the operator needs to know how the rules work.

Some tasks, yeah, no problem: if the box is already dead. But many
tasks involve boxes which aren't dead yet: 1 disk has failed, the box
needs a reboot to run a new kernel, a new release of the application
software, etc etc.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the Beowulf mailing list