building a RAID system - yup - superglue

Robert G. Brown rgb at phy.duke.edu
Thu Oct 9 09:42:46 EDT 2003


On Thu, 9 Oct 2003, Alvin Oga wrote:

> > Pants AND suspenders.  Superglue around the waistband, actually.  Who
> > wants to be caught with their pants down in this way?
> 
> always got bit by tapes... somebody didnt change the tape on the 13th
> a couple months ago ... and critical data is now found to be missing
> 	- people do forget to change tapes ... or clean heads...
> 	( thats the part i dont like about tapes .. and is the most
> 	( common failure mode for tapes ... easily/trivially avoided by
> 	( disks-to-disk backups
> 
> 	- people get sick .. people go on vacations .. people forget
> 
> 
> - no (similar) problems since doing disk-to-disk backups
> 	- and i usually have 3-6 months of full backups floating around
> 	in compressed form

All agreed.  And tapes aren't that permanent a medium either -- they
deteriorate on a timescale of years to decades, with data bleeding
through the film, dropped bits due to cosmic ray strikes,
depolymerization of the underlying tape itself.  Even before the tape
itself is unreadable, you are absolutely certain to be unable to find a
working drive to read it with.  I have a small pile of obsolete tapes in
my office -- tapes made with drives that no longer "exist", and that is
after dumping the most egregiously useless of them.

Still, I'd argue that the best system for many environments is to use
all three: RAID, real backup to (separate) disk, possibly a RAID as
well, and tape for offsite and archival purposes.  The first two layers
protect you against the TIME required to handle users accidentally
deleting files (the most common reason to access a backup) as retrieval
is usually nearly instantaneous and not at all labor intensive.  It also
protects you agains the most common single-server failures that get past
the protection of RAID itself (multidisk failures, blown controllers).
The tape (with periodic offsite storage) protects you against server
room fire, brownouts or spikes that cause immediate data corruption or
disk loss on both original and backup servers, and tapes can be saved
for years -- far longer than one typically can go back on a disk backup
mechanism.  Users not infrequently want to get at a file version they
had LAST YEAR, especially if they don't use CVS.  Finally, some research
groups generate data that exceeds even TB-scale disk resources -- they
constantly move data in and out of their space in GB-sized chunks.  They
often like to create their own tape library as a virtual extension of
the active space.  Tapes aren't only about backup.

So you engineer according to what you can afford and what you need,
making the usual compromises brought about by finite resources.

BTW, one point that hasn't been made in the soft vs hard RAID argument
is that with hard RAID you are subject to (proprietary) HARDWARE
obsolescence, which typically is more difficult to control than
software.  You build a RAID, populate it, use it.  After a few years,
the RAID controller itself dies (but the disks are still good).

Can you get another?  One that can actually retrieve the data on your
disks?  There are no guarantees.  Maybe the company that made your
controller is still in business (or rather, still in the RAID business).
Maybe they either still carry old models, or can do depot repair, or
maybe new models can still handle the raid encoding they implemented
with the old model.  Maybe you can AFFORD a new model, or maybe it has
all sorts of new features and costs 3x as much as the first one did
(which may not have been cheap).  Maybe it takes you weeks to find a
replacement and restore access to your data.

Soft RAID can have problems of its own (if the software for example
evolves to where it is no longer backwards compatible) but it is a whole
lot easier to cope with these problems and they are strictly under your
control.  You are very unlikely to have any "event" like the death of
the RAID server that prevents you from retrieving what is on the disks
(at a cost likely to be quite controllable and in a timely way) as long
as the disks themselves are not corrupted.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list