[Beowulf] OT: recoverable optical media archive format?
reuti at staff.uni-marburg.de
Tue Jun 8 15:03:59 EDT 2010
Am 08.06.2010 um 19:44 schrieb David Mathog:
> This is off topic so I will try to keep it short: is there an
> "archival" format for large binary files which contains enough error
> correction to that all original data may be recovered even if there
> is a
> little data loss in the storage media?
> For my purposes these are disk images, sometimes .tar.gz, other times
> gunzip -c of dd dumps of whole partitions which have been "cleared" by
> filling the empty space with one big file full of zero, and then that
> file deleted. I'm thinking of putting this information on DVD's (only
> need to keep it for a few years at a time) but I don't trust that
> not to lose a sector here or there - having watched far too many
> scratched DVD movies with playback problems.
> Unlike an SDLT with a bad section, the good parts of a DVD are still
> readable when there is a bad block (using dd or ddrescue) but of
> even a single missing chunk makes it impossible to decompress a .gz
> correctly. So what I'm looking for is some sort of .img.gz.ecc
> where the .ecc puts in enough redundant information to recover the
> underlying img.gz even when sectors or data are missing. If no such
> tool/format exists then two copies should be enough to recover all
> of an
> .img.gz so long as the same data wasn't lost on both media, and if bad
> DVD sectors always come back as "failed read", never ever showing up
> a good read but actually containing bad data. Perhaps the frame
> checksum on a DVD is enough to guarantee that?
besides splitting the file, I would suggest to generate some par/par2
files. This format was originally used on the Usene, to have a
reliable way to transfer binary attachements. I.e. first you split
your files into e.g. 10 pieces each and generate 5 par/par2 files for
each of them. Then you need any 10 out of these 15 into total to be
good to recover the original file.
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the Beowulf