building a RAID system - 8 drives - drive-net - tapes

Robert G. Brown rgb at phy.duke.edu
Fri Oct 10 09:34:25 EDT 2003


On Fri, 10 Oct 2003, Jakob Oestergaard wrote:

> On Thu, Oct 09, 2003 at 09:31:13PM -0400, Robert G. Brown wrote:
> ...
> > Each disk has about one fourth of the information.  English is about 3:1
> > compressible (really more; this is using simple symbolic compression).
> > A good cryptanalyst could probably recover "most" of what is on the
> > disks from any one disk, depending on what kind of data is there.
> 
> You overlook the fact that data on a RAID-5 is distributed in 'chunks'
> of sizes around 4k-128k (depending...)

Overlook, hell.  I'm using my usual strategy of feigning knowledge with
the complete faith that my true state of ignorance will be publically
displayed to the entire internet.  This humiliation, in turn, will
eventually cause such mental anguish that I'll be able to claim mental
disability and retire to tending potted plants on a disability check for
the rest of my life...

You probably noticed that I used the same strategy quite recently
regarding things like factors of N in disk read speed estimates, certain
components in disk latency, and oh, too many other things to mention.
Pardon me if I babble on a bit this morning, but my lawy... erm,
"psychiatrist" insists that I need fairly clear evidence of disability
to get away with this.

I personally find that smoking crack cocaine induces a pleasant tendency
to babble nonsense.  And there is no place to babble for the record like
the beowulf list archives, I always say...:-)

> So you would get the entire first 'Introduction to evil empire plans',
> but the entire 'Subverting existing banana government' chapter may be on
> one of the disks that you are missing.
...
> I'm just thinking of distributing two tapes for each disk - one with
> 200G of random numbers, the other with 200G of data XOR'ed with the data
> from the first tape.

Or just one tape, xor'd with 200G worth of random numbers generated from
a cryptographically strong generator via a relatively short key that you
can (as you note) send or carry separately and which is smaller, easier
to secure, and less susceptible to degradation or loss than a second
tape.  It's cheaper that way, and even if you use two tapes people are
going to try cracking the master tape by trying to guess the
key+algorithm you almost certainly used to generate it (see below), so
the xor is no stronger than the key+algorithm combination.;-)

> Enter the one-time pad - unbreakable encryption (unless you get a hold
> of both tapes of course).

Or determine the method and key you used for (oxymoronically) generating
200 Gigarands (which is NOT going to be a hardware generator, I don't
think, unless you are a very patient person or build/buy a quantum
generator or the like -- entropy based things like /dev/random are too
slow, and even quantum generators I've looked into are barely fast
enough:-).

> You'd need to make sure you have good random numbers - as an extra

Ah, that's the rub.  "Good random numbers" isn't quite an oxymoron.
Why, there is even a government standard measure for cryptographic
strength in the US (which many/most generators fail, by the way).
Entropy based generators tend to be very slow -- order of 10-100 kbps
depending on the source of entropy, last I looked.  Quantum generators
IIRC that rely on e.g. single photon transmission events at
half-silvered mirrors have to run at light intensities where single
photon events are discernible (rare, that is) and STILL have to wait for
an autocorrelation time or ten before opening a window for the next
event because even quantum events like this have an associated
correlation time due to the existence of extended correlated states in
the radiating system.  Photon emission from a single atom itself is
antibunched, for example, as after an emission the system requires time
for the single radiating atom to regain a degree of excitation
sufficient to enable re-emission.  I believe that they can achieve more
like 1 mbps of randomness or at least unpredictability.  As you'd need
1.6x10^12 bits to encode your tape, you'd have to wait around 1.6x10^6
seconds to generate the key.  That is, hmmm, between two and three week,
twenty to thirty weeks with an entropy generator, unless you used a
beowulf of entropy generators to shorten the time:-).

Not exactly in the category of "generate a one-time pad while I go have
a cup of coffee".

Using a truly oxymoronic but much faster (and cryptographically strong)
random number generator, e.g.  the mt19937 from the GSL one can generate
a respectable ballpark of 16 MBps (note B, not b) of random bytes and be
done in a mere four hours.  Alas, mt19937 is seeded from a long int and
the seed probably doesn't have enough bits to be secure against a brute
force attack, so one would likely have to fall back on one of the actual
algorithms that permit the use of long keys (1024 bits or even more).

> No no no no no!  Think big!
> 
> Think: cobalt bomb in own backyard - threaten anyone who steals your
> data, that you'll make the planet inhabitable for a few hundred
> decades unless they hand back your tapes.   ;)
> 
> (I'm drafting up 'Introduction to evil empire plans' soon by the way  ;)

Hmm, I'll have to mail you some of my lithium pills, Jakob.  Your own
prescription obviously ran out...:-)

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list