<HTML>

<HEAD>

<TITLE>Re: [Beowulf] dedupe filesystem</TITLE>

</HEAD>

<BODY>

<FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'><BR>

<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>On 6/5<BR>

<BR>

So what we really want is a storage system that will swallow up drives<BR>

as they get bigger and bigger - so as your researchers create more and<BR>

more data, or stream in more and more satellite/accelerator data/logs<BR>

of phone calls (a la GCHQ) then your storage system is expanding at a<BR>

faster rate.<BR>

<BR>

</SPAN></FONT></BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>---<BR>

Many years ago I read an interesting paper talking about how modern user interfaces are hobbled by assumptions incorporated decades ago. &nbsp;When disk space is slow and precious, &nbsp;having users decide to explicitly save their file while editing is a good idea. (don&#8217;t even contemplate casette tape on microcomputers..). &nbsp;Now, though, disk is cheap and fast and so are processors, so there&#8217;s really no reason why you shouldn&#8217;t store your word processing as a chain of keystrokes, with infinite undo available. &nbsp;Say I spent 8 hours a day doing nothing but typing at 100wpm.. That&#8217;s 480 minutes * 500 characters/minute.. Call it a measly 250,000 bytes per day. Heck, the 2GB of RAM in the macbook I&#8217;m typing this on would hold 8000 days of work. &nbsp;In reality, a few GB would probably hold more characters than I will type in my entire life (or mouse clicks, etc.)<BR>

<BR>

In theory, then, with sufficient computational power (and that&#8217;s what this list is all about) &nbsp;with the data on a small thumb drive I should be able to reconstruct everything, &nbsp;in every version, I&#8217;ve ever created or will create. &nbsp;All it takes is a sufficiently powerful &#8220;rendering engine&#8221;<BR>

<BR>

I readily concede that much data that is stored on computers is NOT the direct result of someone typing. &nbsp;Imagery is probably the best example of huge data that isn&#8217;t suitable for the &#8220;base version + all diffs&#8221; model.</SPAN></FONT>

</BODY>

</HTML>