Petabits/sec, and the like

Robert G. Brown rgb at phy.duke.edu
Wed Nov 5 13:39:54 EST 2003


On Wed, 5 Nov 2003, Jim Lux wrote:

> by c, to make the units more useful..I'd modestly propose calling the new 
> unit the Lux, but it's already been used, so perhaps we should recognize 
> rgb's contributions by calling it the "Brown"  10Mbps over thinnet would 
> then be 6MegaBrowns.  100mbps over twisted pair would be 70MegaBrowns.  The 
> 1 Pb/s truckload of disks would be 100MegaBrowns.

You are clearly an evil man, and children and pets probably cross the
street to avoid you.  For the love of God, don't name a unit the Brown.
Megabrowns.  Sheeesh.

> This is clearly the "raw pipe speed" too... not taking into account the 
> headers and any coding that's going on.  The disk drive pipe hides all the 
> coding and sector headers, so the measurement is a real data transfer 
> throughput.  The Ethernet scheme on the other hand, is just the signalling 
> rate, and there is some significant non-zero overhead.
> 
> One might also ask whether physical size of the system being communicated 
> within should be factored in (say, when talking about bisection 
> bandwidth).  Clearly, a cluster with a physical dimension of 100meters is 
> going to be slower than one with a physical dimension of 1 meter, all other 
> things (processor speed, comm speed, etc.) being equal.
> 
> One has to also consider the bandwidth of the entrance and exit to the 
> pipe... merely having the capability to transport Tb of disk drives rapidly 
> doesn't mean that you can put data onto those disks at a Pb/s and get it 
> off at the other end of the shipping channel.  This is where those "use 
> free air as a communication medium" schemes get into trouble.  Sure, the 
> optical bandwidth of air (or optical fiber) is pretty darn wide (on the 
> order of 0.5 PetaHertz (a unit I never thought I'd ever use) for just the 
> visible spectrum) but the modulation and demodulation might prove to be a 
> problem.
> 
> 
> There's also the issue of real computing efficiency.. speed is not 
> everything in some applications... some applications might optimize for 
> calculations per Dollar/Euro or calculations/Joule.  Coming up with a 
> metric for the calculation is a bit tricky.  The calculations could be 
> viewed as extracting information bits from a redundant data set (a 
> coding/decoding process), or as creating new information (although, hmmm... 
> this gets a bit metaphysical)
> 
> I leave the selection of appropriate units and names to the community.

By the time you add dollars to the problem those truckfulls of disks
look pretty damn good, actually.  Which one is cheaper:  Building an
optical fiber network capable of distributing the kids of datasets they
accumulate at the big accelerator labs to the participating Universities
(often on the other side of the country) with enough bandwidth to be
useful, or cross-shipping a RAID that gets refilled and emptied at the
ends?

Consider a metaphor:  Fermilab is a river of data.  People at Duke are
thirsty, but they can only drink just so much just so fast.  It is very
likely much cheaper to just ship Duke an occasional truckfull of bottled
water -- I mean data -- than to build a crosscountry pipeline just to
put a high capacity spigot in a single room.

It is also useful to consider how long it takes to FILL a terabyte RAID.
Even at (say) 100 MB/sec it is still 10,000 seconds, or about three
hours.  A petabyte would require 3000 hours (admittedly potentially in
parallel).  That would be a goodly chunk of a year.  By the time
bottlenecks like this are considered, the time and cost of overnight
shipping a containerized PB across the country are relatively
insignicant.

Interesting transformations between time and spatial dimensions involved
in all of this.  wire/fiber carrier frequency, wire/fiber bundle density
and multiplexing/termination costs plust the cost of the wire/fiber
itself vs achieving a very high spatial information density using a
storage VOLUME and moving the space, with THOSE associated costs.

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list