Petabits/sec, and the like

Jim Lux James.P.Lux at jpl.nasa.gov
Wed Nov 5 12:40:09 EST 2003


All of the enjoyable chat about achieving stupendous data rates with disk 
drives in trucks is quite interesting. By the way, I don't know why you 
insist on having the drives mounted in racks..why not just leave them in 
their original shipping containers. There's also the concept of how many 
bits are being moved in, say, a container load of Britney Spears DVDs. 
(leaving aside questions about redundancy, information entropy, and whether 
there is any information content in Britney Spears to begin with)

But, on to a more practical aspect.  It seems that a mere bits per second 
number isn't useful, because it doesn't embody some practically important 
things, like latency or transport time, both of which can be 
significant.  This is of particular concern to me, because I'm used to 
having to deal with networks where the round trip light time is significant.

So, I propose that an interesting single metric might be to scale the bit 
rate by the latency with which the bits appear at the other end of the 
pipe.  As illustrious an early high performance computing as Seymour Cray 
recognized that this could be significant when you're looking at pumping 
lots of bits real fast.

And, there's a handy yardstick to measure by (issues of quantum 
entanglement and photon twinning aside), in vacuo speed of light.

For example.... old style 10Mbps thinnet ethernet used solid dielectric 
coax, which had a propagation velocity of about 0.66 c.  twisted pair is 
probably around 0.75, fiber optics are a bit tricky, depending on the mode 
of propagation, but probably around 0.85.  The pickup truck full of disks 
is about 1E-7.  The units of the new measure would be, what, (bits per 
second)*(meters per second) or bit meters per second squared. I'd normalize 
by c, to make the units more useful..I'd modestly propose calling the new 
unit the Lux, but it's already been used, so perhaps we should recognize 
rgb's contributions by calling it the "Brown"  10Mbps over thinnet would 
then be 6MegaBrowns.  100mbps over twisted pair would be 70MegaBrowns.  The 
1 Pb/s truckload of disks would be 100MegaBrowns.

This is clearly the "raw pipe speed" too... not taking into account the 
headers and any coding that's going on.  The disk drive pipe hides all the 
coding and sector headers, so the measurement is a real data transfer 
throughput.  The Ethernet scheme on the other hand, is just the signalling 
rate, and there is some significant non-zero overhead.

One might also ask whether physical size of the system being communicated 
within should be factored in (say, when talking about bisection 
bandwidth).  Clearly, a cluster with a physical dimension of 100meters is 
going to be slower than one with a physical dimension of 1 meter, all other 
things (processor speed, comm speed, etc.) being equal.

One has to also consider the bandwidth of the entrance and exit to the 
pipe... merely having the capability to transport Tb of disk drives rapidly 
doesn't mean that you can put data onto those disks at a Pb/s and get it 
off at the other end of the shipping channel.  This is where those "use 
free air as a communication medium" schemes get into trouble.  Sure, the 
optical bandwidth of air (or optical fiber) is pretty darn wide (on the 
order of 0.5 PetaHertz (a unit I never thought I'd ever use) for just the 
visible spectrum) but the modulation and demodulation might prove to be a 
problem.


There's also the issue of real computing efficiency.. speed is not 
everything in some applications... some applications might optimize for 
calculations per Dollar/Euro or calculations/Joule.  Coming up with a 
metric for the calculation is a bit tricky.  The calculations could be 
viewed as extracting information bits from a redundant data set (a 
coding/decoding process), or as creating new information (although, hmmm... 
this gets a bit metaphysical)

I leave the selection of appropriate units and names to the community.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list