[Beowulf] Re: SATA or SCSI drives - Multiple Read/write speeds.

Robert G. Brown rgb at phy.duke.edu
Tue Dec 9 12:42:46 EST 2003

On Tue, 9 Dec 2003, Robin Laing wrote:

> Andrew Latham wrote:
> > While I understand your pain I have no facts for you other than that SATA is
> > much faster than IDE. It can come close to SCSI(160). I have used SATA a little
> > but am happy with it. the selling point for me is cost of controler and disk
> > (controlers of SATA are much less), and the smaller cable format. The cable is
> > so small and easy to use that it is the major draw for me.
> > 
> > good luck on your quest!
> > 
> I knew this but for straight throughput but it is random access that 
> is the real question.

Random access is complicated for any drive system.  It tends to be
latency dominated -- the drive has to do lots of seeks.  Seek time, in
turn, is dominated by platter speed and platter density, with worst case
latencies related to the time required to position the head and turn the
disk so that the track start is underneath.  With drive speeds of
5000-10000 rpm, this time is pretty much fixed and not all that
different from cheap disks to the most expensive, with read and write
being a bit different (so it even matters if you do random access reads
from e.g. a big filesystem with lots of little files or random writes
ditto).  Note also that there are LOTS of components to file latency,
and disk speed is only one of them.  To open a file, the kernel must
first stat it to see if you are PERMITTED to open it.

Note also that the kernel is DESIGNED to hide slow filesystem speeds
from the user.  The kernel caches and buffers and never throws anything
away it might need later unless/until it has to.  A common benchmarking
mistake is to open a file (to see how long it takes) and then open it
again right away in a loop.  Surprise!  It takes a ``long time'' the
first time but the second time is nearly instantaneous, because the
second time the request is served out of the kernel's cache.  A system
with a lot of memory will use all but a tiny fraction of that memory
caching things, if it can.

I don't expect things like latency to be VASTLY affected by SATA vs PATA
vs SCSI, see Mark's remarks on disk speed and platter density -- that is
more strongly related to the disk hardware, not the interface.  Even
things like on-disk cache are trivial in size compared to the kernel's
caches, although I'm sure they help somewhat under some circumstances.  


Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list