interconnect latency, dissected.

John Taylor johnt at quadrics.com
Tue Jul 1 11:18:19 EDT 2003


I agree with Joachim et al on the merit of the paper. In relation to IB
there has been some work at Ohio State, comparing Myrinet and QsNet. The
latter however only discusses MPI, where the UPC group in the former, quite
correctly IMHO, discuss lower level APIs that suit better some applications
and algorithms as well as being the target of specific compiler
environments.

On the paper specifically at Berkeley my only concern is that there is no
mention on the influence of the PCI-Bridge implementation, not withstanding
its specification. For instance the system at ORNL is based on ES40 which on
a similar system gives an 8byte latency so...

prun -N2 mping 0 8 
  1 pinged   0:        0 bytes      7.76 uSec     0.00 MB/s
  1 pinged   0:        1 bytes      8.11 uSec     0.12 MB/s
  1 pinged   0:        2 bytes      8.06 uSec     0.25 MB/s
  1 pinged   0:        4 bytes      8.35 uSec     0.48 MB/s
  1 pinged   0:        8 bytes      8.20 uSec     0.98 MB/s
  .
  .
  .
  1 pinged   0:   524288 bytes   2469.61 uSec   212.30 MB/s
  1 pinged   0:  1048576 bytes   4955.28 uSec   211.61 MB/s

similar to the latency and bandwidth achieved for the author's benchmark.

whereas the same code on the same Quadrics hardware running on a Xeon
(GC-LE) platform gives

prun -N2 mping 0 8
  1 pinged   0:        0 bytes      4.31 uSec     0.00 MB/s
  1 pinged   0:        1 bytes      4.40 uSec     0.23 MB/s
  1 pinged   0:        2 bytes      4.40 uSec     0.45 MB/s
  1 pinged   0:        4 bytes      4.39 uSec     0.91 MB/s
  1 pinged   0:        8 bytes      4.38 uSec     1.83 MB/s
  .
  .
  .
  1 pinged   0:   524288 bytes   1632.61 uSec   321.13 MB/s
  1 pinged   0:  1048576 bytes   3252.28 uSec   322.41 MB/s
  
It may also be the case that the Myrinet performance could also be improved
(it is stated as PCI 32/66 in the paper) based on benchmarking a more recent
PCI-bridge. 


John Taylor
Quadrics Limited
http://www.quadrics.com

> -----Original Message-----
> From: Joachim Worringen [mailto:joachim at ccrl-nece.de]
> Sent: 01 July 2003 09:03
> To: Beowulf mailinglist
> Subject: Re: interconnect latency, dissected.
> 
> 
> James Cownie:
> > Mark Hahn wrote:
> > > does anyone have references handy for recent work on interconnect
> > > latency?
> >
> > Try http://www.cs.berkeley.edu/~bonachea/upc/netperf.pdf
> >
> > It doesn't have Inifinband, but does have Quadrics, Myrinet 
> 2000, GigE and
> > IBM.
> 
> Nice paper showing interesting properties.  But some metrics 
> seem a little bit 
> dubious to me: in 5.2, they seem to see an advantage if the "overlap 
> potential" is higher (when they compare Quadrics and Myrinet) 
> - which usually 
> just results in higher MPI latencies, as this potiential (on 
> small messages) 
> can not be exploited. Even with overlapping mulitple communication 
> operations, the faster interconnect remains faster. This is 
> especially true 
> for small-message latency.
> 
> From the contemporary (cluster) interconnects, SCI is missing next to 
> Infiniband. It would have been interesting to see the results 
> for SCI as it 
> has a very different communication model than most of the 
> other interconnects 
> (most resembling the T3E one).
> 
>  Joachim
> 
> -- 
> Joachim Worringen - NEC C&C research lab St.Augustin
> fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) 
> visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list