Ole W. Saastad
ole at scali.com
Wed Nov 12 10:18:47 EST 2003
Some comments about the channel aggregation of Gibabit ethernet
channels and switches.
> Message: 1
> Date: Tue, 11 Nov 2003 12:32:32 -0500 (EST)
> From: Donald Becker <becker at scyld.com>
> To: Keyan Mehravaran <keyanm at yahoo.com>
> cc: beowulf at beowulf.org
> Subject: Re: Gigabit Switch
> On Tue, 11 Nov 2003, Keyan Mehravaran wrote:
> > I am planning to connect 8 dual Xeon PCs
> > with onboard gigabit through a switch and
> > I only need access to the "zeroth" node.
> > I have two questions:
> > 1) Is there any benefit to using "managed"
> > switch rather than the "unmanaged" ones?
> Frequently "managed" switches are a negative.
> An Ethernet switch should "just work".
> Providing configuration options just encourages setting the switch to
> flawed modes, such as forced-full-duplex or filtering packet types you
> thought you were not using.
> > 2) Is it possible to increase bandwidth by
> > adding an extra gigabit NIC to each node?
> > If the answer is yes, then should all the
> > 16 ports connect to the same switch?
Scali has also addressed this issue and developed a device for
Scali MPI Connect (SMC) called Direct Ethernet Transport, DET.
This bypasses the tcp/ip stack and works well with a single
gigabit ethernet channel. However, the main benefit of DET is
that it very simple to bond two NICs to a single device, usually
named det2. For ScaMPI usage you just select at run time det2 instead
of det0, tcp, myr0 or sci.
> Yes, you can marginally increase bandwidth. But it's not worth it.
I do not agree, we see a marginally lower latency, but the
bandwidth increase when going from one to two gigabit channels
are in the order of 50-60%. When we get approx 110 MB/sec using
one channel this approach yield 165 to 175 MB/sec. When doing
exchange full duplex we can quote a number like 350 MB/sec.
> If you channel bond GbE, you'll likely get out-of-order packets on
> the receiving side and consume much more CPU to reassemble.
> If you trunk, you will not see higher peak bandwidth, and may still
> suffer from bad cache or interrupt affinity effects.
Yes, this is true and if you do not really are constrained by
bandwidth is does not pay off. Latency are most of the time the
constraint. Check your application with a test setup using channel
aggregation and measure yourself.
> You should use separate switches for channel bonding. Although it's
> possible to use VLAN to avoid this, that's brings us back to the
> switch configuration issue. And two half size switches are less
> expensive than one.
Switches up to 24 ports are so cheap today that you can just buy
an extra. For large switches the algebra becomes more complex.
The cost per port becomes so high that you can consider using a
high performance interconnect like Myrinet, Infiniband or SCI.
This will in addition to high bandwidth give you a very low
latency which is beneficial for most applications.
> Bottom line: use a single GbE channel unless there is a specific
> application reason to do otherwise.
Agree, but is said before, if your really need bandwidth there is
an option you can try with Gigabit Ethernet.
> Donald Becker becker at scyld.com
> Scyld Computing Corporation http://www.scyld.com
> 914 Bay Ridge Road, Suite 220 Scyld Beowulf cluster system
> Annapolis MD 21403 410-990-9993
Ole W. Saastad, Dr.Scient.
Manager, ISV relations/Business Dev.
dir. +47 22 62 89 68
fax. +47 22 62 89 51
mob. +47 93 05 74 87
ole at scali.com
Scali - www.scali.com
High Performance Clustering
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf