|
Page 1 of 2
In this installment of the Best of the Beowulf Mailing List we look at Gigabit switches, channel bonding, Opterons, and large memory allocations from the Beowulf Mailing List. You can consult the archives for the actual conversations.
Choosing a Gigabit Switch
On November 11, 2003, Keyan Mehravaran said that he was connecting eight
dual Xeon PC's with on-board Gigabit NICs (Network Interface Cards). He
asked about the relative advantages of a managed switch versus an
unmanaged switch and asked about channel bonding Gigabit NICs.
An unmanaged switch is one where the management of the switch is
entirely internal and cannot be configured. A managed switch is one
that is configurable via an interface such as a web browser or serial
terminal. Donald Becker replied that in his opinion managed switches
are frequently not a good choice because the switch can be set to work
in "flawed modes" that can cause problems in the future. He also
explained in another posting that he thought auto-negotiation, which
unmanaged switches do by default, was a good thing because it was
automatic, transparent, and extensible. Moreover he pointed out that
most switches are now using Ethernet flow control and users don't
know this because it's configured during auto-negotiation. The bottom
line is that things just seem to work better.
The discussion about managed versus unmanaged switches turned to an
interesting topic when Mark Hahn mentioned that he spoke to a large
switch manufacturer and asked about some HPC features such as adding
a special Quality of Service (QOS) tag for small packets to give them
preferential treatment. He also mentioned that he would like to get
performance statistics from the switch for the ports. Donald Becker
mentioned that QOS tags already existed but cautioned about using them
on LANs (Local Area Networks). In his opinion they are very good for
multi-traffic, multi-path WANs (Wide Are Networks) where high-volume
bulk traffic might block low-volume traffic such as telnet sessions.
He then explained why QOS tags in a LAN would not be a good idea.
There was also a discussion of Channel Bonding where two NICs are
combined to get twice the bandwidth. Rafael Orinoco pointed out
that to do this within a single switch, it needs to be capable of
handling multiple VLANs (Virtual LAN) within the switch. Donald Becker
pointed out that you could use multiple switches to avoid this
problem, with one NIC going to one switch and the other NIC to the
other switch. Donald also shared that Channel Bonding Gigabit
Ethernet (GigE) NICs would only marginal increase bandwidth because
you are likely to get some out-of-order packets on the receiving
side that could consume more CPU time to reassemble the packets.
There is another option, however. Scali has an MPI implementation
that bypasses the Linux kernel. Scali claims that by doing this
they can Channel Bond GigE NICs and get 50-60% of the bandwidth
of the second NIC.
Opteron Thoughts
There was a very good discussion on the Beowulf mailing list about
Opteron systems and what to look for in a system. The discussion
began on the 24th of November, 2003, with Derek Richardson asking
about sources of information for tuning the Linux kernel for Opteron
processors. The ever present Donald Becker asked for some clarification
about tuning, but mentioned that the easiest performance improvement
could come from a proper memory DIMM (Dual In-line Memory Module)
configuration to match the application layout. You may recall that
the Opteron has its own local memory controller but that for
multi-CPU systems, each CPU can see all of the memory. Don pointed
out that understanding how the memory slots are filled and what
BIOS options are used can make a huge difference. He mentioned
that simply by choosing to interleave memory in the BIOS can
produce a 30% difference on a dual CPU system.
Greg Lindahl supported this observation and said that Opteron systems
can run counter to what you might think about other systems. In
particular, filling all of the memory slots on a single-channel
memory system can actually improve performance. He also went to
mention that the 2.6 kernel is better (faster) than the 2.4 kernel
on Opteron processors.
Bill Broadley also mentioned that he has seen significant speedups
by adjusting the node interleaving and the memory interleaving.
He said that he has a benchmark that shows about a 2 Gigabytes/sec
(GB/sec) for a single Opteron and about 3 GB/sec for a dual Opteron
if properly configured. Another good benchmark for testing
memory bandwidth is the STREAM benchmark. It is a widely accepted
benchmark and tests several aspects of memory access. It can
also be used to test dual CPU systems. However, during this
discussion about Opterons, several people such as Greg Lindahl,
Mike Snitzer, and Egan Ford pointed out that there are utilities
for forcing a process to a specific CPU. This feature allows you
to adequately test the memory access speed of each processor.
|