Article Index

In this installment of the Best of the Beowulf Mailing List we look at Gigabit switches, channel bonding, Opterons, and large memory allocations from the Beowulf Mailing List. You can consult the archives for the actual conversations.

Choosing a Gigabit Switch

On November 11, 2003, Keyan Mehravaran said that he was connecting eight dual Xeon PC's with on-board Gigabit NICs (Network Interface Cards). He asked about the relative advantages of a managed switch versus an unmanaged switch and asked about channel bonding Gigabit NICs. An unmanaged switch is one where the management of the switch is entirely internal and cannot be configured. A managed switch is one that is configurable via an interface such as a web browser or serial terminal. Donald Becker replied that in his opinion managed switches are frequently not a good choice because the switch can be set to work in "flawed modes" that can cause problems in the future. He also explained in another posting that he thought auto-negotiation, which unmanaged switches do by default, was a good thing because it was automatic, transparent, and extensible. Moreover he pointed out that most switches are now using Ethernet flow control and users don't know this because it's configured during auto-negotiation. The bottom line is that things just seem to work better.

The discussion about managed versus unmanaged switches turned to an interesting topic when Mark Hahn mentioned that he spoke to a large switch manufacturer and asked about some HPC features such as adding a special Quality of Service (QOS) tag for small packets to give them preferential treatment. He also mentioned that he would like to get performance statistics from the switch for the ports. Donald Becker mentioned that QOS tags already existed but cautioned about using them on LANs (Local Area Networks). In his opinion they are very good for multi-traffic, multi-path WANs (Wide Are Networks) where high-volume bulk traffic might block low-volume traffic such as telnet sessions. He then explained why QOS tags in a LAN would not be a good idea.

There was also a discussion of Channel Bonding where two NICs are combined to get twice the bandwidth. Rafael Orinoco pointed out that to do this within a single switch, it needs to be capable of handling multiple VLANs (Virtual LAN) within the switch. Donald Becker pointed out that you could use multiple switches to avoid this problem, with one NIC going to one switch and the other NIC to the other switch. Donald also shared that Channel Bonding Gigabit Ethernet (GigE) NICs would only marginal increase bandwidth because you are likely to get some out-of-order packets on the receiving side that could consume more CPU time to reassemble the packets. There is another option, however. Scali has an MPI implementation that bypasses the Linux kernel. Scali claims that by doing this they can Channel Bond GigE NICs and get 50-60% of the bandwidth of the second NIC.

Opteron Thoughts

There was a very good discussion on the Beowulf mailing list about Opteron systems and what to look for in a system. The discussion began on the 24th of November, 2003, with Derek Richardson asking about sources of information for tuning the Linux kernel for Opteron processors. The ever present Donald Becker asked for some clarification about tuning, but mentioned that the easiest performance improvement could come from a proper memory DIMM (Dual In-line Memory Module) configuration to match the application layout. You may recall that the Opteron has its own local memory controller but that for multi-CPU systems, each CPU can see all of the memory. Don pointed out that understanding how the memory slots are filled and what BIOS options are used can make a huge difference. He mentioned that simply by choosing to interleave memory in the BIOS can produce a 30% difference on a dual CPU system.

Greg Lindahl supported this observation and said that Opteron systems can run counter to what you might think about other systems. In particular, filling all of the memory slots on a single-channel memory system can actually improve performance. He also went to mention that the 2.6 kernel is better (faster) than the 2.4 kernel on Opteron processors.

Bill Broadley also mentioned that he has seen significant speedups by adjusting the node interleaving and the memory interleaving. He said that he has a benchmark that shows about a 2 Gigabytes/sec (GB/sec) for a single Opteron and about 3 GB/sec for a dual Opteron if properly configured. Another good benchmark for testing memory bandwidth is the STREAM benchmark. It is a widely accepted benchmark and tests several aspects of memory access. It can also be used to test dual CPU systems. However, during this discussion about Opterons, several people such as Greg Lindahl, Mike Snitzer, and Egan Ford pointed out that there are utilities for forcing a process to a specific CPU. This feature allows you to adequately test the memory access speed of each processor.

You have no rights to post comments


Login And Newsletter

Create an account to access exclusive content, comment on articles, and receive our newsletters.


This work is licensed under CC BY-NC-SA 4.0

©2005-2023 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.