Cluster interconnects are the key to good performance. Understanding what type of interconnect works best for your application can make a big difference in both cost and performance.

All About Lowest Latency and Highest Scalability

The high-speed InfiniBand server and storage connectivity has become the de facto scalable solution for systems of any size -- ranging from small, departmental-based compute infrastructures to the world's largest PetaScale systems. The rich feature set and the design flexibility enable users to deploy the InfiniBand connectivity between servers and storage in various architectures and topologies to meet performance and or productivity goals. These benefits make InfiniBand the best cost/performance solution when compared to proprietary and Ethernet-based options.

From the why didn't I think of that department

Recently, Richard Walsh put together a table with both PCI and InfiniBand (IB) specifications. He published the spreadsheet on the Beowulf Mailing List. Cluster Monkey contacted Richard and asked permission to reproduce the table in HTML so that it was more accessible. We broke it in half resulting in two tables so it would fit on the page, but the information is the same.

Ethernet is still a viable option for some users

Ethernet has been a key component in HPC clustering since the beginning. Over the years, interconnects like Myrinet and InfiniBand (and some others) have replaced Ethernet as the main compute interconnect largely due to better performance. High performance interconnects like InfiniBand are now the interconnect of choice for those that require performance and scalability. Image

With the availability of such high performance interconnects, one has to ask why do people still use Ethernet? The answer is three fold. First, Gigabit Ethernet (1000 Megabits/second) is "everywhere." Multiple Gigabit Ethernet (GigE) ports can be found on almost every server motherboard and users are comfortable with Ethernet technology. Second, it is inexpensive. The commodity market has pushed prices to the point where low node count clusters can expect Cat 5e cabling and switching costs to be between $10 and $20 per port. And finally, Ethernet is virtually plug-and-play. In other words, it just works.

Mulit-core is changing everything. What do you think the effect mulit-core has on the interconnect requirements for your cluster? Hint: More cores need more interconnnect. You may want to read Real Application Performance and Beyond and Single Points of Performance as well.

"In 1978, a commercial flight between New York and Paris cost around $900 and took seven hours. If the principles of Moore's Law had been applied to the airline industry the way they have to the semiconductor industry since 1978, that flight would now cost about a penny and take less than one second." (Source: Intel)

In 1965, Gordon Moore predicted that the number of transistors that could be integrated into a single silicon chip would approximately double about every two years. For more than forty years Intel has been transforming that law into reality (See Figure One). The increase in transistor density enables more transistors on a single chip and therefore increases in the CPU performance. However, it is not the only factor driving the CPU performance, as the increase of the CPU clock frequency, a bi-product of the transistor density was an important factor in the overall performance improvement.

Can you draw a line with a single point? Sure you can, but it may not tell you anything. Join Gilad Shainer as he discusses high performance networking and single point performance metrics. You may want to read Real Application Performance and Beyond and Optimum Connectivity in the Multi-core Environment as well.

High-performance computations are rapidly becoming a critical tool for conducting research, creative activity, and economic development. In the global economy, speed-to-market is an essential component to get ahead of the competition. High-performance computing utilizes compute power for solving highly complex problems, perform critical analysis, or run computationally intensive workloads faster and with greater efficiency. During the time needed to read this sentence, each of the Top10 clusters on the Top500 list would have performed over 150,000,000,000,000 calculations.


Login And Newsletter

Create an account to access exclusive content, comment on articles, and receive our newsletters.


This work is licensed under CC BY-NC-SA 4.0

©2005-2023 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.