Cluster Newbie

Don't know a thing about clusters. Not to worry. Prodigious cluster scrivener Robert Brown is here to present the basics in a clear and concise set of introductory columns (and then some). Welcome to High-tech hardball.

From the things to consider while on vacation department

Last fall when my daughter started school, she came home and said the teacher recommended students get a graphing calculator. Mind you, it was not the hundred bucks for the calculator that promoted me to grab a pencil and paper and say "Back in my day, this was our graphing calculator. No batteries needed, you can even keep the stylus on your ear, and as a bonus feature it has an undo tip as well", but rather the idea that the pencil and paperwas becoming a lost art. I mean if you are on desert island and need to plot a parabola, where are you going to find a graping calculator? (You might be wondering what this has to do with clusters? read on.)

Engineering Clusters, Revisited

Many months ago we started learning how to design serious clusters, starting with prototyping and benchmarking hardware. The CPU part appeared to be pretty easy -- just get the fastest processor you can afford, with enough memory to run your application. Actually it isn't QUITE that simple, as there are some real differences between processors and computer architectures to choose from (and hence real choices to make) but the idea was simple enough.

From the "How tall is a building?" department

The Beowulf Mailing List is a source of valuable information about high performance computing (HPC) Linux clusters. Conversations on the list apply to not only HPC computing, but Linux performance for any system. Recently (March 8, 2007) the following was posted to the Beowulf Mailing list:

I would like to know what server has the best performance for HPC systems between The Dell Poweredge 1950 (Xeon) And 1435SC (Opteron). Please send me suggestions...

Here are the complete specifications for both servers:
- Poweredge 1435SC, Dual Core AMD Opteron 2216 2.4GHz 3GB RAM 667MHz, 2x512MB and 2x1GB Single Ranked DIMMs
- Poweredge 1950, Dual Core Intel Xeon 5130 2.0GHz 2GB 533MHz (4x512MB), Single Ranked DIMMs

From your specifications, almost certainly the Opteron. For a variety of reasons, but higher clock certainly helps -- it would probably have been faster at equivalent clock anyway. Now that I've "answered", let me tell you why you shouldn't believe me and what you should actually do to answer your own question.

Tools of the cluster trade, don't leave home without them

In past articles, we looked at basic Linux networking. At this point you should have a pretty good idea of how the basic network, common to nearly every modern computer system (TCP/IP over Ethernet), is structured at the packet level. We have learned about Ethernet packets and their header, encapsulating IP packets with their header, encapsulating TCP packets with their header. This process of encapsulation isn't quite finished -- there is nothing to prevent anyone from adding YAH (yet another header) inside the TCP header to further encapsulate actual data, but since a lot of network communications carry the data inside the TCP layer without further headers or encapsulation, we'll quit at this point and move on to the next burning question.

Given this marvelous understanding of how the network is supposed to be working, the burning question du jour is: How well is it working?

The cruel truth about IP Datagrams and other things you may have forgot (or never learned).

In the last column we learned some things that every Cluster Engineer should know about Ethernet and the Internet Protocol (IP). The former specification, recall, is defined by IEEE documents that are "open" but not freely re-publishable; the latter by fully open RFCs that you can read yourself for free and from which I can actually cut and paste while describing them. The article contained a synopsis of information from RFC 791 (IP) and referenced RFC 792 (ICMP) and RFC 894 (IP over Ethernet).

Of course when you read that article (studied it, really) you noticed the fact that the smallest packet that can be sent to deliver one single byte of actual data via IP over Ethernet is exactly 64 bytes, the smallest permitted Ethernet packet size. Of this 64 bytes, 18 bytes are Ethernet header and CRC, 20 bytes are IP header, one byte is data, and the rest (25 bytes, about 40% of the packet) is "padding" (although in practice it will generally be at least partially used for higher level e.g. TCP headers discussed below).

Search

Login And Newsletter

Create an account to access exclusive content, comment on articles, and receive our newsletters.

Feedburner


This work is licensed under CC BY-NC-SA 4.0

©2005-2023 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.