Article Index

In this installment of we look at Wake-on-Lan and processor benchmarking threads from the Beowulf Mailing List and file system benchmarks From the Linux Kernel List.

Turning on Nodes Through the Network

On November 3, 2003, Mathias Brito posted to the Beowulf mailing list asking how he can boot the master node of his cluster and then have the slave nodes boot automatically. There were several responses to his request. The most common response was to use the Wake-On-Lan (WOL) feature of the NIC (Network Interface Card) in the nodes to have them boot when they receive a signal. For WOL to work, the NIC must have a chipset capable of WOL and typically you connect a cable from the NIC to the WOL connector on the motherboard. The NIC has a very low-power mode that monitors the network for a special data packet that will wake up the system causing it to boot up. Erwan Velu pointed out that you can use Scyld's, ether-wake program (it's freely available, see Resources sidebar) to cause the nodes to boot. He added that by simply running ether-wake in the rc.local on the master node, the compute nodes will start booting while the master node is booting. However, care must be taken with this approach so that anything the compute nodes need from the master node, such as NFS file systems, DHCP, TFTP is available when they boot.

Don Becker went on to state that he thought a more reliable and sophisticated approach was to use systems with IPMI (Intelligent Platform Management Interface) 1.5 support. IPMI is a specification that defines a standard, abstracted message-based interface to intelligent platform management hardware. It is used for system health monitoring, chassis intrusion monitoring, and other aspects of server monitoring for systems that have intelligent hardware. It is supported by Intel, Dell, HP, and NEC. Don mentioned that waking each node over the network was included in the IPMI specification. Don also mentioned that a most motherboards equipped for IPMI need a Baseboard Management Controller (BMC) which adds about $25-$150 to the cost of the motherboard. There was a small discussion about the price, but the final conclusion was in the range that Don had mentioned.

Better Benchmarking

There was a very interesting discussion that resulted from a question, asked by Gabriele Butti on October 28th, 2003, as to whether the Itanium 2 (I2) CPU or the Opteron CPU was better for a new cluster. There was some initial response to the question, but a response from Richard Walsh started a very a discussion mostly between Richard and Mark Hahn about benchmarks. Richard and Mark are both well known on the Beowulf mailing list.

Richard started by discussing the I2's SpecFP 2000 performance and Opteron's Hypertransport bus which allows each processor access to the full memory bandwidth of the system. SPEC (Standard Performance Evaluation Corporation) is a non-profit corporation founded to establish and maintain a relevant set of benchmarks for high-performance computers. While SPEC has several benchmarks, the primary benchmarks, SpecINT 2000 and SpecFP 2000, test the integer and floating-point performance. Their benchmark suite consists of several programs that test different aspects of the computer system and require the testers to use a standard set of compiler flags for baseline performance and also allows the testers to use any combination of compiler flags to get peak performance.

Mark responded that he thought the SPEC results for the I2 indicated that the SPEC codes were well suited for the large cache of the I2 but did not necessarily test the I2 itself. Mark and Richard then provided some very detailed discussion about benchmarking where Mark also pointed out that some CPUs have very high results on a certain part of the SPEC tests, but weak results on other parts. However, despite the fact that SPEC uses a geometric mean to average the scores which should reduce the impact of a large score on one particular test, a very strong result can skew the overall SpecFP 2000 number. Mark and Richard also discussed the cache effects on the SpecFP 2000 benchmark with Richard stating that he thought a benchmark or two that were sensitive to cache effects, were important because some real world codes behave the same way.

Eric Moore joined in the discussion with a very interesting look at the SpecFP 2000 benchmark and some of the codes that make up the results. Robert Brown also posted his thoughts that were in line with both Richard's and Mark's. He suggested that one needs to look at the results of ALL of the components of the SpecFP2000 benchmark to get a good idea of the performance, rather than look at the geometric mean. Robert provided some very well written comments about benchmarks regarding HPC (High-Performance Computing). He pointed that he likes benchmarks that address various problem sizes and different aspects of the hardware including the interconnect. Mark originally brought up the idea of a database of benchmarks that one could search or combine to generate meaningful results. Robert seconded this idea. Now, if someone could find the time to do it...

You have no rights to post comments


Login And Newsletter

Create an account to access exclusive content, comment on articles, and receive our newsletters.


Share The Bananas

Creative Commons License
©2005-2018 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.