|
Page 1 of 2
Issues, but no real answers
The Beowulf mailing list provides detailed discussions about issues
concerning Linux HPC clusters.
In this article I review some postings to the
Beowulf list on
performance measurements and Microsoft's foray into
the HPC market (nice historical perspective here) which resulted in
a good and timely discussion about Linux distributions
for the HPC world.
While these discussions were from 2004 and a bit on the older side
(aren't we all?), they do provide some good insights into the these
continuing discussions
G5 Performance and Benchmarks
This Beowulf list thread started when the Eugen Leitel posted a forward
from another mailing list asking about Apple G5 performance,
particularly on the computational chemistry QM and MD codes (the
original poster was Joe Leonard). Eugen also forwarded a response to
Joe's initial posting from the other list. Mike McCallum had tested
dual G5's and found them to be near the top in price/performance,
especially using the IBM compilers. He also found the built-in GigE
to be quite good for scaling on NAMD and CHARMM which according to
Mike scale well with GigE (these were based on some benchmark numbers
that Mike posted).
Bill Broadley was the first to reply with a request for some
clarification of some of the test results that Mike had posted. He
also took issue with some of the conclusions that Mike made regarding
the scalability of NAMD and CHARMM on GigE vs. Myrinet. Michael
Huntingdon then posted that he thought that the Itanium 2 was a good
solution based on the performance numbers.
Joe Landman took issue with the idea of using SPEC numbers as a
"good" predictive indication of performance. Joe also points out the
benchmarks posted by Mike McCallum may have some flaws because of
compilers and compile options. Joe also had problems with recommending
Itanium 2 processors for all HPC applications. Joe pointed out that
for Bioinformatics applications, Itanium 2 processors don't compare
well to other processors.
A frequent contributor, Mark Hahn, joined in to say that he had done
a small analysis of the SpecFP benchmarks. His goal was to point out
that the SpecFP scores of some machines are due to a small subset of
the SpecFP scores. He sorted the scores for each machine and omitted
the top scores and plotted the results. Based on his analysis he
concluded that the PowerPC 970 was good for cache friendly codes and
codes that could perform two vector mult-add operations per cycle;
the Itanium 2 is great if your code is cache friendly, your code is
amenable to the pipelining required by EPIC (the Itanium 2 architecture),
and you can afford them; and Opterons are great if your working set
makes caches less effective.
Greg Lindahl posted a response to Mark's analysis. He didn't think
the analysis was valid because the scores were not normalized to make
their absolute scores valid. Greg also took issue with the phrase,
"...if you can afford them..." in regard to the Itanium 2. Greg's
point is that you find the best price/performance and buy those
systems or you buy the most performance for a given price. Both of
those approaches don't care about the cost of a single system, just
the performance and cost of the entire cluster.
Windows HPC Edition and Linux Clustering
As would be expected, this thread brought up a lot of opinions and
insights from the list membership. On May 25, 2004 Eugen Leitel posted
an article describing how Microsoft was creating an HPC version of
Windows (Now available as Windows Compute Cluster Server 2003).
Of course, there were the immediate comments about seeing
hundreds of nodes all getting the "blue screen of death" at the same
time. Of course there is always a chuckle about these comments, but
there are some real issues behind Microsoft's entry into this space.
Robert Brown started off the discussion by writing a "few" comments
about Microsoft's motives for entering the HPC market. Shortly there
after Joe Landman jumped in to say that companies don't really have
nefarious motives behind their efforts. In his opinion he thought
that most of their efforts are clearly discernible from their basic
goals (usually involving making lots of money). Douglas Eadline,
editor of ClusterMonkey, posted that he thought there
were two reasons that Microsoft is entering this market. The first
reason, like any corporation, is profit. He noted that the article
stated that Microsoft was focusing on the financial and cycle
scavenging markets where the return on the investment is good. The
second reason was for competitive reasons. He suggested that they are
trying to limit "Linux creep" and that using Linux to build airplanes,
find oil, and search the genome have added some legitimacy to Linux
in the data center. His final comment, "I think we just got legit.",
was of a positive nature because the entry of a big company often helps
legitimize/solidifying markets.
|