Cheap PCs from Wal-Mart
Robert G. Brown
rgb at phy.duke.edu
Mon Jun 2 22:23:47 EDT 2003
On Mon, 2 Jun 2003, Mark Hahn wrote:
> but the real question is: can you afford to use a cluster node which
> has, say, 10-20% of the performance? you can stretch Amdahl's law a bit
> and see that the further you push wimpy nodes, the smaller a problem
> domain you can address (requires ever looser coupled programs, longer
> latency of individual work-units, etc).
Ya. I've refrained from this discussion because I'm busy and y'all've
been doing so very well anyway, but I think Mark has it dead right.
Really, truly cheap-o nodes are a lovely thing for (as was proposed in
the original post long ago) a tiny $1K hobby/play cluster, or even for a
toy cluster for a high school cluster laboratory where $2-3K is "a lot
of money" for them to raise. The purpose of such a cluster is NOT to
get some piece of numerical work done as cost-effectively as possible.
It is to teach a generation of kids about managing linux systems,
writing programs, and the rudiments of cluster computing (Amdahl's law
and so forth) in an inexpensive environment that they can "play" with.
It is for cluster humans like me to have cheap miniclusters at home to
play with to try things out in an environment they can "break the hell
out of" without interfering with their production environment.
For real numerical work, the following argue against using the cheapest
a) MFLOPS/$$ are generally not peak for the cheapest hardware, even in
raw aggregate. I "usually" find that I can get the most work done for the
least money one or two clock generations back from peak/bleeding edge.
b) Mark's repeated observations on "good benchmarks" in terms of which
to find the cost-benefit winner are also dead on. The best benchmark is
(of course) the application(s) you plan to run. Lacking numbers for
that application, you'll have to do your best to guestimate it from e.g.
specfp, stream, and other benchmarks. Most people on this list will
find raw integer benchmarks to be somewhat irrelevant, as HPC
performance tends to be dominated by floating point operations. Then
there is the usual list of rate-limiting bottlenecks: cpu, memory,
c) Cheap nodes are cheap, and in a tanstaafl world will likely break
early, break often. For a few nodes in a toy cluster with some sort of
warranty, your aggregate risk may be low and you may have the energy to
deal with failures that occur. For 128 nodes in a production cluster, a
failure a week or more will soon drive you mad.
d) Cheap slow nodes probably draw MORE power per unit of work done
than do faster newer more expensive nodes. I say "probably" because I
haven't done all of the measurements to be able to say this for a fact,
but note that a number of items in a "typical computer" chassis produce
a more or less invariant draw. You have to power the power supply
itself, a floppy, a NIC, a hdd, a motherboard, a video card. Some of
those things actually run quieter and draw less in more expensive
versions. Others draw at a rate that scales with clock (although not
necessarily linearly). Cooling goes with heat. Both cost money.
For example, consider an imaginary node where overhead is 40 watts and
CPU/memory at speed "1" are 40 more watts. To get to speed 2 with
perfect scaling you can buy two nodes and draw 160 watts total. Or you
can buy one node at speed 2, forget scaling, and even if the 2x cpu
draws 80 watts, end up drawing only 120 watts. If the faster CPU has a
marginal cost of less than the second entire system, you end up winning
on raw dollars as well (one of the things that favors fast nodes as in
e) Finally, cheap nodes are almost obsolete when you buy them, so it
should come as no surprise that in a production environment they'll be
obsolete in a lot less than a typical three year lifetime. I've read
cost benefit studies that suggest that from a TCO perspective even
fairly MODERN nodes should be replaced every 12 to 18 months (although
this particular timeframe depends a bit on a variety of overhead and
management costs). Cheap nodes shouldn't even be brought home. Moore's
Law is inexorable and unforgiving, and you'll find that your entire
$1000 "cheap cluster" may be replaceable by a single $1000 desktop
system that is twice as fast INSIDE a year from when you bought it if
you're not careful.
So for hobby/home, sure, wal-mart specials, cheapest homemade,
web-special boxes are fine. Hell, I still run nodes at home as slow as
400 MHz Celerons -- but they aren't "production", they are for fun. At
Duke I buy much more expensive, much more reliable, much faster systems,
and curse the gods when even THEY break down from time to time.
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf