[Beowulf] Selling computation time

Robert G. Brown rgb at phy.duke.edu
Thu Dec 28 11:37:57 EST 2006

On Tue, 26 Dec 2006, Chetoo Valux wrote:

> I wonder then if there would be potential buyers for cluster time. I've been
> browsing,  not too deep, the net, and I've not found (yet) any information
> of someone selling cluster time.

This is a perrenial topic of discussion on the list, and the general
answer is that SO FAR there are two generically distinct marketplaces
for this sort of remote clustering.  One is the sort that is already
being filled by e.g. Google -- remote computations that may well be
distributed over a carefully engineered cluster to perform a single task
that is of value in some very specific mileau.  In many cases the
computations at hand aren't properly "HPC" in that they may not be
"numerical", but they are certainly cluster apps and may well run on
very massive clusters indeed.

The other is numerical HPC applications.  Here the marketplace is one
where it is difficult to achieve a win.  First of all, most people who
are doing HPC have very specific, very diverse, applications and often
these applications run on clusters that are at least to some extent
custom-engineered for the application.

A general purpose commercial cluster would face immediate problems
providing a "grid-like" interface to the general population.  Even
something as simple as compiling for the target platform would become
difficult, and is one of those areas where solutions on real grid
computers tend to be at least somewhat ugly.  Then there is access,
accounting, security, storage, whether or not the applications is EP or
has actual IPCs so that it needs allocations of blocks of nodes with
some given communications stack and physical network.

By the time one works out the economics, it tends to be a lose-lose
proposition.  It is just plain difficult to offer computational
resources to your potential marketplace:

   a) in a way that they can afford -- more specifically can write into a
grant proposal to afford.

   b) in a way that is cheaper than they could obtain it by e.g. spending
the same budget on hardware they themselves own and operate.  Clusters
are really amazingly cheap, after all -- as little as a few $100 per
node, almost certainly less than $1000 per CPU core even on bleeding
edge hardware.  Yes, there are administrative costs and so on, but for
many projects those costs can be filled out of opportunity cost labor
you're paying for anyway.

   c) and, if you manage a rate that satisfies b), that still makes YOU
money.  Dedicated cluster admins are expensive -- suppose you have just
one of these (yourself) and are willing to do the entire entrepreneurial
thing for a mere $60K/year salary and benefits (which I'd argue is
starvation wages for this kind of work).  A 100-node (CPU) pro-grade
cluster will likely cost you at LEAST $50,000 up front, plus the rent of
a physical space with adequate AC and power resources plus roughly
$100/node/year -- call it between $10 and $20K/year just to keep the
nodes powered up, plus another $5000 or so in spare parts and
maintenance expenses.  The amortization of the $50K up front investment
is over at most three years (at which point your nodes will be too slow
to be worth renting anyway, and you'll likely have to drop rental rates
yearly to keep them in a worthwhile zone as it is, so call it $20K of
depreciation and interest on the borrowed money per year plus $20K in
operating expenses per year plus $60K for your salary -- you have to
make about $100K/year, absolute minimum, just to break barely arguably
not quite even.

That's $1000/node, and you have to KEEP them rented out in such a way as
to make this ALL the time for ALL three years to be able to rollover
replace the cluster nodes over that interval and stay in business.  In
reality, in the closely comparable business of renting space and
sysadmin time for network servers, rates are 2-5 times this, so even
allowing for better scaling of service delivery for compute nodes
compared to webservers, this estimate is still very likely quite

Well hell, for $1000 I can buy my OWN compute node -- one with multiple
cores at that -- house it in my OWN space if I or my university have
anything that will do for this purpose (as is usually the case), feed
and cool it, and with FC+PXE+Kickstart and/or warewulf installing and
maintaining it is for me at least a matter of a few hours initial
investment for the entire cluster plus a couple of boots per node, as we
have a FC repository and a PXE/DHCP server already configured.  We can
even handle moderate node heterogeneity with some of the tools developed
at Duke that can rewrite kickstart scripts on the fly according to
xmlish rules.  And I can buy just as many newer, faster nodes next year,
and the year after that, instead of renting your aging nodes.

The point being that with VERY FEW EXCEPTIONS the economics just doesn't
work out.  Yes, some things can be scaled up or down to improve the
basic picture I present, but only at a tremendous risk for a general
purpose business.  The only exceptions I know of personally are where
somebody is already operating a cluster consultation service --
something like Scalable Informatics -- that helps customers design and
build clusters for specific purposes.  In some cases those customers
have "no" existing expertise or infrastructure for the cluster they need
and can obtain a task-specific win by effectively subcontracting the
cluster's purchase AND housing AND operation to SI, where SI already
has space and resources and administration and installation support set
up and can install and operate their client's cluster with absolutely
minimal investment in node-scaled time.

Note that doing THIS provides you with all sorts of things that alter
the basic equation portrayed above -- you already have an income from
the consultative side and don't have to "live" on what you make running
clusters, you don't actually buy the cluster and rent it out, the client
buys the cluster and pays you to house and run it (so you don't have to
deal with depreciation or rollover renewal, they do), you have
preexisting but scalable infrastructure support for cluster installation
and software maintenance, etc.  Even here I imagine that the margins are
dicey and somewhat high risk, but maybe Joe will comment.  Maybe not --
I doubt that he'd welcome more competition, since there is probably
"just enough" business for those fulfilling the need already.  I don't
view this as a high-growth industry...;-)


