Clusters Vs Grids

nixon at nsc.liu.se nixon at nsc.liu.se
Mon Jul 21 04:18:15 EDT 2003


Shin <shin at solarider.org> writes:

> Broadly (very broadly) as I understand it a cluster is a collection
> of machines that will run parallel jobs for codes that require high
> performance - they might be connected by a high speed interconnect
> (ie Myrinet, SCI, etc) or via a normal ethernet type connections.
> The former are described as closely or tightly coupled and the
> latter as loosely coupled? Hopefully I'm correct so far. 

You're basically correct, except that a cluster doesn't necessarily
run parallel jobs. A common situation is that you have lots and lots
of non-interdependent, single-CPU jobs that you want to run as quickly
as possible.

> A grid is also a collection of computing resources (cpu's, storage)
> that will run parallel jobs for codes that also require high
> performance (or perhaps very long run times?). However these
> resources might be distributed over a department, campus or even
> further afield in other organisations, in different parts of the
> world?

Again, basically correct, except for the same point as above. I think
the key issues about a grid is that the resources are:

a) possibly distributed over large geographical distances,

b) possibly belonging to different organizations with different
   policies; there is no centralized administrative control over them.

> As such a grid cam not be closely coupled and any codes that are
> developed for a grid will have to take the very high latency
> overheads of a grid into consideration. Not sure what the bandwidth
> of a grid would be like?

That only depends on how fat pipes you put in. In Nordugrid there is
gigabit-class bandwidth between (most of) the resources. The latency,
on the other hand, is harder to do anything about.

> So I was wondering just how all those coders out there who are
> developing codes on clusters connected with fast interconnects are
> going to convert their codes to use on a grid - or is there even the
> concept of a highly coupled grid - ie grid components that are
> connected via fast interconnections (10Gb ethernet perhaps?) or is
> that still very low in terms of what closely coupled clusters are
> capable of.

There are MPI implementations that run in grid environments, but of
course you might get horrible latency if you have processes running at
different sites.

> Or are people making their clusters available as components of a
> grid, call it a ClusterGrid and in the same way that a grid app
> would specify certain resoure  requirements - it could specify that
> it should look for an available cluster on a grid.

That is a much more likely scenario for running parallel applications
on a grid, yes.

> However I can't see why establishments who have spent a lot of money
> developing their clusters would then make them available on a grid
> for others to use - when they could just create an account for the
> user on their cluster to run their code on.

It is partly a question of administrative overhead. In an non-grid
situation, if a user gets resources allocated to him at n computing
sites, he typically needs to go through n different account activation
processes. Now, consider a large project like LHC at CERN, where you
have dozens and dozens of participating computing sites and a large
number of users - it's just not feasible to have individual accounts
at individual sites.

Another part is resource location; if you have dozens and dozens of
potential job submission sites, you really don't want to manually
keep track of the current load at the different sites. 

In a grid situation, you just need your grid identity, which is a
member of the project virtual organization. You only need to submit
your job to the grid, and it will automatically be scheduled on the
least loaded site where your project VO has been granted resources.
(In theory at least. I'm not aware of many grid projects that have
gotten this far. Nordugrid is one, though.)

> So I was just looking to see if I have my terminology above correct
> for grids and clusters and whether there was any concept of a
> tightly coupled grid or even a ClusterGrid. And if there was any
> useful cross over between clusters and grids - or are the two so
> completely different architecurally that they will never meet; or
> not for the near future at least.

Think of the grid as a generalized way of locating and getting access
to resources in a fluffy, vague "network cloud" of computing
resources.

Clusters are just one type of resource that can be present in the
cloud.

Certain types of applications run best on clusters with high-speed
interconnects - well, then you can use the grid to locate and get
access to suitable clusters.

-- 
Leif Nixon                                    Systems expert
------------------------------------------------------------
National Supercomputer Centre           Linkoping University
------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list