[Beowulf] Can one Infiniband net support MPI and a parallel filesystem?

Ashley Pittman apittman at concurrent-thinking.com
Wed Aug 13 06:29:05 EDT 2008

On Tue, 2008-08-12 at 12:09 -0600, Craig Tierney wrote:
> Chris Samuel wrote:
> > ----- "I Kozin (Igor)" <i.kozin at dl.ac.uk> wrote:

> > But that assumes you're not sharing a node with other
> > jobs that may well be doing I/O.
> > 
> I am wondering, who shares nodes in cluster systems with
> MPI codes?

In my experience, almost everyone.  In practise though most jobs ask for
even numbers of CPU's so larger jobs rarely get scheduled this way.

>  We never have shared nodes for codes that need
> multiple cores since be built our first SMP cluster
> in 2001.  The contention for shared resources (like memory
> bandwidth and disk IO) would lead to unpredictable code performance.

Unpredictable maybe but if the alternative is to not run at all then
it's still a win.  What you wouldn't want is to have a small number of
processes in a big job sharing a node with a resource hogging job and
slow down the entire big job however I've never seen this happening in
the wild.

> Also, a poorly behaved program can cause the other codes on
> that node to crash (which we don't want).

It goes without saying that this shouldn't be able to happen.


Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list