[Beowulf] WRF model on linux cluster: Mpi problem

John Hearns john.hearns at streamline-computing.com
Mon Jul 4 03:48:47 EDT 2005


On Fri, 2005-07-01 at 09:38 +0200, Federico Ceccarelli wrote:
> yeas, 
> 
> I will remove openmosix. 
> I patched the kernel with openmosix because I used the cluster also for
> other smaller applications, so the load balance was useful to me.
> 
> I already tried to switch off openmosix with
> 
> > service openmosix stop
Having a small amount of Openmosix experience, that should work.

Have you used the little graphical tool to display the loads on each
node? (can't remember the name).

Anyway, I go along with the earlier advice to look at the network card
performance.
Do an lspci -vv on all nodes to check that your riser cards are running
at full speed.

What I would do is break this problem down.
Start by running the Pallas benchmark, on one node, then two, then four
etc. See if a pattern develops.
The same with your model, if it is possible to cut down the problem
size. Run on one node (two processors), then two then four.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list