# [Beowulf] Parallel Programming Question

Joe Landman landman at scalableinformatics.com
Tue Jul 17 07:39:43 EDT 2007

```Hi James:

James Roberson wrote:
> Hello,
>
> I'm new to parallel programming and MPI.  I've developed a simulator in
> C++ for which I would
> like to decrease the running time by using a Beowulf cluster.  I'm not
> interested in optimizing
> speed, I'm just looking for a quick and easy way to significantly
> improve the speed over running
> the program on a single machine.

I am not sure I grasp "I'm not interested in optimizing speed, I'm just
looking for a quick and easy way to significantly improve the speed".

Basically I think you are saying you want to run it in parallel, in
order to speed up the program.  Parallelization is an optimization.

> Basically, I need to learn how to parallelize a C++ function.  The
> following functions in particular
> take the longest to run in my simulator.  The first implements LU
> decomposition on a large matrix and
> the second implements the backsubstitution method to solve matrix division.

First, how large are your matrices?  Second, how many times is this
called?

>
> void NR::ludcmp(Mat_IO_DP &a, Vec_O_INT &indx, DP &d)
> {
>     const DP TINY=1.0e-20;
>     int i,imax,j,k;
>     DP big,dum,sum,temp;
>
>     int n=a.nrows();
>     Vec_DP vv(n);
>     d=1.0;
>     for (i=0;i<n;i++) {
>         big=0.0;
>         for (j=0;j<n;j++)
>             if ((temp=fabs(a[i][j])) > big) big=temp;

(warning: coffee impaired, may not be correct in all aspects, notation,
typing, rambling ....)

abs is a slow function.  Think about mathematical equivalents.  abs(x)
maps x into the [0,) range.  Multiplication is much faster than abs.  If

temp=fabs(a[i][j]);
if ( (temp*temp) > (big*big) ) big = temp;

might be a bit faster.

>         if (big == 0.0) nrerror("Singular matrix in routine ludcmp");
>         vv[i]=1.0/big;

Second, you might want to define delta = some_small_number  and compare
big to that.  You would get a catastrophic loss of precision if you
divided by a very small number, and amplify error in the process.
Renormalization/scaling can be your friend, as well as row manipulation.

That said, I would suggest avoiding re-inventing wheels.  There are a
number of excellent LU-decomposition and back substitution codes out
there ...

>
> (The functions are borrowed from the library provided by "Numerical
> Recipes in C++")
> I'm currently calling these functions from the main loop with the lines:

ah.... that explains it...

>
> NR::ludcmp(c,indx,d);
>
> and
>
> NR::lubksb(c,indx,xv);
>
> where the variable 'c' is a large matrix (holding image pixel values)
> and 'xv' is a vector
> used in backsubstitution.
> All of the variables passed into these functions are of types defined in
> " nr.h"
> 'c' is a Mat_DP  (double-precision matrix)
> 'indx' is a Vec_INT  (integer vector)
> 'd' is a DP    (double precision)
> 'xv' is a Vec_DP  (double precision vector)
>
> Is there a simple way to call these functions which will cause the
> cluster to distribute the
> load of operations?  Currently, when I run the program with a 50x50
> array on a single machine,
> it takes about 5 minutes to process a single iteration through the
> matrix division.

Huh?  What machine?  What compiler?  What options?

(more lack of coffee induced stupor here)

This should take well under 1 second.  O(N**3) operations (50*50*50 =
0.125E+6 operations) on a 1 GFLOP machine ( 1E+9 FLOPS) should be
achievable in well under 1 second (ball park of 0.125/1000 seconds times
a constant)

Maybe I am missing something.  This *should* be fast.  Unless you are
running on a Palm pilot or something like that ...

In octave, using an interpreted LU decomposition on a hilbert matrix

h=hilb(50);
a=time ; [l,u,p] = lu(h); b=time; b-a
ans = 0.00063705

this is ball-park what I had estimated, within a constant factor
(cough).  This is an Opteron 242 with 4 GB ram, so it isn't a speed demon...

--

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

!DSPAM:469cab2b226721923983023!

```