Parallel Programming

We are going to be honest here. Writing parallel codes is not simple. You can learn the mechanics of writing MPI codes from Jeff Squyres MPI Monkey column,but what about the computer science? Or more specifically, how are you going to make sure your code runs faster on multiple processors? Join Pavel Telegin and Douglas Eadline as they explain the issues and the answers.

HPC without coding in MPI is possible, but only if your problem fits into one of several high level frameworks.

[Note: The following updated article was originally published in Linux Magazine in June 2009. The background presented in this article has recently become relevant due to the resurgence of things like genetic algorithms and the rapid growth of MapReduce (Hadoop) . It does not cover deep learning.]

Not all HPC applications are created in the same way. There are applications like Gromacs, Amber, OpenFoam, etc. that allow domain specialist to input their problem into an HPC framework. Although there is some work required to "get the problem into the application", these are really application specific solutions that do not require the end user to write a program. At the other end of the spectrum are the user written applications. The starting points for these problems include a compiler (C/C++ or Fortran), an MPI library, and other programming tools. The work involved can range form small to large as the user must concern themselves with the "parallel aspects of the problem". Note: all application software started out at this point some time in the past.

From the Spinal Tap "-011" option working group.

Creating code for GPU accelerators is often not a simple task. In particular if one wants to convert an application to run on GPUs, the resultant code will often look very different than the original. If the application were a typical HPC code it would probably be written in Fortran, C, or C++ and use OpenMP or MPI (or both) to express parallelism. A GPU version must be rewritten to use either NVidia CUDA or OpenCL. For the average HPC user, this process can be rather daunting and present a fairly high barrier to GPU use. Fortunately, many of the popular applications (e.g. Amber) have been ported to use NVidia GPUs with great results, although they will only run on NVidia hardware. What about other open applications or user generated codes?

Yes, mowing ones lawn and HPC have much in common

In many of my previous columns I mentioned Amdahl's Law -- the golden rule of parallel computing. Before you click away, rest assured I have no intention of talking specifically about Amdahl’s Law and I promise not to place a single equation or derivation in this column. Often times people are put off by Amdahl’s law. Such discussions usually start with an equation and talk of the limit as N goes to infinity. Not to worry. There are no formulas, no esoteric terms (sorry, no big words), just the skinny on the limits of parallel computing. I’ll even go one further, I’ll hardly mention parallel computers, multi-core, and other such over worked topics. In this article, I’ll discuss lawn care.

You really should meet Julia

A long time ago, BASIC was "the language" in the PC world. There were other languages of course, but BASIC was well, "basic" and it was the only thing beyond machine code for many early PC enthusiasts. The name was an acronym for "Beginner's All-purpose Symbolic Instruction Code." New users could easily start writing programs after learning a few commands. It was fun and easy. Many scientists and engineers taught themselves BASIC. Of course, many would argue at the time real scientific programs were written in BASIC's big brother FORTRAN, but FORTRAN was a different world in terms of development cycle and hardware. There was this thing called a compiler that had to be run before you could execute your program. BASIC on the other hand seemed to keep track of your code and would just "RUN" whenever you wanted. Of course you might get some errors, but the "edit, run, edit" cycle was rather short and allowed one to easily play with the computer.

A hard to swallow conclusion from the table of cluster

A confession is in order. The last two (1, 2) installments of this column have been a sales pitch of sorts. If you believe some of the things I talked about, large clusters will break, applications will need to tolerate failure and be easy to write, then you may agree that dynamic programming algorithms are one method to solve these problem. The next question is, how do implement these ideas?

The answer is the part many may find distasteful. If you are one of the brave few who take pause to think about how we are going achieve pervasive cluster (parallel) computing, then take a bite of this column. The rest of you weenies should at least nibble at the edges.

Search

Login And Newsletter

Create an account to access exclusive content, comment on articles, and receive our newsletters.

Feedburner

Share The Bananas


Creative Commons License
©2005-2016 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.