semaphore problem with mpich-1.2.5

bvds at bvds at
Mon Jul 7 23:13:46 EDT 2003

I have an Opteron system running GinGin64 with 
a 2.4.21 kernel and gcc-3.3.  I compiled
mpich-1.2.5 with --with-comm=shared, but mpirun 
crashes with the error:

 semget failed for setnum = 0

This is a known problem with mpich (see

Has anyone else seen this error?

I found a discussion, reprinted below, by Douglas Roberts at LANL
His fix worked for me.  Does anyone know of a "real" solution?

Brett van de Sande


I think the reason we get sem_get errors is that the operating system is not
releasing inter-process communication resources (e.g. semaphores) when a
job is finished. It's possible to do this manually. ...
I wrote the following script, which removes
all the shared memory and semaphore resources held by the user:

#! /bin/csh

foreach id (`ipcs -m | gawk 'NR>4 {print $2}'`)
        ipcrm shm $id

foreach id (`ipcs -s | gawk 'NR>4 {print $2}'`)
        ipcrm sem $id


Beowulf mailing list, Beowulf at
To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list