Semaphore controlling access to resource in OpenPBS? SGE?

hanzl at noel.feld.cvut.cz hanzl at noel.feld.cvut.cz
Fri Sep 5 11:03:05 EDT 2003


>> semaphore/mutex locking access to the directory

> If you use SGE, this howto will provide the info:
>
> http://gridengine.sunsource.net/project/gridengine/howto/resource.html

but to create true mutex you want slightly different configuration:

 - 'consumable' resource, with the overall amount to consume being just '1'
 - each job needs amount '1' of this resource

so this resource is exhausted by the first job requesting it and
available again once it finished, and this decision happens in central
scheduler so it is a safe mutex.

I think this howto is a closer match to this task:

http://gridengine.sunsource.net/project/gridengine/howto/consumable.html

(It takes some time to get used to GE's system of resources,
consumables, various default values etc. but it is well worth it, they
are nice building blocks for many useful things.)


I would however still worry about coherency of things cached by
network filesystem although some slow setups of say NFS are probably
nearly safe.

I personally think there is one great way of doing this:

 - Scheduler should know that one job needs to see filesystem changes made by
 another job (there are job dependencies specified via -hold_jid or
 there is a 'directory mutex')

 - Scheduler should be able to ask network filesystem "please propagate cached
 changes made on node A so as they are visible on node B"

This would IMHO solve speed/consistency dilema for many practical
purposes. It should be easy to implement the SGE part of this trick
but I am not aware of any network filesystem being able to "cache a
lot, propagate on demand" as described above.

Regards

Vaclav Hanzl
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list