data storage location

Sat Sep 13 11:10:11 EDT 2003

On Sat, Sep 13, 2003 at 10:13:54AM -0400, Robert G. Brown wrote:
> On Sat, 13 Sep 2003, John Hearns wrote:
> 
> > On Fri, 12 Sep 2003 hanzl at noel.feld.cvut.cz wrote:
> > 
> > > 
> > > Cache-like behavior would save a lot of manual work but unfortunately
> > > I am not aware of any working solution for linux, I want something
> > > like cachefs (nonexistent for linux) or caching ability of AFS/Coda
> > > (too cumbersome for cluster) or theoretical features of
> > Why do you say AFS is too cumbersome?
> > I'm just saying that to provoke a debate - I've never actually set up
> > an AFS infrastructure from scratch (kerberos, kernel patches etc...)
> > but I know its not an afternoon's work.
> > I have had call to work closely with AFS, doing a bit of performance
> > tuning for caches etc.
> 
> AFS is in wide use at Duke (it is the basis of the students' globally
> shared home directory system on campus).  It is doable administratively,
> although as you note it isn't an afternoon's work.  However, it isn't
> likely to be a really good cluster fs.  Reasons:
> 
>   a) It is even less efficient and more resource intensive than NFS
> (unsurprising, as it does more)
> 
>   b) It can (unless you take care with e.g. kerberos tickets) produce
> "odd" and undesirable behavior such as losing the ability to write to a
> file you are holding open after x hours unless/until you
> re-authenticate, blocking the process that owns the open file
> 

First of all, you should be relaying on your batch system to manage the credentials
for your job for you. Then, you really should be prepared to deal with that anyway - 
an NFS server can go away at any time too.

>   c) Its caching behavior is insane and (in my opinion) unreliable, at
> least as of the last time I tried it.  By this I mean that I have
> directly experienced the following AFS bug/feature:
> 
>    System A opens file output in AFS directory
>    System A writes lots of stuff into output
>    System A fflushes and keeps running, output is still open
>    System B wants to graze the current contents of output, so it opens
>    output for reading.
> 
> What does system B find?  According to standard/reliable programming
> conventions, the use of fflush should force a write of all cached data
> back to the file.  AFS, however, only flushes the data (apparently) back
> to its LOCAL cache/image of the file on system A, and only resync's with
> the global image when the file is closed.  So System B will read nothing
> of what System A has written until A closes the file.
> 
> So sure, system A could close and open the file repeatedly, but this
> adds a lot of overhead as open/close is expensive (open requires a full
> stat, checking permissions, etc.).
> 
> Most of this is manageable and one can learn how to work with the system
> in a cluster environment for at least certain classes of task, but it is
> (as the man says:-) "cumbersome", as to a certain extent is
> administration (file acls are more powerful but also require more
> thought and human energy, etc).  Too cumbersome is up to personal
> judgement.  Many clusters use a user filesystem only to launch tasks and
> permit a single results file to eventually be written back -- "anything"
> can be made to work there, and AFS would work as well as NFS or
> whatever.  Other cluster tasks might well be unsuitable as in my system
> A/B example, although with foreknowledge this can be coped with.
> 
> Hope this helps.  I personally no longer use AFS so perhaps the fflush
> "bug" has been fixed,

None of the AFS people will tell you it's a bug. AFS is a naturally 
more file-oriented system - AFS caches whole files, not subblocks of the 
file, so it makes sense that changes are propgated back to the server only 
at close() (and hey, watch out - close can fail - how many people actually
check the return code from close?). AFS is probably a big win for regular
users, where there's not much active sharing between processes. However,
if you're using the filesystem for coordination between multiple processes, 
AFS is not for you.

> and we have/do/are still meditating upon using AFS
> (since we have it running) in at least some capacity on a public cluster
> we're engineering, but it isn't a knee-jerk thing to do EVEN if it is
> already there, and will definitely require a bit of consideration if you
> have to set it up from scratch to use it at all.
> 

For a shared global namespace filesystem that actually does some sort of 
authentication, AFS is really the only game in town. I can't imagine doing a
campus-wide NFS or PVFS setup...

-Erik
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf