data storage location

Robert G. Brown rgb at phy.duke.edu
Sat Sep 13 17:34:13 EDT 2003


On Sat, 13 Sep 2003, Erik Paulson wrote:

> None of the AFS people will tell you it's a bug. AFS is a naturally 
> more file-oriented system - AFS caches whole files, not subblocks of the 
> file, so it makes sense that changes are propgated back to the server only 
> at close() (and hey, watch out - close can fail - how many people actually
> check the return code from close?). AFS is probably a big win for regular
> users, where there's not much active sharing between processes. However,
> if you're using the filesystem for coordination between multiple processes, 
> AFS is not for you.

Well, I did put bug/feature together, or used quotes, because I know
they don't think it is a bug, but I still get/got somewhat annoyed when
standard systems calls behave(d) in unexpected ways.  Technically, of
course, fflush only flushes user level buffers to the kernel, but leaves
it to the kernel to flush them back to the actual file (which it does
according to the dictates of the filesystem and so forth.  So
technically it isn't a real bug, just an arguable/debateable design
decision.  Some stuff like this made more sense a decade ago than it
does now, as well, as overall capacities of a cheap >>client<< now
exceed the capacity of most of the most expensive >>servers<< in use
back then by a Moore's Law Mile.

The feature can bite the hell out of you anyway, giving a whole new
meaning to the term "race condition" if you're not aware of the
possibility.  As in one usually thinks of a race condition as being
"short", and NFS is very good, really, at ensuring that any cached
writebacks are written physically to disk (or serviced out of its cache)
if you write on A and read on B.  It is "difficult" to catch the system
in the window in between a write and the point where the read will read
the correct data.  With AFS the race isn't just tortoise and hare, it
can actually be against a creature that is virtually sessile...

It's fixed now anyway, or there is a workaround (I just logged into two
systems that share my campus AFS home directory to try it out).  If you
do

 int fd;

 fd = open("dummy",O_WRONLY);

 printf("Writing out to dummy.\n");
 write(fd,"This is written out.",21);
 fsync(fd);
 fprintf(stdout,"Done.\n");
 sleep(30);

on system A now, the fsync actually forces a kernel level write through.
fflush on files opened with fopen still doesn't work.  My recollection
of the last time I tried this (quite a few years ago) is that neither of
them would work, which I would have (and did) call a bug.

So now you can force a writeback at the expense of using low level file
I/O.  Unless somebody knows how to get the low level file descriptor
associated with a high level file pointer (if there is any such beast to
get) -- I've never tried to do this.

> > and we have/do/are still meditating upon using AFS
> > (since we have it running) in at least some capacity on a public cluster
> > we're engineering, but it isn't a knee-jerk thing to do EVEN if it is
> > already there, and will definitely require a bit of consideration if you
> > have to set it up from scratch to use it at all.
> > 
> 
> For a shared global namespace filesystem that actually does some sort of 
> authentication, AFS is really the only game in town. I can't imagine doing a
> campus-wide NFS or PVFS setup...

No, we agree, which is why we're thinking seriously about it.  We may
end up doing both -- permit AFS access to a "head node" so people can
connect their own account space to the cluster across the campus WAN,
but use something else on the "inside" of the head node/firewall.

Or something else entirely.  I'd welcome suggestions -- we're having a
fairly major meeting on the project in a week or two.  Real Life
experiences of people who use AFS across a cluster would be lovely.
Especially experiences or tests/benchmarks regarding things like
efficiency and system overhead/load/requirements, and any caveats or
gotchas to be overcome (given by hypothesis a longstanding, well managed
campus-wide AFS setup -- at least seven or eight years old, but I don't
really remember when they started using AFS and it might be a fair bit
longer).

   rgb

> 
> -Erik
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu



_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list