data storage location

Nicholas Henke henken at seas.upenn.edu
Sat Sep 13 09:16:28 EDT 2003


On Sat, 2003-09-13 at 05:34, John Hearns wrote:
> On Fri, 12 Sep 2003 hanzl at noel.feld.cvut.cz wrote:
> 
> > 
> > Cache-like behavior would save a lot of manual work but unfortunately
> > I am not aware of any working solution for linux, I want something
> > like cachefs (nonexistent for linux) or caching ability of AFS/Coda
> > (too cumbersome for cluster) or theoretical features of
> Why do you say AFS is too cumbersome?
> I'm just saying that to provoke a debate - I've never actually set up
> an AFS infrastructure from scratch (kerberos, kernel patches etc...)
> but I know its not an afternoon's work.
> I have had call to work closely with AFS, doing a bit of performance
> tuning for caches etc.

Ok, I'll bite :) We have been exploring filesystem options here, trying
to get away from NFS. Our 'architecture' is 128 nodes with FastE, and 4
IO servers, each with ~600 GB of RAID5 and GigE.  

At first glance, AFS, really OpenAFS in this case, seemed like a dream
come true. There was the ability to have mutliple file servers in a
global namespace, support for doing amanda backups, could let users
export their own data to any machine that is an AFS client -- even
across the country/world. 

Now... for the practical side. When I went to implement it, I was
unaware how much a pain it could be to setup the kerberos stuff. At Penn
we have a campus wide krb5 setup, and getting users to use krb5 logins
was not a problem. However, openAFS needs a krb4 ticket to work, and
Penn will not add the krb524 translator. To get around this, it was
necessary to do a bunch of hacks to store the krb4 ticket on the machine
itself, and tell krb5 that localhost was a valid krb524 server. This may
not sound too bad, but to figure it out, and then to implement correctly
is such a major PITA, I almost dumped the project there. So, now that it
is talkint to the kerberos setup just fine, it was very apparent that
having your ticket timeout was a very annoying problem. -- Basically you
have to do some really ugly script/hack to have a program remember all
of the user's passwords, either in memory or in a gpg protected file,
and periodically check for a ticket that was about to expire, and re
klogin for that user. Not pretty.
The second issue, and the real deal killer is the lack of support for
>2GB files. Absolutely unacceptable. We do natural language processing
on one cluster, and genomics on another, with files that regularily
exceed 2GB.

So... in conclusion. IF openAFS could support >2GB files, and the krb5
mess could be cleanedup, it might work.  For now, we will be installing
& playing with Lustre.

Nic
-- 
Nicholas Henke
Penguin Herder & Linux Cluster System Programmer
Liniac Project - Univ. of Pennsylvania


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list