Need comments about cluster file systems
Philippe Blaise - GRENOBLE
pblaise at cea.fr
Fri Nov 15 04:10:53 EST 2002
"Walter B. Ligon III" wrote:
> > > > (For sure others will point you to PVFS, which IMHO makes sense only
> > > > if network card is quicker than local disk.)
> > >
> > > It's frustrating to hear people talk about how wonderful InterMezzo and
> > > Lustre _are_, and dismiss PVFS and GFS. Software that is not quite
> > > finished is always better and faster than software that already
> > > exists. It only loses speed and features when reality looms.
> > >
> > > Talk about vaporware and deployed systems in different categories unless
> > > you clearly use the future tense.
> > I quite like PVFS, but I think it does not solve the problem. AFAIK it
> > can get speed of NIC. I want the speed of local harddisk, which is
> > much bigger with my hardware. But again, I am ready for any
> > enlightenment.
> Well, the point is there isn't one kind of file system that serves everyone's
> needs. If your application is such that you know where you data needs to be before hand and you can run your computations on the same node, then you don't need anything more than the local file system. If you need to be able to access any data from any node with a parallel application, that is what PVFS was designed for. If the data you need is on the local disk, PVFS gives you local disk speeds. If its not, you are limited by the network speed. There
> is no way around that.
> The original poster who started this didn't specify what kind of applications
> he was considering, thus it seems rather impolite to respond assuming that
> HIS needs are the same as YOUR needs - especially when that involves trashing
> someone else's work.
I don't think you are impolite ?
In fact the discussion started yesterday in my office with two of my colleagues.
We are working on PC's clusters and a Compaq SC machine.
Let's say that from a hardware
point of view, the Compaq SC solution is no more than a alpha cluster
with a fast interconnection (QSW). The OS is tru64, and in our opinion,
the only significant difference between a linux solution is the file systems :
(S)CFS and PFS (cluster and parallel file systems).
>From a user point of view, CFS gives a single system view of the files,
uses the QSW NIC to access disks via some file servers, uses local memory caching,
blablabla (you should be able to find a lot about this on the web),
and PFS is able to manage parallel access to file servers, that gives you more
throughput of course.
What about PC's clusters ? Here, our dear users use local disks ! They seem to be
(one day one of my colleague tried to install PVFS over SCI but it was a disaster),
so we are a little bit disappointed. Why HPC's vendors are doing efforts,
if most of the user's programs can use local disks + ftp at the end of their runs on a
distant file server ?
That's why I post my super naive email yesterday.
Are there some figures about PVFS perfs over SCI and/or Myrinet ?
> The original poster said something about his experience in the world of
> supercomputers and MPPs there were parallel file systems, so one MIGHT think
> that in fact he DOES have applications relavant to PVFS - but who knows, maybe
> he'll tell us.
> In the mean time, since there aren't any good open source solutions to the cached file system problem, why don't you work on it yourself. Our newest version of PVFS is designed so that things like that can be added as modules.
> We don't have the manpower to solve ALL of the problems, so we are trying to
> at least create a framework for collaboration.
> Dr. Walter B. Ligon III
> Associate Professor
> ECE Department
> Clemson University
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf