[Beowulf] High Performance for Large Database
ctierney at hpti.com
Tue Nov 16 12:54:29 EST 2004
On Tue, 2004-11-16 at 02:01, Laurence Liew wrote:
> From what I understand it has to do with "locking" on the SAN devices
> by the GFS drivers.
> Yes.. you are right.. most implementations will have separate IO and
> compute nodes... in fact that is the recommended way.... what I had
> meant in my earlier statement was that "I prefer data to be distributed
> amongst nodes - IO nodes - rather than have them centralised in a single
> SAN backend.
Do you have an issue with a single storage unit, or actually using
a SAN? You could connect, dare I say "cluster", smaller FC based
storage units together. You will get much better price/performance
that going with larger storage units. This solution would work
for shared filesystems like GFS, CXFS, or StorNext. You could
connect the same units directly to IO nodes for distributed filesystems
like Lustre, PVFS1/2, or Ibrix.
> GFS + NFS is painful and slow as you have experienced it... hopefully
> RHEL V4 will bring about better performance and new features to address
> HPC by GFS (unlikely but just hoping).
> Craig Tierney wrote:
> > On Mon, 2004-11-15 at 06:26, Laurence Liew wrote:
> >>The current version of GFS have a 64 node limit.. something to do with
> >>maximum number of connections thru a SAN switch.
> > I would suspect the problem is that GFS doesn't scale past
> > 64 nodes. There is no inherent limitation in Linux on the
> > size of a SAN (well, if there is, it is much larger than 64 nodes).
> > Other shared filesystems, like StorNEXT and CXFS, are limited
> > to 128 nodes due to scalability reasons.
> >>I believe the limit could be removed in RHEL v4.
> >>BTW, GFS was built for enterprise and not specifically for HPC... the
> >>use of SAN (all nodes need to be connected to a single SAN storage)..
> >>may be a bottleneck...
> >>I would still prefer the model of PVFS1/2 and Lustre where the data is
> >>distributed amongst the compute nodes
> > You can do this, but does anyone do it? I suspect that most
> > implementations are setup where the servers are not on the compute
> > nodes. This provides for more consistent performance across
> > the cluster. Also, are you going to install redundant storage in
> > all of your compute nodes so that you can build a FS across the
> > compute nodes? Unless the FS is for scratch only, I don't want
> > to have to explain to the users why the system keeps losing their data.
> > Even if you use some raid1 or raid5 ATA controllers in a
> > few storage servers, you are going to be able to build a fast and
> > fault-tolerant system that just using disk in the compute nodes.
> >>I suspect GFS could prove useful however for enterprise clusters say 32
> >>- 128 nodes where the number of IO nodes (GFS nodes with exported NFS)
> >>can be small (less than 8 nodes)... it could work well
> > I had some experience with a NFS exported GFS system about 12
> > months ago and it wasn't very pleasant. I could feel the latency
> > in the meta-data operations when accessing the front ends of the
> > cluster interactively. It didn't surprise me because other experience
> > I have had with shared file-systems have been similar.
> > Craig
> >>Chris Samuel wrote:
> >>>On Wed, 10 Nov 2004 12:08 pm, Laurence Liew wrote:
> >>>>You may wish to try GFS (open sourced by Red Hat after buying
> >>>>Sistina)... it may give better performance.
> >>>Anyone here using the GPL'd version of GFS on large clusters ?
> >>>Be really interested to hear how folks find that..
> >>>Beowulf mailing list, Beowulf at beowulf.org
> >>>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf