Cluster file systems are hot. What good is 1000 processors if you can't write to a file without clogging your network or server. Learn about the issues and experiences of parallel file systems with Distinguished Cluster Monkey Jeff Layton as your guide.

An update to our previous story on FhGFS

Released on March 10th 2014, the new major release of Fraunhofer’s parallel file system (FhGFS) is now available and comes with two significant new features. While last year's major release was primarily focused on metadata server improvements, the focus for the new release was on performance optimizations on the storage server side. Under the hood, the storage servers now create an innovative new data layout that uses user- and time-based grouping of chunk files. This approach reduces disk seeks significantly by improving the cache efficiency of the underlying server file system and helps to avoid aging-related performance reductions. More visible to the user is a new option for connection-based authentication between clients and servers using a pre-shared secret.

Crafted in Germany, FhGFS is ready to take on the worlds biggest IO challenges

The Fraunhofer Parallel File System (FhGFS) is the high-performance parallel file system of the Fraunhofer Institute for Industrial Mathematics in Kaiserslautern, Germany. It includes a distributed metadata architecture that has been designed to provide the scalability and flexibility required to run today's most demanding HPC applications while being easy to use and manage.

lots of bits in lots of places

In the final part of our worlds biggest and best Parallel File Systems Review, we take a look at object based parallel file systems. Although this installment stands on its own, you may find it valuable to take a look at Part One: The Basics, Taxonomy and NFS and Part Two: NAS, AoE, iSCSI, and more! to round out your knowledge. Let's jump in! And, don't miss the biggest and best summary table I ever created on the last page.

Ever hear of an exabyte?

In our second part of File Systems O'Plenty we take a look at NAS, Distributed File Systems, AoE, iSCSI, and Parallel File Systems. In case you missed part one you can find it here. In this part, we will also point out why IO is important in HPC clustering. Many a CPU cycle is wasted waiting for that data block. Read on how to feed you data appetite.

Storage: its where we put things

Clusters have become the dominant type of HPC systems but that doesn't mean they aren't perfect (sounds like a Dr. Phil show doesn't it?). While you get a huge bang for the buck from them, somehow you have to get the data to and from the processors. Moreover, some applications have fairly benign IO requirements and others need really large amounts of IO. Regardless of your IO requirements you will need some type of file system for your cluster.

I wrote a file system/storage survey article for clusters in the past, but as always things change rather rapidly in the HPC arena. Originally, I had wanted to update the original article, however, the updates became so large that it's really an entirely new article. So this article, I hope, is a bit more in depth and a bit more helpful than the past file system article.


Login And Newsletter

Create an account to access exclusive content, comment on articles, and receive our newsletters.


This work is licensed under CC BY-NC-SA 4.0

©2005-2023 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.