Home
Learning About Clusters
Programming Clusters
Administering Clusters
Benchmarking Clusters
File Systems for Clusters
Cluster Applications/Grid
Cluster News
Site Map
 
    Home
Search
Main Menu
Home
News
Features
Columns
Reviews
Links
FAQ's
Contact
Site Information
Cluster Classifieds
Projects
Conference Reports
Cluster Agenda
Site Map
Add This Article

Cluster Agenda

Cluster Builder

Appro International


Benchmarking Parallel File Systems PDF Print E-mail
Written by Jeff Layton   
Thursday, 22 December 2005
Article Index
Benchmarking Parallel File Systems
Page 2
Page 3

b_eff_io

The first one I'll examine is b_eff_io, or "Effective I/O Bandwidth Benchmark." In a general sense, this benchmark measures the time it takes to transfer data from a location in memory to a location in a file, which can then be used to compute the effective bandwidth. There are a huge number of parameters that could influence the time for this transfer. The benchmark developers chose to classify the parameters of a parallel file system benchmark into 6 general categories: application parameters, access methods, file systems parameters, programming interface, usage aspects, and statistical aspects.

The developers of the benchmark consider application parameters to be classified by things such as the way data is organized into memory for instance contiguous and non-contiguous, the way data is read or written to a file also in a contiguous or noncontiguous fashion, sizes of memory pages, size of data blocks, and the distribution of the blocks.

The interface aspect of the benchmark concerns the choice of file access API. There are several API's that could have been used - Posix I/O buffered or raw, vendors specific API's, or MPI-IO. The developers chose to use MPI-IO since it is a standard interface used to access many parallel file systems.

The access methods are all based on the interface aspect - namely MPI-IO. There are a wide rage of parameters covered for access methods. Since the code is testing I/O bandwidth and there is no overlap of computation and I/O, the authors chose to use blocking calls only.

File System parameters include such things as how many nodes are used, how much memory is used as buffer space, disk block size, disk stripping size, and number of parallel striping devices used. The b_eff_io benchmark cannot control these aspects directly, but the user can change as them as needed.

Parameters such as how many processes are used and how many parallel processors and threads are used for process are examples of usage aspects.

Statistical aspects include things such as a repetition factors and how to calculate to effective bandwidth based on only a subset of parameters used in the testing.

To make life easier for testing, the developers of b_eff_io have tried to ensure that the benchmark will run in 10-15 minutes. They try a number of different access patterns to the data in memory so that a variety of codes are effectively represented in the benchmark. However, only a single run of a specific pattern is run so repeatability is never tested. In addition, only a small amount of data for each access pattern is used, limiting the applicability of the results to larger problems. Consequently, only a rough average of the effective bandwidth is obtained.

IOR

Lawrence Livermore National Laboratory has developed a benchmark called IOR - "Interleaved or Random." The random portion of the code has been removed from the benchmark, but the name has remained the same. The goal of the benchmark is to measure I/O rates for combined open, write, read, and close operations.

There are two versions of the code: a Posix version that primarily views files as streams of bytes, and an MPI-IO version that allows for regions of files to represented as a datatype. In contrast to b_eff_io, IOR performs a great deal of I/O during the benchmark. The run will take longer than b_eff_io but you are more likely to simulate applications that use a parallel file system. Unfortunately, IOR only performs one access pattern, so applying the results to codes with differing I/O requirements is not easy.

When IOR is run, it creates a file in which data are written by each process at some offset in the file. The data do not overlapped, it is contiguous, and there are no gaps. The data are then read back by a differing process. Then the file is deleted, and the resulting throughput information is computed.

With IOR you can repeat the I/O phase of the benchmark any number of times you wish. You can also vary the number of block sizes using an environment variable.


Last Updated ( Thursday, 13 April 2006 )
 
< Prev Article   Next Article >
Mellanox
 

Creative Commons License
  ©2005-2008 Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.
Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.