[Beowulf] Mature open source hierarchical storage management

Nifty Tom Mitchell niftyompi at niftyegg.com
Tue Oct 27 21:02:03 EDT 2009

On Fri, Oct 23, 2009 at 04:12:11PM +1100, Carl Thomas wrote:
> Date: Fri, 23 Oct 2009 16:12:11 +1100
>    We are currently in the midst of planning a major refresh of our existing
>    HPC cluster.


Do add "PowerFile" to your research list.


My back of the email envelope view of what you are doing should have
quick cluster disks for binary objects, swap and libs /scratch /tmp and a
largish NFS RAID based filesystem with an archival back end.  Perhaps a
large slow spinning disk staging RAID in the middle or off to the side too.

There are multiple "delta equations" that
you need to evaluate.  I know I missed some

   - delta file change (GB/day).
   - performance delta at each layer.
   - cost delta at each layer.
   - management cost delta
   - operational cost delta
   - cost of compliance -- what the law requires, by method.
   - cost of physical storage on and off site, include handling and shipping.
   - cost of user training delta.
   - cost of expansion delta.
   - cost of necessary bandwidth, by layer.

Clusters are unique in that they have the potential
of hosting their own distributed RAID (lustre, gluster, zfs)
and with a sufficient archival backend life could be good.
Thus select systems that you can add a second disk to.

Choice of filesystem can help too (see dmapi and friends).

Have fun.
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list