Buying a Beowulf Cluster (Help)

Tue May 27 17:04:51 EDT 2003

hi ya

- 2 replies in 1

On Tue, 27 May 2003, John Bushnell wrote:

> I would strongly suggest knocking on a few doors where you're
> at and finding some local cluster folks.  There have got to be
> some people maintaining clusters at your University.  If nothing
> else, you will have company during your misery. 

than one can look outside the area

- you will get 10x better hw support if they were local
	- biggest problem .. to keep machines up ...
	( something dies, and you need it fixed "now" )

- remote admin and howto support can be done remotely... by people
  that can keep it up .. 

- i think "hands-on support" vs "uptime/reliability support" should be 
  split up ...

> On Sat, 24 May 2003, Shashank Khanvilkar wrote:
> > 1. Opinions on the OS to be installed on the cluster: We have decided on REd
> > Hat 7.3, however suggestions are welcome. (We are not going for RH 8/9
> > because of some known compiler (Intel and portland) problems.. If anyone has
> > knowledge abt this, please let me know).

gcc problems will be across the board .. 
	- old gcc on new hw or new gcc on new hw 
	- old gcc on old hw or new gcc on old hw
	- you will have problems ( glibc + gcc-x.y problems )

	- there's probably more open source support for new gcc w/ new hw

i think to build new boxes based on old distro is a bad idea,
since it'd run into old known bugs that has since been fixed
in the newer distro

- yes, you might get the new bugs in the new systems/distro ..
  but you will also get old bugs in old distro and a lot smaller
  group of open source folks addressing those older issues

- there's usually work arounds for most bugs/problems ..
	- typical/usual work around for older bugs is to upgrade 

	- new bugs/problems -- simply means you need ore time
	to do more detailed testing before deciding 

- i have yet to see newer distro NOT be able to run an older app
  that "claimed to require an old foo-x.y version by the 3rd
  party vendor" .. it works even on newer version unless they
  did something unique to their code to lock it to that linux distro

- given known bugs and features requirements, i'd build on the
  latest/greatest stuff

> > 2. Any documentation on the software that needs to be installed (MPI, PVM,
> > admin stuff etc) that will help us in the long run.

random collection of stuff

> > 3. Any documentation on TO-DO's..or things that we need to check/do before
> > working on the cluster.

pricing and support ... 
	- simulate the "my node just died, how do you(vendor) plan to fix
	it ?? "  and our deadline was yesterday

have fun

