From john.hearns at clustervision.com  Mon Dec  1 09:37:41 2003
From: john.hearns at clustervision.com (John Hearns)
Date: Mon, 1 Dec 2003 15:37:41 +0100 (CET)
Subject: Fedora for x86_64
Message-ID: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com>

I saw this on the Fedora list that it has been released for x86_64
http://fedora.linux.duke.edu/fc1_x86_64/

I should say that I haven't tried/used this myself, just thought
it would be of interest to this list.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Mon Dec  1 09:43:47 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Mon, 1 Dec 2003 06:43:47 -0800
Subject: Mainboard identification and BIOS dump
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF5E@orsmsx402.jf.intel.com>

From: Anas Nashif, Saturday, November 29, 2003 8:29 PM
> 
> DMI decode is your friend
> 
> http://www.nongnu.org/dmidecode/
> 
This is definitely your friend.

HOWEVER, be aware that the information can vary widely and wildly from
one model computer to another, even among different models from the same
OEM.

A while ago, I was using the precursor to the above as the basis for a
"system serial number" utility -- even with the few vendors that I was
using at the time, the variety of places to put a serial number, if
available at all, was daunting.

Bottom line: there's some great info there, but don't be surprised by
the inconsistencies.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msnitzer at lnxi.com  Mon Dec  1 10:50:03 2003
From: msnitzer at lnxi.com (Mike Snitzer)
Date: Mon, 1 Dec 2003 08:50:03 -0700
Subject: Fedora for x86_64
In-Reply-To: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com>; from john.hearns@clustervision.com on Mon, Dec 01, 2003 at 03:37:41PM +0100
References: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com>
Message-ID: <20031201085003.A28915@lnxi.com>

On Mon, Dec 01 2003 at 07:37,
John Hearns <john.hearns at clustervision.com> wrote:

> I saw this on the Fedora list that it has been released for x86_64
> http://fedora.linux.duke.edu/fc1_x86_64/
> 
> I should say that I haven't tried/used this myself, just thought
> it would be of interest to this list.

It should be noted that this is NOT an official Fedora Core 1 release for
amd64; as taken from the post to fedora-devel:

...
ISOs will not be provided for this release, but everything is there for
an install.
...

/***************************************************************************
*       WARNING: This release is a preview, it is not an official Fedora
*       Core 1 Release, this is not an official Fedora Core Test Release.
*       This release may very well cause damage to your data, your system,
*       your pets and loved ones, and most certainly your sleep schedule.
*       There is no guarantee of any type on performance, stability, or
*       your sanity.  Use at your own risk.
***************************************************************************/


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From a.j.martin at qmul.ac.uk  Mon Dec  1 12:47:18 2003
From: a.j.martin at qmul.ac.uk (Alex Martin)
Date: Mon, 1 Dec 2003 17:47:18 +0000
Subject: Fedora for x86_64
In-Reply-To: <854qwkmpj4.fsf@blindglobe.net>
References: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com> <20031201085003.A28915@lnxi.com> <854qwkmpj4.fsf@blindglobe.net>
Message-ID: <200312011747.hB1HlIv21111@heppcb.ph.qmw.ac.uk>

After just installing it...It appears to be mostly 64-bit with support for 
32-bit bins...some applications e.g. openoffice don't apparently yet compile 
for x86_64.

cheers,
Alex


On Monday 01 December 2003 5:11 pm, A.J. Rossini wrote:
> Anyone know if it is a "true 64-bit" release, or a biarch (32/64), or
> just a 32bit?
>
> best,
> -tony
>
> Mike Snitzer <msnitzer at lnxi.com> writes:
> > On Mon, Dec 01 2003 at 07:37,
> >
> > John Hearns <john.hearns at clustervision.com> wrote:
> >> I saw this on the Fedora list that it has been released for x86_64
> >> http://fedora.linux.duke.edu/fc1_x86_64/
> >>
> >> I should say that I haven't tried/used this myself, just thought
> >> it would be of interest to this list.
> >
> > It should be noted that this is NOT an official Fedora Core 1 release for
> > amd64; as taken from the post to fedora-devel:
> >
> > ...
> > ISOs will not be provided for this release, but everything is there for
> > an install.
> > ...
> >
> > /************************************************************************
> >*** *       WARNING: This release is a preview, it is not an official
> > Fedora *       Core 1 Release, this is not an official Fedora Core Test
> > Release. *       This release may very well cause damage to your data,
> > your system, *       your pets and loved ones, and most certainly your
> > sleep schedule. *       There is no guarantee of any type on performance,
> > stability, or *       your sanity.  Use at your own risk.
> > *************************************************************************
> >**/
> >
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf

-- 
------------------------------------------------------------------------------
|                                                                            |
|  Dr. Alex Martin                                                           |
|  e-Mail:   a.j.martin at qmul.ac.uk        Queen Mary, University of London,  |
|  Phone :   +44-(0)20-7882-5033          Mile End Road,                     |
|  Fax   :   +44-(0)20-8981-9465          London, UK   E1 4NS                |
|                                                                            |
------------------------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rossini at blindglobe.net  Mon Dec  1 12:11:27 2003
From: rossini at blindglobe.net (A.J. Rossini)
Date: Mon, 01 Dec 2003 09:11:27 -0800
Subject: Fedora for x86_64
In-Reply-To: <20031201085003.A28915@lnxi.com> (Mike Snitzer's message of
 "Mon, 1 Dec 2003 08:50:03 -0700")
References: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com>
	<20031201085003.A28915@lnxi.com>
Message-ID: <854qwkmpj4.fsf@blindglobe.net>


Anyone know if it is a "true 64-bit" release, or a biarch (32/64), or
just a 32bit?

best,
-tony

Mike Snitzer <msnitzer at lnxi.com> writes:

> On Mon, Dec 01 2003 at 07:37,
> John Hearns <john.hearns at clustervision.com> wrote:
>
>> I saw this on the Fedora list that it has been released for x86_64
>> http://fedora.linux.duke.edu/fc1_x86_64/
>> 
>> I should say that I haven't tried/used this myself, just thought
>> it would be of interest to this list.
>
> It should be noted that this is NOT an official Fedora Core 1 release for
> amd64; as taken from the post to fedora-devel:
>
> ...
> ISOs will not be provided for this release, but everything is there for
> an install.
> ...
>
> /***************************************************************************
> *       WARNING: This release is a preview, it is not an official Fedora
> *       Core 1 Release, this is not an official Fedora Core Test Release.
> *       This release may very well cause damage to your data, your system,
> *       your pets and loved ones, and most certainly your sleep schedule.
> *       There is no guarantee of any type on performance, stability, or
> *       your sanity.  Use at your own risk.
> ***************************************************************************/
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
rossini at u.washington.edu            http://www.analytics.washington.edu/ 
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN          Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email

CONFIDENTIALITY NOTICE: This e-mail message and any attachments may be
confidential and privileged. If you received this message in error,
please destroy it and notify the sender. Thank you.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From verycoldpenguin at hotmail.com  Tue Dec  2 06:05:03 2003
From: verycoldpenguin at hotmail.com (Gareth Glaccum)
Date: Tue, 02 Dec 2003 11:05:03 +0000
Subject: PBS/Maui problem
Message-ID: <Law15-F68Qez1enAqpf000020b9@hotmail.com>

Hi,
I have been trying to get a large cluster working, but am having
problems with PBS crashing if I submit a job with qsub asking for more
than 112 (dual processor) nodes. I have applied the patches to allow
PBS to use large numbers of nodes, but it does not seem to help.

Any ideas as to where I should look?
PBS 2.3.12,
MAUI 3.2.5 (patch 5)

Thanks,
Gareth

_________________________________________________________________
Express yourself with cool emoticons - download MSN Messenger today! 
http://www.msn.co.uk/messenger

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From verycoldpenguin at hotmail.com  Tue Dec  2 09:46:12 2003
From: verycoldpenguin at hotmail.com (Gareth Glaccum)
Date: Tue, 02 Dec 2003 14:46:12 +0000
Subject: PBS/Maui problem
Message-ID: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>

Yes, we have tried that patch, but to no avail.
We are trying to run on Suse advanced server with opterons.
Gareth

>From: Bill Wichser <bill at Princeton.EDU>
>Date: Tue, 02 Dec 2003 09:12:55 -0500
>
>The NCSA scaling patch fixed this for me.  Is this the one you applied?
>http://www-unix.mcs.anl.gov/openpbs/
>Bill

>Gareth Glaccum wrote:
>>I have been trying to get a large cluster working, but am having
>>problems with PBS crashing if I submit a job with qsub asking for more
>>than 112 (dual processor) nodes. I have applied the patches to
...
>>PBS 2.3.12,
>>MAUI 3.2.5 (patch 5)

_________________________________________________________________
Use MSN Messenger to send music and pics to your friends 
http://www.msn.co.uk/messenger

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From nashif at planux.com  Tue Dec  2 12:37:46 2003
From: nashif at planux.com (Anas Nashif)
Date: Tue, 02 Dec 2003 12:37:46 -0500
Subject: clusterworldexpo 2003 Pages!
Message-ID: <3FCCCDEA.10108@planux.com>

hi,

Any idea where can I find the old pages  of clusterworldexpo 2003, 
http://www.clusterworldexpo.com./ is a dead end at the moment! Is there 
an archive somewhere?


Anas

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From pesch at attglobal.net  Tue Dec  2 21:40:24 2003
From: pesch at attglobal.net (pesch at attglobal.net)
Date: Tue, 02 Dec 2003 18:40:24 -0800
Subject: Beowulf of bare motherboards
References: <Pine.LNX.3.96.1031124165330.22139B-100000@Maggie.Linux-Consulting.com>
Message-ID: <3FCD4D18.FE7DCD4E@attglobal.net>

We used that technique in the late nineties: one 300W PS for 4 or more
motherboards (we had 1:6 power multiplier
pc boards and cabling made). Worked well and saved lots of space. The idea
might again become interesting for the
new low power processors (VIA 1 Ghz = 7W).

To support the motherboards we used prepunched steel sheetmetal bent to fit and nylon pc guides (remember the
s-100 bus?)

Paul Schenker


Alvin Oga wrote:

> hi ya
>
> On Mon, 24 Nov 2003, Jean-Christophe Ducom wrote:
>
> > I tried to find a link to a 'old' project where people were using racks to put
> > barebone motherboards (to save the cost of the case basically).
>
> hotmail and google used those motherboard in the 19" (kingstarusa.com)
> racks  -- looks like its discontinued ??
>
> - a flat piece of (aluminum/steel) metal (from home depot/orchard) will
>   work too you know
>         - just add a couple holes on stand off for the mb and power supply
>         - or get a sheet metal shop to bend and drill a few holes w
>           rack mounting ears
>
> > It was similar to the following project but was more elaborated (it was possible
> > to pull out the bare motherboards of the shelf, etc...)
> > http://www.abo.fi/~physcomp/cluster/celeron.html
>
> i'm very interested in those systems ...
>         - to build a cluster w/ just motherboards and optionally w/ disks
>         - power supply will be simple +12vDC wall adaptor ...
>         - P4-3G equivalent mb/cpu
>
>         - it'd be a good engineering challenge :-)
>         ( big question is what holds up the back of the "caseless"
>         ( motherboards and disks
>
> c ya
> alvin
>
> > I spent hours to find it on google..without success.
> > Could anyone remember it? Please send the link.
> > Thanks a lot
>
> there are other pc104 based caseless clusters
>         http://eri.ca.sandia.gov/eri/howto.html
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Tue Dec  2 15:06:55 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Wed, 3 Dec 2003 04:06:55 +0800 (CST)
Subject: PBS/Maui problem
In-Reply-To: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
Message-ID: <20031202200655.95901.qmail@web16801.mail.tpe.yahoo.com>

Did you try SPBS (scalable edition)?

And how did PBS fail? qsub, scheduler, server?

Andrew.

 --- Gareth Glaccum <verycoldpenguin at hotmail.com>
????
>
> Yes, we have tried that patch, but to no avail.
> We are trying to run on Suse advanced server with
> opterons.
> Gareth
> 
> >From: Bill Wichser <bill at Princeton.EDU>
> >Date: Tue, 02 Dec 2003 09:12:55 -0500
> >
> >The NCSA scaling patch fixed this for me.  Is this
> the one you applied?
> >http://www-unix.mcs.anl.gov/openpbs/
> >Bill
> 
> >Gareth Glaccum wrote:
> >>I have been trying to get a large cluster working,
> but am having
> >>problems with PBS crashing if I submit a job with
> qsub asking for more
> >>than 112 (dual processor) nodes. I have applied
> the patches to
> ...
> >>PBS 2.3.12,
> >>MAUI 3.2.5 (patch 5)
> 
>
_________________________________________________________________
> Use MSN Messenger to send music and pics to your
> friends 
> http://www.msn.co.uk/messenger
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Tue Dec  2 20:09:45 2003
From: csamuel at vpac.org (Chris Samuel)
Date: Wed, 3 Dec 2003 12:09:45 +1100
Subject: PBS/Maui problem
In-Reply-To: <Law15-F68Qez1enAqpf000020b9@hotmail.com>
References: <Law15-F68Qez1enAqpf000020b9@hotmail.com>
Message-ID: <200312031209.53543.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 2 Dec 2003 10:05 pm, Gareth Glaccum wrote:

> I have been trying to get a large cluster working, but am having
> problems with PBS crashing if I submit a job with qsub asking for more
> than 112 (dual processor) nodes. I have applied the patches to allow
> PBS to use large numbers of nodes, but it does not seem to help.
>
> Any ideas as to where I should look?
> PBS 2.3.12,
> MAUI 3.2.5 (patch 5)

I'd stronly suggest trying out Scalable PBS instead of OpenPBS.  It's actively 
developed and they've been fixing lots of problems that are still in OpenPBS 
and adding enhancements.

	http://www.supercluster.org/

It's freely available (they forked from an earlier OpenPBS release which had a 
more liberal license than the later ones).

cheers!
Chris
- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/zTfdO2KABBYQAh8RAoLAAJ94HRU9Dgu2B4fLhwQdQ2EDnp1q+gCfZHk8
utf26uf4JQL2eNVFv7vxi1c=
=AQ/L
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Tue Dec  2 20:27:40 2003
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Tue, 2 Dec 2003 20:27:40 -0500 (EST)
Subject: clusterworldexpo 2003 Pages!
In-Reply-To: <3FCCCDEA.10108@planux.com>
Message-ID: <Pine.LNX.4.44.0312022021580.12961-100000@boltzmann.basement-supercomputing.com>

On Tue, 2 Dec 2003, Anas Nashif wrote:

> hi,
> 
> Any idea where can I find the old pages  of clusterworldexpo 2003, 
> http://www.clusterworldexpo.com./ is a dead end at the moment! Is there 
> an archive somewhere?

What exactly do you need?

The www.clusterworldexpo.com site is morphing into the 2004 meeting site.
ClusterWorld Expo will be held on April 5-8, 2004, keynotes include
Tom Sterling, Ian Foster, and Dave Turek.

Doug

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Tue Dec  2 20:12:38 2003
From: csamuel at vpac.org (Chris Samuel)
Date: Wed, 3 Dec 2003 12:12:38 +1100
Subject: PBS/Maui problem
In-Reply-To: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
References: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
Message-ID: <200312031212.39775.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 3 Dec 2003 01:46 am, Gareth Glaccum wrote:

> We are trying to run on Suse advanced server with opterons.

Here's a quote from the Scalable PBS guys from the mailing list:

[quote]

  The next release of SPBS is under testing and is currently available as 
a snapshot in the spbs/temp download directory.  This snapshot 
incorporates a number of patches which assist in the following areas:

SUSE Linux support
IA64 support
large job support 
readline support in qmgr
support for very large node memory and filesystems
correct ncpus reporting

  Many thanks go out to NCSA and the TeraGrid team for their excellent 
help in identifing and correcting a number of remaining high-end scaling 
issues found within SPBS.  

  Please let us know if any issues are discovered with this release and 
please keep the patches coming!

[/quote]

- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/zTiGO2KABBYQAh8RAuppAJ9LGg7Pj7MLlT1MSb2oW2WABWB4CgCdF7Dq
Tq4fnxlcaDA/5vIGCf9QNeQ=
=YwfO
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From nashif at planux.com  Tue Dec  2 22:12:56 2003
From: nashif at planux.com (Anas Nashif)
Date: Tue, 02 Dec 2003 22:12:56 -0500
Subject: clusterworldexpo 2003 Pages!
In-Reply-To: <Pine.LNX.4.44.0312022021580.12961-100000@boltzmann.basement-supercomputing.com>
References: <Pine.LNX.4.44.0312022021580.12961-100000@boltzmann.basement-supercomputing.com>
Message-ID: <3FCD54B8.8070805@planux.com>


Douglas Eadline, Cluster World Magazine wrote:
> On Tue, 2 Dec 2003, Anas Nashif wrote:
> 
> 
>>hi,
>>
>>Any idea where can I find the old pages  of clusterworldexpo 2003, 
>>http://www.clusterworldexpo.com./ is a dead end at the moment! Is there 
>>an archive somewhere?
> 
> 
> What exactly do you need?
> 
Everything :-)
I'd like to see who talked there and to see what talk were given etc. 
Its always good to have some kind of archive with the program of old 
conferences, for example something like www.supercomp.org.


> The www.clusterworldexpo.com site is morphing into the 2004 meeting site.
> ClusterWorld Expo will be held on April 5-8, 2004, keynotes include
> Tom Sterling, Ian Foster, and Dave Turek.
> 

Yes, I could see that on the new page, but as I said, its a dead end, no 
links to anything there...


Thanks,

Anas

> Doug
> 
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Wed Dec  3 04:27:24 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Wed, 3 Dec 2003 01:27:24 -0800 (PST)
Subject: Beowulf of bare motherboards
In-Reply-To: <Pine.LNX.4.44.0312030958010.9015-100000@druifje.clustervision.com>
Message-ID: <Pine.LNX.3.96.1031203011718.21885A-100000@Maggie.Linux-Consulting.com>


hi ya john

On Wed, 3 Dec 2003, John Hearns wrote:

> Someone mention VIA mini-ITXes?
> If I could have the resources, I wouldn't fan out a single PSU
> to several mini-ITX boards. It would be cheap, but introduce a single
> point of failure, and you'd have to cobble somthing together to
> deal with ATX power on/off.

single point of failures is not acceptable if the cost of that
item is small compared to the "overall system"
	- hvac, public utilty point of failure is harder
	to avoid, but can be avoided w/ a data center setup
	on the opposite side of the country

> Funds permitting, one of the small 12V DC-DC PSU per board.

you can use a simple wall adaptor to +12v adaptor
	and a +12vdc to +{various-atx} voltage dc-dc convertor

	www.mini-itx.com sells their proprietory +12v dc-dc convertors
	( $50ea range ) and we're debating what the "cluster/blade" of
	mini-itx mb should look like when its mounted in a standard rack
	or custom rack .. and why one way is better than another .. fun
	stuff ..

- if you want a p4-3Ghz in a mini-itx form factor, than we're back
  to only one mb manufacturer :-)

> Then run a high current 12V supply along the rack.
> Simple cheap relay would do the job of power cycling  also.

"relays" has had the worst reliability of any electromechanical part
( so its been long replaced by transistors :-) especially at high
( currents and low/medium voltages

> On the VIA front, the smaller  nano-ITX form factor boards are due soon.
> Could make nice building blocks.

those nano-itx mb is due out (in production) around may/june time frame
??

c ya
alvin

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Wed Dec  3 04:02:56 2003
From: john.hearns at clustervision.com (John Hearns)
Date: Wed, 3 Dec 2003 10:02:56 +0100 (CET)
Subject: Beowulf of bare motherboards
In-Reply-To: <3FCD4D18.FE7DCD4E@attglobal.net>
Message-ID: <Pine.LNX.4.44.0312030958010.9015-100000@druifje.clustervision.com>

On Tue, 2 Dec 2003 pesch at attglobal.net wrote:

> We used that technique in the late nineties: one 300W PS for 4 or more
> motherboards (we had 1:6 power multiplier
> pc boards and cabling made). Worked well and saved lots of space. The idea
> might again become interesting for the
> new low power processors (VIA 1 Ghz = 7W).
> 
Someone mention VIA mini-ITXes?
If I could have the resources, I wouldn't fan out a single PSU
to several mini-ITX boards. It would be cheap, but introduce a single
point of failure, and you'd have to cobble somthing together to
deal with ATX power on/off.
Funds permitting, one of the small 12V DC-DC PSU per board.
Then run a high current 12V supply along the rack.
Simple cheap relay would do the job of power cycling  also.

On the VIA front, the smaller  nano-ITX form factor boards are due soon.
Could make nice building blocks.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From derek.richardson at pgs.com  Wed Dec  3 11:27:59 2003
From: derek.richardson at pgs.com (Derek Richardson)
Date: Wed, 03 Dec 2003 10:27:59 -0600
Subject: Opteron kernel
In-Reply-To: <3FCD0CEE.80908@seismiccity.com>
References: <Pine.LNX.4.44.0311251500590.4408-100000@training.scyld.com> <3FC4EFB3.10708@pgs.com> <3FCD0CEE.80908@seismiccity.com>
Message-ID: <3FCE0F0F.9020407@pgs.com>

Claude,
I'm thinking there is a lot of potential for optimization is the x86-64 
architecture.  Two different versions of our code ( they have slightly 
differing code and were compiled w/ same GNU compilers but using 
different flags ) had a large performance difference.  One version ran 
at ~ 85% of the speed of the P4 gear, and another at ~ 140% of P4 gear ( 
dual Xeon 3.06 GHz boxen ).  Having found this out two days ago and 
spent all of yesterday repairing some dead nodes, I haven't had a chance 
to chase the testing up ( find out which flags, code differences, etc. 
).  We are planning on doing a run w/ the same code base, but the 
changed compiler flags.  That should bring out whether it is the code 
changes, or the compiler flags.  My guess would be the compiler flags, 
but I don't know ( yet ) what changes were made in the code itself.  
There's also some pre-fetching optimization work that can be done as 
well, so things are looking a bit brighter.
As a side note, AMD recommends the SUSE 64 bit kernel ( apparently even 
for non-SUSE, non-64bit OSes like RedHat ).  I don't know where they 
stand on RH Advanced Whatchamadoodle vs. SUSE, but I'll have to sort 
that out in the future, if we actually ever get around to getting some 
Opterons ( our stance has been that they have to outperform the P4 Xeon 
gear using the same code and OS, then we'll worry about seriously 
optimizing ).
I suppose I'll let everyone know when we discover what made such a large 
difference.
Regards,
Derek R.

Claude Pignol wrote:

>
>
> Derek Richardson wrote:
>
>> Donald,
>> Sorry for the late reply, bloody Exchange server didn't drop it in my 
>> inbox until late this morning.  Memory and scheduling would probably 
>> be the biggest factor.  Processor affinity doesn't matter as much, 
>> because in my experience we haven't had problems w/ processes 
>> bouncing between CPUs.  PCI bus is almost a non-issue, since our 
>> application is embarassingly parallel and therefore has no need for > 
>> 100 Mbit ethernet, and there is no disk on a PCI-attached controller, 
>> so we have very little information passing over the PCI bus.
>> By interleaving, I assume you mean at the physical level, which I had 
>> a quick peek at when we got the system ( it's an IBM eServer 325, a 
>> loaner for testing ) and I assumed to be correct.  But given the poor 
>> performance I have seen ( 2 GHz Opterons coming in at ~15% slower 
>> than a 3 GHz P4 on a compute/memory intensive application when most 
>> benchmarks I have seen would imply the inverse ), I will double-check 
>> that when given a chance. 
>
> I have the same conclusion concerning the performance. I haven't seen 
> on our application (floating point and memory  intensive) the speed up 
> that we could expect from the SPEC benchmark.
> (using gcc 3.3 Kernel NUMA  bank interleaving ON CPU interleaving OFF)
> The problem is probably due to the compiler that doesn't generate a 
> very optimized code on common application.
> It seems that the price performance ratio is still in favor of Xeon 
> for dual processor machine.
>
>>
>> I will probably just try the latest 2.6 kernel and a few other tweaks 
>> as well, and AMD has also offerred help, but that would more likely 
>> be at the application layer ( which I don't have control of, 
>> unfortunately ).
>> Thanks for the response, and my apologies for the vagueness of the 
>> question.
>> Derek R.
>>
>> Donald Becker wrote:
>>
>>> On Mon, 24 Nov 2003, Derek Richardson wrote:
>>>
>>>  
>>>
>>>> Does anyone know where to find info on tuning the linux kernel for 
>>>> Opterons?  Googling hasn't turned up much useful information.
>>>>   
>>>
>>>
>>> What type of tuning?
>>> PCI bus transactions (the Itanium required more, but the Opteron still
>>> benefits)?  Scheduling?  Processor affinity?  What kernel version?
>>> If you ask specific questions, there is likely someone on the list that
>>> knows the specific answer.
>>>
>>> The easiest performance improvement comes from proper memory DIMM
>>> configuration to match the application layout.  Each processor has its
>>> own local memory controller, and understanding how the memory slots are
>>> filled and the options e.g. interleave can make a 30% difference on a
>>> dual processor system.
>>>
>>>  
>>>
>>
>
> -- 
> ------------------------------------------------------------------------
> Claude Pignol 	SeismicCity, Inc. <http://www.seismiccity.com>
> 2900 Wilcrest Dr.    Suite 370 	 Houston TX 77042
> Phone:832 251 1471 Mob:281 703 2933 	 Fax:832 251 0586
>
>

-- 
Linux Administrator
derek.derekson at pgs.com
derek.derekson at ieee.org
Office 713-781-4000
Cell 713-817-1197
Disease can be cured; fate is incurable.
		-- Chinese proverb


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Wed Dec  3 17:50:39 2003
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Wed, 3 Dec 2003 14:50:39 -0800 (PST)
Subject: Scalable PBS (was:  PBS/Maui problem)
In-Reply-To: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
Message-ID: <20031203225039.79037.qmail@web11404.mail.yahoo.com>

While Scalable PBS is technically better than OpenPBS, I found that it
is actually less open than other batch systems (condor, OpenPBS, SGE)

All "scalablepbsusers" mail messages are filtered by hand by Cluster
Resource INC. This creates significant delays to the mail response
rate.

All major lists are not filtered by hand, I just don't understand the
reasons of doing that...

BTW, anyone on that list but is not encountering the same experience??

Rayson


__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sean at asacomputers.com  Wed Dec  3 18:19:23 2003
From: sean at asacomputers.com (Sean)
Date: Wed, 03 Dec 2003 15:19:23 -0800
Subject: U320 and 64 bit Itanium
In-Reply-To: <20031203225039.79037.qmail@web11404.mail.yahoo.com>
References: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
Message-ID: <5.1.0.14.2.20031203151755.02fb1aa0@pop.asacomputers.com>

Can somebody suggest us where to get the U320 drivers for  64 bit Redhat 
Linux that will work with the Itanium solution ?

Thanks and Regards
Sean
ASA Computers Inc.
2354, Calle Del Mundo
Santa Clara CA 95054
Telephone : (408) 654-2901 xtn 205
                   (408) 654-2900 ask for Sean
                   (800) REAL-PCS (1-800-732-5727)
Fax:            (408) 654-2910
E-mail : sean at asacomputers.com
URL    : http://www.asacomputers.com


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rokrau at yahoo.com  Wed Dec  3 21:24:47 2003
From: rokrau at yahoo.com (Roland Krause)
Date: Wed, 3 Dec 2003 18:24:47 -0800 (PST)
Subject: problem allocating large amount of memory
Message-ID: <20031204022447.32578.qmail@web40014.mail.yahoo.com>

Hi all,
I am trying to allocate a continuous chunk of memory of more than
2GBytes using malloc(). 

My sytem is a Microway Dual Athlon node with 4GB of physical RAM. The
kernel identifies itself as Redhat-2.4.20 (it runs RH-9). It has been
compiled with the CONFIG_HIGHMEM4G and CONFIG_HIGHMEM options turned
on. 

Here is what I _am_ able to do. Using a little test program that I have
written I can pretty much get 3 GB of memory allocated in chunks. The
largest chunk is 2,143 GBytes, then one of 0.939 GBytes size and
finally some smaller chunks of 10MBytes. So the total amount of memory
I can get is close enough to the promised 3G/1G split which is well
documented on the net. 

What I am not able to do currently is to get the 2.95GB all at once.
"But I must have it all."

I have set the overcommit_memory kernel parameter to 1 already but that
that doesn't seem to change anything. 

Also has someone experience with the various kernel patches for large
memory out there (im's 4G/4G or IBM's 3.5G/0.5G hack)? 

I would be very greatful for any kind of advice with regards to this
problem. I am certain that more people here must have the same problem.


Best regards,
Roland


__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Wed Dec  3 22:31:20 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Wed, 3 Dec 2003 19:31:20 -0800
Subject: problem allocating large amount of memory
In-Reply-To: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
References: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
Message-ID: <20031204033120.GJ20846@cse.ucdavis.edu>

ACK, sorry, I missed the mention of running Redhat-9.

Do you have an example program?

Did you link static or dynamic?

Is it possible your process has 0.05GB of memory used in some other
way?

-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Thu Dec  4 00:49:50 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 4 Dec 2003 00:49:50 -0500 (EST)
Subject: problem allocating large amount of memory
In-Reply-To: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312032353040.7748-100000@coffee.psychology.mcmaster.ca>

> Here is what I _am_ able to do. Using a little test program that I have
> written I can pretty much get 3 GB of memory allocated in chunks. The
> largest chunk is 2,143 GBytes, then one of 0.939 GBytes size and
> finally some smaller chunks of 10MBytes. So the total amount of memory

yes.  unless you are quite careful, your address space looks like this:

0-128M		zero page
128M + small	program text
		sbrk heap (grows up)
1GB		mmap arena (grows up)
3GB - small	stack base (grows down)
3GB-4GB		kernel direct-mapped area

your ~1GB is allocated in the sbrk heap (above text, below 1GB).
the ~2GB is allocated in the mmap arena (glibc puts large allocations
there, if possible, since you can munmap arbitrary pages, but heaps can 
only rarely shrink).

interestingly, you can avoid the mmap arena entirely if you try (static linking,
avoid even static stdio).  that leaves nearly 3 GB available for the heap or stack.  
also interesting is that you can use mmap with MAP_FIXED to avoid the default 
mmap-arena at 1GB.  the following code demonstrates all of these.  the last time
I tried, you could also move around the default mmap base (TASK_UNMAPPED_BASE,
and could squeeze the 3G barier, too (TASK_SIZE).  I've seen patches to make 
TASK_UNMAPPED_BASE a /proc setting, and to make the mmap arena grow down
(which lets you start it at a little under 3G, leaving a few hundred MB for stack).
finally, there is a patch which does away with the kernel's 1G chunk entirely
(leaving 4G:4G, but necessitating some nastiness on context switches)


#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>

void print(char *message) {
    unsigned l = strlen(message);
    write(1,message,l);
}
void printuint(unsigned u) {
    char buf[20];
    char *p = buf + sizeof(buf) - 1;
    *p-- = 0;
    do {
        *p-- = "0123456789"[u % 10];
        u /= 10;
    } while (u);
    print(p+1);
}

int main() {
#if 1
//    unsigned chunk = 128*1024;                                                
    unsigned chunk = 124*1024;
    unsigned total = 0;
    void *p;

    while (p = malloc(chunk)) {
        total += chunk;
        printuint(total);
        print("MB\t: ");
        printuint((unsigned)p);
        print("\n");
    }
#else
    unsigned offset = 150*1024*1024;
    unsigned size = (unsigned) 3e9;
    void *p = mmap((void*) offset,
                   size,
                   PROT_READ|PROT_WRITE,
                   MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
                   0,0);
    printuint(size >> 20);
    print(" MB\t: ");
    printuint((unsigned) p);
    print("\n");
#endif
    return 0;
}

> Also has someone experience with the various kernel patches for large
> memory out there (im's 4G/4G or IBM's 3.5G/0.5G hack)? 

there's nothing IBM-specific about 3.5/.5, that's for sure.

as it happens, I'm going to be doing some measurements of performance soon.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Thu Dec  4 10:39:24 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Thu, 4 Dec 2003 07:39:24 -0800
Subject: problem allocating large amount of memory
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF78@orsmsx402.jf.intel.com>

From: Mark Hahn; Sent: Wednesday, December 03, 2003 9:50 PM
> 
> From: Roland Krause; Sent: Wednesday, December 03, 2003 6:25 PM
> > Here is what I _am_ able to do. Using a little test program that I
have
> > written I can pretty much get 3 GB of memory allocated in chunks.
The
> > largest chunk is 2,143 GBytes, then one of 0.939 GBytes size and
> > finally some smaller chunks of 10MBytes. So the total amount of
memory

The 2.143 GB chunk is above TASK_UNMAPPED_BASE and the 0.939 chunk is
below TASK_UNMAPPED_BASE.

> yes.  unless you are quite careful, your address space looks like
this:
> 
> 0-128M		zero page
> 128M + small	program text
> 		sbrk heap (grows up)
> 1GB		mmap arena (grows up)
> 3GB - small	stack base (grows down)
> 3GB-4GB		kernel direct-mapped area
> 
> your ~1GB is allocated in the sbrk heap (above text, below 1GB).
> the ~2GB is allocated in the mmap arena (glibc puts large allocations
> there, if possible, since you can munmap arbitrary pages, but heaps
can
> only rarely shrink).

Right.

> interestingly, you can avoid the mmap arena entirely if you try
(static
> linking,
> avoid even static stdio).  that leaves nearly 3 GB available for the
heap
> or stack.

Interesting, never tried static linking.  While I worked with an app
that needed dynamic linking, this is an experiment I will certainly try.

> also interesting is that you can use mmap with MAP_FIXED to avoid the
> default
> mmap-arena at 1GB.  the following code demonstrates all of these.  the
> last time
> I tried, you could also move around the default mmap base
> (TASK_UNMAPPED_BASE,
> and could squeeze the 3G barier, too (TASK_SIZE).  I've seen patches
to
> make
> TASK_UNMAPPED_BASE a /proc setting, and to make the mmap arena grow
down
> (which lets you start it at a little under 3G, leaving a few hundred
MB
> for stack).

Prior to RH 7.3, you could use one of the extant TASK_UNMAPPED_BASE
patches to address this problem.  I always used the patch to move
TASK_UNMAPPED_BASE UP, so that the brk() area (the 0.939 chunk above)
could get larger.  I could reliably get this up to about 2.2 GB or so
(on a per-process basis). The original requestor would want to move
TASK_UNMAPPED_BASE DOWN, so that the first big malloc() could be larger.


Starting at RH Linux 7.3, Red Hat prelinked glibc to the fixed value of
TASK_UNMAPPED_BASE so that moving TASK_UNMAPPED_BASE around only caused
heartache and despair, a.k.a., you app crashed and burned as you
trampled over glibc.

I have rebuilt a few pairs of RH kernels and glibc's to add the kernel
patch and not prelink glibc, thereby restoring the wonders of the
per-process TASK_UNMAPPED_BASE patch.  But, this must be done to both
the kernel and glibc.

So, the biggest issue in an unpatched RH world is not the user app, but
glibc.

> finally, there is a patch which does away with the kernel's 1G chunk
> entirely
> (leaving 4G:4G, but necessitating some nastiness on context switches)

This is something I want to look at, to quantify how bad it actually is.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Wed Dec  3 22:28:33 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Wed, 3 Dec 2003 19:28:33 -0800
Subject: problem allocating large amount of memory
In-Reply-To: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
References: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
Message-ID: <20031204032833.GI20846@cse.ucdavis.edu>

On Wed, Dec 03, 2003 at 06:24:47PM -0800, Roland Krause wrote:
> What I am not able to do currently is to get the 2.95GB all at once.
> "But I must have it all."

A small example program is useful.

I'll include one that works for me.  Here's the output of the run:

[root at quad root]# gcc -Wall -Wno-long-long -pedantic memfill.c -o memfill && ./memfill
Array size of 483183820 doubles (3.60 GB) allocated
Initialized 1GB.
Initialized 1GB.
Initialized 1GB.
Initialized 1GB.
Sleeping for 60 seconds so you can check top.

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
17182 root      25   0 3584M 3.5G   340 S     0.0 48.4   0:10   3 memfill
 
I'll attach my source.  This particular machine has 8GB ram, but it
would be kinda strange for this to fall just because it's virtual.

You do have enough swap right?

-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define RAM_USED 3.6   /* 3.6 GB */
#define GB 1073741824  /* bytes per GB */

int main()
{
	double *x;
	long long i;
	long long array_size;

	array_size=RAM_USED*GB/sizeof(double);	
	x=malloc(RAM_USED*GB);
	if (x)
	{
		printf ("Array size of %lld doubles (%3.2f GB) allocated\n",array_size,RAM_USED);
		for (i=0;i<array_size;i++)
		{
			x[i]=i;
			if (i%(GB/sizeof(double))==0)
			{ 
				printf ("Initialized 1GB.\n");
			}
		}
		printf ("Sleeping for 60 seconds so you can check top.\n");
		sleep(60);
		return(0);
	}	
	printf ("We couldn't allocate an array of %3.2f GB!\n",RAM_USED);
	return(-1);
}
	
-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define RAM_USED 3.6   /* 3.6 GB */
#define GB 1073741824  /* bytes per GB */

int main()
{
	double *x;
	long long i;
	long long array_size;

	array_size=RAM_USED*GB/sizeof(double);	
	x=malloc(RAM_USED*GB);
	if (x)
	{
		printf ("Array size of %lld doubles (%3.2f GB) allocated\n",array_size,RAM_USED);
		for (i=0;i<array_size;i++)
		{
			x[i]=i;
			if (i%(GB/sizeof(double))==0)
			{ 
				printf ("Initialized 1GB.\n");
			}
		}
		printf ("Sleeping for 60 seconds so you can check top.\n");
		sleep(60);
		return(0);
	}	
	printf ("We couldn't allocate an array of %3.2f GB!\n",RAM_USED);
	return(-1);
}
	

From jrdm at sdf.lonestar.org  Thu Dec  4 12:27:42 2003
From: jrdm at sdf.lonestar.org (Linux Guy)
Date: Thu, 4 Dec 2003 17:27:42 +0000 (UTC)
Subject: maui scheduler
Message-ID: <Pine.NEB.4.58.0312041717130.5995@otaku.freeshell.org>


Will the real Maui Scheduler please stand up?

How many maui's are out there?

http://sourceforge.net/projects/mauischeduler/
http://sourceforge.net/projects/mauisched/
http://supercluster.org/maui/

others?

I thought this was a MHPCC project?

--
jrdm at sdf.lonestar.org
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From henken at seas.upenn.edu  Thu Dec  4 13:35:15 2003
From: henken at seas.upenn.edu (Nicholas Henke)
Date: Thu, 04 Dec 2003 13:35:15 -0500
Subject: maui scheduler
In-Reply-To: <Pine.NEB.4.58.0312041717130.5995@otaku.freeshell.org>
References: <Pine.NEB.4.58.0312041717130.5995@otaku.freeshell.org>
Message-ID: <1070562915.28739.20.camel@roughneck.liniac.upenn.edu>

On Thu, 2003-12-04 at 12:27, Linux Guy wrote:
> Will the real Maui Scheduler please stand up?
> 
> How many maui's are out there?
> 
> http://sourceforge.net/projects/mauischeduler/
> http://sourceforge.net/projects/mauisched/
> http://supercluster.org/maui/
> 

The 'real' one is supercluster.org.

Nic
-- 
Nicholas Henke
Penguin Herder & Linux Cluster System Programmer
Liniac Project - Univ. of Pennsylvania

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rokrau at yahoo.com  Thu Dec  4 14:40:30 2003
From: rokrau at yahoo.com (Roland Krause)
Date: Thu, 4 Dec 2003 11:40:30 -0800 (PST)
Subject: problem allocating large amount of memory
In-Reply-To: <20031204033120.GJ20846@cse.ucdavis.edu>
Message-ID: <20031204194030.88760.qmail@web40006.mail.yahoo.com>

Bill,
thanks a lot for your help. 
Please find attached a little test program. I use

g++ -O -Wall memchk.cpp -static -o memchk

Afaik size_t is unsigned long on 32 bit systems and long long is the
same.

I've linked the code first dynamic then static with no differences in
the amount I am getting. 

Roland


--- Bill Broadley <bill at cse.ucdavis.edu> wrote:
> ACK, sorry, I missed the mention of running Redhat-9.
> 
> Do you have an example program?
> 
> Did you link static or dynamic?
> 
> Is it possible your process has 0.05GB of memory used in some other
> way?
> 
> -- 
> Bill Broadley
> Information Architect
> Computational Science and Engineering
> UC Davis

__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: memchk.cpp
Type: text/x-c++src
Size: 636 bytes
Desc: memchk.cpp
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20031204/9137ff92/attachment.bin>

From josip at lanl.gov  Thu Dec  4 15:05:17 2003
From: josip at lanl.gov (Josip Loncaric)
Date: Thu, 04 Dec 2003 13:05:17 -0700
Subject: problem allocating large amount of memory
In-Reply-To: <Pine.LNX.4.44.0312032353040.7748-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0312032353040.7748-100000@coffee.psychology.mcmaster.ca>
Message-ID: <3FCF937D.5070109@lanl.gov>

In addition to Mark's very helpful address space layout, you may want to 
consult this web page:

http://www.intel.com/support/performancetools/c/linux/2gbarray.htm

which saye:

"The maximum size of an array that can be created by Intel? IA-32 
compilers is 2 GB."

due to the fact that:

"The default Linux* kernel on IA-32 loads shared libraries at 1 GB, 
which limits the contiguous address space available to your program. You 
will get a load time error if your program + static data exceed this."

Intel offers several helpful hints on being able to declare larger 
arrays (e.g. -static linking, etc.).

Sincerely,
Josip

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Thu Dec  4 18:05:57 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Thu, 4 Dec 2003 15:05:57 -0800
Subject: problem allocating large amount of memory
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF82@orsmsx402.jf.intel.com>

From: Josip Loncaric; Sent: Thursday, December 04, 2003 12:05 PM
> 
> In addition to Mark's very helpful address space layout, you may want
to
> consult this web page:
> 
> http://www.intel.com/support/performancetools/c/linux/2gbarray.htm
> 
> which saye:
> 
> "The maximum size of an array that can be created by Intel(r) IA-32
> compilers is 2 GB."

Using the Intel or gcc compilers, a TASK_UNMAPPED_BASE patch, and some
other fiddling, you can create a larger array via brk(2), or (I assume)
malloc(3), and use a larger array.
 
> due to the fact that:
> 
> "The default Linux* kernel on IA-32 loads shared libraries at 1 GB,
> which limits the contiguous address space available to your program.
You
> will get a load time error if your program + static data exceed this."

Again, back to the TASK_UNMAPPED_BASE patch and glibc fiddling.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Thu Dec  4 20:06:59 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 4 Dec 2003 20:06:59 -0500 (EST)
Subject: problem allocating large amount of memory
In-Reply-To: <20031205003627.2288.qmail@web40013.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312041949590.14254-100000@coffee.psychology.mcmaster.ca>

> I've tried your code and, yes, I am able to allocate up to 3G of memory
> in 124K chunks.

I probably should have commented on the code a bit more.  it demonstrates
three separate things: that for <128K allocations, libc uses the heap
first, then when that fills (hits the mmap arena) it switches to allocating
in the mmap arena.  if allocations are 128K or more, it *starts* in the 
mmap arena (since mmap has advantages when doing large allocations - munmap).  
finally, if you statically link and avoid the use of stdio,
you can make one giant allocation from the end of text up to stack.

you can't make that one giant allocation with malloc, though, simply
because glibc has this big-alloc-via-mmap policy.  I dimly recall that 
you can change this behavior at runtime.

> Unfortunately this doesn't not help me because the
> memory needed is allocated for a large software package, written in
> Fortran, that makes heavy use of all kinds of libraries (libc among
> others) over which I have no control. 

I'd suggest moving TASK_UNMAPPED_BASE down, and possibly going to a 
3.5 or 4GB userspace.  I think I also mentioned there's a patch to make
the mmap arena grow down - start it below your max stack extent, and 
let it grow towards the heap.

> Also, if I change your code to try to allocate the available memory in
> one chunk I am obviously in the same situation as before. If I
> understand you correctly, this is because small chunks of memory are
> allocated with sbrk, large ones with mmap.

right, though that's a purely user-space choice, nothing to do with the OS.

> I notice from the output of
> your program that the allocated memory is also not in a contiguous
> block. 

the demo program operates in three modes, one of which is a single chunk,
the other is a contiguous series of small chunks, and the other is two 
series of chunks.

> This must be because Redhat's prelinking of glibc to a fixed
> address in memory as noted by David Lombard. 

as I mentioned, this is irrelevant if you link statically.

> What I dont understand at all then is why your second code example
> (mmap) is able to return
>  2861 MB : 157286400
> or even more memory upon changing size to 4.e9. Isn't this supposently
> simply overwriting the area where glibc is in?

if you link my demo statically, there *is* no mmaped glibc chopping
up the address space.

> Will that prevent me from using stdio. 

stdio (last time I checked) used mmap even when statically linked - 
a single page, presumably a conversion buffer.  you'd have to check the 
source to see whether that can be changed.  I presume it's trying to 
initialize the buffer before the malloc heap is set up, or something 
like that.

> There is no problem linking
> statically for me. I am doing that for other reasons anyway. 

remember, no one says you have to use glibc...

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Thu Dec  4 20:15:54 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Fri, 5 Dec 2003 09:15:54 +0800 (CST)
Subject: maui scheduler
In-Reply-To: <1070562915.28739.20.camel@roughneck.liniac.upenn.edu>
Message-ID: <20031205011554.53403.qmail@web16811.mail.tpe.yahoo.com>

Not trying to say which one is real, which one is not,
but just want to provide a link:

http://bohnsack.com/lists/archives/xcat-user/2385.html

Further, the one from supercluster.org is the most
popular one, and is the safest choice.

Andrew.

--- Nicholas Henke <henken at seas.upenn.edu> ????
> On Thu, 2003-12-04 at 12:27, Linux Guy wrote:
> > Will the real Maui Scheduler please stand up?
> > 
> > How many maui's are out there?
> > 
> > http://sourceforge.net/projects/mauischeduler/
> > http://sourceforge.net/projects/mauisched/
> > http://supercluster.org/maui/
> > 
> 
> The 'real' one is supercluster.org.
> 
> Nic
> -- 
> Nicholas Henke
> Penguin Herder & Linux Cluster System Programmer
> Liniac Project - Univ. of Pennsylvania
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rokrau at yahoo.com  Thu Dec  4 19:36:27 2003
From: rokrau at yahoo.com (Roland Krause)
Date: Thu, 4 Dec 2003 16:36:27 -0800 (PST)
Subject: problem allocating large amount of memory
In-Reply-To: <Pine.LNX.4.44.0312032353040.7748-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20031205003627.2288.qmail@web40013.mail.yahoo.com>

Mark,
thanks a lot for your helpful comments.

So, now I am somewhat more confused :-) 

I've tried your code and, yes, I am able to allocate up to 3G of memory
in 124K chunks. Unfortunately this doesn't not help me because the
memory needed is allocated for a large software package, written in
Fortran, that makes heavy use of all kinds of libraries (libc among
others) over which I have no control. 

Also, if I change your code to try to allocate the available memory in
one chunk I am obviously in the same situation as before. If I
understand you correctly, this is because small chunks of memory are
allocated with sbrk, large ones with mmap. I notice from the output of
your program that the allocated memory is also not in a contiguous
block. This must be because Redhat's prelinking of glibc to a fixed
address in memory as noted by David Lombard. 

What I dont understand at all then is why your second code example
(mmap) is able to return
 2861 MB : 157286400
or even more memory upon changing size to 4.e9. Isn't this supposently
simply overwriting the area where glibc is in? That confuses me now.
Will that prevent me from using stdio. There is no problem linking
statically for me. I am doing that for other reasons anyway. 

Best regards and many thanks for your input. 
Roland


--- Mark Hahn <hahn at physics.mcmaster.ca> wrote:
> 
> yes.  unless you are quite careful, your address space looks like
> this:
> 
> 0-128M		zero page
> 128M + small	program text
> 		sbrk heap (grows up)
> 1GB		mmap arena (grows up)
> 3GB - small	stack base (grows down)
> 3GB-4GB		kernel direct-mapped area
> 
> your ~1GB is allocated in the sbrk heap (above text, below 1GB).
> the ~2GB is allocated in the mmap arena (glibc puts large allocations
> there, if possible, since you can munmap arbitrary pages, but heaps
> can 
> only rarely shrink).
> 
> interestingly, you can avoid the mmap arena entirely if you try
> (static linking,
> avoid even static stdio).  that leaves nearly 3 GB available for the
> heap or stack.  
> also interesting is that you can use mmap with MAP_FIXED to avoid the
> default 
> mmap-arena at 1GB.  the following code demonstrates all of these. 
> the last time
> I tried, you could also move around the default mmap base
> (TASK_UNMAPPED_BASE,
> and could squeeze the 3G barier, too (TASK_SIZE).  I've seen patches
> to make 
> TASK_UNMAPPED_BASE a /proc setting, and to make the mmap arena grow
> down
> (which lets you start it at a little under 3G, leaving a few hundred
> MB for stack).
> finally, there is a patch which does away with the kernel's 1G chunk
> entirely
> (leaving 4G:4G, but necessitating some nastiness on context switches)
> 
> 
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/mman.h>
> 
> void print(char *message) {
>     unsigned l = strlen(message);
>     write(1,message,l);
> }
> void printuint(unsigned u) {
>     char buf[20];
>     char *p = buf + sizeof(buf) - 1;
>     *p-- = 0;
>     do {
>         *p-- = "0123456789"[u % 10];
>         u /= 10;
>     } while (u);
>     print(p+1);
> }
> 
> int main() {
> #if 1
> //    unsigned chunk = 128*1024;                                     
>           
>     unsigned chunk = 124*1024;
>     unsigned total = 0;
>     void *p;
> 
>     while (p = malloc(chunk)) {
>         total += chunk;
>         printuint(total);
>         print("MB\t: ");
>         printuint((unsigned)p);
>         print("\n");
>     }
> #else
>     unsigned offset = 150*1024*1024;
>     unsigned size = (unsigned) 3e9;
>     void *p = mmap((void*) offset,
>                    size,
>                    PROT_READ|PROT_WRITE,
>                    MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
>                    0,0);
>     printuint(size >> 20);
>     print(" MB\t: ");
>     printuint((unsigned) p);
>     print("\n");
> #endif
>     return 0;
> }
> 
> > Also has someone experience with the various kernel patches for
> large
> > memory out there (im's 4G/4G or IBM's 3.5G/0.5G hack)? 
> 
> there's nothing IBM-specific about 3.5/.5, that's for sure.
> 
> as it happens, I'm going to be doing some measurements of performance
> soon.
> 

__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Thu Dec  4 20:38:16 2003
From: csamuel at vpac.org (Chris Samuel)
Date: Fri, 5 Dec 2003 12:38:16 +1100
Subject: LONG RANT [RE: RHEL Copyright Removal]
In-Reply-To: <20031125013008.GA6416@sphere.math.ucdavis.edu>
References: <0B27450D68F1D511993E0001FA7ED2B3036EE4F8@ukjhmbx12.ukjh.zeneca.com> <1069682488.2179.127.camel@scalable> <20031125013008.GA6416@sphere.math.ucdavis.edu>
Message-ID: <200312051238.17633.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 25 Nov 2003 12:30 pm, Bill Broadley wrote:

> On Mon, Nov 24, 2003 at 10:01:30PM +0800, Laurence Liew wrote:
> > Hi all,
> >
> > RedHat have annouced academic pricing at USD25 per desktop (RHEL WS
> > based) and USD50 for Academic server (RHEL ES based) a week or so ago.
>
> This sounded relatively attractive to me, until I found out that
> USD25 per desktop for RHEL WS did NOT include the Opteron version.

I know this is a reply to an old message, but I think it's worth mentioning.

Looking at:

	http://www.redhat.com/solutions/industries/education/products/

It says that AMD64 (presumably both Opteron and Athlon 64) is included in this 
deal.  To quote:

	Versions available: x86, IPF, or AMD64

Chris
- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/z+GIO2KABBYQAh8RAoz1AJ9q9LAB3zfMyT566v0U7+71ykSlxACdHZKJ
9yrL/fFEX1oSwtYYdeHizS8=
=nRd0
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mperez at delta.ft.uam.es  Fri Dec  5 05:55:31 2003
From: mperez at delta.ft.uam.es (Manuel J)
Date: Fri, 5 Dec 2003 11:55:31 +0100
Subject: looking for specific PXE application
Message-ID: <200312051155.31691.mperez@delta.ft.uam.es>


	Hi. I am now involved in a clustering project and I need an application to 
collect all MAC addresses sent from PXE clients to a DHCP host with 
DHCPDISCOVER packets. I am trying to find out before start developing it by 
myself, so I think maybe I could get it from the beowulf project.

Could someone help me with some kind of reference, please?
Thanks.

Manuel J.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From agrajag at dragaera.net  Fri Dec  5 09:14:38 2003
From: agrajag at dragaera.net (Sean Dilda)
Date: Fri, 5 Dec 2003 09:14:38 -0500
Subject: looking for specific PXE application
In-Reply-To: <200312051155.31691.mperez@delta.ft.uam.es>; from mperez@delta.ft.uam.es on Fri, Dec 05, 2003 at 11:55:31AM +0100
References: <200312051155.31691.mperez@delta.ft.uam.es>
Message-ID: <20031205091438.C8280@vallista.dragaera.net>

On Fri, 05 Dec 2003, Manuel J wrote:

> 
> 	Hi. I am now involved in a clustering project and I need an application to 
> collect all MAC addresses sent from PXE clients to a DHCP host with 
> DHCPDISCOVER packets. I am trying to find out before start developing it by 
> myself, so I think maybe I could get it from the beowulf project.
> 
> Could someone help me with some kind of reference, please?
> Thanks.

dhcpd logs all requests, including the requesting MAC address and what
IP (if any) is assigned.  You can find those logs in /var/log/messages.
You can also check /var/lib/dhcpd.leases to see what leases (including
MAC addresses) are currently assigned.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From amitoj at cs.uh.edu  Fri Dec  5 10:09:56 2003
From: amitoj at cs.uh.edu (Amitoj G. Singh)
Date: Fri, 5 Dec 2003 09:09:56 -0600 (CST)
Subject: looking for specific PXE application
In-Reply-To: <200312051155.31691.mperez@delta.ft.uam.es>
Message-ID: <Pine.GSO.4.33.0312050909030.10008-100000@themis.cs.uh.edu>


I recall OSCAR could do that ...

http://oscar.openclustergroup.org

Hope this helps.

- Amitoj.

On Fri, 5 Dec 2003, Manuel J wrote:

>
> 	Hi. I am now involved in a clustering project and I need an application to
> collect all MAC addresses sent from PXE clients to a DHCP host with
> DHCPDISCOVER packets. I am trying to find out before start developing it by
> myself, so I think maybe I could get it from the beowulf project.
>
> Could someone help me with some kind of reference, please?
> Thanks.
>
> Manuel J.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From erwan at mandrakesoft.com  Fri Dec  5 07:56:54 2003
From: erwan at mandrakesoft.com (Erwan Velu)
Date: Fri, 05 Dec 2003 13:56:54 +0100
Subject: looking for specific PXE application
In-Reply-To: <200312051155.31691.mperez@delta.ft.uam.es>
References: <200312051155.31691.mperez@delta.ft.uam.es>
Message-ID: <1070629014.7715.1660.camel@revolution.mandrakesoft.com>

Hi, you could have a look to the script we are using in
CLIC/MandrakeClustering. This scripts are written in Perl and collect
mac addresses and assign in the dhcp configuration as static addresses.
You can use it and tune it for your needs.

> Could someone help me with some kind of reference, please?
> Thanks.
http://cvs.mandrakesoft.com/cgi-bin/cvsweb.cgi/cluster/clic/Devel_admin/add_nodes_to_dhcp_cluster.pm?rev=1.37&content-type=text/x-cvsweb-markup

-- 
Erwan Velu
Linux Cluster Distribution Project Manager
MandrakeSoft
43 rue d'aboukir 75002 Paris
Phone Number : +33 (0) 1 40 41 17 94
Fax Number   : +33 (0) 1 40 41 92 00
Web site     : http://www.mandrakesoft.com
OpenPGP key  : http://www.mandrakesecure.net/cks/ 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec  5 11:15:22 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 5 Dec 2003 11:15:22 -0500 (EST)
Subject: looking for specific PXE application
In-Reply-To: <20031205091438.C8280@vallista.dragaera.net>
Message-ID: <Pine.LNX.4.44.0312051114420.17311-100000@ganesh.phy.duke.edu>

On Fri, 5 Dec 2003, Sean Dilda wrote:

> On Fri, 05 Dec 2003, Manuel J wrote:
> 
> > 
> > 	Hi. I am now involved in a clustering project and I need an application to 
> > collect all MAC addresses sent from PXE clients to a DHCP host with 
> > DHCPDISCOVER packets. I am trying to find out before start developing it by 
> > myself, so I think maybe I could get it from the beowulf project.
> > 
> > Could someone help me with some kind of reference, please?
> > Thanks.
> 
> dhcpd logs all requests, including the requesting MAC address and what
> IP (if any) is assigned.  You can find those logs in /var/log/messages.
> You can also check /var/lib/dhcpd.leases to see what leases (including
> MAC addresses) are currently assigned.

Somebody did publish a grazing script (or reference to one) in this
venue sometime in the last year, maybe.  Google the archives.

  rgb

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Fri Dec  5 12:02:45 2003
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Fri, 5 Dec 2003 09:02:45 -0800 (PST)
Subject: Latency on Beowulf Mailing list
In-Reply-To: <1070639232.7721.1672.camel@revolution.mandrakesoft.com>
Message-ID: <20031205170245.58039.qmail@web11404.mail.yahoo.com>

It usually takes less than 20 minutes for me.

Rayson

--- Erwan Velu <erwan at mandrakesoft.com> wrote:
> When I'm sending messages to beowulf mailing list, I can see them
> after
> 8 hours :(
> 
> Sometimes, my answers are too old for being intresting :(
> 
> Any ideas? Does other users are in the same case ?
> -- 
> Erwan Velu
> Linux Cluster Distribution Project Manager
> MandrakeSoft
> 43 rue d'aboukir 75002 Paris
> Phone Number : +33 (0) 1 40 41 17 94
> Fax Number   : +33 (0) 1 40 41 92 00
> Web site     : http://www.mandrakesoft.com
> OpenPGP key  : http://www.mandrakesecure.net/cks/ 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From erwan at mandrakesoft.com  Fri Dec  5 10:47:13 2003
From: erwan at mandrakesoft.com (Erwan Velu)
Date: Fri, 05 Dec 2003 16:47:13 +0100
Subject: Latency on Beowulf Mailing list
Message-ID: <1070639232.7721.1672.camel@revolution.mandrakesoft.com>

When I'm sending messages to beowulf mailing list, I can see them after
8 hours :(

Sometimes, my answers are too old for being intresting :(

Any ideas? Does other users are in the same case ?
-- 
Erwan Velu
Linux Cluster Distribution Project Manager
MandrakeSoft
43 rue d'aboukir 75002 Paris
Phone Number : +33 (0) 1 40 41 17 94
Fax Number   : +33 (0) 1 40 41 92 00
Web site     : http://www.mandrakesoft.com
OpenPGP key  : http://www.mandrakesecure.net/cks/ 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tim.carlson at pnl.gov  Fri Dec  5 11:57:08 2003
From: tim.carlson at pnl.gov (Tim Carlson)
Date: Fri, 05 Dec 2003 08:57:08 -0800 (PST)
Subject: looking for specific PXE application
In-Reply-To: <200312051631.hB5GV6S10698@NewBlue.scyld.com>
Message-ID: <Pine.LNX.4.44.0312050853020.30089-100000@scorpion.emsl.pnl.gov>


> On Fri, 5 Dec 2003, Sean Dilda wrote:
>
> > On Fri, 05 Dec 2003, Manuel J wrote:
> >
> > >
> > > 	Hi. I am now involved in a clustering project and I need an application to
> > > collect all MAC addresses sent from PXE clients to a DHCP host with
> > > DHCPDISCOVER packets. I am trying to find out before start developing it by
> > > myself, so I think maybe I could get it from the beowulf project.
> > >
> > > Could someone help me with some kind of reference, please?
> > > Thanks.
> >
> > dhcpd logs all requests, including the requesting MAC address and what
> > IP (if any) is assigned.  You can find those logs in /var/log/messages.
> > You can also check /var/lib/dhcpd.leases to see what leases (including
> > MAC addresses) are currently assigned.
>
> Somebody did publish a grazing script (or reference to one) in this
> venue sometime in the last year, maybe.  Google the archives.
>
>   rgb

This is exactly how ROCKS clusters add nodes. Install your frontend, PXE
boot your nodes. If you've already decided on a clusters solution, then
nevermind :)

http://www.rocksclusters.org/

Tim Carlson
Voice: (509) 376 3423
Email: Tim.Carlson at pnl.gov
EMSL UNIX System Support

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Daniel.Kidger at quadrics.com  Fri Dec  5 12:04:42 2003
From: Daniel.Kidger at quadrics.com (Daniel Kidger)
Date: Fri, 5 Dec 2003 17:04:42 -0000
Subject: Latency on Beowulf Mailing list
Message-ID: <010C86D15E4D1247B9A5DD312B7F5AA78DE2BC@stegosaurus.bristol.quadrics.com>

> From: Erwan Velu [mailto:erwan at mandrakesoft.com]
> Subject: Latency on Beowulf Mailing list
> When I'm sending messages to beowulf mailing list, I can see 
> them after  8 hours :(

yes me too.

Much of the time my positngs take a median of say 5 hours.
In the mean time several other folk often manage to post their replies.


Daniel.

--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd.      daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK         0117 915 5505
----------------------- www.quadrics.com --------------------

> 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From herrold at owlriver.com  Fri Dec  5 13:58:18 2003
From: herrold at owlriver.com (R P Herrold)
Date: Fri, 5 Dec 2003 13:58:18 -0500 (EST)
Subject: beowulf]  Re: looking for specific PXE application
In-Reply-To: <Pine.GSO.4.33.0312050909030.10008-100000@themis.cs.uh.edu>
References: <Pine.GSO.4.33.0312050909030.10008-100000@themis.cs.uh.edu>
Message-ID: <Pine.LNX.4.53.0312051354180.8041@swampfox.owlriver.com>

On Fri, 5 Dec 2003, Amitoj G. Singh wrote:

> I recall OSCAR could do that ...
> http://oscar.openclustergroup.org

> > collect all MAC addresses sent from PXE clients to a DHCP host with
> > Could someone help me with some kind of reference, please?

These should all show with the 'arpwatch' package; or in the
alternative, by turning logging up for the tftp server(atftp 
works well) or on the dhcp server, can be extracted from 
/var/log/messages , and then awk |sort |uniq'ed out.

Is the Oscar tool more sophisticated than that?

-- Russ Herrold
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Fri Dec  5 15:28:42 2003
From: becker at scyld.com (Donald Becker)
Date: Fri, 5 Dec 2003 15:28:42 -0500 (EST)
Subject: Latency on Beowulf Mailing list
In-Reply-To: <1070639232.7721.1672.camel@revolution.mandrakesoft.com>
Message-ID: <Pine.LNX.4.44.0312051429120.2537-100000@training.scyld.com>

On Fri, 5 Dec 2003, Erwan Velu wrote:

> When I'm sending messages to beowulf mailing list, I can see them after
> 8 hours :(
> 
> Sometimes, my answers are too old for being intresting :(
> 
> Any ideas? Does other users are in the same case ?

The quick answer is "spammers and viruses".

There are several reasons that this is the case:
  Over 95% of Beowulf list postings are held for moderation
  The Beowulf list alone has about 3000 addresses

95% might seem large, but considering only 1 in ten attempted postings
is valid, only about 50% of the posts are held for moderation.  While I
do sometimes wake up in the middle of the night to moderate, you
shouldn't really expect that.

A posting may be held for moderation by match any of about 25 patterns.
Some of those patterns are pretty general -- even certain three digit
numbers and two digit country codes will trigger moderation.

Once held for moderation the posting may be automatically deleted.
Right now there are 1439 phrases and 3298 IP address prefixes and domain
names.  All were hand added.  My goal is over 90% automatic deletions.
If I stop adding rules, it drops below that number in a week or two as
spammers move machines and change tactics.

Less common is that a post is automatically approved.  Some spammers
have taken to including sections of web pages in their email, so don't
expect this increasing in the future.

The second point is also a result of spammers, albeit indirectly.  The
list is run by mailman, which splits the list up into sections.  If your
position is after a Teergruber, or the link is just busy, your email
will be delayed for several hours.  Despite being very responsible
mailers, our machine (or perhaps our IP address block) does sometimes
end up on a RBL.

I see this problem as only getting worse.  Our "3c509" mailing list is
first alphabetically, and thus is the first recipient of new spam.  I've
mostly given up on it, but leave it in place to harvest new patterns.
It received 5 new messages in the past 30 minutes, a rate up
substantially over just a few months ago. 

So, what can you do to avoid delays?  Nothing especially predictable,
because predictable measures are easily defeated by spammers.  But you
can 
- avoid posting or having a return address from free mail account services
- have a reverse-resolving host name on all mail hops
- don't have "adsl" or "dialup" in the header
- avoid all mention of "personal enhancement" drugs, purchasing drugs of
    any kind, moving money out of your sub-sahara country, mentions of
    credit card names, sex with a farm animal, sex with multiple farm
    animals, webcams, etc.  Talking about your cluster of webcams of
    viagra-enhanced farm animals trying to move their lottery winnings
    out of from Nigeria, even if they puchased the viagra at dicount
    rates from Canadian suppliers that included a free mini-rc car could
    conceivably be brought on-topic, but that posting will never make it.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
914 Bay Ridge Road, Suite 220		Scyld Beowulf cluster systems
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jac67 at georgetown.edu  Fri Dec  5 16:54:06 2003
From: jac67 at georgetown.edu (Jess Cannata)
Date: Fri, 05 Dec 2003 16:54:06 -0500
Subject: Latency on Beowulf Mailing list
In-Reply-To: <Pine.LNX.4.44.0312051429120.2537-100000@training.scyld.com>
References: <Pine.LNX.4.44.0312051429120.2537-100000@training.scyld.com>
Message-ID: <3FD0FE7E.7070806@georgetown.edu>

Is the list a restricted list, meaning that only subscribers to the list 
can post messages? If it isn't, wouldn't this help reduce the number of 
messages that need to be moderated? If it is restricted, then I guess 
that the spammers are getting really good if they are spoofing the 
addresses of the 3000 subscribers.

Jess

Donald Becker wrote:

>On Fri, 5 Dec 2003, Erwan Velu wrote:
>
>  
>
>>When I'm sending messages to beowulf mailing list, I can see them after
>>8 hours :(
>>
>>Sometimes, my answers are too old for being intresting :(
>>
>>Any ideas? Does other users are in the same case ?
>>    
>>
>
>The quick answer is "spammers and viruses".
>
>There are several reasons that this is the case:
>  Over 95% of Beowulf list postings are held for moderation
>  The Beowulf list alone has about 3000 addresses
>
>95% might seem large, but considering only 1 in ten attempted postings
>is valid, only about 50% of the posts are held for moderation.  While I
>do sometimes wake up in the middle of the night to moderate, you
>shouldn't really expect that.
>
>A posting may be held for moderation by match any of about 25 patterns.
>Some of those patterns are pretty general -- even certain three digit
>numbers and two digit country codes will trigger moderation.
>
>Once held for moderation the posting may be automatically deleted.
>Right now there are 1439 phrases and 3298 IP address prefixes and domain
>names.  All were hand added.  My goal is over 90% automatic deletions.
>If I stop adding rules, it drops below that number in a week or two as
>spammers move machines and change tactics.
>
>Less common is that a post is automatically approved.  Some spammers
>have taken to including sections of web pages in their email, so don't
>expect this increasing in the future.
>
>The second point is also a result of spammers, albeit indirectly.  The
>list is run by mailman, which splits the list up into sections.  If your
>position is after a Teergruber, or the link is just busy, your email
>will be delayed for several hours.  Despite being very responsible
>mailers, our machine (or perhaps our IP address block) does sometimes
>end up on a RBL.
>
>I see this problem as only getting worse.  Our "3c509" mailing list is
>first alphabetically, and thus is the first recipient of new spam.  I've
>mostly given up on it, but leave it in place to harvest new patterns.
>It received 5 new messages in the past 30 minutes, a rate up
>substantially over just a few months ago. 
>
>So, what can you do to avoid delays?  Nothing especially predictable,
>because predictable measures are easily defeated by spammers.  But you
>can 
>- avoid posting or having a return address from free mail account services
>- have a reverse-resolving host name on all mail hops
>- don't have "adsl" or "dialup" in the header
>- avoid all mention of "personal enhancement" drugs, purchasing drugs of
>    any kind, moving money out of your sub-sahara country, mentions of
>    credit card names, sex with a farm animal, sex with multiple farm
>    animals, webcams, etc.  Talking about your cluster of webcams of
>    viagra-enhanced farm animals trying to move their lottery winnings
>    out of from Nigeria, even if they puchased the viagra at dicount
>    rates from Canadian suppliers that included a free mini-rc car could
>    conceivably be brought on-topic, but that posting will never make it.
>
>  
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rokrau at yahoo.com  Fri Dec  5 20:05:56 2003
From: rokrau at yahoo.com (Roland Krause)
Date: Fri, 5 Dec 2003 17:05:56 -0800 (PST)
Subject: problem allocating large amount of memory
In-Reply-To: <Pine.LNX.4.44.0312041949590.14254-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20031206010556.44528.qmail@web40012.mail.yahoo.com>

Mark,

thanks for the clarification. I was now able to squeeze
TASK_UNMAPPED_BASE to a rather small fraction of TASK_SIZE and to
allocate enough memory for the application in question. 

Again, thanks a lot for your very helpful comments. 

Roland


--- Mark Hahn <hahn at physics.mcmaster.ca> wrote:
> 
> I probably should have commented on the code a bit more.  it
> demonstrates
> three separate things: that for <128K allocations, libc uses the heap
> first, then when that fills (hits the mmap arena) it switches to
> allocating
> in the mmap arena.  if allocations are 128K or more, it *starts* in
> the 
> mmap arena (since mmap has advantages when doing large allocations -
> munmap).  
> finally, if you statically link and avoid the use of stdio,
> you can make one giant allocation from the end of text up to stack.
> 
> 

__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From pesch at attglobal.net  Sat Dec  6 16:50:11 2003
From: pesch at attglobal.net (pesch at attglobal.net)
Date: Sat, 06 Dec 2003 13:50:11 -0800
Subject: Latency on Beowulf Mailing list
References: <Pine.LNX.4.44.0312051429120.2537-100000@training.scyld.com>
Message-ID: <3FD24F13.8B8CE253@attglobal.net>

I was planning to call Beowulf clustering the "Viagra of computing" - but after Donalds elaborations I plan to
change my mind :(((

Donald Becker wrote:

> On Fri, 5 Dec 2003, Erwan Velu wrote:
>
> > When I'm sending messages to beowulf mailing list, I can see them after
> > 8 hours :(
> >
> > Sometimes, my answers are too old for being intresting :(
> >
> > Any ideas? Does other users are in the same case ?
>
> The quick answer is "spammers and viruses".
>
> There are several reasons that this is the case:
>   Over 95% of Beowulf list postings are held for moderation
>   The Beowulf list alone has about 3000 addresses
>
> 95% might seem large, but considering only 1 in ten attempted postings
> is valid, only about 50% of the posts are held for moderation.  While I
> do sometimes wake up in the middle of the night to moderate, you
> shouldn't really expect that.
>
> A posting may be held for moderation by match any of about 25 patterns.
> Some of those patterns are pretty general -- even certain three digit
> numbers and two digit country codes will trigger moderation.
>
> Once held for moderation the posting may be automatically deleted.
> Right now there are 1439 phrases and 3298 IP address prefixes and domain
> names.  All were hand added.  My goal is over 90% automatic deletions.
> If I stop adding rules, it drops below that number in a week or two as
> spammers move machines and change tactics.
>
> Less common is that a post is automatically approved.  Some spammers
> have taken to including sections of web pages in their email, so don't
> expect this increasing in the future.
>
> The second point is also a result of spammers, albeit indirectly.  The
> list is run by mailman, which splits the list up into sections.  If your
> position is after a Teergruber, or the link is just busy, your email
> will be delayed for several hours.  Despite being very responsible
> mailers, our machine (or perhaps our IP address block) does sometimes
> end up on a RBL.
>
> I see this problem as only getting worse.  Our "3c509" mailing list is
> first alphabetically, and thus is the first recipient of new spam.  I've
> mostly given up on it, but leave it in place to harvest new patterns.
> It received 5 new messages in the past 30 minutes, a rate up
> substantially over just a few months ago.
>
> So, what can you do to avoid delays?  Nothing especially predictable,
> because predictable measures are easily defeated by spammers.  But you
> can
> - avoid posting or having a return address from free mail account services
> - have a reverse-resolving host name on all mail hops
> - don't have "adsl" or "dialup" in the header
> - avoid all mention of "personal enhancement" drugs, purchasing drugs of
>     any kind, moving money out of your sub-sahara country, mentions of
>     credit card names, sex with a farm animal, sex with multiple farm
>     animals, webcams, etc.  Talking about your cluster of webcams of
>     viagra-enhanced farm animals trying to move their lottery winnings
>     out of from Nigeria, even if they puchased the viagra at dicount
>     rates from Canadian suppliers that included a free mini-rc car could
>     conceivably be brought on-topic, but that posting will never make it.
>
> --
> Donald Becker                           becker at scyld.com
> Scyld Computing Corporation             http://www.scyld.com
> 914 Bay Ridge Road, Suite 220           Scyld Beowulf cluster systems
> Annapolis MD 21403                      410-990-9993
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lange at informatik.Uni-Koeln.DE  Sun Dec  7 15:37:50 2003
From: lange at informatik.Uni-Koeln.DE (Thomas Lange)
Date: Sun, 7 Dec 2003 21:37:50 +0100
Subject: looking for specific PXE application
In-Reply-To: <200312051155.31691.mperez@delta.ft.uam.es>
References: <200312051155.31691.mperez@delta.ft.uam.es>
Message-ID: <16339.36766.310248.237282@informatik.uni-koeln.de>

>>>>> On Fri, 5 Dec 2003 11:55:31 +0100, Manuel J <mperez at delta.ft.uam.es> said:

    > 	Hi. I am now involved in a clustering project and I need an
    > 	application to
    > collect all MAC addresses sent from PXE clients to a DHCP host

FAI, the fully automatic installation uses following simple command
pipe:

> tcpdump -qte broadcast and port bootpc >/tmp/mac.lis

The when all machines send out some broadcast messages, you will get
the list with

> perl -ane 'print "\U$F[0]\n"' /tmp/mac.lis|sort|uniq

-- 
regrads Thomas Lange
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Mon Dec  8 16:23:26 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Mon, 08 Dec 2003 14:23:26 -0700
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <3FD4EBCE.4060908@drdc-rddc.gc.ca>

Hello,

This may be off topic but may be of interest to many that follow this 
list.

I have searched the WWW until my eyes are seeing double (and it isn't 
just the beer) trying to find a real answer to my question.  I have 
read the reviews and the hype about SATA being better than IDE/ATA and 
almost as good as SCSI, even better in a couple of areas.

I have talked to our computer people but they don't have enough 
experience with SATA drives to give me a straight answer.

With most new motherboards coming with controllers for SATA drives, I 
am considering using SATA drives for a new high-end workstation and 
small cluster.  I have seen RAID arrays using SATA drives which just 
makes the question even greater.  Of course I have seen RAID arrays 
using IDE drives.

I have used SCSI on all workstations I have built in the past, but the 
cost of SATA drives is making me think twice about this.  Files seem 
to be getting larger from day to day.

My concern is regarding multiple disk read/writes.  With IDE, you can 
wait for what seems like hours while data is being read off of the HD. 
  I want to know if the problem is still as bad with SATA as the 
original ATA drives?  Will the onboard RAID speed up access?

I know that throughput on large files is close and is usually related 
to platter speed.  I am also pleased that the buffers is now 8mb on 
all the drives I am looking at.

Main issue is writing and reading swap on those really large files and 
how it affects other work.

OS will be Linux on all.

-- 
Robin Laing


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Mon Dec  8 17:18:52 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Mon, 8 Dec 2003 14:18:52 -0800
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD4EBCE.4060908@drdc-rddc.gc.ca>
References: <3FD4EBCE.4060908@drdc-rddc.gc.ca>
Message-ID: <20031208221852.GB22702@cse.ucdavis.edu>

In my experience there are many baises, religious opinions, and rules
of thumb that are just extremely BAD basis for making these related
decisions.  Especially since many people's idea about such things change
relatively slowly compared to the actual hardware.

My best recommendation is to either find a benchmark that closely resembles
your application load (Similar mix of read/writes, same level of RAID, same
size read/writes, same locality) and actually benchmark.

I'm sure people can produce a particular configuration of SCSI, ATA, and SATA that 
will be best AND worst for a given benchmark.

So I'd look at bonnie++, postmark, or one of the other opensource benchmarks
see if any of those can be configured to be similar to your workload.  If not
write a benchmark that is similar to your workload and post it to the list asking
people to run it on their hardware.  The more effort you put into it the
more responses your likely to get.  Posting a table of performance results
on a website seems to encourage more to participate.

There are no easy answers, it depends on many many variables, the type
of OS, how long the partition has been live (i.e. fragmentation),
the IDE/SCSI chipset, the drivers, the OS, even the cables can have
performance effects.

The market seems to be going towards SATA, seems like many if not all major
storage vendors have an entry level SATA product, I've no idea if this
is just the latest fad or justified from a pure price/performance perspective.

Good luck.

-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Mon Dec  8 18:15:24 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Mon, 8 Dec 2003 15:15:24 -0800 (PST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <20031208221852.GB22702@cse.ucdavis.edu>
Message-ID: <Pine.LNX.3.96.1031208150058.16362A-100000@Maggie.Linux-Consulting.com>


hi ya robin/bill

On Mon, 8 Dec 2003, Bill Broadley wrote:

> In my experience there are many baises, religious opinions, and rules
> of thumb that are just extremely BAD basis for making these related
> decisions.  Especially since many people's idea about such things change
> relatively slowly compared to the actual hardware.

yupperz !!
 
> My best recommendation is to either find a benchmark that closely resembles
> your application load (Similar mix of read/writes, same level of RAID, same
> size read/writes, same locality) and actually benchmark.
> 
> I'm sure people can produce a particular configuration of SCSI, ATA, and SATA that 
> will be best AND worst for a given benchmark.

yupperz ... no problem ... you want theirs to look not as good, and our
version look like its better... yupp ..

definite yupppers one do a benchmark and compare only similar environments
and apps ... otherwise one is comparing christmas shopping to studing
to be a vet ( benchmarks not related to each other )

-----

for which disks ...
	- i'd stick with plain ole ide disks
	- its cheap
	- you can have a whole 2nd system to backup the primary array
	for about the same cost as an expensive dual-cpu or scsi-based
	system

for serial ata ...
	- dont use its onboard controller for raid ... 
	- it probably be as good as onboard raid on existing mb...
	( ie ... none of um works right 
		works == hands off booting of any disk 
		works == data resyncs by itself w/o intervention

		but doing the same tests w/ sw raid or hw raid
		controller w/ scsi works fine

> So I'd look at bonnie++, postmark, or one of the other opensource benchmarks
> see if any of those can be configured to be similar to your workload.  If not
> write a benchmark that is similar to your workload and post it to the list asking
> people to run it on their hardware.  The more effort you put into it the
> more responses your likely to get.  Posting a table of performance results
> on a website seems to encourage more to participate.

other benchmark tests you can run ....

	http://www.Linux-1U.net/Benchmarks

other tuning you can to to tweek the last instruction out of the system

	http://www.Linux-1U.net/Tuning
 
> There are no easy answers, it depends on many many variables, the type
> of OS, how long the partition has been live (i.e. fragmentation),
> the IDE/SCSI chipset, the drivers, the OS, even the cables can have
> performance effects.

(look for the) picture of partitions/layout ... makes  big difference

	http://www.Linux-1U.net/Partition/

> The market seems to be going towards SATA, seems like many if not all major
> storage vendors have an entry level SATA product, I've no idea if this
> is just the latest fad or justified from a pure price/performance perspective.

if the disk manufacturers stop making scsi/ide disks .. we wont have
any choice... unless we go to the super fast "compact flash"
and its next generation 100GB "compact flash" in the r/d labs 
which is why ibm sold its klunky mechanical disk drives in favor
of its new "solid state disks"  ( forgot its official name )

c ya
alvin

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Dec  8 18:18:07 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 8 Dec 2003 18:18:07 -0500 (EST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD4EBCE.4060908@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312081736360.13824-100000@coffee.psychology.mcmaster.ca>

| read the reviews and the hype about SATA being better than IDE/ATA and 
| almost as good as SCSI, even better in a couple of areas.

<sigh>


| I have talked to our computer people but they don't have enough 
| experience with SATA drives to give me a straight answer.

there's not THAT much to know.

SCSI:	pro: a nice bus-based architecture which makes it easy to put
	many disks in one enclosure.  the bus is fast enough to
	support around 3-5 disks without compromising bandwidth
	(in fact, you'll probably saturate your PCI(x) bus(es) first
	if you're not careful!) 10 and 15K RPM SCSI disks are common, 
	leading to serious advantages if your workload is latency-dominated
	(mostly of small, scattered, uncachable reads, and/or synchronous
	writes.)  5yr warrantees and 1.2 Mhour MTBF are very comforting.

	con: price.  older (pre Ultra2) disks lack even basic CRC protection.
	always lower-density than ATA; often hotter, too.  (note that the 
	density can actually negate any MTBF advantage!)

ATA:	pro: price.  massive density (and that means that bandwidth is 
	excellent, even at much lower RPM.)  ease of purchase/replacement;
	ubiquity (and cheapness) of controllers.

	con: probably half the MTBF of SCSI, 1yr warrantee is common, 
	though the price-premium for 3yr models is small.  most disks are
	5400 or	7200 RPM so latency is potentially a problem (though there is
	one line of 10K RPM'ers but at close to SCSI prices and density).

PATA:	pro: still a bit cheaper than SATA.  PATA *does* include tagged 
	command queueing, but it's mostly ignored by vendors and drivers.

	con: cabling just plain sucks for more than a few disks (remember:
	the standard STILL requires cable be <= 18" of flat ribbon).

SATA:	pro: nicer cable, albeit not bus or daisy-chain (until sata2);
	much improved support for hot-plug and TCQ.

	con: not quite mainstream (price and availability).  putting many
	in one box is still a bit of a problem (albeit also a power problem
	for any kind of disk...)

I have no idea what to make of the roadmappery that shows sata merging with 
serial-attached scsi in a few years.

| My concern is regarding multiple disk read/writes.  With IDE, you can 
| wait for what seems like hours while data is being read off of the HD. 

nah.  it's basically just a design mistake to put two active PATA disks 
on the same channel.  it's fine if one is usually idle (say, cdrom or 
perhaps a disk containing old archives).  most people just avoid putting 
two disks on a channel at all, since channels are almost free, and you 
get to ignore jumpers.


|   I want to know if the problem is still as bad with SATA as the 
| original ATA drives?  Will the onboard RAID speed up access?

there was no problem with "original" disks.  and raid works fine, up until
you saturate your PCI bus...


| I know that throughput on large files is close and is usually related 
| to platter speed.  I am also pleased that the buffers is now 8mb on 
| all the drives I am looking at.

one of the reasons that TCQ is not a huge win is that the kernel's cache
is ~500x bigger than the disk's.  however, it's true that bigger ondisk cache
lets the drive better optimize delayed writes within a cylinder.  for non-TCQ
ATA to be competitive when writing, it's common to enable write-behind
caching.  this can cause data loss or corruption if you crash at exactly the 
right time (paranoids take note).


| Main issue is writing and reading swap on those really large files and 
| how it affects other work.

swap thrashing is a non-fatal error that should be fixed, 
not band-aided by gold-plated hardware.

finally, I should mention that Jeff Garzik is doing a series of good new SATA
drivers (deliberately ignoring the accumulated kruft in the kernel's PATA
code).  they plug into the kernel's SCSI interface, purely to take advantage 
of support for queueing and hotplug, I think.

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Dec  8 20:24:44 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 8 Dec 2003 20:24:44 -0500 (EST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312081736360.13824-100000@coffee.psychology.mcmaster.ca>
Message-ID: <Pine.LNX.4.44.0312082013040.15807-100000@lilith.rgb.private.net>

On Mon, 8 Dec 2003, Mark Hahn wrote:

> | read the reviews and the hype about SATA being better than IDE/ATA and 
> | almost as good as SCSI, even better in a couple of areas.
> 
> <sigh>
> 
> 
> | I have talked to our computer people but they don't have enough 
> | experience with SATA drives to give me a straight answer.
> 
> there's not THAT much to know.

But what there is is a pleasure to read, as always, when you write it.
One tiny question:

> | My concern is regarding multiple disk read/writes.  With IDE, you can 
> | wait for what seems like hours while data is being read off of the HD. 
> 
> nah.  it's basically just a design mistake to put two active PATA disks 
> on the same channel.  it's fine if one is usually idle (say, cdrom or 
> perhaps a disk containing old archives).  most people just avoid putting 
> two disks on a channel at all, since channels are almost free, and you 
> get to ignore jumpers.

So, admitting my near total ignorance about SATA and whether or not I
should lust after it, does SATA perpetuate this problem, or is it more
like a SCSI daisy chain, where each drive gets its own ID and there is a
better handling of parallel access?

The "almost free" part has several annoying aspects, after all.  An
extra controller (or two).  One cable per disk if you use one disk per
channel.  The length thing.  The fact that ribbon cables, when they turn
sideways, do a gangbusters job of occluding fans and airflow, and with
four or five of them in a case routing them around can be a major pain.

There is also a small price premium for SATA, although admittedly it
isn't much.  So, in your fairly expert opinion, is it worth it?

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Dec  8 21:44:14 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 8 Dec 2003 21:44:14 -0500 (EST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312082013040.15807-100000@lilith.rgb.private.net>
Message-ID: <Pine.LNX.4.44.0312082111330.15283-100000@coffee.psychology.mcmaster.ca>

> > | My concern is regarding multiple disk read/writes.  With IDE, you can 
> > | wait for what seems like hours while data is being read off of the HD. 
> > 
> > nah.  it's basically just a design mistake to put two active PATA disks 
> > on the same channel.  it's fine if one is usually idle (say, cdrom or 
> > perhaps a disk containing old archives).  most people just avoid putting 
> > two disks on a channel at all, since channels are almost free, and you 
> > get to ignore jumpers.
> 
> So, admitting my near total ignorance about SATA and whether or not I
> should lust after it, does SATA perpetuate this problem, or is it more
> like a SCSI daisy chain, where each drive gets its own ID and there is a
> better handling of parallel access?

no, or maybe yes.  SATA is *not* becoming more SCSI-like: drives don't
get their own ID (since they're not on a bus).  in SATA-1 at least,
the cable is strictly point-to-point, and each drive acts like a separate
channel (which were always parallel even in PATA).  basically, master/slave
was just a really crappy implementation of SCSI IDs, and SATA has done away
with it.  given that IO is almost always host<>device, there's no real value
in making devices peers, IMO.  

yes to concurrency, but no to "like SCSI" (peers, IDs and multidrop).

> extra controller (or two).  One cable per disk if you use one disk per
> channel.

one cable per disk, period.  this is sort of an interesting design trend,
actually: away from parallel multidrop buses, towards serial point-to-point
ones.  in fact, the sata2 "port multiplier" extension is really a sort
of packet-switching mechanism...

> There is also a small price premium for SATA, although admittedly it
> isn't much.  So, in your fairly expert opinion, is it worth it?

my next 8x250G server(s) will use a pair of promise s150tx4 (non-raid) 4-port
sata controllers ;)

I don't see any significant benefit except where you need lots of devices
and/or hotswap.  well, beyond the obvious coolness factor, of course...
though come to think of it, there should be some performance, and probably
robustness benefits from Jeff Garzik's clean-slate approach.

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Mon Dec  8 22:27:12 2003
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Mon, 08 Dec 2003 19:27:12 -0800
Subject: autofs
Message-ID: <3FD54110.7090703@cert.ucr.edu>

Hi,

I was wanting to use autofs to mount all the nfs shares on my nodes to 
ease the pain of having an nfs server go down.  But the problem with 
that, is that mpich jobs don't seem to want to run the first time 
around.  If I then run them a second time, the drives are mounted, and 
they run fine.  I don't think my users are going to like that too much 
though, so would anyone know a solution?

Thanks,
Glen

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hanzl at noel.feld.cvut.cz  Tue Dec  9 04:53:54 2003
From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz)
Date: Tue, 09 Dec 2003 10:53:54 +0100
Subject: autofs
In-Reply-To: <3FD54110.7090703@cert.ucr.edu>
References: <3FD54110.7090703@cert.ucr.edu>
Message-ID: <20031209105354B.hanzl@unknown-domain>

> I was wanting to use autofs to mount all the nfs shares on my nodes to 
> ease the pain of having an nfs server go down.  But the problem with 
> that, is that mpich jobs don't seem to want to run the first time 
> around.

If you are using bproc than there is a slight chance that it is somehow
related to autofs/bproc deadlock which I discovered long time ago (and
I have no idea whether my fix made it to bproc or not), see:

  http://www.beowulf.org/pipermail/beowulf/2002-May/003508.html

Regards

Vaclav Hanzl
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From David_Walters at sra.com  Tue Dec  9 07:46:24 2003
From: David_Walters at sra.com (Walters, David)
Date: Tue, 9 Dec 2003 07:46:24 -0500
Subject: EMC, anyone?
Message-ID: <0EB5C81FE6FE5A4F8D1FEBF59C6C7BAA1A1824@durham.sra.com>

Our group has an opportunity that few would pass up - more or less free
storage.  Our parent organization is preparing to purchase a large amount of
EMC storage, the configuration of which is not yet nailed down.  We are
investigating the potential to be the recipients of part of that storage,
and (crossing fingers) no one has mentioned the dreaded chargeback word yet.
Obviously, we would be thrilled to gain access to TBs of free storage, so we
can spend more of our budget on people and compute platforms.

Naturally, the EMC reps are plying us with lots of jargon, PR, white papers,
and so on explaining why their technology is the perfect fit for us.
However, I am bothered by the fact that EMC does not have a booth at SC each
year, and I do not see them mentioned in the HPC trade rags.  Makes me think
that they don't really have the technology and support tailored for the HPC
community.

We, of course, are doing due diligence on the business case side, matching
our needs with their numbers.  My question to this group is "Do any of you
use EMC for your HPC storage?"  If so, how?  Been happy with it?

We do primarily models with heavy latency dependency (meteorological, with
CMAQ and MM5).  This will not be the near-line storage, but rather NAS
attached to HiPPI or gigE.

Thanks in advance,

Dave Walters
Project Manager, SRA
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kallio at ebi.ac.uk  Tue Dec  9 07:24:56 2003
From: kallio at ebi.ac.uk (Kimmo Kallio)
Date: Tue, 9 Dec 2003 12:24:56 +0000 (GMT)
Subject: autofs
In-Reply-To: <3FD54110.7090703@cert.ucr.edu>
Message-ID: <Pine.LNX.4.44.0312091157560.25441-100000@rakkine.ebi.ac.uk>

On Mon, 8 Dec 2003, Glen Kaukola wrote:

> Hi,
> 
> I was wanting to use autofs to mount all the nfs shares on my nodes to 
> ease the pain of having an nfs server go down.  But the problem with 
> that, is that mpich jobs don't seem to want to run the first time 
> around.  If I then run them a second time, the drives are mounted, and 
> they run fine.  I don't think my users are going to like that too much 
> though, so would anyone know a solution?

Hi, 

This is not specific to your application but a general autofs issue: If
and autofs directory is not mounted, it simply doesn't exists and some
operations (like file exists) do fail. As a workaround try doing a
indirect autofs mount via a symlink, instead of mounting:

  /my/directory 

do a link :

  /my/directory -> /tmp_mnt/my/directory
  
and automount /tmp_mnt/my/directory instead, but always use /my/directory
in file references. Resolving the link forces the mount operation and
solves the problem. 

However if the automount fails (server down) it doesn't necessarely make
your users any happier as the applications will fail, unless if an long
nfs timeout would kill your application anyway...

Regards,

  Kimmo Kallio, European Bioinformatics Institute

> 
> Thanks,
> Glen
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dskr at mac.com  Tue Dec  9 08:40:34 2003
From: dskr at mac.com (dskr at mac.com)
Date: Tue, 9 Dec 2003 08:40:34 -0500
Subject: Terrasoft Black Lab Linux
Message-ID: <43DEB9EC-2A4D-11D8-9EAE-00039394839E@mac.com>


Greetings:

Does anyone on the list have any experience with TerraSoft's Black Lab 
linux?

As many of you may recall, I am a big fan of 'software that sucks less' 
-- to quote a
wonderful Scyld T-shirt I once saw. Imagine my surprise, then, when I 
found that
TerraSoft (promulgators of YellowDog and BlackLab Linux for PPC) is 
shipping a
new version (2.2) of BlackLab that is based on BProc.

Is this good news? I think it could be for TerraSoft ; this move is a 
big upgrade from
their earlier offering which reminded me of the Dark Times in 
clustering.
(Does anyone else still remember when we had to set up .rhosts files 
and grab
our copy of PVM out of someone else's home directory and copy it into 
our own?)

I'd like to see what BlackLab's story is. but I have been unable to 
find any of the
sources for this product available for download. In particular, I would 
like to know:

	* Does it use beonss?

	* Does it use beoboot?

	* Does it netboot remote Macintoshes?

	* What version of BProc does it use?

	* How did they do MPI? Did they crib Don's version
	of MPICH for BProc?

Additionally, I'm looking for good ideas which can be adapted to a 
little
toy I wrote years ago called 'mpi-mandel'. They tout a similar program 
and
I was hoping to have a peek at it. Does anyone know if their similar 
program
is available under the GPL?

If anyone on this forum has experience with this product, I would 
appreciate
your feedback. If anyone can furnish me with the sources or links for 
the BlackLab
MPI, beoboot, and mandelbrot program, I would be grateful.

Regards,
	Dan Ridge

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Patrick.Begou at hmg.inpg.fr  Tue Dec  9 08:22:39 2003
From: Patrick.Begou at hmg.inpg.fr (Patrick Begou)
Date: Tue, 09 Dec 2003 14:22:39 +0100
Subject: autofs
References: <Pine.LNX.4.44.0312091157560.25441-100000@rakkine.ebi.ac.uk>
Message-ID: <3FD5CC9F.C34CC0CE@hmg.inpg.fr>

Kimmo Kallio a ?crit :
> This is not specific to your application but a general autofs issue: If
> and autofs directory is not mounted, it simply doesn't exists and some
> operations (like file exists) do fail. As a workaround try doing a
> indirect autofs mount via a symlink, instead of mounting:
> 
>   /my/directory
> 
> do a link :
> 
>   /my/directory -> /tmp_mnt/my/directory
> 
> and automount /tmp_mnt/my/directory instead, but always use /my/directory
> in file references. Resolving the link forces the mount operation and
> solves the problem.

I've done something similar but I've added "." in the linked path, like
this:
/my/directory -> /tmp_mnt/my/directory/.

I didn't get any problem with such a link.

Patrick

-- 
===============================================================
|  Equipe M.O.S.T.         | http://most.hmg.inpg.fr          |
|  Patrick BEGOU           |       ------------               |
|  LEGI                    | mailto:Patrick.Begou at hmg.inpg.fr |
|  BP 53 X                 | Tel 04 76 82 51 35               |
|  38041 GRENOBLE CEDEX    | Fax 04 76 82 52 71               |
===============================================================
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Tue Dec  9 10:03:12 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Tue, 9 Dec 2003 07:03:12 -0800
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF92@orsmsx402.jf.intel.com>

From: Mark Hahn; Sent: Monday, December 08, 2003 3:18 PM
> 
> SCSI:	pro: a nice bus-based architecture which makes it easy to put
> 	many disks in one enclosure.  the bus is fast enough to
> 	support around 3-5 disks without compromising bandwidth
> 	(in fact, you'll probably saturate your PCI(x) bus(es) first
> 	if you're not careful!) 10 and 15K RPM SCSI disks are common,
> 	leading to serious advantages if your workload is
latency-dominated
> 	(mostly of small, scattered, uncachable reads, and/or
synchronous
> 	writes.)  5yr warrantees and 1.2 Mhour MTBF are very comforting.

Very big pro:  You can get much higher *sustained* bandwidth levels,
regardless of CPU load.  ATA/PATA requires CPU involvement, and
bandwidth tanks under moderate CPU load.

The highest SCSI bandwidth rates I've seen first hand are 290 MB/S for
IA32 and 380 MB/S for IPF. Both had two controllers on independent PCI-X
busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.


Does SATA reduce the CPU requirement from ATA/PATA, or is it the same?
Unless it's substantially lower, you still have a system best suited for
low to moderate I/O needs.

BTW, http://www.iozone.org/ is a nice standard I/O benchmark.  But, as
mentioned earlier in this thread, app-specific benchmarking is *always*
best.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Tue Dec  9 11:11:30 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Tue, 09 Dec 2003 09:11:30 -0700
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <20031208223637.47124.qmail@web60310.mail.yahoo.com>
References: <20031208223637.47124.qmail@web60310.mail.yahoo.com>
Message-ID: <3FD5F432.6040600@drdc-rddc.gc.ca>

Andrew Latham wrote:
> While I understand your pain I have no facts for you other than that SATA is
> much faster than IDE. It can come close to SCSI(160). I have used SATA a little
> but am happy with it. the selling point for me is cost of controler and disk
> (controlers of SATA are much less), and the smaller cable format. The cable is
> so small and easy to use that it is the major draw for me.
> 
> good luck on your quest!
> 

I knew this but for straight throughput but it is random access that 
is the real question.

-- 
Robin Laing

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at scyld.com  Tue Dec  9 03:12:03 2003
From: rgoornaden at scyld.com (rgoornaden at scyld.com)
Date: Tue, 9 Dec 2003 03:12:03 -0500
Subject: fstab
Message-ID: <200312090812.hB98C3S25365@NewBlue.scyld.com>


hello everybody
after i have edit the file /etc/fstab that I amend the fellowing line to
the file
masternode:/home /home nfs
OR
I use the command "mount -t nfs masternode:/home /home@

to check whether the nfs was successful or not I type "df" on node2 for
instance and i get this result...

"/dev/hda3  17992668 682888 16395776 4% /
none	    256900	  0  256900  0% /dev/shm "

I suposse that this is wrong as it was not mounted on the masternode

thanks
 Ryan


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Tue Dec  9 11:58:32 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Tue, 9 Dec 2003 11:58:32 -0500 (EST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD5F6DD.6000505@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312091146320.19007-100000@coffee.psychology.mcmaster.ca>

> > | My concern is regarding multiple disk read/writes.  With IDE, you can 
> > | wait for what seems like hours while data is being read off of the HD. 
> > 
> > nah.  it's basically just a design mistake to put two active PATA disks 
> > on the same channel.  it's fine if one is usually idle (say, cdrom or 
> > perhaps a disk containing old archives).  most people just avoid putting 
> > two disks on a channel at all, since channels are almost free, and you 
> > get to ignore jumpers.
>
> So it would be a good idea to put data and /tmp on a different channel 
> than swap?

if you're expecting concurrency, then you shouldn't share a limited resource.
a single (master/slave) PATA channel is one such resource.  sharing a spindle
(two partitions on a single disk of any sort) is just as much a mistake.

> > caching.  this can cause data loss or corruption if you crash at exactly the 
> > right time (paranoids take note).
> > 
> I forgot about the "write-behind" problem.  I have been burned with 
> this before.

really?  the window is quite small, since lazy-writing IDEs *do* have a 
timeout for how long they'll delay a write.  or are you thinking of the 
issue of shutting down a machine - when the ATX poweroff happens before
the write is flushed?  (and the OS fails to properly flush the cache...)
the latter is fixed in current Linux.

> memeory while working.  I know on my present workstation I will work 
> with a file that is 2X the memory and I find that the machine stutters 
> (locks for a few seconds) every time there is any disk ascess.  I 

I'll bet you a beer that this is a memory-management problem rather than 
anything wrong with the disk.  Linux has always had a tendency to over-cache,
and get to a point where you clearly notice its scavenging scans.

> one thing I was looking at with SCSI.  From this I take it that SATA 
> can handle some queueing but it just isn't supported yet?

grep LKML for jgarzik and libata.  my real point is that queueing is not 
all that important, since the kernel has always done seek scheduling.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Tue Dec  9 11:22:53 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Tue, 09 Dec 2003 09:22:53 -0700
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312081736360.13824-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0312081736360.13824-100000@coffee.psychology.mcmaster.ca>
Message-ID: <3FD5F6DD.6000505@drdc-rddc.gc.ca>

Mark Hahn wrote:
> | read the reviews and the hype about SATA being better than IDE/ATA and 
> | almost as good as SCSI, even better in a couple of areas.
> 
> <sigh>
> 
> 
> | I have talked to our computer people but they don't have enough 
> | experience with SATA drives to give me a straight answer.
> 
> there's not THAT much to know.
> 


> | My concern is regarding multiple disk read/writes.  With IDE, you can 
> | wait for what seems like hours while data is being read off of the HD. 
> 
> nah.  it's basically just a design mistake to put two active PATA disks 
> on the same channel.  it's fine if one is usually idle (say, cdrom or 
> perhaps a disk containing old archives).  most people just avoid putting 
> two disks on a channel at all, since channels are almost free, and you 
> get to ignore jumpers.
> 
So it would be a good idea to put data and /tmp on a different channel 
than swap?

> 
> |   I want to know if the problem is still as bad with SATA as the 
> | original ATA drives?  Will the onboard RAID speed up access?
> 
> there was no problem with "original" disks.  and raid works fine, up until
> you saturate your PCI bus...
> 
> 
> | I know that throughput on large files is close and is usually related 
> | to platter speed.  I am also pleased that the buffers is now 8mb on 
> | all the drives I am looking at.
> 
> one of the reasons that TCQ is not a huge win is that the kernel's cache
> is ~500x bigger than the disk's.  however, it's true that bigger ondisk cache
> lets the drive better optimize delayed writes within a cylinder.  for non-TCQ
> ATA to be competitive when writing, it's common to enable write-behind
> caching.  this can cause data loss or corruption if you crash at exactly the 
> right time (paranoids take note).
> 
I forgot about the "write-behind" problem.  I have been burned with 
this before.

> 
> | Main issue is writing and reading swap on those really large files and 
> | how it affects other work.
> 
> swap thrashing is a non-fatal error that should be fixed, 
> not band-aided by gold-plated hardware.
> 
I agree but I am not looking at swap thrashing in the sense of many 
small files.  I am looking at 1 or 2 large files that are bigger than 
memeory while working.  I know on my present workstation I will work 
with a file that is 2X the memory and I find that the machine stutters 
(locks for a few seconds) every time there is any disk ascess.  I 
would like to add more ram but that is impossible as there are only 
two slots and they are full.  Management won't provide the funds.

> finally, I should mention that Jeff Garzik is doing a series of good new SATA
> drivers (deliberately ignoring the accumulated kruft in the kernel's PATA
> code).  they plug into the kernel's SCSI interface, purely to take advantage 
> of support for queueing and hotplug, I think.
This is interesting.  I like the idea of hot-swap drives and this is 
one thing I was looking at with SCSI.  From this I take it that SATA 
can handle some queueing but it just isn't supported yet?
> 
> regards, mark hahn.
> 

-- 
Robin Laing

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at scyld.com  Tue Dec  9 00:17:02 2003
From: rgoornaden at scyld.com (rgoornaden at scyld.com)
Date: Tue, 9 Dec 2003 00:17:02 -0500
Subject: Just Begin
Message-ID: <200312090517.hB95H2S09271@NewBlue.scyld.com>


Hello everybody...
I has just started to build a beowulf cluster and after making some
research about it, I decided to use RedHat 9.0 and using MPICH2-0.94 as
message passing software..
Well, I will be very glad if someone can guide me as a friend to construct
this cluster 
Thanks
Ryan


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Tue Dec  9 11:09:09 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Tue, 09 Dec 2003 09:09:09 -0700
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <3FD5F3A5.6090802@drdc-rddc.gc.ca>

> hi ya robin/bill On Mon, 8 Dec 2003, Bill Broadley wrote:
> 
SNIP
> 
> definite yupppers one do a benchmark and compare only similar environments
> and apps ... otherwise one is comparing christmas shopping to studing
> to be a vet ( benchmarks not related to each other )
> 

I like the idea of shopping for a christmas vet. :)

> -----
> 
> for which disks ...
> 	- i'd stick with plain ole ide disks
> 	- its cheap
> 	- you can have a whole 2nd system to backup the primary array
> 	for about the same cost as an expensive dual-cpu or scsi-based
> 	system
> 
> for serial ata ...
> 	- dont use its onboard controller for raid ... 
> 	- it probably be as good as onboard raid on existing mb...
> 	( ie ... none of um works right 
> 		works == hands off booting of any disk 
> 		works == data resyncs by itself w/o intervention
> 
> 		but doing the same tests w/ sw raid or hw raid
> 		controller w/ scsi works fine
> 
This is an answer that is at least in the direction of what I am 
looking for.

> 
>>> So I'd look at bonnie++, postmark, or one of the other opensource benchmarks
>>> see if any of those can be configured to be similar to your workload.  If not
>>> write a benchmark that is similar to your workload and post it to the list asking
>>> people to run it on their hardware.  The more effort you put into it the
>>> more responses your likely to get.  Posting a table of performance results
>>> on a website seems to encourage more to participate.
> 
> 
> other benchmark tests you can run ....
> 
> 	http://www.Linux-1U.net/Benchmarks

Correct link,
http://www.Linux-1U.net/BenchMarks

The problem benchmarks software is you need the hardware to test it 
with.  What a nice circle to be involved in.

> 
> other tuning you can to to tweek the last instruction out of the system
> 
> 	http://www.Linux-1U.net/Tuning
>  

I have looked at http://www.Linux-1U.net before posting my questions 
about SATA.

> 
>>> There are no easy answers, it depends on many many variables, the type
>>> of OS, how long the partition has been live (i.e. fragmentation),
>>> the IDE/SCSI chipset, the drivers, the OS, even the cables can have
>>> performance effects.
> 
> 
> (look for the) picture of partitions/layout ... makes  big difference
> 
> 	http://www.Linux-1U.net/Partition/

I would prefer not to use SWAP at all.  Of course 1Gig of ram is now 
minimum I would put into a desktop.

> 
>>> The market seems to be going towards SATA, seems like many if not all major
>>> storage vendors have an entry level SATA product, I've no idea if this
>>> is just the latest fad or justified from a pure price/performance perspective.
> 
> 
> if the disk manufacturers stop making scsi/ide disks .. we wont have
> any choice... unless we go to the super fast "compact flash"
> and its next generation 100GB "compact flash" in the r/d labs 
> which is why ibm sold its klunky mechanical disk drives in favor
> of its new "solid state disks"  ( forgot its official name )
> 

Solid state memory has been talked about for years.  I remember the 
discussion about bubble memory.

> c ya
> alvin
> 
> 

-- 
Robin Laing

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msnitzer at lnxi.com  Tue Dec  9 12:21:49 2003
From: msnitzer at lnxi.com (Mike Snitzer)
Date: Tue, 9 Dec 2003 10:21:49 -0700
Subject: [Beowulf] Re: Terrasoft Black Lab Linux
In-Reply-To: <43DEB9EC-2A4D-11D8-9EAE-00039394839E@mac.com>; from dskr@mac.com on Tue, Dec 09, 2003 at 08:40:34AM -0500
References: <43DEB9EC-2A4D-11D8-9EAE-00039394839E@mac.com>
Message-ID: <20031209102149.A21557@lnxi.com>

On Tue, Dec 09 2003 at 06:40,
dskr at mac.com <dskr at mac.com> wrote:

> 
> Greetings:
> 
> Does anyone on the list have any experience with TerraSoft's Black Lab 
> linux?
> 
> As many of you may recall, I am a big fan of 'software that sucks less' 
> -- to quote a
> wonderful Scyld T-shirt I once saw. Imagine my surprise, then, when I 
> found that
> TerraSoft (promulgators of YellowDog and BlackLab Linux for PPC) is 
> shipping a
> new version (2.2) of BlackLab that is based on BProc.
> 
> Is this good news? I think it could be for TerraSoft ; this move is a 
> big upgrade from
> their earlier offering which reminded me of the Dark Times in 
> clustering.
> (Does anyone else still remember when we had to set up .rhosts files 
> and grab
> our copy of PVM out of someone else's home directory and copy it into 
> our own?)
> 
> I'd like to see what BlackLab's story is. but I have been unable to 
> find any of the
> sources for this product available for download. In particular, I would 
> like to know:
> 
> 	* Does it use beonss?
> 
> 	* Does it use beoboot?
> 
> 	* Does it netboot remote Macintoshes?
> 
> 	* What version of BProc does it use?
> 
> 	* How did they do MPI? Did they crib Don's version
> 	of MPICH for BProc?

I'd imagine you've seen this link:
http://www.terrasoftsolutions.com/products/blacklab/

On that site it details the fact that BlackLab v2.2 uses Yellow Dog 3.0 as
its base and that its using Bproc 3.x and Supermon; so they likely just
used Eric Hendrik's (LANL's) Clustermatic 3.0. 

Also here is a listing of included software from their site; not too many
_real_ details: 
http://www.terrasoftsolutions.com/products/blacklab/included.shtml

Scouring ftp.{yellowdoglinux,terrasoftsolutions.com}.com didn't yield
anything.  I'd imagine terrasoft would answer emailed questions.

Mike
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Tue Dec  9 13:22:43 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Tue, 9 Dec 2003 10:22:43 -0800
Subject: [Beowulf] RE: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF97@orsmsx402.jf.intel.com>

From: Robin Laing; Sent: Tuesday, December 09, 2003 8:23 AM
> Mark Hahn wrote:
> > nah.  it's basically just a design mistake to put two active PATA
disks
> > on the same channel.  it's fine if one is usually idle (say, cdrom
or
> > perhaps a disk containing old archives).  most people just avoid
putting
> > two disks on a channel at all, since channels are almost free, and
you
> > get to ignore jumpers.
> >
> So it would be a good idea to put data and /tmp on a different channel
> than swap?

This is true of *every* system, regardless of disk technology.

However, it's even better, if possible, to put enough memory in the box
to avoid swap.

> I agree but I am not looking at swap thrashing in the sense of many
> small files.  I am looking at 1 or 2 large files that are bigger than
> memeory while working.  I know on my present workstation I will work
> with a file that is 2X the memory and I find that the machine stutters
> (locks for a few seconds) every time there is any disk ascess.  I
> would like to add more ram but that is impossible as there are only
> two slots and they are full.  Management won't provide the funds.

What kernel are you using? There were a couple/few 2.4 kernels that
would behave badly with this.  Changing the kernel and/or tuning in
/proc can help, I ran in to this and used both fixes.  I don't have the
specifics with me, but they're googleable...

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msnitzer at lnxi.com  Tue Dec  9 12:27:14 2003
From: msnitzer at lnxi.com (Mike Snitzer)
Date: Tue, 9 Dec 2003 10:27:14 -0700
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312091146320.19007-100000@coffee.psychology.mcmaster.ca>; from hahn@physics.mcmaster.ca on Tue, Dec 09, 2003 at 11:58:32AM -0500
References: <3FD5F6DD.6000505@drdc-rddc.gc.ca> <Pine.LNX.4.44.0312091146320.19007-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20031209102714.B21557@lnxi.com>

On Tue, Dec 09 2003 at 09:58,
Mark Hahn <hahn at physics.mcmaster.ca> wrote:

> > one thing I was looking at with SCSI.  From this I take it that SATA 
> > can handle some queueing but it just isn't supported yet?
> 
> grep LKML for jgarzik and libata.  my real point is that queueing is not 
> all that important, since the kernel has always done seek scheduling.

FYI, here is Jeff Garzik's latest Status report for Linux SATA support:
http://lwn.net/Articles/61288/

Mike
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From linux-man at verizon.net  Tue Dec  9 12:45:57 2003
From: linux-man at verizon.net (mark kandianis)
Date: Tue, 09 Dec 2003 12:45:57 -0500
Subject: [Beowulf] beowulf and X
Message-ID: <oprzxcyvu4w6rjvl@outgoing.verizon.net>

hello

i have a background in linux but not particularly beowulf.  i've lately 
been recruited
to develop a graphics system for beowulf with xfree86 and twm.  is anyone 
else doing this
out there?  also, how does beowulf get its graphics currently?  i could 
not figure that out
 from the links on the site.

mark

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Tue Dec  9 13:24:59 2003
From: becker at scyld.com (Donald Becker)
Date: Tue, 9 Dec 2003 13:24:59 -0500 (EST)
Subject: [Beowulf] BW-BUG meeting, Today Dec. 9, 2003, in Greenbelt MD;  -- Red Hat
Message-ID: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>


[[ Please note that this month's meeting is East: Greenbelt, not McLean VA. ]]

     Baltimore Washington  Beowulf Users Group 
            December 2003 Meeting 
               www.bwbug.org
    December 9th at 3:00PM in Greenbelt MD
 
____

        RedHat Roadmap for HPC Beowulf Clusters.

        RedHat is pleased to have the opportunity to present to Baltimore-
Washington Beowulf User Group on Tuesday Dec 9th. Robert Hibbard, Red Hat's
Federal Partner Alliance Manager, will provide information on Red Hat's
Enterprise Linux product strategy, with particular emphasis on it's
relevance to High Performance Computing Clusters. 

        Discussion will include information on the background, current
product optimizations, as well as possible futures for Red Hat efforts
focused on HPCC. 
____

Our meeting facilities are once again provided by Northrup Grumman
	7501 Greenway Center Drive
	Suite 1000 (10th floor)
	Greenbelt, MD 20770, phone
	703-628-7451


-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
914 Bay Ridge Road, Suite 220		Scyld Beowulf cluster systems
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Dec  9 12:49:26 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 9 Dec 2003 12:49:26 -0500 (EST)
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD5F6DD.6000505@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312091244160.18002-100000@lilith.rgb.private.net>

On Tue, 9 Dec 2003, Robin Laing wrote:

> I agree but I am not looking at swap thrashing in the sense of many 
> small files.  I am looking at 1 or 2 large files that are bigger than 
> memeory while working.  I know on my present workstation I will work 
> with a file that is 2X the memory and I find that the machine stutters 
> (locks for a few seconds) every time there is any disk ascess.  I 
> would like to add more ram but that is impossible as there are only 
> two slots and they are full.  Management won't provide the funds.

I have to ask.  Is it a P4?  Strictly empirically I have experienced
similar things even without filling memory.  I actually moved my
fileserver off onto a Celeron (which it has run flawlessly) because it
was so visible, so annoying.

I have no idea why a P4 would behave that way, but to my direct
experience at least some P4-based servers can be really BAD on file
latency for reasons that have nothing to do with the disk hardware or
kernel per se.  Maybe some sort of chipset problem, maybe related to the
particular onboard IDE/ATA controllers -- I never bothered to try to
debug it other than to move the server onto something else where it
worked.  AMD or Celeron or PIII are all just fine.

If you're stuck on the hardware side with no money to get better
hardware, well, you're stuck.  My P4 system had plenty of memory and a
1.8 MHz clock and still was a pig compared to a 400 MHz Celery serving
the SAME DISK physically moved from one to the other.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Dec  9 12:42:46 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 9 Dec 2003 12:42:46 -0500 (EST)
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD5F432.6040600@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312091230320.18002-100000@lilith.rgb.private.net>

On Tue, 9 Dec 2003, Robin Laing wrote:

> Andrew Latham wrote:
> > While I understand your pain I have no facts for you other than that SATA is
> > much faster than IDE. It can come close to SCSI(160). I have used SATA a little
> > but am happy with it. the selling point for me is cost of controler and disk
> > (controlers of SATA are much less), and the smaller cable format. The cable is
> > so small and easy to use that it is the major draw for me.
> > 
> > good luck on your quest!
> > 
> 
> I knew this but for straight throughput but it is random access that 
> is the real question.

Random access is complicated for any drive system.  It tends to be
latency dominated -- the drive has to do lots of seeks.  Seek time, in
turn, is dominated by platter speed and platter density, with worst case
latencies related to the time required to position the head and turn the
disk so that the track start is underneath.  With drive speeds of
5000-10000 rpm, this time is pretty much fixed and not all that
different from cheap disks to the most expensive, with read and write
being a bit different (so it even matters if you do random access reads
from e.g. a big filesystem with lots of little files or random writes
ditto).  Note also that there are LOTS of components to file latency,
and disk speed is only one of them.  To open a file, the kernel must
first stat it to see if you are PERMITTED to open it.

Note also that the kernel is DESIGNED to hide slow filesystem speeds
from the user.  The kernel caches and buffers and never throws anything
away it might need later unless/until it has to.  A common benchmarking
mistake is to open a file (to see how long it takes) and then open it
again right away in a loop.  Surprise!  It takes a ``long time'' the
first time but the second time is nearly instantaneous, because the
second time the request is served out of the kernel's cache.  A system
with a lot of memory will use all but a tiny fraction of that memory
caching things, if it can.

I don't expect things like latency to be VASTLY affected by SATA vs PATA
vs SCSI, see Mark's remarks on disk speed and platter density -- that is
more strongly related to the disk hardware, not the interface.  Even
things like on-disk cache are trivial in size compared to the kernel's
caches, although I'm sure they help somewhat under some circumstances.  

     rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Dec  9 13:50:00 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 9 Dec 2003 13:50:00 -0500 (EST)
Subject: [Beowulf] beowulf and X
In-Reply-To: <oprzxcyvu4w6rjvl@outgoing.verizon.net>
Message-ID: <Pine.LNX.4.44.0312091341170.9213-100000@ganesh.phy.duke.edu>

On Tue, 9 Dec 2003, mark kandianis wrote:

> hello
> 
> i have a background in linux but not particularly beowulf.  i've lately 
> been recruited
> to develop a graphics system for beowulf with xfree86 and twm.  is anyone 
> else doing this
> out there?  also, how does beowulf get its graphics currently?  i could 
> not figure that out
>  from the links on the site.

What exactly do you mean?  Or rather, I think that defining your
engineering goal is the first step for you to accomplish.  "Beowulf"
doesn't get its graphics any particular way, but systems with graphical
heads can be nodes on a beowulfish or other cluster computer design, and
a piece of parallel software could certainly be written to do the
computation on a collection of nodes and graphically represent the
computation on a graphical head in real time or as movies afterward.
Several demo/benchmarky type applications exist that sort-of demonstrate
this -- pvmpov (a raytracing application) and various mandelbrot set
demos e.g. xep in PVM.

So to even get started you've got to figure out what the problem really
is.  Do you mean:

     Display                     "beowulf"

  Graphics head =====network====head node 00
                                |node 01
                                |node 02
                                |node 03
                                |...

(do a computation on the beowulf that e.g. makes an image or creates a
data representation of some sort, send it via the network to the
display, then graphically display it) or:

  Graphical head node 00
   |node 01
   |node 02
   |node 03
   |...

(do the computation where the graphical display is FULLY INTEGRATED with
the nodes, so each node result is independently updated and displayed
with or without a synchronization step/barrier) or:

  ...something else?

In each case, the suitable design will almost certainly be fairly
uniquely suggested by the task, if it is feasible at all to accomplish
the task.  It may not be.

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From laytonjb at comcast.net  Tue Dec  9 17:56:04 2003
From: laytonjb at comcast.net (Jeffrey B. Layton)
Date: Tue, 09 Dec 2003 17:56:04 -0500
Subject: [Beowulf] Re: EMC, anyone?
In-Reply-To: <0EB5C81FE6FE5A4F8D1FEBF59C6C7BAA1A1824@durham.sra.com>
References: <0EB5C81FE6FE5A4F8D1FEBF59C6C7BAA1A1824@durham.sra.com>
Message-ID: <3FD65304.7030606@comcast.net>

David,

   We tried using EMC for storage for one of our
cluster at work. We have a node in the cluster (we
called it the IO node) that was SAN attached to an
EMC SAN. Then that space was NFS exported
throughout the cluster (288 nodes in total).
   Initially we exported the NFS storage over
Myrinet. After some problems we tried it over
FastE.
   The end result was that we never got it to work
correctly. We had filesystems that would just
disappear from the IO node and then reappear. We
had lots of file corruptions and files lost. My favorite
was the 2 TB filesystem that had to be fsck (man
that took a long time). We had EMC folks in, Dell
people in (they supplied the EMC certified IO node)
and the cluster vendor in. No one could ever figure
out the problems although the cluster vendor was
able to help the situations some (Dell and EMC
really did nothing to help). Finally, we ended up
taking the IO node out of the cluster and only
NFS mounting it on the master node. We also
forced people to run using the local hard drives and
not over NFS. This helped things, but we still had
problems from time to time.
   The ultimate solution was to convert the IO node
to a NAS box with attached storage.
   Good Luck with your project!

Jeff

>Our group has an opportunity that few would pass up - more or less free
>storage.  Our parent organization is preparing to purchase a large amount of
>EMC storage, the configuration of which is not yet nailed down.  We are
>investigating the potential to be the recipients of part of that storage,
>and (crossing fingers) no one has mentioned the dreaded chargeback word yet.
>Obviously, we would be thrilled to gain access to TBs of free storage, so we
>can spend more of our budget on people and compute platforms.
>
>Naturally, the EMC reps are plying us with lots of jargon, PR, white papers,
>and so on explaining why their technology is the perfect fit for us.
>However, I am bothered by the fact that EMC does not have a booth at SC each
>year, and I do not see them mentioned in the HPC trade rags.  Makes me think
>that they don't really have the technology and support tailored for the HPC
>community.
>
>We, of course, are doing due diligence on the business case side, matching
>our needs with their numbers.  My question to this group is "Do any of you
>use EMC for your HPC storage?"  If so, how?  Been happy with it?
>
>We do primarily models with heavy latency dependency (meteorological, with
>CMAQ and MM5).  This will not be the near-line storage, but rather NAS
>attached to HiPPI or gigE.
>
>Thanks in advance,
>
>Dave Walters
>Project Manager, SRA
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>  
>


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Tue Dec  9 18:31:55 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Tue, 9 Dec 2003 15:31:55 -0800
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <187D3A7CAB42A54DB61F1D05F012572201D4BF92@orsmsx402.jf.intel.com>
References: <187D3A7CAB42A54DB61F1D05F012572201D4BF92@orsmsx402.jf.intel.com>
Message-ID: <20031209233155.GC7713@cse.ucdavis.edu>

On Tue, Dec 09, 2003 at 07:03:12AM -0800, Lombard, David N wrote:
> Very big pro:  You can get much higher *sustained* bandwidth levels,
> regardless of CPU load.  ATA/PATA requires CPU involvement, and
> bandwidth tanks under moderate CPU load.

I've heard this before, I've yet to see it.  To what do you attribute
this advantage?  DMA scatter gather?  Higher bitrate at the read head?

Do you have a way to quantify this *sustained* bandwidth?  Care to share?

> The highest SCSI bandwidth rates I've seen first hand are 290 MB/S for
> IA32 and 380 MB/S for IPF. Both had two controllers on independent PCI-X
> busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.

Was this RAID-5?  In Hardware?  In Software?  Which controllers?

Do you have any reason to believe you wouldn't see similar with the
same number of SATA drives on 2 independent PCI-X busses?

I've seen 250 MB/sec from a relatively vanilla single controller setup.

Check out: (no I don't really trust tom's that much):
http://www6.tomshardware.com/storage/20031114/raidcore-24.html#data_transfer_diagrams_raid_5

The RaidCore manages 250 MB/sec decaying to 180MB/sec on the slower inner
tracks of a drive.  Certainly seems like 2 of these on seperate busses
would have a good change of hitting the above numbers.

Note the very similar SCSI 8 drive setups are slower.

> Does SATA reduce the CPU requirement from ATA/PATA, or is it the same?
> Unless it's substantially lower, you still have a system best suited for
> low to moderate I/O needs.

Do you have any way to quantify this?  Care to share?  I've seen many similar
comments but when I actually go measure I get very similar numbers, often
single disks managing 40-60 MB/sec and 10% cpu, and maximum disk transfer
rates around 300-400 MB/sec at fairly high rates of cpu usage.

> BTW, http://www.iozone.org/ is a nice standard I/O benchmark.  But, as
> mentioned earlier in this thread, app-specific benchmarking is *always*
> best.

Agreed.  Iozone or bonnie++ seem to do fine on large sequential file
benchmarking I prefer postmark for replicating database like access patterns.


-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From linux-man at verizon.net  Tue Dec  9 19:43:20 2003
From: linux-man at verizon.net (mark kandianis)
Date: Tue, 09 Dec 2003 19:43:20 -0500
Subject: [Beowulf] beowulf and X
In-Reply-To: <Pine.LNX.4.44.0312091341170.9213-100000@ganesh.phy.duke.edu>
References: <Pine.LNX.4.44.0312091341170.9213-100000@ganesh.phy.duke.edu>
Message-ID: <oprzxwaijew6rjvl@outgoing.verizon.net>

On Tue, 9 Dec 2003 13:50:00 -0500 (EST), Robert G. Brown 
<rgb at phy.duke.edu> wrote:

> On Tue, 9 Dec 2003, mark kandianis wrote:
>
>> hello
>>
>> i have a background in linux but not particularly beowulf.  i've lately
>> been recruited
>> to develop a graphics system for beowulf with xfree86 and twm.  is 
>> anyone
>> else doing this
>> out there?  also, how does beowulf get its graphics currently?  i could
>> not figure that out
>>  from the links on the site.
>
> What exactly do you mean?  Or rather, I think that defining your
> engineering goal is the first step for you to accomplish.  "Beowulf"
> doesn't get its graphics any particular way, but systems with graphical
> heads can be nodes on a beowulfish or other cluster computer design, and
> a piece of parallel software could certainly be written to do the
> computation on a collection of nodes and graphically represent the
> computation on a graphical head in real time or as movies afterward.
> Several demo/benchmarky type applications exist that sort-of demonstrate
> this -- pvmpov (a raytracing application) and various mandelbrot set
> demos e.g. xep in PVM.
>
> So to even get started you've got to figure out what the problem really
> is.  Do you mean:
>
>      Display                     "beowulf"
>
>   Graphics head =====network====head node 00
>                                 |node 01
>                                 |node 02
>                                 |node 03
>                                 |...
>
> (do a computation on the beowulf that e.g. makes an image or creates a
> data representation of some sort, send it via the network to the
> display, then graphically display it) or:
>
>   Graphical head node 00
>    |node 01
>    |node 02
>    |node 03
>    |...
>
> (do the computation where the graphical display is FULLY INTEGRATED with
> the nodes, so each node result is independently updated and displayed
> with or without a synchronization step/barrier) or:
>
>   ...something else?
>
> In each case, the suitable design will almost certainly be fairly
> uniquely suggested by the task, if it is feasible at all to accomplish
> the task.  It may not be.
>
>    rgb
>
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>
>

quite honestly if mosix can do it, it seems that xfree86 is already there,
so it looks like my question is moot. so i think i can get this up quicker 
than
i thought.

are there any particular kernels that are geared to beowulf?  or is this 
something
that one has to roll their own?

regards

mark


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Tue Dec  9 19:37:32 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Tue, 9 Dec 2003 16:37:32 -0800
Subject: [Beowulf] RE: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF9D@orsmsx402.jf.intel.com>

From: Bill Broadley [mailto:bill at cse.ucdavis.edu]
> 
> On Tue, Dec 09, 2003 at 07:03:12AM -0800, Lombard, David N wrote:
> > Very big pro:  You can get much higher *sustained* bandwidth levels,
> > regardless of CPU load.  ATA/PATA requires CPU involvement, and
> > bandwidth tanks under moderate CPU load.
> 
> I've heard this before, I've yet to see it.  To what do you attribute
> this advantage?  DMA scatter gather?  Higher bitrate at the read head?

Non involvement of the CPU with direct disk activities (i.e., the bits
handled by the SCSI controller) plus *way* faster CPU to handle the
high-level RAID processing v. the pokey processors found on most RAID
cards.  With multiple controllers on separate busses, I don't funnel all
my I/O through one bus.  Note again, I only discuss maximal disk
bandwidth, which means RAID-0.

> Do you have a way to quantify this *sustained* bandwidth?  Care to
share?

Direct measurement with both standard testers and applications.
Sustained means a dataset substantially larger than memory to avoid
cache effects.

> > The highest SCSI bandwidth rates I've seen first hand are 290 MB/S
for
> > IA32 and 380 MB/S for IPF. Both had two controllers on independent
PCI-X
> > busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.
                                                 ==========
> Was this RAID-5?  In Hardware?  In Software?  Which controllers?
See underlining immediately above.

> Do you have any reason to believe you wouldn't see similar with the
> same number of SATA drives on 2 independent PCI-X busses?

I have no info on SATA, thus the question later on.
 
> I've seen 250 MB/sec from a relatively vanilla single controller
setup.

What file size v. memory and what CPU load *not* associated with
actually driving the I/O?

> Check out: (no I don't really trust tom's that much):
> http://www6.tomshardware.com/storage/20031114/raidcore-
> 24.html#data_transfer_diagrams_raid_5
> 
> The RaidCore manages 250 MB/sec decaying to 180MB/sec on the slower
inner
> tracks of a drive.  Certainly seems like 2 of these on seperate busses
> would have a good change of hitting the above numbers.
> 
> Note the very similar SCSI 8 drive setups are slower.

I'll look at this.

> > Does SATA reduce the CPU requirement from ATA/PATA, or is it the
same?
> > Unless it's substantially lower, you still have a system best suited
for
> > low to moderate I/O needs.
> 
> Do you have any way to quantify this?  Care to share?  I've seen many
> similar
> comments but when I actually go measure I get very similar numbers,
often
> single disks managing 40-60 MB/sec and 10% cpu, and maximum disk
transfer
> rates around 300-400 MB/sec at fairly high rates of cpu usage.

Direct measurement with both standard testers and applications.
Sustained means a dataset substantially larger than memory to avoid
cache effects.

You repeated my comment, "fairly high rates of cpu usage" -- high cpu
usage _just_to_drive_the_I/O_ meaning it's unavailable for the
application.  Also, are you quoting a burst number, that can benefit
from caching, or a sustained number, where the cache was exhausted long
ago?

The high cpu load hurts scientific/engineering apps that want to access
lots of data on disk, and burst rates are meaningless. In addition, I've
repeatedly heard that same thing from sysadmins setting up NFS servers
-- the ATA/PATA disks have too great a *negative* impact on NFS server
performance -- here the burst rates should have been more significant,
but the CPU load got in the way.

> > BTW, http://www.iozone.org/ is a nice standard I/O benchmark.  But,
as
> > mentioned earlier in this thread, app-specific benchmarking is
*always*
> > best.
> 
> Agreed.  Iozone or bonnie++ seem to do fine on large sequential file
> benchmarking I prefer postmark for replicating database like access
> patterns.

Good to know. Thanks.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Tue Dec  9 21:19:18 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Tue, 9 Dec 2003 18:19:18 -0800
Subject: [Beowulf] RE: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <187D3A7CAB42A54DB61F1D05F012572201D4BF9D@orsmsx402.jf.intel.com>
References: <187D3A7CAB42A54DB61F1D05F012572201D4BF9D@orsmsx402.jf.intel.com>
Message-ID: <20031210021918.GL7713@cse.ucdavis.edu>

On Tue, Dec 09, 2003 at 04:37:32PM -0800, Lombard, David N wrote:
> From: Bill Broadley [mailto:bill at cse.ucdavis.edu]
> > 
> > On Tue, Dec 09, 2003 at 07:03:12AM -0800, Lombard, David N wrote:
> > > Very big pro:  You can get much higher *sustained* bandwidth levels,
> > > regardless of CPU load.  ATA/PATA requires CPU involvement, and
> > > bandwidth tanks under moderate CPU load.
> > 
> > I've heard this before, I've yet to see it.  To what do you attribute
> > this advantage?  DMA scatter gather?  Higher bitrate at the read head?
> 
> Non involvement of the CPU with direct disk activities (i.e., the bits
> handled by the SCSI controller)

Er, the way I understand it is with PATA, SCSI, or SATA the driver
basically says Read or write these block(s) at this ADDR and raise
an interupt when done.  Any corrections?

> plus *way* faster CPU to handle the
> high-level RAID processing

I'm a big fan of software RAID, although it's not a SATA vs SCSI issue.

> v. the pokey processors found on most RAID
> cards. 

Agreed.

> With multiple controllers on separate busses, I don't funnel all
> my I/O through one bus.  Note again, I only discuss maximal disk
> bandwidth, which means RAID-0.

Right, sorry I missed the mention.

> Direct measurement with both standard testers and applications.
> Sustained means a dataset substantially larger than memory to avoid
> cache effects.

Seems that it's fairly common to manage 300 MB/sec +/- 50 MB/sec from
1-2 PCI cards.  I've done similar with 3 U160 channels on an older
dual P4.  The URL I posted shows the same for SATA.

> > > The highest SCSI bandwidth rates I've seen first hand are 290 MB/S
> for
> > > IA32 and 380 MB/S for IPF. Both had two controllers on independent
> PCI-X
> > > busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.
>                                                  ==========
> > Was this RAID-5?  In Hardware?  In Software?  Which controllers?

> See underlining immediately above.

Sorry.

> > Do you have any reason to believe you wouldn't see similar with the
> > same number of SATA drives on 2 independent PCI-X busses?
> 
> I have no info on SATA, thus the question later on.

Ah, well the URL shows a single card managing 250 MB/sec which decays
to 180 MB/sec on the slower tracks.  Filesystems, PCI busses, and memory
systems seem to start being an effect here.  I've not seen much more
the 330 MB/sec (my case) up to 400 MB/sec (various random sources).  Even
my switch from ext3 to XFS helped substantially.  With ext3 I was getting
265-280 MB/sec, with XFS my highest sustained sequential bandwidth was
around 330 MB/sec.

Presumably the mentioned raidcore card could perform even better with
raid-0 then raid-5.

> > I've seen 250 MB/sec from a relatively vanilla single controller
> setup.
> 
> What file size v. memory.

18 GBs of file I/O with 6 GB ram on a dual p4 1.8 GHz

> and what CPU load *not* associated with
> actually driving the I/O?

None, just a benchmark, but it showed 50-80% cpu usage for a single CPU,
this was SCSI though.  I've yet to see any I/O system PC based system 
shove this much data around without significant CPU usage.

> Direct measurement with both standard testers and applications.
> Sustained means a dataset substantially larger than memory to avoid
> cache effects.

Of course, I use a factor of 4 minimum to minimize cache effects.

> You repeated my comment, "fairly high rates of cpu usage" -- high cpu
> usage _just_to_drive_the_I/O_ meaning it's unavailable for the
> application.  Also, are you quoting a burst number, that can benefit
> from caching, or a sustained number, where the cache was exhausted long
> ago?

Well the cost of adding an additional cpu to a fileserver is usually
fairly minimal compared to the cost to own of a few TB of disk.  My
system was configured to look like a quad p4-1.8 (because of hyperthreading)
and one cpu would be around 60-80% depending on FS and which stage
of the benchmark was running.  I was careful to avoid cache effects.

I do have a quad CPU opteron I could use as a test bed as well.

> The high cpu load hurts scientific/engineering apps that want to access
> lots of data on disk, and burst rates are meaningless.

Agreed.

> In addition, I've
> repeatedly heard that same thing from sysadmins setting up NFS servers
> -- the ATA/PATA disks have too great a *negative* impact on NFS server
> performance -- here the burst rates should have been more significant,
> but the CPU load got in the way.

An interesting comment, one that I've not noticed personally, can
anyone offer a benchmark or application?  Was this mostly sequential?
Mostly random?  I'd be happy to run some benchmarks over NFS.

I'd love to quantify an honest to god advantage in one direction or
another, preferably collected from some kind of reproducable workload
so that the numerous variables can be pruned down to the ones
with the largest effect on performance or CPU load.

-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lathama at yahoo.com  Tue Dec  9 22:16:44 2003
From: lathama at yahoo.com (Andrew Latham)
Date: Tue, 9 Dec 2003 19:16:44 -0800 (PST)
Subject: [Beowulf] RE: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <20031210021918.GL7713@cse.ucdavis.edu>
Message-ID: <20031210031644.84113.qmail@web60302.mail.yahoo.com>

Amature thought but give it a read.

Would the advances in compressed filesystems like cramfs allow you to access
the 18gig of info on 6gig of ram. I do not know what the file type is and I am
assuming that it is not flat text (xml or other). If however you where working
on a dataset in xml at about 18gig would a compressed filesystem on 6gig of ram
be fast?

Andrew Latham
Wanna Be Employed :-)


--- Bill Broadley <bill at cse.ucdavis.edu> wrote:
> On Tue, Dec 09, 2003 at 04:37:32PM -0800, Lombard, David N wrote:
> > From: Bill Broadley [mailto:bill at cse.ucdavis.edu]
> > > 
> > > On Tue, Dec 09, 2003 at 07:03:12AM -0800, Lombard, David N wrote:
> > > > Very big pro:  You can get much higher *sustained* bandwidth levels,
> > > > regardless of CPU load.  ATA/PATA requires CPU involvement, and
> > > > bandwidth tanks under moderate CPU load.
> > > 
> > > I've heard this before, I've yet to see it.  To what do you attribute
> > > this advantage?  DMA scatter gather?  Higher bitrate at the read head?
> > 
> > Non involvement of the CPU with direct disk activities (i.e., the bits
> > handled by the SCSI controller)
> 
> Er, the way I understand it is with PATA, SCSI, or SATA the driver
> basically says Read or write these block(s) at this ADDR and raise
> an interupt when done.  Any corrections?
> 
> > plus *way* faster CPU to handle the
> > high-level RAID processing
> 
> I'm a big fan of software RAID, although it's not a SATA vs SCSI issue.
> 
> > v. the pokey processors found on most RAID
> > cards. 
> 
> Agreed.
> 
> > With multiple controllers on separate busses, I don't funnel all
> > my I/O through one bus.  Note again, I only discuss maximal disk
> > bandwidth, which means RAID-0.
> 
> Right, sorry I missed the mention.
> 
> > Direct measurement with both standard testers and applications.
> > Sustained means a dataset substantially larger than memory to avoid
> > cache effects.
> 
> Seems that it's fairly common to manage 300 MB/sec +/- 50 MB/sec from
> 1-2 PCI cards.  I've done similar with 3 U160 channels on an older
> dual P4.  The URL I posted shows the same for SATA.
> 
> > > > The highest SCSI bandwidth rates I've seen first hand are 290 MB/S
> > for
> > > > IA32 and 380 MB/S for IPF. Both had two controllers on independent
> > PCI-X
> > > > busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.
> >                                                  ==========
> > > Was this RAID-5?  In Hardware?  In Software?  Which controllers?
> 
> > See underlining immediately above.
> 
> Sorry.
> 
> > > Do you have any reason to believe you wouldn't see similar with the
> > > same number of SATA drives on 2 independent PCI-X busses?
> > 
> > I have no info on SATA, thus the question later on.
> 
> Ah, well the URL shows a single card managing 250 MB/sec which decays
> to 180 MB/sec on the slower tracks.  Filesystems, PCI busses, and memory
> systems seem to start being an effect here.  I've not seen much more
> the 330 MB/sec (my case) up to 400 MB/sec (various random sources).  Even
> my switch from ext3 to XFS helped substantially.  With ext3 I was getting
> 265-280 MB/sec, with XFS my highest sustained sequential bandwidth was
> around 330 MB/sec.
> 
> Presumably the mentioned raidcore card could perform even better with
> raid-0 then raid-5.
> 
> > > I've seen 250 MB/sec from a relatively vanilla single controller
> > setup.
> > 
> > What file size v. memory.
> 
> 18 GBs of file I/O with 6 GB ram on a dual p4 1.8 GHz
> 
> > and what CPU load *not* associated with
> > actually driving the I/O?
> 
> None, just a benchmark, but it showed 50-80% cpu usage for a single CPU,
> this was SCSI though.  I've yet to see any I/O system PC based system 
> shove this much data around without significant CPU usage.
> 
> > Direct measurement with both standard testers and applications.
> > Sustained means a dataset substantially larger than memory to avoid
> > cache effects.
> 
> Of course, I use a factor of 4 minimum to minimize cache effects.
> 
> > You repeated my comment, "fairly high rates of cpu usage" -- high cpu
> > usage _just_to_drive_the_I/O_ meaning it's unavailable for the
> > application.  Also, are you quoting a burst number, that can benefit
> > from caching, or a sustained number, where the cache was exhausted long
> > ago?
> 
> Well the cost of adding an additional cpu to a fileserver is usually
> fairly minimal compared to the cost to own of a few TB of disk.  My
> system was configured to look like a quad p4-1.8 (because of hyperthreading)
> and one cpu would be around 60-80% depending on FS and which stage
> of the benchmark was running.  I was careful to avoid cache effects.
> 
> I do have a quad CPU opteron I could use as a test bed as well.
> 
> > The high cpu load hurts scientific/engineering apps that want to access
> > lots of data on disk, and burst rates are meaningless.
> 
> Agreed.
> 
> > In addition, I've
> > repeatedly heard that same thing from sysadmins setting up NFS servers
> > -- the ATA/PATA disks have too great a *negative* impact on NFS server
> > performance -- here the burst rates should have been more significant,
> > but the CPU load got in the way.
> 
> An interesting comment, one that I've not noticed personally, can
> anyone offer a benchmark or application?  Was this mostly sequential?
> Mostly random?  I'd be happy to run some benchmarks over NFS.
> 
> I'd love to quantify an honest to god advantage in one direction or
> another, preferably collected from some kind of reproducable workload
> so that the numerous variables can be pruned down to the ones
> with the largest effect on performance or CPU load.
> 
> -- 
> Bill Broadley
> Information Architect
> Computational Science and Engineering
> UC Davis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Dec 10 07:25:50 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Dec 2003 07:25:50 -0500 (EST)
Subject: [Beowulf] beowulf and X
In-Reply-To: <oprzxwaijew6rjvl@outgoing.verizon.net>
Message-ID: <Pine.LNX.4.44.0312100707360.18621-100000@lilith.rgb.private.net>

On Tue, 9 Dec 2003, mark kandianis wrote:

> quite honestly if mosix can do it, it seems that xfree86 is already there,
> so it looks like my question is moot. so i think i can get this up quicker 
> than
> i thought.
> 
> are there any particular kernels that are geared to beowulf?  or is this 
> something
> that one has to roll their own?

Hmmm, it looks like you really need a general introduction to the
subject.  Mosix may or may not be the most desireable way to proceed, as
it is quite "expensive" in terms of overhead and requires a custom
(patched) kernel.  It is also not exactly a GPL product, although it is
free and open source.  If you like, its "fork and forget" design
requires all I/O channels of any sort to be transparently encapsulated
and forwarded over TCP sockets to the master host where the jobs are
begun.  For something with little, rare I/O this is fine -- Mosix then
becomes a sort of distributed interface to a standard Linux scheduler
with a moderate degree of load balancing over the network.  For
something that opens lots of files or pipes and does a lot of writing to
them, it can clog up your network and kernel somewhat faster than an
actual parallel program where you can control e.g. data collection
patterns and avoid collisions and reduce the overhead of encapsulation.

If you're talking only a "small" cluster -- < 64 nodes, maybe < 32 nodes
(it depends on the I/O load of your application) -- you have a decent
chance of not getting into trouble with scaling, but you should
definitely experiment.  If you're wanting to run on hundreds of nodes,
I'd be concerned that you'll only be able to use ten, or thirty, or
forty seven, before your application scaling craps out -- all the other
nodes are then potentially "wasted".

There are quite a few resources for cluster beginners out there, many of
them linked to:

   http://www.phy.duke.edu/brahma

(so I won't bother detailing URL's to them all here).  Links and
resources on this site include papers and talks, an online book
(perennially unfinished, but still mostly complete and even
sorta-current:-) on cluster engineering, links to the FAQ, HOWTO, the
Beowulf Underground, turnkey vendor/cluster consultants, useful
hardware, networking stuff -- I've tried to make it a resource
clearinghouse although even so it is far from complete and gets out of
date if I blink.

Finally, I'd urge you to subscribe to the new Cluster Magazine (plug
plug, hint hint) which has articles that will undoubtedly help you out
with all sorts of things over the next twelve months.  I just got my
first issue, and its articles are being written by really smart people
on this list (and a few bozos -- sorry, OLD joke:-) and should be very,
very helpful to people trying to engineer their first cluster or their
fifteenth.  Besides, you get three free trial issues if you sign up now
and live in the US.

Best of luck, and to get even MORE help, describe your actual problem in
more detail.  Possibly after reading about parallel scaling and Amdahl's
Law.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msnitzer at lnxi.com  Wed Dec 10 09:02:28 2003
From: msnitzer at lnxi.com (Mike Snitzer)
Date: Wed, 10 Dec 2003 07:02:28 -0700
Subject: [Beowulf] Re: BW-BUG meeting, Today Dec. 9, 2003, in Greenbelt MD;  -- Red Hat
In-Reply-To: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>; from becker@scyld.com on Tue, Dec 09, 2003 at 01:24:59PM -0500
References: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>
Message-ID: <20031210070228.A28351@lnxi.com>

On Tue, Dec 09 2003 at 11:24,
Donald Becker <becker at scyld.com> wrote:

> 
> [[ Please note that this month's meeting is East: Greenbelt, not McLean VA. ]]
> 
>      Baltimore Washington  Beowulf Users Group 
>             December 2003 Meeting 
>                www.bwbug.org
>     December 9th at 3:00PM in Greenbelt MD
>  
> ____
> 
>         RedHat Roadmap for HPC Beowulf Clusters.
> 
>         RedHat is pleased to have the opportunity to present to Baltimore-
> Washington Beowulf User Group on Tuesday Dec 9th. Robert Hibbard, Red Hat's
> Federal Partner Alliance Manager, will provide information on Red Hat's
> Enterprise Linux product strategy, with particular emphasis on it's
> relevance to High Performance Computing Clusters. 
> 
>         Discussion will include information on the background, current
> product optimizations, as well as possible futures for Red Hat efforts
> focused on HPCC. 

Can those who attended this meeting provide a summary?  Was bwbug able to
get Robert's presentation?

thanks,
Mike

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Wed Dec 10 11:32:56 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Wed, 10 Dec 2003 09:32:56 -0700
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312091244160.18002-100000@lilith.rgb.private.net>
References: <Pine.LNX.4.44.0312091244160.18002-100000@lilith.rgb.private.net>
Message-ID: <3FD74AB8.2090802@drdc-rddc.gc.ca>

Robert G. Brown wrote:
> On Tue, 9 Dec 2003, Robin Laing wrote:
> 
> 
>>I agree but I am not looking at swap thrashing in the sense of many 
>>small files.  I am looking at 1 or 2 large files that are bigger than 
>>memeory while working.  I know on my present workstation I will work 
>>with a file that is 2X the memory and I find that the machine stutters 
>>(locks for a few seconds) every time there is any disk ascess.  I 
>>would like to add more ram but that is impossible as there are only 
>>two slots and they are full.  Management won't provide the funds.
> 
> 
> I have to ask.  Is it a P4?  Strictly empirically I have experienced
> similar things even without filling memory.  I actually moved my
> fileserver off onto a Celeron (which it has run flawlessly) because it
> was so visible, so annoying.

Dell P4 with 512M ram.  IDE drive.

> 
> I have no idea why a P4 would behave that way, but to my direct
> experience at least some P4-based servers can be really BAD on file
> latency for reasons that have nothing to do with the disk hardware or
> kernel per se.  Maybe some sort of chipset problem, maybe related to the
> particular onboard IDE/ATA controllers -- I never bothered to try to
> debug it other than to move the server onto something else where it
> worked.  AMD or Celeron or PIII are all just fine.
Even more reason for me to stick with AMD's.
> 
> If you're stuck on the hardware side with no money to get better
> hardware, well, you're stuck.  My P4 system had plenty of memory and a
> 1.8 MHz clock and still was a pig compared to a 400 MHz Celery serving
> the SAME DISK physically moved from one to the other.
> 
>    rgb
> 

I find that my P90 at home with UWSCSI is faster most of the time than 
my computer at work.

This thread has sure opened up some debate.  I didn't think it would 
raise the number of issues it has.

-- 
Robin Laing

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Wed Dec 10 13:12:47 2003
From: landman at scalableinformatics.com (landman)
Date: Wed, 10 Dec 2003 13:12:47 -0500
Subject: [Beowulf] MPICH error
Message-ID: <20031210180551.M13163@scalableinformatics.com>

Hi Folks:

  A customer is seeing 

        rm_1310:  p4_error: rm_start: net_conn_to_listener failed: 33220

on an MPI job.  Used to work (just last week).  Updated the kernel was the major
change (added XFS support)

  Any idea of what this is?  I assume a network change.  MPICH 1.2.4.  Do I need
to recompile MPICH to match the kernel?

--
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From victor_ms at bol.com.br  Wed Dec 10 05:59:29 2003
From: victor_ms at bol.com.br (Victor Lima)
Date: Wed, 10 Dec 2003 07:59:29 -0300
Subject: [Beowulf] About Linpack
Message-ID: <3FD6FC91.1030402@bol.com.br>

Hello All,
I have a problem with a Linpack (HPL) with my small LinuxRedHat 7.1 with 
kernel 2.4.20 Cluster (17 Machines) with Mosix and MPI.
When I try to execute xhpl this message apear on my screan:

mpirun -np X xhpl

Where X is the number 17, I try to change the file, hpl.dat, but nothin 
happend.


HPL ERROR from process # 0, on line 408 of function HPL_pdinfo:
 >>> Need at least 4 processes for these tests <<<

HPL ERROR from process # 0, on line 610 of function HPL_pdinfo:
 >>> Illegal input in file HPL.dat. Exiting ... <<<

HPL ERROR from process # 0, on line 408 of function HPL_pdinfo:
 >>> Need at least 4 processes for these tests <<<

HPL ERROR from process # 0, on line 610 of function HPL_pdinfo:
 >>> Illegal input in file HPL.dat. Exiting ... <<<

HPL ERROR from process # 0, on line 408 of function HPL_pdinfo:
 >>> Need at least 4 processes for these tests <<<

HPL ERROR from process # 0, on line 610 of function HPL_pdinfo:
 >>> Illegal input in file HPL.dat. Exiting ... <<<

HPL ERROR from process # 0, on line 408 of function HPL_pdinfo:
 >>> Need at least 4 processes for these tests <<<

HPL ERROR from process # 0, on line 610 of function HPL_pdinfo:
 >>> Illegal input in file HPL.dat. Exiting ... <<<


Some one here has the same problem?

+------------------------------------------------
Universidade Cat?lica Dom Bosco
Esp. Redes de Computadores
+55 67 312-3300
Campo Grande / Mato Grosso do Sul
BRAZIL


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Dec 10 14:07:53 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Dec 2003 14:07:53 -0500 (EST)
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD74AB8.2090802@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312101357070.11945-100000@ganesh.phy.duke.edu>

On Wed, 10 Dec 2003, Robin Laing wrote:

> > I have to ask.  Is it a P4?  Strictly empirically I have experienced
> > similar things even without filling memory.  I actually moved my
> > fileserver off onto a Celeron (which it has run flawlessly) because it
> > was so visible, so annoying.
> 
> Dell P4 with 512M ram.  IDE drive.

One thing Mark suggested (offline, I think) is that TOO MUCH memory can
confuse the caching system of at least some kernels.  Since I never
fully debugged this problem, but instead worked around it (a Celeron,
memory, motherboard, case costs maybe $350 and my time and annoyance are
worth much more than this) I don't know if this is true or not, but it
got to where it could actually crash the system when it was running as
an NFS server with lots of sporadic traffic.  It behaved like it was
swapping (and getting behind in swapping at that), but it wasn't.  It
may well have been a memory management problem, but it seemed pretty
specific to that system.

> > worked.  AMD or Celeron or PIII are all just fine.
> Even more reason for me to stick with AMD's.

Ya, me too.  Although the P4 has worked fine since I stopped making it a
server.  I still get rare mini-delays -- it seems a bit more sluggish
than a 1.8 MHz system with really fast memory has ANY business being --
but overall it is satisfactory.

> I find that my P90 at home with UWSCSI is faster most of the time than 
> my computer at work.
> 
> This thread has sure opened up some debate.  I didn't think it would 
> raise the number of issues it has.

Yeah, it's what I love about this list.  Ask the right question, and the
list generates what amounts to a textbook on the technology, tools, and
current best practice.  Poor Jeffrey then has to pick from all this and
condense it for CM.  By tonight, Jeffrey! ;-)

Oh wait, that's my deadline too...:-(

Grumble.  Off to the salt mines.  Except that I'm double parked, with
the final exam for the course I'm teaching being given in a few hours.
So Doug may have to wait a few days for this one...

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From horacio at acm.org  Wed Dec 10 13:50:59 2003
From: horacio at acm.org (Horacio Gonzalez-Velez)
Date: Wed, 10 Dec 2003 18:50:59 -0000
Subject: [Beowulf] Newbie in Beowulf
Message-ID: <002401c3bf4e$8d4305c0$33000e0a@RESNETHGV>

I need to do MPI programming in a Beowulf cluster.  I am porting from Sun
SOlaris to Beowulf so any pointers are extremely appreciated.

Thanks.

-- 
Horacio Gonzalez-Velez,
Institute for Computing Systems Architecture,
School of Informatics, JCMB-1420
Ph.: +44- (0) 131 650 5171 (direct)
Fax: +44- (0) 131 667 7209
University of Edinburgh,
e-mail:  H.Gonzalez-Velez at sms.ed.ac.uk, horacio at acm.org

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gropp at mcs.anl.gov  Wed Dec 10 13:38:46 2003
From: gropp at mcs.anl.gov (William Gropp)
Date: Wed, 10 Dec 2003 12:38:46 -0600
Subject: [Beowulf] MPICH error
In-Reply-To: <20031210180551.M13163@scalableinformatics.com>
References: <20031210180551.M13163@scalableinformatics.com>
Message-ID: <6.0.0.22.2.20031210123625.025d8e88@localhost>

At 12:12 PM 12/10/2003, you wrote:
>Hi Folks:
>
>   A customer is seeing
>
>         rm_1310:  p4_error: rm_start: net_conn_to_listener failed: 33220
>
>on an MPI job.  Used to work (just last week).  Updated the kernel was the 
>major
>change (added XFS support)
>
>   Any idea of what this is?  I assume a network change.  MPICH 1.2.4.  Do 
> I need
>to recompile MPICH to match the kernel?

No, you shouldn't need to recompile MPICH.  The most likely cause is a 
change in how TCP connections are handled.  See 
http://www-unix.mcs.anl.gov/mpi/mpich/docs/faq.htm#linux-redhat for some 
suggestions.

Bill 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Dec 10 14:22:30 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Dec 2003 14:22:30 -0500 (EST)
Subject: [Beowulf] Newbie in Beowulf
In-Reply-To: <002401c3bf4e$8d4305c0$33000e0a@RESNETHGV>
Message-ID: <Pine.LNX.4.44.0312101412540.11945-100000@ganesh.phy.duke.edu>

On Wed, 10 Dec 2003, Horacio Gonzalez-Velez wrote:

> I need to do MPI programming in a Beowulf cluster.  I am porting from Sun
> SOlaris to Beowulf so any pointers are extremely appreciated.

MPI books from MIT press.  Beowulf book, also from MIT press.  Probably
more books out there.  Online book on beowulf engineering on

 http://www.phy.duke.edu/brahma

(along with many other resource links).  Ian Foster and others' books on
parallel programming in general.  Articles (past and present) in both
Linux Magazine and Cluster Magazine, some of Forrest's LM articles
online last I checked.  MPICH website (www.mpich.org), LAM website
(www.lam-mpi.org) with of course MANY resource links and tutorials as
well.

When you get through this, if you still need help as again; there are
lots of MPI programmers on the list that can help with specific
questions, but this should get you started generally speaking.

Note that nearly any way you set up a compute cluster, "true beowulf" or
NOW or background utilization of mostly idle boxes on a linux LAN, will
let you do MPI programming and run the result in parallel.  Cluster
distributions will generally install MPI ready to run, more or less.
Ordinary over the counter distributions e.g. Red Hat will generally
permit you to install it as part of the supported distribution as an
optional package.  As for MPICH vs LAM vs commercial offerings, I'm not
an MPI expert and have no religious feelings -- reportedly one is a bit
easier to run from userspace and the other a bit easier to control in a
managed environment, but this sort of thing is amorphous and hard to
quantify and time dependent, so I won't even say which is which.

  rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Wed Dec 10 16:25:16 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Wed, 10 Dec 2003 13:25:16 -0800
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BFA2@orsmsx402.jf.intel.com>

From: Robert G. Brown; Sent: Wednesday, December 10, 2003 11:08 AM
> 
> On Wed, 10 Dec 2003, Robin Laing wrote:
> 
> > > I have to ask.  Is it a P4?  Strictly empirically I have
experienced
> > > similar things even without filling memory.  I actually moved my
> > > fileserver off onto a Celeron (which it has run flawlessly)
because it
> > > was so visible, so annoying.
> >
> > Dell P4 with 512M ram.  IDE drive.
> 
> One thing Mark suggested (offline, I think) is that TOO MUCH memory
can
> confuse the caching system of at least some kernels.  Since I never
> fully debugged this problem, but instead worked around it (a Celeron,
> memory, motherboard, case costs maybe $350 and my time and annoyance
are
> worth much more than this) I don't know if this is true or not, but it
> got to where it could actually crash the system when it was running as
> an NFS server with lots of sporadic traffic.  It behaved like it was
> swapping (and getting behind in swapping at that), but it wasn't.  It
> may well have been a memory management problem, but it seemed pretty
> specific to that system.

This is very much like the kernel i/o tuning problems that I described
earlier, that were fixed by replacing the kernel (the offending kernel
was a 2.4.17 or 2.4.18), or in some cases, by tuning i/o parameters.

I first saw this on IPF systems with a very high-end I/O subsystem, I
later saw it on other fast 32-bit systems.  All involved significant I/O
traffic, -- the system would appear to hang for extended periods and
then continue on.  The impact ranged from annoying (the IPF) to
debilitating.  The underlying cause was in the use and retirement of
buffers by the kernel.  IIRC, the kernel got to the point of holding on
to too much cache, and then deciding it needed to dump it all before
continuing on.

As I said before, the problem was reported several times on the LK list.
The first reports were with really poor I/O devices, and were dismissed
as such, but later reports showed up with well configured I/O systems,
but any system with the right I/O load could trigger it.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gabriele.butti at unimib.it  Thu Dec 11 10:25:06 2003
From: gabriele.butti at unimib.it (Butti Gabriele - Dottorati di Ricerca)
Date: 11 Dec 2003 16:25:06 +0100
Subject: [Beowulf] SWAP management
Message-ID: <1071156306.16827.53.camel@tantalio.mater.unimib.it>

Hi everybody,
      the question I am going to ask is not strictly releted to a
beowulf cluster but has to do with scientific computing in general, also
with scalar codes. I would like to learn more on how swap memory pages
are handled by a Linux OS. 
My problem is that when I'm running a code, it starts swapping even if
its memory requirements are lower than the total amount of memory
availble. For exaple if there 750 Mb of memory, the program swaps when
using only 450 Mb. How can avoid such a thing to happen? One solution
could be not to create any SWAP partition during the installation but I
think this is a very dramatic solution. 
Is there any other method to force a code to use only RAM ?
It seems that my Linux OS [RH 7.3 basically, in some cases SuSE 8.2]
tries to avoid that the percentage of memory used by a single process
becomes higher than 60-70 %.
Any idea would be appreciated.
TIA
Gabriel
-- 
                                 \\|//
                                -(o o)-
              /------------oOOOo--(_)--oOOOo-------------\
              |                                          |
              |             Gabriele Butti               |
              |        -----------------------           |
              |      Department of Material Science      |     
              |      University of Milano-Bicocca        |     
              |      Via Cozzi 53, 20125 Milano, ITALY   |     
              |      Tel (+39)02 64485214                |           
              |             .oooO   Oooo.                |
              \--------------(   )---(   )---------------/
                              \ (     ) /
                               \_)   (_/


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Thu Dec 11 12:02:39 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 11 Dec 2003 12:02:39 -0500 (EST)
Subject: [Beowulf] SWAP management
In-Reply-To: <1071156306.16827.53.camel@tantalio.mater.unimib.it>
Message-ID: <Pine.LNX.4.44.0312111139550.32494-100000@coffee.psychology.mcmaster.ca>

> with scalar codes. I would like to learn more on how swap memory pages
> are handled by a Linux OS. 

in Linux, there is user memory and kernel memory.  the latter is unswappable,
and only for internal kernel uses, though that includes some user-visible
caches like dcache.  it's not anything you can do anything about, so I'll 
ignore it here.

user-level memory includes cached pages of files, user-level stack or sbrk
heap, mmaped shared libraries, MAP_ANON memory, etc.  some of this is what 
you think of as being part of your process's virtual address space.  other
pages are done behind your back - especially caching of file-backed pages.
all IO normally goes through the page cache and thus competes for physical
pages with all the other page users.  this means that by doing a lot of IO,
you can cause enough page scavenging to force other pages (sufficiently idle)
out to swap or backing store.  (for instance, backing store of an mmaped file
is the file itself, on disk.)

> My problem is that when I'm running a code, it starts swapping even if
> its memory requirements are lower than the total amount of memory
> availble. For exaple if there 750 Mb of memory, the program swaps when
> using only 450 Mb.

are you also doing a lot of file IO?

with IO, the problem is that pages doing IO are "hot looking" to the kernel,
since they are touched by the device driver as well as userspace.  the kernel
will tend to leave them in the pagecache at the expense of other kinds of
pages, which may not be touched as often or fast.  in a way, this is really
a problem with the kernel simply not having enough memory for the properties
of a virtual page.

> How can avoid such a thing to happen?

there is NOTHING wrong with swapping, since it is merely the kernel trying 
to find the set of pages that make the best use of a limited amount of ram.
a moderate amount of swap OUT traffic is very much a good thing, since 
it means that old/idle processes won't clutter up your ram which could be 
more effectively used by something recent.

the problem (if any) is swap IN - especially when there's also swapouts
happening.  when this happens, it means that the kernel is choosing the wrong
pages to swap out, and is winding up having to read them back in immediately.
this is called "thrashing", and barring kernel bugs (such as early 2.4
kernels) the only solution is to add more ram.

> One solution
> could be not to create any SWAP partition during the installation but I
> think this is a very dramatic solution. 

disk is very cheap; ram is still a lot more expensive.  a modest amount of 
swapouts are really a tradeoff: move idle ram pages into cheap disk so the 
expensive ram can be used for something more important.

> Is there any other method to force a code to use only RAM ?

of course: mlock.

> It seems that my Linux OS [RH 7.3 basically, in some cases SuSE 8.2]
> tries to avoid that the percentage of memory used by a single process
> becomes higher than 60-70 %.

I don't believe there is any such heuristic.  it wouldn't have anything to do 
with the distribution, of course, only with the kernel.

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwheeler at startext.co.uk  Thu Dec 11 12:38:11 2003
From: mwheeler at startext.co.uk (Martin WHEELER)
Date: Thu, 11 Dec 2003 17:38:11 +0000 (UTC)
Subject: [Beowulf] Re: [OT] statistical calculations - report
Message-ID: <Pine.LNX.4.33.0312111723290.4108-100000@caxton.startext.demon.co.uk>

Many thanks to all who replied to me both on- and off-list; in
particular those who pointed me towards the ability to create customised
R and Python plugins for gnumeric.  (About which I knew nothing.)
Although I can't do anything about the use of spreadsheet technology in
the first place, at least yesterday I was enable to muster up enough
backup to be able to influence the choice of /which/ spreadsheet I will
be expected to use.

Also thanks to those who made practical suggestions concerning the use
of postgres/mysql databases; this was enough to convince me I had to do
something about certain areas of (natural language) data manipulation by
myself; and eschew the spreadsheet for something that more naturally
fits the way I work!

Regards,
-- 
Martin Wheeler   -   StarTEXT / AVALONIX - Glastonbury - BA6 9PH - England
mwheeler at startext.co.uk                http://www.startext.co.uk/mwheeler/
GPG pub key : 01269BEB  6CAD BFFB DB11 653E B1B7 C62B  AC93 0ED8 0126 9BEB
      - Share your knowledge. It's a way of achieving immortality. -


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rmd003 at sympatico.ca  Thu Dec 11 17:14:40 2003
From: rmd003 at sympatico.ca (rmd003 at sympatico.ca)
Date: Thu, 11 Dec 2003 17:14:40 -0500
Subject: [Beowulf] Simple Cluster
Message-ID: <3FD8EC50.7060606@sympatico.ca>

Hello,

Would anyone know if it is possible to make a cluster with four P1 
computers? If it is possible are there any instructions on how to do 
this or the software required etc...?

Robert Van Amelsvoort
rmd003 at sympatico.ca

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Thu Dec 11 20:15:02 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Fri, 12 Dec 2003 09:15:02 +0800 (CST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <3FD8EC50.7060606@sympatico.ca>
Message-ID: <20031212011502.19428.qmail@web16811.mail.tpe.yahoo.com>

It all depends on what you want to do with the
cluster.

Andrew.

--- rmd003 at sympatico.ca ????
> Hello,
> 
> Would anyone know if it is possible to make a
> cluster with four P1 
> computers? If it is possible are there any
> instructions on how to do 
> this or the software required etc...?
> 
> Robert Van Amelsvoort
> rmd003 at sympatico.ca
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec 12 07:25:03 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 12 Dec 2003 07:25:03 -0500 (EST)
Subject: [Beowulf] SWAP management
In-Reply-To: <Pine.LNX.4.44.0312111139550.32494-100000@coffee.psychology.mcmaster.ca>
Message-ID: <Pine.LNX.4.44.0312120654090.1868-100000@lilith.rgb.private.net>

On Thu, 11 Dec 2003, Mark Hahn wrote:

> > It seems that my Linux OS [RH 7.3 basically, in some cases SuSE 8.2]
> > tries to avoid that the percentage of memory used by a single process
> > becomes higher than 60-70 %.
> 
> I don't believe there is any such heuristic.  it wouldn't have anything to do 
> with the distribution, of course, only with the kernel.

To add to Mark's comment, it is not exactly easy to see what's going on
with a system's memory usage.  Using top and/or vmstat for starters --
vmstat 5 will let you differentiate "swap" events from other paging and
disk activity (possibly associated with applications) while letting you
see memory consumption in real time.  top will give you a lovely picture
of the active process space that auto-updates ever (interval) seconds.
If you enter M, it will toggle into a mode where the list is sorted by
memory consumption instead of run queue (which I find often misses
problems, or rather flashes them up only rarely).  You can then look at
Size (full virtual memory allocation of process) and RSS (space the
process is actually using in memory at the time) while looking at total
memory and swap usage in the header.

Note well that the "used/free" part of memory is not an accurate
reflection of the system's available memory in this display -- to get
that you have to subtract buffer and cached memory from the used
component.  This yields the memory that CAN be made available to
a process if all the cached pages are paged out and all the buffers
flushed and freed.  Linux does NOT like to run in a mode with no cache
and buffer space as it is not efficient -- one reason linux generally
appears so smooth and fast is that a rather large fraction of the time
"I/O" from slow resources is actually served from the cache and "I/O" to
slow resources is actually written into a buffer so that the task can
continue unblocked.  If you do suck up all the free memory, it will then
fuss a bit and try paging things out to free up at least a small bit of
cache/buffer space.

Note that a small amount of swap space usage is fairly normal and
doesn't mean that your system is "swapping".  A small amount of swap
out events is also normal ditto.  It's the swap ins that are more of a
problem.

One problem that can be very difficult to detect is a problem with a
daemon or networking stack.  A runaway forking daemon can consume large
amounts of resources and clutter your system with processes.  A runaway
networking application that is trying to make connections on a "bad"
port or networking connection can sometimes contain a loop that e.g.
tries to make a socket and succeeds, whereby the connection breaks and
the socket has to terminate, which takes a timeout.  I've seen loops
that would leave you with a - um - "large number" of these dying
sockets, which again suck up resources and may or may not eventually
cause problems.  There used to be a similar problem with zombie
processes and I suppose there still is if you right just the right code,
but I haven't seen an actual zombie for a long time.

Note also that top and to a less detailed extent vmstat give you a way
of seeing whether or not an application is leaking.  If a system
"suddenly" starts paging/swapping, chances are really, really good that
one of your applications is leaking sieve-like.  Having written a number
of applications myself which I proudly acknowledge leaked like a
sumbitch until I finally tracked them down with free plumber's putty, I
know just how bone-simple it is to do, especially if you use certain
libraries (e.g. libxml*) where nearly everything you handle is a pointer
to space malloc'd by a called routine that has to be freed before you
reuse it.  top with M can help a bit -- watch that Size and if it grows
while RSS remains constant, suspect a leak.

Finally, a few programs may or may not leak, but they constitute a big
sucking noise when run on your system.  Open Office, for example, is
lovely but uses more memory than X itself (which is also rather a pig).
Some of the gnome apps are similarly quite large and tend to have RSS
close to SIZE.  In general, if you are running a GUI, it is not at all
unlikely that you're using 100 MB or more and might be using several
hundred MB.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec 12 08:40:12 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 12 Dec 2003 08:40:12 -0500 (EST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <3FD8EC50.7060606@sympatico.ca>
Message-ID: <Pine.LNX.4.44.0312120834490.2033-100000@lilith.rgb.private.net>

On Thu, 11 Dec 2003 rmd003 at sympatico.ca wrote:

> Hello,
> 
> Would anyone know if it is possible to make a cluster with four P1 
> computers? If it is possible are there any instructions on how to do 

Sure.  There are instructions in my column in Cluster World 1,1 that
should suffice.  There is also a bunch of stuff that might be enough in
resources linked to http:/www.phy.duke.edu/brahma/index.php, including
an online book on clusters.  You can probably get free issues including
this one with a trial subscription at the clusterworld website.

The problems I can see with using Pentiums at this point are:

  a) likely insufficient memory and disk unless you really work on the
linux installation;

  b) a single $500 vanilla box from your local cheap vendor would be
MUCH MUCH MUCH faster.  As in MUCH faster.  Raw CPU clock a factor of
10, add a factor of 2 to 4 for CPU family and more memory and so forth.
Likely ten or more times faster than your entire cluster of four
Pentiums on a good day.  SO your cluster needs to be a "just for fun"
cluster, for hobbyist or teaching purposes, and would still be much
better (faster and easier to build) with more current CPUs and systems
with a minimum of 128 to 256 MB of memory each.

   rgb

> this or the software required etc...?
> 
> Robert Van Amelsvoort
> rmd003 at sympatico.ca
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Fri Dec 12 09:10:15 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Fri, 12 Dec 2003 06:10:15 -0800 (PST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <Pine.LNX.4.44.0312120834490.2033-100000@lilith.rgb.private.net>
Message-ID: <Pine.LNX.3.96.1031212060516.12940A-100000@Maggie.Linux-Consulting.com>


On Fri, 12 Dec 2003, Robert G. Brown wrote:

> On Thu, 11 Dec 2003 rmd003 at sympatico.ca wrote:
> 
> > Hello,
> > 
> > Would anyone know if it is possible to make a cluster with four P1 
> > computers? If it is possible are there any instructions on how to do 

only good thing that wuld come out of it would be learning what files
need to be changed  to get a cluster working

> Sure.  There are instructions in my column in Cluster World 1,1 that
> should suffice.  There is also a bunch of stuff that might be enough in
> resources linked to http:/www.phy.duke.edu/brahma/index.php, including
> an online book on clusters.  You can probably get free issues including
> this one with a trial subscription at the clusterworld website.
> 
> The problems I can see with using Pentiums at this point are:
> 
>   a) likely insufficient memory and disk unless you really work on the
> linux installation;
> 
>   b) a single $500 vanilla box from your local cheap vendor would be
> MUCH MUCH MUCH faster.  As in MUCH faster.  Raw CPU clock a factor of

now days.. you can get a brand new mini-itx P3-800 equivalent for $125
and you can even use the old memory from the Pentium ( the p3-800 uses
pc-133 memory .. amazingly silly.. .. p3-800 is the EPIA-800 )
	- just the diference in time spent waiting for the old pentiums
	vs the mini-itx would make the mini-itx a better choice
	since you can have a useful cluster after playing

	- but than again, one of my 3 primary "useful" machine is still a
	p-90 w/ 48MB of memory ( primary == used everyday by me )

have fun
alvin

> 10, add a factor of 2 to 4 for CPU family and more memory and so forth.
> Likely ten or more times faster than your entire cluster of four
> Pentiums on a good day.  SO your cluster needs to be a "just for fun"
> cluster, for hobbyist or teaching purposes, and would still be much
> better (faster and easier to build) with more current CPUs and systems
> with a minimum of 128 to 256 MB of memory each.
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Fri Dec 12 09:51:51 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Fri, 12 Dec 2003 06:51:51 -0800
Subject: [Beowulf] SWAP management
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BFBD@orsmsx402.jf.intel.com>

From: Robert G. Brown; Sent: Friday, December 12, 2003 4:25 AM
> On Thu, 11 Dec 2003, Mark Hahn wrote:
> 
> > > It seems that my Linux OS [RH 7.3 basically, in some cases SuSE
8.2]
> > > tries to avoid that the percentage of memory used by a single
process
> > > becomes higher than 60-70 %.
> >
> > I don't believe there is any such heuristic.  it wouldn't have
anything
> to do
> > with the distribution, of course, only with the kernel.
> 
> To add to Mark's comment, it is not exactly easy to see what's going
on
> with a system's memory usage.  Using top and/or vmstat for starters --
> vmstat 5 will let you differentiate "swap" events from other paging
and
> disk activity (possibly associated with applications) while letting
you
> see memory consumption in real time.  top will give you a lovely
picture
> of the active process space that auto-updates ever (interval) seconds.
> If you enter M, it will toggle into a mode where the list is sorted by
> memory consumption instead of run queue (which I find often misses
> problems, or rather flashes them up only rarely).  You can then look
at
> Size (full virtual memory allocation of process) and RSS (space the
> process is actually using in memory at the time) while looking at
total
> memory and swap usage in the header.

I find that atop is a valuable tool to see what going on in a system,
much better than standard top.

Atop doesn't display inactive processes, so your display isn't clutter
with processes you don't care about, regardless of your sort; atop also
shows the growth of both virtual and resident memory.  In addition, atop
also gives a very good look at the system, including cpu, memory, disk,
and network.

One final Good Thing, atop can keep raw data in files that you can
"replay" later, allowing you to see a time-history of activity on the
node.

Take a look at ftp://ftp.atcomputing.nl/pub/tools/linux/

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From james.p.lux at jpl.nasa.gov  Fri Dec 12 10:32:57 2003
From: james.p.lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Dec 2003 07:32:57 -0800
Subject: [Beowulf] Simple Cluster
References: <3FD8EC50.7060606@sympatico.ca>
Message-ID: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>

Sure you can do it. It won't be a ball of fire speed wise, and probably
wouldn't be a cost effective solution to doing any "real work", but it will
compute..

Search the web for the "Pondermatic" which, as I recall, was a couple or
three P1s.  And of course, very early clusters were made with 486's.

Your big challenge is probably going to be (easily) getting an appropriate
distribution that fits within the disk and RAM limits.  Yes, before all the
flames start, I know it's possible to make a version that fits in 16K on an
8088, and that would be bloatware compared to someone's special 6502 Linux
implementation that runs on old Apple IIs, etc.etc.etc., but nobody would
call that easy.  What Robert is probably looking for is a "stick the CDROM
in and go" kind of solution, and, just like in the Windows world, the
current, readily available (as in download the ISO and go) solutions tend to
assume one has a vintage 2001 computer sitting around with a several hundred
MHz processor and 64MB of RAM, etc.

Actually, I'd be very glad to hear that this is not the case..

Maybe one of the old Scyld "cluster on a disk" might be a good way?

Perhaps Rocks?  It sort of self installs.

One could always just boot 4 copies of Knoppix, but I don't know that
there's many "cluster management" tools in Knoppix.

----- Original Message -----
From: <rmd003 at sympatico.ca>
To: <beowulf at beowulf.org>
Sent: Thursday, December 11, 2003 2:14 PM
Subject: [Beowulf] Simple Cluster


> Hello,
>
> Would anyone know if it is possible to make a cluster with four P1
> computers? If it is possible are there any instructions on how to do
> this or the software required etc...?
>
> Robert Van Amelsvoort
> rmd003 at sympatico.ca
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mbanck at gmx.net  Fri Dec 12 10:50:22 2003
From: mbanck at gmx.net (Michael Banck)
Date: Fri, 12 Dec 2003 16:50:22 +0100
Subject: [Beowulf] Simple Cluster
In-Reply-To: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
References: <3FD8EC50.7060606@sympatico.ca> <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <20031212155022.GB25554@blackbird.oase.mhn.de>

On Fri, Dec 12, 2003 at 07:32:57AM -0800, Jim Lux wrote:
> One could always just boot 4 copies of Knoppix, but I don't know that
> there's many "cluster management" tools in Knoppix.
 
While Knoppix is all cool with that self-configuration and stuff, I've
never heard it mentioned when it came to low-level hardware and RAM
requirements. Sure, one must not boot up in KDE|GNOME, but I doubt that
even the console mode has a small memory footprint. I'd like to be
proven wrong, of course :)


Michael
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jeffrey.b.layton at lmco.com  Fri Dec 12 11:50:43 2003
From: jeffrey.b.layton at lmco.com (Jeff Layton)
Date: Fri, 12 Dec 2003 11:50:43 -0500
Subject: [Beowulf] Simple Cluster
In-Reply-To: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
References: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <3FD9F1E3.2080805@lmco.com>


   I can think of three solutions. The first one I can think of is
called clusterKnoppix (bofh.be/clusterknoppix/). It has
OpenMOSIX built-in so you can run compute farm types
of applications (and you get to learn about OpenMOSIX).
You can also run MPI and PVM apps on it.
   The second one I can think of is Warewulf
(warewulf-cluster.org). The primary 'mode' of it allows
you to boot the nodes over the network to a RAM disk
about 70 Megs in size. You could also boot of a CD or floppy
and then pull the install over the network.
   The third one is called Bootable Cluster CD
(www.cs.uni.edu/~gray/bccd/). It is somewhat like
clusterKnoppix but I'm not sure it uses OpenMOSIX.
   A fourth alternative might be Thin-Oscar
(thin-oscar.ccs.usherbrooke.ca/). I don't think it's ready
for prime-time, but you might take a look.

Good Luck!

Jeff

> Sure you can do it. It won't be a ball of fire speed wise, and probably
> wouldn't be a cost effective solution to doing any "real work", but it 
> will
> compute..
>
> Search the web for the "Pondermatic" which, as I recall, was a couple or
> three P1s.  And of course, very early clusters were made with 486's.
>
> Your big challenge is probably going to be (easily) getting an 
> appropriate
> distribution that fits within the disk and RAM limits.  Yes, before 
> all the
> flames start, I know it's possible to make a version that fits in 16K 
> on an
> 8088, and that would be bloatware compared to someone's special 6502 
> Linux
> implementation that runs on old Apple IIs, etc.etc.etc., but nobody would
> call that easy.  What Robert is probably looking for is a "stick the 
> CDROM
> in and go" kind of solution, and, just like in the Windows world, the
> current, readily available (as in download the ISO and go) solutions 
> tend to
> assume one has a vintage 2001 computer sitting around with a several 
> hundred
> MHz processor and 64MB of RAM, etc.
>
> Actually, I'd be very glad to hear that this is not the case..
>
> Maybe one of the old Scyld "cluster on a disk" might be a good way?
>
> Perhaps Rocks?  It sort of self installs.
>
> One could always just boot 4 copies of Knoppix, but I don't know that
> there's many "cluster management" tools in Knoppix.
>
> ----- Original Message -----
> From: <rmd003 at sympatico.ca>
> To: <beowulf at beowulf.org>
> Sent: Thursday, December 11, 2003 2:14 PM
> Subject: [Beowulf] Simple Cluster
>
>
> > Hello,
> >
> > Would anyone know if it is possible to make a cluster with four P1
> > computers? If it is possible are there any instructions on how to do
> > this or the software required etc...?
> >
> > Robert Van Amelsvoort
> > rmd003 at sympatico.ca
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
>


-- 
Dr. Jeff Layton
Aerodynamics and CFD
Lockheed-Martin Aeronautical Company - Marietta


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From henken at seas.upenn.edu  Fri Dec 12 12:24:39 2003
From: henken at seas.upenn.edu (Nicholas Henke)
Date: Fri, 12 Dec 2003 12:24:39 -0500
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <1071249879.25601.14.camel@roughneck.liniac.upenn.edu>

On Fri, 2003-12-12 at 12:08, Joshua Baker-LePain wrote:
> Yes, I know this has been discussed a couple of times, and that my stated 
> goals are at odds with each other.  But I really need the best bang for 
> the noise for a system that will reside in the same room with patients 
> undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
> 2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> and annoys both the patients and the physicians.  This is Bad.
> 
> We're willing to pay for better, but don't want to take too much of a 
> speed hit.  Does anybody have a good vendor for quiet but still high 
> performing systems?  Is there any hope in the 1U form factor (my Opteron 
> nodes are somewhat quieter, since they use squirrel cage fans, but are 
> still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> or 4U case?  Or should I look at lashing together some towers (this system 
> also needs to be somewhat portable)?

They are not 1U, but the Dell 650N I have is just about silent. At most
I hear a faint harddrive noise, but most times I hear nothing at all. 
FYI, this is a dual processor machine as well.

Nic
-- 
Nicholas Henke
Penguin Herder & Linux Cluster System Programmer
Liniac Project - Univ. of Pennsylvania

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jlb17 at duke.edu  Fri Dec 12 12:08:31 2003
From: jlb17 at duke.edu (Joshua Baker-LePain)
Date: Fri, 12 Dec 2003 12:08:31 -0500 (EST)
Subject: [Beowulf] Quiet *and* powerful
Message-ID: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>

Yes, I know this has been discussed a couple of times, and that my stated 
goals are at odds with each other.  But I really need the best bang for 
the noise for a system that will reside in the same room with patients 
undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
and annoys both the patients and the physicians.  This is Bad.

We're willing to pay for better, but don't want to take too much of a 
speed hit.  Does anybody have a good vendor for quiet but still high 
performing systems?  Is there any hope in the 1U form factor (my Opteron 
nodes are somewhat quieter, since they use squirrel cage fans, but are 
still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
or 4U case?  Or should I look at lashing together some towers (this system 
also needs to be somewhat portable)?

Thanks for any hints, pointers, recommendations, or flames.

-- 
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Fri Dec 12 12:54:09 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Fri, 12 Dec 2003 09:54:09 -0800 (PST)
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <1071249879.25601.14.camel@roughneck.liniac.upenn.edu>
Message-ID: <Pine.LNX.3.96.1031212094142.11569A-100000@Maggie.Linux-Consulting.com>


hi ya joshua

- what makes noise is typiclly the qualityof the fan, the 
  fan blade design and the size of the air holes and
  distance to the fans

- a fan held up in the open air shold be close to noiseless

- if you want 1U .. you have to put good quality lateral squirrel cages
  far away from everything and still be able to force air
  across the cpu heatsink fins 
	- you should not hear anything

- if you go to 12U or midtower... there is no noise problem
  except for the cheezy "el cheapo" power supply fan 
	( get a good power supply w/ good fan and you wont hear
	( the power supply either

- next choice isto use peltier cooling but you still have to cool
  down the fin on the hot side of the peltier..
	- you can also attach a bracket from teh cpu heatink
	or peltier heatsink to the case ... to get rid of the heat
	assuming the ambient room temp can pull heat off the case

- sounds like your app is based on "quiet operation" and does
  not need to be 1Us ...
	- i'd stack 6 dual-xons mb into one custom chassis and
	it should be quiet as a "nursing room"

== to prove the point ...
	- take all the fans off ( its not needed for this test )

	- take off all the covers to the chassis

	- arrange the motherboards all facing the same way

	- put a giant 12" household fan blowing air across
	the cpu heatink ( air flowing only in 1 direction )
		- preferably side to side in the direction
		of the cpu heatsink fins

	- put a carboard box around the chassis and leave the
	unobstructed air flow of the cardboard opn on the
	cpu side and opposite site
		- put white hospital linen on the box
		that says "do not sit here"

	( probably should do that with the doors locked so
	( that nobody see the cardboard experiment

- after that, you know what your chassis looks like ...
  and still be quiet ..  or you're stuck with 6 mid-tower systems
  vs noisy 1Us
	- 2Us suffer the same noise fate as 1Us ( the way most people
	build it )

- fun stuff .. making the system quiet and run cool ... temperature wise

have fun
alvin

On Fri, 12 Dec 2003, Nicholas Henke wrote:

> On Fri, 2003-12-12 at 12:08, Joshua Baker-LePain wrote:
> > Yes, I know this has been discussed a couple of times, and that my stated 
> > goals are at odds with each other.  But I really need the best bang for 
> > the noise for a system that will reside in the same room with patients 
> > undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
> > 2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> > and annoys both the patients and the physicians.  This is Bad.
> > 
> > We're willing to pay for better, but don't want to take too much of a 
> > speed hit.  Does anybody have a good vendor for quiet but still high 
> > performing systems?  Is there any hope in the 1U form factor (my Opteron 
> > nodes are somewhat quieter, since they use squirrel cage fans, but are 
> > still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> > or 4U case?  Or should I look at lashing together some towers (this system 
> > also needs to be somewhat portable)?
> 
> They are not 1U, but the Dell 650N I have is just about silent. At most
> I hear a faint harddrive noise, but most times I hear nothing at all. 
> FYI, this is a dual processor machine as well.
> 
> Nic
> -- 
> Nicholas Henke
> Penguin Herder & Linux Cluster System Programmer
> Liniac Project - Univ. of Pennsylvania
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Fri Dec 12 13:10:51 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Fri, 12 Dec 2003 10:10:51 -0800
Subject: [Beowulf] Quiet *and* powerful
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BFBE@orsmsx402.jf.intel.com>

From: Joshua Baker-LePain; Sent: Friday, December 12, 2003 9:09 AM
> 
> Yes, I know this has been discussed a couple of times, and that my
stated
> goals are at odds with each other.  But I really need the best bang
for
> the noise for a system that will reside in the same room with patients
> undergoing diagnostic ultrasound scanning.  Our current setup (6 1U
dual
> 2.4GHz Xeons, with about 11 little fans per node) is ridiculously
loud,
> and annoys both the patients and the physicians.  This is Bad.

It's those tiny high-speed (< 1U) fans that are killing you.

Cheapest solution: Move the system out of the room?  Have you looked at
just running a network cable to a minimal diskless system for the
in-room needs?  I assume those needs are graphic head plus some manner
of sensor input.  The in-room unit could boot from the cluster, located
elsewhere.

In-room solution, but possibly above your price range: Go to a cluster
builder for a custom solution that removes the p/s and fans from each
box, centralizes the larger and slower fan(s) and p/s in the cabinet,
running dc to each node.

Depending on your skill and labor availability (I did see duke.edu in
your addr), you might be able to do this yourself or get some, um, cheap
labor.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Fri Dec 12 13:03:00 2003
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 12 Dec 2003 13:03:00 -0500
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <3FDA02D4.3070209@scalableinformatics.com>

Hi Joshua:

  You should probably look to larger cases with larger fans.  The bigger 
fans move more air at the same RPM.  Also, larger cases are easier to 
pad for sound absorption.  The Xeon's I have seen have been using blower 
technology which is simply not quiet.  A 2-3 U system might be easier to 
cool with a larger fan (~10+ cm).  A better case would help as well if 
you could pad it without drastically affecting cooling (airflow).

   Other options include silencing enclosures (enclosures with acoustic 
padding) to encapsulate the existing systems.  These reduce roars to 
hums, annoying but lower intensity.

Joe

Joshua Baker-LePain wrote:

>Yes, I know this has been discussed a couple of times, and that my stated 
>goals are at odds with each other.  But I really need the best bang for 
>the noise for a system that will reside in the same room with patients 
>undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
>2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
>and annoys both the patients and the physicians.  This is Bad.
>
>We're willing to pay for better, but don't want to take too much of a 
>speed hit.  Does anybody have a good vendor for quiet but still high 
>performing systems?  Is there any hope in the 1U form factor (my Opteron 
>nodes are somewhat quieter, since they use squirrel cage fans, but are 
>still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
>or 4U case?  Or should I look at lashing together some towers (this system 
>also needs to be somewhat portable)?
>
>Thanks for any hints, pointers, recommendations, or flames.
>
>  
>

-- 
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 12 13:30:44 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Dec 2003 10:30:44 -0800
Subject: [Beowulf] Simple Cluster
In-Reply-To: <20031212155022.GB25554@blackbird.oase.mhn.de>
References: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
 <3FD8EC50.7060606@sympatico.ca>
 <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <5.2.0.9.2.20031212102848.02fa4e70@mailhost4.jpl.nasa.gov>

At 04:50 PM 12/12/2003 +0100, Michael Banck wrote:
>On Fri, Dec 12, 2003 at 07:32:57AM -0800, Jim Lux wrote:
> > One could always just boot 4 copies of Knoppix, but I don't know that
> > there's many "cluster management" tools in Knoppix.
>
>While Knoppix is all cool with that self-configuration and stuff, I've
>never heard it mentioned when it came to low-level hardware and RAM
>requirements. Sure, one must not boot up in KDE|GNOME, but I doubt that
>even the console mode has a small memory footprint. I'd like to be
>proven wrong, of course :)


I have booted Knoppix into command line mode in 64MB, and maybe 32MB.. I'll 
have to go down into the lab and check.  These are ancient Micron Win95 ISA 
machines we have to run old hardware specific in-circuit-emulators from 
Analog Devices.  One machine didn't work, but I think that was because the 
CD-ROM is broken, not because of other resources.


>Michael
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Fri Dec 12 12:42:24 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Fri, 12 Dec 2003 09:42:24 -0800
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <1071249879.25601.14.camel@roughneck.liniac.upenn.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu> <1071249879.25601.14.camel@roughneck.liniac.upenn.edu>
Message-ID: <20031212174224.GA24197@cse.ucdavis.edu>

On Fri, Dec 12, 2003 at 12:24:39PM -0500, Nicholas Henke wrote:
> > We're willing to pay for better, but don't want to take too much of a 
> > speed hit.  Does anybody have a good vendor for quiet but still high 
> > performing systems?  Is there any hope in the 1U form factor (my Opteron 
> > nodes are somewhat quieter, since they use squirrel cage fans, but are 
> > still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> > or 4U case?  Or should I look at lashing together some towers (this system 
> > also needs to be somewhat portable)?
> 
> They are not 1U, but the Dell 650N I have is just about silent. At most
> I hear a faint harddrive noise, but most times I hear nothing at all. 
> FYI, this is a dual processor machine as well.

I'd recommend trying the 360N, I've seen the single p4 substantially
outperform the dual (even with 2 jobs running).

Basically the memory bus is substantially better on the 360N, and it's
even quieter then the 650N.  Of course this depends on the worldload.

I've never heard a quiet 1 or 2U.  Even the apple xserv's are pretty
loud.

If building yourself I recommend a case with rubber grommets, and
slow RPM 120mm fans similarly mounted.  The Antec Sonnata is an example.

Other possibilities include placing the servers elsewhere and using
a small quiet machine with an LCD/keyboard/mouse.
-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 12 13:27:22 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Dec 2003 10:27:22 -0800
Subject: [Beowulf] Simple Cluster
In-Reply-To: <Pine.LNX.4.44.0312120834490.2033-100000@lilith.rgb.private
 .net>
References: <3FD8EC50.7060606@sympatico.ca>
Message-ID: <5.2.0.9.2.20031212102323.018cc750@mailhost4.jpl.nasa.gov>

At 08:40 AM 12/12/2003 -0500, Robert G. Brown wrote:
>On Thu, 11 Dec 2003 rmd003 at sympatico.ca wrote:
>
> > Hello,
> >
> > Would anyone know if it is possible to make a cluster with four P1
> > computers? If it is possible are there any instructions on how to do
>
>Sure.  There are instructions in my column in Cluster World 1,1 that
>should suffice.  There is also a bunch of stuff that might be enough in
>resources linked to http:/www.phy.duke.edu/brahma/index.php, including
>an online book on clusters.  You can probably get free issues including
>this one with a trial subscription at the clusterworld website.
>
>The problems I can see with using Pentiums at this point are:
>
>   a) likely insufficient memory and disk unless you really work on the
>linux installation;
>
>   b) a single $500 vanilla box from your local cheap vendor would be
>MUCH MUCH MUCH faster.  As in MUCH faster.  Raw CPU clock a factor of
>10, add a factor of 2 to 4 for CPU family and more memory and so forth.
>Likely ten or more times faster than your entire cluster of four
>Pentiums on a good day.  SO your cluster needs to be a "just for fun"
>cluster, for hobbyist or teaching purposes, and would still be much
>better (faster and easier to build) with more current CPUs and systems
>with a minimum of 128 to 256 MB of memory each.


Unless you've got computers for free, and your time is free, Robert's words 
are well spoken..

That said.. if you just want to fool with MPI, for instance, and, you've 
got institutional computing resources running WinNT floating around on the 
network, the MPICH-NT version works quite well.  My first MPI program used 
this, with one node being an OLD, OLD ('98-'99 vintage) Win NT4.0 box on a 
P1, and the other node being a PPro desktop, also running NT4.0

I wrote and compiled everything in Visual C... (4 or 5, I can't recall 
which...) and I started working on a wrapper to allow use in Visual Basic, 
for a true thrill..


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Fri Dec 12 14:04:05 2003
From: lindahl at pathscale.com (Greg Lindahl)
Date: Fri, 12 Dec 2003 11:04:05 -0800
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <20031212190405.GA3036@greglaptop.internal.keyresearch.com>

On Fri, Dec 12, 2003 at 12:08:31PM -0500, Joshua Baker-LePain wrote:

> Our current setup (6 1U dual 
> 2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 

The main issue is 1U -- small fans are inefficient, so you end up with
a lot more noise for a given amount of power.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ShiYi.Yue at astrazeneca.com  Fri Dec 12 14:31:11 2003
From: ShiYi.Yue at astrazeneca.com (ShiYi.Yue at astrazeneca.com)
Date: Fri, 12 Dec 2003 20:31:11 +0100
Subject: [Beowulf] Pros and cons of different beowulf clusters
Message-ID: <D2A2B86E8730D711B8560008028AC980257A22@camrd9.camrd.astrazeneca.net>

Hi,

Can someone point me out if there is any comparison of different (small)
beowulf clusters? The hardware will be limited in < 20 PCs. As an example of
this comparison, something like Rocks vs. OSCAR, what do you think about the
installation, maintenance, and upgrade, which one is easier? which one is
more flexible?

Thank you in advance!

shiyi
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dtj at uberh4x0r.org  Fri Dec 12 14:57:07 2003
From: dtj at uberh4x0r.org (Dean Johnson)
Date: Fri, 12 Dec 2003 13:57:07 -0600
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <20031212190405.GA3036@greglaptop.internal.keyresearch.com>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
	 <20031212190405.GA3036@greglaptop.internal.keyresearch.com>
Message-ID: <1071259026.1556.124.camel@terra>

On Fri, 2003-12-12 at 13:04, Greg Lindahl wrote:
> On Fri, Dec 12, 2003 at 12:08:31PM -0500, Joshua Baker-LePain wrote:
> 
> > Our current setup (6 1U dual 
> > 2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> 
> The main issue is 1U -- small fans are inefficient, so you end up with
> a lot more noise for a given amount of power.
> 

And they are MUCH higher pitched, which pegs the annoy-o-meter. I used
to have an SGI 1100 (1U dual PIII) and an SGI Origin 200 in my home
office. They were both probably the same overall loudness, but it was
the 1100 that I would shut off when I wasn't using it.

-- 

	-Dean

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 12 17:48:26 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Dec 2003 14:48:26 -0800
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu
 >
Message-ID: <5.2.0.9.2.20031212144145.03173b80@mailhost4.jpl.nasa.gov>

First question... does it "really" need to be in the same room?

There is a huge variation in fan noise among models and makes of fan, and, 
furthermore, the structural stuff around it has an effect.  Perhaps just 
buying quieter fans and retrofitting?

Can you put the whole thing in a BIG sound isolated box (read, rack)... 
most equipment racks aren't designed for good acoustical properties.  There 
are, however, industries which are noise level sensitive (sound recording 
and mixing), and they have standard 19" racks, but with better 
design/packaging/etc.

If you're not hugely cost constrained, you can do away with fans entirely 
and sink the whole thing into a tank of fluorinert (but, at $70+/gallon....)

The other thing to think about is whether many smaller/lower power nodes 
can do your job.   If things scaled exactly as processor speed (don't we 
wish).. you've got 12 * 2.4 GHz = 28.8 GHz... Could 40 or 50 1GHz VIA type 
fanless processors work?

Overall, your best bet might be to get some custom sheet metal made to 
mount your motherboards in a more congenial (acoustic and thermal) 
environment.  Rather than have 2 layers of metal between each mobo, make a 
custom enclosure that stacks the boards a few inches apart, and which 
shares a couple big, but quiet, fans to push air through it.

In general, for a given amount of air moved, small fans are much less 
efficient and more noisy than big fans. (efficiency and noise are not very 
well correlated... the mechanical power in the noise is vanishingly small).


At 12:08 PM 12/12/2003 -0500, Joshua Baker-LePain wrote:
>Yes, I know this has been discussed a couple of times, and that my stated
>goals are at odds with each other.  But I really need the best bang for
>the noise for a system that will reside in the same room with patients
>undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual
>2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud,
>and annoys both the patients and the physicians.  This is Bad.
>
>We're willing to pay for better, but don't want to take too much of a
>speed hit.  Does anybody have a good vendor for quiet but still high
>performing systems?  Is there any hope in the 1U form factor (my Opteron
>nodes are somewhat quieter, since they use squirrel cage fans, but are
>still too loud), or should I look at, e.g., putting Quad Opterons in a 3
>or 4U case?  Or should I look at lashing together some towers (this system
>also needs to be somewhat portable)?
>
>Thanks for any hints, pointers, recommendations, or flames.
>
>--
>Joshua Baker-LePain
>Department of Biomedical Engineering
>Duke University
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Sat Dec 13 05:57:13 2003
From: john.hearns at clustervision.com (John Hearns)
Date: Sat, 13 Dec 2003 11:57:13 +0100 (CET)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <Pine.LNX.4.44.0312131154320.3660-100000@druifje.clustervision.com>

On Fri, 12 Dec 2003, Jim Lux wrote:

> call that easy.  What Robert is probably looking for is a "stick the CDROM
> in and go" kind of solution, and, just like in the Windows world, the
> current, readily available (as in download the ISO and go) solutions tend to
> assume one has a vintage 2001 computer sitting around with a several hundred
> MHz processor and 64MB of RAM, etc.
> 
> 
> One could always just boot 4 copies of Knoppix, but I don't know that
> there's many "cluster management" tools in Knoppix.

How about ClusterKnoppix then?
http://bofh.be/clusterknoppix/

Its a Knoppix version which runs OpenMosix.
Teh slaves boot via PXE - which might rule out the old P1s.
You probably could boot via floppy though.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at scyld.com  Sat Dec 13 04:45:20 2003
From: rgoornaden at scyld.com (rgoornaden at scyld.com)
Date: Sat, 13 Dec 2003 04:45:20 -0500
Subject: [Beowulf] java virtual machine
Message-ID: <200312130945.hBD9jKS29056@NewBlue.scyld.com>


hello
has someone ever met this package while installing mpich2-0.94???
thanks


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Sat Dec 13 07:52:26 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Sat, 13 Dec 2003 06:52:26 -0600
Subject: [Beowulf] BW-BUG meeting, Today Dec. 9, 2003, in Greenbelt MD;
  -- Red Hat
In-Reply-To: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>
References: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>
Message-ID: <3FDB0B8A.8040903@tamu.edu>

Would it be possible for someone to give a synopsis (assuming that, due 
to travel and a catch-up effort on my part, I didn't miss it already) of 
this meeting?

Thanks, Gerry

Donald Becker wrote:
> [[ Please note that this month's meeting is East: Greenbelt, not McLean VA. ]]
> 
>      Baltimore Washington  Beowulf Users Group 
>             December 2003 Meeting 
>                www.bwbug.org
>     December 9th at 3:00PM in Greenbelt MD
>  
> ____
> 
>         RedHat Roadmap for HPC Beowulf Clusters.
> 
>         RedHat is pleased to have the opportunity to present to Baltimore-
> Washington Beowulf User Group on Tuesday Dec 9th. Robert Hibbard, Red Hat's
> Federal Partner Alliance Manager, will provide information on Red Hat's
> Enterprise Linux product strategy, with particular emphasis on it's
> relevance to High Performance Computing Clusters. 
> 
>         Discussion will include information on the background, current
> product optimizations, as well as possible futures for Red Hat efforts
> focused on HPCC. 
> ____
> 
> Our meeting facilities are once again provided by Northrup Grumman
> 	7501 Greenway Center Drive
> 	Suite 1000 (10th floor)
> 	Greenbelt, MD 20770, phone
> 	703-628-7451
> 
> 

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From laytonjb at comcast.net  Sat Dec 13 12:51:37 2003
From: laytonjb at comcast.net (Jeffrey B. Layton)
Date: Sat, 13 Dec 2003 12:51:37 -0500
Subject: [Beowulf] Anyone recently build a small cluster?
Message-ID: <3FDB51A9.1030406@comcast.net>

Good morning,

   I'm looking for someone or a group that has recently
built a small (16 nodes or less) cluster that was their
first cluster. I'm working on a part of one of my columns
for Cluster World and I want to feature a small cluster
that someone built for the first time.

Thanks!

Jeff


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From daniel.pfenniger at obs.unige.ch  Sat Dec 13 13:28:45 2003
From: daniel.pfenniger at obs.unige.ch (Daniel Pfenniger)
Date: Sat, 13 Dec 2003 19:28:45 +0100
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <3FDB5A5D.8090206@obs.unige.ch>

Hi,

Its not a 1U, its a low-noise Linux P4 box, but really *low* noise: the 
transtec 1200
We bought these for offices precisely because these boxes are designed 
for low noise.

http://www.transtec.ch/CH/E/products/workstations/linuxworkstations/transtec1200lownoiseworkstation.html?fsid=342edfd38a845c179dd18ef965091b2d

In practice in the office the box can barely be noticed,  I can imagine 
a dozen or more
of these boxes would not disturb a normal conversation.

    Dan


Joshua Baker-LePain wrote:

>Yes, I know this has been discussed a couple of times, and that my stated 
>goals are at odds with each other.  But I really need the best bang for 
>the noise for a system that will reside in the same room with patients 
>undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
>2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
>and annoys both the patients and the physicians.  This is Bad.
>
>We're willing to pay for better, but don't want to take too much of a 
>speed hit.  Does anybody have a good vendor for quiet but still high 
>performing systems?  Is there any hope in the 1U form factor (my Opteron 
>nodes are somewhat quieter, since they use squirrel cage fans, but are 
>still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
>or 4U case?  Or should I look at lashing together some towers (this system 
>also needs to be somewhat portable)?
>
>Thanks for any hints, pointers, recommendations, or flames.
>
>  
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lathama at yahoo.com  Sat Dec 13 13:55:12 2003
From: lathama at yahoo.com (Andrew Latham)
Date: Sat, 13 Dec 2003 10:55:12 -0800 (PST)
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <3FDB5A5D.8090206@obs.unige.ch>
Message-ID: <20031213185512.48570.qmail@web60304.mail.yahoo.com>

koolance.com has rackmount cases that use a water cooling system that is both
cool and quite. It also is a standard rackmount case that would free up some
design issues..


--- Daniel Pfenniger <daniel.pfenniger at obs.unige.ch> wrote:
> Hi,
> 
> Its not a 1U, its a low-noise Linux P4 box, but really *low* noise: the 
> transtec 1200
> We bought these for offices precisely because these boxes are designed 
> for low noise.
> 
>
http://www.transtec.ch/CH/E/products/workstations/linuxworkstations/transtec1200lownoiseworkstation.html?fsid=342edfd38a845c179dd18ef965091b2d
> 
> In practice in the office the box can barely be noticed,  I can imagine 
> a dozen or more
> of these boxes would not disturb a normal conversation.
> 
>     Dan
> 
> 
> Joshua Baker-LePain wrote:
> 
> >Yes, I know this has been discussed a couple of times, and that my stated 
> >goals are at odds with each other.  But I really need the best bang for 
> >the noise for a system that will reside in the same room with patients 
> >undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
> >2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> >and annoys both the patients and the physicians.  This is Bad.
> >
> >We're willing to pay for better, but don't want to take too much of a 
> >speed hit.  Does anybody have a good vendor for quiet but still high 
> >performing systems?  Is there any hope in the 1U form factor (my Opteron 
> >nodes are somewhat quieter, since they use squirrel cage fans, but are 
> >still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> >or 4U case?  Or should I look at lashing together some towers (this system 
> >also needs to be somewhat portable)?
> >
> >Thanks for any hints, pointers, recommendations, or flames.
> >
> >  
> >
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

=====
/---------------------------------------------------------------------------------------------------\
Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.

What Is an agnostic? -  An agnostic thinks it impossible to know the truth
in matters such as, a god or the future with which religions are concerned 
with. Or, if not impossible, at least impossible at the present time.
 
LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
\---------------------------------------------------------------------------------------------------/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Sat Dec 13 17:26:37 2003
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Sat, 13 Dec 2003 14:26:37 -0800 (PST)
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <20031213185512.48570.qmail@web60304.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312131422190.1124-100000@twin.uoregon.edu>

On Sat, 13 Dec 2003, Andrew Latham wrote:

> koolance.com has rackmount cases that use a water cooling system that is both
> cool and quite. It also is a standard rackmount case that would free up some
> design issues..

it's also 4u...

in 4u I have 8 opteron 242 cpu's in 4 cases with three panaflo crossflow 
blowers ea which are quite bearable compared to screaming loud 8000rpm 
40mm fans.

> 
> 
> --- Daniel Pfenniger <daniel.pfenniger at obs.unige.ch> wrote:
> > Hi,
> > 
> > Its not a 1U, its a low-noise Linux P4 box, but really *low* noise: the 
> > transtec 1200
> > We bought these for offices precisely because these boxes are designed 
> > for low noise.
> > 
> >
> http://www.transtec.ch/CH/E/products/workstations/linuxworkstations/transtec1200lownoiseworkstation.html?fsid=342edfd38a845c179dd18ef965091b2d
> > 
> > In practice in the office the box can barely be noticed,  I can imagine 
> > a dozen or more
> > of these boxes would not disturb a normal conversation.
> > 
> >     Dan
> > 
> > 
> > Joshua Baker-LePain wrote:
> > 
> > >Yes, I know this has been discussed a couple of times, and that my stated 
> > >goals are at odds with each other.  But I really need the best bang for 
> > >the noise for a system that will reside in the same room with patients 
> > >undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
> > >2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> > >and annoys both the patients and the physicians.  This is Bad.
> > >
> > >We're willing to pay for better, but don't want to take too much of a 
> > >speed hit.  Does anybody have a good vendor for quiet but still high 
> > >performing systems?  Is there any hope in the 1U form factor (my Opteron 
> > >nodes are somewhat quieter, since they use squirrel cage fans, but are 
> > >still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> > >or 4U case?  Or should I look at lashing together some towers (this system 
> > >also needs to be somewhat portable)?
> > >
> > >Thanks for any hints, pointers, recommendations, or flames.
> > >
> > >  
> > >
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> =====
> /---------------------------------------------------------------------------------------------------\
> Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.
> 
> What Is an agnostic? -  An agnostic thinks it impossible to know the truth
> in matters such as, a god or the future with which religions are concerned 
> with. Or, if not impossible, at least impossible at the present time.
>  
> LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
> \---------------------------------------------------------------------------------------------------/
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli  	       Unix Consulting 	       joelja at darkwing.uoregon.edu    
GPG Key Fingerprint:     5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Mon Dec 15 00:48:45 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Mon, 15 Dec 2003 13:48:45 +0800 (CST)
Subject: [Beowulf] Fwd: GridEngine for AMD64 available on ftp.suse.com
In-Reply-To: <200308201609.UAA08558@nocserv.free.net>
Message-ID: <20031215054845.37575.qmail@web16802.mail.tpe.yahoo.com>

I downloaded the rpm -- I didn't install it, but I
just extracted the files, and did a "file" command.
The binaries are compiled as 64-bit.

Andrew.

> SuSE ship SGE on their CDs, including their AMD64
> and
> Athlon64 distribution:
> 
>
http://www.suse.de/us/private/products/suse_linux/i386/packages_amd64/gridengine.html
> 
> And it is also available on
>
ftp.suse.com:/pub/suse/x86_64/9.0/suse/x86_64/gridengine-5.3-257.x86_64.rpm
> 
> But on sure if it works on RedHat or not.
> 
>  -Ron
> 
> 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ranjit.chagar at ntlworld.com  Mon Dec 15 08:54:31 2003
From: ranjit.chagar at ntlworld.com (Ranjit Chagar)
Date: Mon, 15 Dec 2003 13:54:31 -0000
Subject: [Beowulf] Simple Cluster
References: <3FD8EC50.7060606@sympatico.ca> <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <005001c3c312$f7418600$0301a8c0@chagar>

Hi robert/jim,

Well I built a cluster just for the hell of it. And as you said, before the
flames start, it was built just to see what I could do, built from cheap
PCs, just for the fun of it. They are 133Mhz PII and I built mine following
the instructions from pondermatic. Okay, so in this day and age that is old
hat, and so is my system but I enjoyed building it and enjoy playing around
with it. And then, being stupid myself, I wrote out instructions so that I
could did it again cause I will be the first to admit my memory isn't that
good.

Full details at http://homepage.ntlworld.com/ranjit.chagar/

Robert - if you have any questions let me know.

Jim - I dont mean for this email to sound bad but my english sometimes is
taken wrong. I mean to say that you can do it if you want.

Best Regards, Ranjit

----- Original Message -----
From: "Jim Lux" <james.p.lux at jpl.nasa.gov>
To: <rmd003 at sympatico.ca>
Cc: <beowulf at beowulf.org>
Sent: Friday, December 12, 2003 3:32 PM
Subject: Re: [Beowulf] Simple Cluster


> Sure you can do it. It won't be a ball of fire speed wise, and probably
> wouldn't be a cost effective solution to doing any "real work", but it
will
> compute..
>
> Search the web for the "Pondermatic" which, as I recall, was a couple or
> three P1s.  And of course, very early clusters were made with 486's.
>
> Your big challenge is probably going to be (easily) getting an appropriate
> distribution that fits within the disk and RAM limits.  Yes, before all
the
> flames start, I know it's possible to make a version that fits in 16K on
an
> 8088, and that would be bloatware compared to someone's special 6502 Linux
> implementation that runs on old Apple IIs, etc.etc.etc., but nobody would
> call that easy.  What Robert is probably looking for is a "stick the CDROM
> in and go" kind of solution, and, just like in the Windows world, the
> current, readily available (as in download the ISO and go) solutions tend
to
> assume one has a vintage 2001 computer sitting around with a several
hundred
> MHz processor and 64MB of RAM, etc.
>
> Actually, I'd be very glad to hear that this is not the case..
>
> Maybe one of the old Scyld "cluster on a disk" might be a good way?
>
> Perhaps Rocks?  It sort of self installs.
>
> One could always just boot 4 copies of Knoppix, but I don't know that
> there's many "cluster management" tools in Knoppix.
>
> ----- Original Message -----
> From: <rmd003 at sympatico.ca>
> To: <beowulf at beowulf.org>
> Sent: Thursday, December 11, 2003 2:14 PM
> Subject: [Beowulf] Simple Cluster
>
>
> > Hello,
> >
> > Would anyone know if it is possible to make a cluster with four P1
> > computers? If it is possible are there any instructions on how to do
> > this or the software required etc...?
> >
> > Robert Van Amelsvoort
> > rmd003 at sympatico.ca
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Dec 15 12:41:12 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 15 Dec 2003 12:41:12 -0500 (EST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <5.2.0.9.2.20031215091234.02fa5c88@mailhost4.jpl.nasa.gov>
Message-ID: <Pine.LNX.4.44.0312151240560.3201-100000@ganesh.phy.duke.edu>

On Mon, 15 Dec 2003, Jim Lux wrote:

> Outstanding, Ranjit...
> Great that you wrote up a page describing how you did it, too!! Especially, 
> describing the problems you encountered (i.e. slot dependence for network 
> cards..)
> 
> So now you can say you built your own supercomputer.  How cool is that.


And just in time for Jeff's column, too;-)

   rgb

> 
> Jim
> 
> 
> At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
> >Hi robert/jim,
> >
> >Well I built a cluster just for the hell of it. And as you said, before the
> >flames start, it was built just to see what I could do, built from cheap
> >PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> >the instructions from pondermatic. Okay, so in this day and age that is old
> >hat, and so is my system but I enjoyed building it and enjoy playing around
> >with it. And then, being stupid myself, I wrote out instructions so that I
> >could did it again cause I will be the first to admit my memory isn't that
> >good.
> >
> >Full details at http://homepage.ntlworld.com/ranjit.chagar/
> >
> >Robert - if you have any questions let me know.
> >
> >Jim - I dont mean for this email to sound bad but my english sometimes is
> >taken wrong. I mean to say that you can do it if you want.
> >
> >Best Regards, Ranjit
> 
> James Lux, P.E.
> Spacecraft Telecommunications Section
> Jet Propulsion Laboratory, Mail Stop 161-213
> 4800 Oak Grove Drive
> Pasadena CA 91109
> tel: (818)354-2075
> fax: (818)393-6875
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Mon Dec 15 12:15:31 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 15 Dec 2003 09:15:31 -0800
Subject: [Beowulf] Simple Cluster
In-Reply-To: <005001c3c312$f7418600$0301a8c0@chagar>
References: <3FD8EC50.7060606@sympatico.ca>
 <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <5.2.0.9.2.20031215091234.02fa5c88@mailhost4.jpl.nasa.gov>

Outstanding, Ranjit...
Great that you wrote up a page describing how you did it, too!! Especially, 
describing the problems you encountered (i.e. slot dependence for network 
cards..)

So now you can say you built your own supercomputer.  How cool is that.

Jim


At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
>Hi robert/jim,
>
>Well I built a cluster just for the hell of it. And as you said, before the
>flames start, it was built just to see what I could do, built from cheap
>PCs, just for the fun of it. They are 133Mhz PII and I built mine following
>the instructions from pondermatic. Okay, so in this day and age that is old
>hat, and so is my system but I enjoyed building it and enjoy playing around
>with it. And then, being stupid myself, I wrote out instructions so that I
>could did it again cause I will be the first to admit my memory isn't that
>good.
>
>Full details at http://homepage.ntlworld.com/ranjit.chagar/
>
>Robert - if you have any questions let me know.
>
>Jim - I dont mean for this email to sound bad but my english sometimes is
>taken wrong. I mean to say that you can do it if you want.
>
>Best Regards, Ranjit

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Dec 15 13:56:04 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 15 Dec 2003 13:56:04 -0500 (EST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <20031215183945.41825.qmail@web60305.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312151347300.3201-100000@ganesh.phy.duke.edu>

On Mon, 15 Dec 2003, Andrew Latham wrote:

> I have a small 9 node p133 cluster. It works.
> 
> What does the list think about the idea of developing software on the
> smaller(mem) and older systems. I have one so I am bias but I do see that
> developing software that can handle 64meg of ram on a P586 system would lend to
> tighter and more efficant code. I am not trying to sell the P133 systems, only
> thinking about good code for them would be really nice(fast) on a Xeon or
> better. I already know this could spark a discussion on busses and chipsets and
> processors. Just thinking

More likely a discussion on balance.  I actually think that developing
on small clusters is good, but I'm not so sure about small REALLY old
systems.  The problem is that things like memory access speed and
pipelining change so much across processor generations that not only are
the bottlenecks different, the bottlenecking processes have different
thresholds and are in different ratios to the other system performance
determiners.  Just as performance on such a cluster would not be
terribly good as a predictor of performance on modern cluster from a
hardware point of view, it isn't certain that it would be all that great
from a software point of view.

My favorite case study to illustrate the point is what I continue to
think of as a brilliant piece of code -- ATLAS.  Would an ATLAS-tuned
BLAS built on and for a 586 still perform optimally on a P4 or Opteron?
I think not.  Not even close.  Even if ATLAS-level tuning may be beyond
most programmers, there are issues with stride, cache size and type, and
for parallel programmers the relative speeds of CPU, memory, and network
that can strongly affect program design and performance and scaling.

So I too have a small cluster at home and develop there, and for a lot
of code it doesn't matter as long as one doesn't test SCALING there.
But I'm not sure the code itself is any better "because" it was
developed there.

Although given that my beer-filled refrigerator is just downstairs, it
may be...;-)

   rgb

> 
> 
> --- Jim Lux <James.P.Lux at jpl.nasa.gov> wrote:
> > Outstanding, Ranjit...
> > Great that you wrote up a page describing how you did it, too!! Especially, 
> > describing the problems you encountered (i.e. slot dependence for network 
> > cards..)
> > 
> > So now you can say you built your own supercomputer.  How cool is that.
> > 
> > Jim
> > 
> > 
> > At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
> > >Hi robert/jim,
> > >
> > >Well I built a cluster just for the hell of it. And as you said, before the
> > >flames start, it was built just to see what I could do, built from cheap
> > >PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> > >the instructions from pondermatic. Okay, so in this day and age that is old
> > >hat, and so is my system but I enjoyed building it and enjoy playing around
> > >with it. And then, being stupid myself, I wrote out instructions so that I
> > >could did it again cause I will be the first to admit my memory isn't that
> > >good.
> > >
> > >Full details at http://homepage.ntlworld.com/ranjit.chagar/
> > >
> > >Robert - if you have any questions let me know.
> > >
> > >Jim - I dont mean for this email to sound bad but my english sometimes is
> > >taken wrong. I mean to say that you can do it if you want.
> > >
> > >Best Regards, Ranjit
> > 
> > James Lux, P.E.
> > Spacecraft Telecommunications Section
> > Jet Propulsion Laboratory, Mail Stop 161-213
> > 4800 Oak Grove Drive
> > Pasadena CA 91109
> > tel: (818)354-2075
> > fax: (818)393-6875
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> =====
> /---------------------------------------------------------------------------------------------------\
> Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.
> 
> What Is an agnostic? -  An agnostic thinks it impossible to know the truth
> in matters such as, a god or the future with which religions are concerned 
> with. Or, if not impossible, at least impossible at the present time.
>  
> LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
> \---------------------------------------------------------------------------------------------------/
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lathama at yahoo.com  Mon Dec 15 13:39:45 2003
From: lathama at yahoo.com (Andrew Latham)
Date: Mon, 15 Dec 2003 10:39:45 -0800 (PST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <5.2.0.9.2.20031215091234.02fa5c88@mailhost4.jpl.nasa.gov>
Message-ID: <20031215183945.41825.qmail@web60305.mail.yahoo.com>

I have a small 9 node p133 cluster. It works.

What does the list think about the idea of developing software on the
smaller(mem) and older systems. I have one so I am bias but I do see that
developing software that can handle 64meg of ram on a P586 system would lend to
tighter and more efficant code. I am not trying to sell the P133 systems, only
thinking about good code for them would be really nice(fast) on a Xeon or
better. I already know this could spark a discussion on busses and chipsets and
processors. Just thinking


--- Jim Lux <James.P.Lux at jpl.nasa.gov> wrote:
> Outstanding, Ranjit...
> Great that you wrote up a page describing how you did it, too!! Especially, 
> describing the problems you encountered (i.e. slot dependence for network 
> cards..)
> 
> So now you can say you built your own supercomputer.  How cool is that.
> 
> Jim
> 
> 
> At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
> >Hi robert/jim,
> >
> >Well I built a cluster just for the hell of it. And as you said, before the
> >flames start, it was built just to see what I could do, built from cheap
> >PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> >the instructions from pondermatic. Okay, so in this day and age that is old
> >hat, and so is my system but I enjoyed building it and enjoy playing around
> >with it. And then, being stupid myself, I wrote out instructions so that I
> >could did it again cause I will be the first to admit my memory isn't that
> >good.
> >
> >Full details at http://homepage.ntlworld.com/ranjit.chagar/
> >
> >Robert - if you have any questions let me know.
> >
> >Jim - I dont mean for this email to sound bad but my english sometimes is
> >taken wrong. I mean to say that you can do it if you want.
> >
> >Best Regards, Ranjit
> 
> James Lux, P.E.
> Spacecraft Telecommunications Section
> Jet Propulsion Laboratory, Mail Stop 161-213
> 4800 Oak Grove Drive
> Pasadena CA 91109
> tel: (818)354-2075
> fax: (818)393-6875
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

=====
/---------------------------------------------------------------------------------------------------\
Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.

What Is an agnostic? -  An agnostic thinks it impossible to know the truth
in matters such as, a god or the future with which religions are concerned 
with. Or, if not impossible, at least impossible at the present time.
 
LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
\---------------------------------------------------------------------------------------------------/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Mon Dec 15 11:01:23 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Mon, 15 Dec 2003 10:01:23 -0600
Subject: [Beowulf] Simple Cluster
In-Reply-To: <005001c3c312$f7418600$0301a8c0@chagar>
References: <3FD8EC50.7060606@sympatico.ca> <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422> <005001c3c312$f7418600$0301a8c0@chagar>
Message-ID: <3FDDDAD3.3020005@tamu.edu>

The flames come sometimes... And in today's world, where a high end box 
can outperform a small, low-power cluster, it's often hard to separate 
the flames from significant help/tips.

My first cluster was 7 66 MHz 486's, and it was done as a proof of 
concept project.  I demonstrated that I could improve performance with 
the cluster over a single machine doing serialized processing of 
geodetic data.  Note that it was still faster to run the code serially 
on a dual-processor Pentium 266 with more memory than any of the nodes 
in the cluster...  But it proved the point and was a valid academic 
exercise.

Now you're ready to try code on a little cluster, and gain some 
programming skills.  After that, you're ready to build something bigger 
and more capable.

Good luck!
Gerry

Ranjit Chagar wrote:
> Hi robert/jim,
> 
> Well I built a cluster just for the hell of it. And as you said, before the
> flames start, it was built just to see what I could do, built from cheap
> PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> the instructions from pondermatic. Okay, so in this day and age that is old
> hat, and so is my system but I enjoyed building it and enjoy playing around
> with it. And then, being stupid myself, I wrote out instructions so that I
> could did it again cause I will be the first to admit my memory isn't that
> good.
> 
> Full details at http://homepage.ntlworld.com/ranjit.chagar/
> 
> Robert - if you have any questions let me know.
> 
> Jim - I dont mean for this email to sound bad but my english sometimes is
> taken wrong. I mean to say that you can do it if you want.
> 
> Best Regards, Ranjit
> 
> ----- Original Message -----
> From: "Jim Lux" <james.p.lux at jpl.nasa.gov>
> To: <rmd003 at sympatico.ca>
> Cc: <beowulf at beowulf.org>
> Sent: Friday, December 12, 2003 3:32 PM
> Subject: Re: [Beowulf] Simple Cluster
> 
> 
> 
>>Sure you can do it. It won't be a ball of fire speed wise, and probably
>>wouldn't be a cost effective solution to doing any "real work", but it
> 
> will
> 
>>compute..
>>
>>Search the web for the "Pondermatic" which, as I recall, was a couple or
>>three P1s.  And of course, very early clusters were made with 486's.
>>
>>Your big challenge is probably going to be (easily) getting an appropriate
>>distribution that fits within the disk and RAM limits.  Yes, before all
> 
> the
> 
>>flames start, I know it's possible to make a version that fits in 16K on
> 
> an
> 
>>8088, and that would be bloatware compared to someone's special 6502 Linux
>>implementation that runs on old Apple IIs, etc.etc.etc., but nobody would
>>call that easy.  What Robert is probably looking for is a "stick the CDROM
>>in and go" kind of solution, and, just like in the Windows world, the
>>current, readily available (as in download the ISO and go) solutions tend
> 
> to
> 
>>assume one has a vintage 2001 computer sitting around with a several
> 
> hundred
> 
>>MHz processor and 64MB of RAM, etc.
>>
>>Actually, I'd be very glad to hear that this is not the case..
>>
>>Maybe one of the old Scyld "cluster on a disk" might be a good way?
>>
>>Perhaps Rocks?  It sort of self installs.
>>
>>One could always just boot 4 copies of Knoppix, but I don't know that
>>there's many "cluster management" tools in Knoppix.
>>
>>----- Original Message -----
>>From: <rmd003 at sympatico.ca>
>>To: <beowulf at beowulf.org>
>>Sent: Thursday, December 11, 2003 2:14 PM
>>Subject: [Beowulf] Simple Cluster
>>
>>
>>
>>>Hello,
>>>
>>>Would anyone know if it is possible to make a cluster with four P1
>>>computers? If it is possible are there any instructions on how to do
>>>this or the software required etc...?
>>>
>>>Robert Van Amelsvoort
>>>rmd003 at sympatico.ca
>>>
>>>_______________________________________________
>>>Beowulf mailing list, Beowulf at beowulf.org
>>>To change your subscription (digest mode or unsubscribe) visit
>>
>>http://www.beowulf.org/mailman/listinfo/beowulf
>>
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit
> 
> http://www.beowulf.org/mailman/listinfo/beowulf
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From henken at seas.upenn.edu  Mon Dec 15 16:27:10 2003
From: henken at seas.upenn.edu (Nicholas Henke)
Date: Mon, 15 Dec 2003 16:27:10 -0500
Subject: [Beowulf] clubmask 0.6b2 released
Message-ID: <1071523630.6527.2.camel@roughneck.liniac.upenn.edu>

Changes since 0.6b1:
-----------------------------------------
Add support for runtime (clubmask.conf) choice of resource manager subsystem.
The available options now are ganglia and supermon. Support for ganglia3 will be
added once it is released. Ganglia is now the preferred choice, as it is _much_
more stable.

add --with-supermon to setup.py to turn on compiling of supermon python module.
It is now off by default, as ganglia is the preferred and default RM subsystem.
------------------------------------------------------------------------------

Name        : Clubmask
Version     : 0.6                             
Release     : b2
Group       : Cluster Resource Management and Scheduling
Vendor      : Liniac Project, University of Pennsylvania
License     : GPL-2
URL         : http://clubmask.sourceforge.net

What is Clubmask
------------------------------------------------------------------------------
Clubmask is a resource manager designed to allow Bproc based clusters
enjoy the full scheduling power and configuration of the Maui HPC
Scheduler.

Clubmask uses a modified version of the Supermon resource monitoring
software to gather resource information from the cluster nodes. This
information is combined with job submission data and delivered to the
Maui scheduler. Maui issues job control commands back to Clubmask,
which then starts or stops the job scripts using the Bproc environment.

Clubmask also provides builtin support for a supermon2ganglia translator
that allows a standard Ganlgia  web backend to contact supermon and get
XML data that will disply through the Ganglia web interface.

Clubmask is currently running on around 10 clusters, varying in size
from 8 to 128 nodes, and has been tested up to 5000 jobs.


Notes/warnings on this release:
------------------------------------------------------------------------------
Before upgrading, please make sure to save your /etc/clubmask/clubmask.conf
file, as it may get overwritten.

To use the resource requests, you must be running the latest snapshot of maui.


Links
-------------
Bproc: http://bproc.sourceforge.net
Ganglia: http://ganglia.sourceforge.net
Maui Scheduler: http://www.supercluster.org/maui
Supermon: http://supermon.sourceforge.net

Nic
-- 
Nicholas Henke
Penguin Herder & Linux Cluster System Programmer
Liniac Project - Univ. of Pennsylvania

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From camm at enhanced.com  Mon Dec 15 17:00:14 2003
From: camm at enhanced.com (Camm Maguire)
Date: 15 Dec 2003 17:00:14 -0500
Subject: [Beowulf] Simple Cluster
In-Reply-To: <Pine.LNX.4.44.0312151347300.3201-100000@ganesh.phy.duke.edu>
References: <Pine.LNX.4.44.0312151347300.3201-100000@ganesh.phy.duke.edu>
Message-ID: <547k0xu4gh.fsf@intech19.enhanced.com>

Greetings!  You may be interested in Debian's atlas setup.  We have
several binary packages which depend on a virtual blas2 and lapack2
package, which can be provided by either the reference libraries or a
variety of atlas provided versions with various ISA instructions
supported.  For example, on i386, we have sse, sse2, and 3dnow builds
in addition to the 'vanilla' x86 build.  As you know, the isa
instructions are only one of many factors affecting atlas tuning.
They are the key one, however, in a) determining whether the lib will
run at all on a given system, and b) that delivers the lion's share of
the performance.  The philosophy here is to provide binaries which
give factors of 2 or more of performance gain to be had, while making
it easy for users to get the remaining 10-20% by customizing the
package for their site.  'apt-get -q source atlas; cd atlas-3.2.1ln;
fakeroot debian/rules custom' gives one a tuned .deb for the running
box. 

We need to get newer versions of the lib uploaded, but otherwise it
works great.  'Almost' customized performance automatically available
to R, octave,.... without recompilation.

Take care,

"Robert G. Brown" <rgb at phy.duke.edu> writes:

> On Mon, 15 Dec 2003, Andrew Latham wrote:
> 
> > I have a small 9 node p133 cluster. It works.
> > 
> > What does the list think about the idea of developing software on the
> > smaller(mem) and older systems. I have one so I am bias but I do see that
> > developing software that can handle 64meg of ram on a P586 system would lend to
> > tighter and more efficant code. I am not trying to sell the P133 systems, only
> > thinking about good code for them would be really nice(fast) on a Xeon or
> > better. I already know this could spark a discussion on busses and chipsets and
> > processors. Just thinking
> 
> More likely a discussion on balance.  I actually think that developing
> on small clusters is good, but I'm not so sure about small REALLY old
> systems.  The problem is that things like memory access speed and
> pipelining change so much across processor generations that not only are
> the bottlenecks different, the bottlenecking processes have different
> thresholds and are in different ratios to the other system performance
> determiners.  Just as performance on such a cluster would not be
> terribly good as a predictor of performance on modern cluster from a
> hardware point of view, it isn't certain that it would be all that great
> from a software point of view.
> 
> My favorite case study to illustrate the point is what I continue to
> think of as a brilliant piece of code -- ATLAS.  Would an ATLAS-tuned
> BLAS built on and for a 586 still perform optimally on a P4 or Opteron?
> I think not.  Not even close.  Even if ATLAS-level tuning may be beyond
> most programmers, there are issues with stride, cache size and type, and
> for parallel programmers the relative speeds of CPU, memory, and network
> that can strongly affect program design and performance and scaling.
> 
> So I too have a small cluster at home and develop there, and for a lot
> of code it doesn't matter as long as one doesn't test SCALING there.
> But I'm not sure the code itself is any better "because" it was
> developed there.
> 
> Although given that my beer-filled refrigerator is just downstairs, it
> may be...;-)
> 
>    rgb
> 
> > 
> > 
> > --- Jim Lux <James.P.Lux at jpl.nasa.gov> wrote:
> > > Outstanding, Ranjit...
> > > Great that you wrote up a page describing how you did it, too!! Especially, 
> > > describing the problems you encountered (i.e. slot dependence for network 
> > > cards..)
> > > 
> > > So now you can say you built your own supercomputer.  How cool is that.
> > > 
> > > Jim
> > > 
> > > 
> > > At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
> > > >Hi robert/jim,
> > > >
> > > >Well I built a cluster just for the hell of it. And as you said, before the
> > > >flames start, it was built just to see what I could do, built from cheap
> > > >PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> > > >the instructions from pondermatic. Okay, so in this day and age that is old
> > > >hat, and so is my system but I enjoyed building it and enjoy playing around
> > > >with it. And then, being stupid myself, I wrote out instructions so that I
> > > >could did it again cause I will be the first to admit my memory isn't that
> > > >good.
> > > >
> > > >Full details at http://homepage.ntlworld.com/ranjit.chagar/
> > > >
> > > >Robert - if you have any questions let me know.
> > > >
> > > >Jim - I dont mean for this email to sound bad but my english sometimes is
> > > >taken wrong. I mean to say that you can do it if you want.
> > > >
> > > >Best Regards, Ranjit
> > > 
> > > James Lux, P.E.
> > > Spacecraft Telecommunications Section
> > > Jet Propulsion Laboratory, Mail Stop 161-213
> > > 4800 Oak Grove Drive
> > > Pasadena CA 91109
> > > tel: (818)354-2075
> > > fax: (818)393-6875
> > > 
> > > _______________________________________________
> > > Beowulf mailing list, Beowulf at beowulf.org
> > > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> > 
> > =====
> > /---------------------------------------------------------------------------------------------------\
> > Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.
> > 
> > What Is an agnostic? -  An agnostic thinks it impossible to know the truth
> > in matters such as, a god or the future with which religions are concerned 
> > with. Or, if not impossible, at least impossible at the present time.
> >  
> > LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
> > \---------------------------------------------------------------------------------------------------/
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> > 
> 
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> 
> 

-- 
Camm Maguire			     			camm at enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jlb17 at duke.edu  Tue Dec 16 11:35:32 2003
From: jlb17 at duke.edu (Joshua Baker-LePain)
Date: Tue, 16 Dec 2003 11:35:32 -0500 (EST)
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <Pine.LNX.4.44.0312161131580.2243-100000@chaos.egr.duke.edu>

I just wanted to thank everybody who's gotten back to me, both on and off 
list -- lots of good suggestions.  Now, off to see what I can implement...

-- 
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From avkon at imm.uran.ru  Wed Dec 17 09:08:15 2003
From: avkon at imm.uran.ru (Alexandr Konovalov)
Date: Wed, 17 Dec 2003 19:08:15 +0500
Subject: [Beowulf] Right place to MPICH discussions
Message-ID: <3FE0634F.8060300@imm.uran.ru>

Hi,

Where is relevant place to discuss MPICH internal problems?
I send mail to mpi-maint at mcs.anl.gov but receive no reaction.

Basically we have problem with shmat in
mpid/ch_p4/p4/lib/p4_MD.c:MD_initmem in linux around 2.4.20
kernels. It seems to me that if we change System V IPC
horrors with plain and simple mmap in MD_initmem we have
broke nothing anyway. Is this reasonable?

While googling I found only the hint "to play with
P4_GLOBMEMSIZE" but in our case P4_GLOBMEMSIZE always too
small (so MPICH complaine) or too high (so shmat failed).

It's quite strange to me that we have very general 
configuration (2 CPU Xeons, Redhat 7.3 etc) and problems 
arise at wide class of MPI programs. The only specific there 
I think the -with-comm=shared flag in confogure.

-- 
Best regards,
Alexandr Konovalov


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Dec 17 13:54:31 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 17 Dec 2003 13:54:31 -0500 (EST)
Subject: [Beowulf] PVM master/slave project template...
Message-ID: <Pine.LNX.4.44.0312171346530.8574-100000@lucifer.rgb.private.net>

Dear Listvolken,

I just finished building a PVM master/slave project template for public
and private re-use (mostly for re-use in the CW column that I WILL
finish in the next couple of days:-).  I am curious as to whether it
works for anybody other than myself.  If there is anybody out there who
always wanted an automagical PVM template that does n hello worlds in
parallel with d delay (ready to be gutted and replaced with your own
code) then it would be lovely if you would grab it and give it a try.

I'm testing the included documentation too (yes, it is at least modestly
autodocumenting) so I won't tell you much more besides:

 http://www.phy.duke.edu/~rgb/General/general.php

from whence you can grab it.

N.B. -- the included docs do NOT tell you how to get and install pvm or
how to configure a pvm cluster; it is presumed that you can do or have
done that by other means.  For many of you it is at most a:

 yum install pvm

per node, or perhaps a rpm -Uvh /path/to/pvm-whatever.i386.rpm if you
don't have a yummified public repository.  Plus perhaps installing
pvm-gui on a head node. Then it is just setting the environment (e.g.
PVM_ROOT, PVM_RSH...) up correctly and cranking either pvm or xpvm to
create a virtual cluster.  This is all well documented elsewhere.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hunting at ix.netcom.com  Wed Dec 17 22:21:05 2003
From: hunting at ix.netcom.com (Michael Huntingdon)
Date: Wed, 17 Dec 2003 19:21:05 -0800
Subject: [Beowulf] 1 hour benchmark account request
In-Reply-To: <20031218015322.GJ7381@cse.ucdavis.edu>
Message-ID: <3.0.3.32.20031217192105.00f11fc0@popd.ix.netcom.com>

Bill is toying with Itanium?


At 05:53 PM 12/17/2003 -0800, Bill Broadley wrote:
>
>Does anyone have a benchmark account available for an hour or so (afterhours
>is fine) that has the following available:
>* 32 nodes (p4 or athlon > 2 Ghz or opteron) 
>* Myrinet (any flavor) or
>* Infiniband gcc (any flavor) MPI (any flavor)
>
>I could return the favor with various opteron/itanium 2 benchmarking.
>
>-- 
>Bill Broadley
>Computational Science and Engineering
>UC Davis
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Wed Dec 17 20:53:22 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Wed, 17 Dec 2003 17:53:22 -0800
Subject: [Beowulf] 1 hour benchmark account request
Message-ID: <20031218015322.GJ7381@cse.ucdavis.edu>


Does anyone have a benchmark account available for an hour or so (afterhours
is fine) that has the following available:
* 32 nodes (p4 or athlon > 2 Ghz or opteron) 
* Myrinet (any flavor) or
* Infiniband gcc (any flavor) MPI (any flavor)

I could return the favor with various opteron/itanium 2 benchmarking.

-- 
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Thu Dec 18 10:41:34 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Thu, 18 Dec 2003 23:41:34 +0800 (CST)
Subject: [Beowulf] real Grid computing
Message-ID: <20031218154134.90104.qmail@web16810.mail.tpe.yahoo.com>

BONIC will be replacing SETI at home's client for the
next generation of SETI at home.

http://boinc.berkeley.edu

It's opensource, and looks like it is better than to
wait for SGE 6.0 to get the P2P client.

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From leigh at twilightdreams.net  Thu Dec 18 16:19:01 2003
From: leigh at twilightdreams.net (Leigh)
Date: Thu, 18 Dec 2003 16:19:01 -0500
Subject: [Beowulf] Semi-philosophical Question
Message-ID: <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>

I was talking over Beowulf clusters with a coworker (as I have been working 
on learning to build one for my company) and he came up with an interesting 
question that I was unsure of.

As most of the data is saved upon the "gateway" and the other machines 
simply access it to use the data, what happens when multiple machines are 
making use of the same data and they all try to save at once? Do they all 
work as one system and save it only once, or can multiple nodes 
theoretically be using one file and both try to save to it?

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From optimize at optimization.net  Wed Dec 17 14:46:34 2003
From: optimize at optimization.net (optimize)
Date: Wed, 17 Dec 2003 14:46:34 -0500
Subject: [Beowulf] PVM master/slave project template...
Message-ID: <200312171946.AYF63595@ms7.verisignmail.com>

i would volunteer to get a copy of your good PVM work. i will 
try to validate/test it if at all possible. i could use it 
over large_scale combinatorial optimization problems.

thanks & bol
ralph
optimal regards.
-------------- next part --------------
An embedded message was scrubbed...
From: "Robert G. Brown" <rgb at phy.duke.edu>
Subject: [Beowulf] PVM master/slave project template...
Date: Wed, 17 Dec 2003 13:54:31 -0500 (EST)
Size: 3997
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20031217/db04212c/attachment.mht>

From agrajag at dragaera.net  Thu Dec 18 17:35:36 2003
From: agrajag at dragaera.net (Jag)
Date: Thu, 18 Dec 2003 17:35:36 -0500
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>
References: <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>
Message-ID: <1071786936.4291.69.camel@pel>

On Thu, 2003-12-18 at 16:19, Leigh wrote:
> I was talking over Beowulf clusters with a coworker (as I have been working 
> on learning to build one for my company) and he came up with an interesting 
> question that I was unsure of.
> 
> As most of the data is saved upon the "gateway" and the other machines 

Not really in answer to your question, but some general info..
That is one configuration, but is not always the case.  As an example, I
currently have a cluster that has a node dedicated to sharing out a
terrabyte of space over NFS.  There's one node that's dedicated to doing
the scheduling (using SGE), and three other nodes allow user logins for
them to submit jobs from.  Jobs aren't executed on any of these nodes.

There's also something called pvfs that'll let you use the harddrives on
all your slave nodes and combine them into one shared filesystem that
they can all use.

> simply access it to use the data, what happens when multiple machines are 
> making use of the same data and they all try to save at once? Do they all 
> work as one system and save it only once, or can multiple nodes 
> theoretically be using one file and both try to save to it?

This is really an application specific question.  A lot of MPI jobs
shuffle all the data back to one of the processes and let that process
write out the output files, so you won't have a problem.  There are also
other programs that may have a seperate output file for every slave node
the job is run on.  What is your cluster going to be used for?  The best
way to answer the question is to determine what apps will be used and
see how they handle output.  If its an inhouse program, you may want to
make sure your programmers are aware they'll be writing to a shared
filesystem so that they don't accidently write the code in such a way
that the results get corrupted by having them all use the same output
file.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From leigh at twilightdreams.net  Thu Dec 18 17:45:09 2003
From: leigh at twilightdreams.net (Leigh)
Date: Thu, 18 Dec 2003 17:45:09 -0500
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <1071786936.4291.69.camel@pel>
References: <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>
 <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>
Message-ID: <4.2.0.58.20031218174322.00b6d310@mail.flexfeed.com>

At 05:35 PM 12/18/2003 -0500, Jag wrote:
>On Thu, 2003-12-18 at 16:19, Leigh wrote:
>
> >What is your cluster going to be used for?


It hasn't been decided what the cluster will be used for. The entire thing, 
thus far, is an experiment. Mostly to see if two people who so far, have no 
clue how to build one can get one built and running (so far so good, I 
think) and from there, we'll putz around and see what we can do with it. 
Maybe have fun with SETI at home, or perhaps sell space upon the "big" one 
once we get it going for scientists to be able to run data upon.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec 19 06:38:48 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 19 Dec 2003 06:38:48 -0500 (EST)
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <4.2.0.58.20031218174322.00b6d310@mail.flexfeed.com>
Message-ID: <Pine.LNX.4.44.0312190636570.29889-100000@lucifer.rgb.private.net>

On Thu, 18 Dec 2003, Leigh wrote:

> At 05:35 PM 12/18/2003 -0500, Jag wrote:
> >On Thu, 2003-12-18 at 16:19, Leigh wrote:
> >
> > >What is your cluster going to be used for?
> 
> 
> It hasn't been decided what the cluster will be used for. The entire thing, 
> thus far, is an experiment. Mostly to see if two people who so far, have no 
> clue how to build one can get one built and running (so far so good, I 
> think) and from there, we'll putz around and see what we can do with it. 
> Maybe have fun with SETI at home, or perhaps sell space upon the "big" one 
> once we get it going for scientists to be able to run data upon.

Do you know how to program?  C?  Perl?

If so, I've got a few toys for you to play with...but they'll be boring
toys if you can't tinker.

   rgb

> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Fri Dec 19 09:52:32 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Fri, 19 Dec 2003 06:52:32 -0800
Subject: [Beowulf] Semi-philosophical Question
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BFE2@orsmsx402.jf.intel.com>

From: Leigh; Sent: Thursday, December 18, 2003 4:45 PM
> 
> It hasn't been decided what the cluster will be used for. The entire
> thing,
> thus far, is an experiment. Mostly to see if two people who so far,
have
> no
> clue how to build one can get one built and running (so far so good, I
> think) and from there, we'll putz around and see what we can do with
it.
> Maybe have fun with SETI at home, or perhaps sell space upon the "big"
one
> once we get it going for scientists to be able to run data upon.

For learning purposes, have a blast.  But, before you make a "big" one
that others will use, make sure you know *what* the cluster is being
used for and that you design the cluster to meet those requirements.

You will probably be much better off going to a cluster builder that
focuses on your users' applications and builds the right system based on
those requirements.

-- 
David N. Lombard

My comments do not represent the opinion of Intel
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From topa_007 at yahoo.com  Thu Dec 18 23:47:39 2003
From: topa_007 at yahoo.com (70uf33q Hu5541n)
Date: Thu, 18 Dec 2003 20:47:39 -0800 (PST)
Subject: [Beowulf] University project Help required
Message-ID: <20031219044739.55802.qmail@web12703.mail.yahoo.com>

hi all,

20 yr old, Engineering grad from India doing a project
in Distributed Computing which is to be submitted in
March for evaluation to The University.

The Project deals with Computing Primes on a LAN based
network which is under load. 

The project aims at Real time Load Balancement on a
Heterogenous Cluster such that the Load is Distributed
such that the clients get synchronised and when the
data is received back it arrives at approx the same
time.

I'm attaching an Abstract on the project.Please go
through it and any comments/advice/guidance will be
helpful.

My prof says that JAVA RMI can be implemented for this
project.
I'm a noob programmer with exp in C/C++/JAVA.

I badly need guidance on how to go ahead with this
project.

Any help will be appreciated.
Thanks in advance.

Cheers,
Toufeeq

=====
"Love is control,I'll die if I let go
I will only let you breathe
My air that you receive
Then we'll see if I let you love me."
-James Hetfield
All Within My Hands,St.Anger
Metallica

__________________________________
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Abstract.zip
Type: application/x-zip-compressed
Size: 6869 bytes
Desc: Abstract.zip
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20031218/ef8d5587/attachment.bin>

From sal10 at utah.edu  Fri Dec 19 12:03:16 2003
From: sal10 at utah.edu (sal10 at utah.edu)
Date: Fri, 19 Dec 2003 10:03:16 -0700
Subject: [Beowulf] Wireless Channel Bonding
Message-ID: <1071853396.3fe32f54cb231@webmail.utah.edu>

I am working on a project to create a wireless network that uses several 
802.11 channels in an attempt to increase data throughput.  The network would 
link 2 computers and each computer would have 2 wireless cards.  Does anyone 
know if this can be done the same way as Ethernet channel bonding?   If anyone 
has any ideas, let me know. In addition, if anyone is aware of sources of 
information about wireless channel bonding, please let me know.
Thanks
Andy  

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From leigh at twilightdreams.net  Fri Dec 19 13:10:26 2003
From: leigh at twilightdreams.net (Leigh)
Date: Fri, 19 Dec 2003 13:10:26 -0500
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <Pine.LNX.4.44.0312190636570.29889-100000@lucifer.rgb.priva
 te.net>
References: <4.2.0.58.20031218174322.00b6d310@mail.flexfeed.com>
Message-ID: <4.2.0.58.20031219130954.00b43798@mail.flexfeed.com>

At 06:38 AM 12/19/2003 -0500, Robert G. Brown wrote:


>Do you know how to program?  C?  Perl?
>
>If so, I've got a few toys for you to play with...but they'll be boring
>toys if you can't tinker.
>
>    rgb

Unfortunately, I don't. I can read and understand code, but I can't code 
myself  yet.


Leigh
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From leigh at twilightdreams.net  Fri Dec 19 13:12:31 2003
From: leigh at twilightdreams.net (Leigh)
Date: Fri, 19 Dec 2003 13:12:31 -0500
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <187D3A7CAB42A54DB61F1D05F012572201D4BFE2@orsmsx402.jf.inte
 l.com>
Message-ID: <4.2.0.58.20031219131146.00b3c050@mail.flexfeed.com>

At 06:52 AM 12/19/2003 -0800, Lombard, David N wrote:

>For learning purposes, have a blast.  But, before you make a "big" one
>that others will use, make sure you know *what* the cluster is being
>used for and that you design the cluster to meet those requirements.
>
>You will probably be much better off going to a cluster builder that
>focuses on your users' applications and builds the right system based on
>those requirements.
>
>--
>David N. Lombard
>
>My comments do not represent the opinion of Intel


Currently, the plan is just to tinker around with a few (4) small systems 
to get the hang of things and figure out what I'm doing. Once we know what 
we're doing and what we want, we'll make more plans for the big stuff.


Leigh


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 19 14:17:42 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 19 Dec 2003 11:17:42 -0800
Subject: [Beowulf] Wireless Channel Bonding
In-Reply-To: <1071853396.3fe32f54cb231@webmail.utah.edu>
Message-ID: <5.2.0.9.2.20031219111046.0313ed08@mailhost4.jpl.nasa.gov>

At 10:03 AM 12/19/2003 -0700, sal10 at utah.edu wrote:
>I am working on a project to create a wireless network that uses several
>802.11 channels in an attempt to increase data throughput.  The network would
>link 2 computers and each computer would have 2 wireless cards.  Does anyone
>know if this can be done the same way as Ethernet channel bonding?   If 
>anyone
>has any ideas, let me know. In addition, if anyone is aware of sources of
>information about wireless channel bonding, please let me know.
>Thanks
>Andy

I've been looking into something quite similar, and, superficially at 
least, it should be possible, although clunky..

Here's one technique that will almost certainly work:

Two wired interfaces in the machine
Each interface is connected to a wireless bridge (something like the 
LinkSys WET11)
The two WETs are configured for different, non-overlapping, RF channels 
(1,6,11 for 802.11b)
As far as the machine is concerned, it's just like having two parallel wires.


Bear in mind that 802.11 is a half duplex medium!  Any one node can either 
be transmitting or receiving  but not both.  Think old style Coax Ethernet.


I see no philosophical reason why one couldn't, for instance, plug in 
multiple PCI based wireless cards. To the computer they just look like 
network interfaces.  The problem you might face is the lack of drivers for 
the high performance 802.11a or 802.11g PCI cards.

If someone can confirm that, for instance, the LinkSys WMP55AG works with 
Linux, particularly in connection with a VIA Mini-ITX motherboard, I'd be 
real happy to hear about it.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lathama at yahoo.com  Fri Dec 19 14:38:13 2003
From: lathama at yahoo.com (Andrew Latham)
Date: Fri, 19 Dec 2003 11:38:13 -0800 (PST)
Subject: [Beowulf] 2.6
In-Reply-To: <5.2.0.9.2.20031219111046.0313ed08@mailhost4.jpl.nasa.gov>
Message-ID: <20031219193813.11509.qmail@web60304.mail.yahoo.com>

Most already know and have played with 2.6.

lots of smp fixes and Linus is fixing documentation.

But did you read the changelog?....    ....hint read the last entry.


Summary of changes from v2.6.0-test11 to v2.6.0
============================================

<hirofumi at mail.parknet.co.jp>
	[PATCH] Missing initialization of /proc/net/tcp seq_file
	
	We need to initialize st->state in tcp_seq_start().  Otherwise
	tcp_seq_stop() is run with previous st->state, and it calls the unneeded
	unlock etc, causing a kernel crash.

<mingo at elte.hu>
	[PATCH] Fix lost wakeups problem

	When doing sync wakeups we must not skip the notification of other cpus
	if the task is not on this runqueue.

<torvalds at home.osdl.org>
	Fix x86 kernel page fault error codes

<torvalds at home.osdl.org>
	Fix ide-scsi.c uninitialized variable

<yoshfuji at linux-ipv6.org>
	[IPV6]: Fix ipv4 mapped address calculation in udpv6_sendmsg().

<laforge at netfilter.org>
	[NETFILTER]: Sanitize ip_ct_tcp_timeout_close_wait value, from 2.4.x

<pavlin at icir.org>
	[RTNETLINK]: Add RTPROT_XORP.

<mingo at elte.hu>
	[PATCH] Fix /proc access to dead thread group list oops
	
	The pid_alive() check within the loop is incorrect.  If we are within
	the tasklist lock and the thread group leader is valid then the thread
	chain will be fully intact.
	
	Instead, the check should be _outside_ the loop, since if the group
	leader no longer exists, the whole list is gone and we must not try
	to access it.
	
	Move the check around, and add comment.
	
	Bug-hunting and fix by Srivatsa Vaddagiri

<axboe at suse.de>
	[PATCH] fix broken x86_64 rdtscll
	
	The scheduler is completed b0rked on x86_64, and I finally found out
	why.  sched_clock() always returned 0, because rdtscll() always returned
	0.  The 'a' in the macro doesn't agree with the 'a' in the function,
	yippe :-)
	
	This is a show stopper for x86_64.

<khali at linux-fr.org>
	[PATCH] I2C: fix i2c_smbus_write_byte() for i2c-nforce2
	
	This patch fixes i2c_smbus_write_byte() being broken for i2c-nforce2.
	This causes trouble when that module is used together with eeprom (which
	is also in 2.6). We have had three user reports about the problem.
	
	Credits go to Mark D. Studebaker for finding and fixing the problem.

<drepper at redhat.com>
	[PATCH] Fix 'noexec' behaviour
	
	We should not allow mmap() with PROT_EXEC on mounts marked "noexec",
	since otherwise there is no way for user-supplied executable loaders
	(like ld.so and emulator environments) to properly honour the
	"noexec"ness of the target.

<davem at nuts.ninka.net>
	[NETFILTER]: In conntrack, do not fragment TSO packets by accident.

<ja at ssi.bg>
	[BRIDGE]: Provide correct TOS value to IPv4 routing.

<jgarzik at pobox.com>
	[PATCH] fix use-after-free in libata
	
	Fixes oops some were seeing on module unload.
	
	Caught by Jon Burgess.

<jgarzik at pobox.com>
	[PATCH] fix oops on unload in pcnet32
	
	The driver was calling pci_unregister_driver for each _device_, and then
	again at the end of the module unload routine.  Remove the call that's
	inside the loop, pci_unregister_driver should only be called once.
	   
	Caught by Don Fry (and many others)

<jgarzik at pobox.com>
	[PATCH] remove manual driver poisoning of net_device
	
	From: Al Viro <viro at parcelfarce.linux.theplanet.co.uk>
	   
	   Such poisoning can cause oopses either because the refcount is not
	   zero when the poisoning occurs, or due to kernel debugging options
	   being enabled.

<torvalds at home.osdl.org>
	Fix the PROT_EXEC breakage on anonymous mmap.
	
	Clean up the tests while at it.

<jgarzik at pobox.com>
	[PATCH] wireless airo oops fix
	
	From Javier Achirica:
	
	Delay MIC activation to prevent Oops

<davem at nuts.ninka.net>
	[PKT_SCHED]: Do not dereference the special pointer value 'HTB_DIRECT'.
	
	Based upon a patch from devik.

<devik at cdi.cz>
	[PKT_SCHED]: In HTB, filters must be destroyed before the classes.

<James_McMechan at hotmail.com>
	[PATCH] tmpfs oops fix
	
	The problem was that the cursor was in the list being walked, and when
	the pointer pointed to the cursor the list_del/list_add_tail pair would
	oops trying to find the entry pointed to by the prev pointer of the
	deleted cursor element.
	
	The solution I found was to move the list_del earlier, before the
	beginning of the list walk. since it is not used during the list walk and
	should not count in the list enumeration it can be deleted, then the
	list pointer cannot point to it so it can be added safely with the
	list_add_tail without oopsing, and everything works as expected.
	
	I am unable to oops this version with any of my test programs. 
	
	Patch acked by Al Viro.

<greg at kroah.com>
	[PATCH] USB: register usb-serial ports in the proper place in sysfs
	
	They should be bound to the interface the driver is attached to, not
	the device.

<david-b at pacbell.net>
	[PATCH] USB: fix remove device after set_configuration
	
	If a device can't be configured, the current test9 code forgets
	to clean it out of sysfs.  This resolves that issue, so the retry
	in usb_new_device() stands a chance of working.
	
	The enumeration code still doesn't handle such errors well, but
	at least this way that hub port can be used for another device.

<greg at kroah.com>
	[PATCH] USB: fix race with hub devices disconnecting while stuff is still
happening to them.

<acme at conectiva.com.br>
	[IPV6]: Fix TCP socket leak.
	
	TCP IPV6 ->hash() method should not grab a socket reference.

<axboe at suse.de>
	[PATCH] scsi_ioctl memcpy'ing user address
	
	James reported a bug in scsi_ioctl.c where it mem copies a user pointer
	instead of using copy_from_user(). I inadvertently introduced this one
	when getting rid of CDROM_SEND_PACKET. Here's a trivial patch to fix it.

<mdharm-usb at one-eyed-alien.net>
	[PATCH] USB storage: fix for jumpshot and datafab devices
	
	This patch fixes some obvious errors in the jumpshot and datafab drivers.
	
	This should close out Bugzilla bug #1408
	
	> Date: Mon, 1 Dec 2003 12:14:53 -0500 (EST)
	> From: Alan Stern <stern at rowland.harvard.edu>
	> Subject: Patch from Eduard Hasenleithner
	> To: Matthew Dharm <mdharm-usb at one-eyed-alien.net>
	> cc: USB Storage List <usb-storage at one-eyed-alien.net>
	>
	> Matt:
	>
	> Did you see this patch?  It was posted to the usb-development mailing list
	> about a week ago, before I started making all my changes.  It is clearly
	> correct and necessary.
	>
	> Alan Stern

<trini at kernel.crashing.org>
	[PATCH] USB: mark the scanner driver as obsolete
	
	On Mon, Dec 01, 2003 at 11:21:58AM -0800, Greg KH wrote:
	> Can't you use xsane without the scanner kernel driver?  I thought the
	> latest versions used libusb/usbfs to talk directly to the hardware.
	> Because of this, the USB scanner driver is marked to be removed from the
	> kernel sometime in the near future.
	
	After a bit of mucking around (and possibly finding a bug with debian's
	libusb/xsane/hotplug interaction, nothing seems to run
	/etc/hotplug/usb/libusbscanner and thus only root can scan, anyone whose
	got this working please let me know), the problem does not exist if I
	only use  libusb xsane.
	
	How about the following:

<oliver at neukum.org>
	[PATCH] USB: fix sleping in interrupt bug in auerswald driver
	
	this fixes two instances of GFP_KERNEL from completion handlers.

<oliver at neukum.org>
	[PATCH] USB: fix race with signal delivery in usbfs
	
	apart from locking bugs, there are other races. This fixes one with
	signal delivery. The signal should be delivered _before_ the reciever
	is woken.

<stern at rowland.harvard.edu>
	[PATCH] USB: fix bug not setting device state following usb_device_reset()

<herbert at gondor.apana.org.au>
	[PATCH] USB: Fix connect/disconnect race
	
	This patch was integrated by you in 2.4 six months ago.  Unfortunately
	it never got into 2.5.  Without it you can end up with crashes such
	as http://bugs.debian.org/218670

<greg at kroah.com>
	[PATCH] USB: fix bug for multiple opens on ttyUSB devices.
	
	This patch fixes the bug where running ppp over a ttyUSB device would fail.

<arvidjaar at mail.ru>
	[PATCH] USB: prevent catch-all USB aliases in modules.alias
	
	visor.c defines one empty slot in USB ids table that can be filled in at
	runtime using module parameters. file2alias generates catch-all alias for it:
	
	alias usb:v*p*dl*dh*dc*dsc*dp*ic*isc*ip* visor
	
	patch adds the same sanity check as in depmod to scripts/file2alias.

<greg at kroah.com>
	kobject: fix bug where a parent could be deleted before a child device.

<torvalds at home.osdl.org>
	Fix subtle bug in "finish_wait()", which can cause kernel stack
	corruption on SMP because of another CPU still accessing a waitqueue
	even after it was de-allocated.
	
	Use a careful version of the list emptiness check to make sure we
	don't de-allocate the stack frame before the waitqueue is all done.

<axboe at suse.de>
	[PATCH] no bio unmap on cdb copy failure
	
	The previous scsi_ioctl.c patch didn't cleanup the buffer/bio in the
	error case. 
	
	Fix it by copying the command data earlier.

<l.s.r at web.de>
	[PATCH] HPFS: missing lock_kernel() in hpfs_readdir()
	
	In 2.5.x, the BKL was pushed from vfs_readdir() into the filesystem
	specific functions.  But only the unlock_kernel() made it into the HPFS
	code, lock_kernel() got lost on the way.  This rendered the filesystem
	unusable.
	
	This adds the missing lock_kernel().  It's been tested by Timo Maier who
	also reported the problem earlier today.

<torvalds at home.osdl.org>
	More subtle SMP bugs in prepare_to_wait()/finish_wait(). 
	
	This time we have a SMP memory ordering issue in prepare_to_wait(),
	where we really need to make sure that subsequent tests for the
	event we are waiting for can not migrate up to before the wait
	queue has been set up.

<torvalds at home.osdl.org>
	Fix thread group leader zombie leak
	
	Petr Vandrovec noticed a problem where the thread group leader
	would not be properly reaped if the parent of the thread group
	was ignoring SIGCHLD, and the thread group leader had exited
	before the last sub-thread.
	
	Fixed by Ingo Molnar.

<neilb at cse.unsw.edu.au>
	[PATCH] Fix possible bio corruption with RAID5
	
	 1/ make sure raid5 doesn't try to handle multiple overlaping
	    requests at the same time as this would confuse things badly.
	    Currently it justs BUGs if this is attempted.
	 2/ Fix a possible data-loss-on-write problem.  If two or
	    more bio's that write to the same page are processed at the
	    same time, only the first was actually commited to storage.
	 3/ Fix a use-after-free bug.  raid5 keeps the bio's it is given
	    in linked lists when more than one bio touch a single page.
	    In some cases the tail of this list can be freed, and
	    the current test for 'are we at the end' isn't reliable.
	    This patch strengths the test to make it reliable.

<axboe at suse.de>
	[PATCH] Fix IDE bus reset and DMA disable when reading blank DVD-R
	
	From Jon Burgess:
	
	  There is a problems with blank DVD media using the ide-cd driver.
	
	  When we attempt to read the blank disk, the drive responds to the read
	  request by returning a "blank media" error.  The kernel doesn't have
	  any special case handling for this sense value and retries the request
	  a couple of times, then gives up and does a bus reset and disables DMA
	  to the device.
	
	  Which obviously doesn't help the situation.
	
	  The sense key value of 8 isn't listed in ide-cd.h, but it is listed in
	  scsi.h as a "BLANK_CHECK" error.
	
	  This trivial patch treats this error condition as a reason to abort
	  the request.  This behaviour is the same as what we do with a blank CD-R.
	
	  It looks like the same fix might be desired for 2.4 as well, although
	  is perhaps not so important since scsi-ide is normally used instead.

<axboe at suse.de>
	[PATCH] CDROM_SEND_PACKET bug
	
	I just found Yet Another Bug in scsi_ioctl - CDROM_SEND_PACKET puts a
	kernel pointer in hdr->cmdp, where sg_io() expects to find user address.
	This worked up until recently because of the memcpy bug, but now it
	doesn't because we do the proper copy_from_user(). 
	
	This fix undoes the user copy code from sg_io, and instead makes the
	SG_IO ioctl copy it locally.  This makes SG_IO and CDROM_SEND_PACKET
	agree on the calling convention, and everybody is happy. 
	
	I've tested that both
	
	   cdrecord -dev=/dev/hdc -inq
	
	and
	
	   cdrecord -dev=ATAPI:/dev/hdc -inq
	
	works now.  The former will use SG_IO, the latter CDROM_SEND_PACKET (and
	incidentally would work in both 2.4 and 2.6, if it wasn't for
	CDROM_SEND_PACKET sucking badly in 2.4).

<jes at trained-monkey.org>
	[PATCH] qla1280 crash fix in error handling
	
	This fixes a bug in the qla1280 driver where it would leave a pointer to
	an on the stack completion event in a command structure if
	qla1280_mailbox_command fails.  The result is that the interrupt handler
	later tries to complete() garbage on the stack.  The mailbox command can
	fail if a device on the bus decides to lock up etc.

<torvalds at home.osdl.org>
	Linux 2.6.0


=====
/---------------------------------------------------------------------------------------------------\
Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.

What Is an agnostic? -  An agnostic thinks it impossible to know the truth
in matters such as, a god or the future with which religions are concerned 
with. Or, if not impossible, at least impossible at the present time.
 
LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
\---------------------------------------------------------------------------------------------------/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec 19 14:10:52 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 19 Dec 2003 14:10:52 -0500 (EST)
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <4.2.0.58.20031219130954.00b43798@mail.flexfeed.com>
Message-ID: <Pine.LNX.4.44.0312191408390.15354-100000@ganesh.phy.duke.edu>

On Fri, 19 Dec 2003, Leigh wrote:

> At 06:38 AM 12/19/2003 -0500, Robert G. Brown wrote:
> 
> 
> >Do you know how to program?  C?  Perl?
> >
> >If so, I've got a few toys for you to play with...but they'll be boring
> >toys if you can't tinker.
> >
> >    rgb
> 
> Unfortunately, I don't. I can read and understand code, but I can't code 
> myself  yet.

Ahh.  The biggest problem you'll then have with clusters is that you're
stuck running other people's code.  There is some "fun" code out there
to play with that doesn't require anything but building and running in
e.g. the PVM or MPI distributions and elsewhere, but not a whole lot.
To go further at some point you'll have to learn to code.  Then you can
write applications like one that generates all the prime numbers with
less than X digits and the like...;-)

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Fri Dec 19 14:48:50 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Fri, 19 Dec 2003 11:48:50 -0800 (PST)
Subject: [Beowulf] Wireless Channel Bonding
In-Reply-To: <5.2.0.9.2.20031219111046.0313ed08@mailhost4.jpl.nasa.gov>
Message-ID: <Pine.LNX.3.96.1031219114419.24392A-100000@Maggie.Linux-Consulting.com>


hi ya jim

On Fri, 19 Dec 2003, Jim Lux wrote:

> I see no philosophical reason why one couldn't, for instance, plug in 
> multiple PCI based wireless cards. To the computer they just look like 
> network interfaces.  The problem you might face is the lack of drivers for 
> the high performance 802.11a or 802.11g PCI cards.

i've gotten a netgear wg311 (802.11g) nic recognized/configured
on my test redhat EL - ws setup with the madwifi drivers

collection of wireless drivers and supported cards:

	http://www.Linux-Sec.net/Wireless

> If someone can confirm that, for instance, the LinkSys WMP55AG works with 
> Linux, particularly in connection with a VIA Mini-ITX motherboard, I'd be 
> real happy to hear about it.

i'll be playing with the linksys WMP54g next for the other end of the
wireless connection, and hopefully run ipsec between teh two connections
since wep is a cracked technology
 
c ya
alvin

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 19 18:08:16 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 19 Dec 2003 15:08:16 -0800
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <4.2.0.58.20031219130954.00b43798@mail.flexfeed.com>
References: <Pine.LNX.4.44.0312190636570.29889-100000@lucifer.rgb.priva te.net>
 <4.2.0.58.20031218174322.00b6d310@mail.flexfeed.com>
Message-ID: <5.2.0.9.2.20031219150401.0319aaa8@mailhost4.jpl.nasa.gov>

At 01:10 PM 12/19/2003 -0500, Leigh wrote:
>At 06:38 AM 12/19/2003 -0500, Robert G. Brown wrote:
>
>
>>Do you know how to program?  C?  Perl?
>>
>>If so, I've got a few toys for you to play with...but they'll be boring
>>toys if you can't tinker.
>>
>>    rgb
>
>Unfortunately, I don't. I can read and understand code, but I can't code 
>myself  yet.
>


Hah... if you can read and understand code, you can tinker with it.. If you 
break it.. well, that's why you keep versions.  Surely you can use a text 
editor and invoke the compiler/linker.

I'll even point out that one can run parallel applications using Visual 
Basic (or even, qbasic, for that matter)

Leap in and start modifying.  Do those series expansions for psi, e, 
Euler's Constant, pi, etc.  Solve the 8 queens problems. Crack 
DES.  Calculate casino odds by monte carlo simulation ( a nice 
embarrassingly parallel challenge...)

If you want something more "useful", take a look at one of the genetic 
optimizing algorithms and parallelize it (or, more usefully, find someone 
else's parallel implementation, and modify or configure it with something 
practical.)


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 19 18:10:12 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 19 Dec 2003 15:10:12 -0800
Subject: [Beowulf] Wireless Channel Bonding
References: <5.2.0.9.2.20031219111046.0313ed08@mailhost4.jpl.nasa.gov>
Message-ID: <5.2.0.9.2.20031219150853.031a8870@mailhost4.jpl.nasa.gov>

At 11:48 AM 12/19/2003 -0800, Alvin Oga wrote:


I'm more interested in the 802.11a 5GHz technologies.. the WMP54g is a 2.4 
GHz band device (read, incredibly congested in my lab).  However, I have 
been given to understand that the WMP55AG is based on the Atheros chipset, 
and that they have actually published a Linux driver...


>hi ya jim
>
>On Fri, 19 Dec 2003, Jim Lux wrote:
>
> > I see no philosophical reason why one couldn't, for instance, plug in
> > multiple PCI based wireless cards. To the computer they just look like
> > network interfaces.  The problem you might face is the lack of drivers for
> > the high performance 802.11a or 802.11g PCI cards.
>
>i've gotten a netgear wg311 (802.11g) nic recognized/configured
>on my test redhat EL - ws setup with the madwifi drivers
>
>collection of wireless drivers and supported cards:
>
>         http://www.Linux-Sec.net/Wireless
>
> > If someone can confirm that, for instance, the LinkSys WMP55AG works with
> > Linux, particularly in connection with a VIA Mini-ITX motherboard, I'd be
> > real happy to hear about it.
>
>i'll be playing with the linksys WMP54g next for the other end of the
>wireless connection, and hopefully run ipsec between teh two connections
>since wep is a cracked technology
>
>c ya
>alvin

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From brian.dobbins at yale.edu  Fri Dec 19 19:19:42 2003
From: brian.dobbins at yale.edu (Brian Dobbins)
Date: Fri, 19 Dec 2003 19:19:42 -0500 (EST)
Subject: [Beowulf] 2.6
In-Reply-To: <20031219193813.11509.qmail@web60304.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312191915120.18934-100000@email.combustion.eng.yale.edu>

> Most already know and have played with 2.6.
> lots of smp fixes and Linus is fixing documentation.
> But did you read the changelog?....    ....hint read the last entry.

  On a slightly different note (.. different from the changelog bit ...), 
what are people's experiences in terms of performance?  Any noticeable 
difference in, ie, SMP codes?  I/O?  Network performance?  

  I have some Opterons here which, as soon as the jobs they need to run 
are done, I'm going to reboot with a PXE+Etherboot (*) 2.6 kernel to play 
with, but that could be a while yet. 

  (*) And, for the sake of saving anyone who may be trying the same thing, 
for some reason when using "mkelf-linux", I couldn't specify:

  --append="root=/dev/ram"

  .. like I could with the 2.4 kernel.  This time, I had to use the device 
numbers:

  --append="root=0100"

  Not 100% sure that was the problem, since it was done on very little 
sleep, but if any of you are booting diskless opterons and want to try the 
2.6 kernel but aren't having much luck, give that a shot.

  Cheers,
  - Brian


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From douglas at shore.net  Sat Dec 20 12:54:32 2003
From: douglas at shore.net (Douglas O'Flaherty)
Date: Sat, 20 Dec 2003 12:54:32 -0500
Subject: [Beowulf] RH Update 1 Announcement
Message-ID: <3FE48CD8.1060708@shore.net>

Thought this list would be interested... Now if they only also announced 
cluster pricing...

RedHat goes public with what is in Update 1:

http://news.com.com/2100-7344_3-5130174.html?tag=nefd_top

*Red Hat began public testing this week of an update designed to make 
its new premium Linux product work better on IBM servers and computers 
that use Advanced Micro Devices' Opteron chip. *

Update 1 of Red Hat Enterprise Linux 3 
<http://news.com.com/2100-7344-5094774.html?tag=nl> is expected to be 
final in mid-January, spokeswoman Leigh Day said on Friday.

The update will speed up RHEL 3 on IBM mainframes, Red Hat said. It will 
also make it work on a broader number of IBM's Power-chip-based pSeries 
and iSeries servers and on some new servers using Intel's Itanium 2 
processor.

doug


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Sat Dec 20 14:16:02 2003
From: landman at scalableinformatics.com (Joe Landman)
Date: Sat, 20 Dec 2003 14:16:02 -0500
Subject: [Beowulf] RH Update 1 Announcement
In-Reply-To: <3FE48CD8.1060708@shore.net>
References: <3FE48CD8.1060708@shore.net>
Message-ID: <1071947761.12682.18.camel@protein.scalableinformatics.com>

So my questions are (relative to this), which product would be used for
the compute nodes on a cluster?  

Redhat has:
	RHEL WS at ~$792 from web store
	
SUSE has:
	SLES 2 CPU license at ~$767 from their web store
	SL Pro 9.0 for AMD64 at  ~$120    

I assume the $700++ items have the NUMA patches.  Does the SL Pro
product?  

Of course there are other distributions one could use.  Commercially
Scyld, CLIC, and a few others are out or coming out such as Callident . 
Non-commercial you have ROCKS, cAos (soon), White-Box, Debian, OSCAR +
[RH | Mandrake], biobrew, Gentoo, and probably a few others.

Who is going to support the x86_64 platforms?  RH and SUSE are obvious,
but I think that cAos, ROCKS, CLIC, Gentoo, et al may/will support
x86_64.  Has anyone compiled a list yet?

Curious.

Joe

-- 
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman at scalableinformatics.com
  web: http://scalableinformatics.com
phone: +1 734 612 4615


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kervin at blueprintinc.com  Sat Dec 20 17:30:13 2003
From: kervin at blueprintinc.com (Kervin L. Pierre)
Date: Sat, 20 Dec 2003 17:30:13 -0500
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for 2.4 kernel?
Message-ID: <3FE4CD75.9070805@blueprintinc.com>

Hello,

I am upgrading software on a cluster at my college and part of the 
documentation says to patch the kernel with the "TCP Short Messages" 
patch found at http://www.icase.edu/coral/LinuxTCP.html .

The patch is only available for 2.2 series kernel and none seems to be 
done for the 2.4 kernel.  The contact email on that page bounces as well.

Is this patch still necessary for TCP Short Messages functionality?  If 
so where can I find the patch against 2.4?

Any information would be appreciated,
--Kervin
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From laytonjb at comcast.net  Sat Dec 20 18:49:03 2003
From: laytonjb at comcast.net (Jeffrey B. Layton)
Date: Sat, 20 Dec 2003 18:49:03 -0500
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for
 2.4 kernel?
In-Reply-To: <3FE4CD75.9070805@blueprintinc.com>
References: <3FE4CD75.9070805@blueprintinc.com>
Message-ID: <3FE4DFEF.2000502@comcast.net>

Kervin,

   You don't need it for the 2.4 or 2.6 kernels.

Enjoy!

Jeff

> Hello,
>
> I am upgrading software on a cluster at my college and part of the 
> documentation says to patch the kernel with the "TCP Short Messages" 
> patch found at http://www.icase.edu/coral/LinuxTCP.html .
>
> The patch is only available for 2.2 series kernel and none seems to be 
> done for the 2.4 kernel.  The contact email on that page bounces as well.
>
> Is this patch still necessary for TCP Short Messages functionality?  
> If so where can I find the patch against 2.4?
>
> Any information would be appreciated,
> --Kervin
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
>


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kervin at blueprintinc.com  Sat Dec 20 23:36:09 2003
From: kervin at blueprintinc.com (Kervin L. Pierre)
Date: Sat, 20 Dec 2003 23:36:09 -0500
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for
 2.4 kernel?
In-Reply-To: <3FE4DFEF.2000502@comcast.net>
References: <3FE4CD75.9070805@blueprintinc.com> <3FE4DFEF.2000502@comcast.net>
Message-ID: <3FE52339.5090203@blueprintinc.com>

Thanks Jeffrey,

Is there a kernel config or a /proc file associated with TCP Short 
Messages?  Or is it enabled by default?  Eg with the patch one had to 
'echo 1 > /proc/sys/net/ipv4/tcp_faster_timeouts', but this file is not 
in 2.4's /proc.

On a related note, does anyone have any TCP options I can turn on to 
improve the network performance of my beowulf?  I have 50 nodes using 
channel-bonding on 4 cisco switches.

Thanks again,
--Kervin

Jeffrey B. Layton wrote:
> Kervin,
> 
>   You don't need it for the 2.4 or 2.6 kernels.
> 
> Enjoy!
> 
> Jeff
> 
>> Hello,
>>
>> I am upgrading software on a cluster at my college and part of the 
>> documentation says to patch the kernel with the "TCP Short Messages" 
>> patch found at http://www.icase.edu/coral/LinuxTCP.html .
>>
>> The patch is only available for 2.2 series kernel and none seems to be 
>> done for the 2.4 kernel.  The contact email on that page bounces as well.
>>
>> Is this patch still necessary for TCP Short Messages functionality?  
>> If so where can I find the patch against 2.4?
>>
>> Any information would be appreciated,
>> --Kervin
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit 
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
> 
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From laytonjb at comcast.net  Sun Dec 21 07:55:32 2003
From: laytonjb at comcast.net (Jeffrey B. Layton)
Date: Sun, 21 Dec 2003 07:55:32 -0500
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for
 2.4 kernel?
In-Reply-To: <3FE52339.5090203@blueprintinc.com>
References: <3FE4CD75.9070805@blueprintinc.com> <3FE4DFEF.2000502@comcast.net> <3FE52339.5090203@blueprintinc.com>
Message-ID: <3FE59844.8050002@comcast.net>

Kervin,

> Thanks Jeffrey,
>
> Is there a kernel config or a /proc file associated with TCP Short 
> Messages?  Or is it enabled by default?  Eg with the patch one had to 
> 'echo 1 > /proc/sys/net/ipv4/tcp_faster_timeouts', but this file is 
> not in 2.4's /proc. 


   To be honest, I can't remember. Josip found that problem way back
in the 2.2 kernel days and I haven't used a 2.2 kernel in about a
year. Here's the original link I have:

http://www.icase.edu/coral/LinuxTCP2.html

Here's a post from Josip explaining that the short message problem
was fixed in the 2.4 series kernels:

http://www.beowulf.org/pipermail/beowulf/2001-August/000988.html

Once again, you don't have to worry about the problem. However, if
you think it's a problem, I'd contact Josip directly and see if he can
help you determine if it is a problem for your code and perhaps how
you can fix it.
   To be honest, it might not be worth fixing. The TCP stack and
networking in the 2.6 kernel are pretty good from what I've heard.
Maybe switching to a 2.6 kernel could help the problem.

> On a related note, does anyone have any TCP options I can turn on to 
> improve the network performance of my beowulf?  I have 50 nodes using 
> channel-bonding on 4 cisco switches. 


   My condolences on using Cisco. I've need had the displeasure
of using them in clusters, but from everything I've heard and
everyone I have spoken with, they're not the best. Difficult
beasts to work with and they don't have good throughput.
   Can you give us some more details? What kind of nodes?
What kind of NICs? Driver version? Switch version? Are you
just trying to get better performance or do you think there's
a problem? What kind of network performance are you getting
now? Have you run things like netpipe and/or netperf between
two nodes? How about testing the NASA Parallel benchmarks
between various combinations of nodes to check performance?
What MPI are you running?
   Also, since you're bonding, have you applied the latest
bonding patches?

http://sourceforge.net/projects/bonding/

You might also join the bonding mailing list if the problem
appears to be with the channel bonding.


Good Luck!

Jeff

P.S. It's Jeff, not Jeffrey. Only RGB calls me Jeffrey and I think
he does it to tweak me. Well, there is my wife when she's
angry with me. Wait, I think I hear some yelling... .

>
>
> Thanks again,
> --Kervin
>
> Jeffrey B. Layton wrote:
>
>> Kervin,
>>
>>   You don't need it for the 2.4 or 2.6 kernels.
>>
>> Enjoy!
>>
>> Jeff
>>
>>> Hello,
>>>
>>> I am upgrading software on a cluster at my college and part of the 
>>> documentation says to patch the kernel with the "TCP Short Messages" 
>>> patch found at http://www.icase.edu/coral/LinuxTCP.html .
>>>
>>> The patch is only available for 2.2 series kernel and none seems to 
>>> be done for the 2.4 kernel.  The contact email on that page bounces 
>>> as well.
>>>
>>> Is this patch still necessary for TCP Short Messages functionality?  
>>> If so where can I find the patch against 2.4?
>>>
>>> Any information would be appreciated,
>>> --Kervin
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit 
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>
>>
>
>


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mbanck at gmx.net  Sun Dec 21 07:34:00 2003
From: mbanck at gmx.net (Michael Banck)
Date: Sun, 21 Dec 2003 13:34:00 +0100
Subject: [Beowulf] RH Update 1 Announcement
In-Reply-To: <1071947761.12682.18.camel@protein.scalableinformatics.com>
References: <3FE48CD8.1060708@shore.net> <1071947761.12682.18.camel@protein.scalableinformatics.com>
Message-ID: <20031221123400.GA23879@blackbird.oase.mhn.de>

On Sat, Dec 20, 2003 at 02:16:02PM -0500, Joe Landman wrote:
> Who is going to support the x86_64 platforms?  RH and SUSE are obvious,
> but I think that cAos, ROCKS, CLIC, Gentoo, et al may/will support
> x86_64.  Has anyone compiled a list yet?

Debian will. Stuff is still being hashed out, though, as being able to
have both 32 and 64 bit packages installed concurrently requires some
changes to the low-level packaging system.


Michael
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Sun Dec 21 18:33:55 2003
From: csamuel at vpac.org (Chris Samuel)
Date: Mon, 22 Dec 2003 10:33:55 +1100
Subject: [Beowulf] RH Update 1 Announcement
In-Reply-To: <1071947761.12682.18.camel@protein.scalableinformatics.com>
References: <3FE48CD8.1060708@shore.net> <1071947761.12682.18.camel@protein.scalableinformatics.com>
Message-ID: <200312221033.56256.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sun, 21 Dec 2003 06:16 am, Joe Landman wrote:

> Who is going to support the x86_64 platforms?  RH and SUSE are obvious,
> but I think that cAos, ROCKS, CLIC, Gentoo, et al may/will support
> x86_64.  Has anyone compiled a list yet?

Data points that I'm aware off (apart from SuSE and RHEL):

NPACI Rocks - 3.1 due out Real Soon Now (tm) (maybe this week) will be rebuild 
from trademark-stripped RHEL SRPMS (as Redhat require) and will support 
Opterons as well as the previous IA32 and IA64 architectures.

Mandrake 9.2 for AMD64 - currently at RC1 and freely downloadable for Opterons 
and Athlon64 processors.

Gentoo's AMD64 support sounds distinctly early beta-ish from their technical 
notes at http://dev.gentoo.org/~brad_mssw/amd64-tech-notes.html - there's 
also a report (no details) of a successful install at 
http://www.odegard.uni.cc/index.php?itemid=3

Debian likewise sounds like a work in progress, the port home page is at 
http://www.debian.org/ports/amd64/ and there's an FAQ linked from it which 
gives a lot more information. Of course, given the recent compromise of 
Debian systems the development may be more advanced than the web pages.

The cAos website lists AMD64 as a target, but the download sites only list 
i386 for the moment.

TurboLinux now supports Opterons with the release of their AMD64 Update Kit at 
http://www.turbolinux.com/products/tl8a/tl8a_uk/ - I guess this shouldn't be 
suprising as they're a UnitedLinux distro just like SuSE is. Connectia 
(another UL distro) doesn't seem to, although their website is in Spanish and 
I had to guess what the search form was. :-)

cheers!
Chris
- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/5i3jO2KABBYQAh8RAktXAJ9qjfnmUTfMgUkTR3ujtgGvonfqcgCghvAp
c4thcjce81kA9t6odoowblc=
=k/Dd
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From michael.worsham at mci.com  Mon Dec 22 11:34:31 2003
From: michael.worsham at mci.com (Michael Worsham)
Date: Mon, 22 Dec 2003 11:34:31 -0500
Subject: [Beowulf] QNX Support?
Message-ID: <001001c3c8a9$7ab5d130$2f7032a6@Wcomnet.com>

Hi all.

Anyone have any documentation/links to sites of setting up a beowulf under
QNX?

Thanks.

-- M


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From josip at lanl.gov  Mon Dec 22 13:15:46 2003
From: josip at lanl.gov (Josip Loncaric)
Date: Mon, 22 Dec 2003 11:15:46 -0700
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for
 2.4 kernel?
In-Reply-To: <3FE4CD75.9070805@blueprintinc.com>
References: <3FE4CD75.9070805@blueprintinc.com>
Message-ID: <3FE734D2.2080802@lanl.gov>

Kervin L. Pierre wrote:
> 
> I am upgrading software on a cluster at my college and part of the 
> documentation says to patch the kernel with the "TCP Short Messages" 
> patch found at http://www.icase.edu/coral/LinuxTCP.html .
> 
> The patch is only available for 2.2 series kernel and none seems to be 
> done for the 2.4 kernel.  The contact email on that page bounces as well.

Unfortunately, ICASE is no more: it was "improved out of existence" (the 
successor organization NIA operates somewhat differently).

The Dec. 31, 2002 snapshot of the official ICASE web site is hosted by 
USRA, so papers etc. can be retrieved using the old URLs, but ICASE 
E-mail addresses and personal web pages are defunct.

> Is this patch still necessary for TCP Short Messages functionality?  If 
> so where can I find the patch against 2.4?

The patch was needed for 2.0 and 2.2 Linux kernels due to a quirk in 
their TCP stack implementation.  Since 2.4 Linux kernels perform fine 
without the patch, you do not need it any more.

Sincerely,
Josip

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msaleh at ihu.ac.ir  Sat Dec 27 05:05:04 2003
From: msaleh at ihu.ac.ir (Mahmoud Saleh)
Date: Sat, 27 Dec 2003 13:35:04 +0330
Subject: [Beowulf] Gigabit Ethernet vs Myrinet
Message-ID: <WorldClient-F200312271335.AA35040001@ihu.ac.ir>

Folks,

Reading a couple of comparison tables regarding latency of different NIC 
protocols, I noticed that many solutions suggest to use Myrinet style 
NIC due to its low latency, namely around 8usec for I/O intensive 
cluster jobs. I was wondering if Gigabit Ethernet does the same.

Suppose that Maximum packet size in GE is 1500 bytes and the minimum is 
aroud 100 bytes. This translates to an average of 800 bytes or 6400 
bits. In Gigabit Ethernet that would cause a delay of 6400/10^9 sec  or 
6.4usec for packet assembly, which is in the same order as Myrinet. 

Is this justification correct? If so, how wise is it to use Gigabit 
Ethernet for an I/O intensive cluster?

 
Regards,
--
Mahmoud

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Sat Dec 27 13:21:34 2003
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Sat, 27 Dec 2003 13:21:34 -0500 (EST)
Subject: [Beowulf] Gigabit Ethernet vs Myrinet
In-Reply-To: <WorldClient-F200312271335.AA35040001@ihu.ac.ir>
Message-ID: <Pine.LNX.4.44.0312271303380.2670-100000@boltzmann.basement-supercomputing.com>

On Sat, 27 Dec 2003, Mahmoud Saleh wrote:

> Folks,
> 
> Reading a couple of comparison tables regarding latency of different NIC 
> protocols, I noticed that many solutions suggest to use Myrinet style 
> NIC due to its low latency, namely around 8usec for I/O intensive 
> cluster jobs. I was wondering if Gigabit Ethernet does the same.
> 
> Suppose that Maximum packet size in GE is 1500 bytes and the minimum is 
> aroud 100 bytes. This translates to an average of 800 bytes or 6400 
> bits. In Gigabit Ethernet that would cause a delay of 6400/10^9 sec  or 
> 6.4usec for packet assembly, which is in the same order as Myrinet. 

The best 1 byte latency for GigE I have measured has been 25 us.
This test was using netpipe/TCP. It is hard to provide a solid number
because Ethernet chip-sets/drivers vary as do motherboards that include
GigE. The best thing to do is test some hardware.

> 
> Is this justification correct? If so, how wise is it to use Gigabit 
> Ethernet for an I/O intensive cluster?

More tests are needed to answer that. With Myrinet, Quadrics, SCI, you 
will get better performance -- and spend more money. Somethings you may 
need to consider with this decision:

1. What  API you will use MPI, PVM, sockets? (API can add overhead to 
   latency numbers)
2. How many nodes do expect to use ?
3. Is there a single NFS server for the data or are you using something 
   like PVFS or GFS?
4. What are your I/O block sizes?

Doug
> 
>  
> 
> Regards,
> --
> Mahmoud
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Sat Dec 27 12:52:31 2003
From: lindahl at pathscale.com (Greg Lindahl)
Date: Sat, 27 Dec 2003 09:52:31 -0800
Subject: [Beowulf] Gigabit Ethernet vs Myrinet
In-Reply-To: <WorldClient-F200312271335.AA35040001@ihu.ac.ir>
References: <WorldClient-F200312271335.AA35040001@ihu.ac.ir>
Message-ID: <20031227175231.GB1642@greglaptop.earthlink.net>

On Sat, Dec 27, 2003 at 01:35:04PM +0330, Mahmoud Saleh wrote:

> Reading a couple of comparison tables regarding latency of different NIC 
> protocols, I noticed that many solutions suggest to use Myrinet style 
> NIC due to its low latency, namely around 8usec for I/O intensive 
> cluster jobs. I was wondering if Gigabit Ethernet does the same.

First off, most people separate disk I/O from program
communications. I'll assume that you're talking about the second.

> Suppose that Maximum packet size in GE is 1500 bytes and the minimum is 
> aroud 100 bytes. This translates to an average of 800 bytes or 6400 
> bits. In Gigabit Ethernet that would cause a delay of 6400/10^9 sec  or 
> 6.4usec for packet assembly, which is in the same order as Myrinet. 

You are only thinking about the time needed to send the actual
bytes. The total time to send a small message is much bigger than
that. There are published papers that show the "ping pong" latency for
gigabit ethernet. This number is highly dependent on the exact
gigE card, switch, OS, and gigE driver that you're using.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Sun Dec 28 08:30:23 2003
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Sun, 28 Dec 2003 08:30:23 -0500 (EST)
Subject: [Beowulf] New Poll on Cluster-Rant
In-Reply-To: <Pine.LNX.4.44.0312271303380.2670-100000@boltzmann.basement-supercomputing.com>
Message-ID: <Pine.LNX.4.44.0312280815000.10843-100000@boltzmann.basement-supercomputing.com>


For those interested, there is a new poll asking about kernel 2.6 at
cluster-rant.com. The links to the new poll and previous interconnects
poll (107 votes) can be found here:

http://www.cluster-rant.com/article.pl?sid=03/12/22/1625228

Doug

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ds10025 at cam.ac.uk  Mon Dec 29 06:27:21 2003
From: ds10025 at cam.ac.uk (D. Scott)
Date: 29 Dec 2003 11:27:21 +0000
Subject: [Beowulf] X-window, MPICH, MPE, Cluster performance test
Message-ID: <E1AavYP-0004YS-JM@maroon.csi.cam.ac.uk>


Hi

At last! My cluster is now online. I would like to thank everyone for they 
help. I thinking of putting a website together covering my experience in 
putting this cluster together. Will this be of use to anyone? Is they 
website that covers top 100 list of small cluster?.

Now it is online I would like to test it.

MPICH comes with test program, eg mpptest. Programs works and it produce 
nice graph. Is they any documentation/tutorial that explains meaning of 
these graphs?

MPICH also comes with MPE graphic test programs, mandel. Problem is that I 
have only got X-window installed on the master node. But, when I run 
pmandel, it returms an error, staying that it can not find shared library 
for X-window on other nodes. How can I make X-window shared across other 
nodes from the Master node? Same me install GUI programs on other nodes.

This could be related problem, but when I complied life (that uses MPE 
libraries) it returns error that MPE libraries are undefined. Any ideas?


Can I install both LAM/MPICH and MPICH-1.2.5 on the same machine?

How to calculate flops?


Are they any other performance test?

Thanks in advance.


Dan


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From anand at mecheng.iisc.ernet.in  Mon Dec 29 12:13:17 2003
From: anand at mecheng.iisc.ernet.in (Anand TNC)
Date: Mon, 29 Dec 2003 22:43:17 +0530 (IST)
Subject: [Beowulf] X-window, MPICH, MPE, Cluster performance test
In-Reply-To: <E1AavYP-0004YS-JM@maroon.csi.cam.ac.uk>
Message-ID: <Pine.LNX.4.33.0312292242180.11359-100000@mecheng.iisc.ernet.in>

^Hi
^
^At last! My cluster is now online. I would like to thank everyone for they 
^help. I thinking of putting a website together covering my experience in 
^putting this cluster together. Will this be of use to anyone? Is they 
^website that covers top 100 list of small cluster?.

Hi,

we're planning to set up a small cluster ~6 nodes - it will be very useful 
to people like me

thanks

Anand

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From brian.dobbins at yale.edu  Mon Dec 29 15:41:25 2003
From: brian.dobbins at yale.edu (Brian Dobbins)
Date: Mon, 29 Dec 2003 15:41:25 -0500 (EST)
Subject: [Beowulf] Q: Any beowulf people in the Beijing area?
Message-ID: <Pine.LNX.4.44.0312291532090.20289-100000@email.combustion.eng.yale.edu>


Hi guys,

  I'm just curious if there are any people here in the Beijing area who 
are doing work with Beowulf clusters?  I may be moving there sometime next 
year, but would like very much to stay involved in the realm of Beowulf 
and parallel computing in general.

  This is very preliminary, but if there are any of you out there who do 
work with clusters, or are planning on building one, etc., and happen to 
be in the Beijing area, I'd love to know!  :-)

  Cheers,
  - Brian

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Per.Lindstrom at me.chalmers.se  Mon Dec 29 14:59:27 2003
From: Per.Lindstrom at me.chalmers.se (=?ISO-8859-1?Q?Per_Lindstr=F6m?=)
Date: Mon, 29 Dec 2003 20:59:27 +0100
Subject: [Beowulf] Websites for small clusters
Message-ID: <3FF0879F.4080301@me.chalmers.se>

Hi Dan,

It should be great if you publish your cluster work instructions on a 
website. I have found that there is need for a such place.

The site http://www.msm.cam.ac.uk/map/mapmain.html is a good example on 
how a website sharing scientific and/or professional experience can be 
aranged.

If it not allready exist, shall we arrange something similar for few 
node clusters? (Few node clusters 2 - 30 nodes?)

Best regards
Per Lindstr?m


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From james.p.lux at jpl.nasa.gov  Mon Dec 29 21:45:05 2003
From: james.p.lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 29 Dec 2003 18:45:05 -0800
Subject: [Beowulf] Websites for small clusters
References: <3FF0879F.4080301@me.chalmers.se>
Message-ID: <004001c3ce7e$f06507e0$36a8a8c0@LAPTOP152422>

I fully agree... I suspect that most readers of this list start with a small
cluster, and a historical record of what it took to get it up and running is
quite useful, especially the hiccups and problems that you inevitably
encounter. (e.g. what do you mean the circuit breaker just tripped on the
plug strip when we plugged all those things into it?)

----- Original Message -----
From: "Per Lindstr?m" <Per.Lindstrom at me.chalmers.se>
To: "D. Scott" <ds10025 at cam.ac.uk>; "Anand TNC"
<anand at mecheng.iisc.ernet.in>
Cc: "Beowulf" <beowulf at beowulf.org>; "Josh Moore" <kong at thejosh.net>; "Per"
<Per.Lindstrom at madpenguin.org>
Sent: Monday, December 29, 2003 11:59 AM
Subject: [Beowulf] Websites for small clusters


> Hi Dan,
>
> It should be great if you publish your cluster work instructions on a
> website. I have found that there is need for a such place.
>
> The site http://www.msm.cam.ac.uk/map/mapmain.html is a good example on
> how a website sharing scientific and/or professional experience can be
> aranged.
>
> If it not allready exist, shall we arrange something similar for few
> node clusters? (Few node clusters 2 - 30 nodes?)
>
> Best regards
> Per Lindstr?m
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ds10025 at cam.ac.uk  Tue Dec 30 06:42:43 2003
From: ds10025 at cam.ac.uk (D. Scott)
Date: 30 Dec 2003 11:42:43 +0000
Subject: [Beowulf] Websites for small clusters
Message-ID: <E1AbIGp-0002rN-By@maroon.csi.cam.ac.uk>

Hi

How did you calculate mega flops?

I will download and look into the benchmark program you are using.

Could we aggreed on which benchmark software to use so that we can compare 
performance of each small cluster?

http://home.attmil.ne.jp/a/jm/

Gives me an idea to put together a basic site. I'll see what I can do.

It did take me alot of time and effect searching the net for information. 
I'll see if I can put it all together.


Dan

On Dec 30 2003, Josh Moore wrote:

> I have seen a few but not that many websites around dealing with 
> indviduals clusters.  Most links were down and it took a great deal of 
> searching to come up with a few pages.  That is the main reason I made 
> my website http://home.attmil.ne.jp/a/jm/  dealing with the building of 
> my cluster.  It started as a two node cluster and has updates has I add 
> more nodes and run other tests.
> 
> Jim Lux wrote:
> 
> > I fully agree... I suspect that most readers of this list start with a 
> > small cluster, and a historical record of what it took to get it up and 
> > running is quite useful, especially the hiccups and problems that you 
> > inevitably encounter. (e.g. what do you mean the circuit breaker just 
> > tripped on the plug strip when we plugged all those things into it?)
> >
> > ----- Original Message ----- From: "Per Lindstr?m" 
> > <Per.Lindstrom at me.chalmers.se> To: "D. Scott" <ds10025 at cam.ac.uk>; 
> > "Anand TNC" <anand at mecheng.iisc.ernet.in> Cc: "Beowulf" 
> > <beowulf at beowulf.org>; "Josh Moore" <kong at thejosh.net>; "Per" 
> > <Per.Lindstrom at madpenguin.org> Sent: Monday, December 29, 2003 11:59 AM 
> > Subject: [Beowulf] Websites for small clusters
> >
> >
> >  
> >
> >>Hi Dan,
> >>
> >>It should be great if you publish your cluster work instructions on a
> >>website. I have found that there is need for a such place.
> >>
> >>The site http://www.msm.cam.ac.uk/map/mapmain.html is a good example on
> >>how a website sharing scientific and/or professional experience can be
> >>aranged.
> >>
> >>If it not allready exist, shall we arrange something similar for few
> >>node clusters? (Few node clusters 2 - 30 nodes?)
> >>
> >>Best regards
> >>Per Lindstr?m
> >>
> >>
> >>_______________________________________________
> >>Beowulf mailing list, Beowulf at beowulf.org
> >>To change your subscription (digest mode or unsubscribe) visit
> >>    
> >>
> >http://www.beowulf.org/mailman/listinfo/beowulf
> >
> >  
> >
> 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ds10025 at cam.ac.uk  Tue Dec 30 08:55:00 2003
From: ds10025 at cam.ac.uk (D. Scott)
Date: 30 Dec 2003 13:55:00 +0000
Subject: [Beowulf] Websites for small clusters
Message-ID: <E1AbKKq-0007fl-7z@maroon.csi.cam.ac.uk>

Hi

Another site had a paper. Have anyone come across Linpack paper?

The site is http://www.csis.hku.hk/~clwang/gideon300/peak.html.

I had the same problem interpret the results when I ran test supplied my 
mpich, eg mpptest and gotest.

Will it be worth setting up a performance chart for small clusters. It can 
include, FLOPS, Network performance etc.

Dan
On Dec 30 2003, Josh Moore wrote:Linpack paper,

> Hi,
> I found a site that contained a modified version of the PI calculator 
> that comes bundled with mpich.  I have attached it.  I'm not sure on the 
> accuracy of it, but it seems to work.  I would love to have a standard 
> bench marking program to compare results.  Pallas is good, but it takes 
> a while to run and it can be hard to interpret the results.  It would be 
> much easier to say this setup has this many megaflops/gigaflops and this 
> setup has this many instead of saying here is a 200 line test result of 
> my setup from Pallas.  Pallas is great but it can be over-kill when you 
> want a quick estimate of overall performance while adding nodes or doing 
> different tweaking.  I am constatly adding stuff to my site.  I hope to 
> add some nodes and upgrade to 100Mbps by the end of January.  I am also 
> hoping to make the site easier to navigate instead of just having a 
> single page.
> 
> Josh
> 
> 
> D. Scott wrote:
> 
> > Hi
> >
> > How did you calculate mega flops?
> >
> > I will download and look into the benchmark program you are using.
> >
> > Could we aggreed on which benchmark software to use so that we can 
> > compare performance of each small cluster?
> >
> > http://home.attmil.ne.jp/a/jm/
> >
> > Gives me an idea to put together a basic site. I'll see what I can do.
> >
> > It did take me alot of time and effect searching the net for 
> > information. I'll see if I can put it all together.
> >
> >
> > Dan
> >
> >
> 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Tue Dec 30 10:16:40 2003
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Tue, 30 Dec 2003 10:16:40 -0500 (EST)
Subject: [Beowulf] Websites for small clusters
In-Reply-To: <E1AbIGp-0002rN-By@maroon.csi.cam.ac.uk>
Message-ID: <Pine.LNX.4.44.0312301011370.10843-100000@boltzmann.basement-supercomputing.com>

On 30 Dec 2003, D. Scott wrote:

> Hi
> 
> How did you calculate mega flops?
> 
> I will download and look into the benchmark program you are using.
> 
> Could we aggreed on which benchmark software to use so that we can compare 
> performance of each small cluster?

Check out:

http://www.cluster-rant.com/article.pl?sid=03/03/17/1838236

It explains a bit about the Beowulf Performance suite (BPS), it is
not intended to measure LINPAC MFLOPS, but rather help you see if the
cluster is working properly. It may take a little fidgeting to get it
to work as there is no standard way to do things, but it can be useful
to test if the cluster is working properly and measure perforamnce using 
the NAS parallel test suite.

Let me know if you need help.

Doug

> 
> http://home.attmil.ne.jp/a/jm/
> 
> Gives me an idea to put together a basic site. I'll see what I can do.
> 
> It did take me alot of time and effect searching the net for information. 
> I'll see if I can put it all together.
> 
> 
> Dan
> 
> On Dec 30 2003, Josh Moore wrote:
> 
> > I have seen a few but not that many websites around dealing with 
> > indviduals clusters.  Most links were down and it took a great deal of 
> > searching to come up with a few pages.  That is the main reason I made 
> > my website http://home.attmil.ne.jp/a/jm/  dealing with the building of 
> > my cluster.  It started as a two node cluster and has updates has I add 
> > more nodes and run other tests.
> > 
> > Jim Lux wrote:
> > 
> > > I fully agree... I suspect that most readers of this list start with a 
> > > small cluster, and a historical record of what it took to get it up and 
> > > running is quite useful, especially the hiccups and problems that you 
> > > inevitably encounter. (e.g. what do you mean the circuit breaker just 
> > > tripped on the plug strip when we plugged all those things into it?)
> > >
> > > ----- Original Message ----- From: "Per Lindstr?m" 
> > > <Per.Lindstrom at me.chalmers.se> To: "D. Scott" <ds10025 at cam.ac.uk>; 
> > > "Anand TNC" <anand at mecheng.iisc.ernet.in> Cc: "Beowulf" 
> > > <beowulf at beowulf.org>; "Josh Moore" <kong at thejosh.net>; "Per" 
> > > <Per.Lindstrom at madpenguin.org> Sent: Monday, December 29, 2003 11:59 AM 
> > > Subject: [Beowulf] Websites for small clusters
> > >
> > >
> > >  
> > >
> > >>Hi Dan,
> > >>
> > >>It should be great if you publish your cluster work instructions on a
> > >>website. I have found that there is need for a such place.
> > >>
> > >>The site http://www.msm.cam.ac.uk/map/mapmain.html is a good example on
> > >>how a website sharing scientific and/or professional experience can be
> > >>aranged.
> > >>
> > >>If it not allready exist, shall we arrange something similar for few
> > >>node clusters? (Few node clusters 2 - 30 nodes?)
> > >>
> > >>Best regards
> > >>Per Lindstr?m
> > >>
> > >>
> > >>_______________________________________________
> > >>Beowulf mailing list, Beowulf at beowulf.org
> > >>To change your subscription (digest mode or unsubscribe) visit
> > >>    
> > >>
> > >http://www.beowulf.org/mailman/listinfo/beowulf
> > >
> > >  
> > >
> > 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Editor-in-chief                   ClusterWorld Magazine
Desk: 610.865.6061                            
Cell: 610.390.7765         Redefining High Performance Computing
Fax:  610.865.6618 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jgreenseid at wesleyan.edu  Tue Dec 30 12:18:46 2003
From: jgreenseid at wesleyan.edu (Joe Greenseid)
Date: Tue, 30 Dec 2003 12:18:46 -0500 (EST)
Subject: [Beowulf] Websites for small clusters
In-Reply-To: <Pine.LNX.4.44.0312301011370.10843-100000@boltzmann.basement-supercomputing.com>
References: <Pine.LNX.4.44.0312301011370.10843-100000@boltzmann.basement-supercomputing.com>
Message-ID: <Pine.GSO.4.53.0312301215030.19375@alumni.wesleyan.edu>

I have tried to post as many of these "how to build a beowulf" sites as i
can find on my website here:  http://lcic.org/documentation.html#comp

right now it looks like i have 5 or 6 of them that aren't from places like
IBM and stuff (and a few from IBM, ameslab, etc).  if folks come across
others i'm missing, please send them along to me, i'd be happy to post
them (i've seen a few things on the list here in the past month that i
have on the TODO list; i just haven't had much time with the real job
taking all my time lately, but that is changing shortly).

--Joe

***************************************
*  Joe Greenseid                      *
*  jgreenseid [at] wesleyan [dot] edu *
*  http://www.thunderlizards.net      *
*  http://lcic.org                    *
***************************************

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ddw at dreamscape.com  Tue Dec 30 21:41:19 2003
From: ddw at dreamscape.com (Daniel Williams)
Date: Tue, 30 Dec 2003 21:41:19 -0500
Subject: [Beowulf] Megaflops & Benchmarks
Message-ID: <3FF23739.BC46C032@dreamscape.com>

I am hoping to build a cluster as soon as I can find 8 or more Pentium II
class machines being scrapped, and I would be interested in being able to
compare a cluster's performance with all my single processor machines. Is
there a benchmark that will run on a single processor PC, as well as a
cluster, so you can compare them directly?
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mikhailberis at free.net.ph  Wed Dec 31 03:22:00 2003
From: mikhailberis at free.net.ph (Dean Michael C. Berris)
Date: 31 Dec 2003 16:22:00 +0800
Subject: [Beowulf] Beowulf Benchmark
Message-ID: <1072858917.3845.8.camel@mikhail>

Good day everyone,

I am a student at the University of the Philippines at Los Banos (UPLB)
here in the Philippines, and I'm currently doing my thesis on projective
computational load balancing algorithm for Beowulf clusters. I am in
charge of two homogeneous clusters, each having 5 nodes, one based on
the x86 architecture while the other is based on the UltraSPARC
architecture.

I am relatively new to clustering technologies, but I have been at a
loss while looking for possible benchmarking tools for clusters. I have
seen some libraries like LINPACK for linear algebra, but I don't know
how to use it for benchmarking.

I have implemented a parallel genetic algorithm solution to the
asymmetric traveling salesman problem (100 nodes) on the x86 based
cluster, as well as a prime number finder on both the x86 and UltraSPARC
clusters. I have results on both the x86 cluster as well as the
UltraSPARC cluster with regard to the prime number finder, but I haven't
an idea as to how I could come up with the FLOPS that either cluster can
do.

Any tutorials, insights, and examples would be most welcome.

Thanks in advance!

-- 
Dean Michael C. Berris <mikhailberis at free.net.ph>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Mon Dec  1 09:37:41 2003
From: john.hearns at clustervision.com (John Hearns)
Date: Mon, 1 Dec 2003 15:37:41 +0100 (CET)
Subject: Fedora for x86_64
Message-ID: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com>

I saw this on the Fedora list that it has been released for x86_64
http://fedora.linux.duke.edu/fc1_x86_64/

I should say that I haven't tried/used this myself, just thought
it would be of interest to this list.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Mon Dec  1 09:43:47 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Mon, 1 Dec 2003 06:43:47 -0800
Subject: Mainboard identification and BIOS dump
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF5E@orsmsx402.jf.intel.com>

From: Anas Nashif, Saturday, November 29, 2003 8:29 PM
> 
> DMI decode is your friend
> 
> http://www.nongnu.org/dmidecode/
> 
This is definitely your friend.

HOWEVER, be aware that the information can vary widely and wildly from
one model computer to another, even among different models from the same
OEM.

A while ago, I was using the precursor to the above as the basis for a
"system serial number" utility -- even with the few vendors that I was
using at the time, the variety of places to put a serial number, if
available at all, was daunting.

Bottom line: there's some great info there, but don't be surprised by
the inconsistencies.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msnitzer at lnxi.com  Mon Dec  1 10:50:03 2003
From: msnitzer at lnxi.com (Mike Snitzer)
Date: Mon, 1 Dec 2003 08:50:03 -0700
Subject: Fedora for x86_64
In-Reply-To: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com>; from john.hearns@clustervision.com on Mon, Dec 01, 2003 at 03:37:41PM +0100
References: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com>
Message-ID: <20031201085003.A28915@lnxi.com>

On Mon, Dec 01 2003 at 07:37,
John Hearns <john.hearns at clustervision.com> wrote:

> I saw this on the Fedora list that it has been released for x86_64
> http://fedora.linux.duke.edu/fc1_x86_64/
> 
> I should say that I haven't tried/used this myself, just thought
> it would be of interest to this list.

It should be noted that this is NOT an official Fedora Core 1 release for
amd64; as taken from the post to fedora-devel:

...
ISOs will not be provided for this release, but everything is there for
an install.
...

/***************************************************************************
*       WARNING: This release is a preview, it is not an official Fedora
*       Core 1 Release, this is not an official Fedora Core Test Release.
*       This release may very well cause damage to your data, your system,
*       your pets and loved ones, and most certainly your sleep schedule.
*       There is no guarantee of any type on performance, stability, or
*       your sanity.  Use at your own risk.
***************************************************************************/


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From a.j.martin at qmul.ac.uk  Mon Dec  1 12:47:18 2003
From: a.j.martin at qmul.ac.uk (Alex Martin)
Date: Mon, 1 Dec 2003 17:47:18 +0000
Subject: Fedora for x86_64
In-Reply-To: <854qwkmpj4.fsf@blindglobe.net>
References: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com> <20031201085003.A28915@lnxi.com> <854qwkmpj4.fsf@blindglobe.net>
Message-ID: <200312011747.hB1HlIv21111@heppcb.ph.qmw.ac.uk>

After just installing it...It appears to be mostly 64-bit with support for 
32-bit bins...some applications e.g. openoffice don't apparently yet compile 
for x86_64.

cheers,
Alex


On Monday 01 December 2003 5:11 pm, A.J. Rossini wrote:
> Anyone know if it is a "true 64-bit" release, or a biarch (32/64), or
> just a 32bit?
>
> best,
> -tony
>
> Mike Snitzer <msnitzer at lnxi.com> writes:
> > On Mon, Dec 01 2003 at 07:37,
> >
> > John Hearns <john.hearns at clustervision.com> wrote:
> >> I saw this on the Fedora list that it has been released for x86_64
> >> http://fedora.linux.duke.edu/fc1_x86_64/
> >>
> >> I should say that I haven't tried/used this myself, just thought
> >> it would be of interest to this list.
> >
> > It should be noted that this is NOT an official Fedora Core 1 release for
> > amd64; as taken from the post to fedora-devel:
> >
> > ...
> > ISOs will not be provided for this release, but everything is there for
> > an install.
> > ...
> >
> > /************************************************************************
> >*** *       WARNING: This release is a preview, it is not an official
> > Fedora *       Core 1 Release, this is not an official Fedora Core Test
> > Release. *       This release may very well cause damage to your data,
> > your system, *       your pets and loved ones, and most certainly your
> > sleep schedule. *       There is no guarantee of any type on performance,
> > stability, or *       your sanity.  Use at your own risk.
> > *************************************************************************
> >**/
> >
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf

-- 
------------------------------------------------------------------------------
|                                                                            |
|  Dr. Alex Martin                                                           |
|  e-Mail:   a.j.martin at qmul.ac.uk        Queen Mary, University of London,  |
|  Phone :   +44-(0)20-7882-5033          Mile End Road,                     |
|  Fax   :   +44-(0)20-8981-9465          London, UK   E1 4NS                |
|                                                                            |
------------------------------------------------------------------------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rossini at blindglobe.net  Mon Dec  1 12:11:27 2003
From: rossini at blindglobe.net (A.J. Rossini)
Date: Mon, 01 Dec 2003 09:11:27 -0800
Subject: Fedora for x86_64
In-Reply-To: <20031201085003.A28915@lnxi.com> (Mike Snitzer's message of
 "Mon, 1 Dec 2003 08:50:03 -0700")
References: <Pine.LNX.4.44.0312011535240.2673-100000@druifje.clustervision.com>
	<20031201085003.A28915@lnxi.com>
Message-ID: <854qwkmpj4.fsf@blindglobe.net>


Anyone know if it is a "true 64-bit" release, or a biarch (32/64), or
just a 32bit?

best,
-tony

Mike Snitzer <msnitzer at lnxi.com> writes:

> On Mon, Dec 01 2003 at 07:37,
> John Hearns <john.hearns at clustervision.com> wrote:
>
>> I saw this on the Fedora list that it has been released for x86_64
>> http://fedora.linux.duke.edu/fc1_x86_64/
>> 
>> I should say that I haven't tried/used this myself, just thought
>> it would be of interest to this list.
>
> It should be noted that this is NOT an official Fedora Core 1 release for
> amd64; as taken from the post to fedora-devel:
>
> ...
> ISOs will not be provided for this release, but everything is there for
> an install.
> ...
>
> /***************************************************************************
> *       WARNING: This release is a preview, it is not an official Fedora
> *       Core 1 Release, this is not an official Fedora Core Test Release.
> *       This release may very well cause damage to your data, your system,
> *       your pets and loved ones, and most certainly your sleep schedule.
> *       There is no guarantee of any type on performance, stability, or
> *       your sanity.  Use at your own risk.
> ***************************************************************************/
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
rossini at u.washington.edu            http://www.analytics.washington.edu/ 
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN          Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email

CONFIDENTIALITY NOTICE: This e-mail message and any attachments may be
confidential and privileged. If you received this message in error,
please destroy it and notify the sender. Thank you.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From verycoldpenguin at hotmail.com  Tue Dec  2 06:05:03 2003
From: verycoldpenguin at hotmail.com (Gareth Glaccum)
Date: Tue, 02 Dec 2003 11:05:03 +0000
Subject: PBS/Maui problem
Message-ID: <Law15-F68Qez1enAqpf000020b9@hotmail.com>

Hi,
I have been trying to get a large cluster working, but am having
problems with PBS crashing if I submit a job with qsub asking for more
than 112 (dual processor) nodes. I have applied the patches to allow
PBS to use large numbers of nodes, but it does not seem to help.

Any ideas as to where I should look?
PBS 2.3.12,
MAUI 3.2.5 (patch 5)

Thanks,
Gareth

_________________________________________________________________
Express yourself with cool emoticons - download MSN Messenger today! 
http://www.msn.co.uk/messenger

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From verycoldpenguin at hotmail.com  Tue Dec  2 09:46:12 2003
From: verycoldpenguin at hotmail.com (Gareth Glaccum)
Date: Tue, 02 Dec 2003 14:46:12 +0000
Subject: PBS/Maui problem
Message-ID: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>

Yes, we have tried that patch, but to no avail.
We are trying to run on Suse advanced server with opterons.
Gareth

>From: Bill Wichser <bill at Princeton.EDU>
>Date: Tue, 02 Dec 2003 09:12:55 -0500
>
>The NCSA scaling patch fixed this for me.  Is this the one you applied?
>http://www-unix.mcs.anl.gov/openpbs/
>Bill

>Gareth Glaccum wrote:
>>I have been trying to get a large cluster working, but am having
>>problems with PBS crashing if I submit a job with qsub asking for more
>>than 112 (dual processor) nodes. I have applied the patches to
...
>>PBS 2.3.12,
>>MAUI 3.2.5 (patch 5)

_________________________________________________________________
Use MSN Messenger to send music and pics to your friends 
http://www.msn.co.uk/messenger

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From nashif at planux.com  Tue Dec  2 12:37:46 2003
From: nashif at planux.com (Anas Nashif)
Date: Tue, 02 Dec 2003 12:37:46 -0500
Subject: clusterworldexpo 2003 Pages!
Message-ID: <3FCCCDEA.10108@planux.com>

hi,

Any idea where can I find the old pages  of clusterworldexpo 2003, 
http://www.clusterworldexpo.com./ is a dead end at the moment! Is there 
an archive somewhere?


Anas

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From pesch at attglobal.net  Tue Dec  2 21:40:24 2003
From: pesch at attglobal.net (pesch at attglobal.net)
Date: Tue, 02 Dec 2003 18:40:24 -0800
Subject: Beowulf of bare motherboards
References: <Pine.LNX.3.96.1031124165330.22139B-100000@Maggie.Linux-Consulting.com>
Message-ID: <3FCD4D18.FE7DCD4E@attglobal.net>

We used that technique in the late nineties: one 300W PS for 4 or more
motherboards (we had 1:6 power multiplier
pc boards and cabling made). Worked well and saved lots of space. The idea
might again become interesting for the
new low power processors (VIA 1 Ghz = 7W).

To support the motherboards we used prepunched steel sheetmetal bent to fit and nylon pc guides (remember the
s-100 bus?)

Paul Schenker


Alvin Oga wrote:

> hi ya
>
> On Mon, 24 Nov 2003, Jean-Christophe Ducom wrote:
>
> > I tried to find a link to a 'old' project where people were using racks to put
> > barebone motherboards (to save the cost of the case basically).
>
> hotmail and google used those motherboard in the 19" (kingstarusa.com)
> racks  -- looks like its discontinued ??
>
> - a flat piece of (aluminum/steel) metal (from home depot/orchard) will
>   work too you know
>         - just add a couple holes on stand off for the mb and power supply
>         - or get a sheet metal shop to bend and drill a few holes w
>           rack mounting ears
>
> > It was similar to the following project but was more elaborated (it was possible
> > to pull out the bare motherboards of the shelf, etc...)
> > http://www.abo.fi/~physcomp/cluster/celeron.html
>
> i'm very interested in those systems ...
>         - to build a cluster w/ just motherboards and optionally w/ disks
>         - power supply will be simple +12vDC wall adaptor ...
>         - P4-3G equivalent mb/cpu
>
>         - it'd be a good engineering challenge :-)
>         ( big question is what holds up the back of the "caseless"
>         ( motherboards and disks
>
> c ya
> alvin
>
> > I spent hours to find it on google..without success.
> > Could anyone remember it? Please send the link.
> > Thanks a lot
>
> there are other pc104 based caseless clusters
>         http://eri.ca.sandia.gov/eri/howto.html
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Tue Dec  2 15:06:55 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Wed, 3 Dec 2003 04:06:55 +0800 (CST)
Subject: PBS/Maui problem
In-Reply-To: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
Message-ID: <20031202200655.95901.qmail@web16801.mail.tpe.yahoo.com>

Did you try SPBS (scalable edition)?

And how did PBS fail? qsub, scheduler, server?

Andrew.

 --- Gareth Glaccum <verycoldpenguin at hotmail.com>
????
>
> Yes, we have tried that patch, but to no avail.
> We are trying to run on Suse advanced server with
> opterons.
> Gareth
> 
> >From: Bill Wichser <bill at Princeton.EDU>
> >Date: Tue, 02 Dec 2003 09:12:55 -0500
> >
> >The NCSA scaling patch fixed this for me.  Is this
> the one you applied?
> >http://www-unix.mcs.anl.gov/openpbs/
> >Bill
> 
> >Gareth Glaccum wrote:
> >>I have been trying to get a large cluster working,
> but am having
> >>problems with PBS crashing if I submit a job with
> qsub asking for more
> >>than 112 (dual processor) nodes. I have applied
> the patches to
> ...
> >>PBS 2.3.12,
> >>MAUI 3.2.5 (patch 5)
> 
>
_________________________________________________________________
> Use MSN Messenger to send music and pics to your
> friends 
> http://www.msn.co.uk/messenger
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Tue Dec  2 20:09:45 2003
From: csamuel at vpac.org (Chris Samuel)
Date: Wed, 3 Dec 2003 12:09:45 +1100
Subject: PBS/Maui problem
In-Reply-To: <Law15-F68Qez1enAqpf000020b9@hotmail.com>
References: <Law15-F68Qez1enAqpf000020b9@hotmail.com>
Message-ID: <200312031209.53543.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 2 Dec 2003 10:05 pm, Gareth Glaccum wrote:

> I have been trying to get a large cluster working, but am having
> problems with PBS crashing if I submit a job with qsub asking for more
> than 112 (dual processor) nodes. I have applied the patches to allow
> PBS to use large numbers of nodes, but it does not seem to help.
>
> Any ideas as to where I should look?
> PBS 2.3.12,
> MAUI 3.2.5 (patch 5)

I'd stronly suggest trying out Scalable PBS instead of OpenPBS.  It's actively 
developed and they've been fixing lots of problems that are still in OpenPBS 
and adding enhancements.

	http://www.supercluster.org/

It's freely available (they forked from an earlier OpenPBS release which had a 
more liberal license than the later ones).

cheers!
Chris
- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/zTfdO2KABBYQAh8RAoLAAJ94HRU9Dgu2B4fLhwQdQ2EDnp1q+gCfZHk8
utf26uf4JQL2eNVFv7vxi1c=
=AQ/L
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Tue Dec  2 20:27:40 2003
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Tue, 2 Dec 2003 20:27:40 -0500 (EST)
Subject: clusterworldexpo 2003 Pages!
In-Reply-To: <3FCCCDEA.10108@planux.com>
Message-ID: <Pine.LNX.4.44.0312022021580.12961-100000@boltzmann.basement-supercomputing.com>

On Tue, 2 Dec 2003, Anas Nashif wrote:

> hi,
> 
> Any idea where can I find the old pages  of clusterworldexpo 2003, 
> http://www.clusterworldexpo.com./ is a dead end at the moment! Is there 
> an archive somewhere?

What exactly do you need?

The www.clusterworldexpo.com site is morphing into the 2004 meeting site.
ClusterWorld Expo will be held on April 5-8, 2004, keynotes include
Tom Sterling, Ian Foster, and Dave Turek.

Doug

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Tue Dec  2 20:12:38 2003
From: csamuel at vpac.org (Chris Samuel)
Date: Wed, 3 Dec 2003 12:12:38 +1100
Subject: PBS/Maui problem
In-Reply-To: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
References: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
Message-ID: <200312031212.39775.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 3 Dec 2003 01:46 am, Gareth Glaccum wrote:

> We are trying to run on Suse advanced server with opterons.

Here's a quote from the Scalable PBS guys from the mailing list:

[quote]

  The next release of SPBS is under testing and is currently available as 
a snapshot in the spbs/temp download directory.  This snapshot 
incorporates a number of patches which assist in the following areas:

SUSE Linux support
IA64 support
large job support 
readline support in qmgr
support for very large node memory and filesystems
correct ncpus reporting

  Many thanks go out to NCSA and the TeraGrid team for their excellent 
help in identifing and correcting a number of remaining high-end scaling 
issues found within SPBS.  

  Please let us know if any issues are discovered with this release and 
please keep the patches coming!

[/quote]

- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/zTiGO2KABBYQAh8RAuppAJ9LGg7Pj7MLlT1MSb2oW2WABWB4CgCdF7Dq
Tq4fnxlcaDA/5vIGCf9QNeQ=
=YwfO
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From nashif at planux.com  Tue Dec  2 22:12:56 2003
From: nashif at planux.com (Anas Nashif)
Date: Tue, 02 Dec 2003 22:12:56 -0500
Subject: clusterworldexpo 2003 Pages!
In-Reply-To: <Pine.LNX.4.44.0312022021580.12961-100000@boltzmann.basement-supercomputing.com>
References: <Pine.LNX.4.44.0312022021580.12961-100000@boltzmann.basement-supercomputing.com>
Message-ID: <3FCD54B8.8070805@planux.com>


Douglas Eadline, Cluster World Magazine wrote:
> On Tue, 2 Dec 2003, Anas Nashif wrote:
> 
> 
>>hi,
>>
>>Any idea where can I find the old pages  of clusterworldexpo 2003, 
>>http://www.clusterworldexpo.com./ is a dead end at the moment! Is there 
>>an archive somewhere?
> 
> 
> What exactly do you need?
> 
Everything :-)
I'd like to see who talked there and to see what talk were given etc. 
Its always good to have some kind of archive with the program of old 
conferences, for example something like www.supercomp.org.


> The www.clusterworldexpo.com site is morphing into the 2004 meeting site.
> ClusterWorld Expo will be held on April 5-8, 2004, keynotes include
> Tom Sterling, Ian Foster, and Dave Turek.
> 

Yes, I could see that on the new page, but as I said, its a dead end, no 
links to anything there...


Thanks,

Anas

> Doug
> 
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Wed Dec  3 04:27:24 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Wed, 3 Dec 2003 01:27:24 -0800 (PST)
Subject: Beowulf of bare motherboards
In-Reply-To: <Pine.LNX.4.44.0312030958010.9015-100000@druifje.clustervision.com>
Message-ID: <Pine.LNX.3.96.1031203011718.21885A-100000@Maggie.Linux-Consulting.com>


hi ya john

On Wed, 3 Dec 2003, John Hearns wrote:

> Someone mention VIA mini-ITXes?
> If I could have the resources, I wouldn't fan out a single PSU
> to several mini-ITX boards. It would be cheap, but introduce a single
> point of failure, and you'd have to cobble somthing together to
> deal with ATX power on/off.

single point of failures is not acceptable if the cost of that
item is small compared to the "overall system"
	- hvac, public utilty point of failure is harder
	to avoid, but can be avoided w/ a data center setup
	on the opposite side of the country

> Funds permitting, one of the small 12V DC-DC PSU per board.

you can use a simple wall adaptor to +12v adaptor
	and a +12vdc to +{various-atx} voltage dc-dc convertor

	www.mini-itx.com sells their proprietory +12v dc-dc convertors
	( $50ea range ) and we're debating what the "cluster/blade" of
	mini-itx mb should look like when its mounted in a standard rack
	or custom rack .. and why one way is better than another .. fun
	stuff ..

- if you want a p4-3Ghz in a mini-itx form factor, than we're back
  to only one mb manufacturer :-)

> Then run a high current 12V supply along the rack.
> Simple cheap relay would do the job of power cycling  also.

"relays" has had the worst reliability of any electromechanical part
( so its been long replaced by transistors :-) especially at high
( currents and low/medium voltages

> On the VIA front, the smaller  nano-ITX form factor boards are due soon.
> Could make nice building blocks.

those nano-itx mb is due out (in production) around may/june time frame
??

c ya
alvin

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Wed Dec  3 04:02:56 2003
From: john.hearns at clustervision.com (John Hearns)
Date: Wed, 3 Dec 2003 10:02:56 +0100 (CET)
Subject: Beowulf of bare motherboards
In-Reply-To: <3FCD4D18.FE7DCD4E@attglobal.net>
Message-ID: <Pine.LNX.4.44.0312030958010.9015-100000@druifje.clustervision.com>

On Tue, 2 Dec 2003 pesch at attglobal.net wrote:

> We used that technique in the late nineties: one 300W PS for 4 or more
> motherboards (we had 1:6 power multiplier
> pc boards and cabling made). Worked well and saved lots of space. The idea
> might again become interesting for the
> new low power processors (VIA 1 Ghz = 7W).
> 
Someone mention VIA mini-ITXes?
If I could have the resources, I wouldn't fan out a single PSU
to several mini-ITX boards. It would be cheap, but introduce a single
point of failure, and you'd have to cobble somthing together to
deal with ATX power on/off.
Funds permitting, one of the small 12V DC-DC PSU per board.
Then run a high current 12V supply along the rack.
Simple cheap relay would do the job of power cycling  also.

On the VIA front, the smaller  nano-ITX form factor boards are due soon.
Could make nice building blocks.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From derek.richardson at pgs.com  Wed Dec  3 11:27:59 2003
From: derek.richardson at pgs.com (Derek Richardson)
Date: Wed, 03 Dec 2003 10:27:59 -0600
Subject: Opteron kernel
In-Reply-To: <3FCD0CEE.80908@seismiccity.com>
References: <Pine.LNX.4.44.0311251500590.4408-100000@training.scyld.com> <3FC4EFB3.10708@pgs.com> <3FCD0CEE.80908@seismiccity.com>
Message-ID: <3FCE0F0F.9020407@pgs.com>

Claude,
I'm thinking there is a lot of potential for optimization is the x86-64 
architecture.  Two different versions of our code ( they have slightly 
differing code and were compiled w/ same GNU compilers but using 
different flags ) had a large performance difference.  One version ran 
at ~ 85% of the speed of the P4 gear, and another at ~ 140% of P4 gear ( 
dual Xeon 3.06 GHz boxen ).  Having found this out two days ago and 
spent all of yesterday repairing some dead nodes, I haven't had a chance 
to chase the testing up ( find out which flags, code differences, etc. 
).  We are planning on doing a run w/ the same code base, but the 
changed compiler flags.  That should bring out whether it is the code 
changes, or the compiler flags.  My guess would be the compiler flags, 
but I don't know ( yet ) what changes were made in the code itself.  
There's also some pre-fetching optimization work that can be done as 
well, so things are looking a bit brighter.
As a side note, AMD recommends the SUSE 64 bit kernel ( apparently even 
for non-SUSE, non-64bit OSes like RedHat ).  I don't know where they 
stand on RH Advanced Whatchamadoodle vs. SUSE, but I'll have to sort 
that out in the future, if we actually ever get around to getting some 
Opterons ( our stance has been that they have to outperform the P4 Xeon 
gear using the same code and OS, then we'll worry about seriously 
optimizing ).
I suppose I'll let everyone know when we discover what made such a large 
difference.
Regards,
Derek R.

Claude Pignol wrote:

>
>
> Derek Richardson wrote:
>
>> Donald,
>> Sorry for the late reply, bloody Exchange server didn't drop it in my 
>> inbox until late this morning.  Memory and scheduling would probably 
>> be the biggest factor.  Processor affinity doesn't matter as much, 
>> because in my experience we haven't had problems w/ processes 
>> bouncing between CPUs.  PCI bus is almost a non-issue, since our 
>> application is embarassingly parallel and therefore has no need for > 
>> 100 Mbit ethernet, and there is no disk on a PCI-attached controller, 
>> so we have very little information passing over the PCI bus.
>> By interleaving, I assume you mean at the physical level, which I had 
>> a quick peek at when we got the system ( it's an IBM eServer 325, a 
>> loaner for testing ) and I assumed to be correct.  But given the poor 
>> performance I have seen ( 2 GHz Opterons coming in at ~15% slower 
>> than a 3 GHz P4 on a compute/memory intensive application when most 
>> benchmarks I have seen would imply the inverse ), I will double-check 
>> that when given a chance. 
>
> I have the same conclusion concerning the performance. I haven't seen 
> on our application (floating point and memory  intensive) the speed up 
> that we could expect from the SPEC benchmark.
> (using gcc 3.3 Kernel NUMA  bank interleaving ON CPU interleaving OFF)
> The problem is probably due to the compiler that doesn't generate a 
> very optimized code on common application.
> It seems that the price performance ratio is still in favor of Xeon 
> for dual processor machine.
>
>>
>> I will probably just try the latest 2.6 kernel and a few other tweaks 
>> as well, and AMD has also offerred help, but that would more likely 
>> be at the application layer ( which I don't have control of, 
>> unfortunately ).
>> Thanks for the response, and my apologies for the vagueness of the 
>> question.
>> Derek R.
>>
>> Donald Becker wrote:
>>
>>> On Mon, 24 Nov 2003, Derek Richardson wrote:
>>>
>>>  
>>>
>>>> Does anyone know where to find info on tuning the linux kernel for 
>>>> Opterons?  Googling hasn't turned up much useful information.
>>>>   
>>>
>>>
>>> What type of tuning?
>>> PCI bus transactions (the Itanium required more, but the Opteron still
>>> benefits)?  Scheduling?  Processor affinity?  What kernel version?
>>> If you ask specific questions, there is likely someone on the list that
>>> knows the specific answer.
>>>
>>> The easiest performance improvement comes from proper memory DIMM
>>> configuration to match the application layout.  Each processor has its
>>> own local memory controller, and understanding how the memory slots are
>>> filled and the options e.g. interleave can make a 30% difference on a
>>> dual processor system.
>>>
>>>  
>>>
>>
>
> -- 
> ------------------------------------------------------------------------
> Claude Pignol 	SeismicCity, Inc. <http://www.seismiccity.com>
> 2900 Wilcrest Dr.    Suite 370 	 Houston TX 77042
> Phone:832 251 1471 Mob:281 703 2933 	 Fax:832 251 0586
>
>

-- 
Linux Administrator
derek.derekson at pgs.com
derek.derekson at ieee.org
Office 713-781-4000
Cell 713-817-1197
Disease can be cured; fate is incurable.
		-- Chinese proverb


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Wed Dec  3 17:50:39 2003
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Wed, 3 Dec 2003 14:50:39 -0800 (PST)
Subject: Scalable PBS (was:  PBS/Maui problem)
In-Reply-To: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
Message-ID: <20031203225039.79037.qmail@web11404.mail.yahoo.com>

While Scalable PBS is technically better than OpenPBS, I found that it
is actually less open than other batch systems (condor, OpenPBS, SGE)

All "scalablepbsusers" mail messages are filtered by hand by Cluster
Resource INC. This creates significant delays to the mail response
rate.

All major lists are not filtered by hand, I just don't understand the
reasons of doing that...

BTW, anyone on that list but is not encountering the same experience??

Rayson


__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From sean at asacomputers.com  Wed Dec  3 18:19:23 2003
From: sean at asacomputers.com (Sean)
Date: Wed, 03 Dec 2003 15:19:23 -0800
Subject: U320 and 64 bit Itanium
In-Reply-To: <20031203225039.79037.qmail@web11404.mail.yahoo.com>
References: <Law15-F82T9IeV3jRFx00002c84@hotmail.com>
Message-ID: <5.1.0.14.2.20031203151755.02fb1aa0@pop.asacomputers.com>

Can somebody suggest us where to get the U320 drivers for  64 bit Redhat 
Linux that will work with the Itanium solution ?

Thanks and Regards
Sean
ASA Computers Inc.
2354, Calle Del Mundo
Santa Clara CA 95054
Telephone : (408) 654-2901 xtn 205
                   (408) 654-2900 ask for Sean
                   (800) REAL-PCS (1-800-732-5727)
Fax:            (408) 654-2910
E-mail : sean at asacomputers.com
URL    : http://www.asacomputers.com


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rokrau at yahoo.com  Wed Dec  3 21:24:47 2003
From: rokrau at yahoo.com (Roland Krause)
Date: Wed, 3 Dec 2003 18:24:47 -0800 (PST)
Subject: problem allocating large amount of memory
Message-ID: <20031204022447.32578.qmail@web40014.mail.yahoo.com>

Hi all,
I am trying to allocate a continuous chunk of memory of more than
2GBytes using malloc(). 

My sytem is a Microway Dual Athlon node with 4GB of physical RAM. The
kernel identifies itself as Redhat-2.4.20 (it runs RH-9). It has been
compiled with the CONFIG_HIGHMEM4G and CONFIG_HIGHMEM options turned
on. 

Here is what I _am_ able to do. Using a little test program that I have
written I can pretty much get 3 GB of memory allocated in chunks. The
largest chunk is 2,143 GBytes, then one of 0.939 GBytes size and
finally some smaller chunks of 10MBytes. So the total amount of memory
I can get is close enough to the promised 3G/1G split which is well
documented on the net. 

What I am not able to do currently is to get the 2.95GB all at once.
"But I must have it all."

I have set the overcommit_memory kernel parameter to 1 already but that
that doesn't seem to change anything. 

Also has someone experience with the various kernel patches for large
memory out there (im's 4G/4G or IBM's 3.5G/0.5G hack)? 

I would be very greatful for any kind of advice with regards to this
problem. I am certain that more people here must have the same problem.


Best regards,
Roland


__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Wed Dec  3 22:31:20 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Wed, 3 Dec 2003 19:31:20 -0800
Subject: problem allocating large amount of memory
In-Reply-To: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
References: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
Message-ID: <20031204033120.GJ20846@cse.ucdavis.edu>

ACK, sorry, I missed the mention of running Redhat-9.

Do you have an example program?

Did you link static or dynamic?

Is it possible your process has 0.05GB of memory used in some other
way?

-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Thu Dec  4 00:49:50 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 4 Dec 2003 00:49:50 -0500 (EST)
Subject: problem allocating large amount of memory
In-Reply-To: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312032353040.7748-100000@coffee.psychology.mcmaster.ca>

> Here is what I _am_ able to do. Using a little test program that I have
> written I can pretty much get 3 GB of memory allocated in chunks. The
> largest chunk is 2,143 GBytes, then one of 0.939 GBytes size and
> finally some smaller chunks of 10MBytes. So the total amount of memory

yes.  unless you are quite careful, your address space looks like this:

0-128M		zero page
128M + small	program text
		sbrk heap (grows up)
1GB		mmap arena (grows up)
3GB - small	stack base (grows down)
3GB-4GB		kernel direct-mapped area

your ~1GB is allocated in the sbrk heap (above text, below 1GB).
the ~2GB is allocated in the mmap arena (glibc puts large allocations
there, if possible, since you can munmap arbitrary pages, but heaps can 
only rarely shrink).

interestingly, you can avoid the mmap arena entirely if you try (static linking,
avoid even static stdio).  that leaves nearly 3 GB available for the heap or stack.  
also interesting is that you can use mmap with MAP_FIXED to avoid the default 
mmap-arena at 1GB.  the following code demonstrates all of these.  the last time
I tried, you could also move around the default mmap base (TASK_UNMAPPED_BASE,
and could squeeze the 3G barier, too (TASK_SIZE).  I've seen patches to make 
TASK_UNMAPPED_BASE a /proc setting, and to make the mmap arena grow down
(which lets you start it at a little under 3G, leaving a few hundred MB for stack).
finally, there is a patch which does away with the kernel's 1G chunk entirely
(leaving 4G:4G, but necessitating some nastiness on context switches)


#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>

void print(char *message) {
    unsigned l = strlen(message);
    write(1,message,l);
}
void printuint(unsigned u) {
    char buf[20];
    char *p = buf + sizeof(buf) - 1;
    *p-- = 0;
    do {
        *p-- = "0123456789"[u % 10];
        u /= 10;
    } while (u);
    print(p+1);
}

int main() {
#if 1
//    unsigned chunk = 128*1024;                                                
    unsigned chunk = 124*1024;
    unsigned total = 0;
    void *p;

    while (p = malloc(chunk)) {
        total += chunk;
        printuint(total);
        print("MB\t: ");
        printuint((unsigned)p);
        print("\n");
    }
#else
    unsigned offset = 150*1024*1024;
    unsigned size = (unsigned) 3e9;
    void *p = mmap((void*) offset,
                   size,
                   PROT_READ|PROT_WRITE,
                   MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
                   0,0);
    printuint(size >> 20);
    print(" MB\t: ");
    printuint((unsigned) p);
    print("\n");
#endif
    return 0;
}

> Also has someone experience with the various kernel patches for large
> memory out there (im's 4G/4G or IBM's 3.5G/0.5G hack)? 

there's nothing IBM-specific about 3.5/.5, that's for sure.

as it happens, I'm going to be doing some measurements of performance soon.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Thu Dec  4 10:39:24 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Thu, 4 Dec 2003 07:39:24 -0800
Subject: problem allocating large amount of memory
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF78@orsmsx402.jf.intel.com>

From: Mark Hahn; Sent: Wednesday, December 03, 2003 9:50 PM
> 
> From: Roland Krause; Sent: Wednesday, December 03, 2003 6:25 PM
> > Here is what I _am_ able to do. Using a little test program that I
have
> > written I can pretty much get 3 GB of memory allocated in chunks.
The
> > largest chunk is 2,143 GBytes, then one of 0.939 GBytes size and
> > finally some smaller chunks of 10MBytes. So the total amount of
memory

The 2.143 GB chunk is above TASK_UNMAPPED_BASE and the 0.939 chunk is
below TASK_UNMAPPED_BASE.

> yes.  unless you are quite careful, your address space looks like
this:
> 
> 0-128M		zero page
> 128M + small	program text
> 		sbrk heap (grows up)
> 1GB		mmap arena (grows up)
> 3GB - small	stack base (grows down)
> 3GB-4GB		kernel direct-mapped area
> 
> your ~1GB is allocated in the sbrk heap (above text, below 1GB).
> the ~2GB is allocated in the mmap arena (glibc puts large allocations
> there, if possible, since you can munmap arbitrary pages, but heaps
can
> only rarely shrink).

Right.

> interestingly, you can avoid the mmap arena entirely if you try
(static
> linking,
> avoid even static stdio).  that leaves nearly 3 GB available for the
heap
> or stack.

Interesting, never tried static linking.  While I worked with an app
that needed dynamic linking, this is an experiment I will certainly try.

> also interesting is that you can use mmap with MAP_FIXED to avoid the
> default
> mmap-arena at 1GB.  the following code demonstrates all of these.  the
> last time
> I tried, you could also move around the default mmap base
> (TASK_UNMAPPED_BASE,
> and could squeeze the 3G barier, too (TASK_SIZE).  I've seen patches
to
> make
> TASK_UNMAPPED_BASE a /proc setting, and to make the mmap arena grow
down
> (which lets you start it at a little under 3G, leaving a few hundred
MB
> for stack).

Prior to RH 7.3, you could use one of the extant TASK_UNMAPPED_BASE
patches to address this problem.  I always used the patch to move
TASK_UNMAPPED_BASE UP, so that the brk() area (the 0.939 chunk above)
could get larger.  I could reliably get this up to about 2.2 GB or so
(on a per-process basis). The original requestor would want to move
TASK_UNMAPPED_BASE DOWN, so that the first big malloc() could be larger.


Starting at RH Linux 7.3, Red Hat prelinked glibc to the fixed value of
TASK_UNMAPPED_BASE so that moving TASK_UNMAPPED_BASE around only caused
heartache and despair, a.k.a., you app crashed and burned as you
trampled over glibc.

I have rebuilt a few pairs of RH kernels and glibc's to add the kernel
patch and not prelink glibc, thereby restoring the wonders of the
per-process TASK_UNMAPPED_BASE patch.  But, this must be done to both
the kernel and glibc.

So, the biggest issue in an unpatched RH world is not the user app, but
glibc.

> finally, there is a patch which does away with the kernel's 1G chunk
> entirely
> (leaving 4G:4G, but necessitating some nastiness on context switches)

This is something I want to look at, to quantify how bad it actually is.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Wed Dec  3 22:28:33 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Wed, 3 Dec 2003 19:28:33 -0800
Subject: problem allocating large amount of memory
In-Reply-To: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
References: <20031204022447.32578.qmail@web40014.mail.yahoo.com>
Message-ID: <20031204032833.GI20846@cse.ucdavis.edu>

On Wed, Dec 03, 2003 at 06:24:47PM -0800, Roland Krause wrote:
> What I am not able to do currently is to get the 2.95GB all at once.
> "But I must have it all."

A small example program is useful.

I'll include one that works for me.  Here's the output of the run:

[root at quad root]# gcc -Wall -Wno-long-long -pedantic memfill.c -o memfill && ./memfill
Array size of 483183820 doubles (3.60 GB) allocated
Initialized 1GB.
Initialized 1GB.
Initialized 1GB.
Initialized 1GB.
Sleeping for 60 seconds so you can check top.

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
17182 root      25   0 3584M 3.5G   340 S     0.0 48.4   0:10   3 memfill
 
I'll attach my source.  This particular machine has 8GB ram, but it
would be kinda strange for this to fall just because it's virtual.

You do have enough swap right?

-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define RAM_USED 3.6   /* 3.6 GB */
#define GB 1073741824  /* bytes per GB */

int main()
{
	double *x;
	long long i;
	long long array_size;

	array_size=RAM_USED*GB/sizeof(double);	
	x=malloc(RAM_USED*GB);
	if (x)
	{
		printf ("Array size of %lld doubles (%3.2f GB) allocated\n",array_size,RAM_USED);
		for (i=0;i<array_size;i++)
		{
			x[i]=i;
			if (i%(GB/sizeof(double))==0)
			{ 
				printf ("Initialized 1GB.\n");
			}
		}
		printf ("Sleeping for 60 seconds so you can check top.\n");
		sleep(60);
		return(0);
	}	
	printf ("We couldn't allocate an array of %3.2f GB!\n",RAM_USED);
	return(-1);
}
	
-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define RAM_USED 3.6   /* 3.6 GB */
#define GB 1073741824  /* bytes per GB */

int main()
{
	double *x;
	long long i;
	long long array_size;

	array_size=RAM_USED*GB/sizeof(double);	
	x=malloc(RAM_USED*GB);
	if (x)
	{
		printf ("Array size of %lld doubles (%3.2f GB) allocated\n",array_size,RAM_USED);
		for (i=0;i<array_size;i++)
		{
			x[i]=i;
			if (i%(GB/sizeof(double))==0)
			{ 
				printf ("Initialized 1GB.\n");
			}
		}
		printf ("Sleeping for 60 seconds so you can check top.\n");
		sleep(60);
		return(0);
	}	
	printf ("We couldn't allocate an array of %3.2f GB!\n",RAM_USED);
	return(-1);
}
	

From jrdm at sdf.lonestar.org  Thu Dec  4 12:27:42 2003
From: jrdm at sdf.lonestar.org (Linux Guy)
Date: Thu, 4 Dec 2003 17:27:42 +0000 (UTC)
Subject: maui scheduler
Message-ID: <Pine.NEB.4.58.0312041717130.5995@otaku.freeshell.org>


Will the real Maui Scheduler please stand up?

How many maui's are out there?

http://sourceforge.net/projects/mauischeduler/
http://sourceforge.net/projects/mauisched/
http://supercluster.org/maui/

others?

I thought this was a MHPCC project?

--
jrdm at sdf.lonestar.org
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From henken at seas.upenn.edu  Thu Dec  4 13:35:15 2003
From: henken at seas.upenn.edu (Nicholas Henke)
Date: Thu, 04 Dec 2003 13:35:15 -0500
Subject: maui scheduler
In-Reply-To: <Pine.NEB.4.58.0312041717130.5995@otaku.freeshell.org>
References: <Pine.NEB.4.58.0312041717130.5995@otaku.freeshell.org>
Message-ID: <1070562915.28739.20.camel@roughneck.liniac.upenn.edu>

On Thu, 2003-12-04 at 12:27, Linux Guy wrote:
> Will the real Maui Scheduler please stand up?
> 
> How many maui's are out there?
> 
> http://sourceforge.net/projects/mauischeduler/
> http://sourceforge.net/projects/mauisched/
> http://supercluster.org/maui/
> 

The 'real' one is supercluster.org.

Nic
-- 
Nicholas Henke
Penguin Herder & Linux Cluster System Programmer
Liniac Project - Univ. of Pennsylvania

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rokrau at yahoo.com  Thu Dec  4 14:40:30 2003
From: rokrau at yahoo.com (Roland Krause)
Date: Thu, 4 Dec 2003 11:40:30 -0800 (PST)
Subject: problem allocating large amount of memory
In-Reply-To: <20031204033120.GJ20846@cse.ucdavis.edu>
Message-ID: <20031204194030.88760.qmail@web40006.mail.yahoo.com>

Bill,
thanks a lot for your help. 
Please find attached a little test program. I use

g++ -O -Wall memchk.cpp -static -o memchk

Afaik size_t is unsigned long on 32 bit systems and long long is the
same.

I've linked the code first dynamic then static with no differences in
the amount I am getting. 

Roland


--- Bill Broadley <bill at cse.ucdavis.edu> wrote:
> ACK, sorry, I missed the mention of running Redhat-9.
> 
> Do you have an example program?
> 
> Did you link static or dynamic?
> 
> Is it possible your process has 0.05GB of memory used in some other
> way?
> 
> -- 
> Bill Broadley
> Information Architect
> Computational Science and Engineering
> UC Davis

__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: memchk.cpp
Type: text/x-c++src
Size: 636 bytes
Desc: memchk.cpp
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20031204/9137ff92/attachment-0001.bin>

From josip at lanl.gov  Thu Dec  4 15:05:17 2003
From: josip at lanl.gov (Josip Loncaric)
Date: Thu, 04 Dec 2003 13:05:17 -0700
Subject: problem allocating large amount of memory
In-Reply-To: <Pine.LNX.4.44.0312032353040.7748-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0312032353040.7748-100000@coffee.psychology.mcmaster.ca>
Message-ID: <3FCF937D.5070109@lanl.gov>

In addition to Mark's very helpful address space layout, you may want to 
consult this web page:

http://www.intel.com/support/performancetools/c/linux/2gbarray.htm

which saye:

"The maximum size of an array that can be created by Intel? IA-32 
compilers is 2 GB."

due to the fact that:

"The default Linux* kernel on IA-32 loads shared libraries at 1 GB, 
which limits the contiguous address space available to your program. You 
will get a load time error if your program + static data exceed this."

Intel offers several helpful hints on being able to declare larger 
arrays (e.g. -static linking, etc.).

Sincerely,
Josip

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Thu Dec  4 18:05:57 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Thu, 4 Dec 2003 15:05:57 -0800
Subject: problem allocating large amount of memory
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF82@orsmsx402.jf.intel.com>

From: Josip Loncaric; Sent: Thursday, December 04, 2003 12:05 PM
> 
> In addition to Mark's very helpful address space layout, you may want
to
> consult this web page:
> 
> http://www.intel.com/support/performancetools/c/linux/2gbarray.htm
> 
> which saye:
> 
> "The maximum size of an array that can be created by Intel(r) IA-32
> compilers is 2 GB."

Using the Intel or gcc compilers, a TASK_UNMAPPED_BASE patch, and some
other fiddling, you can create a larger array via brk(2), or (I assume)
malloc(3), and use a larger array.
 
> due to the fact that:
> 
> "The default Linux* kernel on IA-32 loads shared libraries at 1 GB,
> which limits the contiguous address space available to your program.
You
> will get a load time error if your program + static data exceed this."

Again, back to the TASK_UNMAPPED_BASE patch and glibc fiddling.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Thu Dec  4 20:06:59 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 4 Dec 2003 20:06:59 -0500 (EST)
Subject: problem allocating large amount of memory
In-Reply-To: <20031205003627.2288.qmail@web40013.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312041949590.14254-100000@coffee.psychology.mcmaster.ca>

> I've tried your code and, yes, I am able to allocate up to 3G of memory
> in 124K chunks.

I probably should have commented on the code a bit more.  it demonstrates
three separate things: that for <128K allocations, libc uses the heap
first, then when that fills (hits the mmap arena) it switches to allocating
in the mmap arena.  if allocations are 128K or more, it *starts* in the 
mmap arena (since mmap has advantages when doing large allocations - munmap).  
finally, if you statically link and avoid the use of stdio,
you can make one giant allocation from the end of text up to stack.

you can't make that one giant allocation with malloc, though, simply
because glibc has this big-alloc-via-mmap policy.  I dimly recall that 
you can change this behavior at runtime.

> Unfortunately this doesn't not help me because the
> memory needed is allocated for a large software package, written in
> Fortran, that makes heavy use of all kinds of libraries (libc among
> others) over which I have no control. 

I'd suggest moving TASK_UNMAPPED_BASE down, and possibly going to a 
3.5 or 4GB userspace.  I think I also mentioned there's a patch to make
the mmap arena grow down - start it below your max stack extent, and 
let it grow towards the heap.

> Also, if I change your code to try to allocate the available memory in
> one chunk I am obviously in the same situation as before. If I
> understand you correctly, this is because small chunks of memory are
> allocated with sbrk, large ones with mmap.

right, though that's a purely user-space choice, nothing to do with the OS.

> I notice from the output of
> your program that the allocated memory is also not in a contiguous
> block. 

the demo program operates in three modes, one of which is a single chunk,
the other is a contiguous series of small chunks, and the other is two 
series of chunks.

> This must be because Redhat's prelinking of glibc to a fixed
> address in memory as noted by David Lombard. 

as I mentioned, this is irrelevant if you link statically.

> What I dont understand at all then is why your second code example
> (mmap) is able to return
>  2861 MB : 157286400
> or even more memory upon changing size to 4.e9. Isn't this supposently
> simply overwriting the area where glibc is in?

if you link my demo statically, there *is* no mmaped glibc chopping
up the address space.

> Will that prevent me from using stdio. 

stdio (last time I checked) used mmap even when statically linked - 
a single page, presumably a conversion buffer.  you'd have to check the 
source to see whether that can be changed.  I presume it's trying to 
initialize the buffer before the malloc heap is set up, or something 
like that.

> There is no problem linking
> statically for me. I am doing that for other reasons anyway. 

remember, no one says you have to use glibc...

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Thu Dec  4 20:15:54 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Fri, 5 Dec 2003 09:15:54 +0800 (CST)
Subject: maui scheduler
In-Reply-To: <1070562915.28739.20.camel@roughneck.liniac.upenn.edu>
Message-ID: <20031205011554.53403.qmail@web16811.mail.tpe.yahoo.com>

Not trying to say which one is real, which one is not,
but just want to provide a link:

http://bohnsack.com/lists/archives/xcat-user/2385.html

Further, the one from supercluster.org is the most
popular one, and is the safest choice.

Andrew.

--- Nicholas Henke <henken at seas.upenn.edu> ????
> On Thu, 2003-12-04 at 12:27, Linux Guy wrote:
> > Will the real Maui Scheduler please stand up?
> > 
> > How many maui's are out there?
> > 
> > http://sourceforge.net/projects/mauischeduler/
> > http://sourceforge.net/projects/mauisched/
> > http://supercluster.org/maui/
> > 
> 
> The 'real' one is supercluster.org.
> 
> Nic
> -- 
> Nicholas Henke
> Penguin Herder & Linux Cluster System Programmer
> Liniac Project - Univ. of Pennsylvania
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rokrau at yahoo.com  Thu Dec  4 19:36:27 2003
From: rokrau at yahoo.com (Roland Krause)
Date: Thu, 4 Dec 2003 16:36:27 -0800 (PST)
Subject: problem allocating large amount of memory
In-Reply-To: <Pine.LNX.4.44.0312032353040.7748-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20031205003627.2288.qmail@web40013.mail.yahoo.com>

Mark,
thanks a lot for your helpful comments.

So, now I am somewhat more confused :-) 

I've tried your code and, yes, I am able to allocate up to 3G of memory
in 124K chunks. Unfortunately this doesn't not help me because the
memory needed is allocated for a large software package, written in
Fortran, that makes heavy use of all kinds of libraries (libc among
others) over which I have no control. 

Also, if I change your code to try to allocate the available memory in
one chunk I am obviously in the same situation as before. If I
understand you correctly, this is because small chunks of memory are
allocated with sbrk, large ones with mmap. I notice from the output of
your program that the allocated memory is also not in a contiguous
block. This must be because Redhat's prelinking of glibc to a fixed
address in memory as noted by David Lombard. 

What I dont understand at all then is why your second code example
(mmap) is able to return
 2861 MB : 157286400
or even more memory upon changing size to 4.e9. Isn't this supposently
simply overwriting the area where glibc is in? That confuses me now.
Will that prevent me from using stdio. There is no problem linking
statically for me. I am doing that for other reasons anyway. 

Best regards and many thanks for your input. 
Roland


--- Mark Hahn <hahn at physics.mcmaster.ca> wrote:
> 
> yes.  unless you are quite careful, your address space looks like
> this:
> 
> 0-128M		zero page
> 128M + small	program text
> 		sbrk heap (grows up)
> 1GB		mmap arena (grows up)
> 3GB - small	stack base (grows down)
> 3GB-4GB		kernel direct-mapped area
> 
> your ~1GB is allocated in the sbrk heap (above text, below 1GB).
> the ~2GB is allocated in the mmap arena (glibc puts large allocations
> there, if possible, since you can munmap arbitrary pages, but heaps
> can 
> only rarely shrink).
> 
> interestingly, you can avoid the mmap arena entirely if you try
> (static linking,
> avoid even static stdio).  that leaves nearly 3 GB available for the
> heap or stack.  
> also interesting is that you can use mmap with MAP_FIXED to avoid the
> default 
> mmap-arena at 1GB.  the following code demonstrates all of these. 
> the last time
> I tried, you could also move around the default mmap base
> (TASK_UNMAPPED_BASE,
> and could squeeze the 3G barier, too (TASK_SIZE).  I've seen patches
> to make 
> TASK_UNMAPPED_BASE a /proc setting, and to make the mmap arena grow
> down
> (which lets you start it at a little under 3G, leaving a few hundred
> MB for stack).
> finally, there is a patch which does away with the kernel's 1G chunk
> entirely
> (leaving 4G:4G, but necessitating some nastiness on context switches)
> 
> 
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/mman.h>
> 
> void print(char *message) {
>     unsigned l = strlen(message);
>     write(1,message,l);
> }
> void printuint(unsigned u) {
>     char buf[20];
>     char *p = buf + sizeof(buf) - 1;
>     *p-- = 0;
>     do {
>         *p-- = "0123456789"[u % 10];
>         u /= 10;
>     } while (u);
>     print(p+1);
> }
> 
> int main() {
> #if 1
> //    unsigned chunk = 128*1024;                                     
>           
>     unsigned chunk = 124*1024;
>     unsigned total = 0;
>     void *p;
> 
>     while (p = malloc(chunk)) {
>         total += chunk;
>         printuint(total);
>         print("MB\t: ");
>         printuint((unsigned)p);
>         print("\n");
>     }
> #else
>     unsigned offset = 150*1024*1024;
>     unsigned size = (unsigned) 3e9;
>     void *p = mmap((void*) offset,
>                    size,
>                    PROT_READ|PROT_WRITE,
>                    MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
>                    0,0);
>     printuint(size >> 20);
>     print(" MB\t: ");
>     printuint((unsigned) p);
>     print("\n");
> #endif
>     return 0;
> }
> 
> > Also has someone experience with the various kernel patches for
> large
> > memory out there (im's 4G/4G or IBM's 3.5G/0.5G hack)? 
> 
> there's nothing IBM-specific about 3.5/.5, that's for sure.
> 
> as it happens, I'm going to be doing some measurements of performance
> soon.
> 

__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Thu Dec  4 20:38:16 2003
From: csamuel at vpac.org (Chris Samuel)
Date: Fri, 5 Dec 2003 12:38:16 +1100
Subject: LONG RANT [RE: RHEL Copyright Removal]
In-Reply-To: <20031125013008.GA6416@sphere.math.ucdavis.edu>
References: <0B27450D68F1D511993E0001FA7ED2B3036EE4F8@ukjhmbx12.ukjh.zeneca.com> <1069682488.2179.127.camel@scalable> <20031125013008.GA6416@sphere.math.ucdavis.edu>
Message-ID: <200312051238.17633.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 25 Nov 2003 12:30 pm, Bill Broadley wrote:

> On Mon, Nov 24, 2003 at 10:01:30PM +0800, Laurence Liew wrote:
> > Hi all,
> >
> > RedHat have annouced academic pricing at USD25 per desktop (RHEL WS
> > based) and USD50 for Academic server (RHEL ES based) a week or so ago.
>
> This sounded relatively attractive to me, until I found out that
> USD25 per desktop for RHEL WS did NOT include the Opteron version.

I know this is a reply to an old message, but I think it's worth mentioning.

Looking at:

	http://www.redhat.com/solutions/industries/education/products/

It says that AMD64 (presumably both Opteron and Athlon 64) is included in this 
deal.  To quote:

	Versions available: x86, IPF, or AMD64

Chris
- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/z+GIO2KABBYQAh8RAoz1AJ9q9LAB3zfMyT566v0U7+71ykSlxACdHZKJ
9yrL/fFEX1oSwtYYdeHizS8=
=nRd0
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mperez at delta.ft.uam.es  Fri Dec  5 05:55:31 2003
From: mperez at delta.ft.uam.es (Manuel J)
Date: Fri, 5 Dec 2003 11:55:31 +0100
Subject: looking for specific PXE application
Message-ID: <200312051155.31691.mperez@delta.ft.uam.es>


	Hi. I am now involved in a clustering project and I need an application to 
collect all MAC addresses sent from PXE clients to a DHCP host with 
DHCPDISCOVER packets. I am trying to find out before start developing it by 
myself, so I think maybe I could get it from the beowulf project.

Could someone help me with some kind of reference, please?
Thanks.

Manuel J.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From agrajag at dragaera.net  Fri Dec  5 09:14:38 2003
From: agrajag at dragaera.net (Sean Dilda)
Date: Fri, 5 Dec 2003 09:14:38 -0500
Subject: looking for specific PXE application
In-Reply-To: <200312051155.31691.mperez@delta.ft.uam.es>; from mperez@delta.ft.uam.es on Fri, Dec 05, 2003 at 11:55:31AM +0100
References: <200312051155.31691.mperez@delta.ft.uam.es>
Message-ID: <20031205091438.C8280@vallista.dragaera.net>

On Fri, 05 Dec 2003, Manuel J wrote:

> 
> 	Hi. I am now involved in a clustering project and I need an application to 
> collect all MAC addresses sent from PXE clients to a DHCP host with 
> DHCPDISCOVER packets. I am trying to find out before start developing it by 
> myself, so I think maybe I could get it from the beowulf project.
> 
> Could someone help me with some kind of reference, please?
> Thanks.

dhcpd logs all requests, including the requesting MAC address and what
IP (if any) is assigned.  You can find those logs in /var/log/messages.
You can also check /var/lib/dhcpd.leases to see what leases (including
MAC addresses) are currently assigned.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From amitoj at cs.uh.edu  Fri Dec  5 10:09:56 2003
From: amitoj at cs.uh.edu (Amitoj G. Singh)
Date: Fri, 5 Dec 2003 09:09:56 -0600 (CST)
Subject: looking for specific PXE application
In-Reply-To: <200312051155.31691.mperez@delta.ft.uam.es>
Message-ID: <Pine.GSO.4.33.0312050909030.10008-100000@themis.cs.uh.edu>


I recall OSCAR could do that ...

http://oscar.openclustergroup.org

Hope this helps.

- Amitoj.

On Fri, 5 Dec 2003, Manuel J wrote:

>
> 	Hi. I am now involved in a clustering project and I need an application to
> collect all MAC addresses sent from PXE clients to a DHCP host with
> DHCPDISCOVER packets. I am trying to find out before start developing it by
> myself, so I think maybe I could get it from the beowulf project.
>
> Could someone help me with some kind of reference, please?
> Thanks.
>
> Manuel J.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From erwan at mandrakesoft.com  Fri Dec  5 07:56:54 2003
From: erwan at mandrakesoft.com (Erwan Velu)
Date: Fri, 05 Dec 2003 13:56:54 +0100
Subject: looking for specific PXE application
In-Reply-To: <200312051155.31691.mperez@delta.ft.uam.es>
References: <200312051155.31691.mperez@delta.ft.uam.es>
Message-ID: <1070629014.7715.1660.camel@revolution.mandrakesoft.com>

Hi, you could have a look to the script we are using in
CLIC/MandrakeClustering. This scripts are written in Perl and collect
mac addresses and assign in the dhcp configuration as static addresses.
You can use it and tune it for your needs.

> Could someone help me with some kind of reference, please?
> Thanks.
http://cvs.mandrakesoft.com/cgi-bin/cvsweb.cgi/cluster/clic/Devel_admin/add_nodes_to_dhcp_cluster.pm?rev=1.37&content-type=text/x-cvsweb-markup

-- 
Erwan Velu
Linux Cluster Distribution Project Manager
MandrakeSoft
43 rue d'aboukir 75002 Paris
Phone Number : +33 (0) 1 40 41 17 94
Fax Number   : +33 (0) 1 40 41 92 00
Web site     : http://www.mandrakesoft.com
OpenPGP key  : http://www.mandrakesecure.net/cks/ 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec  5 11:15:22 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 5 Dec 2003 11:15:22 -0500 (EST)
Subject: looking for specific PXE application
In-Reply-To: <20031205091438.C8280@vallista.dragaera.net>
Message-ID: <Pine.LNX.4.44.0312051114420.17311-100000@ganesh.phy.duke.edu>

On Fri, 5 Dec 2003, Sean Dilda wrote:

> On Fri, 05 Dec 2003, Manuel J wrote:
> 
> > 
> > 	Hi. I am now involved in a clustering project and I need an application to 
> > collect all MAC addresses sent from PXE clients to a DHCP host with 
> > DHCPDISCOVER packets. I am trying to find out before start developing it by 
> > myself, so I think maybe I could get it from the beowulf project.
> > 
> > Could someone help me with some kind of reference, please?
> > Thanks.
> 
> dhcpd logs all requests, including the requesting MAC address and what
> IP (if any) is assigned.  You can find those logs in /var/log/messages.
> You can also check /var/lib/dhcpd.leases to see what leases (including
> MAC addresses) are currently assigned.

Somebody did publish a grazing script (or reference to one) in this
venue sometime in the last year, maybe.  Google the archives.

  rgb

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From raysonlogin at yahoo.com  Fri Dec  5 12:02:45 2003
From: raysonlogin at yahoo.com (Rayson Ho)
Date: Fri, 5 Dec 2003 09:02:45 -0800 (PST)
Subject: Latency on Beowulf Mailing list
In-Reply-To: <1070639232.7721.1672.camel@revolution.mandrakesoft.com>
Message-ID: <20031205170245.58039.qmail@web11404.mail.yahoo.com>

It usually takes less than 20 minutes for me.

Rayson

--- Erwan Velu <erwan at mandrakesoft.com> wrote:
> When I'm sending messages to beowulf mailing list, I can see them
> after
> 8 hours :(
> 
> Sometimes, my answers are too old for being intresting :(
> 
> Any ideas? Does other users are in the same case ?
> -- 
> Erwan Velu
> Linux Cluster Distribution Project Manager
> MandrakeSoft
> 43 rue d'aboukir 75002 Paris
> Phone Number : +33 (0) 1 40 41 17 94
> Fax Number   : +33 (0) 1 40 41 92 00
> Web site     : http://www.mandrakesoft.com
> OpenPGP key  : http://www.mandrakesecure.net/cks/ 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From erwan at mandrakesoft.com  Fri Dec  5 10:47:13 2003
From: erwan at mandrakesoft.com (Erwan Velu)
Date: Fri, 05 Dec 2003 16:47:13 +0100
Subject: Latency on Beowulf Mailing list
Message-ID: <1070639232.7721.1672.camel@revolution.mandrakesoft.com>

When I'm sending messages to beowulf mailing list, I can see them after
8 hours :(

Sometimes, my answers are too old for being intresting :(

Any ideas? Does other users are in the same case ?
-- 
Erwan Velu
Linux Cluster Distribution Project Manager
MandrakeSoft
43 rue d'aboukir 75002 Paris
Phone Number : +33 (0) 1 40 41 17 94
Fax Number   : +33 (0) 1 40 41 92 00
Web site     : http://www.mandrakesoft.com
OpenPGP key  : http://www.mandrakesecure.net/cks/ 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tim.carlson at pnl.gov  Fri Dec  5 11:57:08 2003
From: tim.carlson at pnl.gov (Tim Carlson)
Date: Fri, 05 Dec 2003 08:57:08 -0800 (PST)
Subject: looking for specific PXE application
In-Reply-To: <200312051631.hB5GV6S10698@NewBlue.scyld.com>
Message-ID: <Pine.LNX.4.44.0312050853020.30089-100000@scorpion.emsl.pnl.gov>


> On Fri, 5 Dec 2003, Sean Dilda wrote:
>
> > On Fri, 05 Dec 2003, Manuel J wrote:
> >
> > >
> > > 	Hi. I am now involved in a clustering project and I need an application to
> > > collect all MAC addresses sent from PXE clients to a DHCP host with
> > > DHCPDISCOVER packets. I am trying to find out before start developing it by
> > > myself, so I think maybe I could get it from the beowulf project.
> > >
> > > Could someone help me with some kind of reference, please?
> > > Thanks.
> >
> > dhcpd logs all requests, including the requesting MAC address and what
> > IP (if any) is assigned.  You can find those logs in /var/log/messages.
> > You can also check /var/lib/dhcpd.leases to see what leases (including
> > MAC addresses) are currently assigned.
>
> Somebody did publish a grazing script (or reference to one) in this
> venue sometime in the last year, maybe.  Google the archives.
>
>   rgb

This is exactly how ROCKS clusters add nodes. Install your frontend, PXE
boot your nodes. If you've already decided on a clusters solution, then
nevermind :)

http://www.rocksclusters.org/

Tim Carlson
Voice: (509) 376 3423
Email: Tim.Carlson at pnl.gov
EMSL UNIX System Support

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Daniel.Kidger at quadrics.com  Fri Dec  5 12:04:42 2003
From: Daniel.Kidger at quadrics.com (Daniel Kidger)
Date: Fri, 5 Dec 2003 17:04:42 -0000
Subject: Latency on Beowulf Mailing list
Message-ID: <010C86D15E4D1247B9A5DD312B7F5AA78DE2BC@stegosaurus.bristol.quadrics.com>

> From: Erwan Velu [mailto:erwan at mandrakesoft.com]
> Subject: Latency on Beowulf Mailing list
> When I'm sending messages to beowulf mailing list, I can see 
> them after  8 hours :(

yes me too.

Much of the time my positngs take a median of say 5 hours.
In the mean time several other folk often manage to post their replies.


Daniel.

--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd.      daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK         0117 915 5505
----------------------- www.quadrics.com --------------------

> 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From herrold at owlriver.com  Fri Dec  5 13:58:18 2003
From: herrold at owlriver.com (R P Herrold)
Date: Fri, 5 Dec 2003 13:58:18 -0500 (EST)
Subject: beowulf]  Re: looking for specific PXE application
In-Reply-To: <Pine.GSO.4.33.0312050909030.10008-100000@themis.cs.uh.edu>
References: <Pine.GSO.4.33.0312050909030.10008-100000@themis.cs.uh.edu>
Message-ID: <Pine.LNX.4.53.0312051354180.8041@swampfox.owlriver.com>

On Fri, 5 Dec 2003, Amitoj G. Singh wrote:

> I recall OSCAR could do that ...
> http://oscar.openclustergroup.org

> > collect all MAC addresses sent from PXE clients to a DHCP host with
> > Could someone help me with some kind of reference, please?

These should all show with the 'arpwatch' package; or in the
alternative, by turning logging up for the tftp server(atftp 
works well) or on the dhcp server, can be extracted from 
/var/log/messages , and then awk |sort |uniq'ed out.

Is the Oscar tool more sophisticated than that?

-- Russ Herrold
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Fri Dec  5 15:28:42 2003
From: becker at scyld.com (Donald Becker)
Date: Fri, 5 Dec 2003 15:28:42 -0500 (EST)
Subject: Latency on Beowulf Mailing list
In-Reply-To: <1070639232.7721.1672.camel@revolution.mandrakesoft.com>
Message-ID: <Pine.LNX.4.44.0312051429120.2537-100000@training.scyld.com>

On Fri, 5 Dec 2003, Erwan Velu wrote:

> When I'm sending messages to beowulf mailing list, I can see them after
> 8 hours :(
> 
> Sometimes, my answers are too old for being intresting :(
> 
> Any ideas? Does other users are in the same case ?

The quick answer is "spammers and viruses".

There are several reasons that this is the case:
  Over 95% of Beowulf list postings are held for moderation
  The Beowulf list alone has about 3000 addresses

95% might seem large, but considering only 1 in ten attempted postings
is valid, only about 50% of the posts are held for moderation.  While I
do sometimes wake up in the middle of the night to moderate, you
shouldn't really expect that.

A posting may be held for moderation by match any of about 25 patterns.
Some of those patterns are pretty general -- even certain three digit
numbers and two digit country codes will trigger moderation.

Once held for moderation the posting may be automatically deleted.
Right now there are 1439 phrases and 3298 IP address prefixes and domain
names.  All were hand added.  My goal is over 90% automatic deletions.
If I stop adding rules, it drops below that number in a week or two as
spammers move machines and change tactics.

Less common is that a post is automatically approved.  Some spammers
have taken to including sections of web pages in their email, so don't
expect this increasing in the future.

The second point is also a result of spammers, albeit indirectly.  The
list is run by mailman, which splits the list up into sections.  If your
position is after a Teergruber, or the link is just busy, your email
will be delayed for several hours.  Despite being very responsible
mailers, our machine (or perhaps our IP address block) does sometimes
end up on a RBL.

I see this problem as only getting worse.  Our "3c509" mailing list is
first alphabetically, and thus is the first recipient of new spam.  I've
mostly given up on it, but leave it in place to harvest new patterns.
It received 5 new messages in the past 30 minutes, a rate up
substantially over just a few months ago. 

So, what can you do to avoid delays?  Nothing especially predictable,
because predictable measures are easily defeated by spammers.  But you
can 
- avoid posting or having a return address from free mail account services
- have a reverse-resolving host name on all mail hops
- don't have "adsl" or "dialup" in the header
- avoid all mention of "personal enhancement" drugs, purchasing drugs of
    any kind, moving money out of your sub-sahara country, mentions of
    credit card names, sex with a farm animal, sex with multiple farm
    animals, webcams, etc.  Talking about your cluster of webcams of
    viagra-enhanced farm animals trying to move their lottery winnings
    out of from Nigeria, even if they puchased the viagra at dicount
    rates from Canadian suppliers that included a free mini-rc car could
    conceivably be brought on-topic, but that posting will never make it.

-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
914 Bay Ridge Road, Suite 220		Scyld Beowulf cluster systems
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jac67 at georgetown.edu  Fri Dec  5 16:54:06 2003
From: jac67 at georgetown.edu (Jess Cannata)
Date: Fri, 05 Dec 2003 16:54:06 -0500
Subject: Latency on Beowulf Mailing list
In-Reply-To: <Pine.LNX.4.44.0312051429120.2537-100000@training.scyld.com>
References: <Pine.LNX.4.44.0312051429120.2537-100000@training.scyld.com>
Message-ID: <3FD0FE7E.7070806@georgetown.edu>

Is the list a restricted list, meaning that only subscribers to the list 
can post messages? If it isn't, wouldn't this help reduce the number of 
messages that need to be moderated? If it is restricted, then I guess 
that the spammers are getting really good if they are spoofing the 
addresses of the 3000 subscribers.

Jess

Donald Becker wrote:

>On Fri, 5 Dec 2003, Erwan Velu wrote:
>
>  
>
>>When I'm sending messages to beowulf mailing list, I can see them after
>>8 hours :(
>>
>>Sometimes, my answers are too old for being intresting :(
>>
>>Any ideas? Does other users are in the same case ?
>>    
>>
>
>The quick answer is "spammers and viruses".
>
>There are several reasons that this is the case:
>  Over 95% of Beowulf list postings are held for moderation
>  The Beowulf list alone has about 3000 addresses
>
>95% might seem large, but considering only 1 in ten attempted postings
>is valid, only about 50% of the posts are held for moderation.  While I
>do sometimes wake up in the middle of the night to moderate, you
>shouldn't really expect that.
>
>A posting may be held for moderation by match any of about 25 patterns.
>Some of those patterns are pretty general -- even certain three digit
>numbers and two digit country codes will trigger moderation.
>
>Once held for moderation the posting may be automatically deleted.
>Right now there are 1439 phrases and 3298 IP address prefixes and domain
>names.  All were hand added.  My goal is over 90% automatic deletions.
>If I stop adding rules, it drops below that number in a week or two as
>spammers move machines and change tactics.
>
>Less common is that a post is automatically approved.  Some spammers
>have taken to including sections of web pages in their email, so don't
>expect this increasing in the future.
>
>The second point is also a result of spammers, albeit indirectly.  The
>list is run by mailman, which splits the list up into sections.  If your
>position is after a Teergruber, or the link is just busy, your email
>will be delayed for several hours.  Despite being very responsible
>mailers, our machine (or perhaps our IP address block) does sometimes
>end up on a RBL.
>
>I see this problem as only getting worse.  Our "3c509" mailing list is
>first alphabetically, and thus is the first recipient of new spam.  I've
>mostly given up on it, but leave it in place to harvest new patterns.
>It received 5 new messages in the past 30 minutes, a rate up
>substantially over just a few months ago. 
>
>So, what can you do to avoid delays?  Nothing especially predictable,
>because predictable measures are easily defeated by spammers.  But you
>can 
>- avoid posting or having a return address from free mail account services
>- have a reverse-resolving host name on all mail hops
>- don't have "adsl" or "dialup" in the header
>- avoid all mention of "personal enhancement" drugs, purchasing drugs of
>    any kind, moving money out of your sub-sahara country, mentions of
>    credit card names, sex with a farm animal, sex with multiple farm
>    animals, webcams, etc.  Talking about your cluster of webcams of
>    viagra-enhanced farm animals trying to move their lottery winnings
>    out of from Nigeria, even if they puchased the viagra at dicount
>    rates from Canadian suppliers that included a free mini-rc car could
>    conceivably be brought on-topic, but that posting will never make it.
>
>  
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rokrau at yahoo.com  Fri Dec  5 20:05:56 2003
From: rokrau at yahoo.com (Roland Krause)
Date: Fri, 5 Dec 2003 17:05:56 -0800 (PST)
Subject: problem allocating large amount of memory
In-Reply-To: <Pine.LNX.4.44.0312041949590.14254-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20031206010556.44528.qmail@web40012.mail.yahoo.com>

Mark,

thanks for the clarification. I was now able to squeeze
TASK_UNMAPPED_BASE to a rather small fraction of TASK_SIZE and to
allocate enough memory for the application in question. 

Again, thanks a lot for your very helpful comments. 

Roland


--- Mark Hahn <hahn at physics.mcmaster.ca> wrote:
> 
> I probably should have commented on the code a bit more.  it
> demonstrates
> three separate things: that for <128K allocations, libc uses the heap
> first, then when that fills (hits the mmap arena) it switches to
> allocating
> in the mmap arena.  if allocations are 128K or more, it *starts* in
> the 
> mmap arena (since mmap has advantages when doing large allocations -
> munmap).  
> finally, if you statically link and avoid the use of stdio,
> you can make one giant allocation from the end of text up to stack.
> 
> 

__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From pesch at attglobal.net  Sat Dec  6 16:50:11 2003
From: pesch at attglobal.net (pesch at attglobal.net)
Date: Sat, 06 Dec 2003 13:50:11 -0800
Subject: Latency on Beowulf Mailing list
References: <Pine.LNX.4.44.0312051429120.2537-100000@training.scyld.com>
Message-ID: <3FD24F13.8B8CE253@attglobal.net>

I was planning to call Beowulf clustering the "Viagra of computing" - but after Donalds elaborations I plan to
change my mind :(((

Donald Becker wrote:

> On Fri, 5 Dec 2003, Erwan Velu wrote:
>
> > When I'm sending messages to beowulf mailing list, I can see them after
> > 8 hours :(
> >
> > Sometimes, my answers are too old for being intresting :(
> >
> > Any ideas? Does other users are in the same case ?
>
> The quick answer is "spammers and viruses".
>
> There are several reasons that this is the case:
>   Over 95% of Beowulf list postings are held for moderation
>   The Beowulf list alone has about 3000 addresses
>
> 95% might seem large, but considering only 1 in ten attempted postings
> is valid, only about 50% of the posts are held for moderation.  While I
> do sometimes wake up in the middle of the night to moderate, you
> shouldn't really expect that.
>
> A posting may be held for moderation by match any of about 25 patterns.
> Some of those patterns are pretty general -- even certain three digit
> numbers and two digit country codes will trigger moderation.
>
> Once held for moderation the posting may be automatically deleted.
> Right now there are 1439 phrases and 3298 IP address prefixes and domain
> names.  All were hand added.  My goal is over 90% automatic deletions.
> If I stop adding rules, it drops below that number in a week or two as
> spammers move machines and change tactics.
>
> Less common is that a post is automatically approved.  Some spammers
> have taken to including sections of web pages in their email, so don't
> expect this increasing in the future.
>
> The second point is also a result of spammers, albeit indirectly.  The
> list is run by mailman, which splits the list up into sections.  If your
> position is after a Teergruber, or the link is just busy, your email
> will be delayed for several hours.  Despite being very responsible
> mailers, our machine (or perhaps our IP address block) does sometimes
> end up on a RBL.
>
> I see this problem as only getting worse.  Our "3c509" mailing list is
> first alphabetically, and thus is the first recipient of new spam.  I've
> mostly given up on it, but leave it in place to harvest new patterns.
> It received 5 new messages in the past 30 minutes, a rate up
> substantially over just a few months ago.
>
> So, what can you do to avoid delays?  Nothing especially predictable,
> because predictable measures are easily defeated by spammers.  But you
> can
> - avoid posting or having a return address from free mail account services
> - have a reverse-resolving host name on all mail hops
> - don't have "adsl" or "dialup" in the header
> - avoid all mention of "personal enhancement" drugs, purchasing drugs of
>     any kind, moving money out of your sub-sahara country, mentions of
>     credit card names, sex with a farm animal, sex with multiple farm
>     animals, webcams, etc.  Talking about your cluster of webcams of
>     viagra-enhanced farm animals trying to move their lottery winnings
>     out of from Nigeria, even if they puchased the viagra at dicount
>     rates from Canadian suppliers that included a free mini-rc car could
>     conceivably be brought on-topic, but that posting will never make it.
>
> --
> Donald Becker                           becker at scyld.com
> Scyld Computing Corporation             http://www.scyld.com
> 914 Bay Ridge Road, Suite 220           Scyld Beowulf cluster systems
> Annapolis MD 21403                      410-990-9993
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lange at informatik.Uni-Koeln.DE  Sun Dec  7 15:37:50 2003
From: lange at informatik.Uni-Koeln.DE (Thomas Lange)
Date: Sun, 7 Dec 2003 21:37:50 +0100
Subject: looking for specific PXE application
In-Reply-To: <200312051155.31691.mperez@delta.ft.uam.es>
References: <200312051155.31691.mperez@delta.ft.uam.es>
Message-ID: <16339.36766.310248.237282@informatik.uni-koeln.de>

>>>>> On Fri, 5 Dec 2003 11:55:31 +0100, Manuel J <mperez at delta.ft.uam.es> said:

    > 	Hi. I am now involved in a clustering project and I need an
    > 	application to
    > collect all MAC addresses sent from PXE clients to a DHCP host

FAI, the fully automatic installation uses following simple command
pipe:

> tcpdump -qte broadcast and port bootpc >/tmp/mac.lis

The when all machines send out some broadcast messages, you will get
the list with

> perl -ane 'print "\U$F[0]\n"' /tmp/mac.lis|sort|uniq

-- 
regrads Thomas Lange
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Mon Dec  8 16:23:26 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Mon, 08 Dec 2003 14:23:26 -0700
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <3FD4EBCE.4060908@drdc-rddc.gc.ca>

Hello,

This may be off topic but may be of interest to many that follow this 
list.

I have searched the WWW until my eyes are seeing double (and it isn't 
just the beer) trying to find a real answer to my question.  I have 
read the reviews and the hype about SATA being better than IDE/ATA and 
almost as good as SCSI, even better in a couple of areas.

I have talked to our computer people but they don't have enough 
experience with SATA drives to give me a straight answer.

With most new motherboards coming with controllers for SATA drives, I 
am considering using SATA drives for a new high-end workstation and 
small cluster.  I have seen RAID arrays using SATA drives which just 
makes the question even greater.  Of course I have seen RAID arrays 
using IDE drives.

I have used SCSI on all workstations I have built in the past, but the 
cost of SATA drives is making me think twice about this.  Files seem 
to be getting larger from day to day.

My concern is regarding multiple disk read/writes.  With IDE, you can 
wait for what seems like hours while data is being read off of the HD. 
  I want to know if the problem is still as bad with SATA as the 
original ATA drives?  Will the onboard RAID speed up access?

I know that throughput on large files is close and is usually related 
to platter speed.  I am also pleased that the buffers is now 8mb on 
all the drives I am looking at.

Main issue is writing and reading swap on those really large files and 
how it affects other work.

OS will be Linux on all.

-- 
Robin Laing


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Mon Dec  8 17:18:52 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Mon, 8 Dec 2003 14:18:52 -0800
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD4EBCE.4060908@drdc-rddc.gc.ca>
References: <3FD4EBCE.4060908@drdc-rddc.gc.ca>
Message-ID: <20031208221852.GB22702@cse.ucdavis.edu>

In my experience there are many baises, religious opinions, and rules
of thumb that are just extremely BAD basis for making these related
decisions.  Especially since many people's idea about such things change
relatively slowly compared to the actual hardware.

My best recommendation is to either find a benchmark that closely resembles
your application load (Similar mix of read/writes, same level of RAID, same
size read/writes, same locality) and actually benchmark.

I'm sure people can produce a particular configuration of SCSI, ATA, and SATA that 
will be best AND worst for a given benchmark.

So I'd look at bonnie++, postmark, or one of the other opensource benchmarks
see if any of those can be configured to be similar to your workload.  If not
write a benchmark that is similar to your workload and post it to the list asking
people to run it on their hardware.  The more effort you put into it the
more responses your likely to get.  Posting a table of performance results
on a website seems to encourage more to participate.

There are no easy answers, it depends on many many variables, the type
of OS, how long the partition has been live (i.e. fragmentation),
the IDE/SCSI chipset, the drivers, the OS, even the cables can have
performance effects.

The market seems to be going towards SATA, seems like many if not all major
storage vendors have an entry level SATA product, I've no idea if this
is just the latest fad or justified from a pure price/performance perspective.

Good luck.

-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Mon Dec  8 18:15:24 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Mon, 8 Dec 2003 15:15:24 -0800 (PST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <20031208221852.GB22702@cse.ucdavis.edu>
Message-ID: <Pine.LNX.3.96.1031208150058.16362A-100000@Maggie.Linux-Consulting.com>


hi ya robin/bill

On Mon, 8 Dec 2003, Bill Broadley wrote:

> In my experience there are many baises, religious opinions, and rules
> of thumb that are just extremely BAD basis for making these related
> decisions.  Especially since many people's idea about such things change
> relatively slowly compared to the actual hardware.

yupperz !!
 
> My best recommendation is to either find a benchmark that closely resembles
> your application load (Similar mix of read/writes, same level of RAID, same
> size read/writes, same locality) and actually benchmark.
> 
> I'm sure people can produce a particular configuration of SCSI, ATA, and SATA that 
> will be best AND worst for a given benchmark.

yupperz ... no problem ... you want theirs to look not as good, and our
version look like its better... yupp ..

definite yupppers one do a benchmark and compare only similar environments
and apps ... otherwise one is comparing christmas shopping to studing
to be a vet ( benchmarks not related to each other )

-----

for which disks ...
	- i'd stick with plain ole ide disks
	- its cheap
	- you can have a whole 2nd system to backup the primary array
	for about the same cost as an expensive dual-cpu or scsi-based
	system

for serial ata ...
	- dont use its onboard controller for raid ... 
	- it probably be as good as onboard raid on existing mb...
	( ie ... none of um works right 
		works == hands off booting of any disk 
		works == data resyncs by itself w/o intervention

		but doing the same tests w/ sw raid or hw raid
		controller w/ scsi works fine

> So I'd look at bonnie++, postmark, or one of the other opensource benchmarks
> see if any of those can be configured to be similar to your workload.  If not
> write a benchmark that is similar to your workload and post it to the list asking
> people to run it on their hardware.  The more effort you put into it the
> more responses your likely to get.  Posting a table of performance results
> on a website seems to encourage more to participate.

other benchmark tests you can run ....

	http://www.Linux-1U.net/Benchmarks

other tuning you can to to tweek the last instruction out of the system

	http://www.Linux-1U.net/Tuning
 
> There are no easy answers, it depends on many many variables, the type
> of OS, how long the partition has been live (i.e. fragmentation),
> the IDE/SCSI chipset, the drivers, the OS, even the cables can have
> performance effects.

(look for the) picture of partitions/layout ... makes  big difference

	http://www.Linux-1U.net/Partition/

> The market seems to be going towards SATA, seems like many if not all major
> storage vendors have an entry level SATA product, I've no idea if this
> is just the latest fad or justified from a pure price/performance perspective.

if the disk manufacturers stop making scsi/ide disks .. we wont have
any choice... unless we go to the super fast "compact flash"
and its next generation 100GB "compact flash" in the r/d labs 
which is why ibm sold its klunky mechanical disk drives in favor
of its new "solid state disks"  ( forgot its official name )

c ya
alvin

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Dec  8 18:18:07 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 8 Dec 2003 18:18:07 -0500 (EST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD4EBCE.4060908@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312081736360.13824-100000@coffee.psychology.mcmaster.ca>

| read the reviews and the hype about SATA being better than IDE/ATA and 
| almost as good as SCSI, even better in a couple of areas.

<sigh>


| I have talked to our computer people but they don't have enough 
| experience with SATA drives to give me a straight answer.

there's not THAT much to know.

SCSI:	pro: a nice bus-based architecture which makes it easy to put
	many disks in one enclosure.  the bus is fast enough to
	support around 3-5 disks without compromising bandwidth
	(in fact, you'll probably saturate your PCI(x) bus(es) first
	if you're not careful!) 10 and 15K RPM SCSI disks are common, 
	leading to serious advantages if your workload is latency-dominated
	(mostly of small, scattered, uncachable reads, and/or synchronous
	writes.)  5yr warrantees and 1.2 Mhour MTBF are very comforting.

	con: price.  older (pre Ultra2) disks lack even basic CRC protection.
	always lower-density than ATA; often hotter, too.  (note that the 
	density can actually negate any MTBF advantage!)

ATA:	pro: price.  massive density (and that means that bandwidth is 
	excellent, even at much lower RPM.)  ease of purchase/replacement;
	ubiquity (and cheapness) of controllers.

	con: probably half the MTBF of SCSI, 1yr warrantee is common, 
	though the price-premium for 3yr models is small.  most disks are
	5400 or	7200 RPM so latency is potentially a problem (though there is
	one line of 10K RPM'ers but at close to SCSI prices and density).

PATA:	pro: still a bit cheaper than SATA.  PATA *does* include tagged 
	command queueing, but it's mostly ignored by vendors and drivers.

	con: cabling just plain sucks for more than a few disks (remember:
	the standard STILL requires cable be <= 18" of flat ribbon).

SATA:	pro: nicer cable, albeit not bus or daisy-chain (until sata2);
	much improved support for hot-plug and TCQ.

	con: not quite mainstream (price and availability).  putting many
	in one box is still a bit of a problem (albeit also a power problem
	for any kind of disk...)

I have no idea what to make of the roadmappery that shows sata merging with 
serial-attached scsi in a few years.

| My concern is regarding multiple disk read/writes.  With IDE, you can 
| wait for what seems like hours while data is being read off of the HD. 

nah.  it's basically just a design mistake to put two active PATA disks 
on the same channel.  it's fine if one is usually idle (say, cdrom or 
perhaps a disk containing old archives).  most people just avoid putting 
two disks on a channel at all, since channels are almost free, and you 
get to ignore jumpers.


|   I want to know if the problem is still as bad with SATA as the 
| original ATA drives?  Will the onboard RAID speed up access?

there was no problem with "original" disks.  and raid works fine, up until
you saturate your PCI bus...


| I know that throughput on large files is close and is usually related 
| to platter speed.  I am also pleased that the buffers is now 8mb on 
| all the drives I am looking at.

one of the reasons that TCQ is not a huge win is that the kernel's cache
is ~500x bigger than the disk's.  however, it's true that bigger ondisk cache
lets the drive better optimize delayed writes within a cylinder.  for non-TCQ
ATA to be competitive when writing, it's common to enable write-behind
caching.  this can cause data loss or corruption if you crash at exactly the 
right time (paranoids take note).


| Main issue is writing and reading swap on those really large files and 
| how it affects other work.

swap thrashing is a non-fatal error that should be fixed, 
not band-aided by gold-plated hardware.

finally, I should mention that Jeff Garzik is doing a series of good new SATA
drivers (deliberately ignoring the accumulated kruft in the kernel's PATA
code).  they plug into the kernel's SCSI interface, purely to take advantage 
of support for queueing and hotplug, I think.

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Dec  8 20:24:44 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 8 Dec 2003 20:24:44 -0500 (EST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312081736360.13824-100000@coffee.psychology.mcmaster.ca>
Message-ID: <Pine.LNX.4.44.0312082013040.15807-100000@lilith.rgb.private.net>

On Mon, 8 Dec 2003, Mark Hahn wrote:

> | read the reviews and the hype about SATA being better than IDE/ATA and 
> | almost as good as SCSI, even better in a couple of areas.
> 
> <sigh>
> 
> 
> | I have talked to our computer people but they don't have enough 
> | experience with SATA drives to give me a straight answer.
> 
> there's not THAT much to know.

But what there is is a pleasure to read, as always, when you write it.
One tiny question:

> | My concern is regarding multiple disk read/writes.  With IDE, you can 
> | wait for what seems like hours while data is being read off of the HD. 
> 
> nah.  it's basically just a design mistake to put two active PATA disks 
> on the same channel.  it's fine if one is usually idle (say, cdrom or 
> perhaps a disk containing old archives).  most people just avoid putting 
> two disks on a channel at all, since channels are almost free, and you 
> get to ignore jumpers.

So, admitting my near total ignorance about SATA and whether or not I
should lust after it, does SATA perpetuate this problem, or is it more
like a SCSI daisy chain, where each drive gets its own ID and there is a
better handling of parallel access?

The "almost free" part has several annoying aspects, after all.  An
extra controller (or two).  One cable per disk if you use one disk per
channel.  The length thing.  The fact that ribbon cables, when they turn
sideways, do a gangbusters job of occluding fans and airflow, and with
four or five of them in a case routing them around can be a major pain.

There is also a small price premium for SATA, although admittedly it
isn't much.  So, in your fairly expert opinion, is it worth it?

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Mon Dec  8 21:44:14 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Mon, 8 Dec 2003 21:44:14 -0500 (EST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312082013040.15807-100000@lilith.rgb.private.net>
Message-ID: <Pine.LNX.4.44.0312082111330.15283-100000@coffee.psychology.mcmaster.ca>

> > | My concern is regarding multiple disk read/writes.  With IDE, you can 
> > | wait for what seems like hours while data is being read off of the HD. 
> > 
> > nah.  it's basically just a design mistake to put two active PATA disks 
> > on the same channel.  it's fine if one is usually idle (say, cdrom or 
> > perhaps a disk containing old archives).  most people just avoid putting 
> > two disks on a channel at all, since channels are almost free, and you 
> > get to ignore jumpers.
> 
> So, admitting my near total ignorance about SATA and whether or not I
> should lust after it, does SATA perpetuate this problem, or is it more
> like a SCSI daisy chain, where each drive gets its own ID and there is a
> better handling of parallel access?

no, or maybe yes.  SATA is *not* becoming more SCSI-like: drives don't
get their own ID (since they're not on a bus).  in SATA-1 at least,
the cable is strictly point-to-point, and each drive acts like a separate
channel (which were always parallel even in PATA).  basically, master/slave
was just a really crappy implementation of SCSI IDs, and SATA has done away
with it.  given that IO is almost always host<>device, there's no real value
in making devices peers, IMO.  

yes to concurrency, but no to "like SCSI" (peers, IDs and multidrop).

> extra controller (or two).  One cable per disk if you use one disk per
> channel.

one cable per disk, period.  this is sort of an interesting design trend,
actually: away from parallel multidrop buses, towards serial point-to-point
ones.  in fact, the sata2 "port multiplier" extension is really a sort
of packet-switching mechanism...

> There is also a small price premium for SATA, although admittedly it
> isn't much.  So, in your fairly expert opinion, is it worth it?

my next 8x250G server(s) will use a pair of promise s150tx4 (non-raid) 4-port
sata controllers ;)

I don't see any significant benefit except where you need lots of devices
and/or hotswap.  well, beyond the obvious coolness factor, of course...
though come to think of it, there should be some performance, and probably
robustness benefits from Jeff Garzik's clean-slate approach.

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From glen at mail.cert.ucr.edu  Mon Dec  8 22:27:12 2003
From: glen at mail.cert.ucr.edu (Glen Kaukola)
Date: Mon, 08 Dec 2003 19:27:12 -0800
Subject: autofs
Message-ID: <3FD54110.7090703@cert.ucr.edu>

Hi,

I was wanting to use autofs to mount all the nfs shares on my nodes to 
ease the pain of having an nfs server go down.  But the problem with 
that, is that mpich jobs don't seem to want to run the first time 
around.  If I then run them a second time, the drives are mounted, and 
they run fine.  I don't think my users are going to like that too much 
though, so would anyone know a solution?

Thanks,
Glen

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hanzl at noel.feld.cvut.cz  Tue Dec  9 04:53:54 2003
From: hanzl at noel.feld.cvut.cz (hanzl at noel.feld.cvut.cz)
Date: Tue, 09 Dec 2003 10:53:54 +0100
Subject: autofs
In-Reply-To: <3FD54110.7090703@cert.ucr.edu>
References: <3FD54110.7090703@cert.ucr.edu>
Message-ID: <20031209105354B.hanzl@unknown-domain>

> I was wanting to use autofs to mount all the nfs shares on my nodes to 
> ease the pain of having an nfs server go down.  But the problem with 
> that, is that mpich jobs don't seem to want to run the first time 
> around.

If you are using bproc than there is a slight chance that it is somehow
related to autofs/bproc deadlock which I discovered long time ago (and
I have no idea whether my fix made it to bproc or not), see:

  http://www.beowulf.org/pipermail/beowulf/2002-May/003508.html

Regards

Vaclav Hanzl
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From David_Walters at sra.com  Tue Dec  9 07:46:24 2003
From: David_Walters at sra.com (Walters, David)
Date: Tue, 9 Dec 2003 07:46:24 -0500
Subject: EMC, anyone?
Message-ID: <0EB5C81FE6FE5A4F8D1FEBF59C6C7BAA1A1824@durham.sra.com>

Our group has an opportunity that few would pass up - more or less free
storage.  Our parent organization is preparing to purchase a large amount of
EMC storage, the configuration of which is not yet nailed down.  We are
investigating the potential to be the recipients of part of that storage,
and (crossing fingers) no one has mentioned the dreaded chargeback word yet.
Obviously, we would be thrilled to gain access to TBs of free storage, so we
can spend more of our budget on people and compute platforms.

Naturally, the EMC reps are plying us with lots of jargon, PR, white papers,
and so on explaining why their technology is the perfect fit for us.
However, I am bothered by the fact that EMC does not have a booth at SC each
year, and I do not see them mentioned in the HPC trade rags.  Makes me think
that they don't really have the technology and support tailored for the HPC
community.

We, of course, are doing due diligence on the business case side, matching
our needs with their numbers.  My question to this group is "Do any of you
use EMC for your HPC storage?"  If so, how?  Been happy with it?

We do primarily models with heavy latency dependency (meteorological, with
CMAQ and MM5).  This will not be the near-line storage, but rather NAS
attached to HiPPI or gigE.

Thanks in advance,

Dave Walters
Project Manager, SRA
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kallio at ebi.ac.uk  Tue Dec  9 07:24:56 2003
From: kallio at ebi.ac.uk (Kimmo Kallio)
Date: Tue, 9 Dec 2003 12:24:56 +0000 (GMT)
Subject: autofs
In-Reply-To: <3FD54110.7090703@cert.ucr.edu>
Message-ID: <Pine.LNX.4.44.0312091157560.25441-100000@rakkine.ebi.ac.uk>

On Mon, 8 Dec 2003, Glen Kaukola wrote:

> Hi,
> 
> I was wanting to use autofs to mount all the nfs shares on my nodes to 
> ease the pain of having an nfs server go down.  But the problem with 
> that, is that mpich jobs don't seem to want to run the first time 
> around.  If I then run them a second time, the drives are mounted, and 
> they run fine.  I don't think my users are going to like that too much 
> though, so would anyone know a solution?

Hi, 

This is not specific to your application but a general autofs issue: If
and autofs directory is not mounted, it simply doesn't exists and some
operations (like file exists) do fail. As a workaround try doing a
indirect autofs mount via a symlink, instead of mounting:

  /my/directory 

do a link :

  /my/directory -> /tmp_mnt/my/directory
  
and automount /tmp_mnt/my/directory instead, but always use /my/directory
in file references. Resolving the link forces the mount operation and
solves the problem. 

However if the automount fails (server down) it doesn't necessarely make
your users any happier as the applications will fail, unless if an long
nfs timeout would kill your application anyway...

Regards,

  Kimmo Kallio, European Bioinformatics Institute

> 
> Thanks,
> Glen
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dskr at mac.com  Tue Dec  9 08:40:34 2003
From: dskr at mac.com (dskr at mac.com)
Date: Tue, 9 Dec 2003 08:40:34 -0500
Subject: Terrasoft Black Lab Linux
Message-ID: <43DEB9EC-2A4D-11D8-9EAE-00039394839E@mac.com>


Greetings:

Does anyone on the list have any experience with TerraSoft's Black Lab 
linux?

As many of you may recall, I am a big fan of 'software that sucks less' 
-- to quote a
wonderful Scyld T-shirt I once saw. Imagine my surprise, then, when I 
found that
TerraSoft (promulgators of YellowDog and BlackLab Linux for PPC) is 
shipping a
new version (2.2) of BlackLab that is based on BProc.

Is this good news? I think it could be for TerraSoft ; this move is a 
big upgrade from
their earlier offering which reminded me of the Dark Times in 
clustering.
(Does anyone else still remember when we had to set up .rhosts files 
and grab
our copy of PVM out of someone else's home directory and copy it into 
our own?)

I'd like to see what BlackLab's story is. but I have been unable to 
find any of the
sources for this product available for download. In particular, I would 
like to know:

	* Does it use beonss?

	* Does it use beoboot?

	* Does it netboot remote Macintoshes?

	* What version of BProc does it use?

	* How did they do MPI? Did they crib Don's version
	of MPICH for BProc?

Additionally, I'm looking for good ideas which can be adapted to a 
little
toy I wrote years ago called 'mpi-mandel'. They tout a similar program 
and
I was hoping to have a peek at it. Does anyone know if their similar 
program
is available under the GPL?

If anyone on this forum has experience with this product, I would 
appreciate
your feedback. If anyone can furnish me with the sources or links for 
the BlackLab
MPI, beoboot, and mandelbrot program, I would be grateful.

Regards,
	Dan Ridge

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Patrick.Begou at hmg.inpg.fr  Tue Dec  9 08:22:39 2003
From: Patrick.Begou at hmg.inpg.fr (Patrick Begou)
Date: Tue, 09 Dec 2003 14:22:39 +0100
Subject: autofs
References: <Pine.LNX.4.44.0312091157560.25441-100000@rakkine.ebi.ac.uk>
Message-ID: <3FD5CC9F.C34CC0CE@hmg.inpg.fr>

Kimmo Kallio a ?crit :
> This is not specific to your application but a general autofs issue: If
> and autofs directory is not mounted, it simply doesn't exists and some
> operations (like file exists) do fail. As a workaround try doing a
> indirect autofs mount via a symlink, instead of mounting:
> 
>   /my/directory
> 
> do a link :
> 
>   /my/directory -> /tmp_mnt/my/directory
> 
> and automount /tmp_mnt/my/directory instead, but always use /my/directory
> in file references. Resolving the link forces the mount operation and
> solves the problem.

I've done something similar but I've added "." in the linked path, like
this:
/my/directory -> /tmp_mnt/my/directory/.

I didn't get any problem with such a link.

Patrick

-- 
===============================================================
|  Equipe M.O.S.T.         | http://most.hmg.inpg.fr          |
|  Patrick BEGOU           |       ------------               |
|  LEGI                    | mailto:Patrick.Begou at hmg.inpg.fr |
|  BP 53 X                 | Tel 04 76 82 51 35               |
|  38041 GRENOBLE CEDEX    | Fax 04 76 82 52 71               |
===============================================================
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Tue Dec  9 10:03:12 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Tue, 9 Dec 2003 07:03:12 -0800
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF92@orsmsx402.jf.intel.com>

From: Mark Hahn; Sent: Monday, December 08, 2003 3:18 PM
> 
> SCSI:	pro: a nice bus-based architecture which makes it easy to put
> 	many disks in one enclosure.  the bus is fast enough to
> 	support around 3-5 disks without compromising bandwidth
> 	(in fact, you'll probably saturate your PCI(x) bus(es) first
> 	if you're not careful!) 10 and 15K RPM SCSI disks are common,
> 	leading to serious advantages if your workload is
latency-dominated
> 	(mostly of small, scattered, uncachable reads, and/or
synchronous
> 	writes.)  5yr warrantees and 1.2 Mhour MTBF are very comforting.

Very big pro:  You can get much higher *sustained* bandwidth levels,
regardless of CPU load.  ATA/PATA requires CPU involvement, and
bandwidth tanks under moderate CPU load.

The highest SCSI bandwidth rates I've seen first hand are 290 MB/S for
IA32 and 380 MB/S for IPF. Both had two controllers on independent PCI-X
busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.


Does SATA reduce the CPU requirement from ATA/PATA, or is it the same?
Unless it's substantially lower, you still have a system best suited for
low to moderate I/O needs.

BTW, http://www.iozone.org/ is a nice standard I/O benchmark.  But, as
mentioned earlier in this thread, app-specific benchmarking is *always*
best.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Tue Dec  9 11:11:30 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Tue, 09 Dec 2003 09:11:30 -0700
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <20031208223637.47124.qmail@web60310.mail.yahoo.com>
References: <20031208223637.47124.qmail@web60310.mail.yahoo.com>
Message-ID: <3FD5F432.6040600@drdc-rddc.gc.ca>

Andrew Latham wrote:
> While I understand your pain I have no facts for you other than that SATA is
> much faster than IDE. It can come close to SCSI(160). I have used SATA a little
> but am happy with it. the selling point for me is cost of controler and disk
> (controlers of SATA are much less), and the smaller cable format. The cable is
> so small and easy to use that it is the major draw for me.
> 
> good luck on your quest!
> 

I knew this but for straight throughput but it is random access that 
is the real question.

-- 
Robin Laing

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at scyld.com  Tue Dec  9 03:12:03 2003
From: rgoornaden at scyld.com (rgoornaden at scyld.com)
Date: Tue, 9 Dec 2003 03:12:03 -0500
Subject: fstab
Message-ID: <200312090812.hB98C3S25365@NewBlue.scyld.com>


hello everybody
after i have edit the file /etc/fstab that I amend the fellowing line to
the file
masternode:/home /home nfs
OR
I use the command "mount -t nfs masternode:/home /home@

to check whether the nfs was successful or not I type "df" on node2 for
instance and i get this result...

"/dev/hda3  17992668 682888 16395776 4% /
none	    256900	  0  256900  0% /dev/shm "

I suposse that this is wrong as it was not mounted on the masternode

thanks
 Ryan


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Tue Dec  9 11:58:32 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Tue, 9 Dec 2003 11:58:32 -0500 (EST)
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD5F6DD.6000505@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312091146320.19007-100000@coffee.psychology.mcmaster.ca>

> > | My concern is regarding multiple disk read/writes.  With IDE, you can 
> > | wait for what seems like hours while data is being read off of the HD. 
> > 
> > nah.  it's basically just a design mistake to put two active PATA disks 
> > on the same channel.  it's fine if one is usually idle (say, cdrom or 
> > perhaps a disk containing old archives).  most people just avoid putting 
> > two disks on a channel at all, since channels are almost free, and you 
> > get to ignore jumpers.
>
> So it would be a good idea to put data and /tmp on a different channel 
> than swap?

if you're expecting concurrency, then you shouldn't share a limited resource.
a single (master/slave) PATA channel is one such resource.  sharing a spindle
(two partitions on a single disk of any sort) is just as much a mistake.

> > caching.  this can cause data loss or corruption if you crash at exactly the 
> > right time (paranoids take note).
> > 
> I forgot about the "write-behind" problem.  I have been burned with 
> this before.

really?  the window is quite small, since lazy-writing IDEs *do* have a 
timeout for how long they'll delay a write.  or are you thinking of the 
issue of shutting down a machine - when the ATX poweroff happens before
the write is flushed?  (and the OS fails to properly flush the cache...)
the latter is fixed in current Linux.

> memeory while working.  I know on my present workstation I will work 
> with a file that is 2X the memory and I find that the machine stutters 
> (locks for a few seconds) every time there is any disk ascess.  I 

I'll bet you a beer that this is a memory-management problem rather than 
anything wrong with the disk.  Linux has always had a tendency to over-cache,
and get to a point where you clearly notice its scavenging scans.

> one thing I was looking at with SCSI.  From this I take it that SATA 
> can handle some queueing but it just isn't supported yet?

grep LKML for jgarzik and libata.  my real point is that queueing is not 
all that important, since the kernel has always done seek scheduling.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Tue Dec  9 11:22:53 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Tue, 09 Dec 2003 09:22:53 -0700
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312081736360.13824-100000@coffee.psychology.mcmaster.ca>
References: <Pine.LNX.4.44.0312081736360.13824-100000@coffee.psychology.mcmaster.ca>
Message-ID: <3FD5F6DD.6000505@drdc-rddc.gc.ca>

Mark Hahn wrote:
> | read the reviews and the hype about SATA being better than IDE/ATA and 
> | almost as good as SCSI, even better in a couple of areas.
> 
> <sigh>
> 
> 
> | I have talked to our computer people but they don't have enough 
> | experience with SATA drives to give me a straight answer.
> 
> there's not THAT much to know.
> 


> | My concern is regarding multiple disk read/writes.  With IDE, you can 
> | wait for what seems like hours while data is being read off of the HD. 
> 
> nah.  it's basically just a design mistake to put two active PATA disks 
> on the same channel.  it's fine if one is usually idle (say, cdrom or 
> perhaps a disk containing old archives).  most people just avoid putting 
> two disks on a channel at all, since channels are almost free, and you 
> get to ignore jumpers.
> 
So it would be a good idea to put data and /tmp on a different channel 
than swap?

> 
> |   I want to know if the problem is still as bad with SATA as the 
> | original ATA drives?  Will the onboard RAID speed up access?
> 
> there was no problem with "original" disks.  and raid works fine, up until
> you saturate your PCI bus...
> 
> 
> | I know that throughput on large files is close and is usually related 
> | to platter speed.  I am also pleased that the buffers is now 8mb on 
> | all the drives I am looking at.
> 
> one of the reasons that TCQ is not a huge win is that the kernel's cache
> is ~500x bigger than the disk's.  however, it's true that bigger ondisk cache
> lets the drive better optimize delayed writes within a cylinder.  for non-TCQ
> ATA to be competitive when writing, it's common to enable write-behind
> caching.  this can cause data loss or corruption if you crash at exactly the 
> right time (paranoids take note).
> 
I forgot about the "write-behind" problem.  I have been burned with 
this before.

> 
> | Main issue is writing and reading swap on those really large files and 
> | how it affects other work.
> 
> swap thrashing is a non-fatal error that should be fixed, 
> not band-aided by gold-plated hardware.
> 
I agree but I am not looking at swap thrashing in the sense of many 
small files.  I am looking at 1 or 2 large files that are bigger than 
memeory while working.  I know on my present workstation I will work 
with a file that is 2X the memory and I find that the machine stutters 
(locks for a few seconds) every time there is any disk ascess.  I 
would like to add more ram but that is impossible as there are only 
two slots and they are full.  Management won't provide the funds.

> finally, I should mention that Jeff Garzik is doing a series of good new SATA
> drivers (deliberately ignoring the accumulated kruft in the kernel's PATA
> code).  they plug into the kernel's SCSI interface, purely to take advantage 
> of support for queueing and hotplug, I think.
This is interesting.  I like the idea of hot-swap drives and this is 
one thing I was looking at with SCSI.  From this I take it that SATA 
can handle some queueing but it just isn't supported yet?
> 
> regards, mark hahn.
> 

-- 
Robin Laing

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at scyld.com  Tue Dec  9 00:17:02 2003
From: rgoornaden at scyld.com (rgoornaden at scyld.com)
Date: Tue, 9 Dec 2003 00:17:02 -0500
Subject: Just Begin
Message-ID: <200312090517.hB95H2S09271@NewBlue.scyld.com>


Hello everybody...
I has just started to build a beowulf cluster and after making some
research about it, I decided to use RedHat 9.0 and using MPICH2-0.94 as
message passing software..
Well, I will be very glad if someone can guide me as a friend to construct
this cluster 
Thanks
Ryan


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Tue Dec  9 11:09:09 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Tue, 09 Dec 2003 09:09:09 -0700
Subject: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <3FD5F3A5.6090802@drdc-rddc.gc.ca>

> hi ya robin/bill On Mon, 8 Dec 2003, Bill Broadley wrote:
> 
SNIP
> 
> definite yupppers one do a benchmark and compare only similar environments
> and apps ... otherwise one is comparing christmas shopping to studing
> to be a vet ( benchmarks not related to each other )
> 

I like the idea of shopping for a christmas vet. :)

> -----
> 
> for which disks ...
> 	- i'd stick with plain ole ide disks
> 	- its cheap
> 	- you can have a whole 2nd system to backup the primary array
> 	for about the same cost as an expensive dual-cpu or scsi-based
> 	system
> 
> for serial ata ...
> 	- dont use its onboard controller for raid ... 
> 	- it probably be as good as onboard raid on existing mb...
> 	( ie ... none of um works right 
> 		works == hands off booting of any disk 
> 		works == data resyncs by itself w/o intervention
> 
> 		but doing the same tests w/ sw raid or hw raid
> 		controller w/ scsi works fine
> 
This is an answer that is at least in the direction of what I am 
looking for.

> 
>>> So I'd look at bonnie++, postmark, or one of the other opensource benchmarks
>>> see if any of those can be configured to be similar to your workload.  If not
>>> write a benchmark that is similar to your workload and post it to the list asking
>>> people to run it on their hardware.  The more effort you put into it the
>>> more responses your likely to get.  Posting a table of performance results
>>> on a website seems to encourage more to participate.
> 
> 
> other benchmark tests you can run ....
> 
> 	http://www.Linux-1U.net/Benchmarks

Correct link,
http://www.Linux-1U.net/BenchMarks

The problem benchmarks software is you need the hardware to test it 
with.  What a nice circle to be involved in.

> 
> other tuning you can to to tweek the last instruction out of the system
> 
> 	http://www.Linux-1U.net/Tuning
>  

I have looked at http://www.Linux-1U.net before posting my questions 
about SATA.

> 
>>> There are no easy answers, it depends on many many variables, the type
>>> of OS, how long the partition has been live (i.e. fragmentation),
>>> the IDE/SCSI chipset, the drivers, the OS, even the cables can have
>>> performance effects.
> 
> 
> (look for the) picture of partitions/layout ... makes  big difference
> 
> 	http://www.Linux-1U.net/Partition/

I would prefer not to use SWAP at all.  Of course 1Gig of ram is now 
minimum I would put into a desktop.

> 
>>> The market seems to be going towards SATA, seems like many if not all major
>>> storage vendors have an entry level SATA product, I've no idea if this
>>> is just the latest fad or justified from a pure price/performance perspective.
> 
> 
> if the disk manufacturers stop making scsi/ide disks .. we wont have
> any choice... unless we go to the super fast "compact flash"
> and its next generation 100GB "compact flash" in the r/d labs 
> which is why ibm sold its klunky mechanical disk drives in favor
> of its new "solid state disks"  ( forgot its official name )
> 

Solid state memory has been talked about for years.  I remember the 
discussion about bubble memory.

> c ya
> alvin
> 
> 

-- 
Robin Laing

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msnitzer at lnxi.com  Tue Dec  9 12:21:49 2003
From: msnitzer at lnxi.com (Mike Snitzer)
Date: Tue, 9 Dec 2003 10:21:49 -0700
Subject: [Beowulf] Re: Terrasoft Black Lab Linux
In-Reply-To: <43DEB9EC-2A4D-11D8-9EAE-00039394839E@mac.com>; from dskr@mac.com on Tue, Dec 09, 2003 at 08:40:34AM -0500
References: <43DEB9EC-2A4D-11D8-9EAE-00039394839E@mac.com>
Message-ID: <20031209102149.A21557@lnxi.com>

On Tue, Dec 09 2003 at 06:40,
dskr at mac.com <dskr at mac.com> wrote:

> 
> Greetings:
> 
> Does anyone on the list have any experience with TerraSoft's Black Lab 
> linux?
> 
> As many of you may recall, I am a big fan of 'software that sucks less' 
> -- to quote a
> wonderful Scyld T-shirt I once saw. Imagine my surprise, then, when I 
> found that
> TerraSoft (promulgators of YellowDog and BlackLab Linux for PPC) is 
> shipping a
> new version (2.2) of BlackLab that is based on BProc.
> 
> Is this good news? I think it could be for TerraSoft ; this move is a 
> big upgrade from
> their earlier offering which reminded me of the Dark Times in 
> clustering.
> (Does anyone else still remember when we had to set up .rhosts files 
> and grab
> our copy of PVM out of someone else's home directory and copy it into 
> our own?)
> 
> I'd like to see what BlackLab's story is. but I have been unable to 
> find any of the
> sources for this product available for download. In particular, I would 
> like to know:
> 
> 	* Does it use beonss?
> 
> 	* Does it use beoboot?
> 
> 	* Does it netboot remote Macintoshes?
> 
> 	* What version of BProc does it use?
> 
> 	* How did they do MPI? Did they crib Don's version
> 	of MPICH for BProc?

I'd imagine you've seen this link:
http://www.terrasoftsolutions.com/products/blacklab/

On that site it details the fact that BlackLab v2.2 uses Yellow Dog 3.0 as
its base and that its using Bproc 3.x and Supermon; so they likely just
used Eric Hendrik's (LANL's) Clustermatic 3.0. 

Also here is a listing of included software from their site; not too many
_real_ details: 
http://www.terrasoftsolutions.com/products/blacklab/included.shtml

Scouring ftp.{yellowdoglinux,terrasoftsolutions.com}.com didn't yield
anything.  I'd imagine terrasoft would answer emailed questions.

Mike
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Tue Dec  9 13:22:43 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Tue, 9 Dec 2003 10:22:43 -0800
Subject: [Beowulf] RE: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF97@orsmsx402.jf.intel.com>

From: Robin Laing; Sent: Tuesday, December 09, 2003 8:23 AM
> Mark Hahn wrote:
> > nah.  it's basically just a design mistake to put two active PATA
disks
> > on the same channel.  it's fine if one is usually idle (say, cdrom
or
> > perhaps a disk containing old archives).  most people just avoid
putting
> > two disks on a channel at all, since channels are almost free, and
you
> > get to ignore jumpers.
> >
> So it would be a good idea to put data and /tmp on a different channel
> than swap?

This is true of *every* system, regardless of disk technology.

However, it's even better, if possible, to put enough memory in the box
to avoid swap.

> I agree but I am not looking at swap thrashing in the sense of many
> small files.  I am looking at 1 or 2 large files that are bigger than
> memeory while working.  I know on my present workstation I will work
> with a file that is 2X the memory and I find that the machine stutters
> (locks for a few seconds) every time there is any disk ascess.  I
> would like to add more ram but that is impossible as there are only
> two slots and they are full.  Management won't provide the funds.

What kernel are you using? There were a couple/few 2.4 kernels that
would behave badly with this.  Changing the kernel and/or tuning in
/proc can help, I ran in to this and used both fixes.  I don't have the
specifics with me, but they're googleable...

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msnitzer at lnxi.com  Tue Dec  9 12:27:14 2003
From: msnitzer at lnxi.com (Mike Snitzer)
Date: Tue, 9 Dec 2003 10:27:14 -0700
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312091146320.19007-100000@coffee.psychology.mcmaster.ca>; from hahn@physics.mcmaster.ca on Tue, Dec 09, 2003 at 11:58:32AM -0500
References: <3FD5F6DD.6000505@drdc-rddc.gc.ca> <Pine.LNX.4.44.0312091146320.19007-100000@coffee.psychology.mcmaster.ca>
Message-ID: <20031209102714.B21557@lnxi.com>

On Tue, Dec 09 2003 at 09:58,
Mark Hahn <hahn at physics.mcmaster.ca> wrote:

> > one thing I was looking at with SCSI.  From this I take it that SATA 
> > can handle some queueing but it just isn't supported yet?
> 
> grep LKML for jgarzik and libata.  my real point is that queueing is not 
> all that important, since the kernel has always done seek scheduling.

FYI, here is Jeff Garzik's latest Status report for Linux SATA support:
http://lwn.net/Articles/61288/

Mike
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From linux-man at verizon.net  Tue Dec  9 12:45:57 2003
From: linux-man at verizon.net (mark kandianis)
Date: Tue, 09 Dec 2003 12:45:57 -0500
Subject: [Beowulf] beowulf and X
Message-ID: <oprzxcyvu4w6rjvl@outgoing.verizon.net>

hello

i have a background in linux but not particularly beowulf.  i've lately 
been recruited
to develop a graphics system for beowulf with xfree86 and twm.  is anyone 
else doing this
out there?  also, how does beowulf get its graphics currently?  i could 
not figure that out
 from the links on the site.

mark

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From becker at scyld.com  Tue Dec  9 13:24:59 2003
From: becker at scyld.com (Donald Becker)
Date: Tue, 9 Dec 2003 13:24:59 -0500 (EST)
Subject: [Beowulf] BW-BUG meeting, Today Dec. 9, 2003, in Greenbelt MD;  -- Red Hat
Message-ID: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>


[[ Please note that this month's meeting is East: Greenbelt, not McLean VA. ]]

     Baltimore Washington  Beowulf Users Group 
            December 2003 Meeting 
               www.bwbug.org
    December 9th at 3:00PM in Greenbelt MD
 
____

        RedHat Roadmap for HPC Beowulf Clusters.

        RedHat is pleased to have the opportunity to present to Baltimore-
Washington Beowulf User Group on Tuesday Dec 9th. Robert Hibbard, Red Hat's
Federal Partner Alliance Manager, will provide information on Red Hat's
Enterprise Linux product strategy, with particular emphasis on it's
relevance to High Performance Computing Clusters. 

        Discussion will include information on the background, current
product optimizations, as well as possible futures for Red Hat efforts
focused on HPCC. 
____

Our meeting facilities are once again provided by Northrup Grumman
	7501 Greenway Center Drive
	Suite 1000 (10th floor)
	Greenbelt, MD 20770, phone
	703-628-7451


-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
914 Bay Ridge Road, Suite 220		Scyld Beowulf cluster systems
Annapolis MD 21403			410-990-9993

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Dec  9 12:49:26 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 9 Dec 2003 12:49:26 -0500 (EST)
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD5F6DD.6000505@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312091244160.18002-100000@lilith.rgb.private.net>

On Tue, 9 Dec 2003, Robin Laing wrote:

> I agree but I am not looking at swap thrashing in the sense of many 
> small files.  I am looking at 1 or 2 large files that are bigger than 
> memeory while working.  I know on my present workstation I will work 
> with a file that is 2X the memory and I find that the machine stutters 
> (locks for a few seconds) every time there is any disk ascess.  I 
> would like to add more ram but that is impossible as there are only 
> two slots and they are full.  Management won't provide the funds.

I have to ask.  Is it a P4?  Strictly empirically I have experienced
similar things even without filling memory.  I actually moved my
fileserver off onto a Celeron (which it has run flawlessly) because it
was so visible, so annoying.

I have no idea why a P4 would behave that way, but to my direct
experience at least some P4-based servers can be really BAD on file
latency for reasons that have nothing to do with the disk hardware or
kernel per se.  Maybe some sort of chipset problem, maybe related to the
particular onboard IDE/ATA controllers -- I never bothered to try to
debug it other than to move the server onto something else where it
worked.  AMD or Celeron or PIII are all just fine.

If you're stuck on the hardware side with no money to get better
hardware, well, you're stuck.  My P4 system had plenty of memory and a
1.8 MHz clock and still was a pig compared to a 400 MHz Celery serving
the SAME DISK physically moved from one to the other.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Dec  9 12:42:46 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 9 Dec 2003 12:42:46 -0500 (EST)
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD5F432.6040600@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312091230320.18002-100000@lilith.rgb.private.net>

On Tue, 9 Dec 2003, Robin Laing wrote:

> Andrew Latham wrote:
> > While I understand your pain I have no facts for you other than that SATA is
> > much faster than IDE. It can come close to SCSI(160). I have used SATA a little
> > but am happy with it. the selling point for me is cost of controler and disk
> > (controlers of SATA are much less), and the smaller cable format. The cable is
> > so small and easy to use that it is the major draw for me.
> > 
> > good luck on your quest!
> > 
> 
> I knew this but for straight throughput but it is random access that 
> is the real question.

Random access is complicated for any drive system.  It tends to be
latency dominated -- the drive has to do lots of seeks.  Seek time, in
turn, is dominated by platter speed and platter density, with worst case
latencies related to the time required to position the head and turn the
disk so that the track start is underneath.  With drive speeds of
5000-10000 rpm, this time is pretty much fixed and not all that
different from cheap disks to the most expensive, with read and write
being a bit different (so it even matters if you do random access reads
from e.g. a big filesystem with lots of little files or random writes
ditto).  Note also that there are LOTS of components to file latency,
and disk speed is only one of them.  To open a file, the kernel must
first stat it to see if you are PERMITTED to open it.

Note also that the kernel is DESIGNED to hide slow filesystem speeds
from the user.  The kernel caches and buffers and never throws anything
away it might need later unless/until it has to.  A common benchmarking
mistake is to open a file (to see how long it takes) and then open it
again right away in a loop.  Surprise!  It takes a ``long time'' the
first time but the second time is nearly instantaneous, because the
second time the request is served out of the kernel's cache.  A system
with a lot of memory will use all but a tiny fraction of that memory
caching things, if it can.

I don't expect things like latency to be VASTLY affected by SATA vs PATA
vs SCSI, see Mark's remarks on disk speed and platter density -- that is
more strongly related to the disk hardware, not the interface.  Even
things like on-disk cache are trivial in size compared to the kernel's
caches, although I'm sure they help somewhat under some circumstances.  

     rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Tue Dec  9 13:50:00 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Tue, 9 Dec 2003 13:50:00 -0500 (EST)
Subject: [Beowulf] beowulf and X
In-Reply-To: <oprzxcyvu4w6rjvl@outgoing.verizon.net>
Message-ID: <Pine.LNX.4.44.0312091341170.9213-100000@ganesh.phy.duke.edu>

On Tue, 9 Dec 2003, mark kandianis wrote:

> hello
> 
> i have a background in linux but not particularly beowulf.  i've lately 
> been recruited
> to develop a graphics system for beowulf with xfree86 and twm.  is anyone 
> else doing this
> out there?  also, how does beowulf get its graphics currently?  i could 
> not figure that out
>  from the links on the site.

What exactly do you mean?  Or rather, I think that defining your
engineering goal is the first step for you to accomplish.  "Beowulf"
doesn't get its graphics any particular way, but systems with graphical
heads can be nodes on a beowulfish or other cluster computer design, and
a piece of parallel software could certainly be written to do the
computation on a collection of nodes and graphically represent the
computation on a graphical head in real time or as movies afterward.
Several demo/benchmarky type applications exist that sort-of demonstrate
this -- pvmpov (a raytracing application) and various mandelbrot set
demos e.g. xep in PVM.

So to even get started you've got to figure out what the problem really
is.  Do you mean:

     Display                     "beowulf"

  Graphics head =====network====head node 00
                                |node 01
                                |node 02
                                |node 03
                                |...

(do a computation on the beowulf that e.g. makes an image or creates a
data representation of some sort, send it via the network to the
display, then graphically display it) or:

  Graphical head node 00
   |node 01
   |node 02
   |node 03
   |...

(do the computation where the graphical display is FULLY INTEGRATED with
the nodes, so each node result is independently updated and displayed
with or without a synchronization step/barrier) or:

  ...something else?

In each case, the suitable design will almost certainly be fairly
uniquely suggested by the task, if it is feasible at all to accomplish
the task.  It may not be.

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From laytonjb at comcast.net  Tue Dec  9 17:56:04 2003
From: laytonjb at comcast.net (Jeffrey B. Layton)
Date: Tue, 09 Dec 2003 17:56:04 -0500
Subject: [Beowulf] Re: EMC, anyone?
In-Reply-To: <0EB5C81FE6FE5A4F8D1FEBF59C6C7BAA1A1824@durham.sra.com>
References: <0EB5C81FE6FE5A4F8D1FEBF59C6C7BAA1A1824@durham.sra.com>
Message-ID: <3FD65304.7030606@comcast.net>

David,

   We tried using EMC for storage for one of our
cluster at work. We have a node in the cluster (we
called it the IO node) that was SAN attached to an
EMC SAN. Then that space was NFS exported
throughout the cluster (288 nodes in total).
   Initially we exported the NFS storage over
Myrinet. After some problems we tried it over
FastE.
   The end result was that we never got it to work
correctly. We had filesystems that would just
disappear from the IO node and then reappear. We
had lots of file corruptions and files lost. My favorite
was the 2 TB filesystem that had to be fsck (man
that took a long time). We had EMC folks in, Dell
people in (they supplied the EMC certified IO node)
and the cluster vendor in. No one could ever figure
out the problems although the cluster vendor was
able to help the situations some (Dell and EMC
really did nothing to help). Finally, we ended up
taking the IO node out of the cluster and only
NFS mounting it on the master node. We also
forced people to run using the local hard drives and
not over NFS. This helped things, but we still had
problems from time to time.
   The ultimate solution was to convert the IO node
to a NAS box with attached storage.
   Good Luck with your project!

Jeff

>Our group has an opportunity that few would pass up - more or less free
>storage.  Our parent organization is preparing to purchase a large amount of
>EMC storage, the configuration of which is not yet nailed down.  We are
>investigating the potential to be the recipients of part of that storage,
>and (crossing fingers) no one has mentioned the dreaded chargeback word yet.
>Obviously, we would be thrilled to gain access to TBs of free storage, so we
>can spend more of our budget on people and compute platforms.
>
>Naturally, the EMC reps are plying us with lots of jargon, PR, white papers,
>and so on explaining why their technology is the perfect fit for us.
>However, I am bothered by the fact that EMC does not have a booth at SC each
>year, and I do not see them mentioned in the HPC trade rags.  Makes me think
>that they don't really have the technology and support tailored for the HPC
>community.
>
>We, of course, are doing due diligence on the business case side, matching
>our needs with their numbers.  My question to this group is "Do any of you
>use EMC for your HPC storage?"  If so, how?  Been happy with it?
>
>We do primarily models with heavy latency dependency (meteorological, with
>CMAQ and MM5).  This will not be the near-line storage, but rather NAS
>attached to HiPPI or gigE.
>
>Thanks in advance,
>
>Dave Walters
>Project Manager, SRA
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>  
>


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Tue Dec  9 18:31:55 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Tue, 9 Dec 2003 15:31:55 -0800
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <187D3A7CAB42A54DB61F1D05F012572201D4BF92@orsmsx402.jf.intel.com>
References: <187D3A7CAB42A54DB61F1D05F012572201D4BF92@orsmsx402.jf.intel.com>
Message-ID: <20031209233155.GC7713@cse.ucdavis.edu>

On Tue, Dec 09, 2003 at 07:03:12AM -0800, Lombard, David N wrote:
> Very big pro:  You can get much higher *sustained* bandwidth levels,
> regardless of CPU load.  ATA/PATA requires CPU involvement, and
> bandwidth tanks under moderate CPU load.

I've heard this before, I've yet to see it.  To what do you attribute
this advantage?  DMA scatter gather?  Higher bitrate at the read head?

Do you have a way to quantify this *sustained* bandwidth?  Care to share?

> The highest SCSI bandwidth rates I've seen first hand are 290 MB/S for
> IA32 and 380 MB/S for IPF. Both had two controllers on independent PCI-X
> busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.

Was this RAID-5?  In Hardware?  In Software?  Which controllers?

Do you have any reason to believe you wouldn't see similar with the
same number of SATA drives on 2 independent PCI-X busses?

I've seen 250 MB/sec from a relatively vanilla single controller setup.

Check out: (no I don't really trust tom's that much):
http://www6.tomshardware.com/storage/20031114/raidcore-24.html#data_transfer_diagrams_raid_5

The RaidCore manages 250 MB/sec decaying to 180MB/sec on the slower inner
tracks of a drive.  Certainly seems like 2 of these on seperate busses
would have a good change of hitting the above numbers.

Note the very similar SCSI 8 drive setups are slower.

> Does SATA reduce the CPU requirement from ATA/PATA, or is it the same?
> Unless it's substantially lower, you still have a system best suited for
> low to moderate I/O needs.

Do you have any way to quantify this?  Care to share?  I've seen many similar
comments but when I actually go measure I get very similar numbers, often
single disks managing 40-60 MB/sec and 10% cpu, and maximum disk transfer
rates around 300-400 MB/sec at fairly high rates of cpu usage.

> BTW, http://www.iozone.org/ is a nice standard I/O benchmark.  But, as
> mentioned earlier in this thread, app-specific benchmarking is *always*
> best.

Agreed.  Iozone or bonnie++ seem to do fine on large sequential file
benchmarking I prefer postmark for replicating database like access patterns.


-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From linux-man at verizon.net  Tue Dec  9 19:43:20 2003
From: linux-man at verizon.net (mark kandianis)
Date: Tue, 09 Dec 2003 19:43:20 -0500
Subject: [Beowulf] beowulf and X
In-Reply-To: <Pine.LNX.4.44.0312091341170.9213-100000@ganesh.phy.duke.edu>
References: <Pine.LNX.4.44.0312091341170.9213-100000@ganesh.phy.duke.edu>
Message-ID: <oprzxwaijew6rjvl@outgoing.verizon.net>

On Tue, 9 Dec 2003 13:50:00 -0500 (EST), Robert G. Brown 
<rgb at phy.duke.edu> wrote:

> On Tue, 9 Dec 2003, mark kandianis wrote:
>
>> hello
>>
>> i have a background in linux but not particularly beowulf.  i've lately
>> been recruited
>> to develop a graphics system for beowulf with xfree86 and twm.  is 
>> anyone
>> else doing this
>> out there?  also, how does beowulf get its graphics currently?  i could
>> not figure that out
>>  from the links on the site.
>
> What exactly do you mean?  Or rather, I think that defining your
> engineering goal is the first step for you to accomplish.  "Beowulf"
> doesn't get its graphics any particular way, but systems with graphical
> heads can be nodes on a beowulfish or other cluster computer design, and
> a piece of parallel software could certainly be written to do the
> computation on a collection of nodes and graphically represent the
> computation on a graphical head in real time or as movies afterward.
> Several demo/benchmarky type applications exist that sort-of demonstrate
> this -- pvmpov (a raytracing application) and various mandelbrot set
> demos e.g. xep in PVM.
>
> So to even get started you've got to figure out what the problem really
> is.  Do you mean:
>
>      Display                     "beowulf"
>
>   Graphics head =====network====head node 00
>                                 |node 01
>                                 |node 02
>                                 |node 03
>                                 |...
>
> (do a computation on the beowulf that e.g. makes an image or creates a
> data representation of some sort, send it via the network to the
> display, then graphically display it) or:
>
>   Graphical head node 00
>    |node 01
>    |node 02
>    |node 03
>    |...
>
> (do the computation where the graphical display is FULLY INTEGRATED with
> the nodes, so each node result is independently updated and displayed
> with or without a synchronization step/barrier) or:
>
>   ...something else?
>
> In each case, the suitable design will almost certainly be fairly
> uniquely suggested by the task, if it is feasible at all to accomplish
> the task.  It may not be.
>
>    rgb
>
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>
>

quite honestly if mosix can do it, it seems that xfree86 is already there,
so it looks like my question is moot. so i think i can get this up quicker 
than
i thought.

are there any particular kernels that are geared to beowulf?  or is this 
something
that one has to roll their own?

regards

mark


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Tue Dec  9 19:37:32 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Tue, 9 Dec 2003 16:37:32 -0800
Subject: [Beowulf] RE: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BF9D@orsmsx402.jf.intel.com>

From: Bill Broadley [mailto:bill at cse.ucdavis.edu]
> 
> On Tue, Dec 09, 2003 at 07:03:12AM -0800, Lombard, David N wrote:
> > Very big pro:  You can get much higher *sustained* bandwidth levels,
> > regardless of CPU load.  ATA/PATA requires CPU involvement, and
> > bandwidth tanks under moderate CPU load.
> 
> I've heard this before, I've yet to see it.  To what do you attribute
> this advantage?  DMA scatter gather?  Higher bitrate at the read head?

Non involvement of the CPU with direct disk activities (i.e., the bits
handled by the SCSI controller) plus *way* faster CPU to handle the
high-level RAID processing v. the pokey processors found on most RAID
cards.  With multiple controllers on separate busses, I don't funnel all
my I/O through one bus.  Note again, I only discuss maximal disk
bandwidth, which means RAID-0.

> Do you have a way to quantify this *sustained* bandwidth?  Care to
share?

Direct measurement with both standard testers and applications.
Sustained means a dataset substantially larger than memory to avoid
cache effects.

> > The highest SCSI bandwidth rates I've seen first hand are 290 MB/S
for
> > IA32 and 380 MB/S for IPF. Both had two controllers on independent
PCI-X
> > busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.
                                                 ==========
> Was this RAID-5?  In Hardware?  In Software?  Which controllers?
See underlining immediately above.

> Do you have any reason to believe you wouldn't see similar with the
> same number of SATA drives on 2 independent PCI-X busses?

I have no info on SATA, thus the question later on.
 
> I've seen 250 MB/sec from a relatively vanilla single controller
setup.

What file size v. memory and what CPU load *not* associated with
actually driving the I/O?

> Check out: (no I don't really trust tom's that much):
> http://www6.tomshardware.com/storage/20031114/raidcore-
> 24.html#data_transfer_diagrams_raid_5
> 
> The RaidCore manages 250 MB/sec decaying to 180MB/sec on the slower
inner
> tracks of a drive.  Certainly seems like 2 of these on seperate busses
> would have a good change of hitting the above numbers.
> 
> Note the very similar SCSI 8 drive setups are slower.

I'll look at this.

> > Does SATA reduce the CPU requirement from ATA/PATA, or is it the
same?
> > Unless it's substantially lower, you still have a system best suited
for
> > low to moderate I/O needs.
> 
> Do you have any way to quantify this?  Care to share?  I've seen many
> similar
> comments but when I actually go measure I get very similar numbers,
often
> single disks managing 40-60 MB/sec and 10% cpu, and maximum disk
transfer
> rates around 300-400 MB/sec at fairly high rates of cpu usage.

Direct measurement with both standard testers and applications.
Sustained means a dataset substantially larger than memory to avoid
cache effects.

You repeated my comment, "fairly high rates of cpu usage" -- high cpu
usage _just_to_drive_the_I/O_ meaning it's unavailable for the
application.  Also, are you quoting a burst number, that can benefit
from caching, or a sustained number, where the cache was exhausted long
ago?

The high cpu load hurts scientific/engineering apps that want to access
lots of data on disk, and burst rates are meaningless. In addition, I've
repeatedly heard that same thing from sysadmins setting up NFS servers
-- the ATA/PATA disks have too great a *negative* impact on NFS server
performance -- here the burst rates should have been more significant,
but the CPU load got in the way.

> > BTW, http://www.iozone.org/ is a nice standard I/O benchmark.  But,
as
> > mentioned earlier in this thread, app-specific benchmarking is
*always*
> > best.
> 
> Agreed.  Iozone or bonnie++ seem to do fine on large sequential file
> benchmarking I prefer postmark for replicating database like access
> patterns.

Good to know. Thanks.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Tue Dec  9 21:19:18 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Tue, 9 Dec 2003 18:19:18 -0800
Subject: [Beowulf] RE: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <187D3A7CAB42A54DB61F1D05F012572201D4BF9D@orsmsx402.jf.intel.com>
References: <187D3A7CAB42A54DB61F1D05F012572201D4BF9D@orsmsx402.jf.intel.com>
Message-ID: <20031210021918.GL7713@cse.ucdavis.edu>

On Tue, Dec 09, 2003 at 04:37:32PM -0800, Lombard, David N wrote:
> From: Bill Broadley [mailto:bill at cse.ucdavis.edu]
> > 
> > On Tue, Dec 09, 2003 at 07:03:12AM -0800, Lombard, David N wrote:
> > > Very big pro:  You can get much higher *sustained* bandwidth levels,
> > > regardless of CPU load.  ATA/PATA requires CPU involvement, and
> > > bandwidth tanks under moderate CPU load.
> > 
> > I've heard this before, I've yet to see it.  To what do you attribute
> > this advantage?  DMA scatter gather?  Higher bitrate at the read head?
> 
> Non involvement of the CPU with direct disk activities (i.e., the bits
> handled by the SCSI controller)

Er, the way I understand it is with PATA, SCSI, or SATA the driver
basically says Read or write these block(s) at this ADDR and raise
an interupt when done.  Any corrections?

> plus *way* faster CPU to handle the
> high-level RAID processing

I'm a big fan of software RAID, although it's not a SATA vs SCSI issue.

> v. the pokey processors found on most RAID
> cards. 

Agreed.

> With multiple controllers on separate busses, I don't funnel all
> my I/O through one bus.  Note again, I only discuss maximal disk
> bandwidth, which means RAID-0.

Right, sorry I missed the mention.

> Direct measurement with both standard testers and applications.
> Sustained means a dataset substantially larger than memory to avoid
> cache effects.

Seems that it's fairly common to manage 300 MB/sec +/- 50 MB/sec from
1-2 PCI cards.  I've done similar with 3 U160 channels on an older
dual P4.  The URL I posted shows the same for SATA.

> > > The highest SCSI bandwidth rates I've seen first hand are 290 MB/S
> for
> > > IA32 and 380 MB/S for IPF. Both had two controllers on independent
> PCI-X
> > > busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.
>                                                  ==========
> > Was this RAID-5?  In Hardware?  In Software?  Which controllers?

> See underlining immediately above.

Sorry.

> > Do you have any reason to believe you wouldn't see similar with the
> > same number of SATA drives on 2 independent PCI-X busses?
> 
> I have no info on SATA, thus the question later on.

Ah, well the URL shows a single card managing 250 MB/sec which decays
to 180 MB/sec on the slower tracks.  Filesystems, PCI busses, and memory
systems seem to start being an effect here.  I've not seen much more
the 330 MB/sec (my case) up to 400 MB/sec (various random sources).  Even
my switch from ext3 to XFS helped substantially.  With ext3 I was getting
265-280 MB/sec, with XFS my highest sustained sequential bandwidth was
around 330 MB/sec.

Presumably the mentioned raidcore card could perform even better with
raid-0 then raid-5.

> > I've seen 250 MB/sec from a relatively vanilla single controller
> setup.
> 
> What file size v. memory.

18 GBs of file I/O with 6 GB ram on a dual p4 1.8 GHz

> and what CPU load *not* associated with
> actually driving the I/O?

None, just a benchmark, but it showed 50-80% cpu usage for a single CPU,
this was SCSI though.  I've yet to see any I/O system PC based system 
shove this much data around without significant CPU usage.

> Direct measurement with both standard testers and applications.
> Sustained means a dataset substantially larger than memory to avoid
> cache effects.

Of course, I use a factor of 4 minimum to minimize cache effects.

> You repeated my comment, "fairly high rates of cpu usage" -- high cpu
> usage _just_to_drive_the_I/O_ meaning it's unavailable for the
> application.  Also, are you quoting a burst number, that can benefit
> from caching, or a sustained number, where the cache was exhausted long
> ago?

Well the cost of adding an additional cpu to a fileserver is usually
fairly minimal compared to the cost to own of a few TB of disk.  My
system was configured to look like a quad p4-1.8 (because of hyperthreading)
and one cpu would be around 60-80% depending on FS and which stage
of the benchmark was running.  I was careful to avoid cache effects.

I do have a quad CPU opteron I could use as a test bed as well.

> The high cpu load hurts scientific/engineering apps that want to access
> lots of data on disk, and burst rates are meaningless.

Agreed.

> In addition, I've
> repeatedly heard that same thing from sysadmins setting up NFS servers
> -- the ATA/PATA disks have too great a *negative* impact on NFS server
> performance -- here the burst rates should have been more significant,
> but the CPU load got in the way.

An interesting comment, one that I've not noticed personally, can
anyone offer a benchmark or application?  Was this mostly sequential?
Mostly random?  I'd be happy to run some benchmarks over NFS.

I'd love to quantify an honest to god advantage in one direction or
another, preferably collected from some kind of reproducable workload
so that the numerous variables can be pruned down to the ones
with the largest effect on performance or CPU load.

-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lathama at yahoo.com  Tue Dec  9 22:16:44 2003
From: lathama at yahoo.com (Andrew Latham)
Date: Tue, 9 Dec 2003 19:16:44 -0800 (PST)
Subject: [Beowulf] RE: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <20031210021918.GL7713@cse.ucdavis.edu>
Message-ID: <20031210031644.84113.qmail@web60302.mail.yahoo.com>

Amature thought but give it a read.

Would the advances in compressed filesystems like cramfs allow you to access
the 18gig of info on 6gig of ram. I do not know what the file type is and I am
assuming that it is not flat text (xml or other). If however you where working
on a dataset in xml at about 18gig would a compressed filesystem on 6gig of ram
be fast?

Andrew Latham
Wanna Be Employed :-)


--- Bill Broadley <bill at cse.ucdavis.edu> wrote:
> On Tue, Dec 09, 2003 at 04:37:32PM -0800, Lombard, David N wrote:
> > From: Bill Broadley [mailto:bill at cse.ucdavis.edu]
> > > 
> > > On Tue, Dec 09, 2003 at 07:03:12AM -0800, Lombard, David N wrote:
> > > > Very big pro:  You can get much higher *sustained* bandwidth levels,
> > > > regardless of CPU load.  ATA/PATA requires CPU involvement, and
> > > > bandwidth tanks under moderate CPU load.
> > > 
> > > I've heard this before, I've yet to see it.  To what do you attribute
> > > this advantage?  DMA scatter gather?  Higher bitrate at the read head?
> > 
> > Non involvement of the CPU with direct disk activities (i.e., the bits
> > handled by the SCSI controller)
> 
> Er, the way I understand it is with PATA, SCSI, or SATA the driver
> basically says Read or write these block(s) at this ADDR and raise
> an interupt when done.  Any corrections?
> 
> > plus *way* faster CPU to handle the
> > high-level RAID processing
> 
> I'm a big fan of software RAID, although it's not a SATA vs SCSI issue.
> 
> > v. the pokey processors found on most RAID
> > cards. 
> 
> Agreed.
> 
> > With multiple controllers on separate busses, I don't funnel all
> > my I/O through one bus.  Note again, I only discuss maximal disk
> > bandwidth, which means RAID-0.
> 
> Right, sorry I missed the mention.
> 
> > Direct measurement with both standard testers and applications.
> > Sustained means a dataset substantially larger than memory to avoid
> > cache effects.
> 
> Seems that it's fairly common to manage 300 MB/sec +/- 50 MB/sec from
> 1-2 PCI cards.  I've done similar with 3 U160 channels on an older
> dual P4.  The URL I posted shows the same for SATA.
> 
> > > > The highest SCSI bandwidth rates I've seen first hand are 290 MB/S
> > for
> > > > IA32 and 380 MB/S for IPF. Both had two controllers on independent
> > PCI-X
> > > > busses, 6 disks for IA32 and 12 for IPF in a s/w RAID-0 config.
> >                                                  ==========
> > > Was this RAID-5?  In Hardware?  In Software?  Which controllers?
> 
> > See underlining immediately above.
> 
> Sorry.
> 
> > > Do you have any reason to believe you wouldn't see similar with the
> > > same number of SATA drives on 2 independent PCI-X busses?
> > 
> > I have no info on SATA, thus the question later on.
> 
> Ah, well the URL shows a single card managing 250 MB/sec which decays
> to 180 MB/sec on the slower tracks.  Filesystems, PCI busses, and memory
> systems seem to start being an effect here.  I've not seen much more
> the 330 MB/sec (my case) up to 400 MB/sec (various random sources).  Even
> my switch from ext3 to XFS helped substantially.  With ext3 I was getting
> 265-280 MB/sec, with XFS my highest sustained sequential bandwidth was
> around 330 MB/sec.
> 
> Presumably the mentioned raidcore card could perform even better with
> raid-0 then raid-5.
> 
> > > I've seen 250 MB/sec from a relatively vanilla single controller
> > setup.
> > 
> > What file size v. memory.
> 
> 18 GBs of file I/O with 6 GB ram on a dual p4 1.8 GHz
> 
> > and what CPU load *not* associated with
> > actually driving the I/O?
> 
> None, just a benchmark, but it showed 50-80% cpu usage for a single CPU,
> this was SCSI though.  I've yet to see any I/O system PC based system 
> shove this much data around without significant CPU usage.
> 
> > Direct measurement with both standard testers and applications.
> > Sustained means a dataset substantially larger than memory to avoid
> > cache effects.
> 
> Of course, I use a factor of 4 minimum to minimize cache effects.
> 
> > You repeated my comment, "fairly high rates of cpu usage" -- high cpu
> > usage _just_to_drive_the_I/O_ meaning it's unavailable for the
> > application.  Also, are you quoting a burst number, that can benefit
> > from caching, or a sustained number, where the cache was exhausted long
> > ago?
> 
> Well the cost of adding an additional cpu to a fileserver is usually
> fairly minimal compared to the cost to own of a few TB of disk.  My
> system was configured to look like a quad p4-1.8 (because of hyperthreading)
> and one cpu would be around 60-80% depending on FS and which stage
> of the benchmark was running.  I was careful to avoid cache effects.
> 
> I do have a quad CPU opteron I could use as a test bed as well.
> 
> > The high cpu load hurts scientific/engineering apps that want to access
> > lots of data on disk, and burst rates are meaningless.
> 
> Agreed.
> 
> > In addition, I've
> > repeatedly heard that same thing from sysadmins setting up NFS servers
> > -- the ATA/PATA disks have too great a *negative* impact on NFS server
> > performance -- here the burst rates should have been more significant,
> > but the CPU load got in the way.
> 
> An interesting comment, one that I've not noticed personally, can
> anyone offer a benchmark or application?  Was this mostly sequential?
> Mostly random?  I'd be happy to run some benchmarks over NFS.
> 
> I'd love to quantify an honest to god advantage in one direction or
> another, preferably collected from some kind of reproducable workload
> so that the numerous variables can be pruned down to the ones
> with the largest effect on performance or CPU load.
> 
> -- 
> Bill Broadley
> Information Architect
> Computational Science and Engineering
> UC Davis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Dec 10 07:25:50 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Dec 2003 07:25:50 -0500 (EST)
Subject: [Beowulf] beowulf and X
In-Reply-To: <oprzxwaijew6rjvl@outgoing.verizon.net>
Message-ID: <Pine.LNX.4.44.0312100707360.18621-100000@lilith.rgb.private.net>

On Tue, 9 Dec 2003, mark kandianis wrote:

> quite honestly if mosix can do it, it seems that xfree86 is already there,
> so it looks like my question is moot. so i think i can get this up quicker 
> than
> i thought.
> 
> are there any particular kernels that are geared to beowulf?  or is this 
> something
> that one has to roll their own?

Hmmm, it looks like you really need a general introduction to the
subject.  Mosix may or may not be the most desireable way to proceed, as
it is quite "expensive" in terms of overhead and requires a custom
(patched) kernel.  It is also not exactly a GPL product, although it is
free and open source.  If you like, its "fork and forget" design
requires all I/O channels of any sort to be transparently encapsulated
and forwarded over TCP sockets to the master host where the jobs are
begun.  For something with little, rare I/O this is fine -- Mosix then
becomes a sort of distributed interface to a standard Linux scheduler
with a moderate degree of load balancing over the network.  For
something that opens lots of files or pipes and does a lot of writing to
them, it can clog up your network and kernel somewhat faster than an
actual parallel program where you can control e.g. data collection
patterns and avoid collisions and reduce the overhead of encapsulation.

If you're talking only a "small" cluster -- < 64 nodes, maybe < 32 nodes
(it depends on the I/O load of your application) -- you have a decent
chance of not getting into trouble with scaling, but you should
definitely experiment.  If you're wanting to run on hundreds of nodes,
I'd be concerned that you'll only be able to use ten, or thirty, or
forty seven, before your application scaling craps out -- all the other
nodes are then potentially "wasted".

There are quite a few resources for cluster beginners out there, many of
them linked to:

   http://www.phy.duke.edu/brahma

(so I won't bother detailing URL's to them all here).  Links and
resources on this site include papers and talks, an online book
(perennially unfinished, but still mostly complete and even
sorta-current:-) on cluster engineering, links to the FAQ, HOWTO, the
Beowulf Underground, turnkey vendor/cluster consultants, useful
hardware, networking stuff -- I've tried to make it a resource
clearinghouse although even so it is far from complete and gets out of
date if I blink.

Finally, I'd urge you to subscribe to the new Cluster Magazine (plug
plug, hint hint) which has articles that will undoubtedly help you out
with all sorts of things over the next twelve months.  I just got my
first issue, and its articles are being written by really smart people
on this list (and a few bozos -- sorry, OLD joke:-) and should be very,
very helpful to people trying to engineer their first cluster or their
fifteenth.  Besides, you get three free trial issues if you sign up now
and live in the US.

Best of luck, and to get even MORE help, describe your actual problem in
more detail.  Possibly after reading about parallel scaling and Amdahl's
Law.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msnitzer at lnxi.com  Wed Dec 10 09:02:28 2003
From: msnitzer at lnxi.com (Mike Snitzer)
Date: Wed, 10 Dec 2003 07:02:28 -0700
Subject: [Beowulf] Re: BW-BUG meeting, Today Dec. 9, 2003, in Greenbelt MD;  -- Red Hat
In-Reply-To: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>; from becker@scyld.com on Tue, Dec 09, 2003 at 01:24:59PM -0500
References: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>
Message-ID: <20031210070228.A28351@lnxi.com>

On Tue, Dec 09 2003 at 11:24,
Donald Becker <becker at scyld.com> wrote:

> 
> [[ Please note that this month's meeting is East: Greenbelt, not McLean VA. ]]
> 
>      Baltimore Washington  Beowulf Users Group 
>             December 2003 Meeting 
>                www.bwbug.org
>     December 9th at 3:00PM in Greenbelt MD
>  
> ____
> 
>         RedHat Roadmap for HPC Beowulf Clusters.
> 
>         RedHat is pleased to have the opportunity to present to Baltimore-
> Washington Beowulf User Group on Tuesday Dec 9th. Robert Hibbard, Red Hat's
> Federal Partner Alliance Manager, will provide information on Red Hat's
> Enterprise Linux product strategy, with particular emphasis on it's
> relevance to High Performance Computing Clusters. 
> 
>         Discussion will include information on the background, current
> product optimizations, as well as possible futures for Red Hat efforts
> focused on HPCC. 

Can those who attended this meeting provide a summary?  Was bwbug able to
get Robert's presentation?

thanks,
Mike

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Robin.Laing at drdc-rddc.gc.ca  Wed Dec 10 11:32:56 2003
From: Robin.Laing at drdc-rddc.gc.ca (Robin Laing)
Date: Wed, 10 Dec 2003 09:32:56 -0700
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <Pine.LNX.4.44.0312091244160.18002-100000@lilith.rgb.private.net>
References: <Pine.LNX.4.44.0312091244160.18002-100000@lilith.rgb.private.net>
Message-ID: <3FD74AB8.2090802@drdc-rddc.gc.ca>

Robert G. Brown wrote:
> On Tue, 9 Dec 2003, Robin Laing wrote:
> 
> 
>>I agree but I am not looking at swap thrashing in the sense of many 
>>small files.  I am looking at 1 or 2 large files that are bigger than 
>>memeory while working.  I know on my present workstation I will work 
>>with a file that is 2X the memory and I find that the machine stutters 
>>(locks for a few seconds) every time there is any disk ascess.  I 
>>would like to add more ram but that is impossible as there are only 
>>two slots and they are full.  Management won't provide the funds.
> 
> 
> I have to ask.  Is it a P4?  Strictly empirically I have experienced
> similar things even without filling memory.  I actually moved my
> fileserver off onto a Celeron (which it has run flawlessly) because it
> was so visible, so annoying.

Dell P4 with 512M ram.  IDE drive.

> 
> I have no idea why a P4 would behave that way, but to my direct
> experience at least some P4-based servers can be really BAD on file
> latency for reasons that have nothing to do with the disk hardware or
> kernel per se.  Maybe some sort of chipset problem, maybe related to the
> particular onboard IDE/ATA controllers -- I never bothered to try to
> debug it other than to move the server onto something else where it
> worked.  AMD or Celeron or PIII are all just fine.
Even more reason for me to stick with AMD's.
> 
> If you're stuck on the hardware side with no money to get better
> hardware, well, you're stuck.  My P4 system had plenty of memory and a
> 1.8 MHz clock and still was a pig compared to a 400 MHz Celery serving
> the SAME DISK physically moved from one to the other.
> 
>    rgb
> 

I find that my P90 at home with UWSCSI is faster most of the time than 
my computer at work.

This thread has sure opened up some debate.  I didn't think it would 
raise the number of issues it has.

-- 
Robin Laing

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Wed Dec 10 13:12:47 2003
From: landman at scalableinformatics.com (landman)
Date: Wed, 10 Dec 2003 13:12:47 -0500
Subject: [Beowulf] MPICH error
Message-ID: <20031210180551.M13163@scalableinformatics.com>

Hi Folks:

  A customer is seeing 

        rm_1310:  p4_error: rm_start: net_conn_to_listener failed: 33220

on an MPI job.  Used to work (just last week).  Updated the kernel was the major
change (added XFS support)

  Any idea of what this is?  I assume a network change.  MPICH 1.2.4.  Do I need
to recompile MPICH to match the kernel?

--
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From victor_ms at bol.com.br  Wed Dec 10 05:59:29 2003
From: victor_ms at bol.com.br (Victor Lima)
Date: Wed, 10 Dec 2003 07:59:29 -0300
Subject: [Beowulf] About Linpack
Message-ID: <3FD6FC91.1030402@bol.com.br>

Hello All,
I have a problem with a Linpack (HPL) with my small LinuxRedHat 7.1 with 
kernel 2.4.20 Cluster (17 Machines) with Mosix and MPI.
When I try to execute xhpl this message apear on my screan:

mpirun -np X xhpl

Where X is the number 17, I try to change the file, hpl.dat, but nothin 
happend.


HPL ERROR from process # 0, on line 408 of function HPL_pdinfo:
 >>> Need at least 4 processes for these tests <<<

HPL ERROR from process # 0, on line 610 of function HPL_pdinfo:
 >>> Illegal input in file HPL.dat. Exiting ... <<<

HPL ERROR from process # 0, on line 408 of function HPL_pdinfo:
 >>> Need at least 4 processes for these tests <<<

HPL ERROR from process # 0, on line 610 of function HPL_pdinfo:
 >>> Illegal input in file HPL.dat. Exiting ... <<<

HPL ERROR from process # 0, on line 408 of function HPL_pdinfo:
 >>> Need at least 4 processes for these tests <<<

HPL ERROR from process # 0, on line 610 of function HPL_pdinfo:
 >>> Illegal input in file HPL.dat. Exiting ... <<<

HPL ERROR from process # 0, on line 408 of function HPL_pdinfo:
 >>> Need at least 4 processes for these tests <<<

HPL ERROR from process # 0, on line 610 of function HPL_pdinfo:
 >>> Illegal input in file HPL.dat. Exiting ... <<<


Some one here has the same problem?

+------------------------------------------------
Universidade Cat?lica Dom Bosco
Esp. Redes de Computadores
+55 67 312-3300
Campo Grande / Mato Grosso do Sul
BRAZIL


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Dec 10 14:07:53 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Dec 2003 14:07:53 -0500 (EST)
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
In-Reply-To: <3FD74AB8.2090802@drdc-rddc.gc.ca>
Message-ID: <Pine.LNX.4.44.0312101357070.11945-100000@ganesh.phy.duke.edu>

On Wed, 10 Dec 2003, Robin Laing wrote:

> > I have to ask.  Is it a P4?  Strictly empirically I have experienced
> > similar things even without filling memory.  I actually moved my
> > fileserver off onto a Celeron (which it has run flawlessly) because it
> > was so visible, so annoying.
> 
> Dell P4 with 512M ram.  IDE drive.

One thing Mark suggested (offline, I think) is that TOO MUCH memory can
confuse the caching system of at least some kernels.  Since I never
fully debugged this problem, but instead worked around it (a Celeron,
memory, motherboard, case costs maybe $350 and my time and annoyance are
worth much more than this) I don't know if this is true or not, but it
got to where it could actually crash the system when it was running as
an NFS server with lots of sporadic traffic.  It behaved like it was
swapping (and getting behind in swapping at that), but it wasn't.  It
may well have been a memory management problem, but it seemed pretty
specific to that system.

> > worked.  AMD or Celeron or PIII are all just fine.
> Even more reason for me to stick with AMD's.

Ya, me too.  Although the P4 has worked fine since I stopped making it a
server.  I still get rare mini-delays -- it seems a bit more sluggish
than a 1.8 MHz system with really fast memory has ANY business being --
but overall it is satisfactory.

> I find that my P90 at home with UWSCSI is faster most of the time than 
> my computer at work.
> 
> This thread has sure opened up some debate.  I didn't think it would 
> raise the number of issues it has.

Yeah, it's what I love about this list.  Ask the right question, and the
list generates what amounts to a textbook on the technology, tools, and
current best practice.  Poor Jeffrey then has to pick from all this and
condense it for CM.  By tonight, Jeffrey! ;-)

Oh wait, that's my deadline too...:-(

Grumble.  Off to the salt mines.  Except that I'm double parked, with
the final exam for the course I'm teaching being given in a few hours.
So Doug may have to wait a few days for this one...

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From horacio at acm.org  Wed Dec 10 13:50:59 2003
From: horacio at acm.org (Horacio Gonzalez-Velez)
Date: Wed, 10 Dec 2003 18:50:59 -0000
Subject: [Beowulf] Newbie in Beowulf
Message-ID: <002401c3bf4e$8d4305c0$33000e0a@RESNETHGV>

I need to do MPI programming in a Beowulf cluster.  I am porting from Sun
SOlaris to Beowulf so any pointers are extremely appreciated.

Thanks.

-- 
Horacio Gonzalez-Velez,
Institute for Computing Systems Architecture,
School of Informatics, JCMB-1420
Ph.: +44- (0) 131 650 5171 (direct)
Fax: +44- (0) 131 667 7209
University of Edinburgh,
e-mail:  H.Gonzalez-Velez at sms.ed.ac.uk, horacio at acm.org

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gropp at mcs.anl.gov  Wed Dec 10 13:38:46 2003
From: gropp at mcs.anl.gov (William Gropp)
Date: Wed, 10 Dec 2003 12:38:46 -0600
Subject: [Beowulf] MPICH error
In-Reply-To: <20031210180551.M13163@scalableinformatics.com>
References: <20031210180551.M13163@scalableinformatics.com>
Message-ID: <6.0.0.22.2.20031210123625.025d8e88@localhost>

At 12:12 PM 12/10/2003, you wrote:
>Hi Folks:
>
>   A customer is seeing
>
>         rm_1310:  p4_error: rm_start: net_conn_to_listener failed: 33220
>
>on an MPI job.  Used to work (just last week).  Updated the kernel was the 
>major
>change (added XFS support)
>
>   Any idea of what this is?  I assume a network change.  MPICH 1.2.4.  Do 
> I need
>to recompile MPICH to match the kernel?

No, you shouldn't need to recompile MPICH.  The most likely cause is a 
change in how TCP connections are handled.  See 
http://www-unix.mcs.anl.gov/mpi/mpich/docs/faq.htm#linux-redhat for some 
suggestions.

Bill 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Dec 10 14:22:30 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 10 Dec 2003 14:22:30 -0500 (EST)
Subject: [Beowulf] Newbie in Beowulf
In-Reply-To: <002401c3bf4e$8d4305c0$33000e0a@RESNETHGV>
Message-ID: <Pine.LNX.4.44.0312101412540.11945-100000@ganesh.phy.duke.edu>

On Wed, 10 Dec 2003, Horacio Gonzalez-Velez wrote:

> I need to do MPI programming in a Beowulf cluster.  I am porting from Sun
> SOlaris to Beowulf so any pointers are extremely appreciated.

MPI books from MIT press.  Beowulf book, also from MIT press.  Probably
more books out there.  Online book on beowulf engineering on

 http://www.phy.duke.edu/brahma

(along with many other resource links).  Ian Foster and others' books on
parallel programming in general.  Articles (past and present) in both
Linux Magazine and Cluster Magazine, some of Forrest's LM articles
online last I checked.  MPICH website (www.mpich.org), LAM website
(www.lam-mpi.org) with of course MANY resource links and tutorials as
well.

When you get through this, if you still need help as again; there are
lots of MPI programmers on the list that can help with specific
questions, but this should get you started generally speaking.

Note that nearly any way you set up a compute cluster, "true beowulf" or
NOW or background utilization of mostly idle boxes on a linux LAN, will
let you do MPI programming and run the result in parallel.  Cluster
distributions will generally install MPI ready to run, more or less.
Ordinary over the counter distributions e.g. Red Hat will generally
permit you to install it as part of the supported distribution as an
optional package.  As for MPICH vs LAM vs commercial offerings, I'm not
an MPI expert and have no religious feelings -- reportedly one is a bit
easier to run from userspace and the other a bit easier to control in a
managed environment, but this sort of thing is amorphous and hard to
quantify and time dependent, so I won't even say which is which.

  rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Wed Dec 10 16:25:16 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Wed, 10 Dec 2003 13:25:16 -0800
Subject: [Beowulf] Re: SATA  or SCSI drives - Multiple Read/write speeds.
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BFA2@orsmsx402.jf.intel.com>

From: Robert G. Brown; Sent: Wednesday, December 10, 2003 11:08 AM
> 
> On Wed, 10 Dec 2003, Robin Laing wrote:
> 
> > > I have to ask.  Is it a P4?  Strictly empirically I have
experienced
> > > similar things even without filling memory.  I actually moved my
> > > fileserver off onto a Celeron (which it has run flawlessly)
because it
> > > was so visible, so annoying.
> >
> > Dell P4 with 512M ram.  IDE drive.
> 
> One thing Mark suggested (offline, I think) is that TOO MUCH memory
can
> confuse the caching system of at least some kernels.  Since I never
> fully debugged this problem, but instead worked around it (a Celeron,
> memory, motherboard, case costs maybe $350 and my time and annoyance
are
> worth much more than this) I don't know if this is true or not, but it
> got to where it could actually crash the system when it was running as
> an NFS server with lots of sporadic traffic.  It behaved like it was
> swapping (and getting behind in swapping at that), but it wasn't.  It
> may well have been a memory management problem, but it seemed pretty
> specific to that system.

This is very much like the kernel i/o tuning problems that I described
earlier, that were fixed by replacing the kernel (the offending kernel
was a 2.4.17 or 2.4.18), or in some cases, by tuning i/o parameters.

I first saw this on IPF systems with a very high-end I/O subsystem, I
later saw it on other fast 32-bit systems.  All involved significant I/O
traffic, -- the system would appear to hang for extended periods and
then continue on.  The impact ranged from annoying (the IPF) to
debilitating.  The underlying cause was in the use and retirement of
buffers by the kernel.  IIRC, the kernel got to the point of holding on
to too much cache, and then deciding it needed to dump it all before
continuing on.

As I said before, the problem was reported several times on the LK list.
The first reports were with really poor I/O devices, and were dismissed
as such, but later reports showed up with well configured I/O systems,
but any system with the right I/O load could trigger it.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gabriele.butti at unimib.it  Thu Dec 11 10:25:06 2003
From: gabriele.butti at unimib.it (Butti Gabriele - Dottorati di Ricerca)
Date: 11 Dec 2003 16:25:06 +0100
Subject: [Beowulf] SWAP management
Message-ID: <1071156306.16827.53.camel@tantalio.mater.unimib.it>

Hi everybody,
      the question I am going to ask is not strictly releted to a
beowulf cluster but has to do with scientific computing in general, also
with scalar codes. I would like to learn more on how swap memory pages
are handled by a Linux OS. 
My problem is that when I'm running a code, it starts swapping even if
its memory requirements are lower than the total amount of memory
availble. For exaple if there 750 Mb of memory, the program swaps when
using only 450 Mb. How can avoid such a thing to happen? One solution
could be not to create any SWAP partition during the installation but I
think this is a very dramatic solution. 
Is there any other method to force a code to use only RAM ?
It seems that my Linux OS [RH 7.3 basically, in some cases SuSE 8.2]
tries to avoid that the percentage of memory used by a single process
becomes higher than 60-70 %.
Any idea would be appreciated.
TIA
Gabriel
-- 
                                 \\|//
                                -(o o)-
              /------------oOOOo--(_)--oOOOo-------------\
              |                                          |
              |             Gabriele Butti               |
              |        -----------------------           |
              |      Department of Material Science      |     
              |      University of Milano-Bicocca        |     
              |      Via Cozzi 53, 20125 Milano, ITALY   |     
              |      Tel (+39)02 64485214                |           
              |             .oooO   Oooo.                |
              \--------------(   )---(   )---------------/
                              \ (     ) /
                               \_)   (_/


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hahn at physics.mcmaster.ca  Thu Dec 11 12:02:39 2003
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 11 Dec 2003 12:02:39 -0500 (EST)
Subject: [Beowulf] SWAP management
In-Reply-To: <1071156306.16827.53.camel@tantalio.mater.unimib.it>
Message-ID: <Pine.LNX.4.44.0312111139550.32494-100000@coffee.psychology.mcmaster.ca>

> with scalar codes. I would like to learn more on how swap memory pages
> are handled by a Linux OS. 

in Linux, there is user memory and kernel memory.  the latter is unswappable,
and only for internal kernel uses, though that includes some user-visible
caches like dcache.  it's not anything you can do anything about, so I'll 
ignore it here.

user-level memory includes cached pages of files, user-level stack or sbrk
heap, mmaped shared libraries, MAP_ANON memory, etc.  some of this is what 
you think of as being part of your process's virtual address space.  other
pages are done behind your back - especially caching of file-backed pages.
all IO normally goes through the page cache and thus competes for physical
pages with all the other page users.  this means that by doing a lot of IO,
you can cause enough page scavenging to force other pages (sufficiently idle)
out to swap or backing store.  (for instance, backing store of an mmaped file
is the file itself, on disk.)

> My problem is that when I'm running a code, it starts swapping even if
> its memory requirements are lower than the total amount of memory
> availble. For exaple if there 750 Mb of memory, the program swaps when
> using only 450 Mb.

are you also doing a lot of file IO?

with IO, the problem is that pages doing IO are "hot looking" to the kernel,
since they are touched by the device driver as well as userspace.  the kernel
will tend to leave them in the pagecache at the expense of other kinds of
pages, which may not be touched as often or fast.  in a way, this is really
a problem with the kernel simply not having enough memory for the properties
of a virtual page.

> How can avoid such a thing to happen?

there is NOTHING wrong with swapping, since it is merely the kernel trying 
to find the set of pages that make the best use of a limited amount of ram.
a moderate amount of swap OUT traffic is very much a good thing, since 
it means that old/idle processes won't clutter up your ram which could be 
more effectively used by something recent.

the problem (if any) is swap IN - especially when there's also swapouts
happening.  when this happens, it means that the kernel is choosing the wrong
pages to swap out, and is winding up having to read them back in immediately.
this is called "thrashing", and barring kernel bugs (such as early 2.4
kernels) the only solution is to add more ram.

> One solution
> could be not to create any SWAP partition during the installation but I
> think this is a very dramatic solution. 

disk is very cheap; ram is still a lot more expensive.  a modest amount of 
swapouts are really a tradeoff: move idle ram pages into cheap disk so the 
expensive ram can be used for something more important.

> Is there any other method to force a code to use only RAM ?

of course: mlock.

> It seems that my Linux OS [RH 7.3 basically, in some cases SuSE 8.2]
> tries to avoid that the percentage of memory used by a single process
> becomes higher than 60-70 %.

I don't believe there is any such heuristic.  it wouldn't have anything to do 
with the distribution, of course, only with the kernel.

regards, mark hahn.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mwheeler at startext.co.uk  Thu Dec 11 12:38:11 2003
From: mwheeler at startext.co.uk (Martin WHEELER)
Date: Thu, 11 Dec 2003 17:38:11 +0000 (UTC)
Subject: [Beowulf] Re: [OT] statistical calculations - report
Message-ID: <Pine.LNX.4.33.0312111723290.4108-100000@caxton.startext.demon.co.uk>

Many thanks to all who replied to me both on- and off-list; in
particular those who pointed me towards the ability to create customised
R and Python plugins for gnumeric.  (About which I knew nothing.)
Although I can't do anything about the use of spreadsheet technology in
the first place, at least yesterday I was enable to muster up enough
backup to be able to influence the choice of /which/ spreadsheet I will
be expected to use.

Also thanks to those who made practical suggestions concerning the use
of postgres/mysql databases; this was enough to convince me I had to do
something about certain areas of (natural language) data manipulation by
myself; and eschew the spreadsheet for something that more naturally
fits the way I work!

Regards,
-- 
Martin Wheeler   -   StarTEXT / AVALONIX - Glastonbury - BA6 9PH - England
mwheeler at startext.co.uk                http://www.startext.co.uk/mwheeler/
GPG pub key : 01269BEB  6CAD BFFB DB11 653E B1B7 C62B  AC93 0ED8 0126 9BEB
      - Share your knowledge. It's a way of achieving immortality. -


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rmd003 at sympatico.ca  Thu Dec 11 17:14:40 2003
From: rmd003 at sympatico.ca (rmd003 at sympatico.ca)
Date: Thu, 11 Dec 2003 17:14:40 -0500
Subject: [Beowulf] Simple Cluster
Message-ID: <3FD8EC50.7060606@sympatico.ca>

Hello,

Would anyone know if it is possible to make a cluster with four P1 
computers? If it is possible are there any instructions on how to do 
this or the software required etc...?

Robert Van Amelsvoort
rmd003 at sympatico.ca

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Thu Dec 11 20:15:02 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Fri, 12 Dec 2003 09:15:02 +0800 (CST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <3FD8EC50.7060606@sympatico.ca>
Message-ID: <20031212011502.19428.qmail@web16811.mail.tpe.yahoo.com>

It all depends on what you want to do with the
cluster.

Andrew.

--- rmd003 at sympatico.ca ????
> Hello,
> 
> Would anyone know if it is possible to make a
> cluster with four P1 
> computers? If it is possible are there any
> instructions on how to do 
> this or the software required etc...?
> 
> Robert Van Amelsvoort
> rmd003 at sympatico.ca
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or
> unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec 12 07:25:03 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 12 Dec 2003 07:25:03 -0500 (EST)
Subject: [Beowulf] SWAP management
In-Reply-To: <Pine.LNX.4.44.0312111139550.32494-100000@coffee.psychology.mcmaster.ca>
Message-ID: <Pine.LNX.4.44.0312120654090.1868-100000@lilith.rgb.private.net>

On Thu, 11 Dec 2003, Mark Hahn wrote:

> > It seems that my Linux OS [RH 7.3 basically, in some cases SuSE 8.2]
> > tries to avoid that the percentage of memory used by a single process
> > becomes higher than 60-70 %.
> 
> I don't believe there is any such heuristic.  it wouldn't have anything to do 
> with the distribution, of course, only with the kernel.

To add to Mark's comment, it is not exactly easy to see what's going on
with a system's memory usage.  Using top and/or vmstat for starters --
vmstat 5 will let you differentiate "swap" events from other paging and
disk activity (possibly associated with applications) while letting you
see memory consumption in real time.  top will give you a lovely picture
of the active process space that auto-updates ever (interval) seconds.
If you enter M, it will toggle into a mode where the list is sorted by
memory consumption instead of run queue (which I find often misses
problems, or rather flashes them up only rarely).  You can then look at
Size (full virtual memory allocation of process) and RSS (space the
process is actually using in memory at the time) while looking at total
memory and swap usage in the header.

Note well that the "used/free" part of memory is not an accurate
reflection of the system's available memory in this display -- to get
that you have to subtract buffer and cached memory from the used
component.  This yields the memory that CAN be made available to
a process if all the cached pages are paged out and all the buffers
flushed and freed.  Linux does NOT like to run in a mode with no cache
and buffer space as it is not efficient -- one reason linux generally
appears so smooth and fast is that a rather large fraction of the time
"I/O" from slow resources is actually served from the cache and "I/O" to
slow resources is actually written into a buffer so that the task can
continue unblocked.  If you do suck up all the free memory, it will then
fuss a bit and try paging things out to free up at least a small bit of
cache/buffer space.

Note that a small amount of swap space usage is fairly normal and
doesn't mean that your system is "swapping".  A small amount of swap
out events is also normal ditto.  It's the swap ins that are more of a
problem.

One problem that can be very difficult to detect is a problem with a
daemon or networking stack.  A runaway forking daemon can consume large
amounts of resources and clutter your system with processes.  A runaway
networking application that is trying to make connections on a "bad"
port or networking connection can sometimes contain a loop that e.g.
tries to make a socket and succeeds, whereby the connection breaks and
the socket has to terminate, which takes a timeout.  I've seen loops
that would leave you with a - um - "large number" of these dying
sockets, which again suck up resources and may or may not eventually
cause problems.  There used to be a similar problem with zombie
processes and I suppose there still is if you right just the right code,
but I haven't seen an actual zombie for a long time.

Note also that top and to a less detailed extent vmstat give you a way
of seeing whether or not an application is leaking.  If a system
"suddenly" starts paging/swapping, chances are really, really good that
one of your applications is leaking sieve-like.  Having written a number
of applications myself which I proudly acknowledge leaked like a
sumbitch until I finally tracked them down with free plumber's putty, I
know just how bone-simple it is to do, especially if you use certain
libraries (e.g. libxml*) where nearly everything you handle is a pointer
to space malloc'd by a called routine that has to be freed before you
reuse it.  top with M can help a bit -- watch that Size and if it grows
while RSS remains constant, suspect a leak.

Finally, a few programs may or may not leak, but they constitute a big
sucking noise when run on your system.  Open Office, for example, is
lovely but uses more memory than X itself (which is also rather a pig).
Some of the gnome apps are similarly quite large and tend to have RSS
close to SIZE.  In general, if you are running a GUI, it is not at all
unlikely that you're using 100 MB or more and might be using several
hundred MB.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec 12 08:40:12 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 12 Dec 2003 08:40:12 -0500 (EST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <3FD8EC50.7060606@sympatico.ca>
Message-ID: <Pine.LNX.4.44.0312120834490.2033-100000@lilith.rgb.private.net>

On Thu, 11 Dec 2003 rmd003 at sympatico.ca wrote:

> Hello,
> 
> Would anyone know if it is possible to make a cluster with four P1 
> computers? If it is possible are there any instructions on how to do 

Sure.  There are instructions in my column in Cluster World 1,1 that
should suffice.  There is also a bunch of stuff that might be enough in
resources linked to http:/www.phy.duke.edu/brahma/index.php, including
an online book on clusters.  You can probably get free issues including
this one with a trial subscription at the clusterworld website.

The problems I can see with using Pentiums at this point are:

  a) likely insufficient memory and disk unless you really work on the
linux installation;

  b) a single $500 vanilla box from your local cheap vendor would be
MUCH MUCH MUCH faster.  As in MUCH faster.  Raw CPU clock a factor of
10, add a factor of 2 to 4 for CPU family and more memory and so forth.
Likely ten or more times faster than your entire cluster of four
Pentiums on a good day.  SO your cluster needs to be a "just for fun"
cluster, for hobbyist or teaching purposes, and would still be much
better (faster and easier to build) with more current CPUs and systems
with a minimum of 128 to 256 MB of memory each.

   rgb

> this or the software required etc...?
> 
> Robert Van Amelsvoort
> rmd003 at sympatico.ca
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Fri Dec 12 09:10:15 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Fri, 12 Dec 2003 06:10:15 -0800 (PST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <Pine.LNX.4.44.0312120834490.2033-100000@lilith.rgb.private.net>
Message-ID: <Pine.LNX.3.96.1031212060516.12940A-100000@Maggie.Linux-Consulting.com>


On Fri, 12 Dec 2003, Robert G. Brown wrote:

> On Thu, 11 Dec 2003 rmd003 at sympatico.ca wrote:
> 
> > Hello,
> > 
> > Would anyone know if it is possible to make a cluster with four P1 
> > computers? If it is possible are there any instructions on how to do 

only good thing that wuld come out of it would be learning what files
need to be changed  to get a cluster working

> Sure.  There are instructions in my column in Cluster World 1,1 that
> should suffice.  There is also a bunch of stuff that might be enough in
> resources linked to http:/www.phy.duke.edu/brahma/index.php, including
> an online book on clusters.  You can probably get free issues including
> this one with a trial subscription at the clusterworld website.
> 
> The problems I can see with using Pentiums at this point are:
> 
>   a) likely insufficient memory and disk unless you really work on the
> linux installation;
> 
>   b) a single $500 vanilla box from your local cheap vendor would be
> MUCH MUCH MUCH faster.  As in MUCH faster.  Raw CPU clock a factor of

now days.. you can get a brand new mini-itx P3-800 equivalent for $125
and you can even use the old memory from the Pentium ( the p3-800 uses
pc-133 memory .. amazingly silly.. .. p3-800 is the EPIA-800 )
	- just the diference in time spent waiting for the old pentiums
	vs the mini-itx would make the mini-itx a better choice
	since you can have a useful cluster after playing

	- but than again, one of my 3 primary "useful" machine is still a
	p-90 w/ 48MB of memory ( primary == used everyday by me )

have fun
alvin

> 10, add a factor of 2 to 4 for CPU family and more memory and so forth.
> Likely ten or more times faster than your entire cluster of four
> Pentiums on a good day.  SO your cluster needs to be a "just for fun"
> cluster, for hobbyist or teaching purposes, and would still be much
> better (faster and easier to build) with more current CPUs and systems
> with a minimum of 128 to 256 MB of memory each.
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Fri Dec 12 09:51:51 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Fri, 12 Dec 2003 06:51:51 -0800
Subject: [Beowulf] SWAP management
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BFBD@orsmsx402.jf.intel.com>

From: Robert G. Brown; Sent: Friday, December 12, 2003 4:25 AM
> On Thu, 11 Dec 2003, Mark Hahn wrote:
> 
> > > It seems that my Linux OS [RH 7.3 basically, in some cases SuSE
8.2]
> > > tries to avoid that the percentage of memory used by a single
process
> > > becomes higher than 60-70 %.
> >
> > I don't believe there is any such heuristic.  it wouldn't have
anything
> to do
> > with the distribution, of course, only with the kernel.
> 
> To add to Mark's comment, it is not exactly easy to see what's going
on
> with a system's memory usage.  Using top and/or vmstat for starters --
> vmstat 5 will let you differentiate "swap" events from other paging
and
> disk activity (possibly associated with applications) while letting
you
> see memory consumption in real time.  top will give you a lovely
picture
> of the active process space that auto-updates ever (interval) seconds.
> If you enter M, it will toggle into a mode where the list is sorted by
> memory consumption instead of run queue (which I find often misses
> problems, or rather flashes them up only rarely).  You can then look
at
> Size (full virtual memory allocation of process) and RSS (space the
> process is actually using in memory at the time) while looking at
total
> memory and swap usage in the header.

I find that atop is a valuable tool to see what going on in a system,
much better than standard top.

Atop doesn't display inactive processes, so your display isn't clutter
with processes you don't care about, regardless of your sort; atop also
shows the growth of both virtual and resident memory.  In addition, atop
also gives a very good look at the system, including cpu, memory, disk,
and network.

One final Good Thing, atop can keep raw data in files that you can
"replay" later, allowing you to see a time-history of activity on the
node.

Take a look at ftp://ftp.atcomputing.nl/pub/tools/linux/

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From james.p.lux at jpl.nasa.gov  Fri Dec 12 10:32:57 2003
From: james.p.lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Dec 2003 07:32:57 -0800
Subject: [Beowulf] Simple Cluster
References: <3FD8EC50.7060606@sympatico.ca>
Message-ID: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>

Sure you can do it. It won't be a ball of fire speed wise, and probably
wouldn't be a cost effective solution to doing any "real work", but it will
compute..

Search the web for the "Pondermatic" which, as I recall, was a couple or
three P1s.  And of course, very early clusters were made with 486's.

Your big challenge is probably going to be (easily) getting an appropriate
distribution that fits within the disk and RAM limits.  Yes, before all the
flames start, I know it's possible to make a version that fits in 16K on an
8088, and that would be bloatware compared to someone's special 6502 Linux
implementation that runs on old Apple IIs, etc.etc.etc., but nobody would
call that easy.  What Robert is probably looking for is a "stick the CDROM
in and go" kind of solution, and, just like in the Windows world, the
current, readily available (as in download the ISO and go) solutions tend to
assume one has a vintage 2001 computer sitting around with a several hundred
MHz processor and 64MB of RAM, etc.

Actually, I'd be very glad to hear that this is not the case..

Maybe one of the old Scyld "cluster on a disk" might be a good way?

Perhaps Rocks?  It sort of self installs.

One could always just boot 4 copies of Knoppix, but I don't know that
there's many "cluster management" tools in Knoppix.

----- Original Message -----
From: <rmd003 at sympatico.ca>
To: <beowulf at beowulf.org>
Sent: Thursday, December 11, 2003 2:14 PM
Subject: [Beowulf] Simple Cluster


> Hello,
>
> Would anyone know if it is possible to make a cluster with four P1
> computers? If it is possible are there any instructions on how to do
> this or the software required etc...?
>
> Robert Van Amelsvoort
> rmd003 at sympatico.ca
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mbanck at gmx.net  Fri Dec 12 10:50:22 2003
From: mbanck at gmx.net (Michael Banck)
Date: Fri, 12 Dec 2003 16:50:22 +0100
Subject: [Beowulf] Simple Cluster
In-Reply-To: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
References: <3FD8EC50.7060606@sympatico.ca> <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <20031212155022.GB25554@blackbird.oase.mhn.de>

On Fri, Dec 12, 2003 at 07:32:57AM -0800, Jim Lux wrote:
> One could always just boot 4 copies of Knoppix, but I don't know that
> there's many "cluster management" tools in Knoppix.
 
While Knoppix is all cool with that self-configuration and stuff, I've
never heard it mentioned when it came to low-level hardware and RAM
requirements. Sure, one must not boot up in KDE|GNOME, but I doubt that
even the console mode has a small memory footprint. I'd like to be
proven wrong, of course :)


Michael
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jeffrey.b.layton at lmco.com  Fri Dec 12 11:50:43 2003
From: jeffrey.b.layton at lmco.com (Jeff Layton)
Date: Fri, 12 Dec 2003 11:50:43 -0500
Subject: [Beowulf] Simple Cluster
In-Reply-To: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
References: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <3FD9F1E3.2080805@lmco.com>


   I can think of three solutions. The first one I can think of is
called clusterKnoppix (bofh.be/clusterknoppix/). It has
OpenMOSIX built-in so you can run compute farm types
of applications (and you get to learn about OpenMOSIX).
You can also run MPI and PVM apps on it.
   The second one I can think of is Warewulf
(warewulf-cluster.org). The primary 'mode' of it allows
you to boot the nodes over the network to a RAM disk
about 70 Megs in size. You could also boot of a CD or floppy
and then pull the install over the network.
   The third one is called Bootable Cluster CD
(www.cs.uni.edu/~gray/bccd/). It is somewhat like
clusterKnoppix but I'm not sure it uses OpenMOSIX.
   A fourth alternative might be Thin-Oscar
(thin-oscar.ccs.usherbrooke.ca/). I don't think it's ready
for prime-time, but you might take a look.

Good Luck!

Jeff

> Sure you can do it. It won't be a ball of fire speed wise, and probably
> wouldn't be a cost effective solution to doing any "real work", but it 
> will
> compute..
>
> Search the web for the "Pondermatic" which, as I recall, was a couple or
> three P1s.  And of course, very early clusters were made with 486's.
>
> Your big challenge is probably going to be (easily) getting an 
> appropriate
> distribution that fits within the disk and RAM limits.  Yes, before 
> all the
> flames start, I know it's possible to make a version that fits in 16K 
> on an
> 8088, and that would be bloatware compared to someone's special 6502 
> Linux
> implementation that runs on old Apple IIs, etc.etc.etc., but nobody would
> call that easy.  What Robert is probably looking for is a "stick the 
> CDROM
> in and go" kind of solution, and, just like in the Windows world, the
> current, readily available (as in download the ISO and go) solutions 
> tend to
> assume one has a vintage 2001 computer sitting around with a several 
> hundred
> MHz processor and 64MB of RAM, etc.
>
> Actually, I'd be very glad to hear that this is not the case..
>
> Maybe one of the old Scyld "cluster on a disk" might be a good way?
>
> Perhaps Rocks?  It sort of self installs.
>
> One could always just boot 4 copies of Knoppix, but I don't know that
> there's many "cluster management" tools in Knoppix.
>
> ----- Original Message -----
> From: <rmd003 at sympatico.ca>
> To: <beowulf at beowulf.org>
> Sent: Thursday, December 11, 2003 2:14 PM
> Subject: [Beowulf] Simple Cluster
>
>
> > Hello,
> >
> > Would anyone know if it is possible to make a cluster with four P1
> > computers? If it is possible are there any instructions on how to do
> > this or the software required etc...?
> >
> > Robert Van Amelsvoort
> > rmd003 at sympatico.ca
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
>


-- 
Dr. Jeff Layton
Aerodynamics and CFD
Lockheed-Martin Aeronautical Company - Marietta


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From henken at seas.upenn.edu  Fri Dec 12 12:24:39 2003
From: henken at seas.upenn.edu (Nicholas Henke)
Date: Fri, 12 Dec 2003 12:24:39 -0500
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <1071249879.25601.14.camel@roughneck.liniac.upenn.edu>

On Fri, 2003-12-12 at 12:08, Joshua Baker-LePain wrote:
> Yes, I know this has been discussed a couple of times, and that my stated 
> goals are at odds with each other.  But I really need the best bang for 
> the noise for a system that will reside in the same room with patients 
> undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
> 2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> and annoys both the patients and the physicians.  This is Bad.
> 
> We're willing to pay for better, but don't want to take too much of a 
> speed hit.  Does anybody have a good vendor for quiet but still high 
> performing systems?  Is there any hope in the 1U form factor (my Opteron 
> nodes are somewhat quieter, since they use squirrel cage fans, but are 
> still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> or 4U case?  Or should I look at lashing together some towers (this system 
> also needs to be somewhat portable)?

They are not 1U, but the Dell 650N I have is just about silent. At most
I hear a faint harddrive noise, but most times I hear nothing at all. 
FYI, this is a dual processor machine as well.

Nic
-- 
Nicholas Henke
Penguin Herder & Linux Cluster System Programmer
Liniac Project - Univ. of Pennsylvania

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jlb17 at duke.edu  Fri Dec 12 12:08:31 2003
From: jlb17 at duke.edu (Joshua Baker-LePain)
Date: Fri, 12 Dec 2003 12:08:31 -0500 (EST)
Subject: [Beowulf] Quiet *and* powerful
Message-ID: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>

Yes, I know this has been discussed a couple of times, and that my stated 
goals are at odds with each other.  But I really need the best bang for 
the noise for a system that will reside in the same room with patients 
undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
and annoys both the patients and the physicians.  This is Bad.

We're willing to pay for better, but don't want to take too much of a 
speed hit.  Does anybody have a good vendor for quiet but still high 
performing systems?  Is there any hope in the 1U form factor (my Opteron 
nodes are somewhat quieter, since they use squirrel cage fans, but are 
still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
or 4U case?  Or should I look at lashing together some towers (this system 
also needs to be somewhat portable)?

Thanks for any hints, pointers, recommendations, or flames.

-- 
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Fri Dec 12 12:54:09 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Fri, 12 Dec 2003 09:54:09 -0800 (PST)
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <1071249879.25601.14.camel@roughneck.liniac.upenn.edu>
Message-ID: <Pine.LNX.3.96.1031212094142.11569A-100000@Maggie.Linux-Consulting.com>


hi ya joshua

- what makes noise is typiclly the qualityof the fan, the 
  fan blade design and the size of the air holes and
  distance to the fans

- a fan held up in the open air shold be close to noiseless

- if you want 1U .. you have to put good quality lateral squirrel cages
  far away from everything and still be able to force air
  across the cpu heatsink fins 
	- you should not hear anything

- if you go to 12U or midtower... there is no noise problem
  except for the cheezy "el cheapo" power supply fan 
	( get a good power supply w/ good fan and you wont hear
	( the power supply either

- next choice isto use peltier cooling but you still have to cool
  down the fin on the hot side of the peltier..
	- you can also attach a bracket from teh cpu heatink
	or peltier heatsink to the case ... to get rid of the heat
	assuming the ambient room temp can pull heat off the case

- sounds like your app is based on "quiet operation" and does
  not need to be 1Us ...
	- i'd stack 6 dual-xons mb into one custom chassis and
	it should be quiet as a "nursing room"

== to prove the point ...
	- take all the fans off ( its not needed for this test )

	- take off all the covers to the chassis

	- arrange the motherboards all facing the same way

	- put a giant 12" household fan blowing air across
	the cpu heatink ( air flowing only in 1 direction )
		- preferably side to side in the direction
		of the cpu heatsink fins

	- put a carboard box around the chassis and leave the
	unobstructed air flow of the cardboard opn on the
	cpu side and opposite site
		- put white hospital linen on the box
		that says "do not sit here"

	( probably should do that with the doors locked so
	( that nobody see the cardboard experiment

- after that, you know what your chassis looks like ...
  and still be quiet ..  or you're stuck with 6 mid-tower systems
  vs noisy 1Us
	- 2Us suffer the same noise fate as 1Us ( the way most people
	build it )

- fun stuff .. making the system quiet and run cool ... temperature wise

have fun
alvin

On Fri, 12 Dec 2003, Nicholas Henke wrote:

> On Fri, 2003-12-12 at 12:08, Joshua Baker-LePain wrote:
> > Yes, I know this has been discussed a couple of times, and that my stated 
> > goals are at odds with each other.  But I really need the best bang for 
> > the noise for a system that will reside in the same room with patients 
> > undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
> > 2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> > and annoys both the patients and the physicians.  This is Bad.
> > 
> > We're willing to pay for better, but don't want to take too much of a 
> > speed hit.  Does anybody have a good vendor for quiet but still high 
> > performing systems?  Is there any hope in the 1U form factor (my Opteron 
> > nodes are somewhat quieter, since they use squirrel cage fans, but are 
> > still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> > or 4U case?  Or should I look at lashing together some towers (this system 
> > also needs to be somewhat portable)?
> 
> They are not 1U, but the Dell 650N I have is just about silent. At most
> I hear a faint harddrive noise, but most times I hear nothing at all. 
> FYI, this is a dual processor machine as well.
> 
> Nic
> -- 
> Nicholas Henke
> Penguin Herder & Linux Cluster System Programmer
> Liniac Project - Univ. of Pennsylvania
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Fri Dec 12 13:10:51 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Fri, 12 Dec 2003 10:10:51 -0800
Subject: [Beowulf] Quiet *and* powerful
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BFBE@orsmsx402.jf.intel.com>

From: Joshua Baker-LePain; Sent: Friday, December 12, 2003 9:09 AM
> 
> Yes, I know this has been discussed a couple of times, and that my
stated
> goals are at odds with each other.  But I really need the best bang
for
> the noise for a system that will reside in the same room with patients
> undergoing diagnostic ultrasound scanning.  Our current setup (6 1U
dual
> 2.4GHz Xeons, with about 11 little fans per node) is ridiculously
loud,
> and annoys both the patients and the physicians.  This is Bad.

It's those tiny high-speed (< 1U) fans that are killing you.

Cheapest solution: Move the system out of the room?  Have you looked at
just running a network cable to a minimal diskless system for the
in-room needs?  I assume those needs are graphic head plus some manner
of sensor input.  The in-room unit could boot from the cluster, located
elsewhere.

In-room solution, but possibly above your price range: Go to a cluster
builder for a custom solution that removes the p/s and fans from each
box, centralizes the larger and slower fan(s) and p/s in the cabinet,
running dc to each node.

Depending on your skill and labor availability (I did see duke.edu in
your addr), you might be able to do this yourself or get some, um, cheap
labor.

-- 
David N. Lombard

My comments represent my opinions, not those of Intel.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Fri Dec 12 13:03:00 2003
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 12 Dec 2003 13:03:00 -0500
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <3FDA02D4.3070209@scalableinformatics.com>

Hi Joshua:

  You should probably look to larger cases with larger fans.  The bigger 
fans move more air at the same RPM.  Also, larger cases are easier to 
pad for sound absorption.  The Xeon's I have seen have been using blower 
technology which is simply not quiet.  A 2-3 U system might be easier to 
cool with a larger fan (~10+ cm).  A better case would help as well if 
you could pad it without drastically affecting cooling (airflow).

   Other options include silencing enclosures (enclosures with acoustic 
padding) to encapsulate the existing systems.  These reduce roars to 
hums, annoying but lower intensity.

Joe

Joshua Baker-LePain wrote:

>Yes, I know this has been discussed a couple of times, and that my stated 
>goals are at odds with each other.  But I really need the best bang for 
>the noise for a system that will reside in the same room with patients 
>undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
>2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
>and annoys both the patients and the physicians.  This is Bad.
>
>We're willing to pay for better, but don't want to take too much of a 
>speed hit.  Does anybody have a good vendor for quiet but still high 
>performing systems?  Is there any hope in the 1U form factor (my Opteron 
>nodes are somewhat quieter, since they use squirrel cage fans, but are 
>still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
>or 4U case?  Or should I look at lashing together some towers (this system 
>also needs to be somewhat portable)?
>
>Thanks for any hints, pointers, recommendations, or flames.
>
>  
>

-- 
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 12 13:30:44 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Dec 2003 10:30:44 -0800
Subject: [Beowulf] Simple Cluster
In-Reply-To: <20031212155022.GB25554@blackbird.oase.mhn.de>
References: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
 <3FD8EC50.7060606@sympatico.ca>
 <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <5.2.0.9.2.20031212102848.02fa4e70@mailhost4.jpl.nasa.gov>

At 04:50 PM 12/12/2003 +0100, Michael Banck wrote:
>On Fri, Dec 12, 2003 at 07:32:57AM -0800, Jim Lux wrote:
> > One could always just boot 4 copies of Knoppix, but I don't know that
> > there's many "cluster management" tools in Knoppix.
>
>While Knoppix is all cool with that self-configuration and stuff, I've
>never heard it mentioned when it came to low-level hardware and RAM
>requirements. Sure, one must not boot up in KDE|GNOME, but I doubt that
>even the console mode has a small memory footprint. I'd like to be
>proven wrong, of course :)


I have booted Knoppix into command line mode in 64MB, and maybe 32MB.. I'll 
have to go down into the lab and check.  These are ancient Micron Win95 ISA 
machines we have to run old hardware specific in-circuit-emulators from 
Analog Devices.  One machine didn't work, but I think that was because the 
CD-ROM is broken, not because of other resources.


>Michael
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Fri Dec 12 12:42:24 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Fri, 12 Dec 2003 09:42:24 -0800
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <1071249879.25601.14.camel@roughneck.liniac.upenn.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu> <1071249879.25601.14.camel@roughneck.liniac.upenn.edu>
Message-ID: <20031212174224.GA24197@cse.ucdavis.edu>

On Fri, Dec 12, 2003 at 12:24:39PM -0500, Nicholas Henke wrote:
> > We're willing to pay for better, but don't want to take too much of a 
> > speed hit.  Does anybody have a good vendor for quiet but still high 
> > performing systems?  Is there any hope in the 1U form factor (my Opteron 
> > nodes are somewhat quieter, since they use squirrel cage fans, but are 
> > still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> > or 4U case?  Or should I look at lashing together some towers (this system 
> > also needs to be somewhat portable)?
> 
> They are not 1U, but the Dell 650N I have is just about silent. At most
> I hear a faint harddrive noise, but most times I hear nothing at all. 
> FYI, this is a dual processor machine as well.

I'd recommend trying the 360N, I've seen the single p4 substantially
outperform the dual (even with 2 jobs running).

Basically the memory bus is substantially better on the 360N, and it's
even quieter then the 650N.  Of course this depends on the worldload.

I've never heard a quiet 1 or 2U.  Even the apple xserv's are pretty
loud.

If building yourself I recommend a case with rubber grommets, and
slow RPM 120mm fans similarly mounted.  The Antec Sonnata is an example.

Other possibilities include placing the servers elsewhere and using
a small quiet machine with an LCD/keyboard/mouse.
-- 
Bill Broadley
Information Architect
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 12 13:27:22 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Dec 2003 10:27:22 -0800
Subject: [Beowulf] Simple Cluster
In-Reply-To: <Pine.LNX.4.44.0312120834490.2033-100000@lilith.rgb.private
 .net>
References: <3FD8EC50.7060606@sympatico.ca>
Message-ID: <5.2.0.9.2.20031212102323.018cc750@mailhost4.jpl.nasa.gov>

At 08:40 AM 12/12/2003 -0500, Robert G. Brown wrote:
>On Thu, 11 Dec 2003 rmd003 at sympatico.ca wrote:
>
> > Hello,
> >
> > Would anyone know if it is possible to make a cluster with four P1
> > computers? If it is possible are there any instructions on how to do
>
>Sure.  There are instructions in my column in Cluster World 1,1 that
>should suffice.  There is also a bunch of stuff that might be enough in
>resources linked to http:/www.phy.duke.edu/brahma/index.php, including
>an online book on clusters.  You can probably get free issues including
>this one with a trial subscription at the clusterworld website.
>
>The problems I can see with using Pentiums at this point are:
>
>   a) likely insufficient memory and disk unless you really work on the
>linux installation;
>
>   b) a single $500 vanilla box from your local cheap vendor would be
>MUCH MUCH MUCH faster.  As in MUCH faster.  Raw CPU clock a factor of
>10, add a factor of 2 to 4 for CPU family and more memory and so forth.
>Likely ten or more times faster than your entire cluster of four
>Pentiums on a good day.  SO your cluster needs to be a "just for fun"
>cluster, for hobbyist or teaching purposes, and would still be much
>better (faster and easier to build) with more current CPUs and systems
>with a minimum of 128 to 256 MB of memory each.


Unless you've got computers for free, and your time is free, Robert's words 
are well spoken..

That said.. if you just want to fool with MPI, for instance, and, you've 
got institutional computing resources running WinNT floating around on the 
network, the MPICH-NT version works quite well.  My first MPI program used 
this, with one node being an OLD, OLD ('98-'99 vintage) Win NT4.0 box on a 
P1, and the other node being a PPro desktop, also running NT4.0

I wrote and compiled everything in Visual C... (4 or 5, I can't recall 
which...) and I started working on a wrapper to allow use in Visual Basic, 
for a true thrill..


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Fri Dec 12 14:04:05 2003
From: lindahl at pathscale.com (Greg Lindahl)
Date: Fri, 12 Dec 2003 11:04:05 -0800
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <20031212190405.GA3036@greglaptop.internal.keyresearch.com>

On Fri, Dec 12, 2003 at 12:08:31PM -0500, Joshua Baker-LePain wrote:

> Our current setup (6 1U dual 
> 2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 

The main issue is 1U -- small fans are inefficient, so you end up with
a lot more noise for a given amount of power.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ShiYi.Yue at astrazeneca.com  Fri Dec 12 14:31:11 2003
From: ShiYi.Yue at astrazeneca.com (ShiYi.Yue at astrazeneca.com)
Date: Fri, 12 Dec 2003 20:31:11 +0100
Subject: [Beowulf] Pros and cons of different beowulf clusters
Message-ID: <D2A2B86E8730D711B8560008028AC980257A22@camrd9.camrd.astrazeneca.net>

Hi,

Can someone point me out if there is any comparison of different (small)
beowulf clusters? The hardware will be limited in < 20 PCs. As an example of
this comparison, something like Rocks vs. OSCAR, what do you think about the
installation, maintenance, and upgrade, which one is easier? which one is
more flexible?

Thank you in advance!

shiyi
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From dtj at uberh4x0r.org  Fri Dec 12 14:57:07 2003
From: dtj at uberh4x0r.org (Dean Johnson)
Date: Fri, 12 Dec 2003 13:57:07 -0600
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <20031212190405.GA3036@greglaptop.internal.keyresearch.com>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
	 <20031212190405.GA3036@greglaptop.internal.keyresearch.com>
Message-ID: <1071259026.1556.124.camel@terra>

On Fri, 2003-12-12 at 13:04, Greg Lindahl wrote:
> On Fri, Dec 12, 2003 at 12:08:31PM -0500, Joshua Baker-LePain wrote:
> 
> > Our current setup (6 1U dual 
> > 2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> 
> The main issue is 1U -- small fans are inefficient, so you end up with
> a lot more noise for a given amount of power.
> 

And they are MUCH higher pitched, which pegs the annoy-o-meter. I used
to have an SGI 1100 (1U dual PIII) and an SGI Origin 200 in my home
office. They were both probably the same overall loudness, but it was
the 1100 that I would shut off when I wasn't using it.

-- 

	-Dean

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 12 17:48:26 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 12 Dec 2003 14:48:26 -0800
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu
 >
Message-ID: <5.2.0.9.2.20031212144145.03173b80@mailhost4.jpl.nasa.gov>

First question... does it "really" need to be in the same room?

There is a huge variation in fan noise among models and makes of fan, and, 
furthermore, the structural stuff around it has an effect.  Perhaps just 
buying quieter fans and retrofitting?

Can you put the whole thing in a BIG sound isolated box (read, rack)... 
most equipment racks aren't designed for good acoustical properties.  There 
are, however, industries which are noise level sensitive (sound recording 
and mixing), and they have standard 19" racks, but with better 
design/packaging/etc.

If you're not hugely cost constrained, you can do away with fans entirely 
and sink the whole thing into a tank of fluorinert (but, at $70+/gallon....)

The other thing to think about is whether many smaller/lower power nodes 
can do your job.   If things scaled exactly as processor speed (don't we 
wish).. you've got 12 * 2.4 GHz = 28.8 GHz... Could 40 or 50 1GHz VIA type 
fanless processors work?

Overall, your best bet might be to get some custom sheet metal made to 
mount your motherboards in a more congenial (acoustic and thermal) 
environment.  Rather than have 2 layers of metal between each mobo, make a 
custom enclosure that stacks the boards a few inches apart, and which 
shares a couple big, but quiet, fans to push air through it.

In general, for a given amount of air moved, small fans are much less 
efficient and more noisy than big fans. (efficiency and noise are not very 
well correlated... the mechanical power in the noise is vanishingly small).


At 12:08 PM 12/12/2003 -0500, Joshua Baker-LePain wrote:
>Yes, I know this has been discussed a couple of times, and that my stated
>goals are at odds with each other.  But I really need the best bang for
>the noise for a system that will reside in the same room with patients
>undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual
>2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud,
>and annoys both the patients and the physicians.  This is Bad.
>
>We're willing to pay for better, but don't want to take too much of a
>speed hit.  Does anybody have a good vendor for quiet but still high
>performing systems?  Is there any hope in the 1U form factor (my Opteron
>nodes are somewhat quieter, since they use squirrel cage fans, but are
>still too loud), or should I look at, e.g., putting Quad Opterons in a 3
>or 4U case?  Or should I look at lashing together some towers (this system
>also needs to be somewhat portable)?
>
>Thanks for any hints, pointers, recommendations, or flames.
>
>--
>Joshua Baker-LePain
>Department of Biomedical Engineering
>Duke University
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From john.hearns at clustervision.com  Sat Dec 13 05:57:13 2003
From: john.hearns at clustervision.com (John Hearns)
Date: Sat, 13 Dec 2003 11:57:13 +0100 (CET)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <Pine.LNX.4.44.0312131154320.3660-100000@druifje.clustervision.com>

On Fri, 12 Dec 2003, Jim Lux wrote:

> call that easy.  What Robert is probably looking for is a "stick the CDROM
> in and go" kind of solution, and, just like in the Windows world, the
> current, readily available (as in download the ISO and go) solutions tend to
> assume one has a vintage 2001 computer sitting around with a several hundred
> MHz processor and 64MB of RAM, etc.
> 
> 
> One could always just boot 4 copies of Knoppix, but I don't know that
> there's many "cluster management" tools in Knoppix.

How about ClusterKnoppix then?
http://bofh.be/clusterknoppix/

Its a Knoppix version which runs OpenMosix.
Teh slaves boot via PXE - which might rule out the old P1s.
You probably could boot via floppy though.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgoornaden at scyld.com  Sat Dec 13 04:45:20 2003
From: rgoornaden at scyld.com (rgoornaden at scyld.com)
Date: Sat, 13 Dec 2003 04:45:20 -0500
Subject: [Beowulf] java virtual machine
Message-ID: <200312130945.hBD9jKS29056@NewBlue.scyld.com>


hello
has someone ever met this package while installing mpich2-0.94???
thanks


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Sat Dec 13 07:52:26 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Sat, 13 Dec 2003 06:52:26 -0600
Subject: [Beowulf] BW-BUG meeting, Today Dec. 9, 2003, in Greenbelt MD;
  -- Red Hat
In-Reply-To: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>
References: <Pine.LNX.4.44.0312091319540.1723-100000@training.scyld.com>
Message-ID: <3FDB0B8A.8040903@tamu.edu>

Would it be possible for someone to give a synopsis (assuming that, due 
to travel and a catch-up effort on my part, I didn't miss it already) of 
this meeting?

Thanks, Gerry

Donald Becker wrote:
> [[ Please note that this month's meeting is East: Greenbelt, not McLean VA. ]]
> 
>      Baltimore Washington  Beowulf Users Group 
>             December 2003 Meeting 
>                www.bwbug.org
>     December 9th at 3:00PM in Greenbelt MD
>  
> ____
> 
>         RedHat Roadmap for HPC Beowulf Clusters.
> 
>         RedHat is pleased to have the opportunity to present to Baltimore-
> Washington Beowulf User Group on Tuesday Dec 9th. Robert Hibbard, Red Hat's
> Federal Partner Alliance Manager, will provide information on Red Hat's
> Enterprise Linux product strategy, with particular emphasis on it's
> relevance to High Performance Computing Clusters. 
> 
>         Discussion will include information on the background, current
> product optimizations, as well as possible futures for Red Hat efforts
> focused on HPCC. 
> ____
> 
> Our meeting facilities are once again provided by Northrup Grumman
> 	7501 Greenway Center Drive
> 	Suite 1000 (10th floor)
> 	Greenbelt, MD 20770, phone
> 	703-628-7451
> 
> 

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From laytonjb at comcast.net  Sat Dec 13 12:51:37 2003
From: laytonjb at comcast.net (Jeffrey B. Layton)
Date: Sat, 13 Dec 2003 12:51:37 -0500
Subject: [Beowulf] Anyone recently build a small cluster?
Message-ID: <3FDB51A9.1030406@comcast.net>

Good morning,

   I'm looking for someone or a group that has recently
built a small (16 nodes or less) cluster that was their
first cluster. I'm working on a part of one of my columns
for Cluster World and I want to feature a small cluster
that someone built for the first time.

Thanks!

Jeff


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From daniel.pfenniger at obs.unige.ch  Sat Dec 13 13:28:45 2003
From: daniel.pfenniger at obs.unige.ch (Daniel Pfenniger)
Date: Sat, 13 Dec 2003 19:28:45 +0100
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
References: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <3FDB5A5D.8090206@obs.unige.ch>

Hi,

Its not a 1U, its a low-noise Linux P4 box, but really *low* noise: the 
transtec 1200
We bought these for offices precisely because these boxes are designed 
for low noise.

http://www.transtec.ch/CH/E/products/workstations/linuxworkstations/transtec1200lownoiseworkstation.html?fsid=342edfd38a845c179dd18ef965091b2d

In practice in the office the box can barely be noticed,  I can imagine 
a dozen or more
of these boxes would not disturb a normal conversation.

    Dan


Joshua Baker-LePain wrote:

>Yes, I know this has been discussed a couple of times, and that my stated 
>goals are at odds with each other.  But I really need the best bang for 
>the noise for a system that will reside in the same room with patients 
>undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
>2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
>and annoys both the patients and the physicians.  This is Bad.
>
>We're willing to pay for better, but don't want to take too much of a 
>speed hit.  Does anybody have a good vendor for quiet but still high 
>performing systems?  Is there any hope in the 1U form factor (my Opteron 
>nodes are somewhat quieter, since they use squirrel cage fans, but are 
>still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
>or 4U case?  Or should I look at lashing together some towers (this system 
>also needs to be somewhat portable)?
>
>Thanks for any hints, pointers, recommendations, or flames.
>
>  
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lathama at yahoo.com  Sat Dec 13 13:55:12 2003
From: lathama at yahoo.com (Andrew Latham)
Date: Sat, 13 Dec 2003 10:55:12 -0800 (PST)
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <3FDB5A5D.8090206@obs.unige.ch>
Message-ID: <20031213185512.48570.qmail@web60304.mail.yahoo.com>

koolance.com has rackmount cases that use a water cooling system that is both
cool and quite. It also is a standard rackmount case that would free up some
design issues..


--- Daniel Pfenniger <daniel.pfenniger at obs.unige.ch> wrote:
> Hi,
> 
> Its not a 1U, its a low-noise Linux P4 box, but really *low* noise: the 
> transtec 1200
> We bought these for offices precisely because these boxes are designed 
> for low noise.
> 
>
http://www.transtec.ch/CH/E/products/workstations/linuxworkstations/transtec1200lownoiseworkstation.html?fsid=342edfd38a845c179dd18ef965091b2d
> 
> In practice in the office the box can barely be noticed,  I can imagine 
> a dozen or more
> of these boxes would not disturb a normal conversation.
> 
>     Dan
> 
> 
> Joshua Baker-LePain wrote:
> 
> >Yes, I know this has been discussed a couple of times, and that my stated 
> >goals are at odds with each other.  But I really need the best bang for 
> >the noise for a system that will reside in the same room with patients 
> >undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
> >2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> >and annoys both the patients and the physicians.  This is Bad.
> >
> >We're willing to pay for better, but don't want to take too much of a 
> >speed hit.  Does anybody have a good vendor for quiet but still high 
> >performing systems?  Is there any hope in the 1U form factor (my Opteron 
> >nodes are somewhat quieter, since they use squirrel cage fans, but are 
> >still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> >or 4U case?  Or should I look at lashing together some towers (this system 
> >also needs to be somewhat portable)?
> >
> >Thanks for any hints, pointers, recommendations, or flames.
> >
> >  
> >
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

=====
/---------------------------------------------------------------------------------------------------\
Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.

What Is an agnostic? -  An agnostic thinks it impossible to know the truth
in matters such as, a god or the future with which religions are concerned 
with. Or, if not impossible, at least impossible at the present time.
 
LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
\---------------------------------------------------------------------------------------------------/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From joelja at darkwing.uoregon.edu  Sat Dec 13 17:26:37 2003
From: joelja at darkwing.uoregon.edu (Joel Jaeggli)
Date: Sat, 13 Dec 2003 14:26:37 -0800 (PST)
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <20031213185512.48570.qmail@web60304.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312131422190.1124-100000@twin.uoregon.edu>

On Sat, 13 Dec 2003, Andrew Latham wrote:

> koolance.com has rackmount cases that use a water cooling system that is both
> cool and quite. It also is a standard rackmount case that would free up some
> design issues..

it's also 4u...

in 4u I have 8 opteron 242 cpu's in 4 cases with three panaflo crossflow 
blowers ea which are quite bearable compared to screaming loud 8000rpm 
40mm fans.

> 
> 
> --- Daniel Pfenniger <daniel.pfenniger at obs.unige.ch> wrote:
> > Hi,
> > 
> > Its not a 1U, its a low-noise Linux P4 box, but really *low* noise: the 
> > transtec 1200
> > We bought these for offices precisely because these boxes are designed 
> > for low noise.
> > 
> >
> http://www.transtec.ch/CH/E/products/workstations/linuxworkstations/transtec1200lownoiseworkstation.html?fsid=342edfd38a845c179dd18ef965091b2d
> > 
> > In practice in the office the box can barely be noticed,  I can imagine 
> > a dozen or more
> > of these boxes would not disturb a normal conversation.
> > 
> >     Dan
> > 
> > 
> > Joshua Baker-LePain wrote:
> > 
> > >Yes, I know this has been discussed a couple of times, and that my stated 
> > >goals are at odds with each other.  But I really need the best bang for 
> > >the noise for a system that will reside in the same room with patients 
> > >undergoing diagnostic ultrasound scanning.  Our current setup (6 1U dual 
> > >2.4GHz Xeons, with about 11 little fans per node) is ridiculously loud, 
> > >and annoys both the patients and the physicians.  This is Bad.
> > >
> > >We're willing to pay for better, but don't want to take too much of a 
> > >speed hit.  Does anybody have a good vendor for quiet but still high 
> > >performing systems?  Is there any hope in the 1U form factor (my Opteron 
> > >nodes are somewhat quieter, since they use squirrel cage fans, but are 
> > >still too loud), or should I look at, e.g., putting Quad Opterons in a 3 
> > >or 4U case?  Or should I look at lashing together some towers (this system 
> > >also needs to be somewhat portable)?
> > >
> > >Thanks for any hints, pointers, recommendations, or flames.
> > >
> > >  
> > >
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> =====
> /---------------------------------------------------------------------------------------------------\
> Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.
> 
> What Is an agnostic? -  An agnostic thinks it impossible to know the truth
> in matters such as, a god or the future with which religions are concerned 
> with. Or, if not impossible, at least impossible at the present time.
>  
> LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
> \---------------------------------------------------------------------------------------------------/
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli  	       Unix Consulting 	       joelja at darkwing.uoregon.edu    
GPG Key Fingerprint:     5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Mon Dec 15 00:48:45 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Mon, 15 Dec 2003 13:48:45 +0800 (CST)
Subject: [Beowulf] Fwd: GridEngine for AMD64 available on ftp.suse.com
In-Reply-To: <200308201609.UAA08558@nocserv.free.net>
Message-ID: <20031215054845.37575.qmail@web16802.mail.tpe.yahoo.com>

I downloaded the rpm -- I didn't install it, but I
just extracted the files, and did a "file" command.
The binaries are compiled as 64-bit.

Andrew.

> SuSE ship SGE on their CDs, including their AMD64
> and
> Athlon64 distribution:
> 
>
http://www.suse.de/us/private/products/suse_linux/i386/packages_amd64/gridengine.html
> 
> And it is also available on
>
ftp.suse.com:/pub/suse/x86_64/9.0/suse/x86_64/gridengine-5.3-257.x86_64.rpm
> 
> But on sure if it works on RedHat or not.
> 
>  -Ron
> 
> 

-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ranjit.chagar at ntlworld.com  Mon Dec 15 08:54:31 2003
From: ranjit.chagar at ntlworld.com (Ranjit Chagar)
Date: Mon, 15 Dec 2003 13:54:31 -0000
Subject: [Beowulf] Simple Cluster
References: <3FD8EC50.7060606@sympatico.ca> <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <005001c3c312$f7418600$0301a8c0@chagar>

Hi robert/jim,

Well I built a cluster just for the hell of it. And as you said, before the
flames start, it was built just to see what I could do, built from cheap
PCs, just for the fun of it. They are 133Mhz PII and I built mine following
the instructions from pondermatic. Okay, so in this day and age that is old
hat, and so is my system but I enjoyed building it and enjoy playing around
with it. And then, being stupid myself, I wrote out instructions so that I
could did it again cause I will be the first to admit my memory isn't that
good.

Full details at http://homepage.ntlworld.com/ranjit.chagar/

Robert - if you have any questions let me know.

Jim - I dont mean for this email to sound bad but my english sometimes is
taken wrong. I mean to say that you can do it if you want.

Best Regards, Ranjit

----- Original Message -----
From: "Jim Lux" <james.p.lux at jpl.nasa.gov>
To: <rmd003 at sympatico.ca>
Cc: <beowulf at beowulf.org>
Sent: Friday, December 12, 2003 3:32 PM
Subject: Re: [Beowulf] Simple Cluster


> Sure you can do it. It won't be a ball of fire speed wise, and probably
> wouldn't be a cost effective solution to doing any "real work", but it
will
> compute..
>
> Search the web for the "Pondermatic" which, as I recall, was a couple or
> three P1s.  And of course, very early clusters were made with 486's.
>
> Your big challenge is probably going to be (easily) getting an appropriate
> distribution that fits within the disk and RAM limits.  Yes, before all
the
> flames start, I know it's possible to make a version that fits in 16K on
an
> 8088, and that would be bloatware compared to someone's special 6502 Linux
> implementation that runs on old Apple IIs, etc.etc.etc., but nobody would
> call that easy.  What Robert is probably looking for is a "stick the CDROM
> in and go" kind of solution, and, just like in the Windows world, the
> current, readily available (as in download the ISO and go) solutions tend
to
> assume one has a vintage 2001 computer sitting around with a several
hundred
> MHz processor and 64MB of RAM, etc.
>
> Actually, I'd be very glad to hear that this is not the case..
>
> Maybe one of the old Scyld "cluster on a disk" might be a good way?
>
> Perhaps Rocks?  It sort of self installs.
>
> One could always just boot 4 copies of Knoppix, but I don't know that
> there's many "cluster management" tools in Knoppix.
>
> ----- Original Message -----
> From: <rmd003 at sympatico.ca>
> To: <beowulf at beowulf.org>
> Sent: Thursday, December 11, 2003 2:14 PM
> Subject: [Beowulf] Simple Cluster
>
>
> > Hello,
> >
> > Would anyone know if it is possible to make a cluster with four P1
> > computers? If it is possible are there any instructions on how to do
> > this or the software required etc...?
> >
> > Robert Van Amelsvoort
> > rmd003 at sympatico.ca
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Dec 15 12:41:12 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 15 Dec 2003 12:41:12 -0500 (EST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <5.2.0.9.2.20031215091234.02fa5c88@mailhost4.jpl.nasa.gov>
Message-ID: <Pine.LNX.4.44.0312151240560.3201-100000@ganesh.phy.duke.edu>

On Mon, 15 Dec 2003, Jim Lux wrote:

> Outstanding, Ranjit...
> Great that you wrote up a page describing how you did it, too!! Especially, 
> describing the problems you encountered (i.e. slot dependence for network 
> cards..)
> 
> So now you can say you built your own supercomputer.  How cool is that.


And just in time for Jeff's column, too;-)

   rgb

> 
> Jim
> 
> 
> At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
> >Hi robert/jim,
> >
> >Well I built a cluster just for the hell of it. And as you said, before the
> >flames start, it was built just to see what I could do, built from cheap
> >PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> >the instructions from pondermatic. Okay, so in this day and age that is old
> >hat, and so is my system but I enjoyed building it and enjoy playing around
> >with it. And then, being stupid myself, I wrote out instructions so that I
> >could did it again cause I will be the first to admit my memory isn't that
> >good.
> >
> >Full details at http://homepage.ntlworld.com/ranjit.chagar/
> >
> >Robert - if you have any questions let me know.
> >
> >Jim - I dont mean for this email to sound bad but my english sometimes is
> >taken wrong. I mean to say that you can do it if you want.
> >
> >Best Regards, Ranjit
> 
> James Lux, P.E.
> Spacecraft Telecommunications Section
> Jet Propulsion Laboratory, Mail Stop 161-213
> 4800 Oak Grove Drive
> Pasadena CA 91109
> tel: (818)354-2075
> fax: (818)393-6875
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Mon Dec 15 12:15:31 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 15 Dec 2003 09:15:31 -0800
Subject: [Beowulf] Simple Cluster
In-Reply-To: <005001c3c312$f7418600$0301a8c0@chagar>
References: <3FD8EC50.7060606@sympatico.ca>
 <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422>
Message-ID: <5.2.0.9.2.20031215091234.02fa5c88@mailhost4.jpl.nasa.gov>

Outstanding, Ranjit...
Great that you wrote up a page describing how you did it, too!! Especially, 
describing the problems you encountered (i.e. slot dependence for network 
cards..)

So now you can say you built your own supercomputer.  How cool is that.

Jim


At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
>Hi robert/jim,
>
>Well I built a cluster just for the hell of it. And as you said, before the
>flames start, it was built just to see what I could do, built from cheap
>PCs, just for the fun of it. They are 133Mhz PII and I built mine following
>the instructions from pondermatic. Okay, so in this day and age that is old
>hat, and so is my system but I enjoyed building it and enjoy playing around
>with it. And then, being stupid myself, I wrote out instructions so that I
>could did it again cause I will be the first to admit my memory isn't that
>good.
>
>Full details at http://homepage.ntlworld.com/ranjit.chagar/
>
>Robert - if you have any questions let me know.
>
>Jim - I dont mean for this email to sound bad but my english sometimes is
>taken wrong. I mean to say that you can do it if you want.
>
>Best Regards, Ranjit

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Mon Dec 15 13:56:04 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 15 Dec 2003 13:56:04 -0500 (EST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <20031215183945.41825.qmail@web60305.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312151347300.3201-100000@ganesh.phy.duke.edu>

On Mon, 15 Dec 2003, Andrew Latham wrote:

> I have a small 9 node p133 cluster. It works.
> 
> What does the list think about the idea of developing software on the
> smaller(mem) and older systems. I have one so I am bias but I do see that
> developing software that can handle 64meg of ram on a P586 system would lend to
> tighter and more efficant code. I am not trying to sell the P133 systems, only
> thinking about good code for them would be really nice(fast) on a Xeon or
> better. I already know this could spark a discussion on busses and chipsets and
> processors. Just thinking

More likely a discussion on balance.  I actually think that developing
on small clusters is good, but I'm not so sure about small REALLY old
systems.  The problem is that things like memory access speed and
pipelining change so much across processor generations that not only are
the bottlenecks different, the bottlenecking processes have different
thresholds and are in different ratios to the other system performance
determiners.  Just as performance on such a cluster would not be
terribly good as a predictor of performance on modern cluster from a
hardware point of view, it isn't certain that it would be all that great
from a software point of view.

My favorite case study to illustrate the point is what I continue to
think of as a brilliant piece of code -- ATLAS.  Would an ATLAS-tuned
BLAS built on and for a 586 still perform optimally on a P4 or Opteron?
I think not.  Not even close.  Even if ATLAS-level tuning may be beyond
most programmers, there are issues with stride, cache size and type, and
for parallel programmers the relative speeds of CPU, memory, and network
that can strongly affect program design and performance and scaling.

So I too have a small cluster at home and develop there, and for a lot
of code it doesn't matter as long as one doesn't test SCALING there.
But I'm not sure the code itself is any better "because" it was
developed there.

Although given that my beer-filled refrigerator is just downstairs, it
may be...;-)

   rgb

> 
> 
> --- Jim Lux <James.P.Lux at jpl.nasa.gov> wrote:
> > Outstanding, Ranjit...
> > Great that you wrote up a page describing how you did it, too!! Especially, 
> > describing the problems you encountered (i.e. slot dependence for network 
> > cards..)
> > 
> > So now you can say you built your own supercomputer.  How cool is that.
> > 
> > Jim
> > 
> > 
> > At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
> > >Hi robert/jim,
> > >
> > >Well I built a cluster just for the hell of it. And as you said, before the
> > >flames start, it was built just to see what I could do, built from cheap
> > >PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> > >the instructions from pondermatic. Okay, so in this day and age that is old
> > >hat, and so is my system but I enjoyed building it and enjoy playing around
> > >with it. And then, being stupid myself, I wrote out instructions so that I
> > >could did it again cause I will be the first to admit my memory isn't that
> > >good.
> > >
> > >Full details at http://homepage.ntlworld.com/ranjit.chagar/
> > >
> > >Robert - if you have any questions let me know.
> > >
> > >Jim - I dont mean for this email to sound bad but my english sometimes is
> > >taken wrong. I mean to say that you can do it if you want.
> > >
> > >Best Regards, Ranjit
> > 
> > James Lux, P.E.
> > Spacecraft Telecommunications Section
> > Jet Propulsion Laboratory, Mail Stop 161-213
> > 4800 Oak Grove Drive
> > Pasadena CA 91109
> > tel: (818)354-2075
> > fax: (818)393-6875
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> =====
> /---------------------------------------------------------------------------------------------------\
> Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.
> 
> What Is an agnostic? -  An agnostic thinks it impossible to know the truth
> in matters such as, a god or the future with which religions are concerned 
> with. Or, if not impossible, at least impossible at the present time.
>  
> LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
> \---------------------------------------------------------------------------------------------------/
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lathama at yahoo.com  Mon Dec 15 13:39:45 2003
From: lathama at yahoo.com (Andrew Latham)
Date: Mon, 15 Dec 2003 10:39:45 -0800 (PST)
Subject: [Beowulf] Simple Cluster
In-Reply-To: <5.2.0.9.2.20031215091234.02fa5c88@mailhost4.jpl.nasa.gov>
Message-ID: <20031215183945.41825.qmail@web60305.mail.yahoo.com>

I have a small 9 node p133 cluster. It works.

What does the list think about the idea of developing software on the
smaller(mem) and older systems. I have one so I am bias but I do see that
developing software that can handle 64meg of ram on a P586 system would lend to
tighter and more efficant code. I am not trying to sell the P133 systems, only
thinking about good code for them would be really nice(fast) on a Xeon or
better. I already know this could spark a discussion on busses and chipsets and
processors. Just thinking


--- Jim Lux <James.P.Lux at jpl.nasa.gov> wrote:
> Outstanding, Ranjit...
> Great that you wrote up a page describing how you did it, too!! Especially, 
> describing the problems you encountered (i.e. slot dependence for network 
> cards..)
> 
> So now you can say you built your own supercomputer.  How cool is that.
> 
> Jim
> 
> 
> At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
> >Hi robert/jim,
> >
> >Well I built a cluster just for the hell of it. And as you said, before the
> >flames start, it was built just to see what I could do, built from cheap
> >PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> >the instructions from pondermatic. Okay, so in this day and age that is old
> >hat, and so is my system but I enjoyed building it and enjoy playing around
> >with it. And then, being stupid myself, I wrote out instructions so that I
> >could did it again cause I will be the first to admit my memory isn't that
> >good.
> >
> >Full details at http://homepage.ntlworld.com/ranjit.chagar/
> >
> >Robert - if you have any questions let me know.
> >
> >Jim - I dont mean for this email to sound bad but my english sometimes is
> >taken wrong. I mean to say that you can do it if you want.
> >
> >Best Regards, Ranjit
> 
> James Lux, P.E.
> Spacecraft Telecommunications Section
> Jet Propulsion Laboratory, Mail Stop 161-213
> 4800 Oak Grove Drive
> Pasadena CA 91109
> tel: (818)354-2075
> fax: (818)393-6875
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

=====
/---------------------------------------------------------------------------------------------------\
Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.

What Is an agnostic? -  An agnostic thinks it impossible to know the truth
in matters such as, a god or the future with which religions are concerned 
with. Or, if not impossible, at least impossible at the present time.
 
LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
\---------------------------------------------------------------------------------------------------/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From gerry.creager at tamu.edu  Mon Dec 15 11:01:23 2003
From: gerry.creager at tamu.edu (Gerry Creager N5JXS)
Date: Mon, 15 Dec 2003 10:01:23 -0600
Subject: [Beowulf] Simple Cluster
In-Reply-To: <005001c3c312$f7418600$0301a8c0@chagar>
References: <3FD8EC50.7060606@sympatico.ca> <001901c3c0c5$38b25f10$36a8a8c0@LAPTOP152422> <005001c3c312$f7418600$0301a8c0@chagar>
Message-ID: <3FDDDAD3.3020005@tamu.edu>

The flames come sometimes... And in today's world, where a high end box 
can outperform a small, low-power cluster, it's often hard to separate 
the flames from significant help/tips.

My first cluster was 7 66 MHz 486's, and it was done as a proof of 
concept project.  I demonstrated that I could improve performance with 
the cluster over a single machine doing serialized processing of 
geodetic data.  Note that it was still faster to run the code serially 
on a dual-processor Pentium 266 with more memory than any of the nodes 
in the cluster...  But it proved the point and was a valid academic 
exercise.

Now you're ready to try code on a little cluster, and gain some 
programming skills.  After that, you're ready to build something bigger 
and more capable.

Good luck!
Gerry

Ranjit Chagar wrote:
> Hi robert/jim,
> 
> Well I built a cluster just for the hell of it. And as you said, before the
> flames start, it was built just to see what I could do, built from cheap
> PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> the instructions from pondermatic. Okay, so in this day and age that is old
> hat, and so is my system but I enjoyed building it and enjoy playing around
> with it. And then, being stupid myself, I wrote out instructions so that I
> could did it again cause I will be the first to admit my memory isn't that
> good.
> 
> Full details at http://homepage.ntlworld.com/ranjit.chagar/
> 
> Robert - if you have any questions let me know.
> 
> Jim - I dont mean for this email to sound bad but my english sometimes is
> taken wrong. I mean to say that you can do it if you want.
> 
> Best Regards, Ranjit
> 
> ----- Original Message -----
> From: "Jim Lux" <james.p.lux at jpl.nasa.gov>
> To: <rmd003 at sympatico.ca>
> Cc: <beowulf at beowulf.org>
> Sent: Friday, December 12, 2003 3:32 PM
> Subject: Re: [Beowulf] Simple Cluster
> 
> 
> 
>>Sure you can do it. It won't be a ball of fire speed wise, and probably
>>wouldn't be a cost effective solution to doing any "real work", but it
> 
> will
> 
>>compute..
>>
>>Search the web for the "Pondermatic" which, as I recall, was a couple or
>>three P1s.  And of course, very early clusters were made with 486's.
>>
>>Your big challenge is probably going to be (easily) getting an appropriate
>>distribution that fits within the disk and RAM limits.  Yes, before all
> 
> the
> 
>>flames start, I know it's possible to make a version that fits in 16K on
> 
> an
> 
>>8088, and that would be bloatware compared to someone's special 6502 Linux
>>implementation that runs on old Apple IIs, etc.etc.etc., but nobody would
>>call that easy.  What Robert is probably looking for is a "stick the CDROM
>>in and go" kind of solution, and, just like in the Windows world, the
>>current, readily available (as in download the ISO and go) solutions tend
> 
> to
> 
>>assume one has a vintage 2001 computer sitting around with a several
> 
> hundred
> 
>>MHz processor and 64MB of RAM, etc.
>>
>>Actually, I'd be very glad to hear that this is not the case..
>>
>>Maybe one of the old Scyld "cluster on a disk" might be a good way?
>>
>>Perhaps Rocks?  It sort of self installs.
>>
>>One could always just boot 4 copies of Knoppix, but I don't know that
>>there's many "cluster management" tools in Knoppix.
>>
>>----- Original Message -----
>>From: <rmd003 at sympatico.ca>
>>To: <beowulf at beowulf.org>
>>Sent: Thursday, December 11, 2003 2:14 PM
>>Subject: [Beowulf] Simple Cluster
>>
>>
>>
>>>Hello,
>>>
>>>Would anyone know if it is possible to make a cluster with four P1
>>>computers? If it is possible are there any instructions on how to do
>>>this or the software required etc...?
>>>
>>>Robert Van Amelsvoort
>>>rmd003 at sympatico.ca
>>>
>>>_______________________________________________
>>>Beowulf mailing list, Beowulf at beowulf.org
>>>To change your subscription (digest mode or unsubscribe) visit
>>
>>http://www.beowulf.org/mailman/listinfo/beowulf
>>
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit
> 
> http://www.beowulf.org/mailman/listinfo/beowulf
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Gerry Creager -- gerry.creager at tamu.edu
Network Engineering -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From henken at seas.upenn.edu  Mon Dec 15 16:27:10 2003
From: henken at seas.upenn.edu (Nicholas Henke)
Date: Mon, 15 Dec 2003 16:27:10 -0500
Subject: [Beowulf] clubmask 0.6b2 released
Message-ID: <1071523630.6527.2.camel@roughneck.liniac.upenn.edu>

Changes since 0.6b1:
-----------------------------------------
Add support for runtime (clubmask.conf) choice of resource manager subsystem.
The available options now are ganglia and supermon. Support for ganglia3 will be
added once it is released. Ganglia is now the preferred choice, as it is _much_
more stable.

add --with-supermon to setup.py to turn on compiling of supermon python module.
It is now off by default, as ganglia is the preferred and default RM subsystem.
------------------------------------------------------------------------------

Name        : Clubmask
Version     : 0.6                             
Release     : b2
Group       : Cluster Resource Management and Scheduling
Vendor      : Liniac Project, University of Pennsylvania
License     : GPL-2
URL         : http://clubmask.sourceforge.net

What is Clubmask
------------------------------------------------------------------------------
Clubmask is a resource manager designed to allow Bproc based clusters
enjoy the full scheduling power and configuration of the Maui HPC
Scheduler.

Clubmask uses a modified version of the Supermon resource monitoring
software to gather resource information from the cluster nodes. This
information is combined with job submission data and delivered to the
Maui scheduler. Maui issues job control commands back to Clubmask,
which then starts or stops the job scripts using the Bproc environment.

Clubmask also provides builtin support for a supermon2ganglia translator
that allows a standard Ganlgia  web backend to contact supermon and get
XML data that will disply through the Ganglia web interface.

Clubmask is currently running on around 10 clusters, varying in size
from 8 to 128 nodes, and has been tested up to 5000 jobs.


Notes/warnings on this release:
------------------------------------------------------------------------------
Before upgrading, please make sure to save your /etc/clubmask/clubmask.conf
file, as it may get overwritten.

To use the resource requests, you must be running the latest snapshot of maui.


Links
-------------
Bproc: http://bproc.sourceforge.net
Ganglia: http://ganglia.sourceforge.net
Maui Scheduler: http://www.supercluster.org/maui
Supermon: http://supermon.sourceforge.net

Nic
-- 
Nicholas Henke
Penguin Herder & Linux Cluster System Programmer
Liniac Project - Univ. of Pennsylvania

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From camm at enhanced.com  Mon Dec 15 17:00:14 2003
From: camm at enhanced.com (Camm Maguire)
Date: 15 Dec 2003 17:00:14 -0500
Subject: [Beowulf] Simple Cluster
In-Reply-To: <Pine.LNX.4.44.0312151347300.3201-100000@ganesh.phy.duke.edu>
References: <Pine.LNX.4.44.0312151347300.3201-100000@ganesh.phy.duke.edu>
Message-ID: <547k0xu4gh.fsf@intech19.enhanced.com>

Greetings!  You may be interested in Debian's atlas setup.  We have
several binary packages which depend on a virtual blas2 and lapack2
package, which can be provided by either the reference libraries or a
variety of atlas provided versions with various ISA instructions
supported.  For example, on i386, we have sse, sse2, and 3dnow builds
in addition to the 'vanilla' x86 build.  As you know, the isa
instructions are only one of many factors affecting atlas tuning.
They are the key one, however, in a) determining whether the lib will
run at all on a given system, and b) that delivers the lion's share of
the performance.  The philosophy here is to provide binaries which
give factors of 2 or more of performance gain to be had, while making
it easy for users to get the remaining 10-20% by customizing the
package for their site.  'apt-get -q source atlas; cd atlas-3.2.1ln;
fakeroot debian/rules custom' gives one a tuned .deb for the running
box. 

We need to get newer versions of the lib uploaded, but otherwise it
works great.  'Almost' customized performance automatically available
to R, octave,.... without recompilation.

Take care,

"Robert G. Brown" <rgb at phy.duke.edu> writes:

> On Mon, 15 Dec 2003, Andrew Latham wrote:
> 
> > I have a small 9 node p133 cluster. It works.
> > 
> > What does the list think about the idea of developing software on the
> > smaller(mem) and older systems. I have one so I am bias but I do see that
> > developing software that can handle 64meg of ram on a P586 system would lend to
> > tighter and more efficant code. I am not trying to sell the P133 systems, only
> > thinking about good code for them would be really nice(fast) on a Xeon or
> > better. I already know this could spark a discussion on busses and chipsets and
> > processors. Just thinking
> 
> More likely a discussion on balance.  I actually think that developing
> on small clusters is good, but I'm not so sure about small REALLY old
> systems.  The problem is that things like memory access speed and
> pipelining change so much across processor generations that not only are
> the bottlenecks different, the bottlenecking processes have different
> thresholds and are in different ratios to the other system performance
> determiners.  Just as performance on such a cluster would not be
> terribly good as a predictor of performance on modern cluster from a
> hardware point of view, it isn't certain that it would be all that great
> from a software point of view.
> 
> My favorite case study to illustrate the point is what I continue to
> think of as a brilliant piece of code -- ATLAS.  Would an ATLAS-tuned
> BLAS built on and for a 586 still perform optimally on a P4 or Opteron?
> I think not.  Not even close.  Even if ATLAS-level tuning may be beyond
> most programmers, there are issues with stride, cache size and type, and
> for parallel programmers the relative speeds of CPU, memory, and network
> that can strongly affect program design and performance and scaling.
> 
> So I too have a small cluster at home and develop there, and for a lot
> of code it doesn't matter as long as one doesn't test SCALING there.
> But I'm not sure the code itself is any better "because" it was
> developed there.
> 
> Although given that my beer-filled refrigerator is just downstairs, it
> may be...;-)
> 
>    rgb
> 
> > 
> > 
> > --- Jim Lux <James.P.Lux at jpl.nasa.gov> wrote:
> > > Outstanding, Ranjit...
> > > Great that you wrote up a page describing how you did it, too!! Especially, 
> > > describing the problems you encountered (i.e. slot dependence for network 
> > > cards..)
> > > 
> > > So now you can say you built your own supercomputer.  How cool is that.
> > > 
> > > Jim
> > > 
> > > 
> > > At 01:54 PM 12/15/2003 +0000, Ranjit Chagar wrote:
> > > >Hi robert/jim,
> > > >
> > > >Well I built a cluster just for the hell of it. And as you said, before the
> > > >flames start, it was built just to see what I could do, built from cheap
> > > >PCs, just for the fun of it. They are 133Mhz PII and I built mine following
> > > >the instructions from pondermatic. Okay, so in this day and age that is old
> > > >hat, and so is my system but I enjoyed building it and enjoy playing around
> > > >with it. And then, being stupid myself, I wrote out instructions so that I
> > > >could did it again cause I will be the first to admit my memory isn't that
> > > >good.
> > > >
> > > >Full details at http://homepage.ntlworld.com/ranjit.chagar/
> > > >
> > > >Robert - if you have any questions let me know.
> > > >
> > > >Jim - I dont mean for this email to sound bad but my english sometimes is
> > > >taken wrong. I mean to say that you can do it if you want.
> > > >
> > > >Best Regards, Ranjit
> > > 
> > > James Lux, P.E.
> > > Spacecraft Telecommunications Section
> > > Jet Propulsion Laboratory, Mail Stop 161-213
> > > 4800 Oak Grove Drive
> > > Pasadena CA 91109
> > > tel: (818)354-2075
> > > fax: (818)393-6875
> > > 
> > > _______________________________________________
> > > Beowulf mailing list, Beowulf at beowulf.org
> > > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> > 
> > =====
> > /---------------------------------------------------------------------------------------------------\
> > Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.
> > 
> > What Is an agnostic? -  An agnostic thinks it impossible to know the truth
> > in matters such as, a god or the future with which religions are concerned 
> > with. Or, if not impossible, at least impossible at the present time.
> >  
> > LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
> > \---------------------------------------------------------------------------------------------------/
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> > 
> 
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> 
> 

-- 
Camm Maguire			     			camm at enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jlb17 at duke.edu  Tue Dec 16 11:35:32 2003
From: jlb17 at duke.edu (Joshua Baker-LePain)
Date: Tue, 16 Dec 2003 11:35:32 -0500 (EST)
Subject: [Beowulf] Quiet *and* powerful
In-Reply-To: <Pine.LNX.4.44.0312121012330.2243-100000@chaos.egr.duke.edu>
Message-ID: <Pine.LNX.4.44.0312161131580.2243-100000@chaos.egr.duke.edu>

I just wanted to thank everybody who's gotten back to me, both on and off 
list -- lots of good suggestions.  Now, off to see what I can implement...

-- 
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From avkon at imm.uran.ru  Wed Dec 17 09:08:15 2003
From: avkon at imm.uran.ru (Alexandr Konovalov)
Date: Wed, 17 Dec 2003 19:08:15 +0500
Subject: [Beowulf] Right place to MPICH discussions
Message-ID: <3FE0634F.8060300@imm.uran.ru>

Hi,

Where is relevant place to discuss MPICH internal problems?
I send mail to mpi-maint at mcs.anl.gov but receive no reaction.

Basically we have problem with shmat in
mpid/ch_p4/p4/lib/p4_MD.c:MD_initmem in linux around 2.4.20
kernels. It seems to me that if we change System V IPC
horrors with plain and simple mmap in MD_initmem we have
broke nothing anyway. Is this reasonable?

While googling I found only the hint "to play with
P4_GLOBMEMSIZE" but in our case P4_GLOBMEMSIZE always too
small (so MPICH complaine) or too high (so shmat failed).

It's quite strange to me that we have very general 
configuration (2 CPU Xeons, Redhat 7.3 etc) and problems 
arise at wide class of MPI programs. The only specific there 
I think the -with-comm=shared flag in confogure.

-- 
Best regards,
Alexandr Konovalov


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Wed Dec 17 13:54:31 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 17 Dec 2003 13:54:31 -0500 (EST)
Subject: [Beowulf] PVM master/slave project template...
Message-ID: <Pine.LNX.4.44.0312171346530.8574-100000@lucifer.rgb.private.net>

Dear Listvolken,

I just finished building a PVM master/slave project template for public
and private re-use (mostly for re-use in the CW column that I WILL
finish in the next couple of days:-).  I am curious as to whether it
works for anybody other than myself.  If there is anybody out there who
always wanted an automagical PVM template that does n hello worlds in
parallel with d delay (ready to be gutted and replaced with your own
code) then it would be lovely if you would grab it and give it a try.

I'm testing the included documentation too (yes, it is at least modestly
autodocumenting) so I won't tell you much more besides:

 http://www.phy.duke.edu/~rgb/General/general.php

from whence you can grab it.

N.B. -- the included docs do NOT tell you how to get and install pvm or
how to configure a pvm cluster; it is presumed that you can do or have
done that by other means.  For many of you it is at most a:

 yum install pvm

per node, or perhaps a rpm -Uvh /path/to/pvm-whatever.i386.rpm if you
don't have a yummified public repository.  Plus perhaps installing
pvm-gui on a head node. Then it is just setting the environment (e.g.
PVM_ROOT, PVM_RSH...) up correctly and cranking either pvm or xpvm to
create a virtual cluster.  This is all well documented elsewhere.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From hunting at ix.netcom.com  Wed Dec 17 22:21:05 2003
From: hunting at ix.netcom.com (Michael Huntingdon)
Date: Wed, 17 Dec 2003 19:21:05 -0800
Subject: [Beowulf] 1 hour benchmark account request
In-Reply-To: <20031218015322.GJ7381@cse.ucdavis.edu>
Message-ID: <3.0.3.32.20031217192105.00f11fc0@popd.ix.netcom.com>

Bill is toying with Itanium?


At 05:53 PM 12/17/2003 -0800, Bill Broadley wrote:
>
>Does anyone have a benchmark account available for an hour or so (afterhours
>is fine) that has the following available:
>* 32 nodes (p4 or athlon > 2 Ghz or opteron) 
>* Myrinet (any flavor) or
>* Infiniband gcc (any flavor) MPI (any flavor)
>
>I could return the favor with various opteron/itanium 2 benchmarking.
>
>-- 
>Bill Broadley
>Computational Science and Engineering
>UC Davis
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From bill at cse.ucdavis.edu  Wed Dec 17 20:53:22 2003
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Wed, 17 Dec 2003 17:53:22 -0800
Subject: [Beowulf] 1 hour benchmark account request
Message-ID: <20031218015322.GJ7381@cse.ucdavis.edu>


Does anyone have a benchmark account available for an hour or so (afterhours
is fine) that has the following available:
* 32 nodes (p4 or athlon > 2 Ghz or opteron) 
* Myrinet (any flavor) or
* Infiniband gcc (any flavor) MPI (any flavor)

I could return the favor with various opteron/itanium 2 benchmarking.

-- 
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From andrewxwang at yahoo.com.tw  Thu Dec 18 10:41:34 2003
From: andrewxwang at yahoo.com.tw (=?big5?q?Andrew=20Wang?=)
Date: Thu, 18 Dec 2003 23:41:34 +0800 (CST)
Subject: [Beowulf] real Grid computing
Message-ID: <20031218154134.90104.qmail@web16810.mail.tpe.yahoo.com>

BONIC will be replacing SETI at home's client for the
next generation of SETI at home.

http://boinc.berkeley.edu

It's opensource, and looks like it is better than to
wait for SGE 6.0 to get the P2P client.

Andrew.


-----------------------------------------------------------------
??? Yahoo!??
??????????????????????
http://tw.promo.yahoo.com/mail_premium/stationery.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From leigh at twilightdreams.net  Thu Dec 18 16:19:01 2003
From: leigh at twilightdreams.net (Leigh)
Date: Thu, 18 Dec 2003 16:19:01 -0500
Subject: [Beowulf] Semi-philosophical Question
Message-ID: <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>

I was talking over Beowulf clusters with a coworker (as I have been working 
on learning to build one for my company) and he came up with an interesting 
question that I was unsure of.

As most of the data is saved upon the "gateway" and the other machines 
simply access it to use the data, what happens when multiple machines are 
making use of the same data and they all try to save at once? Do they all 
work as one system and save it only once, or can multiple nodes 
theoretically be using one file and both try to save to it?

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From optimize at optimization.net  Wed Dec 17 14:46:34 2003
From: optimize at optimization.net (optimize)
Date: Wed, 17 Dec 2003 14:46:34 -0500
Subject: [Beowulf] PVM master/slave project template...
Message-ID: <200312171946.AYF63595@ms7.verisignmail.com>

i would volunteer to get a copy of your good PVM work. i will 
try to validate/test it if at all possible. i could use it 
over large_scale combinatorial optimization problems.

thanks & bol
ralph
optimal regards.
-------------- next part --------------
An embedded message was scrubbed...
From: "Robert G. Brown" <rgb at phy.duke.edu>
Subject: [Beowulf] PVM master/slave project template...
Date: Wed, 17 Dec 2003 13:54:31 -0500 (EST)
Size: 3997
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20031217/db04212c/attachment-0001.mht>

From agrajag at dragaera.net  Thu Dec 18 17:35:36 2003
From: agrajag at dragaera.net (Jag)
Date: Thu, 18 Dec 2003 17:35:36 -0500
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>
References: <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>
Message-ID: <1071786936.4291.69.camel@pel>

On Thu, 2003-12-18 at 16:19, Leigh wrote:
> I was talking over Beowulf clusters with a coworker (as I have been working 
> on learning to build one for my company) and he came up with an interesting 
> question that I was unsure of.
> 
> As most of the data is saved upon the "gateway" and the other machines 

Not really in answer to your question, but some general info..
That is one configuration, but is not always the case.  As an example, I
currently have a cluster that has a node dedicated to sharing out a
terrabyte of space over NFS.  There's one node that's dedicated to doing
the scheduling (using SGE), and three other nodes allow user logins for
them to submit jobs from.  Jobs aren't executed on any of these nodes.

There's also something called pvfs that'll let you use the harddrives on
all your slave nodes and combine them into one shared filesystem that
they can all use.

> simply access it to use the data, what happens when multiple machines are 
> making use of the same data and they all try to save at once? Do they all 
> work as one system and save it only once, or can multiple nodes 
> theoretically be using one file and both try to save to it?

This is really an application specific question.  A lot of MPI jobs
shuffle all the data back to one of the processes and let that process
write out the output files, so you won't have a problem.  There are also
other programs that may have a seperate output file for every slave node
the job is run on.  What is your cluster going to be used for?  The best
way to answer the question is to determine what apps will be used and
see how they handle output.  If its an inhouse program, you may want to
make sure your programmers are aware they'll be writing to a shared
filesystem so that they don't accidently write the code in such a way
that the results get corrupted by having them all use the same output
file.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From leigh at twilightdreams.net  Thu Dec 18 17:45:09 2003
From: leigh at twilightdreams.net (Leigh)
Date: Thu, 18 Dec 2003 17:45:09 -0500
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <1071786936.4291.69.camel@pel>
References: <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>
 <4.2.0.58.20031218161530.00b68e00@mail.flexfeed.com>
Message-ID: <4.2.0.58.20031218174322.00b6d310@mail.flexfeed.com>

At 05:35 PM 12/18/2003 -0500, Jag wrote:
>On Thu, 2003-12-18 at 16:19, Leigh wrote:
>
> >What is your cluster going to be used for?


It hasn't been decided what the cluster will be used for. The entire thing, 
thus far, is an experiment. Mostly to see if two people who so far, have no 
clue how to build one can get one built and running (so far so good, I 
think) and from there, we'll putz around and see what we can do with it. 
Maybe have fun with SETI at home, or perhaps sell space upon the "big" one 
once we get it going for scientists to be able to run data upon.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec 19 06:38:48 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 19 Dec 2003 06:38:48 -0500 (EST)
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <4.2.0.58.20031218174322.00b6d310@mail.flexfeed.com>
Message-ID: <Pine.LNX.4.44.0312190636570.29889-100000@lucifer.rgb.private.net>

On Thu, 18 Dec 2003, Leigh wrote:

> At 05:35 PM 12/18/2003 -0500, Jag wrote:
> >On Thu, 2003-12-18 at 16:19, Leigh wrote:
> >
> > >What is your cluster going to be used for?
> 
> 
> It hasn't been decided what the cluster will be used for. The entire thing, 
> thus far, is an experiment. Mostly to see if two people who so far, have no 
> clue how to build one can get one built and running (so far so good, I 
> think) and from there, we'll putz around and see what we can do with it. 
> Maybe have fun with SETI at home, or perhaps sell space upon the "big" one 
> once we get it going for scientists to be able to run data upon.

Do you know how to program?  C?  Perl?

If so, I've got a few toys for you to play with...but they'll be boring
toys if you can't tinker.

   rgb

> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From david.n.lombard at intel.com  Fri Dec 19 09:52:32 2003
From: david.n.lombard at intel.com (Lombard, David N)
Date: Fri, 19 Dec 2003 06:52:32 -0800
Subject: [Beowulf] Semi-philosophical Question
Message-ID: <187D3A7CAB42A54DB61F1D05F012572201D4BFE2@orsmsx402.jf.intel.com>

From: Leigh; Sent: Thursday, December 18, 2003 4:45 PM
> 
> It hasn't been decided what the cluster will be used for. The entire
> thing,
> thus far, is an experiment. Mostly to see if two people who so far,
have
> no
> clue how to build one can get one built and running (so far so good, I
> think) and from there, we'll putz around and see what we can do with
it.
> Maybe have fun with SETI at home, or perhaps sell space upon the "big"
one
> once we get it going for scientists to be able to run data upon.

For learning purposes, have a blast.  But, before you make a "big" one
that others will use, make sure you know *what* the cluster is being
used for and that you design the cluster to meet those requirements.

You will probably be much better off going to a cluster builder that
focuses on your users' applications and builds the right system based on
those requirements.

-- 
David N. Lombard

My comments do not represent the opinion of Intel
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From topa_007 at yahoo.com  Thu Dec 18 23:47:39 2003
From: topa_007 at yahoo.com (70uf33q Hu5541n)
Date: Thu, 18 Dec 2003 20:47:39 -0800 (PST)
Subject: [Beowulf] University project Help required
Message-ID: <20031219044739.55802.qmail@web12703.mail.yahoo.com>

hi all,

20 yr old, Engineering grad from India doing a project
in Distributed Computing which is to be submitted in
March for evaluation to The University.

The Project deals with Computing Primes on a LAN based
network which is under load. 

The project aims at Real time Load Balancement on a
Heterogenous Cluster such that the Load is Distributed
such that the clients get synchronised and when the
data is received back it arrives at approx the same
time.

I'm attaching an Abstract on the project.Please go
through it and any comments/advice/guidance will be
helpful.

My prof says that JAVA RMI can be implemented for this
project.
I'm a noob programmer with exp in C/C++/JAVA.

I badly need guidance on how to go ahead with this
project.

Any help will be appreciated.
Thanks in advance.

Cheers,
Toufeeq

=====
"Love is control,I'll die if I let go
I will only let you breathe
My air that you receive
Then we'll see if I let you love me."
-James Hetfield
All Within My Hands,St.Anger
Metallica

__________________________________
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Abstract.zip
Type: application/x-zip-compressed
Size: 6869 bytes
Desc: Abstract.zip
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20031218/ef8d5587/attachment-0001.bin>

From sal10 at utah.edu  Fri Dec 19 12:03:16 2003
From: sal10 at utah.edu (sal10 at utah.edu)
Date: Fri, 19 Dec 2003 10:03:16 -0700
Subject: [Beowulf] Wireless Channel Bonding
Message-ID: <1071853396.3fe32f54cb231@webmail.utah.edu>

I am working on a project to create a wireless network that uses several 
802.11 channels in an attempt to increase data throughput.  The network would 
link 2 computers and each computer would have 2 wireless cards.  Does anyone 
know if this can be done the same way as Ethernet channel bonding?   If anyone 
has any ideas, let me know. In addition, if anyone is aware of sources of 
information about wireless channel bonding, please let me know.
Thanks
Andy  

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From leigh at twilightdreams.net  Fri Dec 19 13:10:26 2003
From: leigh at twilightdreams.net (Leigh)
Date: Fri, 19 Dec 2003 13:10:26 -0500
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <Pine.LNX.4.44.0312190636570.29889-100000@lucifer.rgb.priva
 te.net>
References: <4.2.0.58.20031218174322.00b6d310@mail.flexfeed.com>
Message-ID: <4.2.0.58.20031219130954.00b43798@mail.flexfeed.com>

At 06:38 AM 12/19/2003 -0500, Robert G. Brown wrote:


>Do you know how to program?  C?  Perl?
>
>If so, I've got a few toys for you to play with...but they'll be boring
>toys if you can't tinker.
>
>    rgb

Unfortunately, I don't. I can read and understand code, but I can't code 
myself  yet.


Leigh
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From leigh at twilightdreams.net  Fri Dec 19 13:12:31 2003
From: leigh at twilightdreams.net (Leigh)
Date: Fri, 19 Dec 2003 13:12:31 -0500
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <187D3A7CAB42A54DB61F1D05F012572201D4BFE2@orsmsx402.jf.inte
 l.com>
Message-ID: <4.2.0.58.20031219131146.00b3c050@mail.flexfeed.com>

At 06:52 AM 12/19/2003 -0800, Lombard, David N wrote:

>For learning purposes, have a blast.  But, before you make a "big" one
>that others will use, make sure you know *what* the cluster is being
>used for and that you design the cluster to meet those requirements.
>
>You will probably be much better off going to a cluster builder that
>focuses on your users' applications and builds the right system based on
>those requirements.
>
>--
>David N. Lombard
>
>My comments do not represent the opinion of Intel


Currently, the plan is just to tinker around with a few (4) small systems 
to get the hang of things and figure out what I'm doing. Once we know what 
we're doing and what we want, we'll make more plans for the big stuff.


Leigh


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 19 14:17:42 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 19 Dec 2003 11:17:42 -0800
Subject: [Beowulf] Wireless Channel Bonding
In-Reply-To: <1071853396.3fe32f54cb231@webmail.utah.edu>
Message-ID: <5.2.0.9.2.20031219111046.0313ed08@mailhost4.jpl.nasa.gov>

At 10:03 AM 12/19/2003 -0700, sal10 at utah.edu wrote:
>I am working on a project to create a wireless network that uses several
>802.11 channels in an attempt to increase data throughput.  The network would
>link 2 computers and each computer would have 2 wireless cards.  Does anyone
>know if this can be done the same way as Ethernet channel bonding?   If 
>anyone
>has any ideas, let me know. In addition, if anyone is aware of sources of
>information about wireless channel bonding, please let me know.
>Thanks
>Andy

I've been looking into something quite similar, and, superficially at 
least, it should be possible, although clunky..

Here's one technique that will almost certainly work:

Two wired interfaces in the machine
Each interface is connected to a wireless bridge (something like the 
LinkSys WET11)
The two WETs are configured for different, non-overlapping, RF channels 
(1,6,11 for 802.11b)
As far as the machine is concerned, it's just like having two parallel wires.


Bear in mind that 802.11 is a half duplex medium!  Any one node can either 
be transmitting or receiving  but not both.  Think old style Coax Ethernet.


I see no philosophical reason why one couldn't, for instance, plug in 
multiple PCI based wireless cards. To the computer they just look like 
network interfaces.  The problem you might face is the lack of drivers for 
the high performance 802.11a or 802.11g PCI cards.

If someone can confirm that, for instance, the LinkSys WMP55AG works with 
Linux, particularly in connection with a VIA Mini-ITX motherboard, I'd be 
real happy to hear about it.


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lathama at yahoo.com  Fri Dec 19 14:38:13 2003
From: lathama at yahoo.com (Andrew Latham)
Date: Fri, 19 Dec 2003 11:38:13 -0800 (PST)
Subject: [Beowulf] 2.6
In-Reply-To: <5.2.0.9.2.20031219111046.0313ed08@mailhost4.jpl.nasa.gov>
Message-ID: <20031219193813.11509.qmail@web60304.mail.yahoo.com>

Most already know and have played with 2.6.

lots of smp fixes and Linus is fixing documentation.

But did you read the changelog?....    ....hint read the last entry.


Summary of changes from v2.6.0-test11 to v2.6.0
============================================

<hirofumi at mail.parknet.co.jp>
	[PATCH] Missing initialization of /proc/net/tcp seq_file
	
	We need to initialize st->state in tcp_seq_start().  Otherwise
	tcp_seq_stop() is run with previous st->state, and it calls the unneeded
	unlock etc, causing a kernel crash.

<mingo at elte.hu>
	[PATCH] Fix lost wakeups problem

	When doing sync wakeups we must not skip the notification of other cpus
	if the task is not on this runqueue.

<torvalds at home.osdl.org>
	Fix x86 kernel page fault error codes

<torvalds at home.osdl.org>
	Fix ide-scsi.c uninitialized variable

<yoshfuji at linux-ipv6.org>
	[IPV6]: Fix ipv4 mapped address calculation in udpv6_sendmsg().

<laforge at netfilter.org>
	[NETFILTER]: Sanitize ip_ct_tcp_timeout_close_wait value, from 2.4.x

<pavlin at icir.org>
	[RTNETLINK]: Add RTPROT_XORP.

<mingo at elte.hu>
	[PATCH] Fix /proc access to dead thread group list oops
	
	The pid_alive() check within the loop is incorrect.  If we are within
	the tasklist lock and the thread group leader is valid then the thread
	chain will be fully intact.
	
	Instead, the check should be _outside_ the loop, since if the group
	leader no longer exists, the whole list is gone and we must not try
	to access it.
	
	Move the check around, and add comment.
	
	Bug-hunting and fix by Srivatsa Vaddagiri

<axboe at suse.de>
	[PATCH] fix broken x86_64 rdtscll
	
	The scheduler is completed b0rked on x86_64, and I finally found out
	why.  sched_clock() always returned 0, because rdtscll() always returned
	0.  The 'a' in the macro doesn't agree with the 'a' in the function,
	yippe :-)
	
	This is a show stopper for x86_64.

<khali at linux-fr.org>
	[PATCH] I2C: fix i2c_smbus_write_byte() for i2c-nforce2
	
	This patch fixes i2c_smbus_write_byte() being broken for i2c-nforce2.
	This causes trouble when that module is used together with eeprom (which
	is also in 2.6). We have had three user reports about the problem.
	
	Credits go to Mark D. Studebaker for finding and fixing the problem.

<drepper at redhat.com>
	[PATCH] Fix 'noexec' behaviour
	
	We should not allow mmap() with PROT_EXEC on mounts marked "noexec",
	since otherwise there is no way for user-supplied executable loaders
	(like ld.so and emulator environments) to properly honour the
	"noexec"ness of the target.

<davem at nuts.ninka.net>
	[NETFILTER]: In conntrack, do not fragment TSO packets by accident.

<ja at ssi.bg>
	[BRIDGE]: Provide correct TOS value to IPv4 routing.

<jgarzik at pobox.com>
	[PATCH] fix use-after-free in libata
	
	Fixes oops some were seeing on module unload.
	
	Caught by Jon Burgess.

<jgarzik at pobox.com>
	[PATCH] fix oops on unload in pcnet32
	
	The driver was calling pci_unregister_driver for each _device_, and then
	again at the end of the module unload routine.  Remove the call that's
	inside the loop, pci_unregister_driver should only be called once.
	   
	Caught by Don Fry (and many others)

<jgarzik at pobox.com>
	[PATCH] remove manual driver poisoning of net_device
	
	From: Al Viro <viro at parcelfarce.linux.theplanet.co.uk>
	   
	   Such poisoning can cause oopses either because the refcount is not
	   zero when the poisoning occurs, or due to kernel debugging options
	   being enabled.

<torvalds at home.osdl.org>
	Fix the PROT_EXEC breakage on anonymous mmap.
	
	Clean up the tests while at it.

<jgarzik at pobox.com>
	[PATCH] wireless airo oops fix
	
	From Javier Achirica:
	
	Delay MIC activation to prevent Oops

<davem at nuts.ninka.net>
	[PKT_SCHED]: Do not dereference the special pointer value 'HTB_DIRECT'.
	
	Based upon a patch from devik.

<devik at cdi.cz>
	[PKT_SCHED]: In HTB, filters must be destroyed before the classes.

<James_McMechan at hotmail.com>
	[PATCH] tmpfs oops fix
	
	The problem was that the cursor was in the list being walked, and when
	the pointer pointed to the cursor the list_del/list_add_tail pair would
	oops trying to find the entry pointed to by the prev pointer of the
	deleted cursor element.
	
	The solution I found was to move the list_del earlier, before the
	beginning of the list walk. since it is not used during the list walk and
	should not count in the list enumeration it can be deleted, then the
	list pointer cannot point to it so it can be added safely with the
	list_add_tail without oopsing, and everything works as expected.
	
	I am unable to oops this version with any of my test programs. 
	
	Patch acked by Al Viro.

<greg at kroah.com>
	[PATCH] USB: register usb-serial ports in the proper place in sysfs
	
	They should be bound to the interface the driver is attached to, not
	the device.

<david-b at pacbell.net>
	[PATCH] USB: fix remove device after set_configuration
	
	If a device can't be configured, the current test9 code forgets
	to clean it out of sysfs.  This resolves that issue, so the retry
	in usb_new_device() stands a chance of working.
	
	The enumeration code still doesn't handle such errors well, but
	at least this way that hub port can be used for another device.

<greg at kroah.com>
	[PATCH] USB: fix race with hub devices disconnecting while stuff is still
happening to them.

<acme at conectiva.com.br>
	[IPV6]: Fix TCP socket leak.
	
	TCP IPV6 ->hash() method should not grab a socket reference.

<axboe at suse.de>
	[PATCH] scsi_ioctl memcpy'ing user address
	
	James reported a bug in scsi_ioctl.c where it mem copies a user pointer
	instead of using copy_from_user(). I inadvertently introduced this one
	when getting rid of CDROM_SEND_PACKET. Here's a trivial patch to fix it.

<mdharm-usb at one-eyed-alien.net>
	[PATCH] USB storage: fix for jumpshot and datafab devices
	
	This patch fixes some obvious errors in the jumpshot and datafab drivers.
	
	This should close out Bugzilla bug #1408
	
	> Date: Mon, 1 Dec 2003 12:14:53 -0500 (EST)
	> From: Alan Stern <stern at rowland.harvard.edu>
	> Subject: Patch from Eduard Hasenleithner
	> To: Matthew Dharm <mdharm-usb at one-eyed-alien.net>
	> cc: USB Storage List <usb-storage at one-eyed-alien.net>
	>
	> Matt:
	>
	> Did you see this patch?  It was posted to the usb-development mailing list
	> about a week ago, before I started making all my changes.  It is clearly
	> correct and necessary.
	>
	> Alan Stern

<trini at kernel.crashing.org>
	[PATCH] USB: mark the scanner driver as obsolete
	
	On Mon, Dec 01, 2003 at 11:21:58AM -0800, Greg KH wrote:
	> Can't you use xsane without the scanner kernel driver?  I thought the
	> latest versions used libusb/usbfs to talk directly to the hardware.
	> Because of this, the USB scanner driver is marked to be removed from the
	> kernel sometime in the near future.
	
	After a bit of mucking around (and possibly finding a bug with debian's
	libusb/xsane/hotplug interaction, nothing seems to run
	/etc/hotplug/usb/libusbscanner and thus only root can scan, anyone whose
	got this working please let me know), the problem does not exist if I
	only use  libusb xsane.
	
	How about the following:

<oliver at neukum.org>
	[PATCH] USB: fix sleping in interrupt bug in auerswald driver
	
	this fixes two instances of GFP_KERNEL from completion handlers.

<oliver at neukum.org>
	[PATCH] USB: fix race with signal delivery in usbfs
	
	apart from locking bugs, there are other races. This fixes one with
	signal delivery. The signal should be delivered _before_ the reciever
	is woken.

<stern at rowland.harvard.edu>
	[PATCH] USB: fix bug not setting device state following usb_device_reset()

<herbert at gondor.apana.org.au>
	[PATCH] USB: Fix connect/disconnect race
	
	This patch was integrated by you in 2.4 six months ago.  Unfortunately
	it never got into 2.5.  Without it you can end up with crashes such
	as http://bugs.debian.org/218670

<greg at kroah.com>
	[PATCH] USB: fix bug for multiple opens on ttyUSB devices.
	
	This patch fixes the bug where running ppp over a ttyUSB device would fail.

<arvidjaar at mail.ru>
	[PATCH] USB: prevent catch-all USB aliases in modules.alias
	
	visor.c defines one empty slot in USB ids table that can be filled in at
	runtime using module parameters. file2alias generates catch-all alias for it:
	
	alias usb:v*p*dl*dh*dc*dsc*dp*ic*isc*ip* visor
	
	patch adds the same sanity check as in depmod to scripts/file2alias.

<greg at kroah.com>
	kobject: fix bug where a parent could be deleted before a child device.

<torvalds at home.osdl.org>
	Fix subtle bug in "finish_wait()", which can cause kernel stack
	corruption on SMP because of another CPU still accessing a waitqueue
	even after it was de-allocated.
	
	Use a careful version of the list emptiness check to make sure we
	don't de-allocate the stack frame before the waitqueue is all done.

<axboe at suse.de>
	[PATCH] no bio unmap on cdb copy failure
	
	The previous scsi_ioctl.c patch didn't cleanup the buffer/bio in the
	error case. 
	
	Fix it by copying the command data earlier.

<l.s.r at web.de>
	[PATCH] HPFS: missing lock_kernel() in hpfs_readdir()
	
	In 2.5.x, the BKL was pushed from vfs_readdir() into the filesystem
	specific functions.  But only the unlock_kernel() made it into the HPFS
	code, lock_kernel() got lost on the way.  This rendered the filesystem
	unusable.
	
	This adds the missing lock_kernel().  It's been tested by Timo Maier who
	also reported the problem earlier today.

<torvalds at home.osdl.org>
	More subtle SMP bugs in prepare_to_wait()/finish_wait(). 
	
	This time we have a SMP memory ordering issue in prepare_to_wait(),
	where we really need to make sure that subsequent tests for the
	event we are waiting for can not migrate up to before the wait
	queue has been set up.

<torvalds at home.osdl.org>
	Fix thread group leader zombie leak
	
	Petr Vandrovec noticed a problem where the thread group leader
	would not be properly reaped if the parent of the thread group
	was ignoring SIGCHLD, and the thread group leader had exited
	before the last sub-thread.
	
	Fixed by Ingo Molnar.

<neilb at cse.unsw.edu.au>
	[PATCH] Fix possible bio corruption with RAID5
	
	 1/ make sure raid5 doesn't try to handle multiple overlaping
	    requests at the same time as this would confuse things badly.
	    Currently it justs BUGs if this is attempted.
	 2/ Fix a possible data-loss-on-write problem.  If two or
	    more bio's that write to the same page are processed at the
	    same time, only the first was actually commited to storage.
	 3/ Fix a use-after-free bug.  raid5 keeps the bio's it is given
	    in linked lists when more than one bio touch a single page.
	    In some cases the tail of this list can be freed, and
	    the current test for 'are we at the end' isn't reliable.
	    This patch strengths the test to make it reliable.

<axboe at suse.de>
	[PATCH] Fix IDE bus reset and DMA disable when reading blank DVD-R
	
	From Jon Burgess:
	
	  There is a problems with blank DVD media using the ide-cd driver.
	
	  When we attempt to read the blank disk, the drive responds to the read
	  request by returning a "blank media" error.  The kernel doesn't have
	  any special case handling for this sense value and retries the request
	  a couple of times, then gives up and does a bus reset and disables DMA
	  to the device.
	
	  Which obviously doesn't help the situation.
	
	  The sense key value of 8 isn't listed in ide-cd.h, but it is listed in
	  scsi.h as a "BLANK_CHECK" error.
	
	  This trivial patch treats this error condition as a reason to abort
	  the request.  This behaviour is the same as what we do with a blank CD-R.
	
	  It looks like the same fix might be desired for 2.4 as well, although
	  is perhaps not so important since scsi-ide is normally used instead.

<axboe at suse.de>
	[PATCH] CDROM_SEND_PACKET bug
	
	I just found Yet Another Bug in scsi_ioctl - CDROM_SEND_PACKET puts a
	kernel pointer in hdr->cmdp, where sg_io() expects to find user address.
	This worked up until recently because of the memcpy bug, but now it
	doesn't because we do the proper copy_from_user(). 
	
	This fix undoes the user copy code from sg_io, and instead makes the
	SG_IO ioctl copy it locally.  This makes SG_IO and CDROM_SEND_PACKET
	agree on the calling convention, and everybody is happy. 
	
	I've tested that both
	
	   cdrecord -dev=/dev/hdc -inq
	
	and
	
	   cdrecord -dev=ATAPI:/dev/hdc -inq
	
	works now.  The former will use SG_IO, the latter CDROM_SEND_PACKET (and
	incidentally would work in both 2.4 and 2.6, if it wasn't for
	CDROM_SEND_PACKET sucking badly in 2.4).

<jes at trained-monkey.org>
	[PATCH] qla1280 crash fix in error handling
	
	This fixes a bug in the qla1280 driver where it would leave a pointer to
	an on the stack completion event in a command structure if
	qla1280_mailbox_command fails.  The result is that the interrupt handler
	later tries to complete() garbage on the stack.  The mailbox command can
	fail if a device on the bus decides to lock up etc.

<torvalds at home.osdl.org>
	Linux 2.6.0


=====
/---------------------------------------------------------------------------------------------------\
Andrew Latham -LathamA - Penguin Loving, Moralist Agnostic.

What Is an agnostic? -  An agnostic thinks it impossible to know the truth
in matters such as, a god or the future with which religions are concerned 
with. Or, if not impossible, at least impossible at the present time.
 
LathamA.com - (lay-th-ham-eh) - lathama at lathama.com - lathama at yahoo.com
\---------------------------------------------------------------------------------------------------/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rgb at phy.duke.edu  Fri Dec 19 14:10:52 2003
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 19 Dec 2003 14:10:52 -0500 (EST)
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <4.2.0.58.20031219130954.00b43798@mail.flexfeed.com>
Message-ID: <Pine.LNX.4.44.0312191408390.15354-100000@ganesh.phy.duke.edu>

On Fri, 19 Dec 2003, Leigh wrote:

> At 06:38 AM 12/19/2003 -0500, Robert G. Brown wrote:
> 
> 
> >Do you know how to program?  C?  Perl?
> >
> >If so, I've got a few toys for you to play with...but they'll be boring
> >toys if you can't tinker.
> >
> >    rgb
> 
> Unfortunately, I don't. I can read and understand code, but I can't code 
> myself  yet.

Ahh.  The biggest problem you'll then have with clusters is that you're
stuck running other people's code.  There is some "fun" code out there
to play with that doesn't require anything but building and running in
e.g. the PVM or MPI distributions and elsewhere, but not a whole lot.
To go further at some point you'll have to learn to code.  Then you can
write applications like one that generates all the prime numbers with
less than X digits and the like...;-)

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From alvin at Mail.Linux-Consulting.com  Fri Dec 19 14:48:50 2003
From: alvin at Mail.Linux-Consulting.com (Alvin Oga)
Date: Fri, 19 Dec 2003 11:48:50 -0800 (PST)
Subject: [Beowulf] Wireless Channel Bonding
In-Reply-To: <5.2.0.9.2.20031219111046.0313ed08@mailhost4.jpl.nasa.gov>
Message-ID: <Pine.LNX.3.96.1031219114419.24392A-100000@Maggie.Linux-Consulting.com>


hi ya jim

On Fri, 19 Dec 2003, Jim Lux wrote:

> I see no philosophical reason why one couldn't, for instance, plug in 
> multiple PCI based wireless cards. To the computer they just look like 
> network interfaces.  The problem you might face is the lack of drivers for 
> the high performance 802.11a or 802.11g PCI cards.

i've gotten a netgear wg311 (802.11g) nic recognized/configured
on my test redhat EL - ws setup with the madwifi drivers

collection of wireless drivers and supported cards:

	http://www.Linux-Sec.net/Wireless

> If someone can confirm that, for instance, the LinkSys WMP55AG works with 
> Linux, particularly in connection with a VIA Mini-ITX motherboard, I'd be 
> real happy to hear about it.

i'll be playing with the linksys WMP54g next for the other end of the
wireless connection, and hopefully run ipsec between teh two connections
since wep is a cracked technology
 
c ya
alvin

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 19 18:08:16 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 19 Dec 2003 15:08:16 -0800
Subject: [Beowulf] Semi-philosophical Question
In-Reply-To: <4.2.0.58.20031219130954.00b43798@mail.flexfeed.com>
References: <Pine.LNX.4.44.0312190636570.29889-100000@lucifer.rgb.priva te.net>
 <4.2.0.58.20031218174322.00b6d310@mail.flexfeed.com>
Message-ID: <5.2.0.9.2.20031219150401.0319aaa8@mailhost4.jpl.nasa.gov>

At 01:10 PM 12/19/2003 -0500, Leigh wrote:
>At 06:38 AM 12/19/2003 -0500, Robert G. Brown wrote:
>
>
>>Do you know how to program?  C?  Perl?
>>
>>If so, I've got a few toys for you to play with...but they'll be boring
>>toys if you can't tinker.
>>
>>    rgb
>
>Unfortunately, I don't. I can read and understand code, but I can't code 
>myself  yet.
>


Hah... if you can read and understand code, you can tinker with it.. If you 
break it.. well, that's why you keep versions.  Surely you can use a text 
editor and invoke the compiler/linker.

I'll even point out that one can run parallel applications using Visual 
Basic (or even, qbasic, for that matter)

Leap in and start modifying.  Do those series expansions for psi, e, 
Euler's Constant, pi, etc.  Solve the 8 queens problems. Crack 
DES.  Calculate casino odds by monte carlo simulation ( a nice 
embarrassingly parallel challenge...)

If you want something more "useful", take a look at one of the genetic 
optimizing algorithms and parallelize it (or, more usefully, find someone 
else's parallel implementation, and modify or configure it with something 
practical.)


James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From James.P.Lux at jpl.nasa.gov  Fri Dec 19 18:10:12 2003
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Fri, 19 Dec 2003 15:10:12 -0800
Subject: [Beowulf] Wireless Channel Bonding
References: <5.2.0.9.2.20031219111046.0313ed08@mailhost4.jpl.nasa.gov>
Message-ID: <5.2.0.9.2.20031219150853.031a8870@mailhost4.jpl.nasa.gov>

At 11:48 AM 12/19/2003 -0800, Alvin Oga wrote:


I'm more interested in the 802.11a 5GHz technologies.. the WMP54g is a 2.4 
GHz band device (read, incredibly congested in my lab).  However, I have 
been given to understand that the WMP55AG is based on the Atheros chipset, 
and that they have actually published a Linux driver...


>hi ya jim
>
>On Fri, 19 Dec 2003, Jim Lux wrote:
>
> > I see no philosophical reason why one couldn't, for instance, plug in
> > multiple PCI based wireless cards. To the computer they just look like
> > network interfaces.  The problem you might face is the lack of drivers for
> > the high performance 802.11a or 802.11g PCI cards.
>
>i've gotten a netgear wg311 (802.11g) nic recognized/configured
>on my test redhat EL - ws setup with the madwifi drivers
>
>collection of wireless drivers and supported cards:
>
>         http://www.Linux-Sec.net/Wireless
>
> > If someone can confirm that, for instance, the LinkSys WMP55AG works with
> > Linux, particularly in connection with a VIA Mini-ITX motherboard, I'd be
> > real happy to hear about it.
>
>i'll be playing with the linksys WMP54g next for the other end of the
>wireless connection, and hopefully run ipsec between teh two connections
>since wep is a cracked technology
>
>c ya
>alvin

James Lux, P.E.
Spacecraft Telecommunications Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From brian.dobbins at yale.edu  Fri Dec 19 19:19:42 2003
From: brian.dobbins at yale.edu (Brian Dobbins)
Date: Fri, 19 Dec 2003 19:19:42 -0500 (EST)
Subject: [Beowulf] 2.6
In-Reply-To: <20031219193813.11509.qmail@web60304.mail.yahoo.com>
Message-ID: <Pine.LNX.4.44.0312191915120.18934-100000@email.combustion.eng.yale.edu>

> Most already know and have played with 2.6.
> lots of smp fixes and Linus is fixing documentation.
> But did you read the changelog?....    ....hint read the last entry.

  On a slightly different note (.. different from the changelog bit ...), 
what are people's experiences in terms of performance?  Any noticeable 
difference in, ie, SMP codes?  I/O?  Network performance?  

  I have some Opterons here which, as soon as the jobs they need to run 
are done, I'm going to reboot with a PXE+Etherboot (*) 2.6 kernel to play 
with, but that could be a while yet. 

  (*) And, for the sake of saving anyone who may be trying the same thing, 
for some reason when using "mkelf-linux", I couldn't specify:

  --append="root=/dev/ram"

  .. like I could with the 2.4 kernel.  This time, I had to use the device 
numbers:

  --append="root=0100"

  Not 100% sure that was the problem, since it was done on very little 
sleep, but if any of you are booting diskless opterons and want to try the 
2.6 kernel but aren't having much luck, give that a shot.

  Cheers,
  - Brian


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From douglas at shore.net  Sat Dec 20 12:54:32 2003
From: douglas at shore.net (Douglas O'Flaherty)
Date: Sat, 20 Dec 2003 12:54:32 -0500
Subject: [Beowulf] RH Update 1 Announcement
Message-ID: <3FE48CD8.1060708@shore.net>

Thought this list would be interested... Now if they only also announced 
cluster pricing...

RedHat goes public with what is in Update 1:

http://news.com.com/2100-7344_3-5130174.html?tag=nefd_top

*Red Hat began public testing this week of an update designed to make 
its new premium Linux product work better on IBM servers and computers 
that use Advanced Micro Devices' Opteron chip. *

Update 1 of Red Hat Enterprise Linux 3 
<http://news.com.com/2100-7344-5094774.html?tag=nl> is expected to be 
final in mid-January, spokeswoman Leigh Day said on Friday.

The update will speed up RHEL 3 on IBM mainframes, Red Hat said. It will 
also make it work on a broader number of IBM's Power-chip-based pSeries 
and iSeries servers and on some new servers using Intel's Itanium 2 
processor.

doug


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From landman at scalableinformatics.com  Sat Dec 20 14:16:02 2003
From: landman at scalableinformatics.com (Joe Landman)
Date: Sat, 20 Dec 2003 14:16:02 -0500
Subject: [Beowulf] RH Update 1 Announcement
In-Reply-To: <3FE48CD8.1060708@shore.net>
References: <3FE48CD8.1060708@shore.net>
Message-ID: <1071947761.12682.18.camel@protein.scalableinformatics.com>

So my questions are (relative to this), which product would be used for
the compute nodes on a cluster?  

Redhat has:
	RHEL WS at ~$792 from web store
	
SUSE has:
	SLES 2 CPU license at ~$767 from their web store
	SL Pro 9.0 for AMD64 at  ~$120    

I assume the $700++ items have the NUMA patches.  Does the SL Pro
product?  

Of course there are other distributions one could use.  Commercially
Scyld, CLIC, and a few others are out or coming out such as Callident . 
Non-commercial you have ROCKS, cAos (soon), White-Box, Debian, OSCAR +
[RH | Mandrake], biobrew, Gentoo, and probably a few others.

Who is going to support the x86_64 platforms?  RH and SUSE are obvious,
but I think that cAos, ROCKS, CLIC, Gentoo, et al may/will support
x86_64.  Has anyone compiled a list yet?

Curious.

Joe

-- 
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman at scalableinformatics.com
  web: http://scalableinformatics.com
phone: +1 734 612 4615


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kervin at blueprintinc.com  Sat Dec 20 17:30:13 2003
From: kervin at blueprintinc.com (Kervin L. Pierre)
Date: Sat, 20 Dec 2003 17:30:13 -0500
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for 2.4 kernel?
Message-ID: <3FE4CD75.9070805@blueprintinc.com>

Hello,

I am upgrading software on a cluster at my college and part of the 
documentation says to patch the kernel with the "TCP Short Messages" 
patch found at http://www.icase.edu/coral/LinuxTCP.html .

The patch is only available for 2.2 series kernel and none seems to be 
done for the 2.4 kernel.  The contact email on that page bounces as well.

Is this patch still necessary for TCP Short Messages functionality?  If 
so where can I find the patch against 2.4?

Any information would be appreciated,
--Kervin
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From laytonjb at comcast.net  Sat Dec 20 18:49:03 2003
From: laytonjb at comcast.net (Jeffrey B. Layton)
Date: Sat, 20 Dec 2003 18:49:03 -0500
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for
 2.4 kernel?
In-Reply-To: <3FE4CD75.9070805@blueprintinc.com>
References: <3FE4CD75.9070805@blueprintinc.com>
Message-ID: <3FE4DFEF.2000502@comcast.net>

Kervin,

   You don't need it for the 2.4 or 2.6 kernels.

Enjoy!

Jeff

> Hello,
>
> I am upgrading software on a cluster at my college and part of the 
> documentation says to patch the kernel with the "TCP Short Messages" 
> patch found at http://www.icase.edu/coral/LinuxTCP.html .
>
> The patch is only available for 2.2 series kernel and none seems to be 
> done for the 2.4 kernel.  The contact email on that page bounces as well.
>
> Is this patch still necessary for TCP Short Messages functionality?  
> If so where can I find the patch against 2.4?
>
> Any information would be appreciated,
> --Kervin
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
>


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From kervin at blueprintinc.com  Sat Dec 20 23:36:09 2003
From: kervin at blueprintinc.com (Kervin L. Pierre)
Date: Sat, 20 Dec 2003 23:36:09 -0500
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for
 2.4 kernel?
In-Reply-To: <3FE4DFEF.2000502@comcast.net>
References: <3FE4CD75.9070805@blueprintinc.com> <3FE4DFEF.2000502@comcast.net>
Message-ID: <3FE52339.5090203@blueprintinc.com>

Thanks Jeffrey,

Is there a kernel config or a /proc file associated with TCP Short 
Messages?  Or is it enabled by default?  Eg with the patch one had to 
'echo 1 > /proc/sys/net/ipv4/tcp_faster_timeouts', but this file is not 
in 2.4's /proc.

On a related note, does anyone have any TCP options I can turn on to 
improve the network performance of my beowulf?  I have 50 nodes using 
channel-bonding on 4 cisco switches.

Thanks again,
--Kervin

Jeffrey B. Layton wrote:
> Kervin,
> 
>   You don't need it for the 2.4 or 2.6 kernels.
> 
> Enjoy!
> 
> Jeff
> 
>> Hello,
>>
>> I am upgrading software on a cluster at my college and part of the 
>> documentation says to patch the kernel with the "TCP Short Messages" 
>> patch found at http://www.icase.edu/coral/LinuxTCP.html .
>>
>> The patch is only available for 2.2 series kernel and none seems to be 
>> done for the 2.4 kernel.  The contact email on that page bounces as well.
>>
>> Is this patch still necessary for TCP Short Messages functionality?  
>> If so where can I find the patch against 2.4?
>>
>> Any information would be appreciated,
>> --Kervin
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit 
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
> 
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From laytonjb at comcast.net  Sun Dec 21 07:55:32 2003
From: laytonjb at comcast.net (Jeffrey B. Layton)
Date: Sun, 21 Dec 2003 07:55:32 -0500
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for
 2.4 kernel?
In-Reply-To: <3FE52339.5090203@blueprintinc.com>
References: <3FE4CD75.9070805@blueprintinc.com> <3FE4DFEF.2000502@comcast.net> <3FE52339.5090203@blueprintinc.com>
Message-ID: <3FE59844.8050002@comcast.net>

Kervin,

> Thanks Jeffrey,
>
> Is there a kernel config or a /proc file associated with TCP Short 
> Messages?  Or is it enabled by default?  Eg with the patch one had to 
> 'echo 1 > /proc/sys/net/ipv4/tcp_faster_timeouts', but this file is 
> not in 2.4's /proc. 


   To be honest, I can't remember. Josip found that problem way back
in the 2.2 kernel days and I haven't used a 2.2 kernel in about a
year. Here's the original link I have:

http://www.icase.edu/coral/LinuxTCP2.html

Here's a post from Josip explaining that the short message problem
was fixed in the 2.4 series kernels:

http://www.beowulf.org/pipermail/beowulf/2001-August/000988.html

Once again, you don't have to worry about the problem. However, if
you think it's a problem, I'd contact Josip directly and see if he can
help you determine if it is a problem for your code and perhaps how
you can fix it.
   To be honest, it might not be worth fixing. The TCP stack and
networking in the 2.6 kernel are pretty good from what I've heard.
Maybe switching to a 2.6 kernel could help the problem.

> On a related note, does anyone have any TCP options I can turn on to 
> improve the network performance of my beowulf?  I have 50 nodes using 
> channel-bonding on 4 cisco switches. 


   My condolences on using Cisco. I've need had the displeasure
of using them in clusters, but from everything I've heard and
everyone I have spoken with, they're not the best. Difficult
beasts to work with and they don't have good throughput.
   Can you give us some more details? What kind of nodes?
What kind of NICs? Driver version? Switch version? Are you
just trying to get better performance or do you think there's
a problem? What kind of network performance are you getting
now? Have you run things like netpipe and/or netperf between
two nodes? How about testing the NASA Parallel benchmarks
between various combinations of nodes to check performance?
What MPI are you running?
   Also, since you're bonding, have you applied the latest
bonding patches?

http://sourceforge.net/projects/bonding/

You might also join the bonding mailing list if the problem
appears to be with the channel bonding.


Good Luck!

Jeff

P.S. It's Jeff, not Jeffrey. Only RGB calls me Jeffrey and I think
he does it to tweak me. Well, there is my wife when she's
angry with me. Wait, I think I hear some yelling... .

>
>
> Thanks again,
> --Kervin
>
> Jeffrey B. Layton wrote:
>
>> Kervin,
>>
>>   You don't need it for the 2.4 or 2.6 kernels.
>>
>> Enjoy!
>>
>> Jeff
>>
>>> Hello,
>>>
>>> I am upgrading software on a cluster at my college and part of the 
>>> documentation says to patch the kernel with the "TCP Short Messages" 
>>> patch found at http://www.icase.edu/coral/LinuxTCP.html .
>>>
>>> The patch is only available for 2.2 series kernel and none seems to 
>>> be done for the 2.4 kernel.  The contact email on that page bounces 
>>> as well.
>>>
>>> Is this patch still necessary for TCP Short Messages functionality?  
>>> If so where can I find the patch against 2.4?
>>>
>>> Any information would be appreciated,
>>> --Kervin
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit 
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>
>>
>
>


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mbanck at gmx.net  Sun Dec 21 07:34:00 2003
From: mbanck at gmx.net (Michael Banck)
Date: Sun, 21 Dec 2003 13:34:00 +0100
Subject: [Beowulf] RH Update 1 Announcement
In-Reply-To: <1071947761.12682.18.camel@protein.scalableinformatics.com>
References: <3FE48CD8.1060708@shore.net> <1071947761.12682.18.camel@protein.scalableinformatics.com>
Message-ID: <20031221123400.GA23879@blackbird.oase.mhn.de>

On Sat, Dec 20, 2003 at 02:16:02PM -0500, Joe Landman wrote:
> Who is going to support the x86_64 platforms?  RH and SUSE are obvious,
> but I think that cAos, ROCKS, CLIC, Gentoo, et al may/will support
> x86_64.  Has anyone compiled a list yet?

Debian will. Stuff is still being hashed out, though, as being able to
have both 32 and 64 bit packages installed concurrently requires some
changes to the low-level packaging system.


Michael
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From csamuel at vpac.org  Sun Dec 21 18:33:55 2003
From: csamuel at vpac.org (Chris Samuel)
Date: Mon, 22 Dec 2003 10:33:55 +1100
Subject: [Beowulf] RH Update 1 Announcement
In-Reply-To: <1071947761.12682.18.camel@protein.scalableinformatics.com>
References: <3FE48CD8.1060708@shore.net> <1071947761.12682.18.camel@protein.scalableinformatics.com>
Message-ID: <200312221033.56256.csamuel@vpac.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sun, 21 Dec 2003 06:16 am, Joe Landman wrote:

> Who is going to support the x86_64 platforms?  RH and SUSE are obvious,
> but I think that cAos, ROCKS, CLIC, Gentoo, et al may/will support
> x86_64.  Has anyone compiled a list yet?

Data points that I'm aware off (apart from SuSE and RHEL):

NPACI Rocks - 3.1 due out Real Soon Now (tm) (maybe this week) will be rebuild 
from trademark-stripped RHEL SRPMS (as Redhat require) and will support 
Opterons as well as the previous IA32 and IA64 architectures.

Mandrake 9.2 for AMD64 - currently at RC1 and freely downloadable for Opterons 
and Athlon64 processors.

Gentoo's AMD64 support sounds distinctly early beta-ish from their technical 
notes at http://dev.gentoo.org/~brad_mssw/amd64-tech-notes.html - there's 
also a report (no details) of a successful install at 
http://www.odegard.uni.cc/index.php?itemid=3

Debian likewise sounds like a work in progress, the port home page is at 
http://www.debian.org/ports/amd64/ and there's an FAQ linked from it which 
gives a lot more information. Of course, given the recent compromise of 
Debian systems the development may be more advanced than the web pages.

The cAos website lists AMD64 as a target, but the download sites only list 
i386 for the moment.

TurboLinux now supports Opterons with the release of their AMD64 Update Kit at 
http://www.turbolinux.com/products/tl8a/tl8a_uk/ - I guess this shouldn't be 
suprising as they're a UnitedLinux distro just like SuSE is. Connectia 
(another UL distro) doesn't seem to, although their website is in Spanish and 
I had to guess what the search form was. :-)

cheers!
Chris
- -- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/5i3jO2KABBYQAh8RAktXAJ9qjfnmUTfMgUkTR3ujtgGvonfqcgCghvAp
c4thcjce81kA9t6odoowblc=
=k/Dd
-----END PGP SIGNATURE-----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From michael.worsham at mci.com  Mon Dec 22 11:34:31 2003
From: michael.worsham at mci.com (Michael Worsham)
Date: Mon, 22 Dec 2003 11:34:31 -0500
Subject: [Beowulf] QNX Support?
Message-ID: <001001c3c8a9$7ab5d130$2f7032a6@Wcomnet.com>

Hi all.

Anyone have any documentation/links to sites of setting up a beowulf under
QNX?

Thanks.

-- M


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From josip at lanl.gov  Mon Dec 22 13:15:46 2003
From: josip at lanl.gov (Josip Loncaric)
Date: Mon, 22 Dec 2003 11:15:46 -0700
Subject: [Beowulf] is "TCP Short Messages" patch necessary/available for
 2.4 kernel?
In-Reply-To: <3FE4CD75.9070805@blueprintinc.com>
References: <3FE4CD75.9070805@blueprintinc.com>
Message-ID: <3FE734D2.2080802@lanl.gov>

Kervin L. Pierre wrote:
> 
> I am upgrading software on a cluster at my college and part of the 
> documentation says to patch the kernel with the "TCP Short Messages" 
> patch found at http://www.icase.edu/coral/LinuxTCP.html .
> 
> The patch is only available for 2.2 series kernel and none seems to be 
> done for the 2.4 kernel.  The contact email on that page bounces as well.

Unfortunately, ICASE is no more: it was "improved out of existence" (the 
successor organization NIA operates somewhat differently).

The Dec. 31, 2002 snapshot of the official ICASE web site is hosted by 
USRA, so papers etc. can be retrieved using the old URLs, but ICASE 
E-mail addresses and personal web pages are defunct.

> Is this patch still necessary for TCP Short Messages functionality?  If 
> so where can I find the patch against 2.4?

The patch was needed for 2.0 and 2.2 Linux kernels due to a quirk in 
their TCP stack implementation.  Since 2.4 Linux kernels perform fine 
without the patch, you do not need it any more.

Sincerely,
Josip

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From msaleh at ihu.ac.ir  Sat Dec 27 05:05:04 2003
From: msaleh at ihu.ac.ir (Mahmoud Saleh)
Date: Sat, 27 Dec 2003 13:35:04 +0330
Subject: [Beowulf] Gigabit Ethernet vs Myrinet
Message-ID: <WorldClient-F200312271335.AA35040001@ihu.ac.ir>

Folks,

Reading a couple of comparison tables regarding latency of different NIC 
protocols, I noticed that many solutions suggest to use Myrinet style 
NIC due to its low latency, namely around 8usec for I/O intensive 
cluster jobs. I was wondering if Gigabit Ethernet does the same.

Suppose that Maximum packet size in GE is 1500 bytes and the minimum is 
aroud 100 bytes. This translates to an average of 800 bytes or 6400 
bits. In Gigabit Ethernet that would cause a delay of 6400/10^9 sec  or 
6.4usec for packet assembly, which is in the same order as Myrinet. 

Is this justification correct? If so, how wise is it to use Gigabit 
Ethernet for an I/O intensive cluster?

 
Regards,
--
Mahmoud

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Sat Dec 27 13:21:34 2003
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Sat, 27 Dec 2003 13:21:34 -0500 (EST)
Subject: [Beowulf] Gigabit Ethernet vs Myrinet
In-Reply-To: <WorldClient-F200312271335.AA35040001@ihu.ac.ir>
Message-ID: <Pine.LNX.4.44.0312271303380.2670-100000@boltzmann.basement-supercomputing.com>

On Sat, 27 Dec 2003, Mahmoud Saleh wrote:

> Folks,
> 
> Reading a couple of comparison tables regarding latency of different NIC 
> protocols, I noticed that many solutions suggest to use Myrinet style 
> NIC due to its low latency, namely around 8usec for I/O intensive 
> cluster jobs. I was wondering if Gigabit Ethernet does the same.
> 
> Suppose that Maximum packet size in GE is 1500 bytes and the minimum is 
> aroud 100 bytes. This translates to an average of 800 bytes or 6400 
> bits. In Gigabit Ethernet that would cause a delay of 6400/10^9 sec  or 
> 6.4usec for packet assembly, which is in the same order as Myrinet. 

The best 1 byte latency for GigE I have measured has been 25 us.
This test was using netpipe/TCP. It is hard to provide a solid number
because Ethernet chip-sets/drivers vary as do motherboards that include
GigE. The best thing to do is test some hardware.

> 
> Is this justification correct? If so, how wise is it to use Gigabit 
> Ethernet for an I/O intensive cluster?

More tests are needed to answer that. With Myrinet, Quadrics, SCI, you 
will get better performance -- and spend more money. Somethings you may 
need to consider with this decision:

1. What  API you will use MPI, PVM, sockets? (API can add overhead to 
   latency numbers)
2. How many nodes do expect to use ?
3. Is there a single NFS server for the data or are you using something 
   like PVFS or GFS?
4. What are your I/O block sizes?

Doug
> 
>  
> 
> Regards,
> --
> Mahmoud
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pathscale.com  Sat Dec 27 12:52:31 2003
From: lindahl at pathscale.com (Greg Lindahl)
Date: Sat, 27 Dec 2003 09:52:31 -0800
Subject: [Beowulf] Gigabit Ethernet vs Myrinet
In-Reply-To: <WorldClient-F200312271335.AA35040001@ihu.ac.ir>
References: <WorldClient-F200312271335.AA35040001@ihu.ac.ir>
Message-ID: <20031227175231.GB1642@greglaptop.earthlink.net>

On Sat, Dec 27, 2003 at 01:35:04PM +0330, Mahmoud Saleh wrote:

> Reading a couple of comparison tables regarding latency of different NIC 
> protocols, I noticed that many solutions suggest to use Myrinet style 
> NIC due to its low latency, namely around 8usec for I/O intensive 
> cluster jobs. I was wondering if Gigabit Ethernet does the same.

First off, most people separate disk I/O from program
communications. I'll assume that you're talking about the second.

> Suppose that Maximum packet size in GE is 1500 bytes and the minimum is 
> aroud 100 bytes. This translates to an average of 800 bytes or 6400 
> bits. In Gigabit Ethernet that would cause a delay of 6400/10^9 sec  or 
> 6.4usec for packet assembly, which is in the same order as Myrinet. 

You are only thinking about the time needed to send the actual
bytes. The total time to send a small message is much bigger than
that. There are published papers that show the "ping pong" latency for
gigabit ethernet. This number is highly dependent on the exact
gigE card, switch, OS, and gigE driver that you're using.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Sun Dec 28 08:30:23 2003
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Sun, 28 Dec 2003 08:30:23 -0500 (EST)
Subject: [Beowulf] New Poll on Cluster-Rant
In-Reply-To: <Pine.LNX.4.44.0312271303380.2670-100000@boltzmann.basement-supercomputing.com>
Message-ID: <Pine.LNX.4.44.0312280815000.10843-100000@boltzmann.basement-supercomputing.com>


For those interested, there is a new poll asking about kernel 2.6 at
cluster-rant.com. The links to the new poll and previous interconnects
poll (107 votes) can be found here:

http://www.cluster-rant.com/article.pl?sid=03/12/22/1625228

Doug

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ds10025 at cam.ac.uk  Mon Dec 29 06:27:21 2003
From: ds10025 at cam.ac.uk (D. Scott)
Date: 29 Dec 2003 11:27:21 +0000
Subject: [Beowulf] X-window, MPICH, MPE, Cluster performance test
Message-ID: <E1AavYP-0004YS-JM@maroon.csi.cam.ac.uk>


Hi

At last! My cluster is now online. I would like to thank everyone for they 
help. I thinking of putting a website together covering my experience in 
putting this cluster together. Will this be of use to anyone? Is they 
website that covers top 100 list of small cluster?.

Now it is online I would like to test it.

MPICH comes with test program, eg mpptest. Programs works and it produce 
nice graph. Is they any documentation/tutorial that explains meaning of 
these graphs?

MPICH also comes with MPE graphic test programs, mandel. Problem is that I 
have only got X-window installed on the master node. But, when I run 
pmandel, it returms an error, staying that it can not find shared library 
for X-window on other nodes. How can I make X-window shared across other 
nodes from the Master node? Same me install GUI programs on other nodes.

This could be related problem, but when I complied life (that uses MPE 
libraries) it returns error that MPE libraries are undefined. Any ideas?


Can I install both LAM/MPICH and MPICH-1.2.5 on the same machine?

How to calculate flops?


Are they any other performance test?

Thanks in advance.


Dan


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From anand at mecheng.iisc.ernet.in  Mon Dec 29 12:13:17 2003
From: anand at mecheng.iisc.ernet.in (Anand TNC)
Date: Mon, 29 Dec 2003 22:43:17 +0530 (IST)
Subject: [Beowulf] X-window, MPICH, MPE, Cluster performance test
In-Reply-To: <E1AavYP-0004YS-JM@maroon.csi.cam.ac.uk>
Message-ID: <Pine.LNX.4.33.0312292242180.11359-100000@mecheng.iisc.ernet.in>

^Hi
^
^At last! My cluster is now online. I would like to thank everyone for they 
^help. I thinking of putting a website together covering my experience in 
^putting this cluster together. Will this be of use to anyone? Is they 
^website that covers top 100 list of small cluster?.

Hi,

we're planning to set up a small cluster ~6 nodes - it will be very useful 
to people like me

thanks

Anand

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From brian.dobbins at yale.edu  Mon Dec 29 15:41:25 2003
From: brian.dobbins at yale.edu (Brian Dobbins)
Date: Mon, 29 Dec 2003 15:41:25 -0500 (EST)
Subject: [Beowulf] Q: Any beowulf people in the Beijing area?
Message-ID: <Pine.LNX.4.44.0312291532090.20289-100000@email.combustion.eng.yale.edu>


Hi guys,

  I'm just curious if there are any people here in the Beijing area who 
are doing work with Beowulf clusters?  I may be moving there sometime next 
year, but would like very much to stay involved in the realm of Beowulf 
and parallel computing in general.

  This is very preliminary, but if there are any of you out there who do 
work with clusters, or are planning on building one, etc., and happen to 
be in the Beijing area, I'd love to know!  :-)

  Cheers,
  - Brian

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From Per.Lindstrom at me.chalmers.se  Mon Dec 29 14:59:27 2003
From: Per.Lindstrom at me.chalmers.se (=?ISO-8859-1?Q?Per_Lindstr=F6m?=)
Date: Mon, 29 Dec 2003 20:59:27 +0100
Subject: [Beowulf] Websites for small clusters
Message-ID: <3FF0879F.4080301@me.chalmers.se>

Hi Dan,

It should be great if you publish your cluster work instructions on a 
website. I have found that there is need for a such place.

The site http://www.msm.cam.ac.uk/map/mapmain.html is a good example on 
how a website sharing scientific and/or professional experience can be 
aranged.

If it not allready exist, shall we arrange something similar for few 
node clusters? (Few node clusters 2 - 30 nodes?)

Best regards
Per Lindstr?m


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From james.p.lux at jpl.nasa.gov  Mon Dec 29 21:45:05 2003
From: james.p.lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 29 Dec 2003 18:45:05 -0800
Subject: [Beowulf] Websites for small clusters
References: <3FF0879F.4080301@me.chalmers.se>
Message-ID: <004001c3ce7e$f06507e0$36a8a8c0@LAPTOP152422>

I fully agree... I suspect that most readers of this list start with a small
cluster, and a historical record of what it took to get it up and running is
quite useful, especially the hiccups and problems that you inevitably
encounter. (e.g. what do you mean the circuit breaker just tripped on the
plug strip when we plugged all those things into it?)

----- Original Message -----
From: "Per Lindstr?m" <Per.Lindstrom at me.chalmers.se>
To: "D. Scott" <ds10025 at cam.ac.uk>; "Anand TNC"
<anand at mecheng.iisc.ernet.in>
Cc: "Beowulf" <beowulf at beowulf.org>; "Josh Moore" <kong at thejosh.net>; "Per"
<Per.Lindstrom at madpenguin.org>
Sent: Monday, December 29, 2003 11:59 AM
Subject: [Beowulf] Websites for small clusters


> Hi Dan,
>
> It should be great if you publish your cluster work instructions on a
> website. I have found that there is need for a such place.
>
> The site http://www.msm.cam.ac.uk/map/mapmain.html is a good example on
> how a website sharing scientific and/or professional experience can be
> aranged.
>
> If it not allready exist, shall we arrange something similar for few
> node clusters? (Few node clusters 2 - 30 nodes?)
>
> Best regards
> Per Lindstr?m
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ds10025 at cam.ac.uk  Tue Dec 30 06:42:43 2003
From: ds10025 at cam.ac.uk (D. Scott)
Date: 30 Dec 2003 11:42:43 +0000
Subject: [Beowulf] Websites for small clusters
Message-ID: <E1AbIGp-0002rN-By@maroon.csi.cam.ac.uk>

Hi

How did you calculate mega flops?

I will download and look into the benchmark program you are using.

Could we aggreed on which benchmark software to use so that we can compare 
performance of each small cluster?

http://home.attmil.ne.jp/a/jm/

Gives me an idea to put together a basic site. I'll see what I can do.

It did take me alot of time and effect searching the net for information. 
I'll see if I can put it all together.


Dan

On Dec 30 2003, Josh Moore wrote:

> I have seen a few but not that many websites around dealing with 
> indviduals clusters.  Most links were down and it took a great deal of 
> searching to come up with a few pages.  That is the main reason I made 
> my website http://home.attmil.ne.jp/a/jm/  dealing with the building of 
> my cluster.  It started as a two node cluster and has updates has I add 
> more nodes and run other tests.
> 
> Jim Lux wrote:
> 
> > I fully agree... I suspect that most readers of this list start with a 
> > small cluster, and a historical record of what it took to get it up and 
> > running is quite useful, especially the hiccups and problems that you 
> > inevitably encounter. (e.g. what do you mean the circuit breaker just 
> > tripped on the plug strip when we plugged all those things into it?)
> >
> > ----- Original Message ----- From: "Per Lindstr?m" 
> > <Per.Lindstrom at me.chalmers.se> To: "D. Scott" <ds10025 at cam.ac.uk>; 
> > "Anand TNC" <anand at mecheng.iisc.ernet.in> Cc: "Beowulf" 
> > <beowulf at beowulf.org>; "Josh Moore" <kong at thejosh.net>; "Per" 
> > <Per.Lindstrom at madpenguin.org> Sent: Monday, December 29, 2003 11:59 AM 
> > Subject: [Beowulf] Websites for small clusters
> >
> >
> >  
> >
> >>Hi Dan,
> >>
> >>It should be great if you publish your cluster work instructions on a
> >>website. I have found that there is need for a such place.
> >>
> >>The site http://www.msm.cam.ac.uk/map/mapmain.html is a good example on
> >>how a website sharing scientific and/or professional experience can be
> >>aranged.
> >>
> >>If it not allready exist, shall we arrange something similar for few
> >>node clusters? (Few node clusters 2 - 30 nodes?)
> >>
> >>Best regards
> >>Per Lindstr?m
> >>
> >>
> >>_______________________________________________
> >>Beowulf mailing list, Beowulf at beowulf.org
> >>To change your subscription (digest mode or unsubscribe) visit
> >>    
> >>
> >http://www.beowulf.org/mailman/listinfo/beowulf
> >
> >  
> >
> 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ds10025 at cam.ac.uk  Tue Dec 30 08:55:00 2003
From: ds10025 at cam.ac.uk (D. Scott)
Date: 30 Dec 2003 13:55:00 +0000
Subject: [Beowulf] Websites for small clusters
Message-ID: <E1AbKKq-0007fl-7z@maroon.csi.cam.ac.uk>

Hi

Another site had a paper. Have anyone come across Linpack paper?

The site is http://www.csis.hku.hk/~clwang/gideon300/peak.html.

I had the same problem interpret the results when I ran test supplied my 
mpich, eg mpptest and gotest.

Will it be worth setting up a performance chart for small clusters. It can 
include, FLOPS, Network performance etc.

Dan
On Dec 30 2003, Josh Moore wrote:Linpack paper,

> Hi,
> I found a site that contained a modified version of the PI calculator 
> that comes bundled with mpich.  I have attached it.  I'm not sure on the 
> accuracy of it, but it seems to work.  I would love to have a standard 
> bench marking program to compare results.  Pallas is good, but it takes 
> a while to run and it can be hard to interpret the results.  It would be 
> much easier to say this setup has this many megaflops/gigaflops and this 
> setup has this many instead of saying here is a 200 line test result of 
> my setup from Pallas.  Pallas is great but it can be over-kill when you 
> want a quick estimate of overall performance while adding nodes or doing 
> different tweaking.  I am constatly adding stuff to my site.  I hope to 
> add some nodes and upgrade to 100Mbps by the end of January.  I am also 
> hoping to make the site easier to navigate instead of just having a 
> single page.
> 
> Josh
> 
> 
> D. Scott wrote:
> 
> > Hi
> >
> > How did you calculate mega flops?
> >
> > I will download and look into the benchmark program you are using.
> >
> > Could we aggreed on which benchmark software to use so that we can 
> > compare performance of each small cluster?
> >
> > http://home.attmil.ne.jp/a/jm/
> >
> > Gives me an idea to put together a basic site. I'll see what I can do.
> >
> > It did take me alot of time and effect searching the net for 
> > information. I'll see if I can put it all together.
> >
> >
> > Dan
> >
> >
> 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at linux-mag.com  Tue Dec 30 10:16:40 2003
From: deadline at linux-mag.com (Douglas Eadline, Cluster World Magazine)
Date: Tue, 30 Dec 2003 10:16:40 -0500 (EST)
Subject: [Beowulf] Websites for small clusters
In-Reply-To: <E1AbIGp-0002rN-By@maroon.csi.cam.ac.uk>
Message-ID: <Pine.LNX.4.44.0312301011370.10843-100000@boltzmann.basement-supercomputing.com>

On 30 Dec 2003, D. Scott wrote:

> Hi
> 
> How did you calculate mega flops?
> 
> I will download and look into the benchmark program you are using.
> 
> Could we aggreed on which benchmark software to use so that we can compare 
> performance of each small cluster?

Check out:

http://www.cluster-rant.com/article.pl?sid=03/03/17/1838236

It explains a bit about the Beowulf Performance suite (BPS), it is
not intended to measure LINPAC MFLOPS, but rather help you see if the
cluster is working properly. It may take a little fidgeting to get it
to work as there is no standard way to do things, but it can be useful
to test if the cluster is working properly and measure perforamnce using 
the NAS parallel test suite.

Let me know if you need help.

Doug

> 
> http://home.attmil.ne.jp/a/jm/
> 
> Gives me an idea to put together a basic site. I'll see what I can do.
> 
> It did take me alot of time and effect searching the net for information. 
> I'll see if I can put it all together.
> 
> 
> Dan
> 
> On Dec 30 2003, Josh Moore wrote:
> 
> > I have seen a few but not that many websites around dealing with 
> > indviduals clusters.  Most links were down and it took a great deal of 
> > searching to come up with a few pages.  That is the main reason I made 
> > my website http://home.attmil.ne.jp/a/jm/  dealing with the building of 
> > my cluster.  It started as a two node cluster and has updates has I add 
> > more nodes and run other tests.
> > 
> > Jim Lux wrote:
> > 
> > > I fully agree... I suspect that most readers of this list start with a 
> > > small cluster, and a historical record of what it took to get it up and 
> > > running is quite useful, especially the hiccups and problems that you 
> > > inevitably encounter. (e.g. what do you mean the circuit breaker just 
> > > tripped on the plug strip when we plugged all those things into it?)
> > >
> > > ----- Original Message ----- From: "Per Lindstr?m" 
> > > <Per.Lindstrom at me.chalmers.se> To: "D. Scott" <ds10025 at cam.ac.uk>; 
> > > "Anand TNC" <anand at mecheng.iisc.ernet.in> Cc: "Beowulf" 
> > > <beowulf at beowulf.org>; "Josh Moore" <kong at thejosh.net>; "Per" 
> > > <Per.Lindstrom at madpenguin.org> Sent: Monday, December 29, 2003 11:59 AM 
> > > Subject: [Beowulf] Websites for small clusters
> > >
> > >
> > >  
> > >
> > >>Hi Dan,
> > >>
> > >>It should be great if you publish your cluster work instructions on a
> > >>website. I have found that there is need for a such place.
> > >>
> > >>The site http://www.msm.cam.ac.uk/map/mapmain.html is a good example on
> > >>how a website sharing scientific and/or professional experience can be
> > >>aranged.
> > >>
> > >>If it not allready exist, shall we arrange something similar for few
> > >>node clusters? (Few node clusters 2 - 30 nodes?)
> > >>
> > >>Best regards
> > >>Per Lindstr?m
> > >>
> > >>
> > >>_______________________________________________
> > >>Beowulf mailing list, Beowulf at beowulf.org
> > >>To change your subscription (digest mode or unsubscribe) visit
> > >>    
> > >>
> > >http://www.beowulf.org/mailman/listinfo/beowulf
> > >
> > >  
> > >
> > 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Editor-in-chief                   ClusterWorld Magazine
Desk: 610.865.6061                            
Cell: 610.390.7765         Redefining High Performance Computing
Fax:  610.865.6618 

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From jgreenseid at wesleyan.edu  Tue Dec 30 12:18:46 2003
From: jgreenseid at wesleyan.edu (Joe Greenseid)
Date: Tue, 30 Dec 2003 12:18:46 -0500 (EST)
Subject: [Beowulf] Websites for small clusters
In-Reply-To: <Pine.LNX.4.44.0312301011370.10843-100000@boltzmann.basement-supercomputing.com>
References: <Pine.LNX.4.44.0312301011370.10843-100000@boltzmann.basement-supercomputing.com>
Message-ID: <Pine.GSO.4.53.0312301215030.19375@alumni.wesleyan.edu>

I have tried to post as many of these "how to build a beowulf" sites as i
can find on my website here:  http://lcic.org/documentation.html#comp

right now it looks like i have 5 or 6 of them that aren't from places like
IBM and stuff (and a few from IBM, ameslab, etc).  if folks come across
others i'm missing, please send them along to me, i'd be happy to post
them (i've seen a few things on the list here in the past month that i
have on the TODO list; i just haven't had much time with the real job
taking all my time lately, but that is changing shortly).

--Joe

***************************************
*  Joe Greenseid                      *
*  jgreenseid [at] wesleyan [dot] edu *
*  http://www.thunderlizards.net      *
*  http://lcic.org                    *
***************************************

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ddw at dreamscape.com  Tue Dec 30 21:41:19 2003
From: ddw at dreamscape.com (Daniel Williams)
Date: Tue, 30 Dec 2003 21:41:19 -0500
Subject: [Beowulf] Megaflops & Benchmarks
Message-ID: <3FF23739.BC46C032@dreamscape.com>

I am hoping to build a cluster as soon as I can find 8 or more Pentium II
class machines being scrapped, and I would be interested in being able to
compare a cluster's performance with all my single processor machines. Is
there a benchmark that will run on a single processor PC, as well as a
cluster, so you can compare them directly?
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mikhailberis at free.net.ph  Wed Dec 31 03:22:00 2003
From: mikhailberis at free.net.ph (Dean Michael C. Berris)
Date: 31 Dec 2003 16:22:00 +0800
Subject: [Beowulf] Beowulf Benchmark
Message-ID: <1072858917.3845.8.camel@mikhail>

Good day everyone,

I am a student at the University of the Philippines at Los Banos (UPLB)
here in the Philippines, and I'm currently doing my thesis on projective
computational load balancing algorithm for Beowulf clusters. I am in
charge of two homogeneous clusters, each having 5 nodes, one based on
the x86 architecture while the other is based on the UltraSPARC
architecture.

I am relatively new to clustering technologies, but I have been at a
loss while looking for possible benchmarking tools for clusters. I have
seen some libraries like LINPACK for linear algebra, but I don't know
how to use it for benchmarking.

I have implemented a parallel genetic algorithm solution to the
asymmetric traveling salesman problem (100 nodes) on the x86 based
cluster, as well as a prime number finder on both the x86 and UltraSPARC
clusters. I have results on both the x86 cluster as well as the
UltraSPARC cluster with regard to the prime number finder, but I haven't
an idea as to how I could come up with the FLOPS that either cluster can
do.

Any tutorials, insights, and examples would be most welcome.

Thanks in advance!

-- 
Dean Michael C. Berris <mikhailberis at free.net.ph>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf