From prentice at ias.edu  Mon Jan  2 14:12:47 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 02 Jan 2012 14:12:47 -0500
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
 full of oil.
In-Reply-To: <alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>
	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
Message-ID: <4F0201AF.6080509@ias.edu>

On 12/29/2011 02:49 PM, Mark Hahn wrote:
> guys, this isn't a dating site.

...yet.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
MailScanner: clean


From prentice at ias.edu  Mon Jan  2 14:15:16 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 02 Jan 2012 14:15:16 -0500
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
 full of oil.
In-Reply-To: <D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
	<D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
Message-ID: <4F020244.4040505@ias.edu>

On 12/29/2011 07:50 PM, Vincent Diepeveen wrote:
> it's very useful Mark, as we know now he works for the company and  
> also for which nation.
>
> Vincent

For someone who's always bashing on US Foreign policy, you sure sound
like a Republican or member of the Department of Homeland Security!


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
MailScanner: clean


From eugen at leitl.org  Wed Jan 11 04:13:02 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Wed, 11 Jan 2012 10:13:02 +0100
Subject: [Beowulf] Course: Parallel Programming of High Performance Systems
Message-ID: <20120111091302.GU21917@leitl.org>

----- Forwarded message from Georg Hager <Georg.Hager at rrze.uni-erlangen.de> -----

From: Georg Hager <Georg.Hager at rrze.uni-erlangen.de>
Date: Wed, 11 Jan 2012 01:40:09 +0100 (CET)
To: eugen at leitl.org
Subject: Course: Parallel Programming of High Performance Systems

"Parallel Programming of High Performance Systems" is the
yearly course provided by LRZ and RRZE that gives students
and scientists a solid introduction to

- Processor and HPC system architectures
- Code development and basic tools
- Scalar optimizations (generic and architecture-specific)
- Parallelization basics
- Parallel programming with OpenMP and MPI

There will also be an additional course with advanced topics,
which covers

- Parallel performance tools for MPI and OpenMP
- Parallel I/O with MPI I/O
- I/O tuning and libraries

Hands-on sessions will enable participants to apply the concepts
right away.

Although the federal HPC system at LRZ Munich is treated in some
detail, most of the conveyed concepts are of general use.
You can find the preliminary course agendas on the web:

Basic course:
<http://www.lrz.de/services/compute/courses/index.html#TOC1.14>

Advanced course:
<http://www.lrz.de/services/compute/courses/index.html#TOC1.15>

This year the basic course is hosted by RRZE in Erlangen and
will be available at LRZ in Garching via videoconferencing,
if a sufficient number of people are interested. Hands-On
sessions will then be provided at both locations. The advanced
course will be hosted by LRZ in Garching.

Basic course:
============
Location: RRZE, Martensstr. 1, 91058 Erlangen
Date:     March 5-9, 2012, 9:00-18:00

Advanced course:
===============
Location: LRZ, Boltzmannstr. 1, 85748 Garching b. Muenchen
Date:     March 19-22, 2012, 9:00-18:00


There is no course fee.

Please register for course "HPPP1W11" and/or "HPAT1W11"
at the following LRZ website:

<http://www.lrz-muenchen.de/services/schulung/kursanmeldung>

Hoping to see you there,
G. Hager

-- 
Dr. Georg Hager, HPC Services
Friedrich-Alexander-Universitaet Erlangen-Nuernberg
Regionales RechenZentrum Erlangen (RRZE)
Martensstrasse 1, 91058 Erlangen, Germany
Tel. +49 9131 85-28973, Fax +49 9131 302941
mailto:georg.hager at rrze.uni-erlangen.de
http://www.hpc.rrze.uni-erlangen.de/

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 10:36:48 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 16:36:48 +0100
Subject: [Beowulf] Course: Parallel Programming of High Performance
	Systems
In-Reply-To: <20120111091302.GU21917@leitl.org>
References: <20120111091302.GU21917@leitl.org>
Message-ID: <3DCEF7EC-45ED-43C0-9345-A59938AB9861@xs4all.nl>

Yeah, the sheets are there from the 2003 lecture.
filename LRZ210703_1.pdf

Very helpful if you have grey hair and want to port your years 80  
fortran code to todays HPC hardware.

Vincent

On Jan 11, 2012, at 10:13 AM, Eugen Leitl wrote:

> ----- Forwarded message from Georg Hager <Georg.Hager at rrze.uni- 
> erlangen.de> -----
>
> From: Georg Hager <Georg.Hager at rrze.uni-erlangen.de>
> Date: Wed, 11 Jan 2012 01:40:09 +0100 (CET)
> To: eugen at leitl.org
> Subject: Course: Parallel Programming of High Performance Systems
>
> "Parallel Programming of High Performance Systems" is the
> yearly course provided by LRZ and RRZE that gives students
> and scientists a solid introduction to
>
> - Processor and HPC system architectures
> - Code development and basic tools
> - Scalar optimizations (generic and architecture-specific)
> - Parallelization basics
> - Parallel programming with OpenMP and MPI
>
> There will also be an additional course with advanced topics,
> which covers
>
> - Parallel performance tools for MPI and OpenMP
> - Parallel I/O with MPI I/O
> - I/O tuning and libraries
>
> Hands-on sessions will enable participants to apply the concepts
> right away.
>
> Although the federal HPC system at LRZ Munich is treated in some
> detail, most of the conveyed concepts are of general use.
> You can find the preliminary course agendas on the web:
>
> Basic course:
> <http://www.lrz.de/services/compute/courses/index.html#TOC1.14>
>
> Advanced course:
> <http://www.lrz.de/services/compute/courses/index.html#TOC1.15>
>
> This year the basic course is hosted by RRZE in Erlangen and
> will be available at LRZ in Garching via videoconferencing,
> if a sufficient number of people are interested. Hands-On
> sessions will then be provided at both locations. The advanced
> course will be hosted by LRZ in Garching.
>
> Basic course:
> ============
> Location: RRZE, Martensstr. 1, 91058 Erlangen
> Date:     March 5-9, 2012, 9:00-18:00
>
> Advanced course:
> ===============
> Location: LRZ, Boltzmannstr. 1, 85748 Garching b. Muenchen
> Date:     March 19-22, 2012, 9:00-18:00
>
>
> There is no course fee.
>
> Please register for course "HPPP1W11" and/or "HPAT1W11"
> at the following LRZ website:
>
> <http://www.lrz-muenchen.de/services/schulung/kursanmeldung>
>
> Hoping to see you there,
> G. Hager
>
> -- 
> Dr. Georg Hager, HPC Services
> Friedrich-Alexander-Universitaet Erlangen-Nuernberg
> Regionales RechenZentrum Erlangen (RRZE)
> Martensstrasse 1, 91058 Erlangen, Germany
> Tel. +49 9131 85-28973, Fax +49 9131 302941
> mailto:georg.hager at rrze.uni-erlangen.de
> http://www.hpc.rrze.uni-erlangen.de/
>
> ----- End forwarded message -----
> -- 
> Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
> ______________________________________________________________
> ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
> 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 11:09:00 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 08:09:00 -0800
Subject: [Beowulf] Course: Parallel Programming of High Performance
 Systems
In-Reply-To: <3DCEF7EC-45ED-43C0-9345-A59938AB9861@xs4all.nl>
Message-ID: <CB32F17E.12E3E%james.p.lux@jpl.nasa.gov>

I don't have grey hair (part grey beard, I confess), but I have plenty of
70s era FORTRAN that benefits from parallelization.
Numerical Electromagnetics Code V4, specifically.

The implementation has been throughly validated and have been used for
decades, finding all the little idiosyncracies and dealing with numerical
precision issues, etc.  There's extensive software around that generates
the card image input files it expects and parses the line printer output
files (with the 1 in column 1 for a page break).

Rewriting it from scratch would not be a very good use of time. You'd have
to revisit all the years of validation, make sure there were subtle
differences in function, because while there's an official validation
suite, it's more to make sure that the compile worked ok and there's not
an egregious problem. And who knows what users out there have depended on
some idiosyncratic implementation aspects.

I suspect the same is true for lots of fluid mechanics and other FEM codes
(NASTRAN, for instance).

So an incremental approach of parallelizing that old FORTRAN, replacing
pieces with "new FORTRAN", for instance, might be useful.

(and don't get me started on my experiences with the f2c engine)

 
On 1/11/12 7:36 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:

>Yeah, the sheets are there from the 2003 lecture.
>filename LRZ210703_1.pdf
>
>Very helpful if you have grey hair and want to port your years 80
>fortran code to todays HPC hardware.
>
>Vincent
>
>On Jan 11, 2012, at 10:13 AM, Eugen Leitl wrote:
>
>> ----- Forwarded message from Georg Hager <Georg.Hager at rrze.uni-
>> erlangen.de> -----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 11:18:41 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 08:18:41 -0800
Subject: [Beowulf] A cluster of Arduinos
Message-ID: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>


For educational purposes..

Has anyone done something where they implement some sort of message passing API on a network of Arduinos.  Since they cost only $20 each, and have a fairly facile development environment, it seems you could put together a simple demonstration of parallel processing and various message passing things.

For instance, you could introduce errors in the message links and do experiments with Byzantine General type algorithms, or with multiple parallel routes, etc.

I've not actually tried hooking up multiple arduinos through a USB hub to one PC, but if that works, it gives you a nice "head node, debug console" sort of interface.

Smaller, lighter, cheaper than lashing together MiniITX mobos or building a Wal-Mart Cluster.


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120111/8e0a7553/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From diep at xs4all.nl  Wed Jan 11 12:00:43 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 18:00:43 +0100
Subject: [Beowulf] Course: Parallel Programming of High Performance
	Systems
In-Reply-To: <CB32F17E.12E3E%james.p.lux@jpl.nasa.gov>
References: <CB32F17E.12E3E%james.p.lux@jpl.nasa.gov>
Message-ID: <7B7DB325-4FFB-4C68-9602-2E1E71B41D12@xs4all.nl>


On Jan 11, 2012, at 5:09 PM, Lux, Jim (337C) wrote:

> I don't have grey hair (part grey beard, I confess), but I have  
> plenty of
> 70s era FORTRAN that benefits from parallelization.
> Numerical Electromagnetics Code V4, specifically.
>
> The implementation has been throughly validated and have been used for
> decades, finding all the little idiosyncracies and dealing with  
> numerical
> precision issues, etc.  There's extensive software around that  
> generates
> the card image input files it expects and parses the line printer  
> output
> files (with the 1 in column 1 for a page break).
>
> Rewriting it from scratch would not be a very good use of time.  
> You'd have
> to revisit all the years of validation, make sure there were subtle
> differences in function, because while there's an official validation
> suite, it's more to make sure that the compile worked ok and  
> there's not
> an egregious problem. And who knows what users out there have  
> depended on
> some idiosyncratic implementation aspects.
>
> I suspect the same is true for lots of fluid mechanics and other  
> FEM codes
> (NASTRAN, for instance).
>
> So an incremental approach of parallelizing that old FORTRAN,  
> replacing
> pieces with "new FORTRAN", for instance, might be useful.
>
> (and don't get me started on my experiences with the f2c engine)
>

No need to get started Jim, NASA can ask that the Russians as well.

>
>
> On 1/11/12 7:36 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:
>
>> Yeah, the sheets are there from the 2003 lecture.
>> filename LRZ210703_1.pdf
>>
>> Very helpful if you have grey hair and want to port your years 80
>> fortran code to todays HPC hardware.
>>
>> Vincent
>>
>> On Jan 11, 2012, at 10:13 AM, Eugen Leitl wrote:
>>
>>> ----- Forwarded message from Georg Hager <Georg.Hager at rrze.uni-
>>> erlangen.de> -----
>
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Wed Jan 11 11:58:59 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Wed, 11 Jan 2012 11:58:59 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
Message-ID: <4F0DBFD3.3070503@ias.edu>

On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>
> For educational purposes..
>
> Has anyone done something where they implement some sort of message
> passing API on a network of Arduinos.  Since they cost only $20 each,
> and have a fairly facile development environment, it seems you could
> put together a simple demonstration of parallel processing and various
> message passing things.
>
> For instance, you could introduce errors in the message links and do
> experiments with Byzantine General type algorithms, or with multiple
> parallel routes, etc.
>
> I've not actually tried hooking up multiple arduinos through a USB hub
> to one PC, but if that works, it gives you a nice "head node, debug
> console" sort of interface.
>
> Smaller, lighter, cheaper than lashing together MiniITX mobos or
> building a Wal-Mart Cluster.
>

I started tinkering with Arduinos a couple of months ago. Got lots of
related goodies for Christmas, so I've been looking like a mad scientist
building arduino things lately. I'm still a beginner arduino hacker, but
I'd be game for giving this a try,  if anyone else wants to give this a go.

The Arduino Due, which is overdue in the marketplace, will have a
Cortex-M3 ARM processor.

--
Prentice


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 12:30:30 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 18:30:30 +0100
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
	full of oil.
In-Reply-To: <4F020244.4040505@ias.edu>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
	<D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
	<4F020244.4040505@ias.edu>
Message-ID: <1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>


On Jan 2, 2012, at 8:15 PM, Prentice Bisbal wrote:

> On 12/29/2011 07:50 PM, Vincent Diepeveen wrote:
>> it's very useful Mark, as we know now he works for the company and
>> also for which nation.
>>
>> Vincent
>
> For someone who's always bashing on US Foreign policy, you sure sound
> like a Republican or member of the Department of Homeland Security!

Where is my paycheck?

>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ntmoore at gmail.com  Wed Jan 11 12:31:30 2012
From: ntmoore at gmail.com (Nathan Moore)
Date: Wed, 11 Jan 2012 11:31:30 -0600
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0DBFD3.3070503@ias.edu>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
Message-ID: <CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>

I think something like the Raspberry Pi might be easier for this sort
of task.  They'll also be about $25, but they'll run something like
ARM/linux.  Not out yet thought.

http://www.raspberrypi.org/

On Wed, Jan 11, 2012 at 10:58 AM, Prentice Bisbal <prentice at ias.edu> wrote:
> On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>>
>> For educational purposes..
>>
>> Has anyone done something where they implement some sort of message
>> passing API on a network of Arduinos. ?Since they cost only $20 each,
>> and have a fairly facile development environment, it seems you could
>> put together a simple demonstration of parallel processing and various
>> message passing things.
>>
>> For instance, you could introduce errors in the message links and do
>> experiments with Byzantine General type algorithms, or with multiple
>> parallel routes, etc.
>>
>> I've not actually tried hooking up multiple arduinos through a USB hub
>> to one PC, but if that works, it gives you a nice "head node, debug
>> console" sort of interface.
>>
>> Smaller, lighter, cheaper than lashing together MiniITX mobos or
>> building a Wal-Mart Cluster.
>>
>
> I started tinkering with Arduinos a couple of months ago. Got lots of
> related goodies for Christmas, so I've been looking like a mad scientist
> building arduino things lately. I'm still a beginner arduino hacker, but
> I'd be game for giving this a try, ?if anyone else wants to give this a go.
>
> The Arduino Due, which is overdue in the marketplace, will have a
> Cortex-M3 ARM processor.
>
> --
> Prentice
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


-- 
- - - - - - -?? - - - - - - -?? - - - - - - -
Nathan Moore
Associate Professor, Physics
Winona State University
- - - - - - -?? - - - - - - -?? - - - - - - -
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 12:43:17 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 18:43:17 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0DBFD3.3070503@ias.edu>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
Message-ID: <2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>


On Jan 11, 2012, at 5:58 PM, Prentice Bisbal wrote:

> On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>>
>> For educational purposes..
>>
>> Has anyone done something where they implement some sort of message
>> passing API on a network of Arduinos.  Since they cost only $20 each,
>> and have a fairly facile development environment, it seems you could
>> put together a simple demonstration of parallel processing and  
>> various
>> message passing things.
>>
>> For instance, you could introduce errors in the message links and do
>> experiments with Byzantine General type algorithms, or with multiple
>> parallel routes, etc.
>>
>> I've not actually tried hooking up multiple arduinos through a USB  
>> hub
>> to one PC, but if that works, it gives you a nice "head node, debug
>> console" sort of interface.
>>
>> Smaller, lighter, cheaper than lashing together MiniITX mobos or
>> building a Wal-Mart Cluster.
>>
>
> I started tinkering with Arduinos a couple of months ago. Got lots of
> related goodies for Christmas, so I've been looking like a mad  
> scientist
> building arduino things lately. I'm still a beginner arduino  
> hacker, but
> I'd be game for giving this a try,  if anyone else wants to give  
> this a go.
>
> The Arduino Due, which is overdue in the marketplace, will have a
> Cortex-M3 ARM processor.

Completely superior chip that Cortex-M3.

Though i couldn't program much for it so far - difficult to get  
contract jobs for.
Can do fast multiplication 32 x 32 bits.

You can even implement RSA very fast on that chip.
Runs at 70Mhz or so?

Usually writing assembler for such CPU's is more efficient by the way  
than using
a compiler. Compilers are not so efficient, to say polite, for  
embedded cpu's.

Writing assembler for such cpu's is pretty straightforward, whereas  
in HPC things are far more complicated
because of vectorization.

AVX is the latest there. Speaking of AVX, is there already lots of  
HPC support for AVX?

I see that after years of wrestling the George Woltman released some  
prime number
code (GWNUM), of course as always: in beta for the remainder of this  
century, which uses AVX.

Claims are that it's a tad faster than the existing SIMD codes. I saw  
claims of even above 20% faster,
which is really a lot at that level of engineering; usually you work  
6 months for 0.5% speedup.

If you improve algorithm, you still lose it from this code, as your C/ 
C++  code will be default a factor 10 slower if not more.

I remember how i found a clever caching trick in 2006 for a Numeric  
Theoretic Transform (that's a FFT but then in integers, so without
the rounding errors that the floating point FFT's give), yet after  
some hard work there my C code still was factor 8 slower than Woltman's
SIMD assembler.

>
> --
> Prentice
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 12:44:43 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 18:44:43 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
Message-ID: <940F5BCF-8CC3-4461-ABA4-79FBCF9BF057@xs4all.nl>

That's all very expensive considering the cpu's are under $1 i'd guess.
I actually might need some of this stuff some months from now to  
build some robots.

On Jan 11, 2012, at 6:31 PM, Nathan Moore wrote:

> I think something like the Raspberry Pi might be easier for this sort
> of task.  They'll also be about $25, but they'll run something like
> ARM/linux.  Not out yet thought.
>
> http://www.raspberrypi.org/
>
> On Wed, Jan 11, 2012 at 10:58 AM, Prentice Bisbal  
> <prentice at ias.edu> wrote:
>> On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>>>
>>> For educational purposes..
>>>
>>> Has anyone done something where they implement some sort of message
>>> passing API on a network of Arduinos.  Since they cost only $20  
>>> each,
>>> and have a fairly facile development environment, it seems you could
>>> put together a simple demonstration of parallel processing and  
>>> various
>>> message passing things.
>>>
>>> For instance, you could introduce errors in the message links and do
>>> experiments with Byzantine General type algorithms, or with multiple
>>> parallel routes, etc.
>>>
>>> I've not actually tried hooking up multiple arduinos through a  
>>> USB hub
>>> to one PC, but if that works, it gives you a nice "head node, debug
>>> console" sort of interface.
>>>
>>> Smaller, lighter, cheaper than lashing together MiniITX mobos or
>>> building a Wal-Mart Cluster.
>>>
>>
>> I started tinkering with Arduinos a couple of months ago. Got lots of
>> related goodies for Christmas, so I've been looking like a mad  
>> scientist
>> building arduino things lately. I'm still a beginner arduino  
>> hacker, but
>> I'd be game for giving this a try,  if anyone else wants to give  
>> this a go.
>>
>> The Arduino Due, which is overdue in the marketplace, will have a
>> Cortex-M3 ARM processor.
>>
>> --
>> Prentice
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
> -- 
> - - - - - - -   - - - - - - -   - - - - - - -
> Nathan Moore
> Associate Professor, Physics
> Winona State University
> - - - - - - -   - - - - - - -   - - - - - - -
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 12:58:13 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 09:58:13 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>

Yes.. better the widget that one can whip on down to Radio Shack and buy on my way home from work than the ghostware that may live for Christmas future.

Also, does the Raspberry PI $25 price point include a power supply? The Arduino runs off the USB 5V power, so it's one less thing to hassle with.

I don't know that performance is all that important in this application. It's more to experiment with message passing in a multiprocessor system.  Slow is fine.

(I can't think of a computational application for a ArdWulf (combining Italian and Saxon) that wouldn't be blown away by almost any single computer, including something like a smart phone)

Realistically, you're looking at bitbanging kinds of serial interfaces.

I can see several network implementations: SPI shared bus, Hypercubes, toroidal surfaces, etc.


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Nathan Moore
Sent: Wednesday, January 11, 2012 9:32 AM
To: Prentice Bisbal
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

I think something like the Raspberry Pi might be easier for this sort of task.  They'll also be about $25, but they'll run something like ARM/linux.  Not out yet thought.

http://www.raspberrypi.org/

On Wed, Jan 11, 2012 at 10:58 AM, Prentice Bisbal <prentice at ias.edu> wrote:
> On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>>
>> For educational purposes..
>>
>> Has anyone done something where they implement some sort of message 
>> passing API on a network of Arduinos. ?Since they cost only $20 each, 
>> and have a fairly facile development environment, it seems you could 
>> put together a simple demonstration of parallel processing and 
>> various message passing things.
>>
>> 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:00:36 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:00:36 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>


> The Arduino Due, which is overdue in the marketplace, will have a
> Cortex-M3 ARM processor.

Completely superior chip that Cortex-M3.

Though i couldn't program much for it so far - difficult to get contract jobs for.
Can do fast multiplication 32 x 32 bits.

You can even implement RSA very fast on that chip.
Runs at 70Mhz or so?

Usually writing assembler for such CPU's is more efficient by the way than using a compiler. Compilers are not so efficient, to say polite, for embedded cpu's.

Writing assembler for such cpu's is pretty straightforward, whereas in HPC things are far more complicated because of vectorization.

-->> ah, but this is not really a HPC application.  It's a cluster computer architecture demonstration platform.  The Java based arduino environment is pretty simple and multiplatform.  Yes, it uses a sort of weird C-like language, but there it is... it's easy to use.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:19:24 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:19:24 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>

Yes..
And there's been a bunch of "value clusters" over the years (StoneSouperComputer, for instance)..

But that's still $3k.

I could see putting together 8 nodes for a few hundred dollars. Arduino Uno R3 is about $25 each in quantity.

Think in terms of a small class where you want to have, say, 10 mini-clusters, one per student. No sharing, etc.


-----Original Message-----
From: Alex Chekholko [mailto:alex.chekholko at gmail.com] 
Sent: Wednesday, January 11, 2012 10:12 AM
To: Lux, Jim (337C)
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

The LittleFe cluster is designed specifically for teaching and demonstration.  Current cost is ~$3k.  But it's all standard x86 and runs Linux and even has GPUs.

http://littlefe.net/

I saw them build a bunch of them at SC11.

On Wed, Jan 11, 2012 at 10:00 AM, Lux, Jim (337C) <james.p.lux at jpl.nasa.gov> wrote:
> ?It's a cluster computer architecture demonstration platform.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:27:31 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:27:31 -0800
Subject: [Beowulf] PAPERS interface
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>

Arghh.. my google-fu is failing me..

I'm looking for the papers on the PAPERS cluster interface (based on using parallel ports.. back in the 90s) and, of course, if you search for the word papers, you get nothing useful..

I can't remember who the authors were or where it was done (I'm thinking in the SouthEast US, for some reason, but I'm not sure)

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120111/2be94a53/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From sabujp at gmail.com  Wed Jan 11 13:35:17 2012
From: sabujp at gmail.com (Sabuj Pattanayek)
Date: Wed, 11 Jan 2012 12:35:17 -0600
Subject: [Beowulf] PAPERS interface
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <CAEeMGHu2U_t3VN11iVqO2tYCKCrEVwN3ZZ=Gs5LBvW8RPD=_hQ@mail.gmail.com>

https://www.google.com/search?hl=en&q=%22PAPERS%22%20parallel%20port%20interface&btnG=Google+Search

http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1183&context=ecetr

HTH,
Sabuj
Google Proxy Certified Search Partner

On Wed, Jan 11, 2012 at 12:27 PM, Lux, Jim (337C)
<james.p.lux at jpl.nasa.gov> wrote:
> Arghh.. my google-fu is failing me..
>
>
>
> I?m looking for the papers on the PAPERS cluster interface (based on using
> parallel ports.. back in the 90s) and, of course, if you search for the word
> papers, you get nothing useful..
>
>
>
> I can?t remember who the authors were or where it was done (I?m thinking in
> the SouthEast US, for some reason, but I?m not sure)
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:37:14 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:37:14 -0800
Subject: [Beowulf] PAPERS interface
In-Reply-To: <4F0DD65B.3060808@nasa.gov>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0DD65B.3060808@nasa.gov>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9029@ALTPHYEMBEVSP20.RES.AD.JPL>

Thanks..
Also props to Juan Gallego who found it, too..

From: Jeff Becker [mailto:Jeffrey.C.Becker at nasa.gov]
Sent: Wednesday, January 11, 2012 10:35 AM
To: Lux, Jim (337C)
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] PAPERS interface

On 01/11/12 10:27, Lux, Jim (337C) wrote:
Arghh.. my google-fu is failing me..

I'm looking for the papers on the PAPERS cluster interface (based on using parallel ports.. back in the 90s) and, of course, if you search for the word papers, you get nothing useful..

I can't remember who the authors were or where it was done (I'm thinking in the SouthEast US, for some reason, but I'm not sure)

Hi Jim. The lead author is Hank Dietz. The acronym is:

PAPERS: Purdue's adapter for parallel execution and rapid synchronization.

Cheers from NASA Ames...

-jeff

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120111/5bd31ed6/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:39:41 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:39:41 -0800
Subject: [Beowulf] PAPERS interface
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E902B@ALTPHYEMBEVSP20.RES.AD.JPL>

Excellent.. Purdue.. and have we really been beowulfing since 1994?  I'll be that the earliest clusters can legally buy alcohol now...


So, If I build a cluster with Arduinos using the PAPERS style interface, what will it be called...

BeoPaperDuino?


From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Lux, Jim (337C)
Sent: Wednesday, January 11, 2012 10:28 AM
To: beowulf at beowulf.org
Subject: [Beowulf] PAPERS interface

Arghh.. my google-fu is failing me..

I'm looking for the papers on the PAPERS cluster interface (based on using parallel ports.. back in the 90s) and, of course, if you search for the word papers, you get nothing useful..

I can't remember who the authors were or where it was done (I'm thinking in the SouthEast US, for some reason, but I'm not sure)

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120111/ef7445ec/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From atp at piskorski.com  Wed Jan 11 14:38:53 2012
From: atp at piskorski.com (Andrew Piskorski)
Date: Wed, 11 Jan 2012 14:38:53 -0500
Subject: [Beowulf] PAPERS interface
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <20120111193853.GA86203@piskorski.com>

On Wed, Jan 11, 2012 at 10:27:31AM -0800, Lux, Jim (337C) wrote:

> I'm looking for the papers on the PAPERS cluster interface (based on
> using parallel ports.. back in the 90s) and, of course, if you

It also came up a few times here on the list, e.g.:

  http://www.beowulf.org/archive/2004-October/010934.html
  From: Tim Mattox
  Date: Sat Oct 16 15:15:14 PDT 2004

-- 
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 17:47:00 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 23:47:00 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>

Jim, your microcontroller cluster is not a rather good idea.

Latency didn't keep up with the CPU speeds...

Todays nodes have a CPU core or 12 and soon 16 which can execute,
let's take a simple integer example in my chessprogram and its IPC,
about 24 instructions per cycle

So nothing SIMD, just simple integer instructions most of it, of  
course loads which effectively
come from L1 play an overwhelming role there.

typical latencies to do a random memory read from the remote nodes,  
even with the latest networks,
it's between 0.85 and 1.9 microseconds. Let's take optimistic 1  
microsecond. RDMA read...

So in that timeframe you can execute 24k+ instructions.

IPC at the cheapo cpu's is far under 1 effectively. Around 0.25 for  
most codes.

Cpu's of 70Mhz can execute 1 instruction in each 280 Mhz. Now we are  
busy with rough measures here.

Let's call that 1/4 millisecond.

Even USB 1.1 has to sticks latencies far under 1 millisecond.

So actual latency of todays clusters is factor 25k worse than this  
'cluster'.

In fact your microcontrollercluster here has latencies that you do  
not even have core to core
within a single CPU today.

There is still too much years 80s and years 90s software out there,
written by the guys who wrote books about how to parallellize, which  
simply
doesn't scale at all at modern hardware.

Let me not quote too many names there as i've done before.

They were just too lazy to throw away their old code and start over  
new writing a new parallel concept
that works at todays hardware.

If we involve GPU's now then there is gonna be an even bigger problem  
and that's that bandwidth of the network
can't keep up with what a single GPU delivers. Who is to blame for  
that is quite a complicated discussion,
if anyone has to be blamed anyway.

We just need more clever algorithms there.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 17:56:12 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 23:56:12 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
Message-ID: <106FFC0A-B488-4A39-8C55-7FD27C3BCFC1@xs4all.nl>


On Jan 11, 2012, at 11:47 PM, Vincent Diepeveen wrote:

> Jim, your microcontroller cluster is not a rather good idea.
>
> Latency didn't keep up with the CPU speeds...
>
> Todays nodes have a CPU core or 12 and soon 16 which can execute,
> let's take a simple integer example in my chessprogram and its IPC,
> about 24 instructions per cycle
>
> So nothing SIMD, just simple integer instructions most of it, of
> course loads which effectively
> come from L1 play an overwhelming role there.
>
> typical latencies to do a random memory read from the remote nodes,
> even with the latest networks,
> it's between 0.85 and 1.9 microseconds. Let's take optimistic 1
> microsecond. RDMA read...
>
> So in that timeframe you can execute 24k+ instructions.
>

Hah, how easy it is to make a mistake, sorry for that.

I didn't even multiply by the Ghz frequency of the cpu's yet.

So if it's 3Ghz or so, it's actually closer to factor 75k faster than  
24k.

Furthermore another problem is that you cant fully load networks of  
course.

So to keep the network functioning great you want to do such
hammering over the network no more than once each 750k instructions.


> IPC at the cheapo cpu's is far under 1 effectively. Around 0.25 for
> most codes.
>
> Cpu's of 70Mhz can execute 1 instruction in each 280 Mhz. Now we are
> busy with rough measures here.
>
> Let's call that 1/4 millisecond.
>
> Even USB 1.1 has to sticks latencies far under 1 millisecond.
>
> So actual latency of todays clusters is factor 25k worse than this
> 'cluster'.
>
> In fact your microcontrollercluster here has latencies that you do
> not even have core to core
> within a single CPU today.
>
> There is still too much years 80s and years 90s software out there,
> written by the guys who wrote books about how to parallellize, which
> simply
> doesn't scale at all at modern hardware.
>
> Let me not quote too many names there as i've done before.
>
> They were just too lazy to throw away their old code and start over
> new writing a new parallel concept
> that works at todays hardware.
>
> If we involve GPU's now then there is gonna be an even bigger problem
> and that's that bandwidth of the network
> can't keep up with what a single GPU delivers. Who is to blame for
> that is quite a complicated discussion,
> if anyone has to be blamed anyway.
>
> We just need more clever algorithms there.
>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 18:24:55 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 15:24:55 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Vincent Diepeveen
Sent: Wednesday, January 11, 2012 2:47 PM
To: Beowulf Mailing List
Subject: Re: [Beowulf] A cluster of Arduinos

Jim, your microcontroller cluster is not a rather good idea.

Latency didn't keep up with the CPU speeds...

--- You're missing the point of the cluster.  It's not for performance (where I can't imagine that the slowest single CPU PC out there wouldn't blow the figurative doors off).  It's to provide a very inexpensive way to experiment/play/demonstrate loosely coupled multiprocessor systems. 

--> for example, you could experiment with redundant message routing across a fabric of nodes.  The algorithms are fairly simple, and this gives you a testbed which is qualitatively different than just simulating a bunch of nodes on a single PC.  There is pedagogical value in a system where you can force a link error by just disconnecting the cable, and your blinky lights on each node show what's going on.


There is still too much years 80s and years 90s software out there, written by the guys who wrote books about how to parallellize, which simply doesn't scale at all at modern hardware.

-->  I think that a lot of the theory of parallel processes is speed independent, and while some historical approaches might not be used in a modern system for good implementation reasons, students and others still need to learn about them, if only as the canonical approach.    Sure, you could do a simulation on a single PC (and I've seen them, in Simulink, and in other more specialized tools), but there's a lot of appeal to a hands-on-the-cheap-hardware approach to learning.

--> To take an example, if you set a student a problem of lighting a LED on each node in a specified node order at  specified intervals, and where the node interconnects are not specified in advance, that's a fairly interesting homework problem.  You have to discover the network connectivity graph, then figure out how to pass the message to the appropriate node at the appropriate time.  This is a classic "hot plug network discovery" kind of problem, and in the face of intermittent links, it's of great interest.

--> While that particular problem isn't exactly HPC, it DOES relate to HPC in a world where you cannot assume perfect processor nodes and perfect communications links.  And that gets right to the whole "scalability" thing in HPC.  It wasn't til the implementation of Error Correcting Codes in logic that something like the Q7A computer was even possible, because it was so large that you couldn't guarantee that all the tubes would be working all the time.  Likewise with many other aspects of modern computing.

--> And, of course, in the spaceflight world, this kind of thing is even more important.  A concept of growing importance is the "fractionated spacecraft" where all of the functions that would have been all in one physical vehicle are now spread across many smaller pieces.  And one might reallocate spacecraft fractional pieces between different virtual spacecraft.  Maybe right now, you need a lot of processing power to do image compression and analysis, so you want to allocate a lot of "processing pieces" to the job, with an ad hoc network connection among them.  Later,  you don't need them, so you can release them to other uses.  The pieces might be in the immediate vicinity, or they might be some distance away, which affects the data rate in the link and its error rates.

--> You can legitimately ask whether this sort of thing (the fractionated spacecraft) is a Beowulf (defined as a cluster supercomputer built of commodity components) and I would say it shares many of the same properties, especially in the early Beowulf days before multicores and fancy interconnects were fashionable for multi-thousand processor clusters.  It's that idea of building a large complex device out of many basically identical subunits, using open source/simple software to manage it.  


-->> in summary, it's not about performance.. it's about a teaching tool for networking in the context of cluster computing.  You claim we need to cast off the shackles of old programming styles and get some new blood and ideas.  Well, you need to get people interested in parallel computing and learning the basics (so at least they don't reinvent the square wheel).  One way might be challenges such as parallelization of game play; another might be working with parallelized database; the way I propose is with experimenting with message passing parallelization using dirt cheap hardware.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Wed Jan 11 19:18:11 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Wed, 11 Jan 2012 19:18:11 -0500
Subject: [Beowulf] PAPERS interface
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <2d6fa78f1fc44cea3df118e1c0a27f31.squirrel@mail.eadline.org>

Hank Deitz, was at Purdue, now at Kentucky, see aggregate.org

--
Doug


> Arghh.. my google-fu is failing me..
>
> I'm looking for the papers on the PAPERS cluster interface (based on using
> parallel ports.. back in the 90s) and, of course, if you search for the
> word papers, you get nothing useful..
>
> I can't remember who the authors were or where it was done (I'm thinking
> in the SouthEast US, for some reason, but I'm not sure)
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>


--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From diep at xs4all.nl  Wed Jan 11 19:36:37 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 01:36:37 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>

Yes this was impossible to explain to a bunch of MiT folks as well,
some of whom wrote your book i bet - yet the slower the processor,
the more of a true SMP system it is.

It's obvious that you missed that point.

Writing code for a multicore is tougher, from SMP constraints viewpoint,
than for a bunch of 70Mhz cpu's that have a millisecond latency to  
the other cpu's.

So it's far from demonstrating clusterprogramming. Lightyears away.

Emulation at a simple quadcore is in fact better representative than  
this.

If you want to get closer to clusterprogramming than this, just buy  
yourself off ebay
some barcelona core SMP system with 4 sockets. Say with energy  
efficient 1.8Ghz CPU's.

So with one of the first incarnations of hypertransport, as of course  
later on it dramatically improved.

Latency from cpu to cpu is some 300+ ns if you lookup randomly.

Even good programmers in game tree search have big problems working  
with those latencies.

Clusters are having latencies that are far worse than that. Yet as  
cpu speeds no longer increase much
and number of cores doesn't double that quickly, clusters are the way  
to go if you're CPU hungry.

Setting up small clusters is cheap as well. If i put in the name  
'mellanox' in ebay i see bunches of
cheap cards out there and also switches.

With a single switch you can teach half a dozen students. You can  
just connect the machines you already
got there onto a few switches and write MPI code like that.

Average cost per student also will be a couple of hundreds of dollars.

Vincent


On Jan 12, 2012, at 12:24 AM, Lux, Jim (337C) wrote:

>
>
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf- 
> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
> Sent: Wednesday, January 11, 2012 2:47 PM
> To: Beowulf Mailing List
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> Jim, your microcontroller cluster is not a rather good idea.
>
> Latency didn't keep up with the CPU speeds...
>
> --- You're missing the point of the cluster.  It's not for  
> performance (where I can't imagine that the slowest single CPU PC  
> out there wouldn't blow the figurative doors off).  It's to provide  
> a very inexpensive way to experiment/play/demonstrate loosely  
> coupled multiprocessor systems.
>
> --> for example, you could experiment with redundant message  
> routing across a fabric of nodes.  The algorithms are fairly  
> simple, and this gives you a testbed which is qualitatively  
> different than just simulating a bunch of nodes on a single PC.   
> There is pedagogical value in a system where you can force a link  
> error by just disconnecting the cable, and your blinky lights on  
> each node show what's going on.
>
>
> There is still too much years 80s and years 90s software out there,  
> written by the guys who wrote books about how to parallellize,  
> which simply doesn't scale at all at modern hardware.
>
> -->  I think that a lot of the theory of parallel processes is  
> speed independent, and while some historical approaches might not  
> be used in a modern system for good implementation reasons,  
> students and others still need to learn about them, if only as the  
> canonical approach.    Sure, you could do a simulation on a single  
> PC (and I've seen them, in Simulink, and in other more specialized  
> tools), but there's a lot of appeal to a hands-on-the-cheap- 
> hardware approach to learning.
>
> --> To take an example, if you set a student a problem of lighting  
> a LED on each node in a specified node order at  specified  
> intervals, and where the node interconnects are not specified in  
> advance, that's a fairly interesting homework problem.  You have to  
> discover the network connectivity graph, then figure out how to  
> pass the message to the appropriate node at the appropriate time.   
> This is a classic "hot plug network discovery" kind of problem, and  
> in the face of intermittent links, it's of great interest.
>
> --> While that particular problem isn't exactly HPC, it DOES relate  
> to HPC in a world where you cannot assume perfect processor nodes  
> and perfect communications links.  And that gets right to the whole  
> "scalability" thing in HPC.  It wasn't til the implementation of  
> Error Correcting Codes in logic that something like the Q7A  
> computer was even possible, because it was so large that you  
> couldn't guarantee that all the tubes would be working all the  
> time.  Likewise with many other aspects of modern computing.
>
> --> And, of course, in the spaceflight world, this kind of thing is  
> even more important.  A concept of growing importance is the  
> "fractionated spacecraft" where all of the functions that would  
> have been all in one physical vehicle are now spread across many  
> smaller pieces.  And one might reallocate spacecraft fractional  
> pieces between different virtual spacecraft.  Maybe right now, you  
> need a lot of processing power to do image compression and  
> analysis, so you want to allocate a lot of "processing pieces" to  
> the job, with an ad hoc network connection among them.  Later,  you  
> don't need them, so you can release them to other uses.  The pieces  
> might be in the immediate vicinity, or they might be some distance  
> away, which affects the data rate in the link and its error rates.
>
> --> You can legitimately ask whether this sort of thing (the  
> fractionated spacecraft) is a Beowulf (defined as a cluster  
> supercomputer built of commodity components) and I would say it  
> shares many of the same properties, especially in the early Beowulf  
> days before multicores and fancy interconnects were fashionable for  
> multi-thousand processor clusters.  It's that idea of building a  
> large complex device out of many basically identical subunits,  
> using open source/simple software to manage it.
>
>
> -->> in summary, it's not about performance.. it's about a teaching  
> tool for networking in the context of cluster computing.  You claim  
> we need to cast off the shackles of old programming styles and get  
> some new blood and ideas.  Well, you need to get people interested  
> in parallel computing and learning the basics (so at least they  
> don't reinvent the square wheel).  One way might be challenges such  
> as parallelization of game play; another might be working with  
> parallelized database; the way I propose is with experimenting with  
> message passing parallelization using dirt cheap hardware.
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing To change your subscription (digest mode or unsubscribe)  
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Wed Jan 11 19:59:18 2012
From: samuel at unimelb.edu.au (Chris Samuel)
Date: Thu, 12 Jan 2012 11:59:18 +1100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
Message-ID: <201201121159.18993.samuel@unimelb.edu.au>

On Thu, 12 Jan 2012 11:36:37 AM Vincent Diepeveen wrote:

> So it's far from demonstrating clusterprogramming. Lightyears away.

Whatever happpened to hacking on hardware just for the fun of it?

Just because it's not going to be useful doesn't mean you won't learn 
from the experience, even if the lesson is only "don't do it again". 
:-)

-- 
   Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Wed Jan 11 20:04:32 2012
From: samuel at unimelb.edu.au (Chris Samuel)
Date: Thu, 12 Jan 2012 12:04:32 +1100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <201201121204.32332.samuel@unimelb.edu.au>

On Thu, 12 Jan 2012 04:58:13 AM Lux, Jim (337C) wrote:

> Also, does the Raspberry PI $25 price point include a power supply?

I thought the plan was for them to be powered from the HDMI connector, 
but it appears I was wrong, it looks like it can use either microUSB 
or the GPIO header.

http://elinux.org/RaspberryPiBoard

# The board takes fixed 5V input, (with the 1V2 core voltage generated
# directly from the input using the internal switch-mode supply on the
# BCM2835 die). This permits adoption of the micro USB form factor,
# which, in turn, prevents the user from inadvertently plugging in
# out-of-range power inputs; that would be dangerous, since the 5V
# would go straight to HDMI and output USB ports, even though the
# problem should be mitigated by some protections applied to the input
# power: The board provides a polarity protection diode, a voltage
# clamp, and a self-resetting semiconductor fuse.

-- 
   Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 20:09:53 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 17:09:53 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Vincent Diepeveen
Sent: Wednesday, January 11, 2012 4:37 PM
To: Beowulf Mailing List
Subject: Re: [Beowulf] A cluster of Arduinos

Yes this was impossible to explain to a bunch of MiT folks as well, some of whom wrote your book i bet - yet the slower the processor, the more of a true SMP system it is.

It's obvious that you missed that point.

Writing code for a multicore is tougher, from SMP constraints viewpoint, than for a bunch of 70Mhz cpu's that have a millisecond latency to the other cpu's.

-> Yes, that's true... but that's also what I would think of as more advanced than understanding basic message passing or non-tightly-coupled multiprocessing systems.  And there are lots of applications for the latter.  Some might not be as sexy as others, but they exist.

So it's far from demonstrating clusterprogramming. Lightyears away.
Emulation at a simple quadcore is in fact better representative than this.
If you want to get closer to clusterprogramming than this, just buy yourself off ebay some barcelona core SMP system with 4 sockets. Say with energy efficient 1.8Ghz CPU's.
So with one of the first incarnations of hypertransport, as of course later on it dramatically improved.
Latency from cpu to cpu is some 300+ ns if you lookup randomly.
Even good programmers in game tree search have big problems working with those latencies.

-> but that's an entirely different sort of problem space and instructional area.   


Clusters are having latencies that are far worse than that. Yet as cpu speeds no longer increase much and number of cores doesn't double that quickly, clusters are the way to go if you're CPU hungry.
Setting up small clusters is cheap as well. If i put in the name 'mellanox' in ebay i see bunches of cheap cards out there and also switches.

-> Oh, Im sure the surplus market is full of things one could potentially use. But I suspect that by the time you lash together your $40 cards and $20 cables and several hundred $ switch, you're up in the total system price >$1k.  And you're using surplus, so there's a support issue.  If you're tinkering for yourself in the garage or as a one-off, then surplus is a fine way to go.  If you want to be able to give a list of "go buy this" to a teacher, it needs to be off-the-shelf currently being manufactured stuff.   

-> Say you want to set up 10 demo systems with 8 nodes each, so that each student in a small class has their own to work with.  There's a big difference between $30 Arduinos and $200 netbooks. 

With a single switch you can teach half a dozen students. You can just connect the machines you already got there onto a few switches and write MPI code like that.

-> The whole point is to give a student exclusive access to the system, without needing to share.  Sure, we've all done the shared "computer lab" resource thing and managed to learn(In the late 1970s, I would have done quite a lot to have on demand access to an 029 keypunch).  That's part of what *personal* computers is all about.    My program doesn't work right, I just hit the reset button and start over.  

-> I confess, too, that there is an aspect of the "mass of boards on the desktop with cables strewn around", which is a learning experience in itself.  On the other hand, the Arduino experience is a lot less hassle than, say, a mass of PC mobos, network cards, and power supplies and trying to get them to boot off the net or a USB drive. 


Average cost per student also will be a couple of hundreds of dollars.
-> that's the "total cost of several thousand dollars divided by N students who share it" I suspect.  We could get into a little BOM battle, and I'd venture that I can keep the off the shelf parts cost under $500, and give each student a dedicated system to play with.  The only part that I don't know right off the top of my head is the actual interconnect hardware.  I think you'd want to design some sort of board with a bunch of connectors that connects to the Arduinos with ribbon cables.   But even there, that could be "here's your PCBExpress file.. order the board and you get 3 for $50"

-> over the years I've been involved in several of these "what can we set up for a demonstration", and I've converged to the realization that what you need is a parts list (preferably preloaded at Newark or DigiKey or Mouser or similar) and an explicit set of instructions.   A setup that starts out with:
1) Find 8 motherboards on eBay or newegg with these sorts of specs
2) Find 8 power supplies that match the mother boards

Is doomed to failure.  You need "buy 3 of those and 6 of these, and hook them up this way"

This is the beauty of the whole Arduino culture. In fact, it's a bit too much of that.. there's not a lot of good overview tutorial material.. but lots of "here's how to do specific task X"... I got started looking at Arduinos because I want to build a multichannel temperature controller to smoke/cure sausage.

But I've used just about every small single board computer out there: Rabbit, Basic Stamp, various PIC boards, etc. not to mention various MiniITX and PC schemes.   So far, the Arduino is the winner on dirt cheap and simple combined.  Spend $30, plug in USB cable, load java environment, done.  Now I know why all those projects at the science fair are using them.  You get to focus on what you want to do, rather than getting a computer working.

Vincent


On Jan 12, 2012, at 12:24 AM, Lux, Jim (337C) wrote:

>
>
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf- 
> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
> Sent: Wednesday, January 11, 2012 2:47 PM
> To: Beowulf Mailing List
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> Jim, your microcontroller cluster is not a rather good idea.
>
> Latency didn't keep up with the CPU speeds...
>
> --- You're missing the point of the cluster.  It's not for performance 
> (where I can't imagine that the slowest single CPU PC out there 
> wouldn't blow the figurative doors off).  It's to provide a very 
> inexpensive way to experiment/play/demonstrate loosely coupled 
> multiprocessor systems.
>
> --> for example, you could experiment with redundant message
> routing across a fabric of nodes.  The algorithms are fairly simple, 
> and this gives you a testbed which is qualitatively
> different than just simulating a bunch of nodes on a single PC.   
> There is pedagogical value in a system where you can force a link 
> error by just disconnecting the cable, and your blinky lights on each 
> node show what's going on.
>
>
> There is still too much years 80s and years 90s software out there, 
> written by the guys who wrote books about how to parallellize, which 
> simply doesn't scale at all at modern hardware.
>
> -->  I think that a lot of the theory of parallel processes is
> speed independent, and while some historical approaches might not be 
> used in a modern system for good implementation reasons, students and 
> others still need to learn about them, if only as the
> canonical approach.    Sure, you could do a simulation on a single  
> PC (and I've seen them, in Simulink, and in other more specialized 
> tools), but there's a lot of appeal to a hands-on-the-cheap- hardware 
> approach to learning.
>
> --> To take an example, if you set a student a problem of lighting
> a LED on each node in a specified node order at  specified intervals, 
> and where the node interconnects are not specified in advance, that's 
> a fairly interesting homework problem.  You have to discover the 
> network connectivity graph, then figure out how to
> pass the message to the appropriate node at the appropriate time.   
> This is a classic "hot plug network discovery" kind of problem, and in 
> the face of intermittent links, it's of great interest.
>
> --> While that particular problem isn't exactly HPC, it DOES relate
> to HPC in a world where you cannot assume perfect processor nodes and 
> perfect communications links.  And that gets right to the whole 
> "scalability" thing in HPC.  It wasn't til the implementation of Error 
> Correcting Codes in logic that something like the Q7A computer was 
> even possible, because it was so large that you couldn't guarantee 
> that all the tubes would be working all the time.  Likewise with many 
> other aspects of modern computing.
>
> --> And, of course, in the spaceflight world, this kind of thing is
> even more important.  A concept of growing importance is the 
> "fractionated spacecraft" where all of the functions that would have 
> been all in one physical vehicle are now spread across many smaller 
> pieces.  And one might reallocate spacecraft fractional pieces between 
> different virtual spacecraft.  Maybe right now, you need a lot of 
> processing power to do image compression and analysis, so you want to 
> allocate a lot of "processing pieces" to the job, with an ad hoc 
> network connection among them.  Later,  you don't need them, so you 
> can release them to other uses.  The pieces might be in the immediate 
> vicinity, or they might be some distance away, which affects the data 
> rate in the link and its error rates.
>
> --> You can legitimately ask whether this sort of thing (the
> fractionated spacecraft) is a Beowulf (defined as a cluster 
> supercomputer built of commodity components) and I would say it shares 
> many of the same properties, especially in the early Beowulf days 
> before multicores and fancy interconnects were fashionable for 
> multi-thousand processor clusters.  It's that idea of building a large 
> complex device out of many basically identical subunits, using open 
> source/simple software to manage it.
>
>
> -->> in summary, it's not about performance.. it's about a teaching
> tool for networking in the context of cluster computing.  You claim we 
> need to cast off the shackles of old programming styles and get some 
> new blood and ideas.  Well, you need to get people interested in 
> parallel computing and learning the basics (so at least they don't 
> reinvent the square wheel).  One way might be challenges such as 
> parallelization of game play; another might be working with 
> parallelized database; the way I propose is with experimenting with 
> message passing parallelization using dirt cheap hardware.
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin 
> Computing To change your subscription (digest mode or unsubscribe) 
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 20:22:07 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 17:22:07 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <201201121204.32332.samuel@unimelb.edu.au>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>
	<201201121204.32332.samuel@unimelb.edu.au>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E910B@ALTPHYEMBEVSP20.RES.AD.JPL>

Interesting...
That seems to be a growing trend, then.  So, now we just have to wait for them to actually exist.   The $35 B style board has Ethernet, and assuming one could netboot and operate "headless", then a stack o'raspberry PIs and a cheap Ethernet switch might be an alternate approach.

The "per node" cost is comparable to the Arduino, and it's true that Ethernet is probably more congenial in the long run. 

Drawing 700mA off the microUSB, though..  That's fairly hefty (although not a big deal in general.. you might need to have some better power supply scheme for a basket o'pi cluster.  (Arduino Uno runs around 40-50 mA)


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Chris Samuel
Sent: Wednesday, January 11, 2012 5:05 PM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

On Thu, 12 Jan 2012 04:58:13 AM Lux, Jim (337C) wrote:

> Also, does the Raspberry PI $25 price point include a power supply?

I thought the plan was for them to be powered from the HDMI connector, but it appears I was wrong, it looks like it can use either microUSB or the GPIO header.

http://elinux.org/RaspberryPiBoard

# The board takes fixed 5V input, (with the 1V2 core voltage generated # directly from the input using the internal switch-mode supply on the # BCM2835 die). This permits adoption of the micro USB form factor, # which, in turn, prevents the user from inadvertently plugging in # out-of-range power inputs; that would be dangerous, since the 5V # would go straight to HDMI and output USB ports, even though the # problem should be mitigated by some protections applied to the input # power: The board provides a polarity protection diode, a voltage # clamp, and a self-resetting semiconductor fuse.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 21:03:21 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 03:03:21 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>

The whole purpose of PC's is that they are generic to use. I remember  
how in past decision taking bought low clocked junk for big price -
much against the wish of the sysadmins who wanted a PC for every  
student exclusively. Outdated slow junk is not interesting
to students. Now you and i might like that CPU as it's under $1, but  
to them it's just 70Mhz, factor 500 slower than their home PC single  
core
is. What impresses is if you got something that can beat their own  
machine at home.

In the end in science we basically learn a lot easier if we can take  
a look into the future - so being faster than a single PC is a good  
example of that.

So let them do that. If you take care you launch 1 proces on each  
machine, then at quadcore machines, not to mention i7's with
hyperthreading, you can have 24 computers on 1 switch that serve 24  
students, each using 12 logical cores.

And for demonstration purposes you can run succesful applications  
also at all 24 computers at the same time.

Hey there is switches with even more slots.

Average price per student is gonna beat the crap out any junk  
solution you show up with - besides how many are you gonna buy?

Those computers are already there, one for each student i suspect.

So they can exclusively toy and toy - for the switch it's not a real  
problem except if they really mess up.

But most important they learn something - by toying with 70Mhz  
hardware that's not representative and only intersting to experts like
you and me, who are real good in embedded programming, they don't  
learn much.

There is no replacement for the real thing to test upon.

Besides if you go program at embedded processors, writing good fast  
single CPU code mine is probably gonna kick the hell out of you writing
the same program at 8 CPU's. Probably by factor 10+ it'll be single  
core faster than you at 8.

p.s. not that it's disturbing Jim but your replies are typed within  
my original message always, so tough to read sometimes what you typed  
into
the message i posted here -  maybe this apple macbookpro's
mailing system doesn't know how to handle it - FYI i want to reformat  
it to linux anyway -
getting sick being hacked silly each time by about every other  
consultant,
but well this is all off topic - so hence the postscriptum.

On Jan 12, 2012, at 2:09 AM, Lux, Jim (337C) wrote:

>
>
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf- 
> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
> Sent: Wednesday, January 11, 2012 4:37 PM
> To: Beowulf Mailing List
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> Yes this was impossible to explain to a bunch of MiT folks as well,  
> some of whom wrote your book i bet - yet the slower the processor,  
> the more of a true SMP system it is.
>
> It's obvious that you missed that point.
>
> Writing code for a multicore is tougher, from SMP constraints  
> viewpoint, than for a bunch of 70Mhz cpu's that have a millisecond  
> latency to the other cpu's.
>
> -> Yes, that's true... but that's also what I would think of as  
> more advanced than understanding basic message passing or non- 
> tightly-coupled multiprocessing systems.  And there are lots of  
> applications for the latter.  Some might not be as sexy as others,  
> but they exist.
>
> So it's far from demonstrating clusterprogramming. Lightyears away.
> Emulation at a simple quadcore is in fact better representative  
> than this.
> If you want to get closer to clusterprogramming than this, just buy  
> yourself off ebay some barcelona core SMP system with 4 sockets.  
> Say with energy efficient 1.8Ghz CPU's.
> So with one of the first incarnations of hypertransport, as of  
> course later on it dramatically improved.
> Latency from cpu to cpu is some 300+ ns if you lookup randomly.
> Even good programmers in game tree search have big problems working  
> with those latencies.
>
> -> but that's an entirely different sort of problem space and  
> instructional area.
>
>
> Clusters are having latencies that are far worse than that. Yet as  
> cpu speeds no longer increase much and number of cores doesn't  
> double that quickly, clusters are the way to go if you're CPU hungry.
> Setting up small clusters is cheap as well. If i put in the name  
> 'mellanox' in ebay i see bunches of cheap cards out there and also  
> switches.
>
> -> Oh, Im sure the surplus market is full of things one could  
> potentially use. But I suspect that by the time you lash together  
> your $40 cards and $20 cables and several hundred $ switch, you're  
> up in the total system price >$1k.  And you're using surplus, so  
> there's a support issue.  If you're tinkering for yourself in the  
> garage or as a one-off, then surplus is a fine way to go.  If you  
> want to be able to give a list of "go buy this" to a teacher, it  
> needs to be off-the-shelf currently being manufactured stuff.
>
> -> Say you want to set up 10 demo systems with 8 nodes each, so  
> that each student in a small class has their own to work with.   
> There's a big difference between $30 Arduinos and $200 netbooks.
>
> With a single switch you can teach half a dozen students. You can  
> just connect the machines you already got there onto a few switches  
> and write MPI code like that.
>
> -> The whole point is to give a student exclusive access to the  
> system, without needing to share.  Sure, we've all done the shared  
> "computer lab" resource thing and managed to learn(In the late  
> 1970s, I would have done quite a lot to have on demand access to an  
> 029 keypunch).  That's part of what *personal* computers is all  
> about.    My program doesn't work right, I just hit the reset  
> button and start over.
>
> -> I confess, too, that there is an aspect of the "mass of boards  
> on the desktop with cables strewn around", which is a learning  
> experience in itself.  On the other hand, the Arduino experience is  
> a lot less hassle than, say, a mass of PC mobos, network cards, and  
> power supplies and trying to get them to boot off the net or a USB  
> drive.
>
>
> Average cost per student also will be a couple of hundreds of dollars.
> -> that's the "total cost of several thousand dollars divided by N  
> students who share it" I suspect.  We could get into a little BOM  
> battle, and I'd venture that I can keep the off the shelf parts  
> cost under $500, and give each student a dedicated system to play  
> with.  The only part that I don't know right off the top of my head  
> is the actual interconnect hardware.  I think you'd want to design  
> some sort of board with a bunch of connectors that connects to the  
> Arduinos with ribbon cables.   But even there, that could be  
> "here's your PCBExpress file.. order the board and you get 3 for $50"
>
> -> over the years I've been involved in several of these "what can  
> we set up for a demonstration", and I've converged to the  
> realization that what you need is a parts list (preferably  
> preloaded at Newark or DigiKey or Mouser or similar) and an  
> explicit set of instructions.   A setup that starts out with:
> 1) Find 8 motherboards on eBay or newegg with these sorts of specs
> 2) Find 8 power supplies that match the mother boards
>
> Is doomed to failure.  You need "buy 3 of those and 6 of these, and  
> hook them up this way"
>
> This is the beauty of the whole Arduino culture. In fact, it's a  
> bit too much of that.. there's not a lot of good overview tutorial  
> material.. but lots of "here's how to do specific task X"... I got  
> started looking at Arduinos because I want to build a multichannel  
> temperature controller to smoke/cure sausage.
>
> But I've used just about every small single board computer out  
> there: Rabbit, Basic Stamp, various PIC boards, etc. not to mention  
> various MiniITX and PC schemes.   So far, the Arduino is the winner  
> on dirt cheap and simple combined.  Spend $30, plug in USB cable,  
> load java environment, done.  Now I know why all those projects at  
> the science fair are using them.  You get to focus on what you want  
> to do, rather than getting a computer working.
>
> Vincent
>
>
>
> On Jan 12, 2012, at 12:24 AM, Lux, Jim (337C) wrote:
>
>>
>>
>> -----Original Message-----
>> From: beowulf-bounces at beowulf.org [mailto:beowulf-
>> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
>> Sent: Wednesday, January 11, 2012 2:47 PM
>> To: Beowulf Mailing List
>> Subject: Re: [Beowulf] A cluster of Arduinos
>>
>> Jim, your microcontroller cluster is not a rather good idea.
>>
>> Latency didn't keep up with the CPU speeds...
>>
>> --- You're missing the point of the cluster.  It's not for  
>> performance
>> (where I can't imagine that the slowest single CPU PC out there
>> wouldn't blow the figurative doors off).  It's to provide a very
>> inexpensive way to experiment/play/demonstrate loosely coupled
>> multiprocessor systems.
>>
>> --> for example, you could experiment with redundant message
>> routing across a fabric of nodes.  The algorithms are fairly simple,
>> and this gives you a testbed which is qualitatively
>> different than just simulating a bunch of nodes on a single PC.
>> There is pedagogical value in a system where you can force a link
>> error by just disconnecting the cable, and your blinky lights on each
>> node show what's going on.
>>
>>
>> There is still too much years 80s and years 90s software out there,
>> written by the guys who wrote books about how to parallellize, which
>> simply doesn't scale at all at modern hardware.
>>
>> -->  I think that a lot of the theory of parallel processes is
>> speed independent, and while some historical approaches might not be
>> used in a modern system for good implementation reasons, students and
>> others still need to learn about them, if only as the
>> canonical approach.    Sure, you could do a simulation on a single
>> PC (and I've seen them, in Simulink, and in other more specialized
>> tools), but there's a lot of appeal to a hands-on-the-cheap- hardware
>> approach to learning.
>>
>> --> To take an example, if you set a student a problem of lighting
>> a LED on each node in a specified node order at  specified intervals,
>> and where the node interconnects are not specified in advance, that's
>> a fairly interesting homework problem.  You have to discover the
>> network connectivity graph, then figure out how to
>> pass the message to the appropriate node at the appropriate time.
>> This is a classic "hot plug network discovery" kind of problem,  
>> and in
>> the face of intermittent links, it's of great interest.
>>
>> --> While that particular problem isn't exactly HPC, it DOES relate
>> to HPC in a world where you cannot assume perfect processor nodes and
>> perfect communications links.  And that gets right to the whole
>> "scalability" thing in HPC.  It wasn't til the implementation of  
>> Error
>> Correcting Codes in logic that something like the Q7A computer was
>> even possible, because it was so large that you couldn't guarantee
>> that all the tubes would be working all the time.  Likewise with many
>> other aspects of modern computing.
>>
>> --> And, of course, in the spaceflight world, this kind of thing is
>> even more important.  A concept of growing importance is the
>> "fractionated spacecraft" where all of the functions that would have
>> been all in one physical vehicle are now spread across many smaller
>> pieces.  And one might reallocate spacecraft fractional pieces  
>> between
>> different virtual spacecraft.  Maybe right now, you need a lot of
>> processing power to do image compression and analysis, so you want to
>> allocate a lot of "processing pieces" to the job, with an ad hoc
>> network connection among them.  Later,  you don't need them, so you
>> can release them to other uses.  The pieces might be in the immediate
>> vicinity, or they might be some distance away, which affects the data
>> rate in the link and its error rates.
>>
>> --> You can legitimately ask whether this sort of thing (the
>> fractionated spacecraft) is a Beowulf (defined as a cluster
>> supercomputer built of commodity components) and I would say it  
>> shares
>> many of the same properties, especially in the early Beowulf days
>> before multicores and fancy interconnects were fashionable for
>> multi-thousand processor clusters.  It's that idea of building a  
>> large
>> complex device out of many basically identical subunits, using open
>> source/simple software to manage it.
>>
>>
>> -->> in summary, it's not about performance.. it's about a teaching
>> tool for networking in the context of cluster computing.  You  
>> claim we
>> need to cast off the shackles of old programming styles and get some
>> new blood and ideas.  Well, you need to get people interested in
>> parallel computing and learning the basics (so at least they don't
>> reinvent the square wheel).  One way might be challenges such as
>> parallelization of game play; another might be working with
>> parallelized database; the way I propose is with experimenting with
>> message passing parallelization using dirt cheap hardware.
>>
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> Computing To change your subscription (digest mode or unsubscribe)
>> visit http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing To change your subscription (digest mode or unsubscribe)  
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eagles051387 at gmail.com  Thu Jan 12 02:42:10 2012
From: eagles051387 at gmail.com (Jonathan Aquilina)
Date: Thu, 12 Jan 2012 08:42:10 +0100
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
 full of oil.
In-Reply-To: <1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
	<D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
	<4F020244.4040505@ias.edu>
	<1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>
Message-ID: <4F0E8ED2.5000504@gmail.com>

On 11/01/2012 18:30, Vincent Diepeveen wrote:
> On Jan 2, 2012, at 8:15 PM, Prentice Bisbal wrote:
>
>> On 12/29/2011 07:50 PM, Vincent Diepeveen wrote:
>>> it's very useful Mark, as we know now he works for the company and
>>> also for which nation.
>>>
>>> Vincent
>> For someone who's always bashing on US Foreign policy, you sure sound
>> like a Republican or member of the Department of Homeland Security!
> Where is my paycheck?
>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
FYI vincent I am no back in malta.

Regards

Jonathan Aquilina

Get a signature like this. 
<http://r1.wisestamp.com/r/landing?promo=17&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_17> 
CLICK HERE. 
<http://r1.wisestamp.com/r/landing?promo=17&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_17> 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/d04c951d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p.gif
Type: image/gif
Size: 35 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/d04c951d/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pixel.png
Type: image/png
Size: 90 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/d04c951d/attachment.png>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From eagles051387 at gmail.com  Thu Jan 12 02:42:26 2012
From: eagles051387 at gmail.com (Jonathan Aquilina)
Date: Thu, 12 Jan 2012 08:42:26 +0100
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
 full of oil.
In-Reply-To: <1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
	<D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
	<4F020244.4040505@ias.edu>
	<1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>
Message-ID: <4F0E8EE2.7040403@gmail.com>

On 11/01/2012 18:30, Vincent Diepeveen wrote:
> On Jan 2, 2012, at 8:15 PM, Prentice Bisbal wrote:
>
>> On 12/29/2011 07:50 PM, Vincent Diepeveen wrote:
>>> it's very useful Mark, as we know now he works for the company and
>>> also for which nation.
>>>
>>> Vincent
>> For someone who's always bashing on US Foreign policy, you sure sound
>> like a Republican or member of the Department of Homeland Security!
> Where is my paycheck?
>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
FYI vincent I am now back in malta.

Regards

Jonathan Aquilina

Get a signature like this. 
<http://r1.wisestamp.com/r/landing?promo=17&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_17> 
CLICK HERE. 
<http://r1.wisestamp.com/r/landing?promo=17&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_17> 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/3d76e5b8/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p.gif
Type: image/gif
Size: 35 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/3d76e5b8/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pixel.png
Type: image/png
Size: 90 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/3d76e5b8/attachment.png>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From eugen at leitl.org  Thu Jan 12 03:49:45 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Thu, 12 Jan 2012 09:49:45 +0100
Subject: [Beowulf] the Barcelona Supercomputing Center
Message-ID: <20120112084945.GD21917@leitl.org>


Just some cluster porn:

http://imgur.com/a/OoNVI
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From john.hearns at mclaren.com  Thu Jan 12 05:16:28 2012
From: john.hearns at mclaren.com (Hearns, John)
Date: Thu, 12 Jan 2012 10:16:28 -0000
Subject: [Beowulf] A cluster of Arduinos
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov><CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com><ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL><201201121204.32332.samuel@unimelb.edu.au>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E910B@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <207BB2F60743C34496BE41039233A8090A7D728A@MRL-PWEXCHMB02.mil.tagmclarengroup.com>

> Interesting...
> That seems to be a growing trend, then.  So, now we just have to wait
> for them to actually exist.   The $35 B style board has Ethernet, and
> assuming one could netboot and operate "headless", then a stack
> o'raspberry PIs and a cheap Ethernet switch might be an alternate
> approach.

Regarding Ethernet switches, I had cause recently to look for an USB
powered switch
Such things exist, they are promoted for gamers.
http://www.scan.co.uk/products/8-port-eten-pw-108-pocket-size-metal-casi
ng-10-100-switch-usb-powered-lan-party!

You could imagine a cluster being powered by those USB adapters which
fit into the cigarette
lighter socket of a car.
How about a cluster which fits in the glovebox or under the seat of a
car?


The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From peter.st.john at gmail.com  Thu Jan 12 08:49:16 2012
From: peter.st.john at gmail.com (Peter St. John)
Date: Thu, 12 Jan 2012 08:49:16 -0500
Subject: [Beowulf] the Barcelona Supercomputing Center
In-Reply-To: <20120112084945.GD21917@leitl.org>
References: <20120112084945.GD21917@leitl.org>
Message-ID: <CAF4H3kcwptKZ9vJobaqpzt-p2KNX2jNKdXi8Xx17rYCRBP0eJg@mail.gmail.com>

The architectural contrast (the building housing the racks is a chapel) is
vivid.
Sorta Steampunkish.
The place is described some at http://www.bsc.es/plantillaA.php?cat_id=1 (many
of their pages seem to be in English).
Peter

On Thu, Jan 12, 2012 at 3:49 AM, Eugen Leitl <eugen at leitl.org> wrote:

>
> Just some cluster porn:
>
> http://imgur.com/a/OoNVI
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/d0827829/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From ellis at runnersroll.com  Thu Jan 12 08:58:20 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 08:58:20 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
Message-ID: <4F0EE6FC.2050002@runnersroll.com>

On 01/11/2012 09:03 PM, Vincent Diepeveen wrote:
> The whole purpose of PC's is that they are generic to use. I remember
> how in past decision taking bought low clocked junk for big price -
> much against the wish of the sysadmins who wanted a PC for every
> student exclusively. Outdated slow junk is not interesting
> to students. Now you and i might like that CPU as it's under $1, but
> to them it's just 70Mhz, factor 500 slower than their home PC single
> core
> is. What impresses is if you got something that can beat their own
> machine at home.
>
> In the end in science we basically learn a lot easier if we can take
> a look into the future - so being faster than a single PC is a good
> example of that.

Take this advice in any other area, let's say, Chemical Engineering or 
Mechanical Engineering, and the students are going to come out the of 
the experience with chemical burns at least to at most blowing up half 
of the building.  In the best case all they do is screw up very, very 
expensive equipment.  So I have to respectfully disagree that learning 
is only possible and students will only be interested when working on 
the stuff of the "future."  I think this is likely the reason why many 
introductory engineering classes incorporate use of Lego Mindstorm 
robots rather than lunar rovers (or even overstock lunar rovers :D).

Point in case, I got interested in HPC/Beowulfery back in 2006, read 
RGBs book and a few other texts on it, and finally found a small group 
(4) of unused PIIIs to play on in the attic of one of my college's 
buildings.  Did I learn how to setup a reasonable cluster?  Yes.  Was it 
slow as dirt compared to then modern Intel and AMD processors?  Of 
course.  But did the experience get me so completely hooked on 
HPC/Cluster research that I went on to pursue a PHD on the topic? 
Absolutely.

Granted, I'm just one data point, but I think Jim's idea has all the 
right components for a great educational experience.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Thu Jan 12 09:28:56 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Thu, 12 Jan 2012 09:28:56 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E910B@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>	<201201121204.32332.samuel@unimelb.edu.au>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E910B@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <4F0EEE28.6030404@ias.edu>


On 01/11/2012 08:22 PM, Lux, Jim (337C) wrote:
> Interesting...
> That seems to be a growing trend, then.  So, now we just have to wait for them to actually exist.   The $35 B style board has Ethernet, and assuming one could netboot and operate "headless", then a stack o'raspberry PIs and a cheap Ethernet switch might be an alternate approach.
>
> The "per node" cost is comparable to the Arduino, and it's true that Ethernet is probably more congenial in the long run. 

You can get an ethernet "shield" for arduino to add ethernet
capabilities, but at $35-50 each, you cost savings just went out the
window, especially when compared to the Raspberry Pi. You can also buy
the Arduino Ethernet, which is an arduino board with Ethernet built in,
but at a cost of ~$60, is no better a value than buying an arduino and
the ethernet shield separately.
> Drawing 700mA off the microUSB, though..  That's fairly hefty (although not a big deal in general.. you might need to have some better power supply scheme for a basket o'pi cluster.  (Arduino Uno runs around 40-50 mA

The arduino can be powered by USB, or a 9V power supply, so if you plan
on using lots of them (as Jim is, theoretically), you don't have to
worry about overloading the USB bus.

--
Prentice

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 09:35:50 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 06:35:50 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <207BB2F60743C34496BE41039233A8090A7D728A@MRL-PWEXCHMB02.mil.tagmclarengroup.com>
Message-ID: <CB342F88.12ED3%james.p.lux@jpl.nasa.gov>


On 1/12/12 2:16 AM, "Hearns, John" <john.hearns at mclaren.com> wrote:

>> Interesting...
>> That seems to be a growing trend, then.  So, now we just have to wait
>> for them to actually exist.   The $35 B style board has Ethernet, and
>> assuming one could netboot and operate "headless", then a stack
>> o'raspberry PIs and a cheap Ethernet switch might be an alternate
>> approach.
>
>Regarding Ethernet switches, I had cause recently to look for an USB
>powered switch
>Such things exist, they are promoted for gamers.
>http://www.scan.co.uk/products/8-port-eten-pw-108-pocket-size-metal-casi
>ng-10-100-switch-usb-powered-lan-party!
>
>You could imagine a cluster being powered by those USB adapters which
>fit into the cigarette
>lighter socket of a car.
>How about a cluster which fits in the glovebox or under the seat of a
>car?


Powering off the cigarette lighter socket (or 12V power socket as they're
now labeled) is probably feasible, but those USB widgets can't source a
lot of power.  Certainly not amps.


>
>
>The contents of this email are confidential and for the exclusive use of
>the intended recipient.  If you receive this email in error you should
>not copy it, retransmit it, use it or disclose its contents but should
>return it to the sender immediately and delete your copy.
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 09:39:23 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 15:39:23 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0EE6FC.2050002@runnersroll.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
	<4F0EE6FC.2050002@runnersroll.com>
Message-ID: <41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>

The average guy is not interested in knowing all details regarding  
how to
play tennis with a wooden racket from the 1980s, just around
the time when McEnroe was on the tennisfield playing there.

Most people are more interested in whether you can win that grandslam
with what you produce.

The nerds however are interested in how well you can do with a wooden  
racket
from 1980s,therefore projecting your own interest upon those students  
will just
get them desinterested and you will be judged by them as an  
irrelevant person
in their life, whose name they soon forget.

Vincent

On Jan 12, 2012, at 2:58 PM, Ellis H. Wilson III wrote:

> On 01/11/2012 09:03 PM, Vincent Diepeveen wrote:
>> The whole purpose of PC's is that they are generic to use. I remember
>> how in past decision taking bought low clocked junk for big price -
>> much against the wish of the sysadmins who wanted a PC for every
>> student exclusively. Outdated slow junk is not interesting
>> to students. Now you and i might like that CPU as it's under $1, but
>> to them it's just 70Mhz, factor 500 slower than their home PC single
>> core
>> is. What impresses is if you got something that can beat their own
>> machine at home.
>>
>> In the end in science we basically learn a lot easier if we can take
>> a look into the future - so being faster than a single PC is a good
>> example of that.
>
> Take this advice in any other area, let's say, Chemical Engineering or
> Mechanical Engineering, and the students are going to come out the of
> the experience with chemical burns at least to at most blowing up half
> of the building.  In the best case all they do is screw up very, very
> expensive equipment.  So I have to respectfully disagree that learning
> is only possible and students will only be interested when working on
> the stuff of the "future."  I think this is likely the reason why many
> introductory engineering classes incorporate use of Lego Mindstorm
> robots rather than lunar rovers (or even overstock lunar rovers :D).
>
> Point in case, I got interested in HPC/Beowulfery back in 2006, read
> RGBs book and a few other texts on it, and finally found a small group
> (4) of unused PIIIs to play on in the attic of one of my college's
> buildings.  Did I learn how to setup a reasonable cluster?  Yes.   
> Was it
> slow as dirt compared to then modern Intel and AMD processors?  Of
> course.  But did the experience get me so completely hooked on
> HPC/Cluster research that I went on to pursue a PHD on the topic?
> Absolutely.
>
> Granted, I'm just one data point, but I think Jim's idea has all the
> right components for a great educational experience.
>
> Best,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Thu Jan 12 09:38:13 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Thu, 12 Jan 2012 09:38:13 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
Message-ID: <4F0EF055.3050609@ias.edu>


On 01/11/2012 09:03 PM, Vincent Diepeveen wrote:
> The whole purpose of PC's is that they are generic to use. 

That is also the purpose of the Arduino. That's why they open-sourced
it's hardware design.
> I remember  
> how in past decision taking bought low clocked junk for big price -
> much against the wish of the sysadmins who wanted a PC for every  
> student exclusively. Outdated slow junk is not interesting
> to students. Now you and i might like that CPU as it's under $1, but  
> to them it's just 70Mhz, factor 500 slower than their home PC single  
> core
> is. What impresses is if you got something that can beat their own  
> machine at home.
>

Wrong. What impresses students is teaching something they didn't already
know, or showing them how to do something new. Using baking soda and
vinegar to build a volcano, is very low-tech, but it still impresses
students of all ages (even in this modern Apple i-everything world) and
it's done with ingredients just about everyone already has in their
kitchen.

Show them sodium acetate crystallizing out of a supersaturated solution,
and their heads practically explode. Also very low-tech.

--
Prentice


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Thu Jan 12 09:50:05 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Thu, 12 Jan 2012 09:50:05 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>	<4F0EE6FC.2050002@runnersroll.com>
	<41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
Message-ID: <4F0EF31D.8010603@ias.edu>

On 01/12/2012 09:39 AM, Vincent Diepeveen wrote:
> The average guy is not interested in knowing all details regarding  
> how to
> play tennis with a wooden racket from the 1980s, just around
> the time when McEnroe was on the tennisfield playing there.
>
> Most people are more interested in whether you can win that grandslam
> with what you produce.
>
> The nerds however are interested in how well you can do with a wooden  
> racket
> from 1980s,therefore projecting your own interest upon those students  
> will just
> get them desinterested and you will be judged by them as an  
> irrelevant person
> in their life, whose name they soon forget.

Vincent,  I think the only person projecting here is you. You refer to
the 'average guy'. The word 'average' itself implies that statistics
have been collected and analyzed. Can you please show us your
statistics, and how you collected them, to determine what the average
guy is interested in? And what about the average girl, what is she
interested in?  If you are merely citing the work of other researchers,
please include citations.

--
Prentice
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 09:53:57 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 09:53:57 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0EF31D.8010603@ias.edu>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>	<4F0EE6FC.2050002@runnersroll.com>
	<41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
	<4F0EF31D.8010603@ias.edu>
Message-ID: <4F0EF405.5070600@runnersroll.com>

On 01/12/2012 09:50 AM, Prentice Bisbal wrote:
> On 01/12/2012 09:39 AM, Vincent Diepeveen wrote:
>> The average guy is not interested in knowing all details regarding
>> how to
>> play tennis with a wooden racket from the 1980s, just around
>> the time when McEnroe was on the tennisfield playing there.
>>
>> Most people are more interested in whether you can win that grandslam
>> with what you produce.
>>
>> The nerds however are interested in how well you can do with a wooden
>> racket
>> from 1980s,therefore projecting your own interest upon those students
>> will just
>> get them desinterested and you will be judged by them as an
>> irrelevant person
>> in their life, whose name they soon forget.
>
> Vincent,  I think the only person projecting here is you. You refer to
> the 'average guy'. The word 'average' itself implies that statistics
> have been collected and analyzed. Can you please show us your
> statistics, and how you collected them, to determine what the average
> guy is interested in? And what about the average girl, what is she
> interested in?  If you are merely citing the work of other researchers,
> please include citations.

Guys, let's just let this one die in it's traditional form of Vincent 
disagrees with the list and there is nothing more that can be done.  I 
recently read a blog that suggested (due to similar threads following 
these trajectories) that the Wulf list wasn't what it used to be.

Let's save the flames for editors,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 10:03:49 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 16:03:49 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0EF31D.8010603@ias.edu>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>	<4F0EE6FC.2050002@runnersroll.com>
	<41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
	<4F0EF31D.8010603@ias.edu>
Message-ID: <BDA22E31-B2EC-4D5F-A703-E227779BC2A0@xs4all.nl>

Very simple,

Wooden tennis rackets were dirt cheap in 90s.
No one bought them.

Instead they all bought for the tennis court a light frame racket  
with big blade;
in fact those were pretty expensive in some cases.

Why did no one use suddenly those wooden rackets anymore?

How many people watch upcoming Australian Grandslam?

A lot.

How many will watch 1 or 2 dudes toy with a few embedded processors  
using
a language no one has heard of? Only a handful.

On Jan 12, 2012, at 3:50 PM, Prentice Bisbal wrote:

> On 01/12/2012 09:39 AM, Vincent Diepeveen wrote:
>> The average guy is not interested in knowing all details regarding
>> how to
>> play tennis with a wooden racket from the 1980s, just around
>> the time when McEnroe was on the tennisfield playing there.
>>
>> Most people are more interested in whether you can win that grandslam
>> with what you produce.
>>
>> The nerds however are interested in how well you can do with a wooden
>> racket
>> from 1980s,therefore projecting your own interest upon those students
>> will just
>> get them desinterested and you will be judged by them as an
>> irrelevant person
>> in their life, whose name they soon forget.
>
> Vincent,  I think the only person projecting here is you. You refer to
> the 'average guy'. The word 'average' itself implies that statistics
> have been collected and analyzed. Can you please show us your
> statistics, and how you collected them, to determine what the average
> guy is interested in? And what about the average girl, what is she
> interested in?  If you are merely citing the work of other  
> researchers,
> please include citations.
>
> --
> Prentice
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 10:10:40 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 07:10:40 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
Message-ID: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>


On 1/12/12 6:39 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:

>The average guy is not interested in knowing all details regarding
>how to
>play tennis with a wooden racket from the 1980s, just around
>the time when McEnroe was on the tennisfield playing there.
>
>Most people are more interested in whether you can win that grandslam
>with what you produce.
>
>The nerds however are interested in how well you can do with a wooden
>racket
>from 1980s,therefore projecting your own interest upon those students
>will just
>get them desinterested and you will be judged by them as an
>irrelevant person
>in their life, whose name they soon forget.
>

Having spent some time recently in Human Resources meetings about how to
better recruit software people for JPL, I'd say that something that
appeals to nerds and gives them something to do is not all bad. Part of
the educational process is to find and separate the people who are
interested and have a passion.  I'm not sure that someone who starts
getting into clusters mostly because they are interested in breaking into
the Top500 is the target audience in any case.

If you look over the hobby clusters out there, the vast majority are "hey,
I heard about this interesting idea, I scrounged up N old/small/slow/easy
to find computers and tried to cluster them and do something.  I learned
something about cluster administration, and it was fun, but I don't use it
anymore"   

This is exactly the population you want to hit.  Bring in 100 advanced
high school (grade 11-12 in US) students.  Have them all use cheap
hardware to do a cluster.  Some fraction will think, "this is kind of
cool, maybe I should major in CS instead of X"  Some fraction will think,
"how lame, why not make the single processor faster", and they can be
CompEng or EE majors looking at how to reduce feature sizes and get the
heat out. 

It's just like biology or chemistry classes.  In high school biology
(9th/10th grade) most of it is mundane memorization (Krebs cycle, various
descriptive stuff.  Other than the use of cheap cmos cameras, microscopes
used at this level haven't really changed much in the last 100 years (and
the microscopes at my kids' school are probably 10-20 years old). They
also do some more modern molecular biology in a series of labs partly
funded by Amgen:   Some recombinant DNA to put fluorescent proteins in a
bacteria, running some gels, etc.  The vast majority of the students will
NOT go on to a career in biology, but some fraction do, they get
interested in some aspect, and they wind up majoring in bio, or being a
pre-med, etc.   

Not everyone is looking for the world beater.  A lot of kids start with
Kart racing, even though even the fastest Karts aren't as fast as F1 (or
even a Smart Car).  How many engineers started with dismantling the
lawnmower engine?  


For my own work, I'd rather have people who are interested in solving
problems by ganging up multiple failure prone processors, rather than
centralizing it all in one monolithic box (even if the box happens to have
multiple cores).


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 10:13:00 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 16:13:00 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0EF405.5070600@runnersroll.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>	<4F0EE6FC.2050002@runnersroll.com>
	<41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
	<4F0EF31D.8010603@ias.edu> <4F0EF405.5070600@runnersroll.com>
Message-ID: <ED8B1A88-418D-4ED6-BB37-136F67F4EFB9@xs4all.nl>


On Jan 12, 2012, at 3:53 PM, Ellis H. Wilson III wrote:

> On 01/12/2012 09:50 AM, Prentice Bisbal wrote:
>> On 01/12/2012 09:39 AM, Vincent Diepeveen wrote:
>>> The average guy is not interested in knowing all details regarding
>>> how to
>>> play tennis with a wooden racket from the 1980s, just around
>>> the time when McEnroe was on the tennisfield playing there.
>>>
>>> Most people are more interested in whether you can win that  
>>> grandslam
>>> with what you produce.
>>>
>>> The nerds however are interested in how well you can do with a  
>>> wooden
>>> racket
>>> from 1980s,therefore projecting your own interest upon those  
>>> students
>>> will just
>>> get them desinterested and you will be judged by them as an
>>> irrelevant person
>>> in their life, whose name they soon forget.
>>
>> Vincent,  I think the only person projecting here is you. You  
>> refer to
>> the 'average guy'. The word 'average' itself implies that statistics
>> have been collected and analyzed. Can you please show us your
>> statistics, and how you collected them, to determine what the average
>> guy is interested in? And what about the average girl, what is she
>> interested in?  If you are merely citing the work of other  
>> researchers,
>> please include citations.
>
> Guys, let's just let this one die in it's traditional form of Vincent
> disagrees with the list and there is nothing more that can be done.  I

Ah no medicine seems to cure you.
Let me remember the original posting of Jim:

"it seems you could put together a simple demonstration of parallel  
processing and various message passing things."

The insights presented here obviously render this platform as no good  
for that,
not inspiring and for sure the clever students will total get  
desinterested and a bunch,
out of desinterest probably not even finish the course.

Working with stuff that isn't even within factor 500 of the speed of  
a normal CPU that doesn't motivate,
doesn't inspire and basically learns a person very little.

Embedded cpu's are for professionals, leave it like that.

They are too hard for you to program efficiently.

> recently read a blog that suggested (due to similar threads following
> these trajectories) that the Wulf list wasn't what it used to be.
>
> Let's save the flames for editors,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 10:21:54 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 16:21:54 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
Message-ID: <4B53A30C-FF3E-4996-916F-2D8455C90C5D@xs4all.nl>


On Jan 12, 2012, at 4:10 PM, Lux, Jim (337C) wrote:

>
>
> On 1/12/12 6:39 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:
>
>> The average guy is not interested in knowing all details regarding
>> how to
>> play tennis with a wooden racket from the 1980s, just around
>> the time when McEnroe was on the tennisfield playing there.
>>
>> Most people are more interested in whether you can win that grandslam
>> with what you produce.
>>
>> The nerds however are interested in how well you can do with a wooden
>> racket
>> from 1980s,therefore projecting your own interest upon those students
>> will just
>> get them desinterested and you will be judged by them as an
>> irrelevant person
>> in their life, whose name they soon forget.
>>
>
> Having spent some time recently in Human Resources meetings about  
> how to
> better recruit software people for JPL, I'd say that something that
> appeals to nerds and gives them something to do is not all bad.  
> Part of
> the educational process is to find and separate the people who are
> interested and have a passion.  I'm not sure that someone who starts
> getting into clusters mostly because they are interested in  
> breaking into
> the Top500 is the target audience in any case.
>
> If you look over the hobby clusters out there, the vast majority  
> are "hey,
> I heard about this interesting idea, I scrounged up N old/small/ 
> slow/easy
> to find computers and tried to cluster them and do something.  I  
> learned
> something about cluster administration, and it was fun, but I don't  
> use it
> anymore"
>
> This is exactly the population you want to hit.  Bring in 100 advanced
> high school (grade 11-12 in US) students.  Have them all use cheap
> hardware to do a cluster.  Some fraction will think, "this is kind of
> cool, maybe I should major in CS instead of X"  Some fraction will  
> think,

Your example here will just take care a big number of students don't  
want
to have to do anything with those studies, as there is a few lame nerds
there who toy with equipment that's factor 50k slower (adding to the  
factor 500
the object oriented slowdown of factor 100)  than what they have
at home, and it can do nothing useful.

But in this specific case you'll just scare away students and the  
real clever ones
will get total desinterested as you are busy with lame duck speed  
type cpu's.

If you'd build a small marsrover with it that would be something else  
of course.

> "how lame, why not make the single processor faster", and they can be
> CompEng or EE majors looking at how to reduce feature sizes and get  
> the
> heat out.
>
> It's just like biology or chemistry classes.  In high school biology
> (9th/10th grade) most of it is mundane memorization (Krebs cycle,  
> various
> descriptive stuff.  Other than the use of cheap cmos cameras,  
> microscopes
> used at this level haven't really changed much in the last 100  
> years (and
> the microscopes at my kids' school are probably 10-20 years old). They
> also do some more modern molecular biology in a series of labs partly
> funded by Amgen:   Some recombinant DNA to put fluorescent proteins  
> in a
> bacteria, running some gels, etc.  The vast majority of the  
> students will
> NOT go on to a career in biology, but some fraction do, they get
> interested in some aspect, and they wind up majoring in bio, or  
> being a
> pre-med, etc.
>
> Not everyone is looking for the world beater.  A lot of kids start  
> with
> Kart racing, even though even the fastest Karts aren't as fast as  
> F1 (or
> even a Smart Car).  How many engineers started with dismantling the
> lawnmower engine?
>
>
> For my own work, I'd rather have people who are interested in solving
> problems by ganging up multiple failure prone processors, rather than
> centralizing it all in one monolithic box (even if the box happens  
> to have
> multiple cores).
>
>
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 10:35:41 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 07:35:41 -0800
Subject: [Beowulf] List traffic
In-Reply-To: <4F0EF405.5070600@runnersroll.com>
Message-ID: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>


On 1/12/12 6:53 AM, "Ellis H. Wilson III" <ellis at runnersroll.com> wrote:
>  I 
>recently read a blog that suggested (due to similar threads following
>these trajectories) that the Wulf list wasn't what it used to be.

I think that's for a variety of reasons..

The cluster world has changed.  Back 15-20 years ago, clusters were new,
novel, and pretty much roll your own, so there was a lot of traffic on the
list about how to do that.  Remember all the mobo comparisons, and all the
carefully teased out idiosyncracies of various switches and network
schemes.

Back then, the idea of using a cluster for "big computing" was kind of
new, as well.  People building clusters were doing it either because the
architecture was interesting OR because they had a computing problem to
solve, and a cluster was a cheap way to do it, especially with free labor.

I think clustering has evolved, and the concept of a cluster is totally
mature.  You can buy a cluster essentially off the shelf, from a whole
variety of companies (some with people who were participating in this list
back then and still today), and it's interesting how the basic Beowulf
concept has evolved.

Back in late 90s, it was still largely "commodity computers, commodity
interconnects" where the focus was on using "business class" computers and
networking hardware. Perhaps not consumer, as cheap as possible, but
certainly not fancy, schmancy rack mounted 1U servers.. The switches
people were using were just ordinary network switches, the same as in the
wiring closet down the hall.

Over time, though, there has developed a whole industry of supplying
components specifically aimed at clusters: high speed interconnects,
computers, etc.   Some of this just follows the IT industry in general..
There weren't as many "server farms" back in 1995 as there are now.

Maybe it's because the field has matured?


So, we're back to talking about "roll-your-own" clusters of one sort or
another.  I think anyone serious about big cluster computing (>100 nodes)
probably won'd be hanging on this list looking for hints on how to route
and label their network cables.  There's too many other places to go get
that information, or, better yet, places to hire someone who already knows.

I know that if I needed massive computational power at work, my first
thought these days isn't "hey, lets build a cluster", it's "let's call up
the HPC folks and get an account on one of the existing clusters".

But I still see the need to bring people into the cluster world in some
way.  I don't know where the cluster vendors find their people, or even
what sorts of skill sets they're looking for.  Are they beating the bushes
at CMU, MIT, and other hotbeds of CS looking for prior cluster design
experience?  I suspect not, just like most of the people JPL hires don't
have spacecraft experience in school, or anywhere.  You look for bright
people who might be interested in what you're doing, and they learn the
details of cluster-wrangling on the job.


For myself, I like probing the edges of what you can do with a cluster.
Big computational problems don't excite me.  I like thinking about things
like:

1) What can I use from the body of cluster knowledge to do something
different.  A distributed cluster is topologically similar to one all
contained in a single rack, but it's different.  How is it different
(latency, error rate)? Can I use analysis (particularly from early cluster
days) to do a better job.

2) I've always been a fan of *personal* computing (probably from many
years of negotiating for a piece of some shared resource).  It's tricky
here, because as soon as you have a decent 8 or 16 node cluster that fits
under a desk, and have figured out all the hideous complexity of how to
port some single user application to run on it, someone comes out with a
single processor box that's just as fast, and a lot easier to use.  Back
in the 80s, I designed, but did not build, a 80286 clone using discrete
ECL logic, the idea being to make a 100MHz IBM PC-AT that would run
standard spreadsheet software 20 times faster (a big deal when your huge
spreadsheet takes hours to recalculate).  However, Moore's law and Intel
made that idea a losing proposition.

But still, the idea of personal control over my computing resources is
appealing.  Nobody watching to see "are you effectively using those cpu
cycles".  No arguing about annual re-adjustment of chargeback rates where
you take the total system budget and divide it by CPU seconds.  Ooops not
enough people used it, so your CPU costs just quadrupled.

3) I'm also interested in portable computing (Yes, I have a NEC 8201-
TRS-80 Model 100 clone, and a TI-59, I did sell the Compaq, but I had one
of those too,  etc.)  This is another interesting problem space.. No big
computer room with infrastructure.  Here, the fascinating trade is between
local computer horsepower and cheap long distance datacomm.  At some
point, it's cheaper/easier to send your data via satellite link  to a big
computer elsewhere and get the results back.  It's the classic 60s remote
computing problem revisited once again.


>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 10:56:32 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 16:56:32 +0100
Subject: [Beowulf] Robots
In-Reply-To: <4F0EE6FC.2050002@runnersroll.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
	<4F0EE6FC.2050002@runnersroll.com>
Message-ID: <95F43560-B9AA-4B56-9B32-1A1979461B07@xs4all.nl>

On Jan 12, 2012, at 2:58 PM, Ellis H. Wilson III wrote:
>  I think this is likely the reason why many
> introductory engineering classes incorporate use of Lego Mindstorm
> robots rather than lunar rovers (or even overstock lunar rovers :D).

I didn't comment on other complete wrong examples, but i want to  
highlight
one. Your example of a lego robot actually is disproving your statement.

Amongst the affordable non-self built robots, the lego robot actually
is a genius robot.

It so to speak the i7-3960x under the robots, to compare it with the
fastest i7 that has been released to date.

It is affordable, it is completely programmable with robot OS,
and if you want to build something better you need to be pretty
genius.

A custom robot, except if you build a real simple stupid thing that
can do near to nothing, that'll be really expensive compared to such
  lego robot which goes for oh a copule of hundreds of dollars only.

I see it for around 280 dollar online, and to add some components is
just a few dozens of dollars each copmonent.
>


The normal way to build 'something better', if better at all,
requires building most components for example from aluminium.

Each component then has a price of say roughly $5k and needs to be
special engineered. You need many of those components.

We assume then it's not a commercial project otherwise also royalties
will be involved paying for every component you build, of course that's
a small part of the above price.

Most custom robots, which are hardly bigger in size than the legorobot,
they're pretty expensive actually.

If you want to purchase components together for a tad bigger robot,
just something with 4 wheels which can hold a couple of dozens of  
kilo's,
such components already are $5k - $10k.

And that's mass produced components.

So building something that actually is more functional, better,  
that's not
gonna be easy.

It's a genius robot, really is.

In itself it's not really a lot more expensive , if you produce  
something in the quantities at which lego produces it,
to build a bigger robot.

The reason the lego robot is very small. has really to do with safety.

Big robots rare really dangerous you know.

In cars they use already dozens of cpu's, already 10+ year old cars  
have easily over 100 cpu's inside,
just for safety, with the intend that components of the car don't  
damage humankind.

Robotsoftware is far too primitive there yet. No nothing safety  
concerns.

In all that, the lego robot is really a genius thing.

Very bad example of what you 'tried' to show with some fake arguments.

>
> Point in case, I got interested in HPC/Beowulfery back in 2006, read
> RGBs book and a few other texts on it, and finally found a small group
> (4) of unused PIIIs to play on in the attic of one of my college's
> buildings.  Did I learn how to setup a reasonable cluster?  Yes.   
> Was it
> slow as dirt compared to then modern Intel and AMD processors?  Of
> course.  But did the experience get me so completely hooked on
> HPC/Cluster research that I went on to pursue a PHD on the topic?
> Absolutely.
>
> Granted, I'm just one data point, but I think Jim's idea has all the
> right components for a great educational experience.
>
> Best,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 11:45:29 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 17:45:29 +0100
Subject: [Beowulf] List traffic
In-Reply-To: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
References: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
Message-ID: <4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>

Well i feel small clusters of say 2 computers or so might get more  
common in future.

Yet let's start asking:

What is a cluster however?

That's not such a simple answer.

Having a few computers at home connected via a router with simple  
default ethernet
is something many have at home.

Is that a cluster?

Maybe.

Let me focus pon the clusters with a decent network.

The decent network clusters suffer from a number of problems.

The biggest problem for this list:

0) yesterday i read in the newspaper another Irani scientist was  
killed by a carbomb.

Past few years i really missed experts posting in here and some dorks  
who really have nothing to contribute to the cluster world,
and just are there to be here, like Jonathan Aquilina, they get back  
in return. So experts leave and idiots come back.

This has completely killed this mailing list.

1) The lack of postings by RGB past few months, especially the ones  
where he explains how easy
it is to build a nuke, given the right ingredients, which gives  
interesting discussions.


Let's look to clusters:

10) the lack of software support for clusters

This is the real big issue.

Sure you can get expensive commercial software to run on clusters,   
but that's all
interesting just for scientists.

Which game can effectively use cluster hardware and is dirt cheap?

This really is a big issue.

Note i intend to contribute myself there to change that, but that's  
just 1 person of course.
Not an entire market moving there

11) the huge break even point of using clusterhardware

I can give examples that i sat here at home with next to me Don  
Dailey, the programmer of Cilkchess,
which used Cilk from Leierson. We played Diep at a single cpu against  
Cilkchess single cpu and Cilkchess
got total toasted.

After having been fried for 4 consecutive games, Don had enough of it  
and disconnected the connection
to the cluster, from which he used 1 cpu for the games, and started  
to play at a version at his laptop,
which did NOT use CILK. So no parallel framework.

It was factor 40 faster.

Now note that at tournaments they showed up with 500 or even 1800 cpu's,
yet you can't have a cluster of 1800 cpu's at home.

Usually building a 4 socket box is far easier, though not necessarily  
cheaper, and practical faster than a small cluster.

Especially AMD has a bunch of cheap 4 socket solutions int he market,  
if you buy those 2nd hand ,there is not really
any competition there from 4 socket clusters in the same price range.

100) the huge increase in power consumption lately of machines. Up to  
2002 i used to visit
     someone, Jan Louwman, who had 36 computres at home, testing  
chessprograms at home.
    So that wasn't a cluster, just a bunch of machiens, in sets of 2  
machines connected with a special
    cable we used to play back then machines against each other.

    Nearly all of those machines was 60-100 watt or so.

    He had divided his computers over 3 rooms or so, majority in 1  
room though. There the 16 ampere @ 230 volt
     power plug already had problems supplying this amount of  
electricity. Around the power plug in the wall,
     the wall and plastic of the powerplug were completely black burned.

    As there was only a single P4 machine amongst the computers,
     only 1 box really consumed a lot of power.

Try to run 36 computers at home nowadays. Most machines are well over  
250 watt,
and the fastest 2 machines i've got here eat 410 respectively 270 watt.

That's excluding the videocard in the 410 watt machine, as it's out  
of it currently (AMD HD 6970),
the box has been setup for gpgpu.

36 machines eat way way too much power.

This is a very simple practical problem that one shouldn't overlook.

It's not realistic that the average joe sets up at his popular gaming  
program a cluster of more
than 2 machines or so.

A 2 machine cluster will never beat a 2 socket machine, except when  
each node also has 2 sockets.

So clustering simple home computers together isn't really useful  
except if you really cluster together half a dozen or more.

Half a dozen machines, using the 250 watt measure and another 25 watt  
for each card and 200 watt for the switch,
it's gonna eat 6 * 275 + 200 = 1850 watt. You really need diehards  
for that.

They are there and more than you and i guess,  but they need SOFTWARE  
that interests them that can use it in a very
  efficient manner, clearly proven to them to be working great and  
easy to install, which refers to point 11.

101) most people like to buy new stuff. new cluster hardware is very  
expensive for more than 2 computers as it needs a switch.
          Second hand it's a lot cheaper, sometimes even dirt cheap,  
yet that's already not what most people like to do

110) Linux had a few setbacks and got less attractive. Say when we  
had redhat end 90s with x-windows it was slowly improving
       a lot. Then x64 was there together with a big dang and we went  
back years and years to x.org.

       X.org threw back linux 10 years in time. It eats massive RAM,  
it's ugly bad, it's slow, it's difficult to configure etc.

      Basically there isn't many good distributions now that are for  
free.

      As most clusters work only very well under linux, the  
difficulty of using linux should really be factored in.

      Have a problem under linux?

      Then forget it as a normal user.

      Now for me linux got MORE attractive as i get hacked total  
silly by every consultant who on this planet knows how to hack on the  
internet,
      yet that's not representative for those with cash who can  
afford a cluster. Note i don't fall into the cash group. My total  
income in 2011 was real little.

111) Usually the big cash to afford a cluster is for people with a  
good job or a tad older, that's usually a different group than the  
group that
         can work with linux. See the previous points for that

Despite all that i believe clusters will get more popular in future,  
for a simple reason: processors don't really clock higher.
So all software that can use additional calculation power already is  
getting parallellized or already has been paralelllized.

It's a matter of time before some of those applications also will  
work well at cluster hardware. Yet this is a slow proces
and it really requires software that works real efficient at small  
number of nodes.

As an example of why i feel this will happen i give to you the  
popularity amongst gamers to run 2 graphics cards connected via a  
bridge with
each other within 1 machine.

Yet the important factor there is that the games really profit from  
doing that.

On Jan 12, 2012, at 4:35 PM, Lux, Jim (337C) wrote:

>
>
> On 1/12/12 6:53 AM, "Ellis H. Wilson III" <ellis at runnersroll.com>  
> wrote:
>>  I
>> recently read a blog that suggested (due to similar threads following
>> these trajectories) that the Wulf list wasn't what it used to be.
>
> I think that's for a variety of reasons..
>
> The cluster world has changed.  Back 15-20 years ago, clusters were  
> new,
> novel, and pretty much roll your own, so there was a lot of traffic  
> on the
> list about how to do that.  Remember all the mobo comparisons, and  
> all the
> carefully teased out idiosyncracies of various switches and network
> schemes.
>
> Back then, the idea of using a cluster for "big computing" was kind of
> new, as well.  People building clusters were doing it either  
> because the
> architecture was interesting OR because they had a computing  
> problem to
> solve, and a cluster was a cheap way to do it, especially with free  
> labor.
>
> I think clustering has evolved, and the concept of a cluster is  
> totally
> mature.  You can buy a cluster essentially off the shelf, from a whole
> variety of companies (some with people who were participating in  
> this list
> back then and still today), and it's interesting how the basic Beowulf
> concept has evolved.
>
> Back in late 90s, it was still largely "commodity computers, commodity
> interconnects" where the focus was on using "business class"  
> computers and
> networking hardware. Perhaps not consumer, as cheap as possible, but
> certainly not fancy, schmancy rack mounted 1U servers.. The switches
> people were using were just ordinary network switches, the same as  
> in the
> wiring closet down the hall.
>
> Over time, though, there has developed a whole industry of supplying
> components specifically aimed at clusters: high speed interconnects,
> computers, etc.   Some of this just follows the IT industry in  
> general..
> There weren't as many "server farms" back in 1995 as there are now.
>
> Maybe it's because the field has matured?
>
>
> So, we're back to talking about "roll-your-own" clusters of one  
> sort or
> another.  I think anyone serious about big cluster computing (>100  
> nodes)
> probably won'd be hanging on this list looking for hints on how to  
> route
> and label their network cables.  There's too many other places to  
> go get
> that information, or, better yet, places to hire someone who  
> already knows.
>
> I know that if I needed massive computational power at work, my first
> thought these days isn't "hey, lets build a cluster", it's "let's  
> call up
> the HPC folks and get an account on one of the existing clusters".
>
> But I still see the need to bring people into the cluster world in  
> some
> way.  I don't know where the cluster vendors find their people, or  
> even
> what sorts of skill sets they're looking for.  Are they beating the  
> bushes
> at CMU, MIT, and other hotbeds of CS looking for prior cluster design
> experience?  I suspect not, just like most of the people JPL hires  
> don't
> have spacecraft experience in school, or anywhere.  You look for  
> bright
> people who might be interested in what you're doing, and they learn  
> the
> details of cluster-wrangling on the job.
>
>
> For myself, I like probing the edges of what you can do with a  
> cluster.
> Big computational problems don't excite me.  I like thinking about  
> things
> like:
>
> 1) What can I use from the body of cluster knowledge to do something
> different.  A distributed cluster is topologically similar to one all
> contained in a single rack, but it's different.  How is it different
> (latency, error rate)? Can I use analysis (particularly from early  
> cluster
> days) to do a better job.
>
> 2) I've always been a fan of *personal* computing (probably from many
> years of negotiating for a piece of some shared resource).  It's  
> tricky
> here, because as soon as you have a decent 8 or 16 node cluster  
> that fits
> under a desk, and have figured out all the hideous complexity of  
> how to
> port some single user application to run on it, someone comes out  
> with a
> single processor box that's just as fast, and a lot easier to use.   
> Back
> in the 80s, I designed, but did not build, a 80286 clone using  
> discrete
> ECL logic, the idea being to make a 100MHz IBM PC-AT that would run
> standard spreadsheet software 20 times faster (a big deal when your  
> huge
> spreadsheet takes hours to recalculate).  However, Moore's law and  
> Intel
> made that idea a losing proposition.
>
> But still, the idea of personal control over my computing resources is
> appealing.  Nobody watching to see "are you effectively using those  
> cpu
> cycles".  No arguing about annual re-adjustment of chargeback rates  
> where
> you take the total system budget and divide it by CPU seconds.   
> Ooops not
> enough people used it, so your CPU costs just quadrupled.
>
> 3) I'm also interested in portable computing (Yes, I have a NEC 8201-
> TRS-80 Model 100 clone, and a TI-59, I did sell the Compaq, but I  
> had one
> of those too,  etc.)  This is another interesting problem space..  
> No big
> computer room with infrastructure.  Here, the fascinating trade is  
> between
> local computer horsepower and cheap long distance datacomm.  At some
> point, it's cheaper/easier to send your data via satellite link  to  
> a big
> computer elsewhere and get the results back.  It's the classic 60s  
> remote
> computing problem revisited once again.
>
>
>>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Thu Jan 12 11:49:25 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Thu, 12 Jan 2012 11:49:25 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
Message-ID: <c16a5282ab55e2c99f48c1a52862a505.squirrel@mail.eadline.org>

snip
>
>
> For my own work, I'd rather have people who are interested in solving
> problems by ganging up multiple failure prone processors, rather than
> centralizing it all in one monolithic box (even if the box happens to have
> multiple cores).
>

This is going to be an exascale issue. i.e. how to compute on a systems
whose parts might be in a constant state of breaking. An other interesting
question is how do you know you are getting the right answer on a *really*
large system?

Of course I spend much of my time optimizing really small
systems.

-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From diep at xs4all.nl  Thu Jan 12 11:58:32 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 17:58:32 +0100
Subject: [Beowulf] Adding 1 point
In-Reply-To: <4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
References: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
	<4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
Message-ID: <BB7FC387-8A01-4766-B671-4248BAE6D291@xs4all.nl>

What really made small clusters at home less attractive,
there is another reason i should add :

That's the rise of cheap multi socket machines.

A 2 socket machine is not so expensive anymore nowadays.
So if you want faster than 1 socket, you buy a 2 socket machine.

If you want faster than that , 4 sockets is there.

That choice wasn't there before end 90s easily available. And in the  
21th century it has become cheap.

Another delaying factor is the rise of so many cores per node. AMD  
and intel sell cpu's for their 4 socket line
that has up to double the amount of nodes than you can have in a  
single socket box.

So it's equivalent nearly to 8 nodes, be it low clocked.

For that reason clusters tend to get more effective at a dozen nodes  
or more, assuming cheap single socket nodes.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 12:26:01 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 12:26:01 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4B53A30C-FF3E-4996-916F-2D8455C90C5D@xs4all.nl>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<4B53A30C-FF3E-4996-916F-2D8455C90C5D@xs4all.nl>
Message-ID: <4F0F17A9.7010400@runnersroll.com>

On 01/12/2012 10:21 AM, Vincent Diepeveen wrote:
> On Jan 12, 2012, at 4:10 PM, Lux, Jim (337C) wrote:
>> This is exactly the population you want to hit.  Bring in 100 advanced
>> high school (grade 11-12 in US) students.  Have them all use cheap
>> hardware to do a cluster.  Some fraction will think, "this is kind of
>> cool, maybe I should major in CS instead of X"  Some fraction will
>> think,
>
> Your example here will just take care a big number of students don't
> want
> to have to do anything with those studies, as there is a few lame nerds
> there who toy with equipment that's factor 50k slower (adding to the
> factor 500
> the object oriented slowdown of factor 100)  than what they have
> at home, and it can do nothing useful.
>
> But in this specific case you'll just scare away students and the
> real clever ones
> will get total desinterested as you are busy with lame duck speed
> type cpu's.

You have made it abundantly clear you aren't interested in enrolling in 
such a course.  Thanks for your comments.

On a related note, as I was thinking about 'lame duck' education, I 
remembered that I took an undergraduate machine learning course in which 
we designed players for connect-four, which would compete using recently 
learned techniques against other students in the class.  Despite that 
particular game being a solved one, we all had a blast and got quite 
competitive trying to beat each other out using the recently acquired 
skills.  I would encourage Jim to do something similar once the basics 
of cluster administration are done -- perhaps a mini SC Cluster 
Competition would be a neat application for the Arduinos?

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 12:35:11 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 12:35:11 -0500
Subject: [Beowulf] Robots
In-Reply-To: <95F43560-B9AA-4B56-9B32-1A1979461B07@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
	<4F0EE6FC.2050002@runnersroll.com>
	<95F43560-B9AA-4B56-9B32-1A1979461B07@xs4all.nl>
Message-ID: <4F0F19CF.2050603@runnersroll.com>

On 01/12/2012 10:56 AM, Vincent Diepeveen wrote:
> On Jan 12, 2012, at 2:58 PM, Ellis H. Wilson III wrote:
>> I think this is likely the reason why many
>> introductory engineering classes incorporate use of Lego Mindstorm
>> robots rather than lunar rovers (or even overstock lunar rovers :D).
>
> I didn't comment on other complete wrong examples, but i want to highlight
> one. Your example of a lego robot actually is disproving your statement.

It was a price comparison, and without diving into the nitty-gritty of 
how good or bad both the Arduino and the Mindstorms are in their 
respective areas, it was spot on.  Jim wants to give each student a 10 
node cluster on the cheap (i.e. 20 to 30 bucks per node = 300 bucks), 
universities want to give each student (or teams of students sometimes) 
a robot (~280).  Both provide an approachable level of difficulty and 
potential for education at a reasonable price.

Feel free to continue to disagree for the sake of such.  It was just an 
example.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 12:54:52 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 09:54:52 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <c16a5282ab55e2c99f48c1a52862a505.squirrel@mail.eadline.org>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<c16a5282ab55e2c99f48c1a52862a505.squirrel@mail.eadline.org>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9163@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: Douglas Eadline [mailto:deadline at eadline.org] 
Sent: Thursday, January 12, 2012 8:49 AM
To: Lux, Jim (337C)
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

snip
>
>
> For my own work, I'd rather have people who are interested in solving 
> problems by ganging up multiple failure prone processors, rather than 
> centralizing it all in one monolithic box (even if the box happens to 
> have multiple cores).
>

This is going to be an exascale issue. i.e. how to compute on a systems whose parts might be in a constant state of breaking. An other interesting question is how do you know you are getting the right answer on a *really* large system?

Of course I spend much of my time optimizing really small systems.

--

Your point about scaling is well taken.. so far, the computing world has largely dealt with things by trying to make the processor perfect and error free.  Some limited areas of error correction are popular (RAM).  But think in a bigger area... say your arithmetic unit has some infrequent unknown errors (e.g. FDIV bug on Pentium).. could clever algorithm design and multiple processors (or multi cores) mitigate this (e.g. instead of just computing  Z = X/Y you also compute Z1 = (X*2)/(Y*2).. and compare answers... that exact example's not great because you've added 2 operations, but I can see that there are other clever techniques that might be possible.. )  

What is nice if you can do things like temporal redundancy (do the calculation twice, and if it's different, do it a third time), or even better some sort of "check calculation" that takes small time compared to mainline calculation.

This, I think, is somewhere that even the big iron/cluster folks could be doing some research.  What are optimum communication fabrics to support this kind of "side calculation" which may have different communication patterns and data flow than the "mainline".  It has a parallel in things like CRC checks in communications protocols.  A lot of hardware has a dedicated little CRC checker that is continuously calculating the CRC as the bits arrive, so that when you get to the end of the frame, the answer is already there.  


And Doug, your small systems have a lot of the same issues, perhaps because that small Limulus might be operated in environments other than what the underlying hardware was designed for.  I know people who have been rudely surprised when they found that the design environment for a laptop is a pretty narrow temperature range (e.g. office desktop) and when they put them in a car, subject to 0C or 40C temperatures, if not wider, that things don't work quite as well as expected.

Very small systems (few nodes) have the same issues, in some environments (e.g. a cluster subject to single event upsets or functional interrupts in a high radiation environment with a lot of high energy charged particles. it's not so much a total dose thing, but a SEE thing)

For Juno (which is in polar orbit around Jupiter), we shielded everything in a vault (a 1 meter cube with 1cm thick titanium walls) and still it's an issue.  We don't get very long before everything is cooked. 

And I think that a non-trivially small cluster (e.g. more than 4 nodes, I think) you could do a lot of experimentation on techniques.


(oddly, simulated fault injection is one of the trickier parts)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 12:55:41 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 12:55:41 -0500
Subject: [Beowulf] List traffic
In-Reply-To: <4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
References: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
	<4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
Message-ID: <4F0F1E9D.9000800@runnersroll.com>

I really should be following Joe's advice circa 2008 and just not 
responding, but I can't help myself.

On 01/12/2012 11:45 AM, Vincent Diepeveen wrote:
> The biggest problem for this list:
> 1) The lack of postings by RGB past few months, especially the ones
> where he explains how easy
> it is to build a nuke, given the right ingredients, which gives
> interesting discussions.

The last post from RGB was a long, long discussion about how very wrong 
you were about RNGs.  You just don't get it.  It's okay to be wrong once 
in a while Vincent, and even moreso to just agree to disagree.  Foolish, 
unedited and inflammatory diatribes with a unnatural dose of newlines 
are what is killing this list and what that blog I referenced was 
specifically disappointed with.

So please, I'm begging you.  Stop writing huge emails that trail off 
from their original point.  Try to say things in a non-inflammatory 
manner.  Use spell-check, and try to read your emails once before 
sending them.  And last of all, remember that there are many people on 
this list that have all sorts of different applications -- not just 
Chess.  Your experience does not generalize well to all areas.

Speaking of which, for anyone who is interested in doing serious work 
with low-power processors, please see a paper named FAWN for an 
excellent example of use-cases where low hertz low power processors can 
do some great work.  It's by Dave Anderson of CMU.  I was lucky enough 
to be invited to the CMU PDL retreat a few months back and had a nice 
conversation about the project when we went for a run together.  There 
are some use-cases that benefit massively from that kind of architecture.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 13:10:24 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 10:10:24 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0F17A9.7010400@runnersroll.com>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<4B53A30C-FF3E-4996-916F-2D8455C90C5D@xs4all.nl>
	<4F0F17A9.7010400@runnersroll.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9168@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Ellis H. Wilson III
Sent: Thursday, January 12, 2012 9:26 AM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

On 01/12/2012 10:21 AM, Vincent Diepeveen wrote:
> On Jan 12, 2012, at 4:10 PM, Lux, Jim (337C) wrote:
>> This is exactly the population you want to hit.  Bring in 100 
>> advanced high school (grade 11-12 in US) students.  Have them all use 
>> cheap hardware to do a cluster.  Some fraction will think, "this is 
>> kind of cool, maybe I should major in CS instead of X"  Some fraction 
>> will think,
>
> Your example here will just take care a big number of students don't 
> want to have to do anything with those studies, as there is a few lame 
> nerds there who toy with equipment that's factor 50k slower (adding to 
> the factor 500 the object oriented slowdown of factor 100)  than what 
> they have at home, and it can do nothing useful.
>
> But in this specific case you'll just scare away students and the real 
> clever ones will get total desinterested as you are busy with lame 
> duck speed type cpu's.

You have made it abundantly clear you aren't interested in enrolling in such a course.  Thanks for your comments.

On a related note, as I was thinking about 'lame duck' education, I remembered that I took an undergraduate machine learning course in which we designed players for connect-four, which would compete using recently learned techniques against other students in the class.  Despite that particular game being a solved one, we all had a blast and got quite competitive trying to beat each other out using the recently acquired skills.  I would encourage Jim to do something similar once the basics of cluster administration are done -- perhaps a mini SC Cluster Competition would be a neat application for the Arduinos?


----------------------------------------
Ooohh.. that sounds *very* cool..  

A bunch of slow processors.
A simple problem to solve (e.g. 3D tic-tac-toe) for which there might even be published parallel approaches
The challenge is effectively using the limited system, warts and all.

The RaspberryPI might be a better vehicle, if it hits the price/availability targets: Comparable to Arduinos in price, but a bit more sophisticated and less contrived.


We've been talking about what kind of software competitions JPL could run as a recruiting tool at Universities, and that's along those lines.  Hmm... I wonder if they'd be willing to spend recruiting funds on that?  (probably not.. we're all poor this fiscal year)


And, on the undergrad education thing... At UCLA, I had to write stuff in MIXAL to run on a simulated MIX machine and complained mightily to the TAs, who just pointed to the sacred texts of Knuth, rather than giving an intelligent response as to why we didn't do something like work in PDP-11 ASM or System/360 BAL. (UCLA at the time had a monster 360, but I don't know that they had many 11s, and realistically, BAL is not something I'd inflict on 2nd quarter first year students.   We were a PL/I or PL/C shop in the first couple years' classes for the most part, although there were people doing Algol)

OTOH, I suspect was an atypical incoming student for 1977.

 I had, the previous year, done the Pascal courses at UCSD with p-machines running on LSI-11s as well as the Pascal system on the big Burroughs B6700, which uses a form of ALGOL as the machine language and is a stack machine to boot (how cool is that? Burroughs always did have cool machines.. Hey, they built ILLIAC IV). I had also done some ASM stuff on an 11/20 under RT-11.    I guess that's characteristic of the differences in philosophy between different CS departments  (UCSD was heading more in the direction of Software Engineering being part of the School of Engineering and Applied Sciences, while UCLA it was part of the Math department.  Little did I know, as a cybernetics major, what the difference was: It sure as heck isn't manifested in the course catalog, at least in a form that a incoming student could discern.  Going back now, I could probably look at catalogs from the various universities of the era and divine their philosophies, but that's clearly 2020 hindsight
 )
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 13:22:26 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 10:22:26 -0800
Subject: [Beowulf] FAWN
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>

Fast Array of Wimpy Nodes..
http://www.cs.cmu.edu/~fawnproj/

Very cool stuff...

Their original motivation (reduction of power) is at a much larger scale than my work usually works at (they're talking megawatts in googleish clusters.. I worry about watts derived from solar panels and such)

But it's a whole 'nother twist on the idea of clustering of low performance nodes (by some metric.. they've got good nanojoule/operation metrics) .

And they're doing a very clever thing where they work with the very asymmetric read/write speeds on flash memory.  (And FLASH memory is something I spend a lot of time thinking about these days.. It's what we use in space for NVRAM these days)

Looks like I've got some reading for the holiday weekend.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/a93fa9f2/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From ellis at runnersroll.com  Thu Jan 12 13:26:26 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 13:26:26 -0500
Subject: [Beowulf] FAWN
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <4F0F25D2.90305@runnersroll.com>

On 01/12/2012 01:22 PM, Lux, Jim (337C) wrote:
> But it?s a whole ?nother twist on the idea of clustering of low
> performance nodes (by some metric.. they?ve got good nanojoule/operation
> metrics) .
>

Not just good, from a sorting perspective, /best/:
http://sortbenchmark.org/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Thu Jan 12 13:47:21 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Thu, 12 Jan 2012 13:47:21 -0500
Subject: [Beowulf] List traffic
In-Reply-To: <4F0F1E9D.9000800@runnersroll.com>
References: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
	<4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
	<4F0F1E9D.9000800@runnersroll.com>
Message-ID: <4F0F2AB9.5060105@scalableinformatics.com>

On 01/12/2012 12:55 PM, Ellis H. Wilson III wrote:
> I really should be following Joe's advice circa 2008 and just not
> responding, but I can't help myself.

huh ...?

>
> On 01/12/2012 11:45 AM, Vincent Diepeveen wrote:
>> The biggest problem for this list:
>> 1) The lack of postings by RGB past few months, especially the ones
>> where he explains how easy
>> it is to build a nuke, given the right ingredients, which gives
>> interesting discussions.
>
> The last post from RGB was a long, long discussion about how very wrong
> you were about RNGs.  You just don't get it.  It's okay to be wrong once
> in a while Vincent, and even moreso to just agree to disagree.  Foolish,
> unedited and inflammatory diatribes with a unnatural dose of newlines
> are what is killing this list and what that blog I referenced was
> specifically disappointed with.
>
> So please, I'm begging you.  Stop writing huge emails that trail off
> from their original point.  Try to say things in a non-inflammatory

... oh ... never mind :)


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 14:08:38 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 11:08:38 -0800
Subject: [Beowulf] FAWN
In-Reply-To: <4F0F25D2.90305@runnersroll.com>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9186@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Ellis H. Wilson III
Sent: Thursday, January 12, 2012 10:26 AM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] FAWN

On 01/12/2012 01:22 PM, Lux, Jim (337C) wrote:
> But it's a whole 'nother twist on the idea of clustering of low 
> performance nodes (by some metric.. they've got good 
> nanojoule/operation
> metrics) .
>

Not just good, from a sorting perspective, /best/:
http://sortbenchmark.org/
-------------

I was thinking that their low powered nodes are poor in an absolute performance standpoint (i.e. MIPS), but actually quite good on a computation work per joule basis.

Yes, for sorting, they are kicking rear.


This is interesting, but when you start talking power consumption, one needs to be careful about where you draw boundaries and what's "in the system".  Do you count conversion efficiency in the power supply?   At one level, you say, no, just worry about DC power consumption, but even there.. is it at the board edge, or at the chip?  Something drawing 100Amps at 0.5V is a very different beast than something drawing 10Amps at 5V, and you can't locally optimize too far because your choices inside box A start to affect the design and performance of Box B and Box C.


The contest rules point to a variety of power measurement systems, but based on what I see there, I think there's some scope for "gaming" the system. It sort of seems it's "wall plug power", but then, they do allow DC power systems.  

For instance, one could tune the power supply for the expected load conditions..  You could run those fans at warp speed before the test run starts to cool down as much as possible, and then slow them down (saving power) during the run, maybe even letting the processor get pretty hot.

Sort of like running a top fuel dragster. Only has to go fast for 3 or 4 seconds, so why bother putting in a water pump.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 14:40:15 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 14:40:15 -0500
Subject: [Beowulf] FAWN
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9186@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9186@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <4F0F371F.2060704@runnersroll.com>

On 01/12/2012 02:08 PM, Lux, Jim (337C) wrote:
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Ellis H. Wilson III
> Sent: Thursday, January 12, 2012 10:26 AM
> To: beowulf at beowulf.org
> Subject: Re: [Beowulf] FAWN
>
> On 01/12/2012 01:22 PM, Lux, Jim (337C) wrote:
>> But it's a whole 'nother twist on the idea of clustering of low
>> performance nodes (by some metric.. they've got good
>> nanojoule/operation
>> metrics) .
>>
>
> Not just good, from a sorting perspective, /best/:
> http://sortbenchmark.org/
> -------------
>
> I was thinking that their low powered nodes are poor in an absolute performance standpoint (i.e. MIPS), but actually quite good on a computation work per joule basis.
>
> Yes, for sorting, they are kicking rear.
>
>
> This is interesting, but when you start talking power consumption, one needs to be careful about where you draw boundaries and what's "in the system".  Do you count conversion efficiency in the power supply?   At one level, you say, no, just worry about DC power consumption, but even there.. is it at the board edge, or at the chip?  Something drawing 100Amps at 0.5V is a very different beast than something drawing 10Amps at 5V, and you can't locally optimize too far because your choices inside box A start to affect the design and performance of Box B and Box C.
>
>
> The contest rules point to a variety of power measurement systems, but based on what I see there, I think there's some scope for "gaming" the system. It sort of seems it's "wall plug power", but then, they do allow DC power systems.
>
> For instance, one could tune the power supply for the expected load conditions..  You could run those fans at warp speed before the test run starts to cool down as much as possible, and then slow them down (saving power) during the run, maybe even letting the processor get pretty hot.
>
> Sort of like running a top fuel dragster. Only has to go fast for 3 or 4 seconds, so why bother putting in a water pump.

All fair points, and I can't contest the suggestion that they likely 
tune their algorithm and physical units very highly to perform well for 
this sorting environment.  Dave actually keeps a pretty balanced 
perspective when discussing this, as shown in his reaction to Google 
talking down wimpy nodes.  Wired has a nice article on it, with inside 
it a link to Googles pub that discusses the other half of the coin:

http://www.wired.com/wiredenterprise/2012/01/wimpy_nodes/

Some more reading material for the weekend ;).

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Thu Jan 12 15:45:16 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Thu, 12 Jan 2012 15:45:16 -0500
Subject: [Beowulf] Partial OT: CPU grouping control for Windows 2008 R2 x64
 server for big calcs
Message-ID: <4F0F465C.4010301@scalableinformatics.com>

Ok, this one is fun.  For some definitions of fun.  Unusual definitions 
of fun...  And there is a question towards the end.  This is for folks 
who've been administrating clusters and HPC systems with big windows 
machines (32+ CPUs and large RAM).

Imagine you have a machine as part of a very loose computing cluster. 
End user wants to run Windows (2008R2 x64 enterprise) on it.  This 
machine has 32 processor cores (real ones, no hyperthreading), 1TB ram.

Yeah, its a fun machine to work on.  I won't discuss the OS choice here. 
  You can see some of my playing with it here: 
http://scalability.org/?p=3541 and http://scalability.org/?p=3515

Windows machines can let up to 64 logical processors be part of a 
"group".  A group is a scheduling artifice, and not necessarily directly 
related to the NUMA system ... think of it as a layer abstraction above 
this.

Ok, still with me?

This scheduling artifice, these groups, require at minimum a 
recompilation to work properly with.  Its actually more than that, they 
do require some additional processor affinity bits be handled.  If you 
have a code which doesn't handle this correctly, it will probably crash. 
  Or not work well.  Or both.

Matlab appears to be such a beast.  This isn't necessarily a Matlab 
issue per se, it appears to be something of a design compromise issue in 
Windows.  Windows wasn't designed with large processor counts in mind. 
The changes they'd need to make in order to enable a single large 
spanning entity across all CPUs at once are quite likely not in the 
companies best interests, as there are very few customers with such 
machines.

Still with me?  Here's the problem.

Matlab seems to crash (according to the user) if run on a unit with more 
than one group.  I've not been able to verify on the machine yet myself, 
but I have no reason to disbelieve this.  The issue as its been stated 
to me is that if there is more than one group of processors, Matlab 
crashes.  This is the symptom.

When the unit boots by default, we have 2 16 processor groups.  So 
looking at bcdedit examples, I see how to turn off groups.

One minor problem.

It doesn't work.

I can do an

	bcdedit /set groupaware off

reboot.  Which should completely disable groups, so that all 32 
processor are in one group.  Still 2 groups.

I can do an

	bcdedit /set groupsize 64

reboot.  Still 2 groups.

So far, the only thing that seems to change this is if I install the 
hyperV role.  With that, there is now 1 group.

Looking at all the boot options with

	bcdedit /enum

there's only one config for boot, and its the default.

So ... my questions

1) Does Windows really ignore its approximate equivalent to its boot 
options on a grub line?

2) Is there any way to compel Windows to do the right thing?

As noted, this is for a computing cluster.  Our recommended OS isn't 
feasible right now for them and their application.

Definitely annoying.  I'd love there to be a bios setting to help 
windows past its desire to ignore my requested number of groups.  Not 
sure if adding in the hyperV will impact performance (did some base 
testing with Scilab to see, and I didn't see anything I'd call significant).

Will be bugging Microsoft about this as well (pretty obviously a bug in 
2008R2 x64).

And related to this, I read something about limits in the different 
windows editions.  Is anyone using Windows HPC cluster on big memory 
machines with lots of cores?  Looking at the Microsoft docs, they 
indicate some relatively low limits on ram and processor count.  So does 
this mean that they won't be supporting Interlagos 4 socket machines 16 
cores per socket and 1/2 TB ram in compute nodes for Windows HPC ?  I am 
just imagining someone buying a few of those nodes and being required to 
buy Enterprise or Data center licenses for those machines (which clearly 
would not be used for anything more than HPC).


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Fri Jan 13 00:36:50 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Fri, 13 Jan 2012 16:36:50 +1100
Subject: [Beowulf] FAWN
In-Reply-To: <4F0F25D2.90305@runnersroll.com>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com>
Message-ID: <4F0FC2F2.5090606@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 13/01/12 05:26, Ellis H. Wilson III wrote:

> Not just good, from a sorting perspective, /best/:
> http://sortbenchmark.org/

But that algorithm isn't running on exactly wimpy hardware..

Intel Core i5-2400S 2.5 GHz, 16GB RAM and a bunch of SSDs

cheers!
Chris
- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8PwvIACgkQO2KABBYQAh84cgCfQZN1ZpKfzxLmazCiZLg93n89
dwYAoIZHAFmUYENP2xwMwo5M3xile4F3
=4lFT
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 13 09:01:59 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 13 Jan 2012 15:01:59 +0100
Subject: [Beowulf] Robots
In-Reply-To: <4F0F19CF.2050603@runnersroll.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
	<4F0EE6FC.2050002@runnersroll.com>
	<95F43560-B9AA-4B56-9B32-1A1979461B07@xs4all.nl>
	<4F0F19CF.2050603@runnersroll.com>
Message-ID: <01D34971-9054-4F19-9776-8F107B118A1D@xs4all.nl>


On Jan 12, 2012, at 6:35 PM, Ellis H. Wilson III wrote:

> On 01/12/2012 10:56 AM, Vincent Diepeveen wrote:
>> On Jan 12, 2012, at 2:58 PM, Ellis H. Wilson III wrote:
>>> I think this is likely the reason why many
>>> introductory engineering classes incorporate use of Lego Mindstorm
>>> robots rather than lunar rovers (or even overstock lunar rovers :D).
>>
>> I didn't comment on other complete wrong examples, but i want to  
>> highlight
>> one. Your example of a lego robot actually is disproving your  
>> statement.
>
> It was a price comparison, and without diving into the nitty-gritty  
> of how good or bad both the Arduino and the Mindstorms are in their  
> respective areas, it was spot on.  Jim wants to give each student a  
> 10 node cluster on the cheap (i.e. 20 to 30 bucks per node = 300  
> bucks), universities want to give each student (or teams of  
> students sometimes) a robot (~280).  Both provide an approachable  
> level of difficulty and potential for education at a reasonable price.
>
> Feel free to continue to disagree for the sake of such.  It was  
> just an example.
>
> Best,
>
> ellis

It's not even spot on.  You're lightyears away with your comparision.

You're comparing one of the best available robots that gets mass  
produced,
with some freak thing where there is 100 alternatives which work way  
better,
alternatives are 500x faster, and if you want to also cheaper,
and above all achieve the original goal better of demonstrating SMP  
programming,
as the freak hardware, thanks to real low clocked type of CPU,
has a neglectible latency to other cpu's.

Where the robot shows you how to work with robots, the educational  
purpose as Jim wrote down,
you won't get very well with the embedded cpu's, as the equipment has  
none of the typical problems you can encounter in
  a normal SMP system let alone a cluster environment, meanwhile it  
has total other problems,
which you will never encounter at CPU's.

Such as that embedded cpu's have severely limited caches and can  
execute just 1 instruction at a time.

Embedded programming is total different from CPU programming and  
latencies embedded, thanks to the slow processor speed,
are not even comparable with SMP programming between cores of 1 cpu.

Such multicore box definitely has a cost below $300.

On ebay i see nodes with 8 cores for $200.

And those are 500x faster.

Myself i'm looking at some socket 771 Xeon machines say with a L5420.  
Though they eat a lot more power than intel claims,
it's still i guess a 170 watt a machine or so under full load.

Note we still skipped the algorithmic discussion, as from algorithmic  
viewpoint, if i look to artificial intelligence, getting something to  
work
at 70Mhz machines is gonna behave total different and needs total  
different approach than todays hardware. It's not even in the same  
ballpark.

Vincent


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ntmoore at gmail.com  Fri Jan 13 09:33:33 2012
From: ntmoore at gmail.com (Nathan Moore)
Date: Fri, 13 Jan 2012 08:33:33 -0600
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
References: <41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
	<CB34335D.12EED%james.p.lux@jpl.nasa.gov>
Message-ID: <CACD67sxsy6wHh5LGQaR+Ny8LyifDh+o+=d8dhawGFqfr7f9vCQ@mail.gmail.com>

Jim,

Have you ever interacted with the "Modeling Instruction" folks over at
ASU?  http://modeling.asu.edu/

They've done, for HS Physics, more or less what you're talking about
in terms of making the subject engaging, compelling, and diven by
student, not teacher, interest.


On Thu, Jan 12, 2012 at 9:10 AM, Lux, Jim (337C)
<james.p.lux at jpl.nasa.gov> wrote:
>
>
> On 1/12/12 6:39 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:
>
>>The average guy is not interested in knowing all details regarding
>>how to
>>play tennis with a wooden racket from the 1980s, just around
>>the time when McEnroe was on the tennisfield playing there.
>>
>>Most people are more interested in whether you can win that grandslam
>>with what you produce.
>>
>>The nerds however are interested in how well you can do with a wooden
>>racket
>>from 1980s,therefore projecting your own interest upon those students
>>will just
>>get them desinterested and you will be judged by them as an
>>irrelevant person
>>in their life, whose name they soon forget.
>>
>
> Having spent some time recently in Human Resources meetings about how to
> better recruit software people for JPL, I'd say that something that
> appeals to nerds and gives them something to do is not all bad. Part of
> the educational process is to find and separate the people who are
> interested and have a passion. ?I'm not sure that someone who starts
> getting into clusters mostly because they are interested in breaking into
> the Top500 is the target audience in any case.
>
> If you look over the hobby clusters out there, the vast majority are "hey,
> I heard about this interesting idea, I scrounged up N old/small/slow/easy
> to find computers and tried to cluster them and do something. ?I learned
> something about cluster administration, and it was fun, but I don't use it
> anymore"
>
> This is exactly the population you want to hit. ?Bring in 100 advanced
> high school (grade 11-12 in US) students. ?Have them all use cheap
> hardware to do a cluster. ?Some fraction will think, "this is kind of
> cool, maybe I should major in CS instead of X" ?Some fraction will think,
> "how lame, why not make the single processor faster", and they can be
> CompEng or EE majors looking at how to reduce feature sizes and get the
> heat out.
>
> It's just like biology or chemistry classes. ?In high school biology
> (9th/10th grade) most of it is mundane memorization (Krebs cycle, various
> descriptive stuff. ?Other than the use of cheap cmos cameras, microscopes
> used at this level haven't really changed much in the last 100 years (and
> the microscopes at my kids' school are probably 10-20 years old). They
> also do some more modern molecular biology in a series of labs partly
> funded by Amgen: ? Some recombinant DNA to put fluorescent proteins in a
> bacteria, running some gels, etc. ?The vast majority of the students will
> NOT go on to a career in biology, but some fraction do, they get
> interested in some aspect, and they wind up majoring in bio, or being a
> pre-med, etc.
>
> Not everyone is looking for the world beater. ?A lot of kids start with
> Kart racing, even though even the fastest Karts aren't as fast as F1 (or
> even a Smart Car). ?How many engineers started with dismantling the
> lawnmower engine?
>
>
> For my own work, I'd rather have people who are interested in solving
> problems by ganging up multiple failure prone processors, rather than
> centralizing it all in one monolithic box (even if the box happens to have
> multiple cores).
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


-- 
- - - - - - -?? - - - - - - -?? - - - - - - -
Nathan Moore
Associate Professor, Physics
Winona State University
- - - - - - -?? - - - - - - -?? - - - - - - -
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Fri Jan 13 09:38:28 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Fri, 13 Jan 2012 09:38:28 -0500
Subject: [Beowulf] FAWN
In-Reply-To: <4F0FC2F2.5090606@unimelb.edu.au>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com> <4F0FC2F2.5090606@unimelb.edu.au>
Message-ID: <4c1a72f2e8097c3585e495e1c95bfa2c.squirrel@mail.eadline.org>


> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 13/01/12 05:26, Ellis H. Wilson III wrote:
>
>> Not just good, from a sorting perspective, /best/:
>> http://sortbenchmark.org/
>
> But that algorithm isn't running on exactly wimpy hardware..
>
> Intel Core i5-2400S 2.5 GHz, 16GB RAM and a bunch of SSDs

I can vouch for the i5-2400S processors, one of the best
values out there, I got 200 GFLOPS on a Limulus
using 4 of these. Some more benchmarks here:

http://www.clustermonkey.net//content/view/306/1/

--
Doug

>
> cheers!
> Chris
> - --
>     Christopher Samuel - Senior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>          http://www.vlsci.unimelb.edu.au/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8PwvIACgkQO2KABBYQAh84cgCfQZN1ZpKfzxLmazCiZLg93n89
> dwYAoIZHAFmUYENP2xwMwo5M3xile4F3
> =4lFT
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at eadline.org  Fri Jan 13 10:18:02 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Fri, 13 Jan 2012 10:18:02 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9163@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<c16a5282ab55e2c99f48c1a52862a505.squirrel@mail.eadline.org>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9163@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <6ea096464afff4fe7e544e0cd28c5204.squirrel@mail.eadline.org>


>
>
> -----Original Message-----
> From: Douglas Eadline [mailto:deadline at eadline.org]
> Sent: Thursday, January 12, 2012 8:49 AM
> To: Lux, Jim (337C)
> Cc: beowulf at beowulf.org
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> snip
>>
>>
>> For my own work, I'd rather have people who are interested in solving
>> problems by ganging up multiple failure prone processors, rather than
>> centralizing it all in one monolithic box (even if the box happens to
>> have multiple cores).
>>
>
> This is going to be an exascale issue. i.e. how to compute on a systems
> whose parts might be in a constant state of breaking. An other interesting
> question is how do you know you are getting the right answer on a *really*
> large system?
>
> Of course I spend much of my time optimizing really small systems.
>
> --
>
> Your point about scaling is well taken.. so far, the computing world has
> largely dealt with things by trying to make the processor perfect and
> error free.  Some limited areas of error correction are popular (RAM).
> But think in a bigger area... say your arithmetic unit has some infrequent
> unknown errors (e.g. FDIV bug on Pentium).. could clever algorithm design
> and multiple processors (or multi cores) mitigate this (e.g. instead of
> just computing  Z = X/Y you also compute Z1 = (X*2)/(Y*2).. and compare
> answers... that exact example's not great because you've added 2
> operations, but I can see that there are other clever techniques that
> might be possible.. )
>
> What is nice if you can do things like temporal redundancy (do the
> calculation twice, and if it's different, do it a third time), or even
> better some sort of "check calculation" that takes small time compared to
> mainline calculation.
>
> This, I think, is somewhere that even the big iron/cluster folks could be
> doing some research.  What are optimum communication fabrics to support
> this kind of "side calculation" which may have different communication
> patterns and data flow than the "mainline".  It has a parallel in things
> like CRC checks in communications protocols.  A lot of hardware has a
> dedicated little CRC checker that is continuously calculating the CRC as
> the bits arrive, so that when you get to the end of the frame, the answer
> is already there.
>
>
> And Doug, your small systems have a lot of the same issues, perhaps
> because that small Limulus might be operated in environments other than
> what the underlying hardware was designed for.  I know people who have
> been rudely surprised when they found that the design environment for a
> laptop is a pretty narrow temperature range (e.g. office desktop) and when
> they put them in a car, subject to 0C or 40C temperatures, if not wider,
> that things don't work quite as well as expected.

I will be curious to see where these things show up since
all you really need is a power plug. (a little nervous actually).

>
> Very small systems (few nodes) have the same issues, in some environments
> (e.g. a cluster subject to single event upsets or functional interrupts in
> a high radiation environment with a lot of high energy charged particles.
> it's not so much a total dose thing, but a SEE thing)
>
> For Juno (which is in polar orbit around Jupiter), we shielded everything
> in a vault (a 1 meter cube with 1cm thick titanium walls) and still it's
> an issue.  We don't get very long before everything is cooked.
>
> And I think that a non-trivially small cluster (e.g. more than 4 nodes, I
> think) you could do a lot of experimentation on techniques.

I agree. Four nodes is really small. BTW, the most fun in designing
this system is a set of tighter constraints than are found on the typical
cluster. Noise, power, space, cabling, low cost packaging, etc. I have
been asked about a rack mount version, we'll see.

One thing I find interesting is the core/node efficiency.
(what I call "effective cores") In general *on some codes*, I found that
less cores (1P micro-atx 4-cores) is more efficient than many
cores (2P server 12-core). Seems obvious, but I like to test things.

>
>
> (oddly, simulated fault injection is one of the trickier parts)
>

I would assume, because in a sense, the black swan* is
by definition hard to predict.

(* the book by Nick Taleb, not the movie)


--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From james.p.lux at jpl.nasa.gov  Fri Jan 13 11:26:29 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Fri, 13 Jan 2012 08:26:29 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <6ea096464afff4fe7e544e0cd28c5204.squirrel@mail.eadline.org>
Message-ID: <CB359697.13052%james.p.lux@jpl.nasa.gov>


On 1/13/12 7:18 AM, "Douglas Eadline" <deadline at eadline.org> wrote:
>>
>>
>> And Doug, your small systems have a lot of the same issues, perhaps
>> because that small Limulus might be operated in environments other than
>> what the underlying hardware was designed for.  I know people who have
>> been rudely surprised when they found that the design environment for a
>> laptop is a pretty narrow temperature range (e.g. office desktop) and
>>when
>> they put them in a car, subject to 0C or 40C temperatures, if not wider,
>> that things don't work quite as well as expected.
>
>I will be curious to see where these things show up since
>all you really need is a power plug. (a little nervous actually).

Yes.. That *will* be interesting...  And wait til someone has a cluster of
Limuluses (Not sure of the proper alliterative collective noun, nor the
plural form.. A litany of limuli? A school? A murder?)

>
>I agree. Four nodes is really small. BTW, the most fun in designing
>this system is a set of tighter constraints than are found on the typical
>cluster. Noise, power, space, cabling, low cost packaging, etc. I have
>been asked about a rack mount version, we'll see.
>
>One thing I find interesting is the core/node efficiency.
>(what I call "effective cores") In general *on some codes*, I found that
>less cores (1P micro-atx 4-cores) is more efficient than many
>cores (2P server 12-core). Seems obvious, but I like to test things.


Yes, because we're using, in general, commodity components/assemblies,
we're subject to the results of optimizations and market/business forces
in other user spaces.  Someone designing a media PC for home use might not
care about electrical efficiency (there's no big yellow energy tags on
computers, yet), but would care about noise.  Someone designing a rack
mounted server cares not a whit about noise, but really cares about a 10%
change in power consumption.

And, drop on top of that the non-synchronized differences in
development/manufacturing/fabrication generations for the underlying
parts.  Consumer stuff comes out for the winter selling season. Commercial
stuff probably is on a different cycle. It's not like everyone uses the
same "model year changeover".


>
>>
>>
>> (oddly, simulated fault injection is one of the trickier parts)
>>
>
>I would assume, because in a sense, the black swan* is
>by definition hard to predict.

Not so much that, as the actual mechanics of fault injection.  Think about
testing error detection and recovery for Flash memory.  The underlying
specification error rate is something like 1E-9 or 1E-10/read, and that's
a worst case kind of spec, so errors aren't too common (I.e. You can't
just run and wait for them to occur).  SO how do you cause errors to occur
(without perturbing the system.)...

In the flash case, because we developed our own flash controller logic in
an FPGA, we can add "error injection logic" to the design, but that's not
always the case.  How would you simulate upsets in a CPU core?  (short of
blasting it with radiation.. Which is difficult and expensive.. I wish it
was as easy as getting a little Co60 gamma source and putting it on top of
the chip.. We hike to somewhere that has an accelerator (UC Davis,
Brookhaven, etc) and shoot protons and heavy ions at it.

>
>(* the book by Nick Taleb, not the movie)


Black swans in this case would be things like the Pentium divide bug.
Yes.. That *would* be a challenge, but hey, we've got folks in our JPL
Laboratory for Reliable Software (LARS) who sit around thinking of how to
do that, among other things.  (http://lars-lab.jpl.nasa.gov/)   Hmm.. I'll
have to go talk to those guys about clusters of pi or arduinos...  They're
big into formal verifications, too, and model based verification.  So you
could have a modeled system in SysML or UML and compare its behavior with
that on your prototype.
>
>
>--
>Doug
>
>-- 
>This message has been scanned for viruses and
>dangerous content by MailScanner, and is
>believed to be clean.
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Fri Jan 13 23:18:57 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Fri, 13 Jan 2012 23:18:57 -0500 (EST)
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB359697.13052%james.p.lux@jpl.nasa.gov>
References: <CB359697.13052%james.p.lux@jpl.nasa.gov>
Message-ID: <alpine.LFD.2.02.1201132248190.10797@coffee.psychology.mcmaster.ca>

> care about electrical efficiency (there's no big yellow energy tags on
> computers, yet), but would care about noise.  Someone designing a rack

the "plus 80" branding is pretty ubiquitous now, and the best part
is that commodity ATX parts are starting to show up at gold levels.
server vendors have offered gold or platinum for a while now, but it's
probably more important in the home, since personal machines spend more
time idling, thus running the PSU at low demand.  poor-quality PSUs
are remarkably bad at low utilization.

regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Fri Jan 13 23:46:17 2012
From: samuel at unimelb.edu.au (Chris Samuel)
Date: Sat, 14 Jan 2012 15:46:17 +1100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <6ea096464afff4fe7e544e0cd28c5204.squirrel@mail.eadline.org>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9163@ALTPHYEMBEVSP20.RES.AD.JPL>
	<6ea096464afff4fe7e544e0cd28c5204.squirrel@mail.eadline.org>
Message-ID: <201201141546.17872.samuel@unimelb.edu.au>

On Sat, 14 Jan 2012 02:18:02 AM Douglas Eadline wrote:

> I would assume, because in a sense, the black swan* is
> by definition hard to predict.

Ahem, not around here, they're all black [1].  Now a white swan, that 
would be something to see!

[1] http://www.flickr.com/photos/earthinmyeyes/4608041877/

cheers!
Chris
-- 
   Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Thu Jan 19 09:46:26 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Thu, 19 Jan 2012 09:46:26 -0500
Subject: [Beowulf] Parallel Programming Survey Report
In-Reply-To: <4c1a72f2e8097c3585e495e1c95bfa2c.squirrel@mail.eadline.org>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com> <4F0FC2F2.5090606@unimelb.edu.au>
	<4c1a72f2e8097c3585e495e1c95bfa2c.squirrel@mail.eadline.org>
Message-ID: <6ec5ed08a6fb8c5b390b26bdfc18803a.squirrel@mail.eadline.org>

Last year Dr Dobb's did a survey of parallel programming.
Today I received a copy of:

The Parallel Programming Landscape: Multicore has gone mainstream --
but are developers ready?

It is mostly about multi-core and a bit Intel centric (they
sponsored it) and not too much about HPC. Still interesting
to see how the programming world is coping with multi-core.
If you are interested in a copy you have to sign up here:

https://www.cmpadministration.com/ars/emailnew.do?mode=emailnew&P=P2&MZP=&L=&F=1003933&K=&cid_download

I'll probably read closer and post a summary on Cluster Monkey
at some point.

--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From eugen at leitl.org  Thu Jan 19 09:57:37 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Thu, 19 Jan 2012 15:57:37 +0100
Subject: [Beowulf] Parallel Programming Survey Report
In-Reply-To: <6ec5ed08a6fb8c5b390b26bdfc18803a.squirrel@mail.eadline.org>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com> <4F0FC2F2.5090606@unimelb.edu.au>
	<4c1a72f2e8097c3585e495e1c95bfa2c.squirrel@mail.eadline.org>
	<6ec5ed08a6fb8c5b390b26bdfc18803a.squirrel@mail.eadline.org>
Message-ID: <20120119145737.GK21917@leitl.org>

On Thu, Jan 19, 2012 at 09:46:26AM -0500, Douglas Eadline wrote:
> Last year Dr Dobb's did a survey of parallel programming.
> Today I received a copy of:
> 
> The Parallel Programming Landscape: Multicore has gone mainstream --
> but are developers ready?
> 
> It is mostly about multi-core and a bit Intel centric (they
> sponsored it) and not too much about HPC. Still interesting
> to see how the programming world is coping with multi-core.
> If you are interested in a copy you have to sign up here:
> 
> https://www.cmpadministration.com/ars/emailnew.do?mode=emailnew&P=P2&MZP=&L=&F=1003933&K=&cid_download
> 
> I'll probably read closer and post a summary on Cluster Monkey
> at some point.

While speaking about multicore, I recommend this 21 min
video interview (even if you dislike talking heads and
smarmy interviewers) with david Ungar:

http://channel9.msdn.com/Blogs/Charles/SPLASH-2011-David-Ungar-Self-ManyCore-and-Embracing-Non-Determinism

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Mon Jan 23 08:45:10 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Mon, 23 Jan 2012 14:45:10 +0100
Subject: [Beowulf] =?utf-8?q?CPU_Startup_Combines_CPU+DRAM=E2=80=94And_A_W?=
	=?utf-8?q?hole_Bunch_Of_Crazy?=
Message-ID: <20120123134510.GF7343@leitl.org>


(Old idea, makes sense, will they be able to pull it off?)

http://hothardware.com/News/CPU-Startup-Combines-CPUDRAMAnd-A-Whole-Bunch-Of-Crazy/

CPU Startup Combines CPU+DRAM?And A Whole Bunch Of Crazy

Sunday, January 22, 2012 - by Joel Hruska

The CPU design firm Venray Technology announced a new product design this
week that it claims can deliver enormous performance benefits by combining
CPU and DRAM on to a single piece of silicon. We spent some time earlier this
fall discussing the new TOMI (Thread Optimized Multiprocessor) with company
CTO Russell Fish, but while the idea is interesting; its presentation is
marred by crazy conceptualizing and deeply suspect analytics.

The Multicore Problem:

There are three limiting factors, or walls, that limit the scaling of modern
microprocessors. First, there's the memory wall, defined as the gap between
the CPU and DRAM clock speed. Second, there's the ILP (Instruction Level
Parallelism) wall, which refers to the difficulty of decoding enough
instructions per clock cycle to keep a core completely busy. Finally, there's
the power wall--the faster a CPU is and the more cores it has, the more power
it consumes.

Attempting to compensate for one wall often risks running afoul of the other
two. Adding more cache to decrease the impact of the CPU/DRAM speed
discrepancy adds die complexity and draws more power, as does raising CPU
clock speed. Combined, the three walls are a set of fundamental
constraints--improving architectural efficiency and moving to a smaller
process technology may make the room a bit bigger, but they don't remove the
walls themselves.

TOMI attempts to redefine the problem by building a very different type of
microprocessor. The TOMI Borealis is built using the same transistor
structures as conventional DRAM; the chip trades clock speed and performance
for ultra-low low leakage. Its design is, by necessity, extremely simple. Not
counting the cache, TOMI is a 22,000 transistor design, as compared to 30,000
transistors for the original ARM2. The company's early prototypes, built on
legacy DRAM technology, ran at 500MHz on a 110nm process.

Instead of surrounding a CPU core with a substantial amount of L2 and L3
cache, Venray inserted a CPU core directly into a DRAM design. A TOMI
Borealis core connects eight TOMI cores to a 1Gbit DRAM with a total of 16
ICs per 2GB DIMM. This works out to a total of 128 processor cores per DIMM.
Because they're built using ultra-low-leakage processes and are so small,
such cores cost very little to build and consume vanishingly small amounts of
power (Venray claims power consumption is as low as 23mW per core at 500MHz).

It's an interesting idea.

The Bad:

When your CPU has fewer transistors than an architecture that debuted in
1986, it's a good chance that you left a few things out--like an FPU, branch
prediction, pipelining, or any form of speculative execution. Venray may have
created a chip with power consumption an order of magnitude lower than
anything ARM builds and more memory bandwidth than Intel's highest-end Xeons,
but it's an ultra-specialized, ultra-lightweight core that trades 25 years of
flexibility and performance for scads of memory bandwidth.


The last few years have seen a dramatic surge in the number of low-power,
many-core architectures being floated as the potential future of computing,
but Venray's approach relies on the manufacturing expertise of companies who
have no experience in building microprocessors and don't normally serve as
foundries. This imposes fundamental restrictions on the CPU's ability to
scale; DRAM is manufactured using a three layer mask rather than the 10-12
layers Intel and AMD use for their CPUs. Venray already acknowledges that
these conditions imposed substantial limitations on the original TOMI design.

Of course, there's still a chance that the TOMI uarch could be effective in
certain bandwidth-hungry scenarios--but that's where the Venray Crazy Train
goes flying off the track.

The Disingenuous and Crazy

Let's start here. In a graph like this, you expect the two bars to represent
the same systems being compared across three different characteristics.
That's not the case. When we spoke to Russell Fish in late November, he
pointed us to this publicly available document and claimed that the results
came from a customer with 384 2.1GHz Xeons. There's no such thing as an S5620
Xeon and even if we grant that he meant the E5620 CPU, that's a 2.4GHz chip.

The "Power consumption" graphs show Oracle's maximum power consumption for a
system with 10x Xeon E7-8870s, 168 dedicated SQL processors, 5.3TB (yes, TB)
of Flash and 15x 10,000 RPM hard drives. It's not only a worst-case figure,
it's a figure utterly unrelated to the workload shown in the Performance
comparison. Furthermore, given that each Xeon E7-8870 has a 130W TDP, ten of
them only come out to 1.3kW--Oracle's 17.7kW figure means that the
overwhelming majority of the cabinet's power consumption is driven by
components other than its CPUs.

>From here, things rapidly get worse. Fish makes his points about power walls
by referring to unverified claims that prototype 90nm Tejas chips drew 150W
at 2.8GHz back in 2004. That's like arguing that Ford can't build a decent
car because the Edsel sucked.

After reading about the technology, you might think Venray was planning to
market a small chip to high-end HPC niche markets... and you'd be wrong. The
company expects the following to occur as a result of this revolutionary
architecture (organized by least-to-most creepy):

    Computer speech will be so common that devices will talk to other devices
in the presence of their users.

    Your cell phone camera will recognize the face of anyone it sees and scan
the computer cloud for backround red flags as well as six degrees of
separation

    Common commands will be reduced to short verbal cues like clicking your
tongue or sucking your lips

    Your personal history will be displayed for one and all to see...women
will create search engines to find eligible, prosperous men. Men will create
search engines to qualify women. Criminals will find their jobs much more
difficult because their history will be immediately known to anyone who
encounters them.

    TOMI Technology will be built on flash memories creating the elemental
unit of a learning machine... the machines will be able to self organize,
build robust communicating structures, and collaborate to perform tasks.

    A disposable diaper company will give away TOMI enabled teddy bears that
teach reading and arithmetic. It will be able to identify specific
children... and from time to time remind Mom to buy a product. The bear will
also diagnose a raspy throat, a cough, or runny nose.

Conclusion:

Fish has spent decades in the microprocessor industry--he invented the first
CPU to use a clock multiplier in conjunction with Chuck H. Moore--but his
vision of the future is crazy enough to scare mad dogs and Englishmen.

His idea for a CPU architecture is interesting, even underneath the
obfuscation and false representation, but too practically limited to ever
take off. Google, an enthusiastic and dedicated proponent of energy
efficient, multi-core research said it best in a paper titled "Brawny cores
still beat wimpy cores, most of the time."

 "Once a chip?s single-core performance lags by more than a factor to two or
so behind the higher end of current-generation commodity processors, making a
business case for switching to the wimpy system becomes increasingly
difficult... So go forth and multiply your cores, but do it in moderation, or
the sea of wimpy cores will stick to your programmers? boots like clay."
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Mon Jan 23 10:38:39 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 23 Jan 2012 10:38:39 -0500
Subject: [Beowulf]
 =?utf-8?q?CPU_Startup_Combines_CPU+DRAM=E2=80=94And_A_W?=
 =?utf-8?q?hole_Bunch_Of_Crazy?=
In-Reply-To: <20120123134510.GF7343@leitl.org>
References: <20120123134510.GF7343@leitl.org>
Message-ID: <4F1D7EFF.7080206@ias.edu>

If you read this PDF from Venray Technologies, which is linked to in the
article, you see where the 'Whole Bunch of Crazy" part comes from. After
reading it, Venray lost a lot of credibility in my book.

https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf

--
Prentice


On 01/23/2012 08:45 AM, Eugen Leitl wrote:
> (Old idea, makes sense, will they be able to pull it off?)
>
> http://hothardware.com/News/CPU-Startup-Combines-CPUDRAMAnd-A-Whole-Bunch-Of-Crazy/
>
> CPU Startup Combines CPU+DRAM?And A Whole Bunch Of Crazy
>
> Sunday, January 22, 2012 - by Joel Hruska
>
> The CPU design firm Venray Technology announced a new product design this
> week that it claims can deliver enormous performance benefits by combining
> CPU and DRAM on to a single piece of silicon. We spent some time earlier this
> fall discussing the new TOMI (Thread Optimized Multiprocessor) with company
> CTO Russell Fish, but while the idea is interesting; its presentation is
> marred by crazy conceptualizing and deeply suspect analytics.
>
> The Multicore Problem:
>
> There are three limiting factors, or walls, that limit the scaling of modern
> microprocessors. First, there's the memory wall, defined as the gap between
> the CPU and DRAM clock speed. Second, there's the ILP (Instruction Level
> Parallelism) wall, which refers to the difficulty of decoding enough
> instructions per clock cycle to keep a core completely busy. Finally, there's
> the power wall--the faster a CPU is and the more cores it has, the more power
> it consumes.
>
> Attempting to compensate for one wall often risks running afoul of the other
> two. Adding more cache to decrease the impact of the CPU/DRAM speed
> discrepancy adds die complexity and draws more power, as does raising CPU
> clock speed. Combined, the three walls are a set of fundamental
> constraints--improving architectural efficiency and moving to a smaller
> process technology may make the room a bit bigger, but they don't remove the
> walls themselves.
>
> TOMI attempts to redefine the problem by building a very different type of
> microprocessor. The TOMI Borealis is built using the same transistor
> structures as conventional DRAM; the chip trades clock speed and performance
> for ultra-low low leakage. Its design is, by necessity, extremely simple. Not
> counting the cache, TOMI is a 22,000 transistor design, as compared to 30,000
> transistors for the original ARM2. The company's early prototypes, built on
> legacy DRAM technology, ran at 500MHz on a 110nm process.
>
> Instead of surrounding a CPU core with a substantial amount of L2 and L3
> cache, Venray inserted a CPU core directly into a DRAM design. A TOMI
> Borealis core connects eight TOMI cores to a 1Gbit DRAM with a total of 16
> ICs per 2GB DIMM. This works out to a total of 128 processor cores per DIMM.
> Because they're built using ultra-low-leakage processes and are so small,
> such cores cost very little to build and consume vanishingly small amounts of
> power (Venray claims power consumption is as low as 23mW per core at 500MHz).
>
> It's an interesting idea.
>
> The Bad:
>
> When your CPU has fewer transistors than an architecture that debuted in
> 1986, it's a good chance that you left a few things out--like an FPU, branch
> prediction, pipelining, or any form of speculative execution. Venray may have
> created a chip with power consumption an order of magnitude lower than
> anything ARM builds and more memory bandwidth than Intel's highest-end Xeons,
> but it's an ultra-specialized, ultra-lightweight core that trades 25 years of
> flexibility and performance for scads of memory bandwidth.
>
>
> The last few years have seen a dramatic surge in the number of low-power,
> many-core architectures being floated as the potential future of computing,
> but Venray's approach relies on the manufacturing expertise of companies who
> have no experience in building microprocessors and don't normally serve as
> foundries. This imposes fundamental restrictions on the CPU's ability to
> scale; DRAM is manufactured using a three layer mask rather than the 10-12
> layers Intel and AMD use for their CPUs. Venray already acknowledges that
> these conditions imposed substantial limitations on the original TOMI design.
>
> Of course, there's still a chance that the TOMI uarch could be effective in
> certain bandwidth-hungry scenarios--but that's where the Venray Crazy Train
> goes flying off the track.
>
> The Disingenuous and Crazy
>
> Let's start here. In a graph like this, you expect the two bars to represent
> the same systems being compared across three different characteristics.
> That's not the case. When we spoke to Russell Fish in late November, he
> pointed us to this publicly available document and claimed that the results
> came from a customer with 384 2.1GHz Xeons. There's no such thing as an S5620
> Xeon and even if we grant that he meant the E5620 CPU, that's a 2.4GHz chip.
>
> The "Power consumption" graphs show Oracle's maximum power consumption for a
> system with 10x Xeon E7-8870s, 168 dedicated SQL processors, 5.3TB (yes, TB)
> of Flash and 15x 10,000 RPM hard drives. It's not only a worst-case figure,
> it's a figure utterly unrelated to the workload shown in the Performance
> comparison. Furthermore, given that each Xeon E7-8870 has a 130W TDP, ten of
> them only come out to 1.3kW--Oracle's 17.7kW figure means that the
> overwhelming majority of the cabinet's power consumption is driven by
> components other than its CPUs.
>
> From here, things rapidly get worse. Fish makes his points about power walls
> by referring to unverified claims that prototype 90nm Tejas chips drew 150W
> at 2.8GHz back in 2004. That's like arguing that Ford can't build a decent
> car because the Edsel sucked.
>
> After reading about the technology, you might think Venray was planning to
> market a small chip to high-end HPC niche markets... and you'd be wrong. The
> company expects the following to occur as a result of this revolutionary
> architecture (organized by least-to-most creepy):
>
>     Computer speech will be so common that devices will talk to other devices
> in the presence of their users.
>
>     Your cell phone camera will recognize the face of anyone it sees and scan
> the computer cloud for backround red flags as well as six degrees of
> separation
>
>     Common commands will be reduced to short verbal cues like clicking your
> tongue or sucking your lips
>
>     Your personal history will be displayed for one and all to see...women
> will create search engines to find eligible, prosperous men. Men will create
> search engines to qualify women. Criminals will find their jobs much more
> difficult because their history will be immediately known to anyone who
> encounters them.
>
>     TOMI Technology will be built on flash memories creating the elemental
> unit of a learning machine... the machines will be able to self organize,
> build robust communicating structures, and collaborate to perform tasks.
>
>     A disposable diaper company will give away TOMI enabled teddy bears that
> teach reading and arithmetic. It will be able to identify specific
> children... and from time to time remind Mom to buy a product. The bear will
> also diagnose a raspy throat, a cough, or runny nose.
>
> Conclusion:
>
> Fish has spent decades in the microprocessor industry--he invented the first
> CPU to use a clock multiplier in conjunction with Chuck H. Moore--but his
> vision of the future is crazy enough to scare mad dogs and Englishmen.
>
> His idea for a CPU architecture is interesting, even underneath the
> obfuscation and false representation, but too practically limited to ever
> take off. Google, an enthusiastic and dedicated proponent of energy
> efficient, multi-core research said it best in a paper titled "Brawny cores
> still beat wimpy cores, most of the time."
>
>  "Once a chip?s single-core performance lags by more than a factor to two or
> so behind the higher end of current-generation commodity processors, making a
> business case for switching to the wimpy system becomes increasingly
> difficult... So go forth and multiply your cores, but do it in moderation, or
> the sea of wimpy cores will stick to your programmers? boots like clay."
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Mon Jan 23 11:35:56 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 23 Jan 2012 08:35:56 -0800
Subject: [Beowulf]
 =?windows-1252?q?CPU_Startup_Combines_CPU+DRAM=8BAnd_A_?=
 =?windows-1252?q?Whole_Bunch_Of_Crazy?=
In-Reply-To: <4F1D7EFF.7080206@ias.edu>
Message-ID: <CB42C3AE.133C9%james.p.lux@jpl.nasa.gov>

The CPU reminds me of the old bipolar AMD2901 CPU chip sets...
RISC before it was called RISC.

The white paper sort of harps on the fact that one cannot accurately
predict the future (hey, I was a 10th grader at NCC in 1975, and saw the
Altair at the MITS display in their trailer and KNEW that I wanted one,
but I also wanted lots of other things there, which didn't pan out).
Then, having established that you can make predictions with impunity and
nobody can prove you wrong, they go on with a couple pages of ideas.
(establishing priority for patenting.. Eh?  Like the story Feynman tells
about getting a patent on nuclear powered airplanes)

The concept isn't particularly new (see, e.g. Transputers), but that's
true of most architectural things. I think what happens is that as
manufacturing or other limits/bumps in the road are hit, it forces a
review. There's always the argument that building a bigger, faster version
of what we had before is easier (support for legacy codes, etc.) and at
some point, the balance shifts.. It's not easier to just build bigger
faster.

Vector processors
Pipelines
Cluster computers
Etc.

The "processors in a sea of memory" model has been around for a while
(and, in fact, there were a lot of designs in the 80s, at the board if not
the chip level: transputers, early hypercubes, etc.)  So this is
revisiting the architecture at a smaller level of integration.

One thing about power consumption.. Those memory cells consume so little
power because most of them  are not being accessed.  They're essentially
"floating" capacitors. So the power consumption of the same transistor in
a CPU (where the duty factor is 100%) is going to be higher than the power
consumption in a memory cell (where the duty factor is 0.001% or
something).

And, as always, the challenge is in the software to effectively use the
distributed computing architecture.  When you think about it, we've had
almost a century to figure out how to program single instruction stream
computers of one sort or another, and it was easy, because we are single
stream (SISD) ourselves.  We can create a simulation of multiple threads
by timesharing in some sense (in either the human or machine models)

And we have lots of experience with EP type, or even scatter/gather type
processes (tilling land, building pyramids, assembly lines) so that model
of software/hardware architecture can be argued to be a natural outgrowth
of what humans already do, and have been figuring out how to do for
millenia.  (did Imhotep use some form of project planning tools?  You bet
he did)

However, true parallelism (MIMD) is harder to conceptualize.  Vector and
matrix math is one area, but I'd argue that it's just the same as EP
tasks, just at a finer grain. Systolic arrays, vector pipelines, FFT boxes
from FloatingPointSystems, are all basically ways to use the underlying
structure of the task, in an easy way (how long til there's a hardware
implementation of the new faster-than-FFT algorithm published last week?)
And in all those cases, you have to explicitly make use of the special
capabilities.  That is, in general, the compiler doesn't recognize it
(although, modern parallelizing compilers ARE really smart.. So they
probably do find most of the cases)

I don't know that we have good conceptual tools to take a complex task and
break it effectively into multiple disparate component tasks that can
effectively run in parallel.  It's a hard task for something
straightforward (e.g. Designing a big system or building a spacecraft),
and I don't know that any of outputs of current project planning
techniques (which are entirely manual) can be said to produce
"generalized" optimum outputs.  They produce *an* output for dividing the
complex task up (or else the project can't be done), but I don't know that
the output is provably optimum or even workable (an awful lot of projects
over-run, and not just because of bad estimates for time/cost).

So the problem facing would-be users of new computing architectures (be
they TOMI, HyperCube, ConnectionMachine, or Beowulf) is like that facing a
project planner given a big project, and a brand new crew of workers who
speak a different language, with skill sets totally different than the
planner is used to.

This is what the computer user is facing:  There's no compiler or problem
description technique that will automatically generate a "work plan" to
use that new architecture. It's all manual, and it's hard, and you're up
against a brute force "why not just hook 500 people up to that rock and
drag it" approach.  The people who figure out the new way will certainly
benefit society, but there's going to be a lot of false starts along the
way.  And, I'm not particularly sanguine about the process being automated
(at least in the sense of automatic parallelizing compilers that recognize
loops and repetitve stuff).  I think that for the next few years
(decades?) using new architectures is going to rely on skilled humans to
figure out how to use it, on an ad hoc, unique to each application, basis.


[Back in the 80s, I had a loaner "sugarcube" 4 node Intel hypercube
sitting on my desk for a while.  I wanted to figure out something to do
with it that is non-trivial, and not the examples given in the docs (which
focused on stuff like LISP and Prolog).  I started, as I'm sure many
people do, by taking a multithreaded application I had, and distributing
the threads to processors.  You pretty quickly realize, though, that it's
tough to evenly distribute the loads among processors, and you wind up
with processor 1 waiting for something that processor 2 is doing, which in
turn is waiting for something that processor 3 is doing, and so forth.  In
a "shared processor" this isn't a big deal, and is transparent: the
processor is always working, and aside from deadlocks, there's no
particular reason why you need to balance load among threads.

For what it's worth, the task I was doing was comparable to taking
execution of a Matlab/simulink model and distributing it across multiple
processors.  You had signals flowing among blocks, etc.  These things are
computationally intensive (especially if you have loops in the design, so
you need an iterative solution of some sort) so the idea of putting
multiple processors to work is attractive.   But the "work" in each block
in the diagram isn't known a-priori and might vary during the course of
the simulation, so it's not like you can come up with some sort of
automatic partitioning algorithm.


On 1/23/12 7:38 AM, "Prentice Bisbal" <prentice at ias.edu> wrote:

>If you read this PDF from Venray Technologies, which is linked to in the
>article, you see where the 'Whole Bunch of Crazy" part comes from. After
>reading it, Venray lost a lot of credibility in my book.
>
>https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf
>
>--
>Prentice
>
>
>On 01/23/2012 08:45 AM, Eugen Leitl wrote:
>> (Old idea, makes sense, will they be able to pull it off?)
>>
>>
>>http://hothardware.com/News/CPU-Startup-Combines-CPUDRAMAnd-A-Whole-Bunch
>>-Of-Crazy/
>>
>> CPU Startup Combines CPU+DRAM?And A Whole Bunch Of Crazy
>>
>> Sunday, January 22, 2012 - by Joel Hruska
>>
>> The CPU design firm Venray Technology announced a new product design
>>this
>> week that it claims can deliver enormous performance benefits by
>>combining
>> CPU and DRAM on to a single piece of silicon. We spent some time
>>earlier this
>> fall discussing the new TOMI (Thread Optimized Multiprocessor) with
>>company
>> CTO Russell Fish, but while the idea is interesting; its presentation is
>> marred by crazy conceptualizing and deeply suspect analytics.
>>
>> The Multicore Problem:
>>
>> There are three limiting factors, or walls, that limit the scaling of
>>modern
>> microprocessors. First, there's the memory wall, defined as the gap
>>between
>> the CPU and DRAM clock speed. Second, there's the ILP (Instruction Level
>> Parallelism) wall, which refers to the difficulty of decoding enough
>> instructions per clock cycle to keep a core completely busy. Finally,
>>there's
>> the power wall--the faster a CPU is and the more cores it has, the more
>>power
>> it consumes.
>>
>> Attempting to compensate for one wall often risks running afoul of the
>>other
>> two. Adding more cache to decrease the impact of the CPU/DRAM speed
>> discrepancy adds die complexity and draws more power, as does raising
>>CPU
>> clock speed. Combined, the three walls are a set of fundamental
>> constraints--improving architectural efficiency and moving to a smaller
>> process technology may make the room a bit bigger, but they don't
>>remove the
>> walls themselves.
>>
>> TOMI attempts to redefine the problem by building a very different type
>>of
>> microprocessor. The TOMI Borealis is built using the same transistor
>> structures as conventional DRAM; the chip trades clock speed and
>>performance
>> for ultra-low low leakage. Its design is, by necessity, extremely
>>simple. Not
>> counting the cache, TOMI is a 22,000 transistor design, as compared to
>>30,000
>> transistors for the original ARM2. The company's early prototypes,
>>built on
>> legacy DRAM technology, ran at 500MHz on a 110nm process.
>>
>> Instead of surrounding a CPU core with a substantial amount of L2 and L3
>> cache, Venray inserted a CPU core directly into a DRAM design. A TOMI
>> Borealis core connects eight TOMI cores to a 1Gbit DRAM with a total of
>>16
>> ICs per 2GB DIMM. This works out to a total of 128 processor cores per
>>DIMM.
>> Because they're built using ultra-low-leakage processes and are so
>>small,
>> such cores cost very little to build and consume vanishingly small
>>amounts of
>> power (Venray claims power consumption is as low as 23mW per core at
>>500MHz).
>>
>> It's an interesting idea.
>>
>> The Bad:
>>
>> When your CPU has fewer transistors than an architecture that debuted in
>> 1986, it's a good chance that you left a few things out--like an FPU,
>>branch
>> prediction, pipelining, or any form of speculative execution. Venray
>>may have
>> created a chip with power consumption an order of magnitude lower than
>> anything ARM builds and more memory bandwidth than Intel's highest-end
>>Xeons,
>> but it's an ultra-specialized, ultra-lightweight core that trades 25
>>years of
>> flexibility and performance for scads of memory bandwidth.
>>
>>
>> The last few years have seen a dramatic surge in the number of
>>low-power,
>> many-core architectures being floated as the potential future of
>>computing,
>> but Venray's approach relies on the manufacturing expertise of
>>companies who
>> have no experience in building microprocessors and don't normally serve
>>as
>> foundries. This imposes fundamental restrictions on the CPU's ability to
>> scale; DRAM is manufactured using a three layer mask rather than the
>>10-12
>> layers Intel and AMD use for their CPUs. Venray already acknowledges
>>that
>> these conditions imposed substantial limitations on the original TOMI
>>design.
>>
>> Of course, there's still a chance that the TOMI uarch could be
>>effective in
>> certain bandwidth-hungry scenarios--but that's where the Venray Crazy
>>Train
>> goes flying off the track.
>>
>> The Disingenuous and Crazy
>>
>> Let's start here. In a graph like this, you expect the two bars to
>>represent
>> the same systems being compared across three different characteristics.
>> That's not the case. When we spoke to Russell Fish in late November, he
>> pointed us to this publicly available document and claimed that the
>>results
>> came from a customer with 384 2.1GHz Xeons. There's no such thing as an
>>S5620
>> Xeon and even if we grant that he meant the E5620 CPU, that's a 2.4GHz
>>chip.
>>
>> The "Power consumption" graphs show Oracle's maximum power consumption
>>for a
>> system with 10x Xeon E7-8870s, 168 dedicated SQL processors, 5.3TB
>>(yes, TB)
>> of Flash and 15x 10,000 RPM hard drives. It's not only a worst-case
>>figure,
>> it's a figure utterly unrelated to the workload shown in the Performance
>> comparison. Furthermore, given that each Xeon E7-8870 has a 130W TDP,
>>ten of
>> them only come out to 1.3kW--Oracle's 17.7kW figure means that the
>> overwhelming majority of the cabinet's power consumption is driven by
>> components other than its CPUs.
>>
>> From here, things rapidly get worse. Fish makes his points about power
>>walls
>> by referring to unverified claims that prototype 90nm Tejas chips drew
>>150W
>> at 2.8GHz back in 2004. That's like arguing that Ford can't build a
>>decent
>> car because the Edsel sucked.
>>
>> After reading about the technology, you might think Venray was planning
>>to
>> market a small chip to high-end HPC niche markets... and you'd be
>>wrong. The
>> company expects the following to occur as a result of this revolutionary
>> architecture (organized by least-to-most creepy):
>>
>>     Computer speech will be so common that devices will talk to other
>>devices
>> in the presence of their users.
>>
>>     Your cell phone camera will recognize the face of anyone it sees
>>and scan
>> the computer cloud for backround red flags as well as six degrees of
>> separation
>>
>>     Common commands will be reduced to short verbal cues like clicking
>>your
>> tongue or sucking your lips
>>
>>     Your personal history will be displayed for one and all to
>>see...women
>> will create search engines to find eligible, prosperous men. Men will
>>create
>> search engines to qualify women. Criminals will find their jobs much
>>more
>> difficult because their history will be immediately known to anyone who
>> encounters them.
>>
>>     TOMI Technology will be built on flash memories creating the
>>elemental
>> unit of a learning machine... the machines will be able to self
>>organize,
>> build robust communicating structures, and collaborate to perform tasks.
>>
>>     A disposable diaper company will give away TOMI enabled teddy bears
>>that
>> teach reading and arithmetic. It will be able to identify specific
>> children... and from time to time remind Mom to buy a product. The bear
>>will
>> also diagnose a raspy throat, a cough, or runny nose.
>>
>> Conclusion:
>>
>> Fish has spent decades in the microprocessor industry--he invented the
>>first
>> CPU to use a clock multiplier in conjunction with Chuck H. Moore--but
>>his
>> vision of the future is crazy enough to scare mad dogs and Englishmen.
>>
>> His idea for a CPU architecture is interesting, even underneath the
>> obfuscation and false representation, but too practically limited to
>>ever
>> take off. Google, an enthusiastic and dedicated proponent of energy
>> efficient, multi-core research said it best in a paper titled "Brawny
>>cores
>> still beat wimpy cores, most of the time."
>>
>>  "Once a chip?s single-core performance lags by more than a factor to
>>two or
>> so behind the higher end of current-generation commodity processors,
>>making a
>> business case for switching to the wimpy system becomes increasingly
>> difficult... So go forth and multiply your cores, but do it in
>>moderation, or
>> the sea of wimpy cores will stick to your programmers? boots like clay."
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>>http://www.beowulf.org/mailman/listinfo/beowulf
>>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Mon Jan 23 14:28:26 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Mon, 23 Jan 2012 11:28:26 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
Message-ID: <20120123192826.GB17383@bx9.net>

http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 14:59:30 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Mon, 23 Jan 2012 20:59:30 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120123192826.GB17383@bx9.net>
References: <20120123192826.GB17383@bx9.net>
Message-ID: <F0AE95A2-DDCD-4BFE-ACF8-4B12E559B02B@xs4all.nl>

Interesting article.

Difficult for me analyse - usually you sell your business when it's a  
succes, or when you want to run away.
Not sure which of the 2 it is here.

Maybe some years from now with some support from Intel that Qlogic  
also can unroll FDR. Right now they're stuck with QDR,
which on their homepage they announce as 40 gigabit per second.

http://www.qlogic.com/Products/adapters/Pages/InfiniBandAdapters.aspx

Showing the Qlogic 7300 series.

Mellanox is slamdunking with FDR now, the new generation network  
which is double the bandwidth i suppose from QDR,
which already got unrolled a few months ago and should be shipping by  
now.

Qlogic AFAIK didn't even announce their next generation network yet,  
let alone display it
and still toys with QDR, which is what i toy at home with. Fact they  
announced 'improving' the oldie QDR
i would interpret as bad news for innovating to FDR.

Maybe someone from Mellanox wants to comment on FDR and whether it's  
double the bandwidth of QDR,
as i suppose some will be monitoring this list.


On Jan 23, 2012, at 8:28 PM, Greg Lindahl wrote:

> http://www.hpcwire.com/hpcwire/2012-01-23/ 
> intel_to_buy_qlogic_s_infiniband_business.html
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Mon Jan 23 15:00:07 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Mon, 23 Jan 2012 15:00:07 -0500 (EST)
Subject: [Beowulf]
	=?utf-8?q?CPU_Startup_Combines_CPU+DRAM=E2=80=94And_A_W?=
	=?utf-8?q?hole_Bunch_Of_Crazy?=
In-Reply-To: <4F1D7EFF.7080206@ias.edu>
References: <20120123134510.GF7343@leitl.org> <4F1D7EFF.7080206@ias.edu>
Message-ID: <alpine.LFD.2.02.1201231424150.2099@coffee.psychology.mcmaster.ca>

> If you read this PDF from Venray Technologies, which is linked to in the
> article, you see where the 'Whole Bunch of Crazy" part comes from. After
> reading it, Venray lost a lot of credibility in my book.
>
> https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf

wow, you're not kidding.  mostly it makes me wonder whether the economy
is such that you can actually get first-round VC with collateral like that!
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Mon Jan 23 15:17:01 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 23 Jan 2012 12:17:01 -0800
Subject: [Beowulf]
 =?windows-1252?q?CPU_Startup_Combines_CPU+DRAM=8BAnd_A_?=
 =?windows-1252?q?Whole_Bunch_Of_Crazy?=
In-Reply-To: <alpine.LFD.2.02.1201231424150.2099@coffee.psychology.mcmaster.ca>
Message-ID: <CB42FF53.1347B%james.p.lux@jpl.nasa.gov>

I don't know..

Maybe it's the list of potential applications (some of which are
speculative and well out there) is what it takes to justify VC..

Like DARPA.. High risk, high reward.  The typical VC doesn't expect every
investment to hit, but the ones that do, they want big returns from.

If you're just interested in slogging through successive refinement, there
are probably other sources of capital that are more appropriate.

While some of those things are downright creepy, none of them appear to
violate the laws of physics, and if someone with cash is willing to put
some up to run the idea forward and establish a position (patent term is
20 years after all.. Which is a long ways in the future in the technology
world).

In 2030 there may be gripes on the equivalent of SlashDot about how this
Venray had patents on all the fundamental things people are using.  Think
of hyperlinks, mice, etc.

On 1/23/12 12:00 PM, "Mark Hahn" <hahn at mcmaster.ca> wrote:

>> If you read this PDF from Venray Technologies, which is linked to in the
>> article, you see where the 'Whole Bunch of Crazy" part comes from. After
>> reading it, Venray lost a lot of credibility in my book.
>>
>> https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf
>
>wow, you're not kidding.  mostly it makes me wonder whether the economy
>is such that you can actually get first-round VC with collateral like
>that!
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From raysonlogin at gmail.com  Mon Jan 23 15:50:09 2012
From: raysonlogin at gmail.com (Rayson Ho)
Date: Mon, 23 Jan 2012 15:50:09 -0500
Subject: [Beowulf]
	=?utf-8?q?CPU_Startup_Combines_CPU+DRAM=E2=80=B9And_A_W?=
	=?utf-8?q?hole_Bunch_Of_Crazy?=
In-Reply-To: <CB42C3AE.133C9%james.p.lux@jpl.nasa.gov>
References: <4F1D7EFF.7080206@ias.edu>
	<CB42C3AE.133C9%james.p.lux@jpl.nasa.gov>
Message-ID: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>

On Mon, Jan 23, 2012 at 11:35 AM, Lux, Jim (337C)
<james.p.lux at jpl.nasa.gov> wrote:
> The "processors in a sea of memory" model has been around for a while
> (and, in fact, there were a lot of designs in the 80s, at the board if not
> the chip level: transputers, early hypercubes, etc.) ?So this is
> revisiting the architecture at a smaller level of integration.

I remember 12-15 years ago I was reading quite a few papers published
by the Berkeley Intelligent RAM (IRAM) Project:

http://iram.cs.berkeley.edu/

So 15 years later someone suddenly thinks that it is a good idea to
ship IRAM systems to real customers?? :-D

Rayson

=================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/


> One thing about power consumption.. Those memory cells consume so little
> power because most of them ?are not being accessed. ?They're essentially
> "floating" capacitors. So the power consumption of the same transistor in
> a CPU (where the duty factor is 100%) is going to be higher than the power
> consumption in a memory cell (where the duty factor is 0.001% or
> something).
>
> And, as always, the challenge is in the software to effectively use the
> distributed computing architecture. ?When you think about it, we've had
> almost a century to figure out how to program single instruction stream
> computers of one sort or another, and it was easy, because we are single
> stream (SISD) ourselves. ?We can create a simulation of multiple threads
> by timesharing in some sense (in either the human or machine models)
>
> And we have lots of experience with EP type, or even scatter/gather type
> processes (tilling land, building pyramids, assembly lines) so that model
> of software/hardware architecture can be argued to be a natural outgrowth
> of what humans already do, and have been figuring out how to do for
> millenia. ?(did Imhotep use some form of project planning tools? ?You bet
> he did)
>
> However, true parallelism (MIMD) is harder to conceptualize. ?Vector and
> matrix math is one area, but I'd argue that it's just the same as EP
> tasks, just at a finer grain. Systolic arrays, vector pipelines, FFT boxes
> from FloatingPointSystems, are all basically ways to use the underlying
> structure of the task, in an easy way (how long til there's a hardware
> implementation of the new faster-than-FFT algorithm published last week?)
> And in all those cases, you have to explicitly make use of the special
> capabilities. ?That is, in general, the compiler doesn't recognize it
> (although, modern parallelizing compilers ARE really smart.. So they
> probably do find most of the cases)
>
> I don't know that we have good conceptual tools to take a complex task and
> break it effectively into multiple disparate component tasks that can
> effectively run in parallel. ?It's a hard task for something
> straightforward (e.g. Designing a big system or building a spacecraft),
> and I don't know that any of outputs of current project planning
> techniques (which are entirely manual) can be said to produce
> "generalized" optimum outputs. ?They produce *an* output for dividing the
> complex task up (or else the project can't be done), but I don't know that
> the output is provably optimum or even workable (an awful lot of projects
> over-run, and not just because of bad estimates for time/cost).
>
> So the problem facing would-be users of new computing architectures (be
> they TOMI, HyperCube, ConnectionMachine, or Beowulf) is like that facing a
> project planner given a big project, and a brand new crew of workers who
> speak a different language, with skill sets totally different than the
> planner is used to.
>
> This is what the computer user is facing: ?There's no compiler or problem
> description technique that will automatically generate a "work plan" to
> use that new architecture. It's all manual, and it's hard, and you're up
> against a brute force "why not just hook 500 people up to that rock and
> drag it" approach. ?The people who figure out the new way will certainly
> benefit society, but there's going to be a lot of false starts along the
> way. ?And, I'm not particularly sanguine about the process being automated
> (at least in the sense of automatic parallelizing compilers that recognize
> loops and repetitve stuff). ?I think that for the next few years
> (decades?) using new architectures is going to rely on skilled humans to
> figure out how to use it, on an ad hoc, unique to each application, basis.
>
>
> [Back in the 80s, I had a loaner "sugarcube" 4 node Intel hypercube
> sitting on my desk for a while. ?I wanted to figure out something to do
> with it that is non-trivial, and not the examples given in the docs (which
> focused on stuff like LISP and Prolog). ?I started, as I'm sure many
> people do, by taking a multithreaded application I had, and distributing
> the threads to processors. ?You pretty quickly realize, though, that it's
> tough to evenly distribute the loads among processors, and you wind up
> with processor 1 waiting for something that processor 2 is doing, which in
> turn is waiting for something that processor 3 is doing, and so forth. ?In
> a "shared processor" this isn't a big deal, and is transparent: the
> processor is always working, and aside from deadlocks, there's no
> particular reason why you need to balance load among threads.
>
> For what it's worth, the task I was doing was comparable to taking
> execution of a Matlab/simulink model and distributing it across multiple
> processors. ?You had signals flowing among blocks, etc. ?These things are
> computationally intensive (especially if you have loops in the design, so
> you need an iterative solution of some sort) so the idea of putting
> multiple processors to work is attractive. ? But the "work" in each block
> in the diagram isn't known a-priori and might vary during the course of
> the simulation, so it's not like you can come up with some sort of
> automatic partitioning algorithm.
>
>
> On 1/23/12 7:38 AM, "Prentice Bisbal" <prentice at ias.edu> wrote:
>
>>If you read this PDF from Venray Technologies, which is linked to in the
>>article, you see where the 'Whole Bunch of Crazy" part comes from. After
>>reading it, Venray lost a lot of credibility in my book.
>>
>>https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf
>>
>>--
>>Prentice
>>
>>
>>On 01/23/2012 08:45 AM, Eugen Leitl wrote:
>>> (Old idea, makes sense, will they be able to pull it off?)
>>>
>>>
>>>http://hothardware.com/News/CPU-Startup-Combines-CPUDRAMAnd-A-Whole-Bunch
>>>-Of-Crazy/
>>>
>>> CPU Startup Combines CPU+DRAM?And A Whole Bunch Of Crazy
>>>
>>> Sunday, January 22, 2012 - by Joel Hruska
>>>
>>> The CPU design firm Venray Technology announced a new product design
>>>this
>>> week that it claims can deliver enormous performance benefits by
>>>combining
>>> CPU and DRAM on to a single piece of silicon. We spent some time
>>>earlier this
>>> fall discussing the new TOMI (Thread Optimized Multiprocessor) with
>>>company
>>> CTO Russell Fish, but while the idea is interesting; its presentation is
>>> marred by crazy conceptualizing and deeply suspect analytics.
>>>
>>> The Multicore Problem:
>>>
>>> There are three limiting factors, or walls, that limit the scaling of
>>>modern
>>> microprocessors. First, there's the memory wall, defined as the gap
>>>between
>>> the CPU and DRAM clock speed. Second, there's the ILP (Instruction Level
>>> Parallelism) wall, which refers to the difficulty of decoding enough
>>> instructions per clock cycle to keep a core completely busy. Finally,
>>>there's
>>> the power wall--the faster a CPU is and the more cores it has, the more
>>>power
>>> it consumes.
>>>
>>> Attempting to compensate for one wall often risks running afoul of the
>>>other
>>> two. Adding more cache to decrease the impact of the CPU/DRAM speed
>>> discrepancy adds die complexity and draws more power, as does raising
>>>CPU
>>> clock speed. Combined, the three walls are a set of fundamental
>>> constraints--improving architectural efficiency and moving to a smaller
>>> process technology may make the room a bit bigger, but they don't
>>>remove the
>>> walls themselves.
>>>
>>> TOMI attempts to redefine the problem by building a very different type
>>>of
>>> microprocessor. The TOMI Borealis is built using the same transistor
>>> structures as conventional DRAM; the chip trades clock speed and
>>>performance
>>> for ultra-low low leakage. Its design is, by necessity, extremely
>>>simple. Not
>>> counting the cache, TOMI is a 22,000 transistor design, as compared to
>>>30,000
>>> transistors for the original ARM2. The company's early prototypes,
>>>built on
>>> legacy DRAM technology, ran at 500MHz on a 110nm process.
>>>
>>> Instead of surrounding a CPU core with a substantial amount of L2 and L3
>>> cache, Venray inserted a CPU core directly into a DRAM design. A TOMI
>>> Borealis core connects eight TOMI cores to a 1Gbit DRAM with a total of
>>>16
>>> ICs per 2GB DIMM. This works out to a total of 128 processor cores per
>>>DIMM.
>>> Because they're built using ultra-low-leakage processes and are so
>>>small,
>>> such cores cost very little to build and consume vanishingly small
>>>amounts of
>>> power (Venray claims power consumption is as low as 23mW per core at
>>>500MHz).
>>>
>>> It's an interesting idea.
>>>
>>> The Bad:
>>>
>>> When your CPU has fewer transistors than an architecture that debuted in
>>> 1986, it's a good chance that you left a few things out--like an FPU,
>>>branch
>>> prediction, pipelining, or any form of speculative execution. Venray
>>>may have
>>> created a chip with power consumption an order of magnitude lower than
>>> anything ARM builds and more memory bandwidth than Intel's highest-end
>>>Xeons,
>>> but it's an ultra-specialized, ultra-lightweight core that trades 25
>>>years of
>>> flexibility and performance for scads of memory bandwidth.
>>>
>>>
>>> The last few years have seen a dramatic surge in the number of
>>>low-power,
>>> many-core architectures being floated as the potential future of
>>>computing,
>>> but Venray's approach relies on the manufacturing expertise of
>>>companies who
>>> have no experience in building microprocessors and don't normally serve
>>>as
>>> foundries. This imposes fundamental restrictions on the CPU's ability to
>>> scale; DRAM is manufactured using a three layer mask rather than the
>>>10-12
>>> layers Intel and AMD use for their CPUs. Venray already acknowledges
>>>that
>>> these conditions imposed substantial limitations on the original TOMI
>>>design.
>>>
>>> Of course, there's still a chance that the TOMI uarch could be
>>>effective in
>>> certain bandwidth-hungry scenarios--but that's where the Venray Crazy
>>>Train
>>> goes flying off the track.
>>>
>>> The Disingenuous and Crazy
>>>
>>> Let's start here. In a graph like this, you expect the two bars to
>>>represent
>>> the same systems being compared across three different characteristics.
>>> That's not the case. When we spoke to Russell Fish in late November, he
>>> pointed us to this publicly available document and claimed that the
>>>results
>>> came from a customer with 384 2.1GHz Xeons. There's no such thing as an
>>>S5620
>>> Xeon and even if we grant that he meant the E5620 CPU, that's a 2.4GHz
>>>chip.
>>>
>>> The "Power consumption" graphs show Oracle's maximum power consumption
>>>for a
>>> system with 10x Xeon E7-8870s, 168 dedicated SQL processors, 5.3TB
>>>(yes, TB)
>>> of Flash and 15x 10,000 RPM hard drives. It's not only a worst-case
>>>figure,
>>> it's a figure utterly unrelated to the workload shown in the Performance
>>> comparison. Furthermore, given that each Xeon E7-8870 has a 130W TDP,
>>>ten of
>>> them only come out to 1.3kW--Oracle's 17.7kW figure means that the
>>> overwhelming majority of the cabinet's power consumption is driven by
>>> components other than its CPUs.
>>>
>>> From here, things rapidly get worse. Fish makes his points about power
>>>walls
>>> by referring to unverified claims that prototype 90nm Tejas chips drew
>>>150W
>>> at 2.8GHz back in 2004. That's like arguing that Ford can't build a
>>>decent
>>> car because the Edsel sucked.
>>>
>>> After reading about the technology, you might think Venray was planning
>>>to
>>> market a small chip to high-end HPC niche markets... and you'd be
>>>wrong. The
>>> company expects the following to occur as a result of this revolutionary
>>> architecture (organized by least-to-most creepy):
>>>
>>> ? ? Computer speech will be so common that devices will talk to other
>>>devices
>>> in the presence of their users.
>>>
>>> ? ? Your cell phone camera will recognize the face of anyone it sees
>>>and scan
>>> the computer cloud for backround red flags as well as six degrees of
>>> separation
>>>
>>> ? ? Common commands will be reduced to short verbal cues like clicking
>>>your
>>> tongue or sucking your lips
>>>
>>> ? ? Your personal history will be displayed for one and all to
>>>see...women
>>> will create search engines to find eligible, prosperous men. Men will
>>>create
>>> search engines to qualify women. Criminals will find their jobs much
>>>more
>>> difficult because their history will be immediately known to anyone who
>>> encounters them.
>>>
>>> ? ? TOMI Technology will be built on flash memories creating the
>>>elemental
>>> unit of a learning machine... the machines will be able to self
>>>organize,
>>> build robust communicating structures, and collaborate to perform tasks.
>>>
>>> ? ? A disposable diaper company will give away TOMI enabled teddy bears
>>>that
>>> teach reading and arithmetic. It will be able to identify specific
>>> children... and from time to time remind Mom to buy a product. The bear
>>>will
>>> also diagnose a raspy throat, a cough, or runny nose.
>>>
>>> Conclusion:
>>>
>>> Fish has spent decades in the microprocessor industry--he invented the
>>>first
>>> CPU to use a clock multiplier in conjunction with Chuck H. Moore--but
>>>his
>>> vision of the future is crazy enough to scare mad dogs and Englishmen.
>>>
>>> His idea for a CPU architecture is interesting, even underneath the
>>> obfuscation and false representation, but too practically limited to
>>>ever
>>> take off. Google, an enthusiastic and dedicated proponent of energy
>>> efficient, multi-core research said it best in a paper titled "Brawny
>>>cores
>>> still beat wimpy cores, most of the time."
>>>
>>> ?"Once a chip?s single-core performance lags by more than a factor to
>>>two or
>>> so behind the higher end of current-generation commodity processors,
>>>making a
>>> business case for switching to the wimpy system becomes increasingly
>>> difficult... So go forth and multiply your cores, but do it in
>>>moderation, or
>>> the sea of wimpy cores will stick to your programmers? boots like clay."
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Mon Jan 23 15:58:11 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 23 Jan 2012 12:58:11 -0800
Subject: [Beowulf]
 =?windows-1252?q?CPU_Startup_Combines_CPU+DRAM=8BAnd_A_?=
 =?windows-1252?q?Whole_Bunch_Of_Crazy?=
In-Reply-To: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>
Message-ID: <CB430870.13492%james.p.lux@jpl.nasa.gov>


On 1/23/12 12:50 PM, "Rayson Ho" <raysonlogin at gmail.com> wrote:

>On Mon, Jan 23, 2012 at 11:35 AM, Lux, Jim (337C)
><james.p.lux at jpl.nasa.gov> wrote:
>> The "processors in a sea of memory" model has been around for a while
>> (and, in fact, there were a lot of designs in the 80s, at the board if
>>not
>> the chip level: transputers, early hypercubes, etc.)  So this is
>> revisiting the architecture at a smaller level of integration.
>
>I remember 12-15 years ago I was reading quite a few papers published
>by the Berkeley Intelligent RAM (IRAM) Project:
>
>http://iram.cs.berkeley.edu/
>
>So 15 years later someone suddenly thinks that it is a good idea to
>ship IRAM systems to real customers?? :-D
>
>Rayson


Or maybe, all good ideas keep coming up again, and each time, it's refined
a bit, or there's another possible source of funding appearing.

Look at "solar power transmitted by microwaves from orbit" as an example.
That one has a 15-20 year cycle time.


You have an idea which is attractive.. You get some money to run it
forward, and then insurmountable problems crop up, discoverable only with
significant investment of time/money (>> 1 work month).  That puts the
idea to sleep for a while until either the reasons are forgotten, or
technology has advanced to the point where what might have been
unreasonable the previous time is reasonable now.

Certainly in the computing world, where 10-15 years is sufficient for many
orders of magnitude change in performance along many axes, it pays to
revisit things, since what may have been a good balance or trade back
then, isn't now.

And that's sort of the thrust of their white paper (justifying that now
the time is right), as well as staking their claim to a bunch of general
applications, few of which are uniquely enabled by their proposed
technology.


>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Mon Jan 23 16:19:34 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Mon, 23 Jan 2012 16:19:34 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120123192826.GB17383@bx9.net>
References: <20120123192826.GB17383@bx9.net>
Message-ID: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>

> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html

wonder what Intel's thinking - could do some very interesting stuff,
but it would take a bit of charisma.  QPI-over-IB anyone?

I'm not crazy about Intel being a vertically-integrated HPC supplier
(chips, systems, interconnect, mpi, compilers - I guess they still
don't have their own scheduler or sexy cloud branding ;)

the world is a better place when each level has internal competition
based on useful, open (free), multi-implementation standards.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Mon Jan 23 16:33:48 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 23 Jan 2012 16:33:48 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <4F1DD23C.8080601@scalableinformatics.com>

On 01/23/2012 04:19 PM, Mark Hahn wrote:

> the world is a better place when each level has internal competition
> based on useful, open (free), multi-implementation standards.

Markets always go through these full on vertical integration phases (for 
a while) before the assets are sold off (either voluntarily or via 
bankruptcy court).  Its a natural part of the business cycle.

Cisco is building servers now.  Oracle, the whole stack.  Pretty soon, 
some whipper snapper of a company is going to come along and eat their 
lunches, and then they will get competitive pressure to change.

This said, many *many* large university sites like dealing with "a 
single vendor" (that is until they get eventually screwed over by that 
one vendor, or realize that the "great deal" they are getting really 
isn't as great as it sounded ... ).  Which is part of the reason its so 
hard getting into accounts other vendors have locked up.  Sadly, lots of 
this works around the spirit (and probably skating very close to the 
edge of the letter) of the law surrounding most public acquisition 
processes, but thats life I guess.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Mon Jan 23 16:46:11 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 23 Jan 2012 16:46:11 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <4F1DD523.4020005@ias.edu>


On 01/23/2012 04:19 PM, Mark Hahn wrote:
>> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
> wonder what Intel's thinking - could do some very interesting stuff,
> but it would take a bit of charisma.  QPI-over-IB anyone?

That's what I'm thinking!
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 16:49:12 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Mon, 23 Jan 2012 22:49:12 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <A39A88E0-1752-41B8-A543-1EFCDDB043BF@xs4all.nl>


On Jan 23, 2012, at 10:19 PM, Mark Hahn wrote:

>> http://www.hpcwire.com/hpcwire/2012-01-23/ 
>> intel_to_buy_qlogic_s_infiniband_business.html
>
> wonder what Intel's thinking - could do some very interesting stuff,
> but it would take a bit of charisma.  QPI-over-IB anyone?

forget it

>
> I'm not crazy about Intel being a vertically-integrated HPC supplier
> (chips, systems, interconnect, mpi, compilers - I guess they still
> don't have their own scheduler or sexy cloud branding ;)

maybe they just want a new generation ethernet nic dirt cheap for  
their motherboards;

if you produce it in those numbers as they do probably anything gets  
dirt cheap,
this doesn't bit highend, yet it might be cheaper then to buy qlogic  
than pay royalties to
any of the infiniband vendors; which would be either mellanox or qlogic.

Also they bought qlogic for 125 million dollar, though in cash, which  
doesn't seem to me as exceptionnel much
from intels viewpoint whereas they might intend to sell some of their  
upcoming line of vector cpu's which badly
need a network of course.

125 million is just a few supercomputers. maybe it was just a cheap  
buy, as qlogic doesn't have FDR yet, who knows?

What i wonder about is how wallstreet knew in advance about qlogic  
getting taken over. If we look careful we see that
since say roughly december 19th 2011, the nasdaq rose roughly 10.5%  
and qlogic rose quite a lot more, several percent.

So it was significant more in demand than the index, which is weird  
if we realize that qlogic has unrolled nothing those months
whereas its competitor Mellanox has unrolled FDR.

It's obvious some traders knew this deal was coming, but real  
fingerpointing is not my job.

Vincent


> the world is a better place when each level has internal competition
> based on useful, open (free), multi-implementation standards.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Mon Jan 23 18:00:02 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Tue, 24 Jan 2012 10:00:02 +1100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <4F1DE672.6000602@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 08:19, Mark Hahn wrote:

> wonder what Intel's thinking - could do some very interesting stuff,
> but it would take a bit of charisma.  QPI-over-IB anyone?

I remember way back hearing the IB was going to be the technology to
replace all those various buses (PCI, etc) on a motherboard [1], then it
all went quiet and then it re-emerged as an interconnect.  So perhaps
Intel (who were part of one of the two groups that merged to create IB)
have thoughts again on this?

cheers,
Chris

[1] interestingly a similar comment appears on the IB Wikipedia page
under history, but sadly without references..

http://en.wikipedia.org/wiki/InfiniBand#History

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8d5nIACgkQO2KABBYQAh+rcACgjTSmbr9EC4clrh0J2EQUT8lX
Sz0AniUG4pdhBkliNWGq5E1tsXiOa8IV
=0k6Z
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From joshua_mora at usa.net  Mon Jan 23 18:02:12 2012
From: joshua_mora at usa.net (Joshua mora acosta)
Date: Mon, 23 Jan 2012 17:02:12 -0600
Subject: [Beowulf] Intel buys QLogic InfiniBand business
Message-ID: <708qawXBm8848S02.1327359732@web02.cms.usa.net>

Do you mean IB over QPI ?
Either way, High Node Count Coherence will be an issue.
In any case, by acquiring their IP it is a step forward towards SoC (System on
Chip). A preliminary step (building block) for the Exascale strategy and for
low cost enterprise/cloud solutions.

Joshua
------ Original Message ------
Received: 03:47 PM CST, 01/23/2012
From: Prentice Bisbal <prentice at ias.edu>
To: beowulf at beowulf.org
Subject: Re: [Beowulf] Intel buys QLogic InfiniBand business

> 
> On 01/23/2012 04:19 PM, Mark Hahn wrote:
> >>
http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
> > wonder what Intel's thinking - could do some very interesting stuff,
> > but it would take a bit of charisma.  QPI-over-IB anyone?
> 
> That's what I'm thinking!
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 18:24:15 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 00:24:15 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
Message-ID: <3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>


On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:

> Do you mean IB over QPI ?
> Either way, High Node Count Coherence will be an issue.

Just ignore his statement - it's total nonsense.

Nanosecond latency of QPI using 2 rings versus something that has a  
latency up to factor 1000 slower
with the pci-e as the slowest delaying factor.

Doing cache coherency over that forget it.

 From what i understand a big problem at modern cpu's is the  
crossbar. At latest chip displayed,
the bulldozer, it's taking a significant amount of transistors.

If you confront that crossbar suddenly with latencies a a factor 4000  
slower, that's not gonna let it perform better
of course.


> In any case, by acquiring their IP it is a step forward towards SoC  
> (System on
> Chip). A preliminary step (building block) for the Exascale  
> strategy and for
> low cost enterprise/cloud solutions.

Not with intel. Intel sells fast equipment yet it has a huge price  
always,
about the opposite of infiniband which is a dirt cheap technology.

I guess we must see this much simpler. At such a giant as intel,  
paying a bit over 100 million is peanuts.
Probably less than what they would need to pay for royalties to a  
manufacturer owning a bunch of patents
in the ethernet NIC area; the HPC intel gets 'for free'.

Allows them to produce maybe a 10 gigabit ethernet NIC dirt cheap  
without needing to pay royalties to qlogic.
It will not be a big performer such 10 gigabit ethernet nic, yet  
price matters a lot of course when integrating. Every penny counts then.

What you typically see with intel is that for them the mass market is  
so important, read that's the 1 gigabit ethernet market right now,
that all other products suffer there, as they will give their mass  
market products always, of course, priority.

Itanium is a good example; it always was proces generations behind  
their main products. It never was given a fair chance to compete.

So where they win it with sandy bridge becasue it's soon a proces  
generation or 2 having the edge on AMD,
there intels other products suffer from this,as they don't get that  
proces technology.

meanwhile ethernet is total crucial to have low latency for the  
financial world, as they can make dozens of billions a year by being  
faster
than others at exchanges.

Now back to that mass market and integration of a good and especially  
cheap 10 gigabit nic into intels mainboards,
this buy might be pretty interesting to intel.

Yet that's a market so big, it has nothing to do with HPC i'd argue.

 From HPC viewpoint i wouldn't see this takeover as a threat to  
anyone in HPC,
i guess it basically means intel won't challenge for the crown in  
HPC, giving Mellanox monopoly for a while at FDR.

It's about ethernet i bet.

>
> Joshua
> ------ Original Message ------
> Received: 03:47 PM CST, 01/23/2012
> From: Prentice Bisbal <prentice at ias.edu>
> To: beowulf at beowulf.org
> Subject: Re: [Beowulf] Intel buys QLogic InfiniBand business
>
>>
>> On 01/23/2012 04:19 PM, Mark Hahn wrote:
>>>>
> http://www.hpcwire.com/hpcwire/2012-01-23/ 
> intel_to_buy_qlogic_s_infiniband_business.html
>>> wonder what Intel's thinking - could do some very interesting stuff,
>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>
>> That's what I'm thinking!
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Mon Jan 23 19:03:14 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 23 Jan 2012 19:03:14 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
Message-ID: <4F1DF542.6050504@scalableinformatics.com>

On 01/23/2012 06:24 PM, Vincent Diepeveen wrote:
>
> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:

[...]

> Nanosecond latency of QPI using 2 rings versus something that has a
> latency up to factor 1000 slower
> with the pci-e as the slowest delaying factor.
>
> Doing cache coherency over that forget it.

Hear that Shai F?  Stop work on vSMP now, cause Vincent says it can't 
work!!!

More seriously, with this acquisition, I could see serious contention 
for ScaleMP.  SoC type stuff, using IB between many nodes, in smaller boxen.


>> In any case, by acquiring their IP it is a step forward towards SoC
>> (System on
>> Chip). A preliminary step (building block) for the Exascale
>> strategy and for
>> low cost enterprise/cloud solutions.

Yes.

> Not with intel. Intel sells fast equipment yet it has a huge price
> always,
> about the opposite of infiniband which is a dirt cheap technology.

Must use Shakespeare for this takedown:  Methinks thou dost protesteth 
too much ...

>
> I guess we must see this much simpler. At such a giant as intel,
> paying a bit over 100 million is peanuts.
> Probably less than what they would need to pay for royalties to a
> manufacturer owning a bunch of patents
> in the ethernet NIC area; the HPC intel gets 'for free'.

So ... exactly what are the existing intel 10GbE NIC's then ... Swiss 
Cheese?  I see a fair number of vendors licensing Intel's IP, or, more 
to the point, using Intel silicon (hint: this might be a good reason for 
the acquisition) to build their stuff...

> Allows them to produce maybe a 10 gigabit ethernet NIC dirt cheap

... which they have been doing for years ...

> without needing to pay royalties to qlogic.

... not sure they were, but its possible Qlogic has 10GbE IP that Intel 
licenses, but this transaction was about ... Infiniband ...

[...]

> meanwhile ethernet is total crucial to have low latency for the
> financial world, as they can make dozens of billions a year by being
> faster
> than others at exchanges.

Errr ... given that this is one of our core markets, don't mind if I 
note that latency is critical to these players, so proximity to the 
exchange, and reliable and deterministic latency is absolutely critical. 
  There are switches that are doing 300ns port to port in the Ethernet 
space now.  With the NICs, you are looking in the 2-ish microsecond 
regime.  These are not cheap.

Compare this to QDR.  1 microsecond +/- some.

Which has lower latency?

There are many reasons why exchanges (mostly) aren't on IB.  A few of 
them are even valid technical reasons.  Historical momentum, and 
conservative approaches to new technology rank pretty high.  So does the 
inability to generally export IB far and wide.  And the complexity of 
the stack.  Ethernet is (almost) plug and play.  Its just a network.

IB is sort of kind of plug, install OFED, and play for a while over 
IPoIB until you can recode for some of the RDMA bits.  And don't try to 
run file systems and other things with lots of traffic over IPoIB.  It 
leaks and gradually you will catch some cool ... surprises.

Honestly, its a shame that IPoIB never really got the attention it 
deserved like the other elements of the IB stack did.  Getting a rock 
solid IP implementation atop a fast/low latency net could have driven 
many design wins outside of HPC.  And would have been a gateway 
drug^H^H^H^Htechnology for using the other stack elements.


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Mon Jan 23 19:06:43 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 23 Jan 2012 19:06:43 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F1DF542.6050504@scalableinformatics.com>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
Message-ID: <4F1DF613.1060603@scalableinformatics.com>

On 01/23/2012 07:03 PM, Joe Landman wrote:

> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it can't
> work!!!
>

There is an implicit /sarc tag here BTW.  vSMP does a wonderful job 
(where Vincent claims that things won't work ... they do work, and very 
well at that).

> More seriously, with this acquisition, I could see serious contention
> for ScaleMP.  SoC type stuff, using IB between many nodes, in smaller boxen.

Serious contention to buy ScaleMP (as in potential acquirers)

Must be getting too much blood in the coffee stream.  Can't communicate ...

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From atp at piskorski.com  Mon Jan 23 19:30:30 2012
From: atp at piskorski.com (Andrew Piskorski)
Date: Mon, 23 Jan 2012 19:30:30 -0500
Subject: [Beowulf] CPU Startup Combines CPU+DRAM?And A Whole Bunch Of
	Crazy
In-Reply-To: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>
References: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>
Message-ID: <20120124003030.GA80957@piskorski.com>

On Mon, Jan 23, 2012 at 03:50:09PM -0500, Rayson Ho wrote:

> http://iram.cs.berkeley.edu/
> 
> So 15 years later someone suddenly thinks that it is a good idea to
> ship IRAM systems to real customers?? :-D

Sure.  But from when I last read about the IRAM stuff, I'm pretty sure
it was strictly single core.  Their VIRAM1 chip had 13 MB of DRAM, 1
cpu core, and 4 "vector lanes", with no mention of SMP or any sort of
multi-chip parallelism at all.  If Venray has a good design for using
hundreds or more IRAM-like chips in a parallel machine, that sounds
like a significant step forward.  (The intended fab process and
attendant design rules might also be quite different, although I'm not
at all sure about that.)

-- 
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 19:40:13 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 01:40:13 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F1DF542.6050504@scalableinformatics.com>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
Message-ID: <F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>


On Jan 24, 2012, at 1:03 AM, Joe Landman wrote:

> On 01/23/2012 06:24 PM, Vincent Diepeveen wrote:
>>
>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>
> [...]
>
>> Nanosecond latency of QPI using 2 rings versus something that has a
>> latency up to factor 1000 slower
>> with the pci-e as the slowest delaying factor.
>>
>> Doing cache coherency over that forget it.
>
> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it can't
> work!!!
>
> More seriously, with this acquisition, I could see serious contention
> for ScaleMP.  SoC type stuff, using IB between many nodes, in  
> smaller boxen.
>

That would be some BlueGene type machine you speak about that intel  
would produce with a low power SoC.

This where at this point the bluegene type machines simply can't  
compete with the tiny processors
that get produced by the dozens of millions.

"The tiny processors have won"
    Linus Thorvalds

Intel has themselves a second law of Moore. You can google for it.  
Every new generation of factory that
can produce this machine with double the number of transistors, that  
factory also is 2x more expensive.

A few years ago intel projected that by 2020 building a single  
factory would have a cost of 20 billion dollar.

Now Obama might contribute to this by overspending 40-50%, more  
overspending than the overspending of
Greece, Spain, UK and Portugal combined.

So that will cause massive inflation, which will hurt the poor most,  
and it sure will help the 2nd law of Moore become sooner a reality
rather than later; yet if we move away from politics to money and  
mass production;
i hope you realize that a few HPC cpu's won't pay back for 20 billion  
dollar.

In short only cpu's that get mass produced can.

A good example of massproduced processors are gpu's.

If we look at the leading gpu's, which have by now thousands of  
cores, there is no way to compete with that with SoC's.

What's price of producing 1 gpu versus 200 SOC's with a small core?

Furthermore intel never really could compete in the SOC world so far  
with the low power cpu's that get produced by the billion a year,
so betting on that would be quite surprising, though not impossible  
gamble.

Intel always has been good in low latency designs. yet obviously  
further integration of logics into the cpu means of course you also
need a capable ethernet chip in your cpu. Qlogics can provide that.

Mass produce half a billion of those and then it's cheaper to buy a  
company with such technology than to pay royalties.

Another HPC problem with the bluegene type designs:

all those soc's basically spread the calculation power over a bigger  
area than 1 big power eating chip will.
Bigger area means bigger distance to transfer massive data, and  
that's in itself a very expensive thing.

Overall seen bluegene machines never really had a low power usage,  
despite some stupid professors shouting that.
Per gflop it always was never the performance king; they just  
compared with total hopeless type designs and IBM usually
delivered in time, something that is very important in HPC as well.

IMHO the only reason bluegene could be competative is because it was  
fighting dinosaur type HPC cpu's.

Now SoC's might be mighty interesting in the gamersworld and in the  
telecom to build new phones with,
wich makes it mighty interesting for intel to produce those  
dirtcheap, and maybe even put a more capable ethernet
chip on it, again dirtcheap; as for the HPC world i don't see it  
happen that this SoC can compete anyhow with a gpu or even CPU.

Better write some code in CUDA or OpenCL i'd argue.

Latest AMD gpu the HD Radeon 7970, it is delivering 1 teraflop or so?

With soon a 2 gpu version coming on 1 card that's gonna deliver close  
to 2 Tflop a card, double precision yes.
Multiply by 4 for single precision. 8+ Teraflop single precision.

For a couple of hundreds of dollars. Nvidia will undoubtfully follow  
with their 1 teraflop gpu.

If take a washing machine and pack it with cheapo socks, creating a 2  
Tflop machine, do you guess you can SELL that for a couple of
hundreds of dollars?

Just transport costs already will be more expensive than a single gpu  
card...

Intel cannot compete with that in HPC for the stuff that needs  
bandwidth and doesn't care for latency. as at a new proces technology,
they first go produce a few FPGA cpu's, and after that they produce  
worlds fastest CPU. So there is simply no window in
time to use the latest proces technology for a HPC vector type chip.  
That's why AMD-ATI and Nvidia will win that contest handsdown.

And we sure hope intel will keep selling its cpu's very well, which  
if it is the case means that this won't change.

After all they already make cash on majority of supercomputers as  
each node also usually has 2 Xeon cpu's which go for a multiple of  
the price
of the GPU that's in the box...

>
>>> In any case, by acquiring their IP it is a step forward towards SoC
>>> (System on
>>> Chip). A preliminary step (building block) for the Exascale
>>> strategy and for
>>> low cost enterprise/cloud solutions.
>
> Yes.
>
>> Not with intel. Intel sells fast equipment yet it has a huge price
>> always,
>> about the opposite of infiniband which is a dirt cheap technology.
>
> Must use Shakespeare for this takedown:  Methinks thou dost protesteth
> too much ...
>
>>
>> I guess we must see this much simpler. At such a giant as intel,
>> paying a bit over 100 million is peanuts.
>> Probably less than what they would need to pay for royalties to a
>> manufacturer owning a bunch of patents
>> in the ethernet NIC area; the HPC intel gets 'for free'.
>
> So ... exactly what are the existing intel 10GbE NIC's then ... Swiss
> Cheese?  I see a fair number of vendors licensing Intel's IP, or, more
> to the point, using Intel silicon (hint: this might be a good  
> reason for
> the acquisition) to build their stuff...
>
>> Allows them to produce maybe a 10 gigabit ethernet NIC dirt cheap
>
> ... which they have been doing for years ...
>
>> without needing to pay royalties to qlogic.
>
> ... not sure they were, but its possible Qlogic has 10GbE IP that  
> Intel
> licenses, but this transaction was about ... Infiniband ...
>
> [...]
>
>> meanwhile ethernet is total crucial to have low latency for the
>> financial world, as they can make dozens of billions a year by being
>> faster
>> than others at exchanges.
>
> Errr ... given that this is one of our core markets, don't mind if I
> note that latency is critical to these players, so proximity to the
> exchange, and reliable and deterministic latency is absolutely  
> critical.
>   There are switches that are doing 300ns port to port in the Ethernet
> space now.  With the NICs, you are looking in the 2-ish microsecond
> regime.  These are not cheap.
>
> Compare this to QDR.  1 microsecond +/- some.
>
> Which has lower latency?
>
> There are many reasons why exchanges (mostly) aren't on IB.  A few of
> them are even valid technical reasons.  Historical momentum, and
> conservative approaches to new technology rank pretty high.  So  
> does the
> inability to generally export IB far and wide.  And the complexity of
> the stack.  Ethernet is (almost) plug and play.  Its just a network.
>
> IB is sort of kind of plug, install OFED, and play for a while over
> IPoIB until you can recode for some of the RDMA bits.  And don't  
> try to
> run file systems and other things with lots of traffic over IPoIB.  It
> leaks and gradually you will catch some cool ... surprises.
>
> Honestly, its a shame that IPoIB never really got the attention it
> deserved like the other elements of the IB stack did.  Getting a rock
> solid IP implementation atop a fast/low latency net could have driven
> many design wins outside of HPC.  And would have been a gateway
> drug^H^H^H^Htechnology for using the other stack elements.
>
>
>
> -- 
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>         http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Mon Jan 23 19:51:59 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Tue, 24 Jan 2012 11:51:59 +1100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
Message-ID: <4F1E00AF.4090206@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 11:40, Vincent Diepeveen wrote:

> Overall seen bluegene machines never really had a low power usage,  
> despite some stupid professors shouting that.

So that's why the top 5 places on the last Green500 are all BlueGene..

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8eAK8ACgkQO2KABBYQAh+nIwCdH88tISGrx772Sq/57XquLFRb
GtcAni1urHGd2j+MIJA0LXG2sGk+YymR
=tfjM
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 20:00:43 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 02:00:43 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F1E00AF.4090206@unimelb.edu.au>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
	<4F1E00AF.4090206@unimelb.edu.au>
Message-ID: <7C608C57-D51B-4369-A973-6943E2D2DB7C@xs4all.nl>


On Jan 24, 2012, at 1:51 AM, Christopher Samuel wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 24/01/12 11:40, Vincent Diepeveen wrote:
>
>> Overall seen bluegene machines never really had a low power usage,
>> despite some stupid professors shouting that.
>
> So that's why the top 5 places on the last Green500 are all BlueGene..
>

I wondered about that as well.

When i see 1 gpu get nearly 1 teraflop eating probably a tad more  
power than
official, say a 250 watt it'll consume. I already use more power now  
than the specs in
fact.

Yet even then that's 4 gflop per watt.

Last time i calculated bluegene, sure that's probably the previous  
generation,
it was 3 watts per gflop, or factor 12 more power than a Radon HD 7970.

Please note that in the statements of most HPC centers claiming blue  
gene to be energy efficient,
usually they do not release numbers.

But now the important question, what's price of bluegene per teraflop?

It's let's have a look, around a 500 euro or so for a Radeon HD7970  
card.

Vincent

> - --
>     Christopher Samuel - Senior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>          http://www.vlsci.unimelb.edu.au/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8eAK8ACgkQO2KABBYQAh+nIwCdH88tISGrx772Sq/57XquLFRb
> GtcAni1urHGd2j+MIJA0LXG2sGk+YymR
> =tfjM
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Mon Jan 23 20:06:41 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Tue, 24 Jan 2012 12:06:41 +1100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <7C608C57-D51B-4369-A973-6943E2D2DB7C@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E00AF.4090206@unimelb.edu.au>
	<7C608C57-D51B-4369-A973-6943E2D2DB7C@xs4all.nl>
Message-ID: <4F1E0421.80009@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 12:00, Vincent Diepeveen wrote:

> But now the important question, what's price of bluegene per teraflop?
> 
> It's let's have a look, around a 500 euro or so for a Radeon HD7970  
> card.

What does that matter if you can't power or cool a similar performance
GPU system?   Let alone have any applications that will actually take
advantage of it.

cheers,
Chris
- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8eBCEACgkQO2KABBYQAh839wCdFz1MjiPGCKwvbKpANCmJZpnU
V4UAoJYIfKNf6VleNi0SduPcBtSkqxQq
=E7Rh
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Mon Jan 23 20:07:58 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Mon, 23 Jan 2012 20:07:58 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F1DD523.4020005@ias.edu>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
	<4F1DD523.4020005@ias.edu>
Message-ID: <39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>


>
> On 01/23/2012 04:19 PM, Mark Hahn wrote:
>>> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
>> wonder what Intel's thinking - could do some very interesting stuff,
>> but it would take a bit of charisma.  QPI-over-IB anyone?
>
> That's what I'm thinking!

Numascale does this already with SCI

--
Doug

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at eadline.org  Mon Jan 23 20:15:30 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Mon, 23 Jan 2012 20:15:30 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <2d90512c0be6a3eba887e5f6ab96b3c1.squirrel@mail.eadline.org>


>> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
>
> wonder what Intel's thinking - could do some very interesting stuff,
> but it would take a bit of charisma.  QPI-over-IB anyone?

There were some exascale goals mentioned. I wonder if there is
some plans for a MIC based exascale beast

--
Doug

>
> I'm not crazy about Intel being a vertically-integrated HPC supplier
> (chips, systems, interconnect, mpi, compilers - I guess they still
> don't have their own scheduler or sexy cloud branding ;)
>
> the world is a better place when each level has internal competition
> based on useful, open (free), multi-implementation standards.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ellis at cse.psu.edu  Mon Jan 23 20:19:08 2012
From: ellis at cse.psu.edu (Ellis H. Wilson III)
Date: Mon, 23 Jan 2012 20:19:08 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
Message-ID: <4F1E070C.4040107@cse.psu.edu>

On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>> Nanosecond latency of QPI using 2 rings versus something that has a
>>> latency up to factor 1000 slower
>>> with the pci-e as the slowest delaying factor.
>>>
>>> Doing cache coherency over that forget it.
>>
>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it can't
>> work!!!
>>
>> More seriously, with this acquisition, I could see serious contention
>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>> smaller boxen.
>
> That would be some BlueGene type machine you speak about that intel
> would produce with a low power SoC.
>
> This where at this point the bluegene type machines simply can't
> compete with the tiny processors
> that get produced by the dozens of millions.

For...chess?  ;D

> "The tiny processors have won"
>      Linus Thorvalds

*Torvalds, and if Linux (or any well-supported kernel/OS for that 
matter) currently had data structures designed for extremely high 
parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I 
would agree with this statement.  As I currently see it, all we can 
really say is that someday, probably, perhaps even hopefully:

"The tiny processors will win."

That's after we work out all the nasty nuances involved with designing 
new data structures for OSes that can handle that number of cores, and 
probably design new applications that can use these new OS features. 
And no, GPU support in Linux doesn't count as this already having been 
done.  We just farm out very specific code to run on those things.  If 
somebody has an example of a full-blown, usable OS running on a GPU 
ALONE, I would stand (very interestingly) corrected.

> Intel has themselves a second law of Moore. You can google for it.

Thanks, for a moment there, I almost used AskJeeves.

> A good example of massproduced processors are gpu's.

Was waiting for the hook.  Inevitable really.  I think if we were 
discussing the efficacy and quality of resultant bread from various 
bread machines versus the numerous methods for making bread by hand 
somehow, someway, a GPU would make better bread.  Might be a wholesome 
cyber-loaf of artisan wheat, but nonetheless, it would be better in 
every way.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 20:44:10 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 02:44:10 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F1E070C.4040107@cse.psu.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
	<4F1E070C.4040107@cse.psu.edu>
Message-ID: <D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>

In hardware you cannot beat manycore performance CPU's at the same  
cost structure;
cpu's have an exponential cost structure, for example to maintain  
cache-coherency.

This has many implications; for example also on size and scale.
If you produce a 1000 mm^2 cpu this is extremely expensive with real  
low yields,
whereas a 1000 mm^2 manycore is not a problem at all; cores that do  
not work you
can just turn off. There is no coherency.

So if you produce bigger cpu's, the price goes up per square  
millimeter, with manycores it scales near lineair.

If i remember well at 2007 a NCSA director already had put the  
implication
of this reality in his sheets, assuming by 2010 NCSA would build  
supercomputers
exclusively using manycores.

Note that manycores are not ideal for chess - they are however  
possible to use for majority of system
time that gets burned in HPC as majority of HPC needs throughput  
rather than latency.

Comparing bluegene machines with gpu's makes perfect sense of course  
as the latency
on them is also total crap.

I see the bluegene system by IBM as a genius move from IBM, starting  
an evolution,
moving away from huge expensive cpu's where you produce just a  
handful from in a total
outdated proces technology, with extremely bad yields,
with a milliondollar of startup costs, which by now woud be at todays  
factories approaching
20 million dollar startup costs just to print a  single batch of  
processors.

IBM developing power8 will have a serious problem with newer  
generation factories.
Every batch they print, every mistake it has, DANG 20 million dollar  
gone.

This concept of using simple cpu's, yet not that massively produced  
yet, obviously evoluted now
into a gpu, which is 1 total mass produced cheap chip, that  
integrates all those tiny cores into 1 cpu, which is way
cheaper.

What's price of a bluegene system per teraflop?

It's 500 euro for a 1 teraflop double precision Radeon HD7970...


On Jan 24, 2012, at 2:19 AM, Ellis H. Wilson III wrote:

> On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>>> Nanosecond latency of QPI using 2 rings versus something that has a
>>>> latency up to factor 1000 slower
>>>> with the pci-e as the slowest delaying factor.
>>>>
>>>> Doing cache coherency over that forget it.
>>>
>>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it  
>>> can't
>>> work!!!
>>>
>>> More seriously, with this acquisition, I could see serious  
>>> contention
>>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>>> smaller boxen.
>>
>> That would be some BlueGene type machine you speak about that intel
>> would produce with a low power SoC.
>>
>> This where at this point the bluegene type machines simply can't
>> compete with the tiny processors
>> that get produced by the dozens of millions.
>
> For...chess?  ;D
>
>> "The tiny processors have won"
>>      Linus Thorvalds
>
> *Torvalds, and if Linux (or any well-supported kernel/OS for that
> matter) currently had data structures designed for extremely high
> parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I
> would agree with this statement.  As I currently see it, all we can
> really say is that someday, probably, perhaps even hopefully:
>
> "The tiny processors will win."
>
> That's after we work out all the nasty nuances involved with designing
> new data structures for OSes that can handle that number of cores, and
> probably design new applications that can use these new OS features.
> And no, GPU support in Linux doesn't count as this already having been
> done.  We just farm out very specific code to run on those things.  If
> somebody has an example of a full-blown, usable OS running on a GPU
> ALONE, I would stand (very interestingly) corrected.
>
>> Intel has themselves a second law of Moore. You can google for it.
>
> Thanks, for a moment there, I almost used AskJeeves.
>
>> A good example of massproduced processors are gpu's.
>
> Was waiting for the hook.  Inevitable really.  I think if we were
> discussing the efficacy and quality of resultant bread from various
> bread machines versus the numerous methods for making bread by hand
> somehow, someway, a GPU would make better bread.  Might be a wholesome
> cyber-loaf of artisan wheat, but nonetheless, it would be better in
> every way.
>
> Best,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 20:55:41 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 02:55:41 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
	<4F1DD523.4020005@ias.edu>
	<39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>
Message-ID: <534AD42D-DC33-4199-B476-9ADED3E09073@xs4all.nl>


On Jan 24, 2012, at 2:07 AM, Douglas Eadline wrote:

>
>>
>> On 01/23/2012 04:19 PM, Mark Hahn wrote:
>>>> http://www.hpcwire.com/hpcwire/2012-01-23/ 
>>>> intel_to_buy_qlogic_s_infiniband_business.html
>>> wonder what Intel's thinking - could do some very interesting stuff,
>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>
>> That's what I'm thinking!
>
> Numascale does this already with SCI

They sold 300 systems, is claim on homepage. Not exactly what intel  
aims for. I bet they instead aim to sell half a billion cpu's with
built in ethernet - let's face it their NICs started to get outdated.

For HPC it won't be a slamming succes let alone give you any  
performance.

After all what's price of 1000 SoC's with 1000 tiny cpu's on it, that  
together produce you 1 teraflop,
versus 1 manycore that produces 1 teraflop?

This is not what you buy Qlogics for.

Maybe it was just a cheap buy for the number of patents they posses,  
and the big need within intel for some engineers
that can improve their cpu's with connectivity that the average user  
will like; as for HPC,
moving those engineers within intel to the areas where intel can make  
most cash, that's with cpu's and not with HPC
hardware, seems Mellanox gets a monopoly on HPC network performance.

>
> --
> Doug
>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>
>
> -- 
> Doug
>
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Mon Jan 23 23:55:41 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Mon, 23 Jan 2012 20:55:41 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120123192826.GB17383@bx9.net>
References: <20120123192826.GB17383@bx9.net>
Message-ID: <20120124045541.GB10196@bx9.net>

On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:

> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html

I figured out the main why:

http://seekingalpha.com/news-article/2082171-qlogic-gains-market-share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets

> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
> surpassed $100 million per quarter, and are on track for about fifty
> percent annual growth, according to Crehan Research.

That's the whole market, and QLogic says they are #1 in the FCoE
adapter segment of this market, and #2 in the overall 10 gig adapter
market (see
http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-f2q12-results-earnings-call-transcript)

Historically, QLogic had a fibre channel adapter business that was a
huge cash cow, and they bought their way into various markets and had
limited success with them: iscsi, fibre channel switches, and yes,
InfiniBand, where QLogic managed to get some large sales (TriLabs 3 PF
procurement) yet was at only 15%-20% market share.

I'm surprised that QLogic could succeed in 10gige adapters given all
the competition, but hey, I never understood why fibre channel was
popular, either.

Now that QLogic has found what the next best thing after fibre channel
adapters is, they might as well concentrate on it. It'll be
interesting what Intel plans to do in the exascale market. I've
thought for a long time that non-cache-coherent processors like MIC
ought to have InfiniPath-like hardware queues for sending and
receiving short messages efficiently, even on-chip.

Not to mention that whole exascale thing.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From scrusan at ur.rochester.edu  Tue Jan 24 00:02:26 2012
From: scrusan at ur.rochester.edu (Steve Crusan)
Date: Tue, 24 Jan 2012 00:02:26 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
	<4F1E070C.4040107@cse.psu.edu>
	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
Message-ID: <DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
> 
> 
> It's 500 euro for a 1 teraflop double precision Radeon HD7970...


Great, and nothing runs on it. GPUs are insanely useful for certain tasks, but they aren't going to be able to handle most normal workloads(similar to the BG class of course). Any center that buys BGP (or Q at this point) gear is going to pay for a scientific programmer to adapt their code to take advantage of the BG's strengths; parallelism. 

But It's nice that supercomputing centers use GPUs to boost their flops numbers. Any word on that Chinese system's efficiency? If you look at the architecture of the new K computer in Japan, it's similar to the BlueGene line.

PS: I'm really not an IBMer.


> 
> 
> 
> On Jan 24, 2012, at 2:19 AM, Ellis H. Wilson III wrote:
> 
>> On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>>>> Nanosecond latency of QPI using 2 rings versus something that has a
>>>>> latency up to factor 1000 slower
>>>>> with the pci-e as the slowest delaying factor.
>>>>> 
>>>>> Doing cache coherency over that forget it.
>>>> 
>>>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it  
>>>> can't
>>>> work!!!
>>>> 
>>>> More seriously, with this acquisition, I could see serious  
>>>> contention
>>>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>>>> smaller boxen.
>>> 
>>> That would be some BlueGene type machine you speak about that intel
>>> would produce with a low power SoC.
>>> 
>>> This where at this point the bluegene type machines simply can't
>>> compete with the tiny processors
>>> that get produced by the dozens of millions.
>> 
>> For...chess?  ;D
>> 
>>> "The tiny processors have won"
>>>     Linus Thorvalds
>> 
>> *Torvalds, and if Linux (or any well-supported kernel/OS for that
>> matter) currently had data structures designed for extremely high
>> parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I
>> would agree with this statement.  As I currently see it, all we can
>> really say is that someday, probably, perhaps even hopefully:
>> 
>> "The tiny processors will win."
>> 
>> That's after we work out all the nasty nuances involved with designing
>> new data structures for OSes that can handle that number of cores, and
>> probably design new applications that can use these new OS features.
>> And no, GPU support in Linux doesn't count as this already having been
>> done.  We just farm out very specific code to run on those things.  If
>> somebody has an example of a full-blown, usable OS running on a GPU
>> ALONE, I would stand (very interestingly) corrected.
>> 
>>> Intel has themselves a second law of Moore. You can google for it.
>> 
>> Thanks, for a moment there, I almost used AskJeeves.
>> 
>>> A good example of massproduced processors are gpu's.
>> 
>> Was waiting for the hook.  Inevitable really.  I think if we were
>> discussing the efficacy and quality of resultant bread from various
>> bread machines versus the numerous methods for making bread by hand
>> somehow, someway, a GPU would make better bread.  Might be a wholesome
>> cyber-loaf of artisan wheat, but nonetheless, it would be better in
>> every way.
>> 
>> Best,
>> 
>> ellis
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

 ----------------------
 Steve Crusan
 System Administrator
 Center for Research Computing
 University of Rochester
 https://www.crc.rochester.edu/


-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org

iQEcBAEBAgAGBQJPHjtzAAoJENS19LGOpgqKUHUH/Rvn6tXy8Kla86JNbNwt3KUJ
B+70SwJL/aBDstcDG4ChT5uW0WCcuvS7qRx5e1Zwu68m7qFEZRvIwc0uu0bgHbxt
KRynFRZ6suwudEp0o4HMpCBYNaC7uG7xkUeFbUHKfnfCflWDoz4Y9Fq3a/OhoriK
a5JrQqjVI6HZij+xDqrFvyn80Ec8eSwfRYd8lxfq4abHtE1tKYm/cF5I5Bn2lD5l
wVNvBQiU99ZPeqhcbL5XyvIsceB6ncodJ9zmBxIahrNIogMCq7UJbUhsikSRp6Dd
cL7r0AekTyiRmvZaHZZKbuad68DfATT4hy9/HzodBqTWLxxTMlrW8vNH9a7dSOo=
=oA7r
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Tue Jan 24 00:09:57 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Tue, 24 Jan 2012 16:09:57 +1100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
Message-ID: <4F1E3D25.7000008@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 16:02, Steve Crusan wrote:

> Any center that buys BGP (or Q at this point) gear is
> going to pay for a scientific programmer to adapt their
> code to take advantage of the BG's strengths; parallelism. 

The advantage of the BG platform though is that it's just MPI and
threads, nothing that unusual at all - certainly no need to learn CUDA,
OpenCL, etc..

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8ePSUACgkQO2KABBYQAh+hPQCggfFgdr9R9G6H7hW0Dk1/sGK+
Fe8Aniu7M6CEThw0s7F2CtqTCmuNZMRg
=mH9r
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Tue Jan 24 00:32:08 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Tue, 24 Jan 2012 00:32:08 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
	<4F1DD523.4020005@ias.edu>
	<39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>
Message-ID: <alpine.LFD.2.02.1201240016360.5375@coffee.psychology.mcmaster.ca>

>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>
>> That's what I'm thinking!
>
> Numascale does this already with SCI

it's easy to source and build pretty big IB systems;
how much so with SCI?

I actually like the idea of high-fanout-distributed-router systems,
but they seem prepetually exotic.  where are the hypercubes, FNNs?
afaikt, commodification of IB has snuffed topology as a design issue,
except for cray/BG/k machine-level projects.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Tue Jan 24 00:53:14 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 23 Jan 2012 21:53:14 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201240016360.5375@coffee.psychology.mcmaster.ca>
Message-ID: <CB43872A.1354B%james.p.lux@jpl.nasa.gov>

Inevitably, though, massively parallel interconnects (all boxes connected
to all other boxes) won't scale.


On 1/23/12 9:32 PM, "Mark Hahn" <hahn at mcmaster.ca> wrote:

>>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>>
>>> That's what I'm thinking!
>>
>> Numascale does this already with SCI
>
>it's easy to source and build pretty big IB systems;
>how much so with SCI?
>
>I actually like the idea of high-fanout-distributed-router systems,
>but they seem prepetually exotic.  where are the hypercubes, FNNs?
>afaikt, commodification of IB has snuffed topology as a design issue,
>except for cray/BG/k machine-level projects.
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 24 06:53:35 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 24 Jan 2012 12:53:35 +0100
Subject: [Beowulf] CPU Startup Combines CPU+DRAM?And A Whole Bunch
	Of	Crazy
In-Reply-To: <20120124003030.GA80957@piskorski.com>
References: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>
	<20120124003030.GA80957@piskorski.com>
Message-ID: <20120124115335.GW7343@leitl.org>

On Mon, Jan 23, 2012 at 07:30:30PM -0500, Andrew Piskorski wrote:
> On Mon, Jan 23, 2012 at 03:50:09PM -0500, Rayson Ho wrote:
> 
> > http://iram.cs.berkeley.edu/
> > 
> > So 15 years later someone suddenly thinks that it is a good idea to
> > ship IRAM systems to real customers?? :-D
> 
> Sure.  But from when I last read about the IRAM stuff, I'm pretty sure
> it was strictly single core.  Their VIRAM1 chip had 13 MB of DRAM, 1
> cpu core, and 4 "vector lanes", with no mention of SMP or any sort of
> multi-chip parallelism at all.  If Venray has a good design for using
> hundreds or more IRAM-like chips in a parallel machine, that sounds
> like a significant step forward.  (The intended fab process and
> attendant design rules might also be quite different, although I'm not
> at all sure about that.)

In order to make best use of eDRAM it's best to organize
the CPU around the layout of the memory cells, treating it
as an array. You'll need a refresh register, best as wide
as possible, multi-kBit word sizes, add shifts (which helps
the network processor), VLIW/SIMD, large integer addition
and subtraction, and so on.

If you shrink the dies, use redunant connections to route
around dead dies you can have WSI with utilization rates
of >90% of the real estate. Even without FPUs such a sea
of nodes on a mesh maps very well to massively parallel
physical problems, AI (spiking neurons), and such. Even as
a particle swarm/game physics accelerator engine integrated
into RAM it really helps with massively boosting game
video and physics performance, with obvious applications in
GPGPU as well.

This is not at all stupid, if only this wouldn't be pushed
by apparent bozos.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Tue Jan 24 07:48:23 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Tue, 24 Jan 2012 07:48:23 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <CB43872A.1354B%james.p.lux@jpl.nasa.gov>
References: <CB43872A.1354B%james.p.lux@jpl.nasa.gov>
Message-ID: <986a2d9cf54a1630130a3361fc25a547.squirrel@mail.eadline.org>


> Inevitably, though, massively parallel interconnects (all boxes connected
> to all other boxes) won't scale.
>
Indeed, when thinking about scale I always end up thinking about
the masters of scale -- ants

--
Doug

>
> On 1/23/12 9:32 PM, "Mark Hahn" <hahn at mcmaster.ca> wrote:
>
>>>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>>>
>>>> That's what I'm thinking!
>>>
>>> Numascale does this already with SCI
>>
>>it's easy to source and build pretty big IB systems;
>>how much so with SCI?
>>
>>I actually like the idea of high-fanout-distributed-router systems,
>>but they seem prepetually exotic.  where are the hypercubes, FNNs?
>>afaikt, commodification of IB has snuffed topology as a design issue,
>>except for cray/BG/k machine-level projects.
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>To change your subscription (digest mode or unsubscribe) visit
>>http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From diep at xs4all.nl  Tue Jan 24 07:51:54 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 13:51:54 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
	<4F1E070C.4040107@cse.psu.edu>
	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
Message-ID: <F4B5F8D0-208C-4C34-AB63-5AFA2C34A325@xs4all.nl>


On Jan 24, 2012, at 6:02 AM, Steve Crusan wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
>
> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>>
>>
>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
>
>
> Great, and nothing runs on it.

You build a system of millions of euro's alltogether, NCSA having a  
huge budget and you can't even pay for a few programmers who
write some crunching code for gpu's????

> GPUs are insanely useful for certain tasks, but they aren't going  
> to be able to handle most normal workloads(similar to the BG class  
> of course). Any center that buys BGP (or Q at this point) gear is  
> going to pay for a scientific programmer to adapt their code to  
> take advantage of the BG's strengths; parallelism.
>

bluegene is ibm's equivalent of a HPC gpu, just it's a lot more  
expensive such box.


> But It's nice that supercomputing centers use GPUs to boost their  
> flops numbers. Any word on that Chinese system's efficiency?

Actually on this mailing list if you scroll back in history, and look  
in 2007, some chinese researchers here posted their codes were,
we speak of the 512 streamcore ATI's, already reaching 50% IPC, and  
it worked crossplatform at AMD and Nvidia. They got 25% efficiency
at nvidia.

Now if we realize that most codes on this planet can't use multiply- 
add, then 25% at nvidia and 50% at ATI was really good.

If we look to all sorts of applications and see that if 1 good  
programmer is doing effort, suddenly it works great at gpu's.


> If you look at the architecture of the new K computer in Japan,  
> it's similar to the BlueGene line.
>
> PS: I'm really not an IBMer.
>

I took a look at latest BlueGene/Q and basically it's 4 threads per  
core @ 18 core @ 1.6Ghz or something they are gonna build.
that's a much improved chip over the old bluegenes which are 3 watt  
per gflop.

Yet to my surprise, or maybe not, it's still not in the league of  
gpu's. the not yet built bluegene/q supercomputer claims
2 flops per watt now.

GPU's are 4 flops per watt now and already you can buy it in a shop.

And at least 1 chinese researcher posted here in 2007 to get 2 flops  
per watt out of it.

What works on such ibm hardware efficient should also be no problem  
to port to a GPU.

I see no money amounts quoted on what bluegene/q is gonna cost, yet  
we can be sure it's gonna cost you more than a gpu in the shops.

So a chip not yet sold by ibm, if i may believe wiki, especially  
designed for its purpose, can't compete with a gpu, that's already in  
the shops,
which has been designed for gamers.

Realize that the gpu has been designed for single precision  
calculations and delivers 4x more single precision flops than double,
and we are comparing it double precision here.

BG/Q is using 45 nm processors and AMD7970 is using 28 nm proces  
technology, to just show my point.


>
>
>>
>>
>>
>> On Jan 24, 2012, at 2:19 AM, Ellis H. Wilson III wrote:
>>
>>> On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>>>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>>>>> Nanosecond latency of QPI using 2 rings versus something that  
>>>>>> has a
>>>>>> latency up to factor 1000 slower
>>>>>> with the pci-e as the slowest delaying factor.
>>>>>>
>>>>>> Doing cache coherency over that forget it.
>>>>>
>>>>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it
>>>>> can't
>>>>> work!!!
>>>>>
>>>>> More seriously, with this acquisition, I could see serious
>>>>> contention
>>>>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>>>>> smaller boxen.
>>>>
>>>> That would be some BlueGene type machine you speak about that intel
>>>> would produce with a low power SoC.
>>>>
>>>> This where at this point the bluegene type machines simply can't
>>>> compete with the tiny processors
>>>> that get produced by the dozens of millions.
>>>
>>> For...chess?  ;D
>>>
>>>> "The tiny processors have won"
>>>>     Linus Thorvalds
>>>
>>> *Torvalds, and if Linux (or any well-supported kernel/OS for that
>>> matter) currently had data structures designed for extremely high
>>> parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I
>>> would agree with this statement.  As I currently see it, all we can
>>> really say is that someday, probably, perhaps even hopefully:
>>>
>>> "The tiny processors will win."
>>>
>>> That's after we work out all the nasty nuances involved with  
>>> designing
>>> new data structures for OSes that can handle that number of  
>>> cores, and
>>> probably design new applications that can use these new OS features.
>>> And no, GPU support in Linux doesn't count as this already having  
>>> been
>>> done.  We just farm out very specific code to run on those  
>>> things.  If
>>> somebody has an example of a full-blown, usable OS running on a GPU
>>> ALONE, I would stand (very interestingly) corrected.
>>>
>>>> Intel has themselves a second law of Moore. You can google for it.
>>>
>>> Thanks, for a moment there, I almost used AskJeeves.
>>>
>>>> A good example of massproduced processors are gpu's.
>>>
>>> Was waiting for the hook.  Inevitable really.  I think if we were
>>> discussing the efficacy and quality of resultant bread from various
>>> bread machines versus the numerous methods for making bread by hand
>>> somehow, someway, a GPU would make better bread.  Might be a  
>>> wholesome
>>> cyber-loaf of artisan wheat, but nonetheless, it would be better in
>>> every way.
>>>
>>> Best,
>>>
>>> ellis
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>> Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
>
>  ----------------------
>  Steve Crusan
>  System Administrator
>  Center for Research Computing
>  University of Rochester
>  https://www.crc.rochester.edu/
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
> Comment: GPGTools - http://gpgtools.org
>
> iQEcBAEBAgAGBQJPHjtzAAoJENS19LGOpgqKUHUH/Rvn6tXy8Kla86JNbNwt3KUJ
> B+70SwJL/aBDstcDG4ChT5uW0WCcuvS7qRx5e1Zwu68m7qFEZRvIwc0uu0bgHbxt
> KRynFRZ6suwudEp0o4HMpCBYNaC7uG7xkUeFbUHKfnfCflWDoz4Y9Fq3a/OhoriK
> a5JrQqjVI6HZij+xDqrFvyn80Ec8eSwfRYd8lxfq4abHtE1tKYm/cF5I5Bn2lD5l
> wVNvBQiU99ZPeqhcbL5XyvIsceB6ncodJ9zmBxIahrNIogMCq7UJbUhsikSRp6Dd
> cL7r0AekTyiRmvZaHZZKbuad68DfATT4hy9/HzodBqTWLxxTMlrW8vNH9a7dSOo=
> =oA7r
> -----END PGP SIGNATURE-----
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Tue Jan 24 07:52:46 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 13:52:46 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F1E3D25.7000008@unimelb.edu.au>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
	<4F1E3D25.7000008@unimelb.edu.au>
Message-ID: <08826288-2842-4C6B-B16A-180E5CCCF9D1@xs4all.nl>


On Jan 24, 2012, at 6:09 AM, Christopher Samuel wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 24/01/12 16:02, Steve Crusan wrote:
>
>> Any center that buys BGP (or Q at this point) gear is
>> going to pay for a scientific programmer to adapt their
>> code to take advantage of the BG's strengths; parallelism.
>
> The advantage of the BG platform though is that it's just MPI and
> threads, nothing that unusual at all - certainly no need to learn  
> CUDA,
> OpenCL, etc..
>

If you don't learn opencl, you're gonna run behind.

Vincent

> - --
>     Christopher Samuel - Senior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>          http://www.vlsci.unimelb.edu.au/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8ePSUACgkQO2KABBYQAh+hPQCggfFgdr9R9G6H7hW0Dk1/sGK+
> Fe8Aniu7M6CEThw0s7F2CtqTCmuNZMRg
> =mH9r
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 24 08:20:40 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 24 Jan 2012 14:20:40 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <CB43872A.1354B%james.p.lux@jpl.nasa.gov>
References: <alpine.LFD.2.02.1201240016360.5375@coffee.psychology.mcmaster.ca>
	<CB43872A.1354B%james.p.lux@jpl.nasa.gov>
Message-ID: <20120124132040.GC7343@leitl.org>

On Mon, Jan 23, 2012 at 09:53:14PM -0800, Lux, Jim (337C) wrote:
> Inevitably, though, massively parallel interconnects (all boxes connected
> to all other boxes) won't scale.

You can soup up a local 3d torus with a small network
like connectivity. That keeps the the node connectivity
and number of wires still manageable.

Moreover, the universe does it with local connectivity
(even quantum entanglement needss a relativistic channel
to tell it from RNG) just fine. A 3d grid/torus would
be a good match for anything that can do long-range
by iterating short-range interactions.
 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 24 08:23:27 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 24 Jan 2012 14:23:27 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120124132040.GC7343@leitl.org>
References: <alpine.LFD.2.02.1201240016360.5375@coffee.psychology.mcmaster.ca>
	<CB43872A.1354B%james.p.lux@jpl.nasa.gov>
	<20120124132040.GC7343@leitl.org>
Message-ID: <20120124132327.GE7343@leitl.org>

On Tue, Jan 24, 2012 at 02:20:40PM +0100, Eugen Leitl wrote:
> On Mon, Jan 23, 2012 at 09:53:14PM -0800, Lux, Jim (337C) wrote:
> > Inevitably, though, massively parallel interconnects (all boxes connected
> > to all other boxes) won't scale.
> 
> You can soup up a local 3d torus with a small network

s/small network/small world network

> like connectivity. That keeps the the node connectivity
> and number of wires still manageable.
> 
> Moreover, the universe does it with local connectivity
> (even quantum entanglement needss a relativistic channel
> to tell it from RNG) just fine. A 3d grid/torus would
> be a good match for anything that can do long-range
> by iterating short-range interactions.

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Tue Jan 24 11:21:54 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Tue, 24 Jan 2012 08:21:54 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <986a2d9cf54a1630130a3361fc25a547.squirrel@mail.eadline.org>
Message-ID: <CB44168F.1357B%james.p.lux@jpl.nasa.gov>


On 1/24/12 4:48 AM, "Douglas Eadline" <deadline at eadline.org> wrote:

>
>> Inevitably, though, massively parallel interconnects (all boxes
>>connected
>> to all other boxes) won't scale.
>>
>Indeed, when thinking about scale I always end up thinking about
>the masters of scale -- ants
>
>--

Unfortunately, ants only run a small set of specialized codes, and are not
the generalized computing resource that we're looking for (and, frankly,
don't yet know how to effectively use, if it were to exist)

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 24 11:24:31 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 24 Jan 2012 17:24:31 +0100
Subject: [Beowulf] MIT Genius Stuffs 100 Processors Into Single Chip
Message-ID: <20120124162431.GJ7343@leitl.org>


http://www.wired.com/wiredenterprise/2012/01/mit-genius-stu/

MIT Genius Stuffs 100 Processors Into Single Chip

By Eric Smalley January 23, 2012 | 6:30 am | 
    
Categories: Big Data, Tiny Chips, Data Centers, Hardware, Microprocessors,
Servers, Spin-offs

Anant Agarwal is crazy. If you say otherwise, he's not doing his job. Photo:
Wired.com/Eric Smalley

WESTBOROUGH, Massachusetts ? Call Anant Agarwal?s work crazy, and you?ve made
him a happy man.

Agarwal directs the Massachusetts Institute of Technology?s vaunted Computer
Science and Artificial Intelligence Laboratory, or CSAIL. The lab is housed
in the university?s Stata Center, a Dr. Seussian hodgepodge of forms and
angles that nicely reflects the unhindered-by-reality visionary research that
goes on inside.

Agarwal and his colleagues are figuring out how to build the computer chips
of the future, looking a decade or two down the road. The aim is to do
research that most people think is nuts. ?If people say you?re not crazy,?
Agarwal tells Wired, ?that means you?re not thinking far out enough.?

Agarwal has been at this a while, and periodically, when some of his
pie-in-the-sky research becomes merely cutting-edge, he dons his serial
entrepreneur hat and launches the technology into the world. His latest
commercial venture is Tilera. The company?s specialty is squeezing cores onto
chips ? lots of cores. A core is a processor, the part of a computer chip
that runs software and crunches data. Today?s high-end computer chips have as
many as 16 cores. But Tilera?s top-of-the-line chip has 100.

The idea is to make servers more efficient. If you pack lots of simple cores
onto a single chip, you?re not only saving power. You?re shortening the
distance between cores.

Today, Tilera sells chips with 16, 32, and 64 cores, and it?s scheduled to
ship that 100-core monster later this year. Tilera provides these chips to
Quanta, the huge Taiwanese original design manufacturer (ODM) that supplies
servers to Facebook and ? according to reports, Google. Quanta servers sold
to the big web companies don?t yet include Tilera chips, as far as anyone is
admitting. But the chips are on some of the companies? radar screens.

Agarwal?s outfit is part of an ever growing movement to reinvent the server
for the internet age. Facebook and Google are now designing their own servers
for their sweeping online operations. Startups such as SeaMicro are cramming
hundreds of mobile processors into servers in an effort to save power in the
web data center. And Tilera is tackling this same task from different angle,
cramming the processors into a single chip.

Tilera grew out of a DARPA- and NSF-funded MIT project called RAW, which
produced a prototype 16-core chip in 2002. The key idea was to combine a
processor with a communications switch. Agarwal calls this creation a tile,
and he?s able to build these many tiles into a piece of silicon, creating
what?s known as a ?mesh network.?

?Before that you had the concept of a bunch of processors hanging off of a
bus, and a bus tends to be a real bottleneck,? Agarwal says. ?With a mesh,
every processor gets a switch and they all talk to each other?. You can think
of it as a peer-to-peer network.?

What?s more, Tilera made a critical improvement to the cache memory that?s
part of each core. Agarwal and company made the cache dynamic, so that every
core has a consistent copy of the chip?s data. This Dynamic Distributed Cache
makes the cores act like a single chip so they can run standard software. The
processors run the Linux operating system and programs written in C++, and a
large chunk of Tilera?s commercialization effort focused on programming
tools, including compilers that let programmers recompile existing programs
to run on Tilera processors.

The end result is a 64-core chip that handles more transactions and consumes
less power than an equivalent batch of x86 chips. A 400-watt Tilera server
can replace eight x86 servers that together draw 2,000 watts. Facebook?s
engineers have given the chip a thorough tire-kicking, and Tilera says it has
a growing business selling its chips to networking and videoconferencing
equipment makers. Tilera isn?t naming names, but claims one of the top two
videoconferencing companies and one of the top two firewall companies.

An Army of Wimps

There?s a running debate in the server world over what are called wimpy
nodes. Startups SeaMicro and Calxeda are carving out a niche for low-power
servers based on processors originally built for cellphones and tablets.
Carnegie Mellon professor Dave Andersen calls these chips ?wimpy.? The idea
is that building servers with more but lower-power processors yields better
performance for each watt of power. But some have downplayed the idea,
pointing out that it only works for certain types of applications.

Tilera takes the position that wimpy cores are okay, but wimpy nodes ? aka
wimpy chips ? are not.

Keeping the individual cores wimpy is a plus because a wimpy core is low
power. But if your cores are spread across hundreds of chips, Agarwal says,
you run into problems: inter-chip communications are less efficient than
on-chip communications. Tilera gets the best of both worlds by using wimpy
cores but putting many cores on a chip. But it still has a ways to go.

There?s also a limit to how wimpy your cores can be. Google?s infrastructure
guru, Urs H?lzle, published an influential paper on the subject in 2010. He
argued that in most cases brawny cores beat wimpy cores. To be effective, he
argued, wimpy cores need to be no less than half the power of higher-end x86
cores.

Tilera is boosting the performance of its cores. The company?s most recent
generation of data center server chips, released in June, are 64-bit
processors that run at 1.2 to 1.5 GHz. The company also doubled DRAM speed
and quadrupled the amount of cache per core. ?It?s clear that cores have to
get beefier,? Agarwal says.

The whole debate, however, is somewhat academic. ?At the end of the day, the
customer doesn?t care whether you?re a wimpy core or a big core,? Agarwal
says. ?They care about performance, and they care about performance per watt,
and they care about total cost of ownership, TCO.?

Tilera?s performance per watt claims were validated by a paper published by
Facebook engineers in July. The paper compared Tilera?s second generation
64-core processor to Intel?s Xeon and AMD?s Opteron high end server
processors. Facebook put the processors through their paces on Memcached, a
high-performance database memory system for web applications.

According to the Facebook engineers, a tuned version of Memcached on the
64-core Tilera TILEPro64 yielded at least 67 percent higher throughput than
low-power x86 servers. Taking power and node integration into account as
well, a TILEPro64-based S2Q server with 8 processors handled at least three
times as many transactions per second per Watt as the x86-based servers.

Despite the glowing words, Facebook hasn?t thrown its arms around Tilera. The
stumbling block, cited in the paper, is the limited amount of memory the
Tilera processors support. Thirty-two-bit cores can only address about 4GB of
memory. ?A 32-bit architecture is a nonstarter for the cloud space,? Agarwal
says.

Tilera?s 64-bit processors change the picture. These chips support as much as
a terabyte of memory. Whether the improvement is enough to seal the deal with
Facebook, Agarwal wouldn?t say. ?We have a good relationship,? he says with a
smile.

While Intel Lurks

Intel is also working on many-core chips, and it expects to ship a
specialized 50-core processor, dubbed Knights Corner, in the next year or so
as an accelerator for supercomputers. Unlike the Tilera processors, Knights
Corner is optimized for floating point operations, which means it?s designed
to crunch the large numbers typical of high-performance computing
applications.

In 2009, Intel announced an experimental 48-core processor code-named Rock
Creek and officially labeled the Single-chip Cloud Computer (SCC). The chip
giant has since backed off of some of the loftier claims it was making for
many-core processors, and it focused its many-core efforts on
high-performance computing. For now, Intel is sticking with the Xeon
processor for high-end data center server products.

Dave Hill, who handles server product marketing for Intel, takes exception to
the Facebook paper. ?Really what they compared was a very optimized set of
software running on Tilera versus the standard image that you get from the
open source running on the x86 platforms,? he says.

The Facebook engineers ran over a hundred different permutations in terms of
the number of cores allocated to the Linux stack, the networking stack and
the Memcached stack, Hill says. ?They really kinda fine tuned it. If you
optimize the x86 version, then the paper probably would have been more apples
to apples.?

Tilera?s roadmap calls for its next generation of processors, code-named
Stratton, to be released in 2013. The product line will expand the number of
processors in both directions, down to as few as four and up to as many as
200 cores. The company is going from a 40-nm to a 28-nm process, meaning
they?re able to cram more circuits in a given area. The chip will have
improvements to interfaces, memory, I/O and instruction set, and will have
more cache memory.

But Agarwal isn?t stopping there. As Tilera churns out the 100-core chip,
he?s leading a new MIT effort dubbed the Angstrom project. It?s one of four
DARPA-funded efforts aimed at building exascale supercomputers. In short,
it?s aiming for a chip with 1,000 cores. 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Tue Jan 24 13:13:17 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Tue, 24 Jan 2012 10:13:17 -0800
Subject: [Beowulf] balance between compute and communicate
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B447C5D@ALTPHYEMBEVSP20.RES.AD.JPL>

One of the lines in the article Eugen posted:


"There's also a limit to how wimpy your cores can be. Google's infrastructure guru, Urs H?lzle, published an influential paper on the subject in 2010. He argued that in most cases brawny cores beat wimpy cores. To be effective, he argued, wimpy cores need to be no less than half the power of higher-end x86 cores."


Is interesting.. I think the real issue is one of "system engineering".. you want processor speed, memory size/bandwidth, and internode communication speed/bandwidth to be "balanced".  Super duper 10GHz cores with 1k of RAM  interconnected with 9600bps serial links is clearly an unbalanced system..


The paper is at

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/36448.pdf


>From the paper:
Typically, CPU power decreases by approximately O(k2) when CPU frequency decreases by k,

Hmm.. this isn't necessarily true, with modern designs.  In the bad old days, when core voltages were high and switching losses dominated, yes, this is the case, but with modern designs, the leakage losses are starting to be comparable to the switching losses.  But that's ok, because he never comes back to the power issue again, and heads off on Amdahl's law (which we 'wulfers all know) and the inevitable single thread bottleneck that exists at some point.


However, I certainly agree with him  when he says:
Cost numbers used by wimpy-core evangelists always exclude software development costs. Unfortunately, wimpy-core systems can require applications to be explicitly parallelized or otherwise optimized for acceptable performance....
But, I don't go for
Software development costs often dominate a company's overall technical expenses

I don't know that software development costs dominate.  If you're building a million computer data center (distributed geographically, perhaps), that's on the order of several billion dollars, and you can buy an awful lot of skilled developer time for a billion dollars.  It might cost another billion to manage all of them, but that's still an awful lot of development.  But maybe in his space, the development time is more costly than the hardware purchase and operating costs.


He summarizes with
Once a chip's single-core performance lags by more than a factor to two or so behind the higher end of current-generation commodity
processors, making.....


Which is essentially my system engineering balancing argument, in the context of expectations that the surrounding stuff is current generation.

So the real Computer Engineering question is: Is there some basic rule of thumb that one can use to determine appropriate balance, given things like speeds/bandwidth/power consumption?

Could we, for instance, take moderately well understood implications and forecasts of future performance (e.g. Moore's law and its ilk) and predict what size machines with what performance would be reasonable in say, 20 years?  The scaling rules for CPUs, for Memory, and for Communications are fairly well understood.

(or maybe this is something that's covered in every lower division computer engineering class these days?.. I confess I'm woefully ignorant of what they teach at various levels these days)


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120124/9d4a6fc8/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From diep at xs4all.nl  Tue Jan 24 13:25:07 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 19:25:07 +0100
Subject: [Beowulf] MIT Genius Stuffs 100 Processors Into Single Chip
In-Reply-To: <20120124162431.GJ7343@leitl.org>
References: <20120124162431.GJ7343@leitl.org>
Message-ID: <FF1E8DD6-76AA-4F3C-9CC6-D4FEA319A4E4@xs4all.nl>

I remember the first announcement some years ago from Tilera.
Some persons shipped some emails to tilera asking for more details.
Some just asked - like me - others also offered money to buy a cpu.

They all got a 'no'.

But now that there are more details the chip sounds less impressive.
Let's analyze based upon the vague information on the homepage.

Lots of statements that a marketing department in India would write  
down as such
are there as well; reformulating existing slogans into more political  
slogans,
allowing you to deny later on that it performs very well. We know  
that trick just all
too well.

First of all homepage report it's 23 watts, yet doesn't say whether  
that's idle or under full load.
It just says 'active'. Active is a vague way of formulating. I assume  
that's a core that isn't idle yet
isn't under 100% load. So then it eats like a portion of the power.

So probably it's a watt or 50 under full load.
Then it says 64 cores in a grid @ 700Mhz.

700Mhz sounds as a possible Ghz frequency that you can get if you're  
a professional
(if i'd build something count at it that it'll run 300Mhz or so).
Doesn't seem like weird claim.
64 * 0.7 = 44.8Ghz measure
Yet at the same time it claims on homepage 443 billion operations per  
second.
What is an operation? Is that an internal iop?
It says it's 32 bits VLIW. So that would mean it's processing each  
cycle 10 integers.
Now we know from all other manufacturers they cheat factor 2, by  
double counting if just 1 instruction theirs is doing for example  
Fused Multiply Add.
So we can divide it by 2 probably and get to 220 gflop.
So then a vector would be 5 integers long, which seems like a weird  
measure.
Maybe they rounded it up a tad and in reality mean 4 integers, sounds  
most reasonable.
So then it's 64 cores in a grid executing vectors existing out of 4  
units of 32 bits. Sounds plausible.

If we compare that with some GPU's which are in our notebooks from a  
few years ago, then suddenly it's not so impressive.

Vincent

On Jan 24, 2012, at 5:24 PM, Eugen Leitl wrote:

>
> http://www.wired.com/wiredenterprise/2012/01/mit-genius-stu/

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Tue Jan 24 17:36:14 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Wed, 25 Jan 2012 09:36:14 +1100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <F4B5F8D0-208C-4C34-AB63-5AFA2C34A325@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
	<F4B5F8D0-208C-4C34-AB63-5AFA2C34A325@xs4all.nl>
Message-ID: <4F1F325E.9010109@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 23:51, Vincent Diepeveen wrote:

> You build a system of millions of euro's alltogether, NCSA having a  
> huge budget and you can't even pay for a few programmers who
> write some crunching code for gpu's????

I was at a meeting at SC'06 where the folks from various large
institutions in the US were bemoaning the fact that there was all this
money for petaflop hardware available but none for programmers or
algorithm development to make apps scale out to the systems.

Just because the scientists say it's a good thing to have doesn't mean
the US government funding people will listen to them..

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8fMl4ACgkQO2KABBYQAh95lwCfQodU25X1A0yngWOOwuAqmU2X
thAAoICeeMk8fwx33enCWQ/XGvatdsEc
=OFC+
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Wed Jan 25 17:01:48 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Wed, 25 Jan 2012 17:01:48 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
Message-ID: <4F207BCC.9010701@ias.edu>

On 01/24/2012 12:02 AM, Steve Crusan wrote:
>
>
> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>
>
> > It's 500 euro for a 1 teraflop double precision Radeon HD7970...
>
>
> Great, and nothing runs on it. GPUs are insanely useful for certain
> tasks, but they aren't going to be able to handle most normal
> workloads(similar to the BG class of course). Any center that buys BGP
> (or Q at this point) gear is going to pay for a scientific programmer
> to adapt their code to take advantage of the BG's strengths; parallelism.
>
> But It's nice that supercomputing centers use GPUs to boost their
> flops numbers. Any word on that Chinese system's efficiency? If you
> look at the architecture of the new K computer in Japan, it's similar
> to the BlueGene line.

I attended a presentation at Princeton U. on Monday about the state of
HPC in China. The talk  was given by someone who has been to China and
spoken with the leaders of their HPC efforts. While the Chinese systems
get great scores on LINPACK, even the Chinese concede that on their
"real" applications, they are getting well below the theoretical max
flops, because their codes aren't getting the most out of their systems.
In other words, on real programs, they aren't all that efficient (yet).

--
Prentice


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 25 19:46:57 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 26 Jan 2012 01:46:57 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F207BCC.9010701@ias.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
	<4F207BCC.9010701@ias.edu>
Message-ID: <76840233-6CA8-4B9E-BF66-4A1A93CD1F1F@xs4all.nl>

The supercomputing codes i saw run on processors, to say polite, were  
losing it everywhere.

Also NASA when porting from Origin3800 to Itanium2 1.5Ghz, reported  
publicly a speedup of factor 2 in the forums.

However my own chessprogram, not exactly optimized for itanium2, got  
a boost of factor 4 moving from 500Mhz R14000 (origin3800)
to itanium2 1.3Ghz. That was just a single compile, and it's an  
integer program, whereas the itanium2 is a floating point processor.

The itanium2 1.5Ghz has 6 gflops on paper versus the R14k 500Mhz has  
1 Gflop on paper.

Now a Chinese reporter posted on THIS mailing list, the beowulf  
mailing list, already at GPU hardware some generations ago
an IPC of 25% at nvidia and 50% at AMD.

At the same gpu's back then, most studentprojects got around 25% at  
nvidia; Volkov then went ahead and understood GPU's better
and scored 70% efficiency - again at very old gpu's. Sincethen they  
really improved.

See: http://www.cs.berkeley.edu/~volkov/

So you want to build a supercomptuer now 10x more expensive, and each  
generation lose more efficiency on newer hardware,
whereas some who do effort to write new good code, they get very high  
efficiency?

Just learn how to program and ignore the desinformation - if you have  
a box that fast you really can get a lot of speed out of it.

You shouldn't ask for a 1 billion dollar box that can run your  
oldschool Fortran codes as good as a 5 million GPU box,
look what you can do to write good codes for that manycore hardware.  
OpenGL works at all, CUDA just at nvidia.

Vincent

On Jan 25, 2012, at 11:01 PM, Prentice Bisbal wrote:

> On 01/24/2012 12:02 AM, Steve Crusan wrote:
>>
>>
>> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>>
>>
>>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
>>
>>
>> Great, and nothing runs on it. GPUs are insanely useful for certain
>> tasks, but they aren't going to be able to handle most normal
>> workloads(similar to the BG class of course). Any center that buys  
>> BGP
>> (or Q at this point) gear is going to pay for a scientific programmer
>> to adapt their code to take advantage of the BG's strengths;  
>> parallelism.
>>
>> But It's nice that supercomputing centers use GPUs to boost their
>> flops numbers. Any word on that Chinese system's efficiency? If you
>> look at the architecture of the new K computer in Japan, it's similar
>> to the BlueGene line.
>
> I attended a presentation at Princeton U. on Monday about the state of
> HPC in China. The talk  was given by someone who has been to China and
> spoken with the leaders of their HPC efforts. While the Chinese  
> systems
> get great scores on LINPACK, even the Chinese concede that on their
> "real" applications, they are getting well below the theoretical max
> flops, because their codes aren't getting the most out of their  
> systems.
> In other words, on real programs, they aren't all that efficient  
> (yet).
>
> --
> Prentice
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 26 00:04:31 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 25 Jan 2012 21:04:31 -0800
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <4F1F325E.9010109@unimelb.edu.au>
Message-ID: <CB461DD4.13981%james.p.lux@jpl.nasa.gov>


On 1/24/12 2:36 PM, "Christopher Samuel" <samuel at unimelb.edu.au> wrote:

>institutions in the US were bemoaning the fact that there was all this
>money for petaflop hardware available but none for programmers or
>algorithm development to make apps scale out to the systems.


That's partly because people are an expense, while hardware is an asset
that sits on the balance sheet.

If I fork out a million bucks for a computer, I now have an asset that is
worth a million dollars.

If I fork out a million dollars for 3 skilled developers for a year, at
the end of the year, it's not clear I'll possess an asset that I can sell
for a million dollars.

Obviously, the work product must be worth something, because otherwise we
wouldn't have jobs, but the connection is more tenuous.


The other thing (when government funding is considered) is that the
million dollar hardware purchase might turn into more jobs than the 3
software weenies, if only because "computer assemblers and deliverers" get
paid a lot less, and when it comes to statistics, they don't look at
"cumulative wages", they look at "number of people employed"

>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 26 07:28:41 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 26 Jan 2012 13:28:41 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
Message-ID: <EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>

Mike you replied to me not to mailing list.

note that itanium2 released too late and it was $100k a box initially  
and $7500 a cpu (1.5Ghz) if you ordered a 1000.
And it had same IPC for integers like opteron at the time (later on  
compilers got pgo for opteron as well and then opteron was faster,
at least for diep, in ipc).

Larrabee indeed resembles itanium to some extend, but not quite.
intels expertise is producing highclocked cpu's. itanium was a low  
clocked cpu and therefore failed.
no one pays big bucks for a low clocked cpu. look on ebay - cheapest  
cpu's always the lowclocked ones.

larrabee is something in between a cpu and a gpu so total other  
ballgame - intel moving to a market where they actually have competition
and are not the ones owning the patents.

So that's not gonna be easy for intel some years from now if they  
show up with a 100% vectorized design and not some dreadnought
in between cpu and gpu which is low clocked.

As for your infiniband remark realize that it took 25 years or so to  
bugfix ethernet everywhere - forget 'setting a new standard' there  
for the average Joe.
Not gonna work.

Infiniband is meant for HPC and uses MPI protocol to communicate.  
This is very powerful for clusters and the way to go when scaling at  
supercomputers,
yet it's not gonna conquer average joe's machine, as there is a price  
to pay which is too high for now.

However realize some of sales of the HPC manufacturers goes to low  
latency ethernet - my guess is that intel will use qlogics know how  
there to improve
their cheapo cpu's and upgrade them with better ethernet. Seems  
plausible goal and a very useful one, the rest, such as rivalling  
Mellanox at ethernet,
that's not gonna happen.

On Jan 26, 2012, at 7:23 AM, MDG wrote:

> Technically the Itanium Chip was a failure, it was not x86 100%  
> compatible and actually was for servers but often under-preformed  
> the traditional x86 chips, Intel let it quietly vanish as it came  
> nowhere near the first advertised performance. It varied too far  
> from the x86 architecture design requiring special programing code,  
> so much like the GPUs, though they are actually able to run some  
> parallel process, both under Windows and Linux.
>
> There is a difference the M series NVIDIA cards are moe for servers  
> and the C series such as the C2070 or C2075 for Workstations, the M  
> series also used the same numbering sequence and I think they are  
> up to the 2090 or 2095 series, but you do need PCIe high speed  
> slots for both sets of cards.  Most resale cards I have talked to a  
> few, and be careful there are some knockoffs from mainland China, I  
> verified this with NVIDIA.
>
>  These GPUs are designed that they are not seen as cores or cpus,  
> also most resale?s, are pulled from in one case a pool of HP  
> Workstations and servers, yet the seller had no idea the difference  
> between the C2070 and the M2070s. and as I said none had of them  
> had the required software, most did not even know it was needed!  
> Otherwise the GPUs do not function.  So, as for resale?s it is a  
> pretty expensive gamble as they are untested as no software to even  
> try them with!
>
> The GPUs can be used if you wtrite your own parallel code usually  
> in C++ per NVIDIA, but you still need the software to offload the  
> work to them.  If you are into heavy number crunching, assuming  
> allows parallel processing versus the traditional linear method  
> where a must always come before B and b before C in processes, you  
> will see a lot more results than a typical program, in other things  
> you will see little improvement, my talk with an NVIDIA technician  
> confirmed this you can get a great results for creating say  
> graphics but very little improvement to display a already designed  
> piece, same for statistics, weather forecasting, geology,  
> technically intel has even used their network as a massive HPC to  
> elp design chips, so add engineering, while beyond most physics and  
> nuclear explosions simulations, etc.
>
> Also with fiber optics now coming down in price the idea of  
> multiple super-workstations and even super-servers where a client  
> server relationship and the Server does most of the processing will  
> most likely grow into stable and usable systems before the average  
> work-station.
>
> It will help some with a statistics driven database but not that  
> much for a pure relational database, it also works well with  
> MathLab and SPSS.
>
>
>
> Overall I would expect that the GPUs will soon have more code  
> written for them as they become more plentiful in the real world  
> applications, also there is open source code that is available and  
> being further developed under linux, which with Wine and Winex can  
> run Windows, to some degree, not 100% and as for Windows 7 I have  
> not a clue if it will run under Wine or WineX, though the  
> Macintosh?s now run Windows very well as a second operating  
> system..  Than I would like to have 4 12 core Xeons in my  
> workstation but that bill is far higher than a few 448 GPU cards.  
> Just as any new technology it starts on the high end and then as  
> developed works its way down the price chain, than I was shocked to  
> see a twin Xeon 6 core in a Game machine! So things are moving  
> faster than I anticipated.
>
>
>
> I know I am watching the GPU idea and cards carefully as so far  
> beyond just throwing more cores in the x86 architecture it seems to  
> be moving far faster than when intel started moving upwards, maybe  
> you remember the hardware flaw in the first Pentiums where simple  
> math was processed incorrectly?  Like all things when you introduce  
> new variables into a system, be it hardware or software, there are  
> a lot of things that will not always work or work to the potential  
> of the system.
>
>
>
> As I said I am watching the GPUs closely as so far they seem the  
> most likely next beak-through as software is written that can take  
> advantage of their unique abilities. Also from what I have read  
> they draw far less power than even the new generation of multi-core  
> x86 series.  I am not an expert with these GPU systems but they do  
> hold a great promise as in a leap-forward than just adding x86 cores.
>
> The buying of Infiniband shows hat Intel is looking to move past  
> the copper Ethernet systems, which surpased Arcnet systems.  the  
> only constamnt is change, while technically not an Intel Chip this  
> still shows Moore's law is being leveraged to other platforms  
> including GPUs
>
> Mike.
>
> --- On Wed, 1/25/12, Vincent Diepeveen <diep at xs4all.nl> wrote:
>
> From: Vincent Diepeveen <diep at xs4all.nl>
> Subject: Re: [Beowulf] cpu's versus gpu's - was Intel buys QLogic  
> InfiniBand business
> To: "Prentice Bisbal" <prentice at ias.edu>
> Cc: "Beowulf Mailing List" <beowulf at beowulf.org>
> Date: Wednesday, January 25, 2012, 2:46 PM
>
> The supercomputing codes i saw run on processors, to say polite, were
> losing it everywhere.
>
> Also NASA when porting from Origin3800 to Itanium2 1.5Ghz, reported
> publicly a speedup of factor 2 in the forums.
>
> However my own chessprogram, not exactly optimized for itanium2, got
> a boost of factor 4 moving from 500Mhz R14000 (origin3800)
> to itanium2 1.3Ghz. That was just a single compile, and it's an
> integer program, whereas the itanium2 is a floating point processor.
>
> The itanium2 1.5Ghz has 6 gflops on paper versus the R14k 500Mhz has
> 1 Gflop on paper.
>
> Now a Chinese reporter posted on THIS mailing list, the beowulf
> mailing list, already at GPU hardware some generations ago
> an IPC of 25% at nvidia and 50% at AMD.
>
> At the same gpu's back then, most studentprojects got around 25% at
> nvidia; Volkov then went ahead and understood GPU's better
> and scored 70% efficiency - again at very old gpu's. Sincethen they
> really improved.
>
> See: http://www.cs.berkeley.edu/~volkov/
>
> So you want to build a supercomptuer now 10x more expensive, and each
> generation lose more efficiency on newer hardware,
> whereas some who do effort to write new good code, they get very high
> efficiency?
>
> Just learn how to program and ignore the desinformation - if you have
> a box that fast you really can get a lot of speed out of it.
>
> You shouldn't ask for a 1 billion dollar box that can run your
> oldschool Fortran codes as good as a 5 million GPU box,
> look what you can do to write good codes for that manycore hardware.
> OpenGL works at all, CUDA just at nvidia.
>
> Vincent
>
> On Jan 25, 2012, at 11:01 PM, Prentice Bisbal wrote:
>
> > On 01/24/2012 12:02 AM, Steve Crusan wrote:
> >>
> >>
> >> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
> >>
> >>
> >>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
> >>
> >>
> >> Great, and nothing runs on it. GPUs are insanely useful for certain
> >> tasks, but they aren't going to be able to handle most normal
> >> workloads(similar to the BG class of course). Any center that buys
> >> BGP
> >> (or Q at this point) gear is going to pay for a scientific  
> programmer
> >> to adapt their code to take advantage of the BG's strengths;
> >> parallelism.
> >>
> >> But It's nice that supercomputing centers use GPUs to boost their
> >> flops numbers. Any word on that Chinese system's efficiency? If you
> >> look at the architecture of the new K computer in Japan, it's  
> similar
> >> to the BlueGene line.
> >
> > I attended a presentation at Princeton U. on Monday about the  
> state of
> > HPC in China. The talk  was given by someone who has been to  
> China and
> > spoken with the leaders of their HPC efforts. While the Chinese
> > systems
> > get great scores on LINPACK, even the Chinese concede that on their
> > "real" applications, they are getting well below the theoretical max
> > flops, because their codes aren't getting the most out of their
> > systems.
> > In other words, on real programs, they aren't all that efficient
> > (yet).
> >
> > --
> > Prentice
> >
> >
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> > Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 26 07:35:40 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 26 Jan 2012 13:35:40 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
Message-ID: <E4216EFF-8FA5-49AB-A0C1-335C257567A0@xs4all.nl>


On Jan 26, 2012, at 1:28 PM, Vincent Diepeveen wrote:

> Mike you replied to me not to mailing list.
>
> note that itanium2 released too late and it was $100k a box  
> initially and $7500 a cpu (1.5Ghz) if you ordered a 1000.
> And it had same IPC for integers like opteron at the time (later on  
> compilers got pgo for opteron as well and then opteron was faster,
> at least for diep, in ipc).
>
> Larrabee indeed resembles itanium to some extend, but not quite.
> intels expertise is producing highclocked cpu's. itanium was a low  
> clocked cpu and therefore failed.
> no one pays big bucks for a low clocked cpu. look on ebay -  
> cheapest cpu's always the lowclocked ones.
>
> larrabee is something in between a cpu and a gpu so total other  
> ballgame - intel moving to a market where they actually have  
> competition
> and are not the ones owning the patents.
>
> So that's not gonna be easy for intel some years from now if they  
> show up with a 100% vectorized design and not some dreadnought
> in between cpu and gpu which is low clocked.
>
> As for your infiniband remark realize that it took 25 years or so  
> to bugfix ethernet everywhere - forget 'setting a new standard'  
> there for the average Joe.
> Not gonna work.
>
> Infiniband is meant for HPC and uses MPI protocol to communicate.  
> This is very powerful for clusters and the way to go when scaling  
> at supercomputers,
> yet it's not gonna conquer average joe's machine, as there is a  
> price to pay which is too high for now.
>
> However realize some of sales of the HPC manufacturers goes to low  
> latency ethernet - my guess is that intel will use qlogics know how  
> there to improve
> their cheapo cpu's and upgrade them with better ethernet. Seems  
> plausible goal and a very useful one, the rest, such as rivalling  
> Mellanox at ethernet,
> that's not gonna happen.
>

Oops small typo during speedy write. "mellanox at ethernet" should of  
course be 'mellanox at HPC'.

The question is whether typical low latency ethernet products are  
gonna suffer from intels move. I doubt solarflare will.
they already deliver this stuff only to those who really battle for  
every picosecond, so price is just not the issue there.

Vincent

> On Jan 26, 2012, at 7:23 AM, MDG wrote:
>
>> Technically the Itanium Chip was a failure, it was not x86 100%  
>> compatible and actually was for servers but often under-preformed  
>> the traditional x86 chips, Intel let it quietly vanish as it came  
>> nowhere near the first advertised performance. It varied too far  
>> from the x86 architecture design requiring special programing  
>> code, so much like the GPUs, though they are actually able to run  
>> some parallel process, both under Windows and Linux.
>>
>> There is a difference the M series NVIDIA cards are moe for  
>> servers and the C series such as the C2070 or C2075 for  
>> Workstations, the M series also used the same numbering sequence  
>> and I think they are up to the 2090 or 2095 series, but you do  
>> need PCIe high speed slots for both sets of cards.  Most resale  
>> cards I have talked to a few, and be careful there are some  
>> knockoffs from mainland China, I verified this with NVIDIA.
>>
>>  These GPUs are designed that they are not seen as cores or cpus,  
>> also most resale?s, are pulled from in one case a pool of HP  
>> Workstations and servers, yet the seller had no idea the  
>> difference between the C2070 and the M2070s. and as I said none  
>> had of them had the required software, most did not even know it  
>> was needed! Otherwise the GPUs do not function.  So, as for  
>> resale?s it is a pretty expensive gamble as they are untested as  
>> no software to even try them with!
>>
>> The GPUs can be used if you wtrite your own parallel code usually  
>> in C++ per NVIDIA, but you still need the software to offload the  
>> work to them.  If you are into heavy number crunching, assuming  
>> allows parallel processing versus the traditional linear method  
>> where a must always come before B and b before C in processes, you  
>> will see a lot more results than a typical program, in other  
>> things you will see little improvement, my talk with an NVIDIA  
>> technician confirmed this you can get a great results for creating  
>> say graphics but very little improvement to display a already  
>> designed piece, same for statistics, weather forecasting, geology,  
>> technically intel has even used their network as a massive HPC to  
>> elp design chips, so add engineering, while beyond most physics  
>> and nuclear explosions simulations, etc.
>>
>> Also with fiber optics now coming down in price the idea of  
>> multiple super-workstations and even super-servers where a client  
>> server relationship and the Server does most of the processing  
>> will most likely grow into stable and usable systems before the  
>> average work-station.
>>
>> It will help some with a statistics driven database but not that  
>> much for a pure relational database, it also works well with  
>> MathLab and SPSS.
>>
>>
>>
>> Overall I would expect that the GPUs will soon have more code  
>> written for them as they become more plentiful in the real world  
>> applications, also there is open source code that is available and  
>> being further developed under linux, which with Wine and Winex can  
>> run Windows, to some degree, not 100% and as for Windows 7 I have  
>> not a clue if it will run under Wine or WineX, though the  
>> Macintosh?s now run Windows very well as a second operating  
>> system..  Than I would like to have 4 12 core Xeons in my  
>> workstation but that bill is far higher than a few 448 GPU cards.  
>> Just as any new technology it starts on the high end and then as  
>> developed works its way down the price chain, than I was shocked  
>> to see a twin Xeon 6 core in a Game machine! So things are moving  
>> faster than I anticipated.
>>
>>
>>
>> I know I am watching the GPU idea and cards carefully as so far  
>> beyond just throwing more cores in the x86 architecture it seems  
>> to be moving far faster than when intel started moving upwards,  
>> maybe you remember the hardware flaw in the first Pentiums where  
>> simple math was processed incorrectly?  Like all things when you  
>> introduce new variables into a system, be it hardware or software,  
>> there are a lot of things that will not always work or work to the  
>> potential of the system.
>>
>>
>>
>> As I said I am watching the GPUs closely as so far they seem the  
>> most likely next beak-through as software is written that can take  
>> advantage of their unique abilities. Also from what I have read  
>> they draw far less power than even the new generation of multi- 
>> core x86 series.  I am not an expert with these GPU systems but  
>> they do hold a great promise as in a leap-forward than just adding  
>> x86 cores.
>>
>> The buying of Infiniband shows hat Intel is looking to move past  
>> the copper Ethernet systems, which surpased Arcnet systems.  the  
>> only constamnt is change, while technically not an Intel Chip this  
>> still shows Moore's law is being leveraged to other platforms  
>> including GPUs
>>
>> Mike.
>>
>> --- On Wed, 1/25/12, Vincent Diepeveen <diep at xs4all.nl> wrote:
>>
>> From: Vincent Diepeveen <diep at xs4all.nl>
>> Subject: Re: [Beowulf] cpu's versus gpu's - was Intel buys QLogic  
>> InfiniBand business
>> To: "Prentice Bisbal" <prentice at ias.edu>
>> Cc: "Beowulf Mailing List" <beowulf at beowulf.org>
>> Date: Wednesday, January 25, 2012, 2:46 PM
>>
>> The supercomputing codes i saw run on processors, to say polite, were
>> losing it everywhere.
>>
>> Also NASA when porting from Origin3800 to Itanium2 1.5Ghz, reported
>> publicly a speedup of factor 2 in the forums.
>>
>> However my own chessprogram, not exactly optimized for itanium2, got
>> a boost of factor 4 moving from 500Mhz R14000 (origin3800)
>> to itanium2 1.3Ghz. That was just a single compile, and it's an
>> integer program, whereas the itanium2 is a floating point processor.
>>
>> The itanium2 1.5Ghz has 6 gflops on paper versus the R14k 500Mhz has
>> 1 Gflop on paper.
>>
>> Now a Chinese reporter posted on THIS mailing list, the beowulf
>> mailing list, already at GPU hardware some generations ago
>> an IPC of 25% at nvidia and 50% at AMD.
>>
>> At the same gpu's back then, most studentprojects got around 25% at
>> nvidia; Volkov then went ahead and understood GPU's better
>> and scored 70% efficiency - again at very old gpu's. Sincethen they
>> really improved.
>>
>> See: http://www.cs.berkeley.edu/~volkov/
>>
>> So you want to build a supercomptuer now 10x more expensive, and each
>> generation lose more efficiency on newer hardware,
>> whereas some who do effort to write new good code, they get very high
>> efficiency?
>>
>> Just learn how to program and ignore the desinformation - if you have
>> a box that fast you really can get a lot of speed out of it.
>>
>> You shouldn't ask for a 1 billion dollar box that can run your
>> oldschool Fortran codes as good as a 5 million GPU box,
>> look what you can do to write good codes for that manycore hardware.
>> OpenGL works at all, CUDA just at nvidia.
>>
>> Vincent
>>
>> On Jan 25, 2012, at 11:01 PM, Prentice Bisbal wrote:
>>
>> > On 01/24/2012 12:02 AM, Steve Crusan wrote:
>> >>
>> >>
>> >> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>> >>
>> >>
>> >>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
>> >>
>> >>
>> >> Great, and nothing runs on it. GPUs are insanely useful for  
>> certain
>> >> tasks, but they aren't going to be able to handle most normal
>> >> workloads(similar to the BG class of course). Any center that buys
>> >> BGP
>> >> (or Q at this point) gear is going to pay for a scientific  
>> programmer
>> >> to adapt their code to take advantage of the BG's strengths;
>> >> parallelism.
>> >>
>> >> But It's nice that supercomputing centers use GPUs to boost their
>> >> flops numbers. Any word on that Chinese system's efficiency? If  
>> you
>> >> look at the architecture of the new K computer in Japan, it's  
>> similar
>> >> to the BlueGene line.
>> >
>> > I attended a presentation at Princeton U. on Monday about the  
>> state of
>> > HPC in China. The talk  was given by someone who has been to  
>> China and
>> > spoken with the leaders of their HPC efforts. While the Chinese
>> > systems
>> > get great scores on LINPACK, even the Chinese concede that on their
>> > "real" applications, they are getting well below the theoretical  
>> max
>> > flops, because their codes aren't getting the most out of their
>> > systems.
>> > In other words, on real programs, they aren't all that efficient
>> > (yet).
>> >
>> > --
>> > Prentice
>> >
>> >
>> >
>> > _______________________________________________
>> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> > Computing
>> > To change your subscription (digest mode or unsubscribe) visit
>> > http://www.beowulf.org/mailman/listinfo/beowulf
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Thu Jan 26 18:27:21 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Fri, 27 Jan 2012 10:27:21 +1100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
Message-ID: <4F21E159.7000905@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 26/01/12 23:28, Vincent Diepeveen wrote:

> Mike you replied to me not to mailing list.

That was probably deliberate, and it is inconsiderate to post a reply
publicly without checking with the writer that they are OK with that,
especially as you quoted what they wrote - they may not have wanted that
in the public domain.

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8h4VkACgkQO2KABBYQAh9lJgCfQXwsmDG9l1v4Jt9vUr5YYCr0
fDYAoJdJBbUJBApO5ZOh200gZ5+Lo/vt
=mpU4
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Thu Jan 26 20:48:55 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Thu, 26 Jan 2012 20:48:55 -0500 (EST)
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
Message-ID: <alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>

> Larrabee indeed resembles itanium to some extend, but not quite.

wow, that has to be your most loosely-tethered-to-reality statement yet!
it's true that Larrabee and Itanium are very close 
in the number of letters in their name.

> Infiniband is meant for HPC and uses MPI protocol to communicate.

no and no.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 01:04:17 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 07:04:17 +0100
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
Message-ID: <BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>


On Jan 27, 2012, at 2:48 AM, Mark Hahn wrote:

>> Larrabee indeed resembles itanium to some extend, but not quite.
>
> wow, that has to be your most loosely-tethered-to-reality statement  
> yet!
> it's true that Larrabee and Itanium are very close
> in the number of letters in their name.

Your personal attack seems to indicate you disagree with my  
qualification of the entire Larrabee line
having any reality sense in the long run.

Instead of throwing mudd, mind to explain why a Larrabee,
an architecture far away from mainstream, makes any chance of  
competing in HPC
with the existing architectural concepts in the long run?

Vincent


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 01:06:07 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 07:06:07 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F21E159.7000905@unimelb.edu.au>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
Message-ID: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>

Why do you write this?

On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 26/01/12 23:28, Vincent Diepeveen wrote:
>
>> Mike you replied to me not to mailing list.
>
> That was probably deliberate, and it is inconsiderate to post a reply
> publicly without checking with the writer that they are OK with that,
> especially as you quoted what they wrote - they may not have wanted  
> that
> in the public domain.
>
> - --
>     Christopher Samuel - Senior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>          http://www.vlsci.unimelb.edu.au/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8h4VkACgkQO2KABBYQAh9lJgCfQXwsmDG9l1v4Jt9vUr5YYCr0
> fDYAoJdJBbUJBApO5ZOh200gZ5+Lo/vt
> =mpU4
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Fri Jan 27 10:37:43 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Fri, 27 Jan 2012 10:37:43 -0500 (EST)
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
Message-ID: <alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>

>>> Larrabee indeed resembles itanium to some extend, but not quite.
>>
>> wow, that has to be your most loosely-tethered-to-reality statement
>> yet!
>> it's true that Larrabee and Itanium are very close
>> in the number of letters in their name.
>
> Your personal attack seems to indicate you disagree with my
> qualification of the entire Larrabee line
> having any reality sense in the long run.

not surprisingly, no: I disagree that Larrabee and Itanium resemble
each other in any but really silly ways.

Itanium is a custom, VLIW architecture; Larrabee is an on-chip
cluster of non-VLIW, commodity x86_64 cores.

none of the distinctive features of Itanium (multi-instruction bundles,
dependency on compile-time scheduling, intended market, implementation,
success limited to predictable, high-bandwidth situations, directory-based
inter-node cache coherency) are anything close to the features of Larrabee
(standard x86_64 ISA, no special compiler needed, on-chip message-passing
network, suitable for complex/dynamic/unpredictable loads, possibly not even
cache-coherent across one chip.)

my guess is that you were thinking about how ia64 chips tended to run 
at low clock rates, and thinking about how gpus (probably including
larrabee) also tend to be low-clocked.

> Instead of throwing mudd, mind to explain why a Larrabee,
> an architecture far away from mainstream, makes any chance of
> competing in HPC
> with the existing architectural concepts in the long run?

as far as I know, larrabee will be a mesh of conventional x86_64 cores
that will run today's x86_64 code.  I don't know whether Intel has stated
(or even decided) whether the cores will have full or partial cache
coherency, or whether they'll really be an MPI-like shared-nothing cluster.

if you want to compare Larrabee to Fermi or AMD GCN, that might be 
interesting.  or to mainstream multicore - like bulldozer, with 
32c per package vs larrabee with ">=50".

but not ia64.  it's best we all just forget about it.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Fri Jan 27 10:39:06 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Fri, 27 Jan 2012 10:39:06 -0500 (EST)
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
Message-ID: <alpine.LFD.2.02.1201271037560.32179@coffee.psychology.mcmaster.ca>

> Why do you write this?

because he though you might be interested in improving your etiquette.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 10:42:48 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 10:42:48 -0500
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
Message-ID: <4F22C5F8.6010804@scalableinformatics.com>

On 01/27/2012 10:37 AM, Mark Hahn wrote:
>>>> Larrabee indeed resembles itanium to some extend, but not quite.
>>>
>>> wow, that has to be your most loosely-tethered-to-reality statement
>>> yet!
>>> it's true that Larrabee and Itanium are very close
>>> in the number of letters in their name.
>>
>> Your personal attack seems to indicate you disagree with my
>> qualification of the entire Larrabee line
>> having any reality sense in the long run.
>
> not surprisingly, no: I disagree that Larrabee and Itanium resemble
> each other in any but really silly ways.
>
> Itanium is a custom, VLIW architecture; Larrabee is an on-chip
> cluster of non-VLIW, commodity x86_64 cores.

But ...  but .... they are both made of Silicon .... doesn't that mean 
they are the same?

/sarc

(Sorry, its been a fun week ... and this was just ... too ... 
irresistible ...)

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Fri Jan 27 11:06:00 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Fri, 27 Jan 2012 11:06:00 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
Message-ID: <4F22CB68.3080605@ias.edu>

Vincent,

He wrote that because he's trying to educate you on proper mailing list
etiquette, which is something you appear to be lacking.

Chris is absolutely right - you should not reply to off-list e-mails
on-list.

--
Prentice

On 01/27/2012 01:06 AM, Vincent Diepeveen wrote:
> Why do you write this?
>
> On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:
>
> On 26/01/12 23:28, Vincent Diepeveen wrote:
>
> >>> Mike you replied to me not to mailing list.
>
> That was probably deliberate, and it is inconsiderate to post a reply
> publicly without checking with the writer that they are OK with that,
> especially as you quoted what they wrote - they may not have wanted  
> that
> in the public domain.
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 11:12:35 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 17:12:35 +0100
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
Message-ID: <109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>


On Jan 27, 2012, at 4:37 PM, Mark Hahn wrote:

>>>> Larrabee indeed resembles itanium to some extend, but not quite.
>>>
>>> wow, that has to be your most loosely-tethered-to-reality statement
>>> yet!
>>> it's true that Larrabee and Itanium are very close
>>> in the number of letters in their name.
>>
>> Your personal attack seems to indicate you disagree with my
>> qualification of the entire Larrabee line
>> having any reality sense in the long run.
>
> not surprisingly, no: I disagree that Larrabee and Itanium resemble
> each other in any but really silly ways.
>
> Itanium is a custom, VLIW architecture; Larrabee is an on-chip
> cluster of non-VLIW, commodity x86_64 cores.
>
> none of the distinctive features of Itanium (multi-instruction  
> bundles,
> dependency on compile-time scheduling, intended market,  
> implementation,
> success limited to predictable, high-bandwidth situations,  
> directory-based
> inter-node cache coherency) are anything close to the features of  
> Larrabee
> (standard x86_64 ISA, no special compiler needed, on-chip message- 
> passing
> network, suitable for complex/dynamic/unpredictable loads, possibly  
> not even
> cache-coherent across one chip.)
>
> my guess is that you were thinking about how ia64 chips tended to  
> run at low clock rates, and thinking about how gpus (probably  
> including
> larrabee) also tend to be low-clocked.
>

And both are seem failures from user viewpoint, maybe not from intels  
income viewpoint,
but from intels aim to replace and/or create a new long lasting  
architecture
that can even *remotely* compete with other manufacturers,
not to mention far too high pricepoints for such cpu's.

>> Instead of throwing mudd, mind to explain why a Larrabee,
>> an architecture far away from mainstream, makes any chance of
>> competing in HPC
>> with the existing architectural concepts in the long run?
>
> as far as I know, larrabee will be a mesh of conventional x86_64 cores
> that will run today's x86_64 code.  I don't know whether Intel has  
> stated
> (or even decided) whether the cores will have full or partial cache
> coherency, or whether they'll really be an MPI-like shared-nothing  
> cluster.

Assuming you're not completely born stupid, i assume you will realize  
that IN ORDER to run
most existing x64 codes, it needs to have cache coherency, and that  
it always has been
presented as having exactly that.

Which is one of reasons why the architecture doesn't scale of course.

Well you can forget about them running your x64 fortran codes on it  
at any fast speed.

You need to total rewrite your code to be able to use vectors of  
doubles,
and in contradiction to GPU's where you can indirectly with arrays  
see each PE or each 'compute core'
(which is 4 PE's of in case of AMD-ATI that can execute 1 double a  
cycle),

Such lookups are a disaster at larrabee - having a cost of 7 cycles  
for indirect lookups,
so you really need to use vectors.

Now i bet majority of your oldie x64 code doesn't use such huge vectors,
so to even get some remote performance out of it, a total rewrite of  
most code is needed,
if it can work at all.

We can then also see the insight that GPU's are total superior to  
larrabee at most terrains and
most importantly at multiplicative codes.

As you might know GPU's are worldchampion in doing multiplications  
and CPU's are not.

Multiplication happens to be something that is of major importance  
for the majority of HPC codes.
Majority i really mean - approaching 90% at the public supercomputers.

Vincent

>
> if you want to compare Larrabee to Fermi or AMD GCN, that might be  
> interesting.  or to mainstream multicore - like bulldozer, with 32c  
> per package vs larrabee with ">=50".
>
> but not ia64.  it's best we all just forget about it.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 11:15:05 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 17:15:05 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F22CB68.3080605@ias.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
Message-ID: <B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>

And why do you post this?

On Jan 27, 2012, at 5:06 PM, Prentice Bisbal wrote:

> Vincent,
>
> He wrote that because he's trying to educate you on proper mailing  
> list
> etiquette, which is something you appear to be lacking.
>
> Chris is absolutely right - you should not reply to off-list e-mails
> on-list.
>
> --
> Prentice
>
> On 01/27/2012 01:06 AM, Vincent Diepeveen wrote:
>> Why do you write this?
>>
>> On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:
>>
>> On 26/01/12 23:28, Vincent Diepeveen wrote:
>>
>>>>> Mike you replied to me not to mailing list.
>>
>> That was probably deliberate, and it is inconsiderate to post a reply
>> publicly without checking with the writer that they are OK with that,
>> especially as you quoted what they wrote - they may not have wanted
>> that
>> in the public domain.
>>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at cse.psu.edu  Fri Jan 27 11:25:15 2012
From: ellis at cse.psu.edu (Ellis H. Wilson III)
Date: Fri, 27 Jan 2012 11:25:15 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
Message-ID: <4F22CFEB.6080404@cse.psu.edu>

On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
> And why do you post this?

"Assuming you're not completely born stupid, i assume you will realize
that IN ORDER to" write an effective email that conveys some idea or 
argument, it is extremely helpful to utilize some form of etiquette or 
at the very least, self-restraint in your writing so we all don't stop 
reading your emails.  In fact, while it's not a terribly great book 
IMHO, it might still help to read "How to Win Friends and Influence 
People."  Seems like you have enough time on your hands to write 
near-to-incoherent emails on this list and program near-to-impossible 
applications for GPUs, so perhaps if you can steal a little time from 
one or the other you can finish it in a day or so.

But admittedly, perhaps requesting etiquette from you is truly an 
unthinkable thing to do.  Hence your boggled state of mind.

ellis

>
> On Jan 27, 2012, at 5:06 PM, Prentice Bisbal wrote:
>
>> Vincent,
>>
>> He wrote that because he's trying to educate you on proper mailing
>> list
>> etiquette, which is something you appear to be lacking.
>>
>> Chris is absolutely right - you should not reply to off-list e-mails
>> on-list.
>>
>> --
>> Prentice
>>
>> On 01/27/2012 01:06 AM, Vincent Diepeveen wrote:
>>> Why do you write this?
>>>
>>> On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:
>>>
>>> On 26/01/12 23:28, Vincent Diepeveen wrote:
>>>
>>>>>> Mike you replied to me not to mailing list.
>>>
>>> That was probably deliberate, and it is inconsiderate to post a reply
>>> publicly without checking with the writer that they are OK with that,
>>> especially as you quoted what they wrote - they may not have wanted
>>> that
>>> in the public domain.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Fri Jan 27 11:34:41 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Fri, 27 Jan 2012 11:34:41 -0500
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
Message-ID: <4F22D221.3020504@ias.edu>

On 01/27/2012 11:12 AM, Vincent Diepeveen wrote:
> And both are seem failures from user viewpoint, maybe not from intels  
> income viewpoint,
> but from intels aim to replace and/or create a new long lasting  
> architecture
> that can even *remotely* compete with other manufacturers,
> not to mention far too high pricepoints for such cpu's.

This argument is ridiculous. Just because two completely different
technologies (architectures) both fail, doesn't make them similar.

That's like saying a Ford Edsel and Pontiac Aztek are similar cars.

> Assuming you're not completely born stupid, i assume you will realize  
> that IN ORDER to run

Calling someone "completely born stupid" is unacceptable behavior.
> most existing x64 codes, it needs to have cache coherency, and that  
> it always has been
> presented as having exactly that.
> Which is one of reasons why the architecture doesn't scale of course.

Cache-coherent systems don't scale well? Really? SGI Origins were ccNUMA
systems, and they scaled well.

> Well you can forget about them running your x64 fortran codes on it  
> at any fast speed.
>
> You need to total rewrite your code to be able to use vectors of  
> doubles,
> and in contradiction to GPU's where you can indirectly with arrays  
> see each PE or each 'compute core'
> (which is 4 PE's of in case of AMD-ATI that can execute 1 double a  

This argument makes no sense in the context of this discussion.  You
need to do a significant rewrite of your code to take advantage of GPUs,
too, so how are GPUs better?

> cycle),
>
> Such lookups are a disaster at larrabee - having a cost of 7 cycles  
> for indirect lookups,
> so you really need to use vectors.
>
> Now i bet majority of your oldie x64 code doesn't use such huge vectors,
> so to even get some remote performance out of it, a total rewrite of  
> most code is needed,
> if it can work at all.
>
> We can then also see the insight that GPU's are total superior to  
> larrabee at most terrains and
> most importantly at multiplicative codes.
>
> As you might know GPU's are worldchampion in doing multiplications  
> and CPU's are not.
>
> Multiplication happens to be something that is of major importance  
> for the majority of HPC codes.
> Majority i really mean - approaching 90% at the public supercomputers.

I'm at a loss for words...


Prentice
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Fri Jan 27 11:38:02 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Fri, 27 Jan 2012 11:38:02 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
Message-ID: <4F22D2EA.1080309@ias.edu>

Vincent,

I posted that because you asked a question and I answered it, which is
also good mailing list etiquette.

Since you posted your question "Why do you write this?" to the mailing
list instead of replying just to Chris, anyone on this list is free to
reply to it. Again, this is basic mailing list etiquette.

--
Prentice

On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
> And why do you post this?
>
> On Jan 27, 2012, at 5:06 PM, Prentice Bisbal wrote:
>
>> Vincent,
>>
>> He wrote that because he's trying to educate you on proper mailing  
>> list
>> etiquette, which is something you appear to be lacking.
>>
>> Chris is absolutely right - you should not reply to off-list e-mails
>> on-list.
>>
>> --
>> Prentice
>>
>> On 01/27/2012 01:06 AM, Vincent Diepeveen wrote:
>>> Why do you write this?
>>>
>>> On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:
>>>
>>> On 26/01/12 23:28, Vincent Diepeveen wrote:
>>>
>>>>>> Mike you replied to me not to mailing list.
>>> That was probably deliberate, and it is inconsiderate to post a reply
>>> publicly without checking with the writer that they are OK with that,
>>> especially as you quoted what they wrote - they may not have wanted
>>> that
>>> in the public domain.
>>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 11:41:55 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 17:41:55 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F22CFEB.6080404@cse.psu.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
Message-ID: <3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>


On Jan 27, 2012, at 5:25 PM, Ellis H. Wilson III wrote:

> On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
>> And why do you post this?

So you can follow all etiquette, yet only techincal your mind is not  
capable of following the discussions -
so you just felt replying to etiquette.

That says more about you, than about me.

What everyone hates about politics is that people just speak about  
how things are phrased instead of looking at the intention of the  
phrased text.

Why don't you go into politics, maybe you'll do better there.

Vincent
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at cse.psu.edu  Fri Jan 27 11:58:25 2012
From: ellis at cse.psu.edu (Ellis H. Wilson III)
Date: Fri, 27 Jan 2012 11:58:25 -0500
Subject: [Beowulf] The Absurdity of Diep - Was cpu's versus gpu's - Was
 Intel buys QLogic InfiniBand business
In-Reply-To: <3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
Message-ID: <4F22D7B1.4020508@cse.psu.edu>

On 01/27/2012 11:41 AM, Vincent Diepeveen wrote:
> On Jan 27, 2012, at 5:25 PM, Ellis H. Wilson III wrote:
>
>> On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
>>> And why do you post this?
>
> So you can follow all etiquette, yet only techincal your mind is not
> capable of following the discussions -
> so you just felt replying to etiquette.

No, I've given up writing technically when you're posting because:
a) You go into discussions to prove everyone wrong
b) You rapidly switch the topic if too many people disagree, which is 
frustrating and confusing (hence, was intel buys qlogic, then became 
cpus versus gpus, which became Itanium vs Larabee somehow, and now it is 
how poorly you communicate)
c) There is nothing to gain from having discussions with you

> That says more about you, than about me.

My personal background is storage and communication protocol-heavy.  Not 
processor-oriented.  You are right to suggest I am hesitant to post on a 
thread that directly compares two seemingly different processors, just 
like you hesitate to deal with the reality that you lack basic social 
skills.  Everyone caters to their own strengths, and generally (if they 
are wise), takes a back-seat and tries to learn something in areas they 
are weak.

> What everyone hates about politics is that people just speak about
> how things are phrased instead of looking at the intention of the
> phrased text.
>
> Why don't you go into politics, maybe you'll do better there.

Just because this is a list on Beowulfery and broadly covers everything 
remotely attached to HPC does not mean it needs to be bereft of a 
baseline of etiquette and respect for one another.  I know quite a few 
very nice, but rather intelligent and technically-capable people.  These 
two qualities can in fact coexist in a person, believe it or not.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 12:03:38 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 18:03:38 +0100
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <4F22D221.3020504@ias.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<4F22D221.3020504@ias.edu>
Message-ID: <208B7C7D-3A3E-4134-A352-4D7D78B304D1@xs4all.nl>


On Jan 27, 2012, at 5:34 PM, Prentice Bisbal wrote:

> On 01/27/2012 11:12 AM, Vincent Diepeveen wrote:
>> And both are seem failures from user viewpoint, maybe not from intels
>> income viewpoint,
>> but from intels aim to replace and/or create a new long lasting
>> architecture
>> that can even *remotely* compete with other manufacturers,
>> not to mention far too high pricepoints for such cpu's.
>
> This argument is ridiculous. Just because two completely different
> technologies (architectures) both fail, doesn't make them similar.
>
> That's like saying a Ford Edsel and Pontiac Aztek are similar cars.
>
>> Assuming you're not completely born stupid, i assume you will realize
>> that IN ORDER to run
>
> Calling someone "completely born stupid" is unacceptable behavior.

Whereaas everyone knows the statements of intel on larrabee there and
that without cache coherency you can't multithread and everything  
also has
to be done blocked - so there is zero compatibility with x64 then and  
any compatibility
then cannot get garantueed.

You know this really well - yet you kept yourself dumb there trying  
to cheap score.

As without cache coherency of course it's easy to build big cpu's  
that scale well,
yet they don't work x64 then.

of course intel will be forced to design some kick butt design  
somewhere in future that's
not x64 compatible at all which isn't using things like cache coherency.

Which isn't remotely the idea of larrabee.

That's why you wrote it down as such.

>> most existing x64 codes, it needs to have cache coherency, and that
>> it always has been
>> presented as having exactly that.
>> Which is one of reasons why the architecture doesn't scale of course.
>
> Cache-coherent systems don't scale well? Really? SGI Origins were  
> ccNUMA
> systems, and they scaled well.
>

Indeed this didn't scale near lineair in price.

Each Origin3800 @ 64 processors @ 1.5Ghz was exactly 1 million dollar,
whereas a simple normal x64 cpu at the time had a price similar to  
the square root of that.

In GPU's it all scales very cheap, and when using cache coherency you  
start to lose that
scaling.

Yields will go down of course. Most manufacturers need a pretty high  
yield to sell a chip at
any decent price, so production costs of a larrabee chip in the same  
proces technology as a GPU,
having the same performance will be a huge factor higher. That also  
will cause intel to really sell few of them.

You would consider buying a larrabee at 1 million dollar a card?


>> Well you can forget about them running your x64 fortran codes on it
>> at any fast speed.
>>
>> You need to total rewrite your code to be able to use vectors of
>> doubles,
>> and in contradiction to GPU's where you can indirectly with arrays
>> see each PE or each 'compute core'
>> (which is 4 PE's of in case of AMD-ATI that can execute 1 double a
>
> This argument makes no sense in the context of this discussion.  You
> need to do a significant rewrite of your code to take advantage of  
> GPUs,
> too, so how are GPUs better?

If you need to rewrite it anyway, why not get a much faster  
performance at part of the price?

It's the same effort you have to do.


>
>> cycle),
>>
>> Such lookups are a disaster at larrabee - having a cost of 7 cycles
>> for indirect lookups,
>> so you really need to use vectors.
>>
>> Now i bet majority of your oldie x64 code doesn't use such huge  
>> vectors,
>> so to even get some remote performance out of it, a total rewrite of
>> most code is needed,
>> if it can work at all.
>>
>> We can then also see the insight that GPU's are total superior to
>> larrabee at most terrains and
>> most importantly at multiplicative codes.
>>
>> As you might know GPU's are worldchampion in doing multiplications
>> and CPU's are not.
>>
>> Multiplication happens to be something that is of major importance
>> for the majority of HPC codes.
>> Majority i really mean - approaching 90% at the public  
>> supercomputers.
>
> I'm at a loss for words...
>

http://www.nwo.nl/nwohome.nsf/pages/NWOP_8DEEKL_Eng

title:       "Overview of recent supercomputers 2010"
Author: Aad van der Steen

>
> Prentice
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Fri Jan 27 13:29:52 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Fri, 27 Jan 2012 13:29:52 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
Message-ID: <4F22ED20.7040105@ias.edu>

On 01/27/2012 11:41 AM, Vincent Diepeveen wrote:
> On Jan 27, 2012, at 5:25 PM, Ellis H. Wilson III wrote:
>
>> On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
>>> And why do you post this?
> So you can follow all etiquette, yet only techincal your mind is not  
> capable of following the discussions -
> so you just felt replying to etiquette.
>
> That says more about you, than about me.
>

What it says is that we've given up on discussing technology with you,
because your arguments are completely nonsensical. Since you clearly
don't understand technology, we're hoping you can at least understand
the simple concepts of basic etiquette.

--
Prentice
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From glykos at mbg.duth.gr  Fri Jan 27 13:57:31 2012
From: glykos at mbg.duth.gr (Nicholas M Glykos)
Date: Fri, 27 Jan 2012 20:57:31 +0200 (EET)
Subject: [Beowulf] Signal to noise.
In-Reply-To: <109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
Message-ID: <Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>


Dear List,

I have been a (mostly) quiet reader of this list for the last ~5 years and 
my intention is to continue reading the excellent posts that the members 
of this community contribute almost daily. Having said that, the recent 
Vincent-centric 'discussions' have ---as I am sure you all know---
significantly reduced the signal-to-noise ratio. Can we get back to 
normal, please ?

Thanks,
Nicholas

-- 


            Nicholas M. Glykos, Department of Molecular Biology
     and Genetics, Democritus University of Thrace, University Campus,
  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
    Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From moloney.brendan at gmail.com  Fri Jan 27 14:26:12 2012
From: moloney.brendan at gmail.com (Brendan Moloney)
Date: Fri, 27 Jan 2012 11:26:12 -0800
Subject: [Beowulf] Signal to noise.
In-Reply-To: <Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
Message-ID: <CABOLwMp1qfog+fZ3GRVZX668M7AV+RfWNdzc9c==-3gRjanYyw@mail.gmail.com>

I am in a similar position.  I posted a question to this list quite some
time ago but have remained subscribed to the list ever since. I have always
(or at least until recently) enjoyed reading the discussions on here. I
hope that one person does not ruin such a great resource.

Thanks,
Brendan

On Fri, Jan 27, 2012 at 10:57 AM, Nicholas M Glykos <glykos at mbg.duth.gr>wrote:

>
> Dear List,
>
> I have been a (mostly) quiet reader of this list for the last ~5 years and
> my intention is to continue reading the excellent posts that the members
> of this community contribute almost daily. Having said that, the recent
> Vincent-centric 'discussions' have ---as I am sure you all know---
> significantly reduced the signal-to-noise ratio. Can we get back to
> normal, please ?
>
> Thanks,
> Nicholas
>
> --
>
>
>            Nicholas M. Glykos, Department of Molecular Biology
>     and Genetics, Democritus University of Thrace, University Campus,
>  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
>    Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120127/840bf499/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From h-bugge at online.no  Fri Jan 27 14:29:35 2012
From: h-bugge at online.no (=?iso-8859-1?Q?H=E5kon_Bugge?=)
Date: Fri, 27 Jan 2012 11:29:35 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120124045541.GB10196@bx9.net>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
Message-ID: <8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>

Greg,


On 23. jan. 2012, at 20.55, Greg Lindahl wrote:

> On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:
> 
>> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
> 
> I figured out the main why:
> 
> http://seekingalpha.com/news-article/2082171-qlogic-gains-market-share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets
> 
>> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
>> surpassed $100 million per quarter, and are on track for about fifty
>> percent annual growth, according to Crehan Research.
> 
> That's the whole market, and QLogic says they are #1 in the FCoE
> adapter segment of this market, and #2 in the overall 10 gig adapter
> market (see
> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-f2q12-results-earnings-call-transcript)

That can explain why QLogic is selling, but not why Intel is buying.

10 years ago, Intel went _out_ of the Infiniband marked, see http://www.networkworld.com/newsletters/servers/2002/01383318.html

So has the IB business evolved so incredible well compared to what Intel expected back in 2002? Do not think so.

I would guess that we will see message passing/RDMA over Thunderbolt or similar.


H?kon


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 15:06:54 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 21:06:54 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
Message-ID: <88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>


On Jan 27, 2012, at 8:29 PM, H?kon Bugge wrote:

> Greg,
>
>
> On 23. jan. 2012, at 20.55, Greg Lindahl wrote:
>
>> On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:
>>
>>> http://www.hpcwire.com/hpcwire/2012-01-23/ 
>>> intel_to_buy_qlogic_s_infiniband_business.html
>>
>> I figured out the main why:
>>
>> http://seekingalpha.com/news-article/2082171-qlogic-gains-market- 
>> share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets
>>
>>> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
>>> surpassed $100 million per quarter, and are on track for about fifty
>>> percent annual growth, according to Crehan Research.
>>
>> That's the whole market, and QLogic says they are #1 in the FCoE
>> adapter segment of this market, and #2 in the overall 10 gig adapter
>> market (see
>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses- 
>> f2q12-results-earnings-call-transcript)
>
> That can explain why QLogic is selling, but not why Intel is buying.
>
> 10 years ago, Intel went _out_ of the Infiniband marked, see http:// 
> www.networkworld.com/newsletters/servers/2002/01383318.html
>
> So has the IB business evolved so incredible well compared to what  
> Intel expected back in 2002? Do not think so.
>
> I would guess that we will see message passing/RDMA over  
> Thunderbolt or similar.
>
>

Qlogic offers that QDR.
Mellanox is a generation newer there with FDR.

Both in latency as well as in bandwidth a huge difference.


> H?kon
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 15:19:31 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 15:19:31 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
Message-ID: <4F2306D3.4080509@scalableinformatics.com>

On 01/27/2012 03:06 PM, Vincent Diepeveen wrote:
>
> On Jan 27, 2012, at 8:29 PM, H?kon Bugge wrote:
>
>> Greg,
>>
>>
>> On 23. jan. 2012, at 20.55, Greg Lindahl wrote:
>>
>>> On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:
>>>
>>>> http://www.hpcwire.com/hpcwire/2012-01-23/
>>>> intel_to_buy_qlogic_s_infiniband_business.html
>>>
>>> I figured out the main why:
>>>
>>> http://seekingalpha.com/news-article/2082171-qlogic-gains-market-
>>> share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets
>>>
>>>> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
>>>> surpassed $100 million per quarter, and are on track for about fifty
>>>> percent annual growth, according to Crehan Research.
>>>
>>> That's the whole market, and QLogic says they are #1 in the FCoE
>>> adapter segment of this market, and #2 in the overall 10 gig adapter
>>> market (see
>>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-
>>> f2q12-results-earnings-call-transcript)

I found that statement interesting.   I've actually not known anything 
about their 10GbE products.  My bad.

>>
>> That can explain why QLogic is selling, but not why Intel is buying.
>>
>> 10 years ago, Intel went _out_ of the Infiniband marked, see http://
>> www.networkworld.com/newsletters/servers/2002/01383318.html
>>
>> So has the IB business evolved so incredible well compared to what
>> Intel expected back in 2002? Do not think so.
>>
>> I would guess that we will see message passing/RDMA over
>> Thunderbolt or similar.

Intel buying makes quite a bit of sense IMO.  They are in 10GbE silicon 
and NICs, and being in IB silicon and HCAs gives them not only a hedge 
(10GbE while growing rapidly, is not the only high performance network 
market, and Intel is very good at getting economies of scale going with 
its silicon ... well ... most of its silicon ... ignoring Itanium here 
...).  Its quite likely that Intel would need IB for its PetaScale 
plans.  Someone here postualted putting the silicon on the CPU.  Not 
sure if this would happen, but I could see it on an IOH, easily.  That 
would make sense (at least in terms of the Westmere designs ... for the 
Romley et al. I am not sure where it would make most sense).

But Intel sees the HPC market growth, and I think they realize that 
there are interesting opportunities for them there with tighter high 
performance networking interconnects (Thunderbolt, USB3, IB, 10GbE 
native on all these systems).

> Qlogic offers that QDR.
> Mellanox is a generation newer there with FDR.
>
> Both in latency as well as in bandwidth a huge difference.

Haven't looked much at FDR or EDR latency.  Was it a huge delta (more 
than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us 
for a while, and switches are still ~150-300ns port to port.  At some 
point I think you start hitting a latency floor, bounded in part by "c", 
but also by an optimal technology path length that you can't shorten 
without significant investment and new technology.  Not sure how close 
we are to that point (maybe someone from Qlogic/Mellanox could comment 
on the headroom we have).

Bandwidth wise, you need E5 with PCIe 3 to really take advantage of FDR. 
  So again, its a natural fit, especially if its LOM ....

Curiously, I think this suggests that ScaleMP could be in play on the 
software side ... imagine stringing together bunches of the LOM FDR/QDR 
motherboards with E5's and lots of ram into huge vSMPs (another thread). 
  Shai may tell me I'm full of it (hope he doesn't), but I think this is 
a real possibility.  The Qlogic purchase likely makes this even more 
interesting for Intel (or Cisco, others as a defensive acq).

We sure do live in interesting times!

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 15:27:24 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 15:27:24 -0500
Subject: [Beowulf] Signal to noise.
In-Reply-To: <Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
Message-ID: <4F2308AC.9010704@scalableinformatics.com>

On 01/27/2012 01:57 PM, Nicholas M Glykos wrote:
>
> Dear List,
>
> I have been a (mostly) quiet reader of this list for the last ~5 years and
> my intention is to continue reading the excellent posts that the members
> of this community contribute almost daily. Having said that, the recent
> Vincent-centric 'discussions' have ---as I am sure you all know---
> significantly reduced the signal-to-noise ratio. Can we get back to
> normal, please ?
>

Greetings Nicholas and many others:

   I've found that filters help.  I have some simple procmail filters 
set up in my mail directory that redirect some people's email (and in 
some cases responses to them) to a file I  ... well ... never read.

   By doing so, I find the S/N ratio to be vastly improved.

   Only one person from Beowulf is in this (not Vincent ... I am still 
deeply amused by some of the emails, though that is fading fast with the 
personal attacks).

   Procmail filters look like this

:0:
* ^From:.*bad at person.com
$HOME/twit.filter

   Then I never read the twit.filter.  Just empty it out every now and 
then.  Maybe once every few years.

   Doing this has dramatically improved S/N here and elsewhere.  If you 
don't have this capability directly, your mail client can probably fake 
it.  I use this as I have (far too) many mail clients and I don't want 
to manage the rules on all of them.  If you are afflicted with Microsoft 
exchange as your mail server, I am not sure what you can (easily) do.


Joe


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From glykos at mbg.duth.gr  Fri Jan 27 15:58:02 2012
From: glykos at mbg.duth.gr (Nicholas M Glykos)
Date: Fri, 27 Jan 2012 22:58:02 +0200 (EET)
Subject: [Beowulf] Signal to noise.
In-Reply-To: <Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
Message-ID: <Pine.LNX.4.62.1201272251530.27131@aspera.cluster.mbg.gr>


Hi Joe,


> I've found that filters help.

You are killing my daily digests.


> If you are afflicted with Microsoft ...

What is 'Microsoft' ?
:-)


All the best (and apologies to the list for the email traffic),
Nicholas


-- 


            Nicholas M. Glykos, Department of Molecular Biology
     and Genetics, Democritus University of Thrace, University Campus,
  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
    Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 16:07:34 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 16:07:34 -0500
Subject: [Beowulf] Signal to noise.
In-Reply-To: <Pine.LNX.4.62.1201272251530.27131@aspera.cluster.mbg.gr>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
	<Pine.LNX.4.62.1201272251530.27131@aspera.cluster.mbg.gr>
Message-ID: <4F231216.3020703@scalableinformatics.com>

On 01/27/2012 03:58 PM, Nicholas M Glykos wrote:
>
> Hi Joe,
>
>
>> I've found that filters help.
>
> You are killing my daily digests.

Do'h !  ...  I seem to remember that you can do some more fancy 
filtering ...  Someone showed me something a few years ago, that would 
break apart digests, filter, and reassemble.

Something like this:

http://easierbuntu.blogspot.com/2011/09/managing-your-email-with-fetchmail.html

(they have some interesting procmail recipes, but you can find them to 
do this if you really want to).
>
>
>> If you are afflicted with Microsoft ...
>
> What is 'Microsoft' ?
> :-)

A small, very gentle company in the North West USA.

> All the best (and apologies to the list for the email traffic),
> Nicholas

:)


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 16:42:24 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 22:42:24 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2306D3.4080509@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
Message-ID: <69BBD80B-05C9-4683-99F7-B48A0BDA285D@xs4all.nl>


On Jan 27, 2012, at 9:19 PM, Joe Landman wrote:

> On 01/27/2012 03:06 PM, Vincent Diepeveen wrote:
>>
>> On Jan 27, 2012, at 8:29 PM, H?kon Bugge wrote:
>>
>>> Greg,
>>>
>>>
>>> On 23. jan. 2012, at 20.55, Greg Lindahl wrote:
>>>
>>>> On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:
>>>>
>>>>> http://www.hpcwire.com/hpcwire/2012-01-23/
>>>>> intel_to_buy_qlogic_s_infiniband_business.html
>>>>
>>>> I figured out the main why:
>>>>
>>>> http://seekingalpha.com/news-article/2082171-qlogic-gains-market-
>>>> share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets
>>>>
>>>>> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
>>>>> surpassed $100 million per quarter, and are on track for about  
>>>>> fifty
>>>>> percent annual growth, according to Crehan Research.
>>>>
>>>> That's the whole market, and QLogic says they are #1 in the FCoE
>>>> adapter segment of this market, and #2 in the overall 10 gig  
>>>> adapter
>>>> market (see
>>>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-
>>>> f2q12-results-earnings-call-transcript)
>
> I found that statement interesting.   I've actually not known anything
> about their 10GbE products.  My bad.
>
>>>
>>> That can explain why QLogic is selling, but not why Intel is buying.
>>>
>>> 10 years ago, Intel went _out_ of the Infiniband marked, see http://
>>> www.networkworld.com/newsletters/servers/2002/01383318.html
>>>
>>> So has the IB business evolved so incredible well compared to what
>>> Intel expected back in 2002? Do not think so.
>>>
>>> I would guess that we will see message passing/RDMA over
>>> Thunderbolt or similar.
>
> Intel buying makes quite a bit of sense IMO.  They are in 10GbE  
> silicon
> and NICs, and being in IB silicon and HCAs gives them not only a hedge
> (10GbE while growing rapidly, is not the only high performance network
> market, and Intel is very good at getting economies of scale going  
> with
> its silicon ... well ... most of its silicon ... ignoring Itanium here
> ...).  Its quite likely that Intel would need IB for its PetaScale

Why buy previous generation IB in such case?
It's about the ethernet of course...

They produce tens of millions of cpu's each quarter and also  
announced a SoC (socket on chip).

 From SoC's actually the market produces billions a year. So it's  
alucrative market, yet highly competative.

Having 10 gigabit ethernet on such SoC and the total at a low price  
would give intel a huge lead there
worth dozens of billions a year.

It's not clear to me where all their SoC plans go, but i bet right  
now they are open to any market needing SoC's.

Note that many SoC's are dirt cheap. Even in very low volume we speak  
about some tens of dollars, cpu included
and other connectivity included.

Price is everything there, yet i guess intel will be offering the  
'top' SoC's there with faster cpu's and 10 GigE.

Then they produce a bunch of mainboards.

Think also of upcoming generation of consoles, ipad 3's and similar  
products etc - it's not clear
yet which company gets the contracts for upcoming consoles, it's all  
wide open for now.

Yet they might sell also a 100+ million of those.

Intel is an attractive company to do business with for console  
manufacturers now.

IBM's cell kind of lost momentum there and has nothing new to offer  
that really outperforms as it seems.
Also power usage of cell was kind of disappointing.

Initial version PS3 was 220 watts on average and 100% usage it could  
go up to 380+ watt.
Try to put that on your couch.

Don't confuse this with the later crunching CELL version, a much  
improved chip, used for some supercomputers.

Yet if i remember well, some reports, was it Aad v/d Steen (?)  
already predicted it would be not interesting for upcoming  
supercomputers
as it is some kind of hybrid chip - which has no long term future.

He was right.

> plans.  Someone here postualted putting the silicon on the CPU.  Not
> sure if this would happen, but I could see it on an IOH, easily.  That
> would make sense (at least in terms of the Westmere designs ... for  
> the
> Romley et al. I am not sure where it would make most sense).
>
> But Intel sees the HPC market growth, and I think they realize that
> there are interesting opportunities for them there with tighter high
> performance networking interconnects (Thunderbolt, USB3, IB, 10GbE
> native on all these systems).
>

Undoubtfully they'll try something in the HPC market.

If you already have put lots of cash in development of a product it's  
better to put it
on the market.

Based upon their name they'll sell some.

And some years from now they should have something bigtime improved.
Yet realize how complicated it is to tape out a GPU at a new process  
technology
  if you aren't sure you gonna sell a 100+ million of them.

Such massive projects have to pay back for factories. A product  
that's having a potential of not even selling for over a few dozens
of billions of dollars is not even interesting to develop.

Just startup costs for a GPU at a new proces technology is some  
dozens of millions for each run and the more complex it is and the
newer the proces technology the more expensive it is.

Realize IBM produces its power7 and bluegene/q upcoming cpu at 45 nm  
technology.

GPU's release now in 28 nm. That's giving theoretically an advantage  
of a tad less of (45 / 28) ^ 2 = 2.58

So a gpu of intel needs to be factor 2.58 better in the same proces  
technology than todays gpu's of
AMD (already released 28 nm) and Nvidia (coming soon 28 nm i'd expect).

This where with cpu's, intels big advantage is always that they are  
better in getting newer proces technologies to work sooner than the  
competition.

Ivy Bridge will be 22 nm so i heard rumours.

>> Qlogic offers that QDR.
>> Mellanox is a generation newer there with FDR.
>>
>> Both in latency as well as in bandwidth a huge difference.
>
> Haven't looked much at FDR or EDR latency.  Was it a huge delta (more
> than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us
> for a while, and switches are still ~150-300ns port to port.  At some

Posting here some months ago from Gilad Shainer was it's 0.85 us RDMA  
for FDR versus 1.3 us or so for the other;
more importantly for clusters is the bandwidth.

I guess that pci-e 3.0 allows simply much higher speeds whereas the  
QDR is PCI-E 2.0 stuff.

Isn't pci-e 3.0 about 2x higher bandwidth than 2 pci-e 2.0?

Now i might be happy with that last, but i guess that for big FFT's  
or be it matrice,
you still need massive bandwidth.

Even if n is big in O ( k *  n log n )

Where k in case of matrice is a tad bigger than n and in case of  
Number Theory is usually around the number of bits,
so 3.32 times n or so, that means you still need k steps of n log n.

That's massive bandwidth.

> point I think you start hitting a latency floor, bounded in part by  
> "c",
> but also by an optimal technology path length that you can't shorten
> without significant investment and new technology.  Not sure how close
> we are to that point (maybe someone from Qlogic/Mellanox could comment
> on the headroom we have).

There is a lot of headroom for better latencies from software viewpoint,
as cpu's keep getting faster yet latency of years ago networks was  
just marginally
worse than what's there now.

In case of hardware i really am no expert there.

>
> Bandwidth wise, you need E5 with PCIe 3 to really take advantage of  
> FDR.
>   So again, its a natural fit, especially if its LOM ....
>

All the socket2011 boards that are in the shops now are PCI-e 3.0 and  
a wave of
mainboards with 2 sockets will release a few days before or at the  
same day that
intel finally releases the Xeon version of Sandy Bridge.

Seems it didn't release yet as it's not too high clocked, if i look  
at this sample cpu :)

It's 2Ghz to be precise (8 cores Xeon).

> Curiously, I think this suggests that ScaleMP could be in play on the
> software side ... imagine stringing together bunches of the LOM FDR/ 
> QDR
> motherboards with E5's and lots of ram into huge vSMPs (another  
> thread).
>   Shai may tell me I'm full of it (hope he doesn't), but I think  
> this is
> a real possibility.  The Qlogic purchase likely makes this even more
> interesting for Intel (or Cisco, others as a defensive acq).
>

A technology that just sold to 300 machines, this is not interesting  
market for intel.


They have very expensive factories that each cost many billions of  
dollars.
These need to produce nonstop and sell products, to pay back for the  
factories and to make a profit.

Intel used to be worth over a 100 billion dollar at NASDAQ.

Wasting your most clever engineers, from which each company always  
has too few, to products that can't keep busy your
factories, is a total waste of time. So your huge base of B-class  
engineers, let me not quote some mailing list names,
that's the ones you move to Qlogic then for the HPC.

That's enough to keep it afloat for a while in combination with  
'intel inside'.

Intels profit is too huge to be busy toying with tiny markets with a  
handful of customers,
from which majority forgot to take their medicine when you propose  
rewriting the software to some new hardware platform
you are gonna unroll. A habit intel is not exactly excited about of   
course, as they like to sell each time new technology.

Also each larrabee intel would sell means they sell a bunch of xeons  
less of course.

> We sure do live in interesting times!
>

Not for everyone i guess - many lost their job and as i predicted  
some years ago a guy with a
nobel prize might be carpet bombing a huge nation this summer.

Intel has 3 huge factories in Israel last time i checked.

It sure can give unpredicted results for future.

> -- 
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>         http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 16:47:21 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 16:47:21 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <69BBD80B-05C9-4683-99F7-B48A0BDA285D@xs4all.nl>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<69BBD80B-05C9-4683-99F7-B48A0BDA285D@xs4all.nl>
Message-ID: <4F231B69.1050404@scalableinformatics.com>

On 01/27/2012 04:42 PM, Vincent Diepeveen wrote:
>
> On Jan 27, 2012, at 9:19 PM, Joe Landman wrote:
>
>> On 01/27/2012 03:06 PM, Vincent Diepeveen wrote:


[... merciful trimming ...]


>>>> I would guess that we will see message passing/RDMA over
>>>> Thunderbolt or similar.
>>
>> Intel buying makes quite a bit of sense IMO.  They are in 10GbE
>> silicon
>> and NICs, and being in IB silicon and HCAs gives them not only a hedge
>> (10GbE while growing rapidly, is not the only high performance network
>> market, and Intel is very good at getting economies of scale going
>> with
>> its silicon ... well ... most of its silicon ... ignoring Itanium here
>> ...).  Its quite likely that Intel would need IB for its PetaScale
>
> Why buy previous generation IB in such case?

IP.  Its all about IP.  Its always about IP.  If ever you think its not 
about IP, you should remember "Landman's N+1<sup>th</sup> rule of M&A: 
It's the IP man ... just da IP!"

> It's about the ethernet of course...

... no its not.  Intel has its own ethernet.  Its had it for a LONG 
time, and it did not buy Qlogic ethernet ... Its not about the ethernet. 
  Say it with me ... ITS NOT ABOUT THE ETHERNET ... There, don't you 
feel better now?  I do ...


> They produce tens of millions of cpu's each quarter and also
> announced a SoC (socket on chip)

SoC is "System On a Chip".  Socket on a chip is ... er ... cart before 
the horse?


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Fri Jan 27 17:13:12 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Fri, 27 Jan 2012 14:13:12 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
Message-ID: <20120127221312.GA29961@bx9.net>

On Fri, Jan 27, 2012 at 11:29:35AM -0800, H?kon Bugge wrote:

> That can explain why QLogic is selling, but not why Intel is buying.

That's right. This was probably bought, not sold. If you look at the
press release Intel put out, it's all about Exascale computing.

http://newsroom.intel.com/community/intel_newsroom/blog/2012/01/23/intel-takes-key-step-in-accelerating-high-performance-computing-with-infiniband-acquisition

If you want to put an IB HCA in a CPU or a {north,south}bridge,
TrueScale nee InfiniPath is a much smaller implementation than others,
and most of the chip is memory, which Intel knows how to shrink
drastically compared to the usual way people implement memory.

Also, keep in mind that Intel's benchmarking group in Moscow has a lot
of experience with benchmarking real apps for bids using TrueScale
head-to-head against other HCAs, and I wouldn't be surprised if it was
the case that TrueScale QDR is faster than that other company's FDR on
many real codes, for the usual reason that TrueScale's MPI-oriented
InfiniBand extension is more suited for MPI than the standard
InfiniBand has-more-features-than-MPI-requires protocols.

Finally, I haven't seen it mentioned whether or not QLogic's IB switch
was part of the purchase. If it is, then you should note that it's not
hard to make that chip speak ethernet, and Intel could probably
dramatically improve it with their superior serdes technology.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Shainer at Mellanox.com  Fri Jan 27 17:25:58 2012
From: Shainer at Mellanox.com (Gilad Shainer)
Date: Fri, 27 Jan 2012 22:25:58 +0000
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120127221312.GA29961@bx9.net>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
Message-ID: <F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>

> If you want to put an IB HCA in a CPU or a {north,south}bridge, TrueScale nee
> InfiniPath is a much smaller implementation than others, and most of the chip
> is memory, which Intel knows how to shrink drastically compared to the usual
> way people implement memory.


So I wonder why multiple OEMs decided to use Mellanox for on-board solutions and no one used the QLogic silicon... 


> Also, keep in mind that Intel's benchmarking group in Moscow has a lot of
> experience with benchmarking real apps for bids using TrueScale head-to-head
> against other HCAs, and I wouldn't be surprised if it was the case that TrueScale
> QDR is faster than that other company's FDR on many real codes, 


Surprise surprise... this is no more than FUD. If you have real numbers to back it up please send. If it was so great, how come more people decided to use the Mellanox solutions? If QLogic was doing so great with their solution, I would guess they would not be selling the IB business... 


> Finally, I haven't seen it mentioned whether or not QLogic's IB switch was part
> of the purchase. If it is, then you should note that it's not hard to make that chip
> speak ethernet, and Intel could probably dramatically improve it with their
> superior serdes technology.
> 
> -- greg
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Fri Jan 27 17:27:23 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Fri, 27 Jan 2012 14:27:23 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2306D3.4080509@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
Message-ID: <20120127222723.GB29961@bx9.net>

On Fri, Jan 27, 2012 at 03:19:31PM -0500, Joe Landman wrote:

> >>> That's the whole market, and QLogic says they are #1 in the FCoE
> >>> adapter segment of this market, and #2 in the overall 10 gig adapter
> >>> market (see
> >>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-
> >>> f2q12-results-earnings-call-transcript)
> 
> I found that statement interesting.   I've actually not known anything 
> about their 10GbE products.  My bad.

I'm not surprised, as this 10ge adapter is aimed at the same part of
the market that uses fibre channel, which isn't that common in HPC. It
doesn't have the kind of TCP offload features which have been
(futilely) marketed in HPC; it's all about running the same fibre
channel software most enterprises have run for a long time, but having
the network be ethernet.

> Haven't looked much at FDR or EDR latency.  Was it a huge delta (more 
> than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us 
> for a while, and switches are still ~150-300ns port to port.

Are you talking about the latency of 1 core on 1 system talking to 1
core on one system, or the kind of latency that real MPI programs see,
running on all of the cores on a system and talking to many other
systems? I assure you that the latter is not 0.8 for any IB system.

> At some 
> point I think you start hitting a latency floor, bounded in part by "c", 

Last time I did the computation, we were 10X that floor. And, of
course, each increase in bandwidth usually makes latency worse, absent
heroic efforts of implementers to make that headline latency look
better.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From tom.elken at qlogic.com  Fri Jan 27 18:08:58 2012
From: tom.elken at qlogic.com (Tom Elken)
Date: Fri, 27 Jan 2012 15:08:58 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120127221312.GA29961@bx9.net>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
Message-ID: <35AAF1E4A771E142979F27B51793A4888885B23AE5@AVEXMB1.qlogic.org>

> Finally, I haven't seen it mentioned whether or not QLogic's IB switch
> was part of the purchase.

>From the QLogic press release: " QLogic Corp. ...
today announced a definitive agreement to sell the product lines ... associated with its InfiniBand business to Intel Corporation ..."

So "the product lines" means both the switch and HCA product lines.

Last summer Intel acquired an Ethernet switch business:
http://newsroom.intel.com/community/intel_newsroom/blog/2011/07/19/intel-to-acquire-fulcrum-microsystems
so it is not unprecedented that they are interested in switching as well as host technologies.

-Tom


If it is, then you should note that it's not
> hard to make that chip speak ethernet, and Intel could probably
> dramatically improve it with their superior serdes technology.
>
> -- greg
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf


This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Fri Jan 27 16:07:08 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Fri, 27 Jan 2012 16:07:08 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2306D3.4080509@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
Message-ID: <alpine.LFD.2.02.1201271554310.1084@coffee.psychology.mcmaster.ca>

>>>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-
>>>> f2q12-results-earnings-call-transcript)
>
> I found that statement interesting.   I've actually not known anything
> about their 10GbE products.  My bad.

I was a bit surprised that the entire transcript had only one 
side-ways mention of IB.  also interesting that they seem quite 
heavily into the heavily-offloaded adapter market (which is sort
of the opposite of the original infinipath stuff.)

>>> I would guess that we will see message passing/RDMA over
>>> Thunderbolt or similar.

has there been any mention of Thunderbolt in a switched context?
afaikt it's just a weird "let's do faster USB and throw in video" thing.

> Intel buying makes quite a bit of sense IMO.  They are in 10GbE silicon
> and NICs, and being in IB silicon and HCAs gives them not only a hedge
> (10GbE while growing rapidly, is not the only high performance network

weird to have redundant/competing parts in many of the same markets though.
afaik, intel 10G has a reasonable rep; they presumably won't be junking
their own products.

> ...).  Its quite likely that Intel would need IB for its PetaScale
> plans.

I can't quite tell whether Qlogic's IB switches use Mellanox chips or not.
afaik, Qlogic has their own adapter chips (and perhaps FC/eth).

> than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us
> for a while, and switches are still ~150-300ns port to port.  At some

mellanox qdr systems I've tested are about 1.6 us half-rtt pingpong.
I don't think the switch latency is a big deal, since with 36x fanout,
you don't need a very tall fat-tree.

> Curiously, I think this suggests that ScaleMP could be in play on the
> software side

really?  I'd be interested in hearing from real people who've actually
used it (not marketing, thanks).  I don't really understand how ScaleMP
can do the required coherency in units smaller than a page, which means
that "non-embarassing" programs will surely notice...
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From tom.elken at qlogic.com  Fri Jan 27 18:24:21 2012
From: tom.elken at qlogic.com (Tom Elken)
Date: Fri, 27 Jan 2012 15:24:21 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201271554310.1084@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<alpine.LFD.2.02.1201271554310.1084@coffee.psychology.mcmaster.ca>
Message-ID: <35AAF1E4A771E142979F27B51793A4888885B23AF3@AVEXMB1.qlogic.org>


> I can't quite tell whether Qlogic's IB switches use Mellanox chips or not.

With the QDR generation, QLogic developed its own IB switch chip, and uses it in the 12000 line of switches.

-Tom

This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From bill at cse.ucdavis.edu  Fri Jan 27 21:10:02 2012
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Fri, 27 Jan 2012 18:10:02 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
Message-ID: <4F2358FA.4030009@cse.ucdavis.edu>

On 01/27/2012 02:25 PM, Gilad Shainer wrote:
> So I wonder why multiple OEMs decided to use Mellanox for on-board 
> solutions and no one used the QLogic silicon...

That's a strange argument.

What does Intel want?  Something to make them more money.

In the past that's been integrating functionality into their CPU or
support chipsets.  In the past that's been sata, usb, memory controller,
pci-e controller, and GigE.  The cost in transistors and die
area seems very relevant to Intel's interests.

Anyone have an estimate on how much latency a direct connect to QPI
would save vs pci-e?

What to motherboard board manufacturers want?  Something to make them
more money.

So that's mostly marketing/reputation, pricing, and whatever they can do
to differentiate themselves.  If buying a $150 IB chip lets them charge
$400 more then it's a win, assuming they spend less than $250 of R&D to
add it to the motherboard.  I doubt the difference in transistors or a
few watts would be a big deal either way.

>> Also, keep in mind that Intel's benchmarking group in Moscow has a 
>> lot of experience with benchmarking real apps for bids using 
>> TrueScale
head-to-head
>> against other HCAs, and I wouldn't be surprised if it was the case
that TrueScale
>> QDR is faster than that other company's FDR on many real codes,
> 
> 
> Surprise surprise... this is no more than FUD. If you have real
> numbers to back it up please send. If it was so great, how come more
> people decided to use the Mellanox solutions? If QLogic was doing so
> great with their solution, I would guess they would not be selling the
> IB business...

FUD = Fear, Uncertainty, and Doubt.  Doesn't sound like FUD to me.
More like a cheap attack on Greg, I think we (the mailing list) can do
better.

I've personally compared several generations of Myrinet and Infinipath
to allegedly faster Mellanox adapters.  Mellanox hasn't won yet, but
I've not compared QDR or FDR yet.  With that said the reason I run the
benchmarks to find the best solution and it might well be Mellanox next
time.  It would be irresponsible to recommend Mellanox cluster provide
just pick mellanox FDR over Qlogic QDR just because of the spec sheet.
Of course recommending Qlogic over Mellanox without quantifying real
world performance would be just as irresponsible.

Maybe we could have a few less attacks, complaining and hand waving and
more useful information?  IMO Greg never came across as a commercial
(which beowulf list isn't an appropriate place for), but does regularly
contribute useful info.  Arguing market share as proof of performance
superiority is just silly.

Speaking of which, you said:
  There is some add latency due to the 66/64 new encoding, but overall
  latency is lower than QDR. MPI is below 1us.

I googled for additional information, looked around the Mellanox
website, and couldn't find anything.  Is that above number relevant to
HPC folks running clusters?  Does it involve a switch?   If not
realistic are there any realistic numbers available?
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 21:24:10 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 21:24:10 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120127222723.GB29961@bx9.net>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<20120127222723.GB29961@bx9.net>
Message-ID: <4F235C4A.8040409@scalableinformatics.com>

On 01/27/2012 05:27 PM, Greg Lindahl wrote:

> I'm not surprised, as this 10ge adapter is aimed at the same part of
> the market that uses fibre channel, which isn't that common in HPC. It
> doesn't have the kind of TCP offload features which have been
> (futilely) marketed in HPC; it's all about running the same fibre
> channel software most enterprises have run for a long time, but having
> the network be ethernet.

That makes sense.

>> Haven't looked much at FDR or EDR latency.  Was it a huge delta (more
>> than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us
>> for a while, and switches are still ~150-300ns port to port.
>
> Are you talking about the latency of 1 core on 1 system talking to 1
> core on one system, or the kind of latency that real MPI programs see,
> running on all of the cores on a system and talking to many other
> systems? I assure you that the latter is not 0.8 for any IB system.

I am looking at these things from a "best of all possible cases" 
scenario.  So when someone comes at me with new "best of all possible 
cases" numbers, I can compare.  Sadly this seems to be the state of many 
OEM/integrators/manufacturers.

In storage, we see small disk form factor SSDs marketed generally, with 
statments like 50k IOPs, and 500 MB/s.  Though they neglect to mention 
several specific issues with these, such as writing all zeros, or the 
75k IOPs are sequential IOPs you get from taking the 600 MB/s interface, 
dividing by 8k byte operations on a sequential read.  Actually do a real 
random read and write and you get very ... very different results. 
Especially with non-zero (real) data.


>> At some
>> point I think you start hitting a latency floor, bounded in part by "c",
>
> Last time I did the computation, we were 10X that floor. And, of
> course, each increase in bandwidth usually makes latency worse, absent
> heroic efforts of implementers to make that headline latency look
> better.

I think thats the point though, that moving that performance "knee" down 
to lower latency involves (potentially) significant cost, for a modest 
return ... in terms of real performance benefit to a code.

Thanks for the pointer on the computation.  If we are 1000x off the 
floor, we can probably come up with a way to do better. 10x, probably 
its much harder than we think and not necessarily worth the effort.


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 21:38:14 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Sat, 28 Jan 2012 03:38:14 +0100
Subject: [Beowulf] Setting up new benchmark
In-Reply-To: <4F235C4A.8040409@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<20120127222723.GB29961@bx9.net>
	<4F235C4A.8040409@scalableinformatics.com>
Message-ID: <8C9E1983-6805-4951-8DEB-79FA871940F1@xs4all.nl>

No worries - when by mid februari all components from ebay arrived  
and i've setup a small cluster here
i hope to write some MPI benchmarks that do all sorts of latency  
tests which i'll attach GPL header to,
and which should measure from latency to bandwidth using RDMA reads  
mostly,

with all cores of every node busy.
Will be interesting then to compare it all.
Maybe several over here want to benchmark.
When i first designed the latency benchmark, later on Paul Hsieh  
managed to make the ideas implementation a bit more efficient.
I jumped with a random generator through the memory, Paul Hsieh had  
optimized it to just jumping random.
Dieter Buerssner then wrote the test for single cpu to compare  
whether it was similar to output i got - which appeared to be the case.
Setting up random pattern took very long though - then i optimized to  
setup the random pattern to O ( n log n ).

The advantage of all this is that one really sees the impact with all  
cores at the same time, whereas most tests use a total idle cluster  
and test 1 microtiny thing.

Vincent
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Sat Jan 28 00:29:36 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Fri, 27 Jan 2012 21:29:36 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2358FA.4030009@cse.ucdavis.edu>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
Message-ID: <20120128052936.GF20008@bx9.net>

On Fri, Jan 27, 2012 at 06:10:02PM -0800, Bill Broadley wrote:

> Anyone have an estimate on how much latency a direct connect to QPI
> would save vs pci-e?

~ 0.2us. Remember that the first 2 generations of InfiniPath were both
SDR: one for HyperTransport and one for PCIe. The difference was 0.3us
back then; PathScale + QLogic did some heroic things since to shorten
the pipeline stages & up the clock rate.

-- greg
(and if anyone needs a reminder, I no longer have any financial
involvement with QLogic or Intel.)


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Sat Jan 28 00:34:17 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Fri, 27 Jan 2012 21:34:17 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F235C4A.8040409@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<20120127222723.GB29961@bx9.net>
	<4F235C4A.8040409@scalableinformatics.com>
Message-ID: <20120128053417.GG20008@bx9.net>

On Fri, Jan 27, 2012 at 09:24:10PM -0500, Joe Landman wrote:

> > Are you talking about the latency of 1 core on 1 system talking to 1
> > core on one system, or the kind of latency that real MPI programs see,
> > running on all of the cores on a system and talking to many other
> > systems? I assure you that the latter is not 0.8 for any IB system.
> 
> I am looking at these things from a "best of all possible cases" 
> scenario.  So when someone comes at me with new "best of all possible 
> cases" numbers, I can compare.  Sadly this seems to be the state of many 
> OEM/integrators/manufacturers.

The point I've been trying to make for the past 8 years is that one of
the two chip families you're looking at doesn't degrade as much as the
other from the "best of all possible cases" to a real cluster running
a real code.

> In storage, we see small disk form factor SSDs marketed generally, with 
> statments like 50k IOPs, and 500 MB/s.

And if you knew that one family of SSDs had a wildly different ratio
of peak alleged perf to real application performance, would you ignore
that? I suspect not.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Sat Jan 28 05:17:32 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Sat, 28 Jan 2012 11:17:32 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
	business
In-Reply-To: <4F22ED20.7040105@ias.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu>
Message-ID: <20120128101732.GG7343@leitl.org>

On Fri, Jan 27, 2012 at 01:29:52PM -0500, Prentice Bisbal wrote:

> What it says is that we've given up on discussing technology with you,
> because your arguments are completely nonsensical. Since you clearly
> don't understand technology, we're hoping you can at least understand
> the simple concepts of basic etiquette.

Who's the list moderator, by the way?

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Sat Jan 28 08:32:26 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Sat, 28 Jan 2012 14:32:26 +0100
Subject: [Beowulf] photonic buffer bloat
Message-ID: <20120128133226.GU7343@leitl.org>


Relevant for future clusters, see the PPT presentation linked
in below URL.

----- Forwarded message from Masataka Ohta <mohta at necom830.hpcl.titech.ac.jp> -----

From: Masataka Ohta <mohta at necom830.hpcl.titech.ac.jp>
Date: Sat, 28 Jan 2012 21:42:13 +0900
To: nanog at nanog.org
Subject: Re: photonic buffer bloat
User-Agent: Mozilla/5.0 (Windows NT 5.1;
	rv:9.0) Gecko/20111222 Thunderbird/9.0.1

Eugen Leitl wrote:

> In future photonic networks (which will do relativistic cut-through
> directly in a photonic crossbar without converting photons to electrons
> and back) the fiber is not just a transport channel but also a photonic
> buffer

Yes.

> (e.g. at 10 GBit/s Ethernet a short reach fiber already buffers
> a standard 1500 MTU).

Wrong. 10Gbps is too slow for optical buffering.

At 1Tbps, you can use 100 times less lengthy fiber than at 10Gbps
to buffer packets.

A 1Tbps packet can be constructed by simultaneously encoding
100 wavelengths at 10Gbps.

> Of course photonic gates are expensive, individual delays do add up
> so even with slow light buffers

Don't try to make light slower. Slow light buffers have resonators,
which means they have very very very narrow bandwidth.

Instead, make communication speed faster, which shortens fiber
length of fiber delay line buffers.

> or optical delay loops taken into consideration
> current TCP/IP header layout has not been optimized for leading edge
> containing most significant switching/routing information, or even
> local-knowledge routing (with no global routes). It's too bad IPv6
> was not radical enough, so today's legacy protocols have to be tunneled
> through the networks of the future.

Considering that, in practice, packet headers must be processed
electrically, IPv4 at the photonic backbone is just fine, if most
routing table entries are aggregated at /24 or better, which is
the current practice. You only have to read a 16M entry SRAM.

A problem of IPv6 with 128bit addresses is that route look up
can not be performed within a constant time of a few nano
seconds, which means packets have overrun fiber delay lines.

> I presume this future is some 20-30 years away still.

Not so much. Moore's law requires much rapid bandwidth
increase.

My slides presented at IEEE photonics society 2009 summer topical

	ftp://chacha.hpcl.titech.ac.jp/IEEE-ST.ppt

might be interesting for you.

						Masataka Ohta

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Shainer at Mellanox.com  Sat Jan 28 13:21:59 2012
From: Shainer at Mellanox.com (Gilad Shainer)
Date: Sat, 28 Jan 2012 18:21:59 +0000
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2358FA.4030009@cse.ucdavis.edu>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
Message-ID: <F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>

> > So I wonder why multiple OEMs decided to use Mellanox for on-board
> > solutions and no one used the QLogic silicon...
> 
> That's a strange argument.

It is not an argument, it is stating a fact. If someone claims that a product provide 10x better performance, best fit etc., and from the other side it has very little attraction, something does not make dense. 

> What does Intel want?  Something to make them more money.

Intel explained their move in their PR. They see lots of growth in HPC, definitely in the Exascale, and they see InfiniBand as a key to deliver the right solution. They also mention InfiniBand adoption in other markets, so a good validation for InfiniBand as a leading solution for any server and storage connectivity.

<snip>

> >> Also, keep in mind that Intel's benchmarking group in Moscow has a
> >> lot of experience with benchmarking real apps for bids using
> >> TrueScale
> head-to-head
> >> against other HCAs, and I wouldn't be surprised if it was the case
> that TrueScale
> >> QDR is faster than that other company's FDR on many real codes,
> >
> >
> > Surprise surprise... this is no more than FUD. If you have real
> > numbers to back it up please send. If it was so great, how come more
> > people decided to use the Mellanox solutions? If QLogic was doing so
> > great with their solution, I would guess they would not be selling the
> > IB business...
> 
> FUD = Fear, Uncertainty, and Doubt.  Doesn't sound like FUD to me.
> More like a cheap attack on Greg, I think we (the mailing list) can do better.


I never saw any genuine testing from PathScale and then QLogic comparing their stuff to Mellanox, and you are more than welcome to try and prove me wrong. The argument in this email thread is no more than a re-cap of QLogic latest marketing campaign and yes, it is no more than FUD. Cheap attacks are not my game, so please....


> I've personally compared several generations of Myrinet and Infinipath to
> allegedly faster Mellanox adapters.  Mellanox hasn't won yet, but I've not
> compared QDR or FDR yet.  With that said the reason I run the benchmarks to
> find the best solution and it might well be Mellanox next time.  It would be
> irresponsible to recommend Mellanox cluster provide just pick mellanox FDR
> over Qlogic QDR just because of the spec sheet.
> Of course recommending Qlogic over Mellanox without quantifying real world
> performance would be just as irresponsible.


Going into a bit more of a technical discussion... QLogic way of networking is doing everything in the CPU, and Mellanox way is to implement if all in the hardware (we all know that). The second option is a superset, therefore worse case can be even performance. I encourage you to contact me directly for any application benchmarking you do, and I will be happy to provide you the feedback on what you need in order to get the best out of the Mellanox products. That can be QDR vs QDR as well, no need to go to FDR - I am open for the competition any time... 


> Maybe we could have a few less attacks, complaining and hand waving and
> more useful information?  IMO Greg never came across as a commercial
> (which beowulf list isn't an appropriate place for), but does regularly contribute
> useful info.  Arguing market share as proof of performance superiority is just
> silly.

I am not sure about that... quick search in past emails can show amazing things... 
I believe most of us are in agreement here. Less FUD, more facts.

> Speaking of which, you said:
>   There is some add latency due to the 66/64 new encoding, but overall
>   latency is lower than QDR. MPI is below 1us.
> 
> I googled for additional information, looked around the Mellanox website, and
> couldn't find anything.  Is that above number relevant to
> HPC folks running clusters?  Does it involve a switch?   If not

It is with a switch

-Gilad

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Sat Jan 28 13:41:56 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Sat, 28 Jan 2012 19:41:56 +0100
Subject: [Beowulf] What It'll Take to Go Exascale
Message-ID: <20120128184156.GB7343@leitl.org>


http://www.sciencemag.org/content/335/6067/394.full 

Science 27 January 2012:

Vol. 335 no. 6067 pp. 394-396

DOI: 10.1126/science.335.6067.394

Computer Science

What It'll Take to Go Exascale

Robert F. Service

Scientists hope the next generation of supercomputers will carry out a
million trillion operations per second. But first they must change the way
the machines are built and run.

On fire.

More powerful supercomputers now in the design stage should make modeling
turbulent gas flames more accurate and revolutionize engine designs.

"CREDIT: J. CHEN/CENTER FOR EXASCALE SIMULATION OF COMBUSTION IN TURBULENCE,
SANDIA NATIONAL LABORATORIES"

Using real climate data, scientists at Lawrence Berkeley National Laboratory
(LBNL) in California recently ran a simulation on one of the world's most
powerful supercomputers that replicated the number of tropical storms and
hurricanes that had occurred over the past 30 years. Its accuracy was a
landmark for computer modeling of global climate. But Michael Wehner and his
LBNL colleagues have their eyes on a much bigger prize: understanding whether
an increase in cloud cover from rising temperatures would retard climate
change by reflecting more light back into space, or accelerate it by trapping
additional heat close to Earth.

To succeed, Wehner must be able to model individual cloud systems on a global
scale. To do that, he will need supercomputers more powerful than any yet
designed. These so-called exascale computers would be capable of carrying out
1018 floating point operations per second, or an exaflop. That's nearly 100
times more powerful than today's biggest supercomputer, Japan's ?K Computer,?
which achieves 11.3 petaflops (1015 flops) (see graph), and 1000 times faster
than the Hopper supercomputer used by Wehner and his colleagues. The United
States now appears poised to reach for the exascale, as do China, Japan,
Russia, India, and the European Union.

It won't be easy. Advances in supercomputers have come at a steady pace over
the past 20 years, enabled by the continual improvement in computer chip
manufacturing. But this evolutionary approach won't cut it in getting to the
exascale. Instead, computer scientists must first figure out ways to make
future machines far more energy efficient and tolerant of errors, and find
novel ways to program them.

?The step we are about to take to exascale computing will be very, very
difficult,? says Robert Rosner, a physicist at the University of Chicago in
Illinois, who chaired a recent Department of Energy (DOE) committee charged
with exploring whether exascale computers would be achievable. Charles Shank,
a former director of LBNL who recently headed a separate panel collecting
widespread views on what it would take to build an exascale machine, agrees.
?Nobody said it would be impossible,? Shank says. ?But there are significant
unknowns.?

Gaining support

The next generation of powerful supercomputers will be used to design
high-efficiency engines tailored to burn biofuels, reveal the causes of
supernova explosions, track the atomic workings of catalysts in real time,
and study how persistent radiation damage might affect the metal casing
surrounding nuclear weapons. ?It's a technology that has become critically
important for many scientific disciplines,? says Horst Simon, LBNL's deputy
director.

That versatility has made supercomputing an easy sell to politicians. The
massive 2012 spending bill approved last month by Congress contained $1.06
billion for DOE's program in advanced computing, which includes a down
payment to bring online the world's first exascale computer. Congress didn't
specify exactly how much money should be spent on the exascale initiative,
for which DOE had requested $126 million. But it asked for a detailed plan,
due next month, with multiyear budget breakdowns listing who is expected to
do what, when. Those familiar with the ways of Washington say that the
request reflects an unusual bipartisan consensus on the importance of the
initiative.

?In today's political atmosphere, this is very unusual,? says Jack Dongarra,
a computer scientist at the University of Tennessee, Knoxville, who closely
follows national and international high-performance computing trends. ?It
shows how critical it really is and the threat perceived of the U.S. losing
its dominance in the field.? The threat is real: Japan and China have built
and operate the three most powerful supercomputers in the world.

The rest of the world also hopes that their efforts will make them less
dependent on U.S. technology. Of today's top 500 supercomputers, the vast
majority were built using processors from Intel, Advanced Micro Devices
(AMD), and NVIDIA, all U.S.-based companies. But that's beginning to change,
at least at the top. Japan's K machine is built using specially designed
processors from Fujitsu, a Japanese company. China, which had no
supercomputers in the Top500 List in 2000, now has five petascale machines
and is building another with processors made by a Chinese company. And an
E.U. research effort plans to use ARM processing chips made by a U.K.
company.

Getting over the bumps

Although bigger and faster, supercomputers aren't fundamentally different
from our desktops and laptops, all of which rely on the same sorts of
specialized components. Computer processors serve as the brains that carry
out logical functions, such as adding two numbers together or sending a bit
of data to a location where it is needed. Memory chips, by contrast, hold
data for safekeeping for later use. A network of wires connects processors
and memory and allows data to flow where and when they are needed.

For decades, the primary way of improving computers was creating chips with
ever smaller and faster circuitry. This increased the processor's frequency,
allowing it to churn through tasks at a faster clip. Through the 1990s,
chipmakers steadily boosted the frequency of chips. But the improvements came
at a price: The power demanded by a processor is proportional to its
frequency cubed. So doubling a processor's frequency requires an eightfold
increase in power.

New king.

Japan has the fastest machine (bar), although the United States still has the
most petascale computers (number in parentheses).

"CREDIT: ADAPTED FROM JACK DONGARRA/TOP 500 LIST/UNIVERSITY OF TENNESSEE"

On the rise.

The gap in available supercomputing capacity between the United States and
the rest of the world has narrowed, with China gaining the most ground.

"CREDIT: ADAPTED FROM JACK DONGARRA/TOP 500 LIST/UNIVERSITY OF TENNESSEE"

With the rise of mobile computing, chipmakers couldn't raise power demands
beyond what batteries could store. So about 10 years ago, chip manufacturers
began placing multiple processing ?cores? side by side on single chips. This
arrangement meant that only twice the power was needed to double a chip's
performance.

This trend swept through the world of supercomputers. Those with single
souped-up processors gave way to today's ?parallel? machines that couple vast
numbers of off-the-shelf commercial processors together. This move to
parallel computing ?was a huge, disruptive change,? says Robert Lucas, an
electrical engineer at the University of Southern California's Information
Sciences Institute in Los Angeles.

Hardware makers and software designers had to learn how to split problems
apart, send individual pieces to different processors, synchronize the
results, and synthesize the final ensemble. Today's top machine?Japan's ?K
Computer??has 705,000 cores. If the trend continues, an exascale computer
would have between 100 million and 1 billion processors.

But simply scaling up today's models won't work. ?Business as usual will not
get us to the exascale,? Simon says. ?These computers are becoming so
complicated that a number of issues have come up that were not there before,?
Rosner agrees.

The biggest issue relates to a supercomputer's overall power use. The largest
supercomputers today use about 10 megawatts (MW) of power, enough to power
10,000 homes. If the current trend of power use continues, an exascale
supercomputer would require 200 MW. ?It would take a nuclear power reactor to
run it,? Shank says.

Even if that much power were available, the cost would be prohibitive. At $1
million per megawatt per year, the electricity to run an exascale machine
would cost $200 million annually. ?That's a non-starter,? Shank says. So the
current target is a machine that draws 20 MW at most. Even that goal will
require a 300-fold improvement in flops per watt over today's technology.

Ideas for getting to these low-power chips are already circulating. One would
make use of different types of specialized cores. Today's top-of-the-line
supercomputers already combine conventional processor chips, known as CPUs,
with an alternative version called graphical processing units (GPUs), which
are very fast at certain types of calculations. Chip manufacturers are now
looking at going from ?multicore? chips with four or eight cores to
?many-core? chips, each containing potentially hundreds of CPU and GPU cores,
allowing them to assign different calculations to specialized processors.
That change is expected to make the overall chips more energy efficient.
Intel, AMD, and other chip manufacturers have already announced plans to make
hybrid many-core chips.

Another stumbling block is memory. As the number of processors in a
supercomputer skyrockets, so, too, does the need to add memory to feed bits
of data to the processors. Yet, over the next few years, memory manufacturers
are not projected to increase the storage density of their chips fast enough
to keep up with the performance gains of processors. Supercomputer makers can
get around this by adding additional memory modules. But that's threatening
to drive costs too high, Simon says.

Even if researchers could afford to add more memory modules, that still won't
solve matters. Moving ever-growing streams of data back and forth to
processors is already creating a backup for processors that can dramatically
slow a computer's performance. Today's supercomputers use 70% of their power
to move bits of data around from one place to another.

One potential solution would stack memory chips on top of one another and run
communication and power lines vertically through the stack. This more-compact
architecture would require fewer steps to route data. Another approach would
stack memory chips atop processors to minimize the distance bits need to
travel.

A third issue is errors. Modern processors compute with stunning accuracy,
but they aren't perfect. The average processor will produce one error per
year, as a thermal fluctuation or a random electrical spike flips a bit of
data from one value to another.

Such errors are relatively easy to ferret out when the number of processors
is low. But it gets much harder when 100 million to 1 billion processors are
involved. And increasing complexity produces additional software errors as
well. One possible solution is to have the supercomputer crunch different
problems multiple times and ?vote? for the most common solution. But that
creates a new problem. ?How can I do this without wasting double or triple
the resources?? Lucas asks. ?Solving this problem will probably require new
circuit designs and algorithms.?

Finally, there is the challenge of redesigning the software applications
themselves, such as a novel climate model or a simulation of a chemical
reaction. ?Even if we can produce a machine with 1 billion processors, it's
not clear that we can write software to use it efficiently,? Lucas says.
Current parallel computing machines use a strategy, known as message passing
interface, that divides computational problems and parses out the pieces to
individual processors, then collects the results. But coordinating all this
traffic for millions of processors is becoming a programming nightmare.
?There's a huge concern that the programming paradigm will have to change,?
Rosner says.

DOE has already begun laying the groundwork to tackle these and other
challenges. Last year it began funding three ?co-design? centers,
multi-institution cooperatives led by researchers at Los Alamos, Argonne, and
Sandia national laboratories. The centers bring together scientific users who
write the software code and hardware makers to design complex software and
computer architectures that work in the fastest and most energy-efficient
manner. It poses a potential clash between scientists who favor openness and
hardware companies that normally keep their activities secret for proprietary
reasons. ?But it's a worthy goal,? agrees Wilfred Pinfold, Intel's director
of extreme-scale programming in Hillsboro, Oregon.

Not so fast.

Researchers have some ideas on how to overcome barriers to building exascale
machines.

Coming up with the cash

Solving these challenges will take money, and lots of it. Two years ago,
Simon says, DOE officials estimated that creating an exascale computer would
cost $3 billion to $4 billion over 10 years. That amount would pay for one
exascale computer for classified defense work, one for nonclassified work,
and two 100-petaflops machines to work out some of the technology along the
way.

Those projections assumed that Congress would deliver a promised 10-year
doubling of the budget of DOE's Office of Science. But those assumptions are
?out of the window,? Simon says, replaced by the more likely scenario of
budget cuts as Congress tries to reduce overall federal spending.

Given that bleak fiscal picture, DOE officials must decide how aggressively
they want to pursue an exascale computer. ?What's the right balance of being
aggressive to maintain a leadership position and having the plan sent back to
the drawing board by [the Office of Management and Budget]?? Simon asks. ?I'm
curious to see.? DOE's strategic plan, due out next month, should provide
some answers.

The rest of the world faces a similar juggling act. China, Japan, the
European Union, Russia, and India all have given indications that they hope
to build an exascale computer within the next decade. Although none has
released detailed plans, each will need to find the necessary resources
despite these tight fiscal times.

The victor will reap more than scientific glory. Companies use 57% of the
computing time on the machines on the Top500 List, looking to speed product
design and gain other competitive advantages, Dongarra says. So government
officials see exascale computing as giving their industries a leg up. That's
particularly true for chip companies that plan to use exascale designs to
improve future commodity electronics. ?It will have dividends all the way
down to the laptop,? says Peter Beckman, who directs the Exascale Technology
and Computing Initiative at Argonne National Laboratory in Illinois.

The race to provide the hardware needed for exascale computing ?will be
extremely competitive,? Beckman predicts, and developing software and
networking technology will be equally important, according to Dongarra. Even
so, many observers think that the U.S. track record and the current alignment
of its political and scientific forces makes it America's race to lose.

Whatever happens, U.S. scientists are unlikely to be blindsided. The task of
building the world's first exascale computer is so complex, Simon says, that
it will be nearly impossible for a potential winner to hide in the shadows
and come out of nowhere to claim the prize.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Sat Jan 28 14:26:48 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Sat, 28 Jan 2012 14:26:48 -0500 (EST)
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <20120128101732.GG7343@leitl.org>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
Message-ID: <alpine.LFD.2.02.1201281420330.7178@coffee.psychology.mcmaster.ca>

>> the simple concepts of basic etiquette.
>
> Who's the list moderator, by the way?

no, please - if there were a moderator who had to plow through
all messages, no matter how long, meandering and low-worth,
it would become a very unpleasant chore...

the list doesn't get a lot of passing weirdos - pretty stable 
set of characters, fairly predictable in how much you want to read
their messages, and how much good you expect to gain from them ;)

regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Sat Jan 28 16:28:09 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Sat, 28 Jan 2012 16:28:09 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
Message-ID: <alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>

>>> So I wonder why multiple OEMs decided to use Mellanox for on-board
>>> solutions and no one used the QLogic silicon...
>>
>> That's a strange argument.
>
> It is not an argument, it is stating a fact.

you are mistaken.  you ask a pointed question - do not construe it 
as a statement of fact.  if you wanted to state a fact, you might say:
"multiple OEMs decided to use Mellanox and none have used Qlogic".

by stating this, you are implying that Mellanox is superior in some way,
though another perfectly adequate explanation could be that Qlogic 
didn't offer their chips to OEMs, or did so at a higher price.  (in fact,
the latter would suggest the possibility that Qlogic chips are actually
worth more.)  note my use of subjunctive here.

in reality, Mellanox is the easy choice - widely known and used,
the default.  OEMs are fond of making easy choices: more comfortable
to a lazy customer, possibly lower customer support costs, etc.

this says nothing about whether an easy choice is a superior solution 
to the customer (that is, in performance, price, etc).


> If someone claims that a product provide 10x better performance, best fit
>etc., and from the other side it has very little attraction, something does
>not make dense.

I saw no 10x performance claim here.  there was some casual mention
of a situation where Qlogic QDR performs similar to Mellanox FDR.


>good validation for InfiniBand as a leading solution for any server and
>storage connectivity.

besides Lustre, where do you see IB used for storage?


> Going into a bit more of a technical discussion... QLogic way of networking
>is doing everything in the CPU, and Mellanox way is to implement if all in
>the hardware (we all know that).

this is a dishonest statement: you know that QLogic isn't actually trying
to do *everything* in the CPU.


> The second option is a superset, therefore
>worse case can be even performance.

this is also dishonest: making the adapter more intelligent clearly
introduces some tradeoffs, so it's _not_ a superset.  unless you are 
claiming that within every Mellanox adapter is _literally_ the same 
functionality, at the same performance, as is in a Qlogic adapter.


>> Maybe we could have a few less attacks, complaining and hand waving and
>> more useful information?  IMO Greg never came across as a commercial
>> (which beowulf list isn't an appropriate place for), but does regularly contribute
>> useful info.  Arguing market share as proof of performance superiority is just
>> silly.
>
> I am not sure about that... quick search in past emails can show amazing things...
> I believe most of us are in agreement here. Less FUD, more facts.

"facts" in this context (as opposed to FUD, armwaiving, etc) must be 
dispassionate and quantifiable.  not hyperbole and suggestive rhetoric.

out of curiosity, has anyone set up a head-to-head comparison
(two or more identical machines, both with a Qlogic and a Mellanox card of
the same vintage)?

regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Sat Jan 28 19:12:59 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Sun, 29 Jan 2012 01:12:59 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
Message-ID: <D7E0943A-D155-44EF-B7C8-A0A1C6EF385F@xs4all.nl>


On Jan 28, 2012, at 10:28 PM, Mark Hahn wrote:
[snip]

> out of curiosity, has anyone set up a head-to-head comparison
> (two or more identical machines, both with a Qlogic and a Mellanox  
> card of
> the same vintage)?
>
> regards, mark hahn.

Mark, i stumbled upon the same problem a few months ago when i  
googled for 4x infiniband you can find something,
when moving up to QDR it becomes more sporadic.
Not to mention that the interesting test is where the cards are bad -  
latency.
If you find anything, usually it's manufacturer side statements  
without clear testsetup and usually doing 0 byte tests.

This is exactly why i intend to write a benchmark.

What i personally believe is not important whether FDR,  pci-e 3.0  
and a considerable higher claimed bandwidth than pci-e 2.0 QDR.

What i do believe is that one must measure objectively.

That's why i'm posting for a while now that as soon as the cluster  
works here i'm gonna
write a benchmark to measure latencies moving up the read length  
slowly so that it more and more gets a bandwidth game and simply  
present the
graph for the interested readers.

We're not interested in theoretic tests of 1 core busy that is  
measuring a latency of another core at the other side busy.

A test really requires all cores busy and hammering onto the network  
card.

In the end always everything is a measure of bandwidth of course, but  
even then the lack of scientists online who tested objectively QDR,
no matter *what manufacturer*, such tests really are there in short  
supply and some of them either just tested 1 tiny thing or a  
theoretic thing,
or just lacked all realism when i read the rest of the article.

All with all, after some days of googling,

I found 1 tester who toyed something using the same switch (good  
idea) but the graphs drawn presenting the results are tough to interpret
and basically was interested in something else than what's fast now  
for the network cards.

Running the same oldie tests, whereas all manufacturers have way  
faster alternatives now, such as RDMA reads, is just not interesting.

To be continued in some months...

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Shainer at Mellanox.com  Sun Jan 29 00:03:31 2012
From: Shainer at Mellanox.com (Gilad Shainer)
Date: Sun, 29 Jan 2012 05:03:31 +0000
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
Message-ID: <F46B2E61C40ADF4ABD39500BC54C3C791896C86F@MTIDAG01.mtl.com>

> >>> So I wonder why multiple OEMs decided to use Mellanox for on-board
> >>> solutions and no one used the QLogic silicon...
> >>
> >> That's a strange argument.
> >
> > It is not an argument, it is stating a fact.
> 
> you are mistaken.  you ask a pointed question - do not construe it as a
> statement of fact.  if you wanted to state a fact, you might say:
> "multiple OEMs decided to use Mellanox and none have used Qlogic".

You probably meant to say "I think differently" and not "you are mistaken".... Making this mailing list little more polite will benefit us all.  
 
> by stating this, you are implying that Mellanox is superior in some way, though
> another perfectly adequate explanation could be that Qlogic didn't offer their
> chips to OEMs, or did so at a higher price.  (in fact, the latter would suggest the
> possibility that Qlogic chips are actually worth more.)  note my use of
> subjunctive here.
> 
> in reality, Mellanox is the easy choice - widely known and used, the default.
> OEMs are fond of making easy choices: more comfortable to a lazy customer,
> possibly lower customer support costs, etc.
> 
> this says nothing about whether an easy choice is a superior solution to the
> customer (that is, in performance, price, etc).

OEMs don't place devices on the motherboard just because they can, not because it is cheaper. They do so because they believe it will benefit their users, hence they will sell more. I can assure you that silicon was offered from both companies, and it wasn't an issue of price. From this point you can make any conclusion that you wish to. 
 
<snip>

> >good validation for InfiniBand as a leading solution for any server and
> >storage connectivity.
> 
> besides Lustre, where do you see IB used for storage?

Protocols: iSER (iSCSI), NFSoRDMA, SRP, GPFS, SMB and others
OEMs: DDN, Xyratex, Netapp, EMC, Oracle, SGI, HP, IBM and others. 

> > Going into a bit more of a technical discussion... QLogic way of networking
> >is doing everything in the CPU, and Mellanox way is to implement if all in
> >the hardware (we all know that).
> 
> this is a dishonest statement: you know that QLogic isn't actually trying
> to do *everything* in the CPU.

You are right, you do need a HW translation from PCIe to IB. But I am sure you know where the majority of the transport, error handling etc is being done....

> > The second option is a superset, therefore
> >worse case can be even performance.
> 
> this is also dishonest: making the adapter more intelligent clearly
> introduces some tradeoffs, so it's _not_ a superset.  unless you are
> claiming that within every Mellanox adapter is _literally_ the same
> functionality, at the same performance, as is in a Qlogic adapter.

It is not dishonest. In general offloading is a superset. You can chose to implement just offloading or to leave room for CPU control as well. There will always be parts that are better to be in HW, and if you have flexibility for the rest it is a superset.  


> >> Maybe we could have a few less attacks, complaining and hand waving and
> >> more useful information?  IMO Greg never came across as a commercial
> >> (which beowulf list isn't an appropriate place for), but does regularly
> contribute
> >> useful info.  Arguing market share as proof of performance superiority is
> just
> >> silly.
> >
> > I am not sure about that... quick search in past emails can show amazing
> things...
> > I believe most of us are in agreement here. Less FUD, more facts.
> 
> "facts" in this context (as opposed to FUD, armwaiving, etc) must be
> dispassionate and quantifiable.  not hyperbole and suggestive rhetoric.

Maybe we read different emails.

> out of curiosity, has anyone set up a head-to-head comparison
> (two or more identical machines, both with a Qlogic and a Mellanox card of
> the same vintage)?
> 
> regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Mon Jan 30 10:04:53 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Mon, 30 Jan 2012 10:04:53 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <CANe5L+QYZUZ88LSffUitw==ya8SRtzLSxtAMCQ+mE7pHByjfsA@mail.gmail.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
	<CANe5L+QYZUZ88LSffUitw==ya8SRtzLSxtAMCQ+mE7pHByjfsA@mail.gmail.com>
Message-ID: <alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>

>> out of curiosity, has anyone set up a head-to-head comparison
>> (two or more identical machines, both with a Qlogic and a Mellanox card of
>> the same vintage)?
>>
>> There was a bit of discussion of InfiniBand benchmarking in this thread
> and it seems it would be helpful to the casual readers like myself to have
> a few references to benchmarking toolkits and actual results.
>
> Most often reported results are gathered with either Netpipe from Ames or
> Intel MPI Benchmark (formerly known as Palas Benchmark) or OSU
> Micro-benchmarks.
>
> Searching the web produced a recent report from Swiss CSCS where a Mellanox
> ConnectX3 QDR HCA with a Mellanox switch is set against a Qlogic 7300 QDR
> HCA connected to a Qlogic switch.
> http://www.cscs.ch/fileadmin/user_upload/customers/cscs/Tech_Reports/Performance_Analysis_IB-QDR_final-2.pdf

as far as I can tell, this paper mainly says "a coalescing stack delivers
benchmark results showing a lot higher bandwidth and message rate than a
non-coalescing stack."  the comment on figure 8:

     To some extent, the environment variables mentioned before
     contribute to this outstanding result

which is remarkably droll.  I'm not sure how well coalescing works for real
applications.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Mon Jan 30 11:20:46 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 30 Jan 2012 11:20:46 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
 business
In-Reply-To: <20120128101732.GG7343@leitl.org>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>	<4F22CFEB.6080404@cse.psu.edu>	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>	<4F22ED20.7040105@ias.edu>
	<20120128101732.GG7343@leitl.org>
Message-ID: <4F26C35E.7060702@ias.edu>

On 01/28/2012 05:17 AM, Eugen Leitl wrote:
> On Fri, Jan 27, 2012 at 01:29:52PM -0500, Prentice Bisbal wrote:
>
>> What it says is that we've given up on discussing technology with you,
>> because your arguments are completely nonsensical. Since you clearly
>> don't understand technology, we're hoping you can at least understand
>> the simple concepts of basic etiquette.
> Who's the list moderator, by the way?
>

I don't think there is one, hence all the noise. The mailing list and
beowulf.org is maintained by Penguin Computing/Scyld Software. Maybe
they'd be interested in appoint a moderator or 3.

---
Prentice
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Shainer at Mellanox.com  Mon Jan 30 14:22:24 2012
From: Shainer at Mellanox.com (Gilad Shainer)
Date: Mon, 30 Jan 2012 19:22:24 +0000
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
	<CANe5L+QYZUZ88LSffUitw==ya8SRtzLSxtAMCQ+mE7pHByjfsA@mail.gmail.com>
	<alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>
Message-ID: <F46B2E61C40ADF4ABD39500BC54C3C7918970799@MTIDAG01.mtl.com>

> >> out of curiosity, has anyone set up a head-to-head comparison (two or
> >> more identical machines, both with a Qlogic and a Mellanox card of
> >> the same vintage)?
> >>
> >> There was a bit of discussion of InfiniBand benchmarking in this
> >> thread
> > and it seems it would be helpful to the casual readers like myself to
> > have a few references to benchmarking toolkits and actual results.
> >
> > Most often reported results are gathered with either Netpipe from Ames
> > or Intel MPI Benchmark (formerly known as Palas Benchmark) or OSU
> > Micro-benchmarks.
> >
> > Searching the web produced a recent report from Swiss CSCS where a
> > Mellanox
> > ConnectX3 QDR HCA with a Mellanox switch is set against a Qlogic 7300
> > QDR HCA connected to a Qlogic switch.
> > http://www.cscs.ch/fileadmin/user_upload/customers/cscs/Tech_Reports/P
> > erformance_Analysis_IB-QDR_final-2.pdf
> 
> as far as I can tell, this paper mainly says "a coalescing stack delivers
> benchmark results showing a lot higher bandwidth and message rate than a
> non-coalescing stack."  the comment on figure 8:
> 
>      To some extent, the environment variables mentioned before
>      contribute to this outstanding result
> 
> which is remarkably droll.  I'm not sure how well coalescing works for real
> applications.

First, I looked on the paper and it includes latency and bandwidth comparison as well, not only message rate. It is important for others to know that, and not to dismiss it. Second, both companies have options for message coalescing. You can chose to use it or not - I saw apps that got a benefit from it, and saw applications that does not. Without coalescing Mellanox provides around 30M message per second.

-Gilad.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From peter.st.john at gmail.com  Mon Jan 30 18:07:11 2012
From: peter.st.john at gmail.com (Peter St. John)
Date: Mon, 30 Jan 2012 18:07:11 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F26C35E.7060702@ias.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
Message-ID: <CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>

Instead of appointing a moderator, we could grow one with recursive Page
Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
about this type of thing a while ago because of "citation analysis", see
the link).

Someone writes an open script and members of the list mail it with the
answers to these three questions:
1. do you volunteer to moderate?
2. Who should moderate? (give email addresses)
3. Who should judge who should moderate? (give email addresses).

Then you iterate over scoring people by "wisdom" and who gets the most
"wise" votes, until the scores converge.
The biggest hurdle would probably be getting volunteers, though.
Peter

On Mon, Jan 30, 2012 at 11:20 AM, Prentice Bisbal <prentice at ias.edu> wrote:

> On 01/28/2012 05:17 AM, Eugen Leitl wrote:
> > On Fri, Jan 27, 2012 at 01:29:52PM -0500, Prentice Bisbal wrote:
> >
> >> What it says is that we've given up on discussing technology with you,
> >> because your arguments are completely nonsensical. Since you clearly
> >> don't understand technology, we're hoping you can at least understand
> >> the simple concepts of basic etiquette.
> > Who's the list moderator, by the way?
> >
>
> I don't think there is one, hence all the noise. The mailing list and
> beowulf.org is maintained by Penguin Computing/Scyld Software. Maybe
> they'd be interested in appoint a moderator or 3.
>
> ---
> Prentice
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120130/8b652bf4/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From landman at scalableinformatics.com  Mon Jan 30 18:09:48 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 30 Jan 2012 18:09:48 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
Message-ID: <4F27233C.8080508@scalableinformatics.com>

On 01/30/2012 06:07 PM, Peter St. John wrote:
> Instead of appointing a moderator, we could grow one with recursive Page
> Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
> about this type of thing a while ago because of "citation analysis", see
> the link).

Please ... no moderator.  Lists get boring while waiting for content 
filtering organisms to fulfill their voluntary tasks ...

If you don't like someone's writing, filter them.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Mon Jan 30 18:21:45 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 30 Jan 2012 15:21:45 -0800
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic
	InfiniBand	business
In-Reply-To: <CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>	<4F22ED20.7040105@ias.edu>
	<20120128101732.GG7343@leitl.org>	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>


The biggest hurdle would probably be getting volunteers, though.
Peter

You got that right...  Moderating takes a deft touch and a thick skin.


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120130/2b0c0a1e/attachment.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From james.p.lux at jpl.nasa.gov  Mon Jan 30 18:25:49 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 30 Jan 2012 15:25:49 -0800
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <4F27233C.8080508@scalableinformatics.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>	<4F22ED20.7040105@ias.edu>
	<20120128101732.GG7343@leitl.org>	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<4F27233C.8080508@scalableinformatics.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B448232@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Joe Landman
Sent: Monday, January 30, 2012 3:10 PM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand business

On 01/30/2012 06:07 PM, Peter St. John wrote:
> Instead of appointing a moderator, we could grow one with recursive 
> Page Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we 
> knew about this type of thing a while ago because of "citation 
> analysis", see the link).

Please ... no moderator.  Lists get boring while waiting for content filtering organisms to fulfill their voluntary tasks ...

If you don't like someone's writing, filter them.


--
I agree.
However, there is also "after the fact moderation".. all posts go through by default, but someone acts as a "list conscience" and gently (or not so gently) applies a corrective force, presumably using some sort of adaptive algorithm (different people have different "plant characteristics" so the optimal controller changes).

But that requires an even deft-er touch and thicker skin.

All lists with participation by knowledgeable and opinionated people with varied interests and specialization tend to go off on tangents occasionally.  You just delete when needed, and wait for the transient to die out.  My best guess is that about 48 hours is how long the transient lasts (because it takes two cycles, for those who read the list once a day, to realize that it's died out and not keep feeding it)


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Mon Jan 30 18:52:14 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Mon, 30 Jan 2012 18:52:14 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
Message-ID: <294b053bd84fed49f071a631c79be7e8.squirrel@mail.eadline.org>

I use my personal Zen type moderation.

yea, whatever

--
Doug

> Instead of appointing a moderator, we could grow one with recursive Page
> Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
> about this type of thing a while ago because of "citation analysis", see
> the link).
>
> Someone writes an open script and members of the list mail it with the
> answers to these three questions:
> 1. do you volunteer to moderate?
> 2. Who should moderate? (give email addresses)
> 3. Who should judge who should moderate? (give email addresses).
>
> Then you iterate over scoring people by "wisdom" and who gets the most
> "wise" votes, until the scores converge.
> The biggest hurdle would probably be getting volunteers, though.
> Peter
>
> On Mon, Jan 30, 2012 at 11:20 AM, Prentice Bisbal <prentice at ias.edu>
> wrote:
>
>> On 01/28/2012 05:17 AM, Eugen Leitl wrote:
>> > On Fri, Jan 27, 2012 at 01:29:52PM -0500, Prentice Bisbal wrote:
>> >
>> >> What it says is that we've given up on discussing technology with
>> you,
>> >> because your arguments are completely nonsensical. Since you clearly
>> >> don't understand technology, we're hoping you can at least understand
>> >> the simple concepts of basic etiquette.
>> > Who's the list moderator, by the way?
>> >
>>
>> I don't think there is one, hence all the noise. The mailing list and
>> beowulf.org is maintained by Penguin Computing/Scyld Software. Maybe
>> they'd be interested in appoint a moderator or 3.
>>
>> ---
>> Prentice
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pbm.com  Tue Jan 31 02:53:18 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Mon, 30 Jan 2012 23:53:18 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
	<CANe5L+QYZUZ88LSffUitw==ya8SRtzLSxtAMCQ+mE7pHByjfsA@mail.gmail.com>
	<alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>
Message-ID: <20120131075318.GA2600@bx9.net>

On Mon, Jan 30, 2012 at 10:04:53AM -0500, Mark Hahn wrote:

> > http://www.cscs.ch/fileadmin/user_upload/customers/cscs/Tech_Reports/Performance_Analysis_IB-QDR_final-2.pdf
> 
> as far as I can tell, this paper mainly says "a coalescing stack delivers
> benchmark results showing a lot higher bandwidth and message rate than a
> non-coalescing stack."  the comment on figure 8:
> 
>      To some extent, the environment variables mentioned before
>      contribute to this outstanding result
> 
> which is remarkably droll.  I'm not sure how well coalescing works for real
> applications.

Note also that many of the benchmarks in this analysis weren't run
using MPI -- if I remember correctly, the ib_* commands mentioned use
InfiniBand verbs directly, which means they aren't accellerated on
InfiniPath.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 31 04:28:18 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 31 Jan 2012 10:28:18 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
	business
In-Reply-To: <4F27233C.8080508@scalableinformatics.com>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<4F27233C.8080508@scalableinformatics.com>
Message-ID: <20120131092818.GW7343@leitl.org>

On Mon, Jan 30, 2012 at 06:09:48PM -0500, Joe Landman wrote:
> On 01/30/2012 06:07 PM, Peter St. John wrote:
> > Instead of appointing a moderator, we could grow one with recursive Page
> > Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
> > about this type of thing a while ago because of "citation analysis", see
> > the link).
> 
> Please ... no moderator.  Lists get boring while waiting for content 
> filtering organisms to fulfill their voluntary tasks ...

On all the lists I run and participate in you only turn
moderation on by default for new list members and put 
known bozos on permanent moderation.

The result is zero delay as soon as new list subscribers
have produced their first non-spam non-bozo post.
 
> If you don't like someone's writing, filter them.

I already do, but content producers typically don't bother
and vote with their feet. I have seen many communities die
in that manner. Never surprising, still always sad.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 31 04:31:04 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 31 Jan 2012 10:31:04 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
	business
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <20120131093104.GX7343@leitl.org>

On Mon, Jan 30, 2012 at 03:21:45PM -0800, Lux, Jim (337C) wrote:
> 
> 
> The biggest hurdle would probably be getting volunteers, though.
> Peter
> 
> You got that right...  Moderating takes a deft touch and a thick skin.

I would have no issues moderating Beowulf@ since that would 
require only negligible additional workload.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Glen.Beane at jax.org  Tue Jan 31 07:15:51 2012
From: Glen.Beane at jax.org (Glen Beane)
Date: Tue, 31 Jan 2012 12:15:51 +0000
Subject: [Beowulf] cpu's versus gpu's - was Intel buys
	QLogic	InfiniBand	business
In-Reply-To: <20120131093104.GX7343@leitl.org>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu>
	<20120128101732.GG7343@leitl.org> <4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<20120131093104.GX7343@leitl.org>
Message-ID: <FC01EA88-B735-4DE8-8177-EEC7B3CE8CC8@jax.org>


On Jan 31, 2012, at 4:31 AM, Eugen Leitl wrote:

> On Mon, Jan 30, 2012 at 03:21:45PM -0800, Lux, Jim (337C) wrote:
>> 
>> 
>> The biggest hurdle would probably be getting volunteers, though.
>> Peter
>> 
>> You got that right...  Moderating takes a deft touch and a thick skin.
> 
> I would have no issues moderating Beowulf@ since that would 
> require only negligible additional workload.


Did this list used to be moderated?  I remember when I first joined there would be a significant delay for my email sent to the list, while I was waiting for my replies to show up a whole conversation would be unfolding between "veteran posters"


--
Glen L. Beane
Senior Software Engineer
The Jackson Laboratory
(207) 288-6153

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at cse.psu.edu  Tue Jan 31 10:30:48 2012
From: ellis at cse.psu.edu (Ellis H. Wilson III)
Date: Tue, 31 Jan 2012 10:30:48 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
 business
In-Reply-To: <FC01EA88-B735-4DE8-8177-EEC7B3CE8CC8@jax.org>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<20120131093104.GX7343@leitl.org>
	<FC01EA88-B735-4DE8-8177-EEC7B3CE8CC8@jax.org>
Message-ID: <4F280928.7080806@cse.psu.edu>

On 01/31/2012 07:15 AM, Glen Beane wrote:
> Did this list used to be moderated?  I remember when I first joined there would be a significant delay for my email sent to the list, while I was waiting for my replies to show up a whole conversation would be unfolding between "veteran posters"

Yea, same used to happen to me back in '06 when I first joined.  Sent an 
email about it and got a response back from Don Becker stating that I 
was taken off the moderation list.  I'm not sure if he's still the 
moderator anymore, however.  While I think that's a great way to deal 
with newcomers, I'm not sure there is a fair way to determine which of 
the existing posters are and are not trolls deserving of moderation. 
Therefore I also vote to continue in a non-moderated fashion.

On that note, my sincere apologies to the list if any of my replies 
served in any way to kindle this discussion.  I got a bit colorful due 
to a building frustration from years of eye-rolling.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From cbergstrom at pathscale.com  Tue Jan 31 10:40:48 2012
From: cbergstrom at pathscale.com (=?ISO-8859-1?Q?=22C=2E_Bergstr=F6m=22?=)
Date: Tue, 31 Jan 2012 22:40:48 +0700
Subject: [Beowulf]  List moderation
In-Reply-To: <4F280928.7080806@cse.psu.edu>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<20120131093104.GX7343@leitl.org>
	<FC01EA88-B735-4DE8-8177-EEC7B3CE8CC8@jax.org>
	<4F280928.7080806@cse.psu.edu>
Message-ID: <4F280B80.6030800@pathscale.com>

On 01/31/12 10:30 PM, Ellis H. Wilson III wrote:
> On 01/31/2012 07:15 AM, Glen Beane wrote:
>> Did this list used to be moderated?  I remember when I first joined there would be a significant delay for my email sent to the list, while I was waiting for my replies to show up a whole conversation would be unfolding between "veteran posters"
> Yea, same used to happen to me back in '06 when I first joined.  Sent an
> email about it and got a response back from Don Becker stating that I
> was taken off the moderation list.  I'm not sure if he's still the
> moderator anymore, however.  While I think that's a great way to deal
> with newcomers, I'm not sure there is a fair way to determine which of
> the existing posters are and are not trolls deserving of moderation.
> Therefore I also vote to continue in a non-moderated fashion.
-1

 From a bystander perspective I'm all for moderation and reducing the 
noise.  Even people who have their posts moderated would likely be 
understanding that it's for the greater good.  Lets call it peer review 
instead of "moderation".

imho someone with some guts just needs to do it so this doesn't turn 
into a bikeshed discussion
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From joshua_mora at usa.net  Tue Jan 31 14:19:46 2012
From: joshua_mora at usa.net (Joshua mora acosta)
Date: Tue, 31 Jan 2012 13:19:46 -0600
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
Message-ID: <525qaETsU7536S02.1328037586@web02.cms.usa.net>

I agree with Joe.
Plus I know that most of us, if not all, truly want to share knowledge, and
why not, opinions as well based on personal experiences as long as "we all do
the effort to be respectful with both the individual and the technology and
being open /receptive to be criticized as well". 
That is in fact the reason I like this distribution list.

Joshua.


------ Original Message ------
Received: 05:11 PM CST, 01/30/2012
From: Joe Landman <landman at scalableinformatics.com>
To: beowulf at beowulf.org
Subject: Re: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
business

> On 01/30/2012 06:07 PM, Peter St. John wrote:
> > Instead of appointing a moderator, we could grow one with recursive Page
> > Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
> > about this type of thing a while ago because of "citation analysis", see
> > the link).
> 
> Please ... no moderator.  Lists get boring while waiting for content 
> filtering organisms to fulfill their voluntary tasks ...
> 
> If you don't like someone's writing, filter them.
> 
> -- 
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>         http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From mdidomenico4 at gmail.com  Tue Jan 31 15:55:55 2012
From: mdidomenico4 at gmail.com (Michael Di Domenico)
Date: Tue, 31 Jan 2012 15:55:55 -0500
Subject: [Beowulf] rear door heat exchangers
Message-ID: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>

i'm looking for, but have not found yet, a rear door heat exchanger
with fans.  the door should be able to support up to 35kw using
chilled water.  has anyone seen such an animal?

most of the ones i've seen utilize a side car that sits beside the
rack.  unfortunately, i'm space limited and i need something that will
hang on the back of the rack.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lathama at gmail.com  Tue Jan 31 16:13:48 2012
From: lathama at gmail.com (Andrew Latham)
Date: Tue, 31 Jan 2012 18:13:48 -0300
Subject: [Beowulf] rear door heat exchangers
In-Reply-To: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
References: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
Message-ID: <CA+qj4S8YN0fQ4tQ3Hhyhu_jUh9SZrDDdjE8iACZ5bXWKfgwXhQ@mail.gmail.com>

On Tue, Jan 31, 2012 at 5:55 PM, Michael Di Domenico
<mdidomenico4 at gmail.com> wrote:
> i'm looking for, but have not found yet, a rear door heat exchanger
> with fans. ?the door should be able to support up to 35kw using
> chilled water. ?has anyone seen such an animal?
>
> most of the ones i've seen utilize a side car that sits beside the
> rack. ?unfortunately, i'm space limited and i need something that will
> hang on the back of the rack.
> _____________________________

Maybe: http://www.hoffmanonline.com/product_catalog/section_index.aspx?cat_1=34&cat_2=2383&SelectCatId=2383&CatId=2383

Semi Related question: Has any research been done on cooling the
racks/rails/metal infrastructure in the effort to cool the whole
rack+systems?

-- 
~ Andrew "lathama" Latham lathama at gmail.com http://lathama.net ~
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Tue Jan 31 18:47:18 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Tue, 31 Jan 2012 15:47:18 -0800
Subject: [Beowulf] rear door heat exchangers
In-Reply-To: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
References: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B4483FA@ALTPHYEMBEVSP20.RES.AD.JPL>

Maybe there's an issue with the weight and or flexible tubing on a swinging door?

The Hoffman products in Andrew's email, I think, aren't the kind that hang on a door, more hang on the side of a large box/cabinet (Type 4,12, 3R enclosure) or wall.

They're also air/air heat exchanges or airconditioners (and vortex coolers.. but you don't want one of those unless you have a LOT of compressed air available)

http://www.42u.com/cooling/liquid-cooling/liquid-cooling.htm
shows "in-row liquid cooling" but I think that's sort of in parallel

They do mention, lower down on the page, "Rear Door Liquid Cooling"
But I notice that the Liebert XDF-5 which is basically a rack and chiller deck in one, only pulls out 14kW.


>From DoE:
http://www1.eere.energy.gov/femp/pdfs/rdhe_cr.pdf

They refer the ones installed at LLBL  as RDHx units, but carefully avoid telling you the brand or any decent data.  They do say they cost $6k/door, and suck up 10-11kW/rack with 9 gal/min flow of 72F water.

Googling RDHx turns up "CoolCentric.com"
http://www.coolcentric.com/resources/data_sheets/Coolcentric-Rear-Door-Heat-Exchanger-Data-Sheet.pdf

33kW is as good as they can do.

I also note that they have no fans in them.


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Michael Di Domenico
Sent: Tuesday, January 31, 2012 12:56 PM
To: Beowulf Mailing List
Subject: [Beowulf] rear door heat exchangers

i'm looking for, but have not found yet, a rear door heat exchanger with fans.  the door should be able to support up to 35kw using chilled water.  has anyone seen such an animal?

most of the ones i've seen utilize a side car that sits beside the rack.  unfortunately, i'm space limited and i need something that will hang on the back of the rack.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From sdm900 at gmail.com  Tue Jan 31 18:54:48 2012
From: sdm900 at gmail.com (Stu Midgley)
Date: Wed, 1 Feb 2012 07:54:48 +0800
Subject: [Beowulf] rear door heat exchangers
In-Reply-To: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
References: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
Message-ID: <CAEM1RsV33jkGOa0G1GyKdCDerWh5F23f+LzLHB=zgCLCC-j_mw@mail.gmail.com>

Speak to SGI.  We have about a dozen such racks, all from SGI.


On Wed, Feb 1, 2012 at 4:55 AM, Michael Di Domenico
<mdidomenico4 at gmail.com> wrote:
> i'm looking for, but have not found yet, a rear door heat exchanger
> with fans. ?the door should be able to support up to 35kw using
> chilled water. ?has anyone seen such an animal?
>
> most of the ones i've seen utilize a side car that sits beside the
> rack. ?unfortunately, i'm space limited and i need something that will
> hang on the back of the rack.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


-- 
Dr Stuart Midgley
sdm900 at gmail.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Herbert.Fruchtl at st-andrews.ac.uk  Tue Jan 31 19:18:10 2012
From: Herbert.Fruchtl at st-andrews.ac.uk (Herbert Fruchtl)
Date: Wed, 1 Feb 2012 00:18:10 +0000
Subject: [Beowulf] moderation - was cpu's versus gpu's - was Intel buys
	QLogic
Message-ID: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1>

Folks,

I missed part of this discussion (for obvious reasons I lost interest), but since it seems to be moving in that direction, I'll throw in my two smallest-local-currency-units. I'm a lurker (in old usenet parlance) on this list: reading, but very rarely posting. There are probably many of us, but the others are posting even more rarely...

As long as we don't get real off-topic discussions that attract the weirdos of the Internet (global warming anybody? intelligent design? even C/Fortran tends to peter out quickly nowadays), I am opposed to censorship (aka moderation). The simplistic arguments are:
1) This is my own, selfish, most important argument: it costs time! When, every two years, I have a technical question for the list, I don't want to wait until the USA is out of bed and hope that the moderator isn't at a conference for a week.
2) You need a moderator. It's quite some work, so it will only be done by somebody who gets some satisfaction out of it. This means that the job will attract exactly the kind of people who will not moderate neutrally and dispassionately. Even if they try, there's the fact that power corrupts. You're tempted to censor views that are too far from your own ("ludicrous" is the word you would use), and in the end you have an in-crowd confirming each other's views.
3) You are opening yourself to lawsuits. If something is said on the list that, let's say Intel's corporate lawyers find defamatory, they may go after the moderator.

If you really find somebody's views (and their presentation) objectionable, just killfile them (it's called "filter" in the 21st century). And if certain people think ad hominem attacks help their case, ignore them instead of thinking you can look dignified in taking them on in their own game. You won't.

Back to those dark alleys where we lurkers feel at home...

  Herbert


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Mon Jan  2 14:12:47 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 02 Jan 2012 14:12:47 -0500
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
 full of oil.
In-Reply-To: <alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>
	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
Message-ID: <4F0201AF.6080509@ias.edu>

On 12/29/2011 02:49 PM, Mark Hahn wrote:
> guys, this isn't a dating site.

...yet.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
MailScanner: clean


From prentice at ias.edu  Mon Jan  2 14:15:16 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 02 Jan 2012 14:15:16 -0500
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
 full of oil.
In-Reply-To: <D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
	<D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
Message-ID: <4F020244.4040505@ias.edu>

On 12/29/2011 07:50 PM, Vincent Diepeveen wrote:
> it's very useful Mark, as we know now he works for the company and  
> also for which nation.
>
> Vincent

For someone who's always bashing on US Foreign policy, you sure sound
like a Republican or member of the Department of Homeland Security!


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
MailScanner: clean


From eugen at leitl.org  Wed Jan 11 04:13:02 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Wed, 11 Jan 2012 10:13:02 +0100
Subject: [Beowulf] Course: Parallel Programming of High Performance Systems
Message-ID: <20120111091302.GU21917@leitl.org>

----- Forwarded message from Georg Hager <Georg.Hager at rrze.uni-erlangen.de> -----

From: Georg Hager <Georg.Hager at rrze.uni-erlangen.de>
Date: Wed, 11 Jan 2012 01:40:09 +0100 (CET)
To: eugen at leitl.org
Subject: Course: Parallel Programming of High Performance Systems

"Parallel Programming of High Performance Systems" is the
yearly course provided by LRZ and RRZE that gives students
and scientists a solid introduction to

- Processor and HPC system architectures
- Code development and basic tools
- Scalar optimizations (generic and architecture-specific)
- Parallelization basics
- Parallel programming with OpenMP and MPI

There will also be an additional course with advanced topics,
which covers

- Parallel performance tools for MPI and OpenMP
- Parallel I/O with MPI I/O
- I/O tuning and libraries

Hands-on sessions will enable participants to apply the concepts
right away.

Although the federal HPC system at LRZ Munich is treated in some
detail, most of the conveyed concepts are of general use.
You can find the preliminary course agendas on the web:

Basic course:
<http://www.lrz.de/services/compute/courses/index.html#TOC1.14>

Advanced course:
<http://www.lrz.de/services/compute/courses/index.html#TOC1.15>

This year the basic course is hosted by RRZE in Erlangen and
will be available at LRZ in Garching via videoconferencing,
if a sufficient number of people are interested. Hands-On
sessions will then be provided at both locations. The advanced
course will be hosted by LRZ in Garching.

Basic course:
============
Location: RRZE, Martensstr. 1, 91058 Erlangen
Date:     March 5-9, 2012, 9:00-18:00

Advanced course:
===============
Location: LRZ, Boltzmannstr. 1, 85748 Garching b. Muenchen
Date:     March 19-22, 2012, 9:00-18:00


There is no course fee.

Please register for course "HPPP1W11" and/or "HPAT1W11"
at the following LRZ website:

<http://www.lrz-muenchen.de/services/schulung/kursanmeldung>

Hoping to see you there,
G. Hager

-- 
Dr. Georg Hager, HPC Services
Friedrich-Alexander-Universitaet Erlangen-Nuernberg
Regionales RechenZentrum Erlangen (RRZE)
Martensstrasse 1, 91058 Erlangen, Germany
Tel. +49 9131 85-28973, Fax +49 9131 302941
mailto:georg.hager at rrze.uni-erlangen.de
http://www.hpc.rrze.uni-erlangen.de/

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 10:36:48 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 16:36:48 +0100
Subject: [Beowulf] Course: Parallel Programming of High Performance
	Systems
In-Reply-To: <20120111091302.GU21917@leitl.org>
References: <20120111091302.GU21917@leitl.org>
Message-ID: <3DCEF7EC-45ED-43C0-9345-A59938AB9861@xs4all.nl>

Yeah, the sheets are there from the 2003 lecture.
filename LRZ210703_1.pdf

Very helpful if you have grey hair and want to port your years 80  
fortran code to todays HPC hardware.

Vincent

On Jan 11, 2012, at 10:13 AM, Eugen Leitl wrote:

> ----- Forwarded message from Georg Hager <Georg.Hager at rrze.uni- 
> erlangen.de> -----
>
> From: Georg Hager <Georg.Hager at rrze.uni-erlangen.de>
> Date: Wed, 11 Jan 2012 01:40:09 +0100 (CET)
> To: eugen at leitl.org
> Subject: Course: Parallel Programming of High Performance Systems
>
> "Parallel Programming of High Performance Systems" is the
> yearly course provided by LRZ and RRZE that gives students
> and scientists a solid introduction to
>
> - Processor and HPC system architectures
> - Code development and basic tools
> - Scalar optimizations (generic and architecture-specific)
> - Parallelization basics
> - Parallel programming with OpenMP and MPI
>
> There will also be an additional course with advanced topics,
> which covers
>
> - Parallel performance tools for MPI and OpenMP
> - Parallel I/O with MPI I/O
> - I/O tuning and libraries
>
> Hands-on sessions will enable participants to apply the concepts
> right away.
>
> Although the federal HPC system at LRZ Munich is treated in some
> detail, most of the conveyed concepts are of general use.
> You can find the preliminary course agendas on the web:
>
> Basic course:
> <http://www.lrz.de/services/compute/courses/index.html#TOC1.14>
>
> Advanced course:
> <http://www.lrz.de/services/compute/courses/index.html#TOC1.15>
>
> This year the basic course is hosted by RRZE in Erlangen and
> will be available at LRZ in Garching via videoconferencing,
> if a sufficient number of people are interested. Hands-On
> sessions will then be provided at both locations. The advanced
> course will be hosted by LRZ in Garching.
>
> Basic course:
> ============
> Location: RRZE, Martensstr. 1, 91058 Erlangen
> Date:     March 5-9, 2012, 9:00-18:00
>
> Advanced course:
> ===============
> Location: LRZ, Boltzmannstr. 1, 85748 Garching b. Muenchen
> Date:     March 19-22, 2012, 9:00-18:00
>
>
> There is no course fee.
>
> Please register for course "HPPP1W11" and/or "HPAT1W11"
> at the following LRZ website:
>
> <http://www.lrz-muenchen.de/services/schulung/kursanmeldung>
>
> Hoping to see you there,
> G. Hager
>
> -- 
> Dr. Georg Hager, HPC Services
> Friedrich-Alexander-Universitaet Erlangen-Nuernberg
> Regionales RechenZentrum Erlangen (RRZE)
> Martensstrasse 1, 91058 Erlangen, Germany
> Tel. +49 9131 85-28973, Fax +49 9131 302941
> mailto:georg.hager at rrze.uni-erlangen.de
> http://www.hpc.rrze.uni-erlangen.de/
>
> ----- End forwarded message -----
> -- 
> Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
> ______________________________________________________________
> ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
> 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 11:09:00 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 08:09:00 -0800
Subject: [Beowulf] Course: Parallel Programming of High Performance
 Systems
In-Reply-To: <3DCEF7EC-45ED-43C0-9345-A59938AB9861@xs4all.nl>
Message-ID: <CB32F17E.12E3E%james.p.lux@jpl.nasa.gov>

I don't have grey hair (part grey beard, I confess), but I have plenty of
70s era FORTRAN that benefits from parallelization.
Numerical Electromagnetics Code V4, specifically.

The implementation has been throughly validated and have been used for
decades, finding all the little idiosyncracies and dealing with numerical
precision issues, etc.  There's extensive software around that generates
the card image input files it expects and parses the line printer output
files (with the 1 in column 1 for a page break).

Rewriting it from scratch would not be a very good use of time. You'd have
to revisit all the years of validation, make sure there were subtle
differences in function, because while there's an official validation
suite, it's more to make sure that the compile worked ok and there's not
an egregious problem. And who knows what users out there have depended on
some idiosyncratic implementation aspects.

I suspect the same is true for lots of fluid mechanics and other FEM codes
(NASTRAN, for instance).

So an incremental approach of parallelizing that old FORTRAN, replacing
pieces with "new FORTRAN", for instance, might be useful.

(and don't get me started on my experiences with the f2c engine)

 
On 1/11/12 7:36 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:

>Yeah, the sheets are there from the 2003 lecture.
>filename LRZ210703_1.pdf
>
>Very helpful if you have grey hair and want to port your years 80
>fortran code to todays HPC hardware.
>
>Vincent
>
>On Jan 11, 2012, at 10:13 AM, Eugen Leitl wrote:
>
>> ----- Forwarded message from Georg Hager <Georg.Hager at rrze.uni-
>> erlangen.de> -----

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 11:18:41 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 08:18:41 -0800
Subject: [Beowulf] A cluster of Arduinos
Message-ID: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>


For educational purposes..

Has anyone done something where they implement some sort of message passing API on a network of Arduinos.  Since they cost only $20 each, and have a fairly facile development environment, it seems you could put together a simple demonstration of parallel processing and various message passing things.

For instance, you could introduce errors in the message links and do experiments with Byzantine General type algorithms, or with multiple parallel routes, etc.

I've not actually tried hooking up multiple arduinos through a USB hub to one PC, but if that works, it gives you a nice "head node, debug console" sort of interface.

Smaller, lighter, cheaper than lashing together MiniITX mobos or building a Wal-Mart Cluster.


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120111/8e0a7553/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From diep at xs4all.nl  Wed Jan 11 12:00:43 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 18:00:43 +0100
Subject: [Beowulf] Course: Parallel Programming of High Performance
	Systems
In-Reply-To: <CB32F17E.12E3E%james.p.lux@jpl.nasa.gov>
References: <CB32F17E.12E3E%james.p.lux@jpl.nasa.gov>
Message-ID: <7B7DB325-4FFB-4C68-9602-2E1E71B41D12@xs4all.nl>


On Jan 11, 2012, at 5:09 PM, Lux, Jim (337C) wrote:

> I don't have grey hair (part grey beard, I confess), but I have  
> plenty of
> 70s era FORTRAN that benefits from parallelization.
> Numerical Electromagnetics Code V4, specifically.
>
> The implementation has been throughly validated and have been used for
> decades, finding all the little idiosyncracies and dealing with  
> numerical
> precision issues, etc.  There's extensive software around that  
> generates
> the card image input files it expects and parses the line printer  
> output
> files (with the 1 in column 1 for a page break).
>
> Rewriting it from scratch would not be a very good use of time.  
> You'd have
> to revisit all the years of validation, make sure there were subtle
> differences in function, because while there's an official validation
> suite, it's more to make sure that the compile worked ok and  
> there's not
> an egregious problem. And who knows what users out there have  
> depended on
> some idiosyncratic implementation aspects.
>
> I suspect the same is true for lots of fluid mechanics and other  
> FEM codes
> (NASTRAN, for instance).
>
> So an incremental approach of parallelizing that old FORTRAN,  
> replacing
> pieces with "new FORTRAN", for instance, might be useful.
>
> (and don't get me started on my experiences with the f2c engine)
>

No need to get started Jim, NASA can ask that the Russians as well.

>
>
> On 1/11/12 7:36 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:
>
>> Yeah, the sheets are there from the 2003 lecture.
>> filename LRZ210703_1.pdf
>>
>> Very helpful if you have grey hair and want to port your years 80
>> fortran code to todays HPC hardware.
>>
>> Vincent
>>
>> On Jan 11, 2012, at 10:13 AM, Eugen Leitl wrote:
>>
>>> ----- Forwarded message from Georg Hager <Georg.Hager at rrze.uni-
>>> erlangen.de> -----
>
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Wed Jan 11 11:58:59 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Wed, 11 Jan 2012 11:58:59 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
Message-ID: <4F0DBFD3.3070503@ias.edu>

On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>
> For educational purposes..
>
> Has anyone done something where they implement some sort of message
> passing API on a network of Arduinos.  Since they cost only $20 each,
> and have a fairly facile development environment, it seems you could
> put together a simple demonstration of parallel processing and various
> message passing things.
>
> For instance, you could introduce errors in the message links and do
> experiments with Byzantine General type algorithms, or with multiple
> parallel routes, etc.
>
> I've not actually tried hooking up multiple arduinos through a USB hub
> to one PC, but if that works, it gives you a nice "head node, debug
> console" sort of interface.
>
> Smaller, lighter, cheaper than lashing together MiniITX mobos or
> building a Wal-Mart Cluster.
>

I started tinkering with Arduinos a couple of months ago. Got lots of
related goodies for Christmas, so I've been looking like a mad scientist
building arduino things lately. I'm still a beginner arduino hacker, but
I'd be game for giving this a try,  if anyone else wants to give this a go.

The Arduino Due, which is overdue in the marketplace, will have a
Cortex-M3 ARM processor.

--
Prentice


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 12:30:30 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 18:30:30 +0100
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
	full of oil.
In-Reply-To: <4F020244.4040505@ias.edu>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
	<D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
	<4F020244.4040505@ias.edu>
Message-ID: <1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>


On Jan 2, 2012, at 8:15 PM, Prentice Bisbal wrote:

> On 12/29/2011 07:50 PM, Vincent Diepeveen wrote:
>> it's very useful Mark, as we know now he works for the company and
>> also for which nation.
>>
>> Vincent
>
> For someone who's always bashing on US Foreign policy, you sure sound
> like a Republican or member of the Department of Homeland Security!

Where is my paycheck?

>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ntmoore at gmail.com  Wed Jan 11 12:31:30 2012
From: ntmoore at gmail.com (Nathan Moore)
Date: Wed, 11 Jan 2012 11:31:30 -0600
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0DBFD3.3070503@ias.edu>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
Message-ID: <CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>

I think something like the Raspberry Pi might be easier for this sort
of task.  They'll also be about $25, but they'll run something like
ARM/linux.  Not out yet thought.

http://www.raspberrypi.org/

On Wed, Jan 11, 2012 at 10:58 AM, Prentice Bisbal <prentice at ias.edu> wrote:
> On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>>
>> For educational purposes..
>>
>> Has anyone done something where they implement some sort of message
>> passing API on a network of Arduinos. ?Since they cost only $20 each,
>> and have a fairly facile development environment, it seems you could
>> put together a simple demonstration of parallel processing and various
>> message passing things.
>>
>> For instance, you could introduce errors in the message links and do
>> experiments with Byzantine General type algorithms, or with multiple
>> parallel routes, etc.
>>
>> I've not actually tried hooking up multiple arduinos through a USB hub
>> to one PC, but if that works, it gives you a nice "head node, debug
>> console" sort of interface.
>>
>> Smaller, lighter, cheaper than lashing together MiniITX mobos or
>> building a Wal-Mart Cluster.
>>
>
> I started tinkering with Arduinos a couple of months ago. Got lots of
> related goodies for Christmas, so I've been looking like a mad scientist
> building arduino things lately. I'm still a beginner arduino hacker, but
> I'd be game for giving this a try, ?if anyone else wants to give this a go.
>
> The Arduino Due, which is overdue in the marketplace, will have a
> Cortex-M3 ARM processor.
>
> --
> Prentice
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


-- 
- - - - - - -?? - - - - - - -?? - - - - - - -
Nathan Moore
Associate Professor, Physics
Winona State University
- - - - - - -?? - - - - - - -?? - - - - - - -
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 12:43:17 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 18:43:17 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0DBFD3.3070503@ias.edu>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
Message-ID: <2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>


On Jan 11, 2012, at 5:58 PM, Prentice Bisbal wrote:

> On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>>
>> For educational purposes..
>>
>> Has anyone done something where they implement some sort of message
>> passing API on a network of Arduinos.  Since they cost only $20 each,
>> and have a fairly facile development environment, it seems you could
>> put together a simple demonstration of parallel processing and  
>> various
>> message passing things.
>>
>> For instance, you could introduce errors in the message links and do
>> experiments with Byzantine General type algorithms, or with multiple
>> parallel routes, etc.
>>
>> I've not actually tried hooking up multiple arduinos through a USB  
>> hub
>> to one PC, but if that works, it gives you a nice "head node, debug
>> console" sort of interface.
>>
>> Smaller, lighter, cheaper than lashing together MiniITX mobos or
>> building a Wal-Mart Cluster.
>>
>
> I started tinkering with Arduinos a couple of months ago. Got lots of
> related goodies for Christmas, so I've been looking like a mad  
> scientist
> building arduino things lately. I'm still a beginner arduino  
> hacker, but
> I'd be game for giving this a try,  if anyone else wants to give  
> this a go.
>
> The Arduino Due, which is overdue in the marketplace, will have a
> Cortex-M3 ARM processor.

Completely superior chip that Cortex-M3.

Though i couldn't program much for it so far - difficult to get  
contract jobs for.
Can do fast multiplication 32 x 32 bits.

You can even implement RSA very fast on that chip.
Runs at 70Mhz or so?

Usually writing assembler for such CPU's is more efficient by the way  
than using
a compiler. Compilers are not so efficient, to say polite, for  
embedded cpu's.

Writing assembler for such cpu's is pretty straightforward, whereas  
in HPC things are far more complicated
because of vectorization.

AVX is the latest there. Speaking of AVX, is there already lots of  
HPC support for AVX?

I see that after years of wrestling the George Woltman released some  
prime number
code (GWNUM), of course as always: in beta for the remainder of this  
century, which uses AVX.

Claims are that it's a tad faster than the existing SIMD codes. I saw  
claims of even above 20% faster,
which is really a lot at that level of engineering; usually you work  
6 months for 0.5% speedup.

If you improve algorithm, you still lose it from this code, as your C/ 
C++  code will be default a factor 10 slower if not more.

I remember how i found a clever caching trick in 2006 for a Numeric  
Theoretic Transform (that's a FFT but then in integers, so without
the rounding errors that the floating point FFT's give), yet after  
some hard work there my C code still was factor 8 slower than Woltman's
SIMD assembler.

>
> --
> Prentice
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 12:44:43 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 18:44:43 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
Message-ID: <940F5BCF-8CC3-4461-ABA4-79FBCF9BF057@xs4all.nl>

That's all very expensive considering the cpu's are under $1 i'd guess.
I actually might need some of this stuff some months from now to  
build some robots.

On Jan 11, 2012, at 6:31 PM, Nathan Moore wrote:

> I think something like the Raspberry Pi might be easier for this sort
> of task.  They'll also be about $25, but they'll run something like
> ARM/linux.  Not out yet thought.
>
> http://www.raspberrypi.org/
>
> On Wed, Jan 11, 2012 at 10:58 AM, Prentice Bisbal  
> <prentice at ias.edu> wrote:
>> On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>>>
>>> For educational purposes..
>>>
>>> Has anyone done something where they implement some sort of message
>>> passing API on a network of Arduinos.  Since they cost only $20  
>>> each,
>>> and have a fairly facile development environment, it seems you could
>>> put together a simple demonstration of parallel processing and  
>>> various
>>> message passing things.
>>>
>>> For instance, you could introduce errors in the message links and do
>>> experiments with Byzantine General type algorithms, or with multiple
>>> parallel routes, etc.
>>>
>>> I've not actually tried hooking up multiple arduinos through a  
>>> USB hub
>>> to one PC, but if that works, it gives you a nice "head node, debug
>>> console" sort of interface.
>>>
>>> Smaller, lighter, cheaper than lashing together MiniITX mobos or
>>> building a Wal-Mart Cluster.
>>>
>>
>> I started tinkering with Arduinos a couple of months ago. Got lots of
>> related goodies for Christmas, so I've been looking like a mad  
>> scientist
>> building arduino things lately. I'm still a beginner arduino  
>> hacker, but
>> I'd be game for giving this a try,  if anyone else wants to give  
>> this a go.
>>
>> The Arduino Due, which is overdue in the marketplace, will have a
>> Cortex-M3 ARM processor.
>>
>> --
>> Prentice
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
> -- 
> - - - - - - -   - - - - - - -   - - - - - - -
> Nathan Moore
> Associate Professor, Physics
> Winona State University
> - - - - - - -   - - - - - - -   - - - - - - -
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 12:58:13 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 09:58:13 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>

Yes.. better the widget that one can whip on down to Radio Shack and buy on my way home from work than the ghostware that may live for Christmas future.

Also, does the Raspberry PI $25 price point include a power supply? The Arduino runs off the USB 5V power, so it's one less thing to hassle with.

I don't know that performance is all that important in this application. It's more to experiment with message passing in a multiprocessor system.  Slow is fine.

(I can't think of a computational application for a ArdWulf (combining Italian and Saxon) that wouldn't be blown away by almost any single computer, including something like a smart phone)

Realistically, you're looking at bitbanging kinds of serial interfaces.

I can see several network implementations: SPI shared bus, Hypercubes, toroidal surfaces, etc.


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Nathan Moore
Sent: Wednesday, January 11, 2012 9:32 AM
To: Prentice Bisbal
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

I think something like the Raspberry Pi might be easier for this sort of task.  They'll also be about $25, but they'll run something like ARM/linux.  Not out yet thought.

http://www.raspberrypi.org/

On Wed, Jan 11, 2012 at 10:58 AM, Prentice Bisbal <prentice at ias.edu> wrote:
> On 01/11/2012 11:18 AM, Lux, Jim (337C) wrote:
>>
>> For educational purposes..
>>
>> Has anyone done something where they implement some sort of message 
>> passing API on a network of Arduinos. ?Since they cost only $20 each, 
>> and have a fairly facile development environment, it seems you could 
>> put together a simple demonstration of parallel processing and 
>> various message passing things.
>>
>> 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:00:36 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:00:36 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>
	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>


> The Arduino Due, which is overdue in the marketplace, will have a
> Cortex-M3 ARM processor.

Completely superior chip that Cortex-M3.

Though i couldn't program much for it so far - difficult to get contract jobs for.
Can do fast multiplication 32 x 32 bits.

You can even implement RSA very fast on that chip.
Runs at 70Mhz or so?

Usually writing assembler for such CPU's is more efficient by the way than using a compiler. Compilers are not so efficient, to say polite, for embedded cpu's.

Writing assembler for such cpu's is pretty straightforward, whereas in HPC things are far more complicated because of vectorization.

-->> ah, but this is not really a HPC application.  It's a cluster computer architecture demonstration platform.  The Java based arduino environment is pretty simple and multiplatform.  Yes, it uses a sort of weird C-like language, but there it is... it's easy to use.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:19:24 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:19:24 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>

Yes..
And there's been a bunch of "value clusters" over the years (StoneSouperComputer, for instance)..

But that's still $3k.

I could see putting together 8 nodes for a few hundred dollars. Arduino Uno R3 is about $25 each in quantity.

Think in terms of a small class where you want to have, say, 10 mini-clusters, one per student. No sharing, etc.


-----Original Message-----
From: Alex Chekholko [mailto:alex.chekholko at gmail.com] 
Sent: Wednesday, January 11, 2012 10:12 AM
To: Lux, Jim (337C)
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

The LittleFe cluster is designed specifically for teaching and demonstration.  Current cost is ~$3k.  But it's all standard x86 and runs Linux and even has GPUs.

http://littlefe.net/

I saw them build a bunch of them at SC11.

On Wed, Jan 11, 2012 at 10:00 AM, Lux, Jim (337C) <james.p.lux at jpl.nasa.gov> wrote:
> ?It's a cluster computer architecture demonstration platform.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:27:31 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:27:31 -0800
Subject: [Beowulf] PAPERS interface
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>

Arghh.. my google-fu is failing me..

I'm looking for the papers on the PAPERS cluster interface (based on using parallel ports.. back in the 90s) and, of course, if you search for the word papers, you get nothing useful..

I can't remember who the authors were or where it was done (I'm thinking in the SouthEast US, for some reason, but I'm not sure)

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120111/2be94a53/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From sabujp at gmail.com  Wed Jan 11 13:35:17 2012
From: sabujp at gmail.com (Sabuj Pattanayek)
Date: Wed, 11 Jan 2012 12:35:17 -0600
Subject: [Beowulf] PAPERS interface
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <CAEeMGHu2U_t3VN11iVqO2tYCKCrEVwN3ZZ=Gs5LBvW8RPD=_hQ@mail.gmail.com>

https://www.google.com/search?hl=en&q=%22PAPERS%22%20parallel%20port%20interface&btnG=Google+Search

http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1183&context=ecetr

HTH,
Sabuj
Google Proxy Certified Search Partner

On Wed, Jan 11, 2012 at 12:27 PM, Lux, Jim (337C)
<james.p.lux at jpl.nasa.gov> wrote:
> Arghh.. my google-fu is failing me..
>
>
>
> I?m looking for the papers on the PAPERS cluster interface (based on using
> parallel ports.. back in the 90s) and, of course, if you search for the word
> papers, you get nothing useful..
>
>
>
> I can?t remember who the authors were or where it was done (I?m thinking in
> the SouthEast US, for some reason, but I?m not sure)
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:37:14 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:37:14 -0800
Subject: [Beowulf] PAPERS interface
In-Reply-To: <4F0DD65B.3060808@nasa.gov>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0DD65B.3060808@nasa.gov>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9029@ALTPHYEMBEVSP20.RES.AD.JPL>

Thanks..
Also props to Juan Gallego who found it, too..

From: Jeff Becker [mailto:Jeffrey.C.Becker at nasa.gov]
Sent: Wednesday, January 11, 2012 10:35 AM
To: Lux, Jim (337C)
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] PAPERS interface

On 01/11/12 10:27, Lux, Jim (337C) wrote:
Arghh.. my google-fu is failing me..

I'm looking for the papers on the PAPERS cluster interface (based on using parallel ports.. back in the 90s) and, of course, if you search for the word papers, you get nothing useful..

I can't remember who the authors were or where it was done (I'm thinking in the SouthEast US, for some reason, but I'm not sure)

Hi Jim. The lead author is Hank Dietz. The acronym is:

PAPERS: Purdue's adapter for parallel execution and rapid synchronization.

Cheers from NASA Ames...

-jeff

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120111/5bd31ed6/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From james.p.lux at jpl.nasa.gov  Wed Jan 11 13:39:41 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 10:39:41 -0800
Subject: [Beowulf] PAPERS interface
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E902B@ALTPHYEMBEVSP20.RES.AD.JPL>

Excellent.. Purdue.. and have we really been beowulfing since 1994?  I'll be that the earliest clusters can legally buy alcohol now...


So, If I build a cluster with Arduinos using the PAPERS style interface, what will it be called...

BeoPaperDuino?


From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Lux, Jim (337C)
Sent: Wednesday, January 11, 2012 10:28 AM
To: beowulf at beowulf.org
Subject: [Beowulf] PAPERS interface

Arghh.. my google-fu is failing me..

I'm looking for the papers on the PAPERS cluster interface (based on using parallel ports.. back in the 90s) and, of course, if you search for the word papers, you get nothing useful..

I can't remember who the authors were or where it was done (I'm thinking in the SouthEast US, for some reason, but I'm not sure)

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120111/ef7445ec/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From atp at piskorski.com  Wed Jan 11 14:38:53 2012
From: atp at piskorski.com (Andrew Piskorski)
Date: Wed, 11 Jan 2012 14:38:53 -0500
Subject: [Beowulf] PAPERS interface
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <20120111193853.GA86203@piskorski.com>

On Wed, Jan 11, 2012 at 10:27:31AM -0800, Lux, Jim (337C) wrote:

> I'm looking for the papers on the PAPERS cluster interface (based on
> using parallel ports.. back in the 90s) and, of course, if you

It also came up a few times here on the list, e.g.:

  http://www.beowulf.org/archive/2004-October/010934.html
  From: Tim Mattox
  Date: Sat Oct 16 15:15:14 PDT 2004

-- 
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 17:47:00 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 23:47:00 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>

Jim, your microcontroller cluster is not a rather good idea.

Latency didn't keep up with the CPU speeds...

Todays nodes have a CPU core or 12 and soon 16 which can execute,
let's take a simple integer example in my chessprogram and its IPC,
about 24 instructions per cycle

So nothing SIMD, just simple integer instructions most of it, of  
course loads which effectively
come from L1 play an overwhelming role there.

typical latencies to do a random memory read from the remote nodes,  
even with the latest networks,
it's between 0.85 and 1.9 microseconds. Let's take optimistic 1  
microsecond. RDMA read...

So in that timeframe you can execute 24k+ instructions.

IPC at the cheapo cpu's is far under 1 effectively. Around 0.25 for  
most codes.

Cpu's of 70Mhz can execute 1 instruction in each 280 Mhz. Now we are  
busy with rough measures here.

Let's call that 1/4 millisecond.

Even USB 1.1 has to sticks latencies far under 1 millisecond.

So actual latency of todays clusters is factor 25k worse than this  
'cluster'.

In fact your microcontrollercluster here has latencies that you do  
not even have core to core
within a single CPU today.

There is still too much years 80s and years 90s software out there,
written by the guys who wrote books about how to parallellize, which  
simply
doesn't scale at all at modern hardware.

Let me not quote too many names there as i've done before.

They were just too lazy to throw away their old code and start over  
new writing a new parallel concept
that works at todays hardware.

If we involve GPU's now then there is gonna be an even bigger problem  
and that's that bandwidth of the network
can't keep up with what a single GPU delivers. Who is to blame for  
that is quite a complicated discussion,
if anyone has to be blamed anyway.

We just need more clever algorithms there.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 17:56:12 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 11 Jan 2012 23:56:12 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
Message-ID: <106FFC0A-B488-4A39-8C55-7FD27C3BCFC1@xs4all.nl>


On Jan 11, 2012, at 11:47 PM, Vincent Diepeveen wrote:

> Jim, your microcontroller cluster is not a rather good idea.
>
> Latency didn't keep up with the CPU speeds...
>
> Todays nodes have a CPU core or 12 and soon 16 which can execute,
> let's take a simple integer example in my chessprogram and its IPC,
> about 24 instructions per cycle
>
> So nothing SIMD, just simple integer instructions most of it, of
> course loads which effectively
> come from L1 play an overwhelming role there.
>
> typical latencies to do a random memory read from the remote nodes,
> even with the latest networks,
> it's between 0.85 and 1.9 microseconds. Let's take optimistic 1
> microsecond. RDMA read...
>
> So in that timeframe you can execute 24k+ instructions.
>

Hah, how easy it is to make a mistake, sorry for that.

I didn't even multiply by the Ghz frequency of the cpu's yet.

So if it's 3Ghz or so, it's actually closer to factor 75k faster than  
24k.

Furthermore another problem is that you cant fully load networks of  
course.

So to keep the network functioning great you want to do such
hammering over the network no more than once each 750k instructions.


> IPC at the cheapo cpu's is far under 1 effectively. Around 0.25 for
> most codes.
>
> Cpu's of 70Mhz can execute 1 instruction in each 280 Mhz. Now we are
> busy with rough measures here.
>
> Let's call that 1/4 millisecond.
>
> Even USB 1.1 has to sticks latencies far under 1 millisecond.
>
> So actual latency of todays clusters is factor 25k worse than this
> 'cluster'.
>
> In fact your microcontrollercluster here has latencies that you do
> not even have core to core
> within a single CPU today.
>
> There is still too much years 80s and years 90s software out there,
> written by the guys who wrote books about how to parallellize, which
> simply
> doesn't scale at all at modern hardware.
>
> Let me not quote too many names there as i've done before.
>
> They were just too lazy to throw away their old code and start over
> new writing a new parallel concept
> that works at todays hardware.
>
> If we involve GPU's now then there is gonna be an even bigger problem
> and that's that bandwidth of the network
> can't keep up with what a single GPU delivers. Who is to blame for
> that is quite a complicated discussion,
> if anyone has to be blamed anyway.
>
> We just need more clever algorithms there.
>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 18:24:55 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 15:24:55 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Vincent Diepeveen
Sent: Wednesday, January 11, 2012 2:47 PM
To: Beowulf Mailing List
Subject: Re: [Beowulf] A cluster of Arduinos

Jim, your microcontroller cluster is not a rather good idea.

Latency didn't keep up with the CPU speeds...

--- You're missing the point of the cluster.  It's not for performance (where I can't imagine that the slowest single CPU PC out there wouldn't blow the figurative doors off).  It's to provide a very inexpensive way to experiment/play/demonstrate loosely coupled multiprocessor systems. 

--> for example, you could experiment with redundant message routing across a fabric of nodes.  The algorithms are fairly simple, and this gives you a testbed which is qualitatively different than just simulating a bunch of nodes on a single PC.  There is pedagogical value in a system where you can force a link error by just disconnecting the cable, and your blinky lights on each node show what's going on.


There is still too much years 80s and years 90s software out there, written by the guys who wrote books about how to parallellize, which simply doesn't scale at all at modern hardware.

-->  I think that a lot of the theory of parallel processes is speed independent, and while some historical approaches might not be used in a modern system for good implementation reasons, students and others still need to learn about them, if only as the canonical approach.    Sure, you could do a simulation on a single PC (and I've seen them, in Simulink, and in other more specialized tools), but there's a lot of appeal to a hands-on-the-cheap-hardware approach to learning.

--> To take an example, if you set a student a problem of lighting a LED on each node in a specified node order at  specified intervals, and where the node interconnects are not specified in advance, that's a fairly interesting homework problem.  You have to discover the network connectivity graph, then figure out how to pass the message to the appropriate node at the appropriate time.  This is a classic "hot plug network discovery" kind of problem, and in the face of intermittent links, it's of great interest.

--> While that particular problem isn't exactly HPC, it DOES relate to HPC in a world where you cannot assume perfect processor nodes and perfect communications links.  And that gets right to the whole "scalability" thing in HPC.  It wasn't til the implementation of Error Correcting Codes in logic that something like the Q7A computer was even possible, because it was so large that you couldn't guarantee that all the tubes would be working all the time.  Likewise with many other aspects of modern computing.

--> And, of course, in the spaceflight world, this kind of thing is even more important.  A concept of growing importance is the "fractionated spacecraft" where all of the functions that would have been all in one physical vehicle are now spread across many smaller pieces.  And one might reallocate spacecraft fractional pieces between different virtual spacecraft.  Maybe right now, you need a lot of processing power to do image compression and analysis, so you want to allocate a lot of "processing pieces" to the job, with an ad hoc network connection among them.  Later,  you don't need them, so you can release them to other uses.  The pieces might be in the immediate vicinity, or they might be some distance away, which affects the data rate in the link and its error rates.

--> You can legitimately ask whether this sort of thing (the fractionated spacecraft) is a Beowulf (defined as a cluster supercomputer built of commodity components) and I would say it shares many of the same properties, especially in the early Beowulf days before multicores and fancy interconnects were fashionable for multi-thousand processor clusters.  It's that idea of building a large complex device out of many basically identical subunits, using open source/simple software to manage it.  


-->> in summary, it's not about performance.. it's about a teaching tool for networking in the context of cluster computing.  You claim we need to cast off the shackles of old programming styles and get some new blood and ideas.  Well, you need to get people interested in parallel computing and learning the basics (so at least they don't reinvent the square wheel).  One way might be challenges such as parallelization of game play; another might be working with parallelized database; the way I propose is with experimenting with message passing parallelization using dirt cheap hardware.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Wed Jan 11 19:18:11 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Wed, 11 Jan 2012 19:18:11 -0500
Subject: [Beowulf] PAPERS interface
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9020@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <2d6fa78f1fc44cea3df118e1c0a27f31.squirrel@mail.eadline.org>

Hank Deitz, was at Purdue, now at Kentucky, see aggregate.org

--
Doug


> Arghh.. my google-fu is failing me..
>
> I'm looking for the papers on the PAPERS cluster interface (based on using
> parallel ports.. back in the 90s) and, of course, if you search for the
> word papers, you get nothing useful..
>
> I can't remember who the authors were or where it was done (I'm thinking
> in the SouthEast US, for some reason, but I'm not sure)
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>


--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From diep at xs4all.nl  Wed Jan 11 19:36:37 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 01:36:37 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>

Yes this was impossible to explain to a bunch of MiT folks as well,
some of whom wrote your book i bet - yet the slower the processor,
the more of a true SMP system it is.

It's obvious that you missed that point.

Writing code for a multicore is tougher, from SMP constraints viewpoint,
than for a bunch of 70Mhz cpu's that have a millisecond latency to  
the other cpu's.

So it's far from demonstrating clusterprogramming. Lightyears away.

Emulation at a simple quadcore is in fact better representative than  
this.

If you want to get closer to clusterprogramming than this, just buy  
yourself off ebay
some barcelona core SMP system with 4 sockets. Say with energy  
efficient 1.8Ghz CPU's.

So with one of the first incarnations of hypertransport, as of course  
later on it dramatically improved.

Latency from cpu to cpu is some 300+ ns if you lookup randomly.

Even good programmers in game tree search have big problems working  
with those latencies.

Clusters are having latencies that are far worse than that. Yet as  
cpu speeds no longer increase much
and number of cores doesn't double that quickly, clusters are the way  
to go if you're CPU hungry.

Setting up small clusters is cheap as well. If i put in the name  
'mellanox' in ebay i see bunches of
cheap cards out there and also switches.

With a single switch you can teach half a dozen students. You can  
just connect the machines you already
got there onto a few switches and write MPI code like that.

Average cost per student also will be a couple of hundreds of dollars.

Vincent


On Jan 12, 2012, at 12:24 AM, Lux, Jim (337C) wrote:

>
>
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf- 
> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
> Sent: Wednesday, January 11, 2012 2:47 PM
> To: Beowulf Mailing List
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> Jim, your microcontroller cluster is not a rather good idea.
>
> Latency didn't keep up with the CPU speeds...
>
> --- You're missing the point of the cluster.  It's not for  
> performance (where I can't imagine that the slowest single CPU PC  
> out there wouldn't blow the figurative doors off).  It's to provide  
> a very inexpensive way to experiment/play/demonstrate loosely  
> coupled multiprocessor systems.
>
> --> for example, you could experiment with redundant message  
> routing across a fabric of nodes.  The algorithms are fairly  
> simple, and this gives you a testbed which is qualitatively  
> different than just simulating a bunch of nodes on a single PC.   
> There is pedagogical value in a system where you can force a link  
> error by just disconnecting the cable, and your blinky lights on  
> each node show what's going on.
>
>
> There is still too much years 80s and years 90s software out there,  
> written by the guys who wrote books about how to parallellize,  
> which simply doesn't scale at all at modern hardware.
>
> -->  I think that a lot of the theory of parallel processes is  
> speed independent, and while some historical approaches might not  
> be used in a modern system for good implementation reasons,  
> students and others still need to learn about them, if only as the  
> canonical approach.    Sure, you could do a simulation on a single  
> PC (and I've seen them, in Simulink, and in other more specialized  
> tools), but there's a lot of appeal to a hands-on-the-cheap- 
> hardware approach to learning.
>
> --> To take an example, if you set a student a problem of lighting  
> a LED on each node in a specified node order at  specified  
> intervals, and where the node interconnects are not specified in  
> advance, that's a fairly interesting homework problem.  You have to  
> discover the network connectivity graph, then figure out how to  
> pass the message to the appropriate node at the appropriate time.   
> This is a classic "hot plug network discovery" kind of problem, and  
> in the face of intermittent links, it's of great interest.
>
> --> While that particular problem isn't exactly HPC, it DOES relate  
> to HPC in a world where you cannot assume perfect processor nodes  
> and perfect communications links.  And that gets right to the whole  
> "scalability" thing in HPC.  It wasn't til the implementation of  
> Error Correcting Codes in logic that something like the Q7A  
> computer was even possible, because it was so large that you  
> couldn't guarantee that all the tubes would be working all the  
> time.  Likewise with many other aspects of modern computing.
>
> --> And, of course, in the spaceflight world, this kind of thing is  
> even more important.  A concept of growing importance is the  
> "fractionated spacecraft" where all of the functions that would  
> have been all in one physical vehicle are now spread across many  
> smaller pieces.  And one might reallocate spacecraft fractional  
> pieces between different virtual spacecraft.  Maybe right now, you  
> need a lot of processing power to do image compression and  
> analysis, so you want to allocate a lot of "processing pieces" to  
> the job, with an ad hoc network connection among them.  Later,  you  
> don't need them, so you can release them to other uses.  The pieces  
> might be in the immediate vicinity, or they might be some distance  
> away, which affects the data rate in the link and its error rates.
>
> --> You can legitimately ask whether this sort of thing (the  
> fractionated spacecraft) is a Beowulf (defined as a cluster  
> supercomputer built of commodity components) and I would say it  
> shares many of the same properties, especially in the early Beowulf  
> days before multicores and fancy interconnects were fashionable for  
> multi-thousand processor clusters.  It's that idea of building a  
> large complex device out of many basically identical subunits,  
> using open source/simple software to manage it.
>
>
> -->> in summary, it's not about performance.. it's about a teaching  
> tool for networking in the context of cluster computing.  You claim  
> we need to cast off the shackles of old programming styles and get  
> some new blood and ideas.  Well, you need to get people interested  
> in parallel computing and learning the basics (so at least they  
> don't reinvent the square wheel).  One way might be challenges such  
> as parallelization of game play; another might be working with  
> parallelized database; the way I propose is with experimenting with  
> message passing parallelization using dirt cheap hardware.
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing To change your subscription (digest mode or unsubscribe)  
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Wed Jan 11 19:59:18 2012
From: samuel at unimelb.edu.au (Chris Samuel)
Date: Thu, 12 Jan 2012 11:59:18 +1100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
Message-ID: <201201121159.18993.samuel@unimelb.edu.au>

On Thu, 12 Jan 2012 11:36:37 AM Vincent Diepeveen wrote:

> So it's far from demonstrating clusterprogramming. Lightyears away.

Whatever happpened to hacking on hardware just for the fun of it?

Just because it's not going to be useful doesn't mean you won't learn 
from the experience, even if the lesson is only "don't do it again". 
:-)

-- 
   Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Wed Jan 11 20:04:32 2012
From: samuel at unimelb.edu.au (Chris Samuel)
Date: Thu, 12 Jan 2012 12:04:32 +1100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <201201121204.32332.samuel@unimelb.edu.au>

On Thu, 12 Jan 2012 04:58:13 AM Lux, Jim (337C) wrote:

> Also, does the Raspberry PI $25 price point include a power supply?

I thought the plan was for them to be powered from the HDMI connector, 
but it appears I was wrong, it looks like it can use either microUSB 
or the GPIO header.

http://elinux.org/RaspberryPiBoard

# The board takes fixed 5V input, (with the 1V2 core voltage generated
# directly from the input using the internal switch-mode supply on the
# BCM2835 die). This permits adoption of the micro USB form factor,
# which, in turn, prevents the user from inadvertently plugging in
# out-of-range power inputs; that would be dangerous, since the 5V
# would go straight to HDMI and output USB ports, even though the
# problem should be mitigated by some protections applied to the input
# power: The board provides a polarity protection diode, a voltage
# clamp, and a self-resetting semiconductor fuse.

-- 
   Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 20:09:53 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 17:09:53 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Vincent Diepeveen
Sent: Wednesday, January 11, 2012 4:37 PM
To: Beowulf Mailing List
Subject: Re: [Beowulf] A cluster of Arduinos

Yes this was impossible to explain to a bunch of MiT folks as well, some of whom wrote your book i bet - yet the slower the processor, the more of a true SMP system it is.

It's obvious that you missed that point.

Writing code for a multicore is tougher, from SMP constraints viewpoint, than for a bunch of 70Mhz cpu's that have a millisecond latency to the other cpu's.

-> Yes, that's true... but that's also what I would think of as more advanced than understanding basic message passing or non-tightly-coupled multiprocessing systems.  And there are lots of applications for the latter.  Some might not be as sexy as others, but they exist.

So it's far from demonstrating clusterprogramming. Lightyears away.
Emulation at a simple quadcore is in fact better representative than this.
If you want to get closer to clusterprogramming than this, just buy yourself off ebay some barcelona core SMP system with 4 sockets. Say with energy efficient 1.8Ghz CPU's.
So with one of the first incarnations of hypertransport, as of course later on it dramatically improved.
Latency from cpu to cpu is some 300+ ns if you lookup randomly.
Even good programmers in game tree search have big problems working with those latencies.

-> but that's an entirely different sort of problem space and instructional area.   


Clusters are having latencies that are far worse than that. Yet as cpu speeds no longer increase much and number of cores doesn't double that quickly, clusters are the way to go if you're CPU hungry.
Setting up small clusters is cheap as well. If i put in the name 'mellanox' in ebay i see bunches of cheap cards out there and also switches.

-> Oh, Im sure the surplus market is full of things one could potentially use. But I suspect that by the time you lash together your $40 cards and $20 cables and several hundred $ switch, you're up in the total system price >$1k.  And you're using surplus, so there's a support issue.  If you're tinkering for yourself in the garage or as a one-off, then surplus is a fine way to go.  If you want to be able to give a list of "go buy this" to a teacher, it needs to be off-the-shelf currently being manufactured stuff.   

-> Say you want to set up 10 demo systems with 8 nodes each, so that each student in a small class has their own to work with.  There's a big difference between $30 Arduinos and $200 netbooks. 

With a single switch you can teach half a dozen students. You can just connect the machines you already got there onto a few switches and write MPI code like that.

-> The whole point is to give a student exclusive access to the system, without needing to share.  Sure, we've all done the shared "computer lab" resource thing and managed to learn(In the late 1970s, I would have done quite a lot to have on demand access to an 029 keypunch).  That's part of what *personal* computers is all about.    My program doesn't work right, I just hit the reset button and start over.  

-> I confess, too, that there is an aspect of the "mass of boards on the desktop with cables strewn around", which is a learning experience in itself.  On the other hand, the Arduino experience is a lot less hassle than, say, a mass of PC mobos, network cards, and power supplies and trying to get them to boot off the net or a USB drive. 


Average cost per student also will be a couple of hundreds of dollars.
-> that's the "total cost of several thousand dollars divided by N students who share it" I suspect.  We could get into a little BOM battle, and I'd venture that I can keep the off the shelf parts cost under $500, and give each student a dedicated system to play with.  The only part that I don't know right off the top of my head is the actual interconnect hardware.  I think you'd want to design some sort of board with a bunch of connectors that connects to the Arduinos with ribbon cables.   But even there, that could be "here's your PCBExpress file.. order the board and you get 3 for $50"

-> over the years I've been involved in several of these "what can we set up for a demonstration", and I've converged to the realization that what you need is a parts list (preferably preloaded at Newark or DigiKey or Mouser or similar) and an explicit set of instructions.   A setup that starts out with:
1) Find 8 motherboards on eBay or newegg with these sorts of specs
2) Find 8 power supplies that match the mother boards

Is doomed to failure.  You need "buy 3 of those and 6 of these, and hook them up this way"

This is the beauty of the whole Arduino culture. In fact, it's a bit too much of that.. there's not a lot of good overview tutorial material.. but lots of "here's how to do specific task X"... I got started looking at Arduinos because I want to build a multichannel temperature controller to smoke/cure sausage.

But I've used just about every small single board computer out there: Rabbit, Basic Stamp, various PIC boards, etc. not to mention various MiniITX and PC schemes.   So far, the Arduino is the winner on dirt cheap and simple combined.  Spend $30, plug in USB cable, load java environment, done.  Now I know why all those projects at the science fair are using them.  You get to focus on what you want to do, rather than getting a computer working.

Vincent


On Jan 12, 2012, at 12:24 AM, Lux, Jim (337C) wrote:

>
>
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf- 
> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
> Sent: Wednesday, January 11, 2012 2:47 PM
> To: Beowulf Mailing List
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> Jim, your microcontroller cluster is not a rather good idea.
>
> Latency didn't keep up with the CPU speeds...
>
> --- You're missing the point of the cluster.  It's not for performance 
> (where I can't imagine that the slowest single CPU PC out there 
> wouldn't blow the figurative doors off).  It's to provide a very 
> inexpensive way to experiment/play/demonstrate loosely coupled 
> multiprocessor systems.
>
> --> for example, you could experiment with redundant message
> routing across a fabric of nodes.  The algorithms are fairly simple, 
> and this gives you a testbed which is qualitatively
> different than just simulating a bunch of nodes on a single PC.   
> There is pedagogical value in a system where you can force a link 
> error by just disconnecting the cable, and your blinky lights on each 
> node show what's going on.
>
>
> There is still too much years 80s and years 90s software out there, 
> written by the guys who wrote books about how to parallellize, which 
> simply doesn't scale at all at modern hardware.
>
> -->  I think that a lot of the theory of parallel processes is
> speed independent, and while some historical approaches might not be 
> used in a modern system for good implementation reasons, students and 
> others still need to learn about them, if only as the
> canonical approach.    Sure, you could do a simulation on a single  
> PC (and I've seen them, in Simulink, and in other more specialized 
> tools), but there's a lot of appeal to a hands-on-the-cheap- hardware 
> approach to learning.
>
> --> To take an example, if you set a student a problem of lighting
> a LED on each node in a specified node order at  specified intervals, 
> and where the node interconnects are not specified in advance, that's 
> a fairly interesting homework problem.  You have to discover the 
> network connectivity graph, then figure out how to
> pass the message to the appropriate node at the appropriate time.   
> This is a classic "hot plug network discovery" kind of problem, and in 
> the face of intermittent links, it's of great interest.
>
> --> While that particular problem isn't exactly HPC, it DOES relate
> to HPC in a world where you cannot assume perfect processor nodes and 
> perfect communications links.  And that gets right to the whole 
> "scalability" thing in HPC.  It wasn't til the implementation of Error 
> Correcting Codes in logic that something like the Q7A computer was 
> even possible, because it was so large that you couldn't guarantee 
> that all the tubes would be working all the time.  Likewise with many 
> other aspects of modern computing.
>
> --> And, of course, in the spaceflight world, this kind of thing is
> even more important.  A concept of growing importance is the 
> "fractionated spacecraft" where all of the functions that would have 
> been all in one physical vehicle are now spread across many smaller 
> pieces.  And one might reallocate spacecraft fractional pieces between 
> different virtual spacecraft.  Maybe right now, you need a lot of 
> processing power to do image compression and analysis, so you want to 
> allocate a lot of "processing pieces" to the job, with an ad hoc 
> network connection among them.  Later,  you don't need them, so you 
> can release them to other uses.  The pieces might be in the immediate 
> vicinity, or they might be some distance away, which affects the data 
> rate in the link and its error rates.
>
> --> You can legitimately ask whether this sort of thing (the
> fractionated spacecraft) is a Beowulf (defined as a cluster 
> supercomputer built of commodity components) and I would say it shares 
> many of the same properties, especially in the early Beowulf days 
> before multicores and fancy interconnects were fashionable for 
> multi-thousand processor clusters.  It's that idea of building a large 
> complex device out of many basically identical subunits, using open 
> source/simple software to manage it.
>
>
> -->> in summary, it's not about performance.. it's about a teaching
> tool for networking in the context of cluster computing.  You claim we 
> need to cast off the shackles of old programming styles and get some 
> new blood and ideas.  Well, you need to get people interested in 
> parallel computing and learning the basics (so at least they don't 
> reinvent the square wheel).  One way might be challenges such as 
> parallelization of game play; another might be working with 
> parallelized database; the way I propose is with experimenting with 
> message passing parallelization using dirt cheap hardware.
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin 
> Computing To change your subscription (digest mode or unsubscribe) 
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Wed Jan 11 20:22:07 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 11 Jan 2012 17:22:07 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <201201121204.32332.samuel@unimelb.edu.au>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>
	<201201121204.32332.samuel@unimelb.edu.au>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E910B@ALTPHYEMBEVSP20.RES.AD.JPL>

Interesting...
That seems to be a growing trend, then.  So, now we just have to wait for them to actually exist.   The $35 B style board has Ethernet, and assuming one could netboot and operate "headless", then a stack o'raspberry PIs and a cheap Ethernet switch might be an alternate approach.

The "per node" cost is comparable to the Arduino, and it's true that Ethernet is probably more congenial in the long run. 

Drawing 700mA off the microUSB, though..  That's fairly hefty (although not a big deal in general.. you might need to have some better power supply scheme for a basket o'pi cluster.  (Arduino Uno runs around 40-50 mA)


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Chris Samuel
Sent: Wednesday, January 11, 2012 5:05 PM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

On Thu, 12 Jan 2012 04:58:13 AM Lux, Jim (337C) wrote:

> Also, does the Raspberry PI $25 price point include a power supply?

I thought the plan was for them to be powered from the HDMI connector, but it appears I was wrong, it looks like it can use either microUSB or the GPIO header.

http://elinux.org/RaspberryPiBoard

# The board takes fixed 5V input, (with the 1V2 core voltage generated # directly from the input using the internal switch-mode supply on the # BCM2835 die). This permits adoption of the micro USB form factor, # which, in turn, prevents the user from inadvertently plugging in # out-of-range power inputs; that would be dangerous, since the 5V # would go straight to HDMI and output USB ports, even though the # problem should be mitigated by some protections applied to the input # power: The board provides a polarity protection diode, a voltage # clamp, and a self-resetting semiconductor fuse.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 11 21:03:21 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 03:03:21 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>

The whole purpose of PC's is that they are generic to use. I remember  
how in past decision taking bought low clocked junk for big price -
much against the wish of the sysadmins who wanted a PC for every  
student exclusively. Outdated slow junk is not interesting
to students. Now you and i might like that CPU as it's under $1, but  
to them it's just 70Mhz, factor 500 slower than their home PC single  
core
is. What impresses is if you got something that can beat their own  
machine at home.

In the end in science we basically learn a lot easier if we can take  
a look into the future - so being faster than a single PC is a good  
example of that.

So let them do that. If you take care you launch 1 proces on each  
machine, then at quadcore machines, not to mention i7's with
hyperthreading, you can have 24 computers on 1 switch that serve 24  
students, each using 12 logical cores.

And for demonstration purposes you can run succesful applications  
also at all 24 computers at the same time.

Hey there is switches with even more slots.

Average price per student is gonna beat the crap out any junk  
solution you show up with - besides how many are you gonna buy?

Those computers are already there, one for each student i suspect.

So they can exclusively toy and toy - for the switch it's not a real  
problem except if they really mess up.

But most important they learn something - by toying with 70Mhz  
hardware that's not representative and only intersting to experts like
you and me, who are real good in embedded programming, they don't  
learn much.

There is no replacement for the real thing to test upon.

Besides if you go program at embedded processors, writing good fast  
single CPU code mine is probably gonna kick the hell out of you writing
the same program at 8 CPU's. Probably by factor 10+ it'll be single  
core faster than you at 8.

p.s. not that it's disturbing Jim but your replies are typed within  
my original message always, so tough to read sometimes what you typed  
into
the message i posted here -  maybe this apple macbookpro's
mailing system doesn't know how to handle it - FYI i want to reformat  
it to linux anyway -
getting sick being hacked silly each time by about every other  
consultant,
but well this is all off topic - so hence the postscriptum.

On Jan 12, 2012, at 2:09 AM, Lux, Jim (337C) wrote:

>
>
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf- 
> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
> Sent: Wednesday, January 11, 2012 4:37 PM
> To: Beowulf Mailing List
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> Yes this was impossible to explain to a bunch of MiT folks as well,  
> some of whom wrote your book i bet - yet the slower the processor,  
> the more of a true SMP system it is.
>
> It's obvious that you missed that point.
>
> Writing code for a multicore is tougher, from SMP constraints  
> viewpoint, than for a bunch of 70Mhz cpu's that have a millisecond  
> latency to the other cpu's.
>
> -> Yes, that's true... but that's also what I would think of as  
> more advanced than understanding basic message passing or non- 
> tightly-coupled multiprocessing systems.  And there are lots of  
> applications for the latter.  Some might not be as sexy as others,  
> but they exist.
>
> So it's far from demonstrating clusterprogramming. Lightyears away.
> Emulation at a simple quadcore is in fact better representative  
> than this.
> If you want to get closer to clusterprogramming than this, just buy  
> yourself off ebay some barcelona core SMP system with 4 sockets.  
> Say with energy efficient 1.8Ghz CPU's.
> So with one of the first incarnations of hypertransport, as of  
> course later on it dramatically improved.
> Latency from cpu to cpu is some 300+ ns if you lookup randomly.
> Even good programmers in game tree search have big problems working  
> with those latencies.
>
> -> but that's an entirely different sort of problem space and  
> instructional area.
>
>
> Clusters are having latencies that are far worse than that. Yet as  
> cpu speeds no longer increase much and number of cores doesn't  
> double that quickly, clusters are the way to go if you're CPU hungry.
> Setting up small clusters is cheap as well. If i put in the name  
> 'mellanox' in ebay i see bunches of cheap cards out there and also  
> switches.
>
> -> Oh, Im sure the surplus market is full of things one could  
> potentially use. But I suspect that by the time you lash together  
> your $40 cards and $20 cables and several hundred $ switch, you're  
> up in the total system price >$1k.  And you're using surplus, so  
> there's a support issue.  If you're tinkering for yourself in the  
> garage or as a one-off, then surplus is a fine way to go.  If you  
> want to be able to give a list of "go buy this" to a teacher, it  
> needs to be off-the-shelf currently being manufactured stuff.
>
> -> Say you want to set up 10 demo systems with 8 nodes each, so  
> that each student in a small class has their own to work with.   
> There's a big difference between $30 Arduinos and $200 netbooks.
>
> With a single switch you can teach half a dozen students. You can  
> just connect the machines you already got there onto a few switches  
> and write MPI code like that.
>
> -> The whole point is to give a student exclusive access to the  
> system, without needing to share.  Sure, we've all done the shared  
> "computer lab" resource thing and managed to learn(In the late  
> 1970s, I would have done quite a lot to have on demand access to an  
> 029 keypunch).  That's part of what *personal* computers is all  
> about.    My program doesn't work right, I just hit the reset  
> button and start over.
>
> -> I confess, too, that there is an aspect of the "mass of boards  
> on the desktop with cables strewn around", which is a learning  
> experience in itself.  On the other hand, the Arduino experience is  
> a lot less hassle than, say, a mass of PC mobos, network cards, and  
> power supplies and trying to get them to boot off the net or a USB  
> drive.
>
>
> Average cost per student also will be a couple of hundreds of dollars.
> -> that's the "total cost of several thousand dollars divided by N  
> students who share it" I suspect.  We could get into a little BOM  
> battle, and I'd venture that I can keep the off the shelf parts  
> cost under $500, and give each student a dedicated system to play  
> with.  The only part that I don't know right off the top of my head  
> is the actual interconnect hardware.  I think you'd want to design  
> some sort of board with a bunch of connectors that connects to the  
> Arduinos with ribbon cables.   But even there, that could be  
> "here's your PCBExpress file.. order the board and you get 3 for $50"
>
> -> over the years I've been involved in several of these "what can  
> we set up for a demonstration", and I've converged to the  
> realization that what you need is a parts list (preferably  
> preloaded at Newark or DigiKey or Mouser or similar) and an  
> explicit set of instructions.   A setup that starts out with:
> 1) Find 8 motherboards on eBay or newegg with these sorts of specs
> 2) Find 8 power supplies that match the mother boards
>
> Is doomed to failure.  You need "buy 3 of those and 6 of these, and  
> hook them up this way"
>
> This is the beauty of the whole Arduino culture. In fact, it's a  
> bit too much of that.. there's not a lot of good overview tutorial  
> material.. but lots of "here's how to do specific task X"... I got  
> started looking at Arduinos because I want to build a multichannel  
> temperature controller to smoke/cure sausage.
>
> But I've used just about every small single board computer out  
> there: Rabbit, Basic Stamp, various PIC boards, etc. not to mention  
> various MiniITX and PC schemes.   So far, the Arduino is the winner  
> on dirt cheap and simple combined.  Spend $30, plug in USB cable,  
> load java environment, done.  Now I know why all those projects at  
> the science fair are using them.  You get to focus on what you want  
> to do, rather than getting a computer working.
>
> Vincent
>
>
>
> On Jan 12, 2012, at 12:24 AM, Lux, Jim (337C) wrote:
>
>>
>>
>> -----Original Message-----
>> From: beowulf-bounces at beowulf.org [mailto:beowulf-
>> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
>> Sent: Wednesday, January 11, 2012 2:47 PM
>> To: Beowulf Mailing List
>> Subject: Re: [Beowulf] A cluster of Arduinos
>>
>> Jim, your microcontroller cluster is not a rather good idea.
>>
>> Latency didn't keep up with the CPU speeds...
>>
>> --- You're missing the point of the cluster.  It's not for  
>> performance
>> (where I can't imagine that the slowest single CPU PC out there
>> wouldn't blow the figurative doors off).  It's to provide a very
>> inexpensive way to experiment/play/demonstrate loosely coupled
>> multiprocessor systems.
>>
>> --> for example, you could experiment with redundant message
>> routing across a fabric of nodes.  The algorithms are fairly simple,
>> and this gives you a testbed which is qualitatively
>> different than just simulating a bunch of nodes on a single PC.
>> There is pedagogical value in a system where you can force a link
>> error by just disconnecting the cable, and your blinky lights on each
>> node show what's going on.
>>
>>
>> There is still too much years 80s and years 90s software out there,
>> written by the guys who wrote books about how to parallellize, which
>> simply doesn't scale at all at modern hardware.
>>
>> -->  I think that a lot of the theory of parallel processes is
>> speed independent, and while some historical approaches might not be
>> used in a modern system for good implementation reasons, students and
>> others still need to learn about them, if only as the
>> canonical approach.    Sure, you could do a simulation on a single
>> PC (and I've seen them, in Simulink, and in other more specialized
>> tools), but there's a lot of appeal to a hands-on-the-cheap- hardware
>> approach to learning.
>>
>> --> To take an example, if you set a student a problem of lighting
>> a LED on each node in a specified node order at  specified intervals,
>> and where the node interconnects are not specified in advance, that's
>> a fairly interesting homework problem.  You have to discover the
>> network connectivity graph, then figure out how to
>> pass the message to the appropriate node at the appropriate time.
>> This is a classic "hot plug network discovery" kind of problem,  
>> and in
>> the face of intermittent links, it's of great interest.
>>
>> --> While that particular problem isn't exactly HPC, it DOES relate
>> to HPC in a world where you cannot assume perfect processor nodes and
>> perfect communications links.  And that gets right to the whole
>> "scalability" thing in HPC.  It wasn't til the implementation of  
>> Error
>> Correcting Codes in logic that something like the Q7A computer was
>> even possible, because it was so large that you couldn't guarantee
>> that all the tubes would be working all the time.  Likewise with many
>> other aspects of modern computing.
>>
>> --> And, of course, in the spaceflight world, this kind of thing is
>> even more important.  A concept of growing importance is the
>> "fractionated spacecraft" where all of the functions that would have
>> been all in one physical vehicle are now spread across many smaller
>> pieces.  And one might reallocate spacecraft fractional pieces  
>> between
>> different virtual spacecraft.  Maybe right now, you need a lot of
>> processing power to do image compression and analysis, so you want to
>> allocate a lot of "processing pieces" to the job, with an ad hoc
>> network connection among them.  Later,  you don't need them, so you
>> can release them to other uses.  The pieces might be in the immediate
>> vicinity, or they might be some distance away, which affects the data
>> rate in the link and its error rates.
>>
>> --> You can legitimately ask whether this sort of thing (the
>> fractionated spacecraft) is a Beowulf (defined as a cluster
>> supercomputer built of commodity components) and I would say it  
>> shares
>> many of the same properties, especially in the early Beowulf days
>> before multicores and fancy interconnects were fashionable for
>> multi-thousand processor clusters.  It's that idea of building a  
>> large
>> complex device out of many basically identical subunits, using open
>> source/simple software to manage it.
>>
>>
>> -->> in summary, it's not about performance.. it's about a teaching
>> tool for networking in the context of cluster computing.  You  
>> claim we
>> need to cast off the shackles of old programming styles and get some
>> new blood and ideas.  Well, you need to get people interested in
>> parallel computing and learning the basics (so at least they don't
>> reinvent the square wheel).  One way might be challenges such as
>> parallelization of game play; another might be working with
>> parallelized database; the way I propose is with experimenting with
>> message passing parallelization using dirt cheap hardware.
>>
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> Computing To change your subscription (digest mode or unsubscribe)
>> visit http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing To change your subscription (digest mode or unsubscribe)  
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eagles051387 at gmail.com  Thu Jan 12 02:42:10 2012
From: eagles051387 at gmail.com (Jonathan Aquilina)
Date: Thu, 12 Jan 2012 08:42:10 +0100
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
 full of oil.
In-Reply-To: <1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
	<D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
	<4F020244.4040505@ias.edu>
	<1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>
Message-ID: <4F0E8ED2.5000504@gmail.com>

On 11/01/2012 18:30, Vincent Diepeveen wrote:
> On Jan 2, 2012, at 8:15 PM, Prentice Bisbal wrote:
>
>> On 12/29/2011 07:50 PM, Vincent Diepeveen wrote:
>>> it's very useful Mark, as we know now he works for the company and
>>> also for which nation.
>>>
>>> Vincent
>> For someone who's always bashing on US Foreign policy, you sure sound
>> like a Republican or member of the Department of Homeland Security!
> Where is my paycheck?
>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
FYI vincent I am no back in malta.

Regards

Jonathan Aquilina

Get a signature like this. 
<http://r1.wisestamp.com/r/landing?promo=17&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_17> 
CLICK HERE. 
<http://r1.wisestamp.com/r/landing?promo=17&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_17> 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/d04c951d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p.gif
Type: image/gif
Size: 35 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/d04c951d/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pixel.png
Type: image/png
Size: 90 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/d04c951d/attachment-0001.png>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From eagles051387 at gmail.com  Thu Jan 12 02:42:26 2012
From: eagles051387 at gmail.com (Jonathan Aquilina)
Date: Thu, 12 Jan 2012 08:42:26 +0100
Subject: [Beowulf] clustering using off the shelf systems in a fish tank
 full of oil.
In-Reply-To: <1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>
References: <CB209014.1297E%james.p.lux@jpl.nasa.gov>	<4EFB5AAE.3030900@gmail.com>	<715C5657-461B-41E7-9591-5DF89F3CC285@xs4all.nl>	<4EFC8D03.4020406@gmail.com>	<5AF52A05-28AA-4EE5-A081-EA60BD1E9B32@xs4all.nl>	<4EFC9540.5010906@gmail.com>	<alpine.LFD.2.02.1112291449210.17121@coffee.psychology.mcmaster.ca>
	<D73B062A-87A7-4B16-8F42-7E585A0DFE85@xs4all.nl>
	<4F020244.4040505@ias.edu>
	<1820F354-C0D4-4337-A9EB-DDBD9CB50761@xs4all.nl>
Message-ID: <4F0E8EE2.7040403@gmail.com>

On 11/01/2012 18:30, Vincent Diepeveen wrote:
> On Jan 2, 2012, at 8:15 PM, Prentice Bisbal wrote:
>
>> On 12/29/2011 07:50 PM, Vincent Diepeveen wrote:
>>> it's very useful Mark, as we know now he works for the company and
>>> also for which nation.
>>>
>>> Vincent
>> For someone who's always bashing on US Foreign policy, you sure sound
>> like a Republican or member of the Department of Homeland Security!
> Where is my paycheck?
>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
FYI vincent I am now back in malta.

Regards

Jonathan Aquilina

Get a signature like this. 
<http://r1.wisestamp.com/r/landing?promo=17&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_17> 
CLICK HERE. 
<http://r1.wisestamp.com/r/landing?promo=17&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_17> 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/3d76e5b8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p.gif
Type: image/gif
Size: 35 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/3d76e5b8/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pixel.png
Type: image/png
Size: 90 bytes
Desc: not available
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/3d76e5b8/attachment-0001.png>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From eugen at leitl.org  Thu Jan 12 03:49:45 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Thu, 12 Jan 2012 09:49:45 +0100
Subject: [Beowulf] the Barcelona Supercomputing Center
Message-ID: <20120112084945.GD21917@leitl.org>


Just some cluster porn:

http://imgur.com/a/OoNVI
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From john.hearns at mclaren.com  Thu Jan 12 05:16:28 2012
From: john.hearns at mclaren.com (Hearns, John)
Date: Thu, 12 Jan 2012 10:16:28 -0000
Subject: [Beowulf] A cluster of Arduinos
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov><CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com><ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL><201201121204.32332.samuel@unimelb.edu.au>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E910B@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <207BB2F60743C34496BE41039233A8090A7D728A@MRL-PWEXCHMB02.mil.tagmclarengroup.com>

> Interesting...
> That seems to be a growing trend, then.  So, now we just have to wait
> for them to actually exist.   The $35 B style board has Ethernet, and
> assuming one could netboot and operate "headless", then a stack
> o'raspberry PIs and a cheap Ethernet switch might be an alternate
> approach.

Regarding Ethernet switches, I had cause recently to look for an USB
powered switch
Such things exist, they are promoted for gamers.
http://www.scan.co.uk/products/8-port-eten-pw-108-pocket-size-metal-casi
ng-10-100-switch-usb-powered-lan-party!

You could imagine a cluster being powered by those USB adapters which
fit into the cigarette
lighter socket of a car.
How about a cluster which fits in the glovebox or under the seat of a
car?


The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From peter.st.john at gmail.com  Thu Jan 12 08:49:16 2012
From: peter.st.john at gmail.com (Peter St. John)
Date: Thu, 12 Jan 2012 08:49:16 -0500
Subject: [Beowulf] the Barcelona Supercomputing Center
In-Reply-To: <20120112084945.GD21917@leitl.org>
References: <20120112084945.GD21917@leitl.org>
Message-ID: <CAF4H3kcwptKZ9vJobaqpzt-p2KNX2jNKdXi8Xx17rYCRBP0eJg@mail.gmail.com>

The architectural contrast (the building housing the racks is a chapel) is
vivid.
Sorta Steampunkish.
The place is described some at http://www.bsc.es/plantillaA.php?cat_id=1 (many
of their pages seem to be in English).
Peter

On Thu, Jan 12, 2012 at 3:49 AM, Eugen Leitl <eugen at leitl.org> wrote:

>
> Just some cluster porn:
>
> http://imgur.com/a/OoNVI
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/d0827829/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From ellis at runnersroll.com  Thu Jan 12 08:58:20 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 08:58:20 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
Message-ID: <4F0EE6FC.2050002@runnersroll.com>

On 01/11/2012 09:03 PM, Vincent Diepeveen wrote:
> The whole purpose of PC's is that they are generic to use. I remember
> how in past decision taking bought low clocked junk for big price -
> much against the wish of the sysadmins who wanted a PC for every
> student exclusively. Outdated slow junk is not interesting
> to students. Now you and i might like that CPU as it's under $1, but
> to them it's just 70Mhz, factor 500 slower than their home PC single
> core
> is. What impresses is if you got something that can beat their own
> machine at home.
>
> In the end in science we basically learn a lot easier if we can take
> a look into the future - so being faster than a single PC is a good
> example of that.

Take this advice in any other area, let's say, Chemical Engineering or 
Mechanical Engineering, and the students are going to come out the of 
the experience with chemical burns at least to at most blowing up half 
of the building.  In the best case all they do is screw up very, very 
expensive equipment.  So I have to respectfully disagree that learning 
is only possible and students will only be interested when working on 
the stuff of the "future."  I think this is likely the reason why many 
introductory engineering classes incorporate use of Lego Mindstorm 
robots rather than lunar rovers (or even overstock lunar rovers :D).

Point in case, I got interested in HPC/Beowulfery back in 2006, read 
RGBs book and a few other texts on it, and finally found a small group 
(4) of unused PIIIs to play on in the attic of one of my college's 
buildings.  Did I learn how to setup a reasonable cluster?  Yes.  Was it 
slow as dirt compared to then modern Intel and AMD processors?  Of 
course.  But did the experience get me so completely hooked on 
HPC/Cluster research that I went on to pursue a PHD on the topic? 
Absolutely.

Granted, I'm just one data point, but I think Jim's idea has all the 
right components for a great educational experience.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Thu Jan 12 09:28:56 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Thu, 12 Jan 2012 09:28:56 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E910B@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<CACD67szLEiM=F58jpoWpb-CsvhL4qK-LYZNHfOnX=o5mhk0FXQ@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9006@ALTPHYEMBEVSP20.RES.AD.JPL>	<201201121204.32332.samuel@unimelb.edu.au>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E910B@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <4F0EEE28.6030404@ias.edu>


On 01/11/2012 08:22 PM, Lux, Jim (337C) wrote:
> Interesting...
> That seems to be a growing trend, then.  So, now we just have to wait for them to actually exist.   The $35 B style board has Ethernet, and assuming one could netboot and operate "headless", then a stack o'raspberry PIs and a cheap Ethernet switch might be an alternate approach.
>
> The "per node" cost is comparable to the Arduino, and it's true that Ethernet is probably more congenial in the long run. 

You can get an ethernet "shield" for arduino to add ethernet
capabilities, but at $35-50 each, you cost savings just went out the
window, especially when compared to the Raspberry Pi. You can also buy
the Arduino Ethernet, which is an arduino board with Ethernet built in,
but at a cost of ~$60, is no better a value than buying an arduino and
the ethernet shield separately.
> Drawing 700mA off the microUSB, though..  That's fairly hefty (although not a big deal in general.. you might need to have some better power supply scheme for a basket o'pi cluster.  (Arduino Uno runs around 40-50 mA

The arduino can be powered by USB, or a 9V power supply, so if you plan
on using lots of them (as Jim is, theoretically), you don't have to
worry about overloading the USB bus.

--
Prentice

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 09:35:50 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 06:35:50 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <207BB2F60743C34496BE41039233A8090A7D728A@MRL-PWEXCHMB02.mil.tagmclarengroup.com>
Message-ID: <CB342F88.12ED3%james.p.lux@jpl.nasa.gov>


On 1/12/12 2:16 AM, "Hearns, John" <john.hearns at mclaren.com> wrote:

>> Interesting...
>> That seems to be a growing trend, then.  So, now we just have to wait
>> for them to actually exist.   The $35 B style board has Ethernet, and
>> assuming one could netboot and operate "headless", then a stack
>> o'raspberry PIs and a cheap Ethernet switch might be an alternate
>> approach.
>
>Regarding Ethernet switches, I had cause recently to look for an USB
>powered switch
>Such things exist, they are promoted for gamers.
>http://www.scan.co.uk/products/8-port-eten-pw-108-pocket-size-metal-casi
>ng-10-100-switch-usb-powered-lan-party!
>
>You could imagine a cluster being powered by those USB adapters which
>fit into the cigarette
>lighter socket of a car.
>How about a cluster which fits in the glovebox or under the seat of a
>car?


Powering off the cigarette lighter socket (or 12V power socket as they're
now labeled) is probably feasible, but those USB widgets can't source a
lot of power.  Certainly not amps.


>
>
>The contents of this email are confidential and for the exclusive use of
>the intended recipient.  If you receive this email in error you should
>not copy it, retransmit it, use it or disclose its contents but should
>return it to the sender immediately and delete your copy.
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 09:39:23 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 15:39:23 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0EE6FC.2050002@runnersroll.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
	<4F0EE6FC.2050002@runnersroll.com>
Message-ID: <41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>

The average guy is not interested in knowing all details regarding  
how to
play tennis with a wooden racket from the 1980s, just around
the time when McEnroe was on the tennisfield playing there.

Most people are more interested in whether you can win that grandslam
with what you produce.

The nerds however are interested in how well you can do with a wooden  
racket
from 1980s,therefore projecting your own interest upon those students  
will just
get them desinterested and you will be judged by them as an  
irrelevant person
in their life, whose name they soon forget.

Vincent

On Jan 12, 2012, at 2:58 PM, Ellis H. Wilson III wrote:

> On 01/11/2012 09:03 PM, Vincent Diepeveen wrote:
>> The whole purpose of PC's is that they are generic to use. I remember
>> how in past decision taking bought low clocked junk for big price -
>> much against the wish of the sysadmins who wanted a PC for every
>> student exclusively. Outdated slow junk is not interesting
>> to students. Now you and i might like that CPU as it's under $1, but
>> to them it's just 70Mhz, factor 500 slower than their home PC single
>> core
>> is. What impresses is if you got something that can beat their own
>> machine at home.
>>
>> In the end in science we basically learn a lot easier if we can take
>> a look into the future - so being faster than a single PC is a good
>> example of that.
>
> Take this advice in any other area, let's say, Chemical Engineering or
> Mechanical Engineering, and the students are going to come out the of
> the experience with chemical burns at least to at most blowing up half
> of the building.  In the best case all they do is screw up very, very
> expensive equipment.  So I have to respectfully disagree that learning
> is only possible and students will only be interested when working on
> the stuff of the "future."  I think this is likely the reason why many
> introductory engineering classes incorporate use of Lego Mindstorm
> robots rather than lunar rovers (or even overstock lunar rovers :D).
>
> Point in case, I got interested in HPC/Beowulfery back in 2006, read
> RGBs book and a few other texts on it, and finally found a small group
> (4) of unused PIIIs to play on in the attic of one of my college's
> buildings.  Did I learn how to setup a reasonable cluster?  Yes.   
> Was it
> slow as dirt compared to then modern Intel and AMD processors?  Of
> course.  But did the experience get me so completely hooked on
> HPC/Cluster research that I went on to pursue a PHD on the topic?
> Absolutely.
>
> Granted, I'm just one data point, but I think Jim's idea has all the
> right components for a great educational experience.
>
> Best,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Thu Jan 12 09:38:13 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Thu, 12 Jan 2012 09:38:13 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
Message-ID: <4F0EF055.3050609@ias.edu>


On 01/11/2012 09:03 PM, Vincent Diepeveen wrote:
> The whole purpose of PC's is that they are generic to use. 

That is also the purpose of the Arduino. That's why they open-sourced
it's hardware design.
> I remember  
> how in past decision taking bought low clocked junk for big price -
> much against the wish of the sysadmins who wanted a PC for every  
> student exclusively. Outdated slow junk is not interesting
> to students. Now you and i might like that CPU as it's under $1, but  
> to them it's just 70Mhz, factor 500 slower than their home PC single  
> core
> is. What impresses is if you got something that can beat their own  
> machine at home.
>

Wrong. What impresses students is teaching something they didn't already
know, or showing them how to do something new. Using baking soda and
vinegar to build a volcano, is very low-tech, but it still impresses
students of all ages (even in this modern Apple i-everything world) and
it's done with ingredients just about everyone already has in their
kitchen.

Show them sodium acetate crystallizing out of a supersaturated solution,
and their heads practically explode. Also very low-tech.

--
Prentice


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Thu Jan 12 09:50:05 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Thu, 12 Jan 2012 09:50:05 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>	<4F0EE6FC.2050002@runnersroll.com>
	<41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
Message-ID: <4F0EF31D.8010603@ias.edu>

On 01/12/2012 09:39 AM, Vincent Diepeveen wrote:
> The average guy is not interested in knowing all details regarding  
> how to
> play tennis with a wooden racket from the 1980s, just around
> the time when McEnroe was on the tennisfield playing there.
>
> Most people are more interested in whether you can win that grandslam
> with what you produce.
>
> The nerds however are interested in how well you can do with a wooden  
> racket
> from 1980s,therefore projecting your own interest upon those students  
> will just
> get them desinterested and you will be judged by them as an  
> irrelevant person
> in their life, whose name they soon forget.

Vincent,  I think the only person projecting here is you. You refer to
the 'average guy'. The word 'average' itself implies that statistics
have been collected and analyzed. Can you please show us your
statistics, and how you collected them, to determine what the average
guy is interested in? And what about the average girl, what is she
interested in?  If you are merely citing the work of other researchers,
please include citations.

--
Prentice
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 09:53:57 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 09:53:57 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0EF31D.8010603@ias.edu>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>	<4F0EE6FC.2050002@runnersroll.com>
	<41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
	<4F0EF31D.8010603@ias.edu>
Message-ID: <4F0EF405.5070600@runnersroll.com>

On 01/12/2012 09:50 AM, Prentice Bisbal wrote:
> On 01/12/2012 09:39 AM, Vincent Diepeveen wrote:
>> The average guy is not interested in knowing all details regarding
>> how to
>> play tennis with a wooden racket from the 1980s, just around
>> the time when McEnroe was on the tennisfield playing there.
>>
>> Most people are more interested in whether you can win that grandslam
>> with what you produce.
>>
>> The nerds however are interested in how well you can do with a wooden
>> racket
>> from 1980s,therefore projecting your own interest upon those students
>> will just
>> get them desinterested and you will be judged by them as an
>> irrelevant person
>> in their life, whose name they soon forget.
>
> Vincent,  I think the only person projecting here is you. You refer to
> the 'average guy'. The word 'average' itself implies that statistics
> have been collected and analyzed. Can you please show us your
> statistics, and how you collected them, to determine what the average
> guy is interested in? And what about the average girl, what is she
> interested in?  If you are merely citing the work of other researchers,
> please include citations.

Guys, let's just let this one die in it's traditional form of Vincent 
disagrees with the list and there is nothing more that can be done.  I 
recently read a blog that suggested (due to similar threads following 
these trajectories) that the Wulf list wasn't what it used to be.

Let's save the flames for editors,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 10:03:49 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 16:03:49 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0EF31D.8010603@ias.edu>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>	<4F0EE6FC.2050002@runnersroll.com>
	<41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
	<4F0EF31D.8010603@ias.edu>
Message-ID: <BDA22E31-B2EC-4D5F-A703-E227779BC2A0@xs4all.nl>

Very simple,

Wooden tennis rackets were dirt cheap in 90s.
No one bought them.

Instead they all bought for the tennis court a light frame racket  
with big blade;
in fact those were pretty expensive in some cases.

Why did no one use suddenly those wooden rackets anymore?

How many people watch upcoming Australian Grandslam?

A lot.

How many will watch 1 or 2 dudes toy with a few embedded processors  
using
a language no one has heard of? Only a handful.

On Jan 12, 2012, at 3:50 PM, Prentice Bisbal wrote:

> On 01/12/2012 09:39 AM, Vincent Diepeveen wrote:
>> The average guy is not interested in knowing all details regarding
>> how to
>> play tennis with a wooden racket from the 1980s, just around
>> the time when McEnroe was on the tennisfield playing there.
>>
>> Most people are more interested in whether you can win that grandslam
>> with what you produce.
>>
>> The nerds however are interested in how well you can do with a wooden
>> racket
>> from 1980s,therefore projecting your own interest upon those students
>> will just
>> get them desinterested and you will be judged by them as an
>> irrelevant person
>> in their life, whose name they soon forget.
>
> Vincent,  I think the only person projecting here is you. You refer to
> the 'average guy'. The word 'average' itself implies that statistics
> have been collected and analyzed. Can you please show us your
> statistics, and how you collected them, to determine what the average
> guy is interested in? And what about the average girl, what is she
> interested in?  If you are merely citing the work of other  
> researchers,
> please include citations.
>
> --
> Prentice
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 10:10:40 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 07:10:40 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
Message-ID: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>


On 1/12/12 6:39 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:

>The average guy is not interested in knowing all details regarding
>how to
>play tennis with a wooden racket from the 1980s, just around
>the time when McEnroe was on the tennisfield playing there.
>
>Most people are more interested in whether you can win that grandslam
>with what you produce.
>
>The nerds however are interested in how well you can do with a wooden
>racket
>from 1980s,therefore projecting your own interest upon those students
>will just
>get them desinterested and you will be judged by them as an
>irrelevant person
>in their life, whose name they soon forget.
>

Having spent some time recently in Human Resources meetings about how to
better recruit software people for JPL, I'd say that something that
appeals to nerds and gives them something to do is not all bad. Part of
the educational process is to find and separate the people who are
interested and have a passion.  I'm not sure that someone who starts
getting into clusters mostly because they are interested in breaking into
the Top500 is the target audience in any case.

If you look over the hobby clusters out there, the vast majority are "hey,
I heard about this interesting idea, I scrounged up N old/small/slow/easy
to find computers and tried to cluster them and do something.  I learned
something about cluster administration, and it was fun, but I don't use it
anymore"   

This is exactly the population you want to hit.  Bring in 100 advanced
high school (grade 11-12 in US) students.  Have them all use cheap
hardware to do a cluster.  Some fraction will think, "this is kind of
cool, maybe I should major in CS instead of X"  Some fraction will think,
"how lame, why not make the single processor faster", and they can be
CompEng or EE majors looking at how to reduce feature sizes and get the
heat out. 

It's just like biology or chemistry classes.  In high school biology
(9th/10th grade) most of it is mundane memorization (Krebs cycle, various
descriptive stuff.  Other than the use of cheap cmos cameras, microscopes
used at this level haven't really changed much in the last 100 years (and
the microscopes at my kids' school are probably 10-20 years old). They
also do some more modern molecular biology in a series of labs partly
funded by Amgen:   Some recombinant DNA to put fluorescent proteins in a
bacteria, running some gels, etc.  The vast majority of the students will
NOT go on to a career in biology, but some fraction do, they get
interested in some aspect, and they wind up majoring in bio, or being a
pre-med, etc.   

Not everyone is looking for the world beater.  A lot of kids start with
Kart racing, even though even the fastest Karts aren't as fast as F1 (or
even a Smart Car).  How many engineers started with dismantling the
lawnmower engine?  


For my own work, I'd rather have people who are interested in solving
problems by ganging up multiple failure prone processors, rather than
centralizing it all in one monolithic box (even if the box happens to have
multiple cores).


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 10:13:00 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 16:13:00 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0EF405.5070600@runnersroll.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>	<4F0EE6FC.2050002@runnersroll.com>
	<41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
	<4F0EF31D.8010603@ias.edu> <4F0EF405.5070600@runnersroll.com>
Message-ID: <ED8B1A88-418D-4ED6-BB37-136F67F4EFB9@xs4all.nl>


On Jan 12, 2012, at 3:53 PM, Ellis H. Wilson III wrote:

> On 01/12/2012 09:50 AM, Prentice Bisbal wrote:
>> On 01/12/2012 09:39 AM, Vincent Diepeveen wrote:
>>> The average guy is not interested in knowing all details regarding
>>> how to
>>> play tennis with a wooden racket from the 1980s, just around
>>> the time when McEnroe was on the tennisfield playing there.
>>>
>>> Most people are more interested in whether you can win that  
>>> grandslam
>>> with what you produce.
>>>
>>> The nerds however are interested in how well you can do with a  
>>> wooden
>>> racket
>>> from 1980s,therefore projecting your own interest upon those  
>>> students
>>> will just
>>> get them desinterested and you will be judged by them as an
>>> irrelevant person
>>> in their life, whose name they soon forget.
>>
>> Vincent,  I think the only person projecting here is you. You  
>> refer to
>> the 'average guy'. The word 'average' itself implies that statistics
>> have been collected and analyzed. Can you please show us your
>> statistics, and how you collected them, to determine what the average
>> guy is interested in? And what about the average girl, what is she
>> interested in?  If you are merely citing the work of other  
>> researchers,
>> please include citations.
>
> Guys, let's just let this one die in it's traditional form of Vincent
> disagrees with the list and there is nothing more that can be done.  I

Ah no medicine seems to cure you.
Let me remember the original posting of Jim:

"it seems you could put together a simple demonstration of parallel  
processing and various message passing things."

The insights presented here obviously render this platform as no good  
for that,
not inspiring and for sure the clever students will total get  
desinterested and a bunch,
out of desinterest probably not even finish the course.

Working with stuff that isn't even within factor 500 of the speed of  
a normal CPU that doesn't motivate,
doesn't inspire and basically learns a person very little.

Embedded cpu's are for professionals, leave it like that.

They are too hard for you to program efficiently.

> recently read a blog that suggested (due to similar threads following
> these trajectories) that the Wulf list wasn't what it used to be.
>
> Let's save the flames for editors,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 10:21:54 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 16:21:54 +0100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
Message-ID: <4B53A30C-FF3E-4996-916F-2D8455C90C5D@xs4all.nl>


On Jan 12, 2012, at 4:10 PM, Lux, Jim (337C) wrote:

>
>
> On 1/12/12 6:39 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:
>
>> The average guy is not interested in knowing all details regarding
>> how to
>> play tennis with a wooden racket from the 1980s, just around
>> the time when McEnroe was on the tennisfield playing there.
>>
>> Most people are more interested in whether you can win that grandslam
>> with what you produce.
>>
>> The nerds however are interested in how well you can do with a wooden
>> racket
>> from 1980s,therefore projecting your own interest upon those students
>> will just
>> get them desinterested and you will be judged by them as an
>> irrelevant person
>> in their life, whose name they soon forget.
>>
>
> Having spent some time recently in Human Resources meetings about  
> how to
> better recruit software people for JPL, I'd say that something that
> appeals to nerds and gives them something to do is not all bad.  
> Part of
> the educational process is to find and separate the people who are
> interested and have a passion.  I'm not sure that someone who starts
> getting into clusters mostly because they are interested in  
> breaking into
> the Top500 is the target audience in any case.
>
> If you look over the hobby clusters out there, the vast majority  
> are "hey,
> I heard about this interesting idea, I scrounged up N old/small/ 
> slow/easy
> to find computers and tried to cluster them and do something.  I  
> learned
> something about cluster administration, and it was fun, but I don't  
> use it
> anymore"
>
> This is exactly the population you want to hit.  Bring in 100 advanced
> high school (grade 11-12 in US) students.  Have them all use cheap
> hardware to do a cluster.  Some fraction will think, "this is kind of
> cool, maybe I should major in CS instead of X"  Some fraction will  
> think,

Your example here will just take care a big number of students don't  
want
to have to do anything with those studies, as there is a few lame nerds
there who toy with equipment that's factor 50k slower (adding to the  
factor 500
the object oriented slowdown of factor 100)  than what they have
at home, and it can do nothing useful.

But in this specific case you'll just scare away students and the  
real clever ones
will get total desinterested as you are busy with lame duck speed  
type cpu's.

If you'd build a small marsrover with it that would be something else  
of course.

> "how lame, why not make the single processor faster", and they can be
> CompEng or EE majors looking at how to reduce feature sizes and get  
> the
> heat out.
>
> It's just like biology or chemistry classes.  In high school biology
> (9th/10th grade) most of it is mundane memorization (Krebs cycle,  
> various
> descriptive stuff.  Other than the use of cheap cmos cameras,  
> microscopes
> used at this level haven't really changed much in the last 100  
> years (and
> the microscopes at my kids' school are probably 10-20 years old). They
> also do some more modern molecular biology in a series of labs partly
> funded by Amgen:   Some recombinant DNA to put fluorescent proteins  
> in a
> bacteria, running some gels, etc.  The vast majority of the  
> students will
> NOT go on to a career in biology, but some fraction do, they get
> interested in some aspect, and they wind up majoring in bio, or  
> being a
> pre-med, etc.
>
> Not everyone is looking for the world beater.  A lot of kids start  
> with
> Kart racing, even though even the fastest Karts aren't as fast as  
> F1 (or
> even a Smart Car).  How many engineers started with dismantling the
> lawnmower engine?
>
>
> For my own work, I'd rather have people who are interested in solving
> problems by ganging up multiple failure prone processors, rather than
> centralizing it all in one monolithic box (even if the box happens  
> to have
> multiple cores).
>
>
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 10:35:41 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 07:35:41 -0800
Subject: [Beowulf] List traffic
In-Reply-To: <4F0EF405.5070600@runnersroll.com>
Message-ID: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>


On 1/12/12 6:53 AM, "Ellis H. Wilson III" <ellis at runnersroll.com> wrote:
>  I 
>recently read a blog that suggested (due to similar threads following
>these trajectories) that the Wulf list wasn't what it used to be.

I think that's for a variety of reasons..

The cluster world has changed.  Back 15-20 years ago, clusters were new,
novel, and pretty much roll your own, so there was a lot of traffic on the
list about how to do that.  Remember all the mobo comparisons, and all the
carefully teased out idiosyncracies of various switches and network
schemes.

Back then, the idea of using a cluster for "big computing" was kind of
new, as well.  People building clusters were doing it either because the
architecture was interesting OR because they had a computing problem to
solve, and a cluster was a cheap way to do it, especially with free labor.

I think clustering has evolved, and the concept of a cluster is totally
mature.  You can buy a cluster essentially off the shelf, from a whole
variety of companies (some with people who were participating in this list
back then and still today), and it's interesting how the basic Beowulf
concept has evolved.

Back in late 90s, it was still largely "commodity computers, commodity
interconnects" where the focus was on using "business class" computers and
networking hardware. Perhaps not consumer, as cheap as possible, but
certainly not fancy, schmancy rack mounted 1U servers.. The switches
people were using were just ordinary network switches, the same as in the
wiring closet down the hall.

Over time, though, there has developed a whole industry of supplying
components specifically aimed at clusters: high speed interconnects,
computers, etc.   Some of this just follows the IT industry in general..
There weren't as many "server farms" back in 1995 as there are now.

Maybe it's because the field has matured?


So, we're back to talking about "roll-your-own" clusters of one sort or
another.  I think anyone serious about big cluster computing (>100 nodes)
probably won'd be hanging on this list looking for hints on how to route
and label their network cables.  There's too many other places to go get
that information, or, better yet, places to hire someone who already knows.

I know that if I needed massive computational power at work, my first
thought these days isn't "hey, lets build a cluster", it's "let's call up
the HPC folks and get an account on one of the existing clusters".

But I still see the need to bring people into the cluster world in some
way.  I don't know where the cluster vendors find their people, or even
what sorts of skill sets they're looking for.  Are they beating the bushes
at CMU, MIT, and other hotbeds of CS looking for prior cluster design
experience?  I suspect not, just like most of the people JPL hires don't
have spacecraft experience in school, or anywhere.  You look for bright
people who might be interested in what you're doing, and they learn the
details of cluster-wrangling on the job.


For myself, I like probing the edges of what you can do with a cluster.
Big computational problems don't excite me.  I like thinking about things
like:

1) What can I use from the body of cluster knowledge to do something
different.  A distributed cluster is topologically similar to one all
contained in a single rack, but it's different.  How is it different
(latency, error rate)? Can I use analysis (particularly from early cluster
days) to do a better job.

2) I've always been a fan of *personal* computing (probably from many
years of negotiating for a piece of some shared resource).  It's tricky
here, because as soon as you have a decent 8 or 16 node cluster that fits
under a desk, and have figured out all the hideous complexity of how to
port some single user application to run on it, someone comes out with a
single processor box that's just as fast, and a lot easier to use.  Back
in the 80s, I designed, but did not build, a 80286 clone using discrete
ECL logic, the idea being to make a 100MHz IBM PC-AT that would run
standard spreadsheet software 20 times faster (a big deal when your huge
spreadsheet takes hours to recalculate).  However, Moore's law and Intel
made that idea a losing proposition.

But still, the idea of personal control over my computing resources is
appealing.  Nobody watching to see "are you effectively using those cpu
cycles".  No arguing about annual re-adjustment of chargeback rates where
you take the total system budget and divide it by CPU seconds.  Ooops not
enough people used it, so your CPU costs just quadrupled.

3) I'm also interested in portable computing (Yes, I have a NEC 8201-
TRS-80 Model 100 clone, and a TI-59, I did sell the Compaq, but I had one
of those too,  etc.)  This is another interesting problem space.. No big
computer room with infrastructure.  Here, the fascinating trade is between
local computer horsepower and cheap long distance datacomm.  At some
point, it's cheaper/easier to send your data via satellite link  to a big
computer elsewhere and get the results back.  It's the classic 60s remote
computing problem revisited once again.


>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 10:56:32 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 16:56:32 +0100
Subject: [Beowulf] Robots
In-Reply-To: <4F0EE6FC.2050002@runnersroll.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
	<4F0EE6FC.2050002@runnersroll.com>
Message-ID: <95F43560-B9AA-4B56-9B32-1A1979461B07@xs4all.nl>

On Jan 12, 2012, at 2:58 PM, Ellis H. Wilson III wrote:
>  I think this is likely the reason why many
> introductory engineering classes incorporate use of Lego Mindstorm
> robots rather than lunar rovers (or even overstock lunar rovers :D).

I didn't comment on other complete wrong examples, but i want to  
highlight
one. Your example of a lego robot actually is disproving your statement.

Amongst the affordable non-self built robots, the lego robot actually
is a genius robot.

It so to speak the i7-3960x under the robots, to compare it with the
fastest i7 that has been released to date.

It is affordable, it is completely programmable with robot OS,
and if you want to build something better you need to be pretty
genius.

A custom robot, except if you build a real simple stupid thing that
can do near to nothing, that'll be really expensive compared to such
  lego robot which goes for oh a copule of hundreds of dollars only.

I see it for around 280 dollar online, and to add some components is
just a few dozens of dollars each copmonent.
>


The normal way to build 'something better', if better at all,
requires building most components for example from aluminium.

Each component then has a price of say roughly $5k and needs to be
special engineered. You need many of those components.

We assume then it's not a commercial project otherwise also royalties
will be involved paying for every component you build, of course that's
a small part of the above price.

Most custom robots, which are hardly bigger in size than the legorobot,
they're pretty expensive actually.

If you want to purchase components together for a tad bigger robot,
just something with 4 wheels which can hold a couple of dozens of  
kilo's,
such components already are $5k - $10k.

And that's mass produced components.

So building something that actually is more functional, better,  
that's not
gonna be easy.

It's a genius robot, really is.

In itself it's not really a lot more expensive , if you produce  
something in the quantities at which lego produces it,
to build a bigger robot.

The reason the lego robot is very small. has really to do with safety.

Big robots rare really dangerous you know.

In cars they use already dozens of cpu's, already 10+ year old cars  
have easily over 100 cpu's inside,
just for safety, with the intend that components of the car don't  
damage humankind.

Robotsoftware is far too primitive there yet. No nothing safety  
concerns.

In all that, the lego robot is really a genius thing.

Very bad example of what you 'tried' to show with some fake arguments.

>
> Point in case, I got interested in HPC/Beowulfery back in 2006, read
> RGBs book and a few other texts on it, and finally found a small group
> (4) of unused PIIIs to play on in the attic of one of my college's
> buildings.  Did I learn how to setup a reasonable cluster?  Yes.   
> Was it
> slow as dirt compared to then modern Intel and AMD processors?  Of
> course.  But did the experience get me so completely hooked on
> HPC/Cluster research that I went on to pursue a PHD on the topic?
> Absolutely.
>
> Granted, I'm just one data point, but I think Jim's idea has all the
> right components for a great educational experience.
>
> Best,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 12 11:45:29 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 17:45:29 +0100
Subject: [Beowulf] List traffic
In-Reply-To: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
References: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
Message-ID: <4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>

Well i feel small clusters of say 2 computers or so might get more  
common in future.

Yet let's start asking:

What is a cluster however?

That's not such a simple answer.

Having a few computers at home connected via a router with simple  
default ethernet
is something many have at home.

Is that a cluster?

Maybe.

Let me focus pon the clusters with a decent network.

The decent network clusters suffer from a number of problems.

The biggest problem for this list:

0) yesterday i read in the newspaper another Irani scientist was  
killed by a carbomb.

Past few years i really missed experts posting in here and some dorks  
who really have nothing to contribute to the cluster world,
and just are there to be here, like Jonathan Aquilina, they get back  
in return. So experts leave and idiots come back.

This has completely killed this mailing list.

1) The lack of postings by RGB past few months, especially the ones  
where he explains how easy
it is to build a nuke, given the right ingredients, which gives  
interesting discussions.


Let's look to clusters:

10) the lack of software support for clusters

This is the real big issue.

Sure you can get expensive commercial software to run on clusters,   
but that's all
interesting just for scientists.

Which game can effectively use cluster hardware and is dirt cheap?

This really is a big issue.

Note i intend to contribute myself there to change that, but that's  
just 1 person of course.
Not an entire market moving there

11) the huge break even point of using clusterhardware

I can give examples that i sat here at home with next to me Don  
Dailey, the programmer of Cilkchess,
which used Cilk from Leierson. We played Diep at a single cpu against  
Cilkchess single cpu and Cilkchess
got total toasted.

After having been fried for 4 consecutive games, Don had enough of it  
and disconnected the connection
to the cluster, from which he used 1 cpu for the games, and started  
to play at a version at his laptop,
which did NOT use CILK. So no parallel framework.

It was factor 40 faster.

Now note that at tournaments they showed up with 500 or even 1800 cpu's,
yet you can't have a cluster of 1800 cpu's at home.

Usually building a 4 socket box is far easier, though not necessarily  
cheaper, and practical faster than a small cluster.

Especially AMD has a bunch of cheap 4 socket solutions int he market,  
if you buy those 2nd hand ,there is not really
any competition there from 4 socket clusters in the same price range.

100) the huge increase in power consumption lately of machines. Up to  
2002 i used to visit
     someone, Jan Louwman, who had 36 computres at home, testing  
chessprograms at home.
    So that wasn't a cluster, just a bunch of machiens, in sets of 2  
machines connected with a special
    cable we used to play back then machines against each other.

    Nearly all of those machines was 60-100 watt or so.

    He had divided his computers over 3 rooms or so, majority in 1  
room though. There the 16 ampere @ 230 volt
     power plug already had problems supplying this amount of  
electricity. Around the power plug in the wall,
     the wall and plastic of the powerplug were completely black burned.

    As there was only a single P4 machine amongst the computers,
     only 1 box really consumed a lot of power.

Try to run 36 computers at home nowadays. Most machines are well over  
250 watt,
and the fastest 2 machines i've got here eat 410 respectively 270 watt.

That's excluding the videocard in the 410 watt machine, as it's out  
of it currently (AMD HD 6970),
the box has been setup for gpgpu.

36 machines eat way way too much power.

This is a very simple practical problem that one shouldn't overlook.

It's not realistic that the average joe sets up at his popular gaming  
program a cluster of more
than 2 machines or so.

A 2 machine cluster will never beat a 2 socket machine, except when  
each node also has 2 sockets.

So clustering simple home computers together isn't really useful  
except if you really cluster together half a dozen or more.

Half a dozen machines, using the 250 watt measure and another 25 watt  
for each card and 200 watt for the switch,
it's gonna eat 6 * 275 + 200 = 1850 watt. You really need diehards  
for that.

They are there and more than you and i guess,  but they need SOFTWARE  
that interests them that can use it in a very
  efficient manner, clearly proven to them to be working great and  
easy to install, which refers to point 11.

101) most people like to buy new stuff. new cluster hardware is very  
expensive for more than 2 computers as it needs a switch.
          Second hand it's a lot cheaper, sometimes even dirt cheap,  
yet that's already not what most people like to do

110) Linux had a few setbacks and got less attractive. Say when we  
had redhat end 90s with x-windows it was slowly improving
       a lot. Then x64 was there together with a big dang and we went  
back years and years to x.org.

       X.org threw back linux 10 years in time. It eats massive RAM,  
it's ugly bad, it's slow, it's difficult to configure etc.

      Basically there isn't many good distributions now that are for  
free.

      As most clusters work only very well under linux, the  
difficulty of using linux should really be factored in.

      Have a problem under linux?

      Then forget it as a normal user.

      Now for me linux got MORE attractive as i get hacked total  
silly by every consultant who on this planet knows how to hack on the  
internet,
      yet that's not representative for those with cash who can  
afford a cluster. Note i don't fall into the cash group. My total  
income in 2011 was real little.

111) Usually the big cash to afford a cluster is for people with a  
good job or a tad older, that's usually a different group than the  
group that
         can work with linux. See the previous points for that

Despite all that i believe clusters will get more popular in future,  
for a simple reason: processors don't really clock higher.
So all software that can use additional calculation power already is  
getting parallellized or already has been paralelllized.

It's a matter of time before some of those applications also will  
work well at cluster hardware. Yet this is a slow proces
and it really requires software that works real efficient at small  
number of nodes.

As an example of why i feel this will happen i give to you the  
popularity amongst gamers to run 2 graphics cards connected via a  
bridge with
each other within 1 machine.

Yet the important factor there is that the games really profit from  
doing that.

On Jan 12, 2012, at 4:35 PM, Lux, Jim (337C) wrote:

>
>
> On 1/12/12 6:53 AM, "Ellis H. Wilson III" <ellis at runnersroll.com>  
> wrote:
>>  I
>> recently read a blog that suggested (due to similar threads following
>> these trajectories) that the Wulf list wasn't what it used to be.
>
> I think that's for a variety of reasons..
>
> The cluster world has changed.  Back 15-20 years ago, clusters were  
> new,
> novel, and pretty much roll your own, so there was a lot of traffic  
> on the
> list about how to do that.  Remember all the mobo comparisons, and  
> all the
> carefully teased out idiosyncracies of various switches and network
> schemes.
>
> Back then, the idea of using a cluster for "big computing" was kind of
> new, as well.  People building clusters were doing it either  
> because the
> architecture was interesting OR because they had a computing  
> problem to
> solve, and a cluster was a cheap way to do it, especially with free  
> labor.
>
> I think clustering has evolved, and the concept of a cluster is  
> totally
> mature.  You can buy a cluster essentially off the shelf, from a whole
> variety of companies (some with people who were participating in  
> this list
> back then and still today), and it's interesting how the basic Beowulf
> concept has evolved.
>
> Back in late 90s, it was still largely "commodity computers, commodity
> interconnects" where the focus was on using "business class"  
> computers and
> networking hardware. Perhaps not consumer, as cheap as possible, but
> certainly not fancy, schmancy rack mounted 1U servers.. The switches
> people were using were just ordinary network switches, the same as  
> in the
> wiring closet down the hall.
>
> Over time, though, there has developed a whole industry of supplying
> components specifically aimed at clusters: high speed interconnects,
> computers, etc.   Some of this just follows the IT industry in  
> general..
> There weren't as many "server farms" back in 1995 as there are now.
>
> Maybe it's because the field has matured?
>
>
> So, we're back to talking about "roll-your-own" clusters of one  
> sort or
> another.  I think anyone serious about big cluster computing (>100  
> nodes)
> probably won'd be hanging on this list looking for hints on how to  
> route
> and label their network cables.  There's too many other places to  
> go get
> that information, or, better yet, places to hire someone who  
> already knows.
>
> I know that if I needed massive computational power at work, my first
> thought these days isn't "hey, lets build a cluster", it's "let's  
> call up
> the HPC folks and get an account on one of the existing clusters".
>
> But I still see the need to bring people into the cluster world in  
> some
> way.  I don't know where the cluster vendors find their people, or  
> even
> what sorts of skill sets they're looking for.  Are they beating the  
> bushes
> at CMU, MIT, and other hotbeds of CS looking for prior cluster design
> experience?  I suspect not, just like most of the people JPL hires  
> don't
> have spacecraft experience in school, or anywhere.  You look for  
> bright
> people who might be interested in what you're doing, and they learn  
> the
> details of cluster-wrangling on the job.
>
>
> For myself, I like probing the edges of what you can do with a  
> cluster.
> Big computational problems don't excite me.  I like thinking about  
> things
> like:
>
> 1) What can I use from the body of cluster knowledge to do something
> different.  A distributed cluster is topologically similar to one all
> contained in a single rack, but it's different.  How is it different
> (latency, error rate)? Can I use analysis (particularly from early  
> cluster
> days) to do a better job.
>
> 2) I've always been a fan of *personal* computing (probably from many
> years of negotiating for a piece of some shared resource).  It's  
> tricky
> here, because as soon as you have a decent 8 or 16 node cluster  
> that fits
> under a desk, and have figured out all the hideous complexity of  
> how to
> port some single user application to run on it, someone comes out  
> with a
> single processor box that's just as fast, and a lot easier to use.   
> Back
> in the 80s, I designed, but did not build, a 80286 clone using  
> discrete
> ECL logic, the idea being to make a 100MHz IBM PC-AT that would run
> standard spreadsheet software 20 times faster (a big deal when your  
> huge
> spreadsheet takes hours to recalculate).  However, Moore's law and  
> Intel
> made that idea a losing proposition.
>
> But still, the idea of personal control over my computing resources is
> appealing.  Nobody watching to see "are you effectively using those  
> cpu
> cycles".  No arguing about annual re-adjustment of chargeback rates  
> where
> you take the total system budget and divide it by CPU seconds.   
> Ooops not
> enough people used it, so your CPU costs just quadrupled.
>
> 3) I'm also interested in portable computing (Yes, I have a NEC 8201-
> TRS-80 Model 100 clone, and a TI-59, I did sell the Compaq, but I  
> had one
> of those too,  etc.)  This is another interesting problem space..  
> No big
> computer room with infrastructure.  Here, the fascinating trade is  
> between
> local computer horsepower and cheap long distance datacomm.  At some
> point, it's cheaper/easier to send your data via satellite link  to  
> a big
> computer elsewhere and get the results back.  It's the classic 60s  
> remote
> computing problem revisited once again.
>
>
>>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Thu Jan 12 11:49:25 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Thu, 12 Jan 2012 11:49:25 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
Message-ID: <c16a5282ab55e2c99f48c1a52862a505.squirrel@mail.eadline.org>

snip
>
>
> For my own work, I'd rather have people who are interested in solving
> problems by ganging up multiple failure prone processors, rather than
> centralizing it all in one monolithic box (even if the box happens to have
> multiple cores).
>

This is going to be an exascale issue. i.e. how to compute on a systems
whose parts might be in a constant state of breaking. An other interesting
question is how do you know you are getting the right answer on a *really*
large system?

Of course I spend much of my time optimizing really small
systems.

-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From diep at xs4all.nl  Thu Jan 12 11:58:32 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 12 Jan 2012 17:58:32 +0100
Subject: [Beowulf] Adding 1 point
In-Reply-To: <4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
References: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
	<4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
Message-ID: <BB7FC387-8A01-4766-B671-4248BAE6D291@xs4all.nl>

What really made small clusters at home less attractive,
there is another reason i should add :

That's the rise of cheap multi socket machines.

A 2 socket machine is not so expensive anymore nowadays.
So if you want faster than 1 socket, you buy a 2 socket machine.

If you want faster than that , 4 sockets is there.

That choice wasn't there before end 90s easily available. And in the  
21th century it has become cheap.

Another delaying factor is the rise of so many cores per node. AMD  
and intel sell cpu's for their 4 socket line
that has up to double the amount of nodes than you can have in a  
single socket box.

So it's equivalent nearly to 8 nodes, be it low clocked.

For that reason clusters tend to get more effective at a dozen nodes  
or more, assuming cheap single socket nodes.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 12:26:01 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 12:26:01 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4B53A30C-FF3E-4996-916F-2D8455C90C5D@xs4all.nl>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<4B53A30C-FF3E-4996-916F-2D8455C90C5D@xs4all.nl>
Message-ID: <4F0F17A9.7010400@runnersroll.com>

On 01/12/2012 10:21 AM, Vincent Diepeveen wrote:
> On Jan 12, 2012, at 4:10 PM, Lux, Jim (337C) wrote:
>> This is exactly the population you want to hit.  Bring in 100 advanced
>> high school (grade 11-12 in US) students.  Have them all use cheap
>> hardware to do a cluster.  Some fraction will think, "this is kind of
>> cool, maybe I should major in CS instead of X"  Some fraction will
>> think,
>
> Your example here will just take care a big number of students don't
> want
> to have to do anything with those studies, as there is a few lame nerds
> there who toy with equipment that's factor 50k slower (adding to the
> factor 500
> the object oriented slowdown of factor 100)  than what they have
> at home, and it can do nothing useful.
>
> But in this specific case you'll just scare away students and the
> real clever ones
> will get total desinterested as you are busy with lame duck speed
> type cpu's.

You have made it abundantly clear you aren't interested in enrolling in 
such a course.  Thanks for your comments.

On a related note, as I was thinking about 'lame duck' education, I 
remembered that I took an undergraduate machine learning course in which 
we designed players for connect-four, which would compete using recently 
learned techniques against other students in the class.  Despite that 
particular game being a solved one, we all had a blast and got quite 
competitive trying to beat each other out using the recently acquired 
skills.  I would encourage Jim to do something similar once the basics 
of cluster administration are done -- perhaps a mini SC Cluster 
Competition would be a neat application for the Arduinos?

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 12:35:11 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 12:35:11 -0500
Subject: [Beowulf] Robots
In-Reply-To: <95F43560-B9AA-4B56-9B32-1A1979461B07@xs4all.nl>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
	<4F0EE6FC.2050002@runnersroll.com>
	<95F43560-B9AA-4B56-9B32-1A1979461B07@xs4all.nl>
Message-ID: <4F0F19CF.2050603@runnersroll.com>

On 01/12/2012 10:56 AM, Vincent Diepeveen wrote:
> On Jan 12, 2012, at 2:58 PM, Ellis H. Wilson III wrote:
>> I think this is likely the reason why many
>> introductory engineering classes incorporate use of Lego Mindstorm
>> robots rather than lunar rovers (or even overstock lunar rovers :D).
>
> I didn't comment on other complete wrong examples, but i want to highlight
> one. Your example of a lego robot actually is disproving your statement.

It was a price comparison, and without diving into the nitty-gritty of 
how good or bad both the Arduino and the Mindstorms are in their 
respective areas, it was spot on.  Jim wants to give each student a 10 
node cluster on the cheap (i.e. 20 to 30 bucks per node = 300 bucks), 
universities want to give each student (or teams of students sometimes) 
a robot (~280).  Both provide an approachable level of difficulty and 
potential for education at a reasonable price.

Feel free to continue to disagree for the sake of such.  It was just an 
example.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 12:54:52 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 09:54:52 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <c16a5282ab55e2c99f48c1a52862a505.squirrel@mail.eadline.org>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<c16a5282ab55e2c99f48c1a52862a505.squirrel@mail.eadline.org>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9163@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: Douglas Eadline [mailto:deadline at eadline.org] 
Sent: Thursday, January 12, 2012 8:49 AM
To: Lux, Jim (337C)
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

snip
>
>
> For my own work, I'd rather have people who are interested in solving 
> problems by ganging up multiple failure prone processors, rather than 
> centralizing it all in one monolithic box (even if the box happens to 
> have multiple cores).
>

This is going to be an exascale issue. i.e. how to compute on a systems whose parts might be in a constant state of breaking. An other interesting question is how do you know you are getting the right answer on a *really* large system?

Of course I spend much of my time optimizing really small systems.

--

Your point about scaling is well taken.. so far, the computing world has largely dealt with things by trying to make the processor perfect and error free.  Some limited areas of error correction are popular (RAM).  But think in a bigger area... say your arithmetic unit has some infrequent unknown errors (e.g. FDIV bug on Pentium).. could clever algorithm design and multiple processors (or multi cores) mitigate this (e.g. instead of just computing  Z = X/Y you also compute Z1 = (X*2)/(Y*2).. and compare answers... that exact example's not great because you've added 2 operations, but I can see that there are other clever techniques that might be possible.. )  

What is nice if you can do things like temporal redundancy (do the calculation twice, and if it's different, do it a third time), or even better some sort of "check calculation" that takes small time compared to mainline calculation.

This, I think, is somewhere that even the big iron/cluster folks could be doing some research.  What are optimum communication fabrics to support this kind of "side calculation" which may have different communication patterns and data flow than the "mainline".  It has a parallel in things like CRC checks in communications protocols.  A lot of hardware has a dedicated little CRC checker that is continuously calculating the CRC as the bits arrive, so that when you get to the end of the frame, the answer is already there.  


And Doug, your small systems have a lot of the same issues, perhaps because that small Limulus might be operated in environments other than what the underlying hardware was designed for.  I know people who have been rudely surprised when they found that the design environment for a laptop is a pretty narrow temperature range (e.g. office desktop) and when they put them in a car, subject to 0C or 40C temperatures, if not wider, that things don't work quite as well as expected.

Very small systems (few nodes) have the same issues, in some environments (e.g. a cluster subject to single event upsets or functional interrupts in a high radiation environment with a lot of high energy charged particles. it's not so much a total dose thing, but a SEE thing)

For Juno (which is in polar orbit around Jupiter), we shielded everything in a vault (a 1 meter cube with 1cm thick titanium walls) and still it's an issue.  We don't get very long before everything is cooked. 

And I think that a non-trivially small cluster (e.g. more than 4 nodes, I think) you could do a lot of experimentation on techniques.


(oddly, simulated fault injection is one of the trickier parts)
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 12:55:41 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 12:55:41 -0500
Subject: [Beowulf] List traffic
In-Reply-To: <4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
References: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
	<4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
Message-ID: <4F0F1E9D.9000800@runnersroll.com>

I really should be following Joe's advice circa 2008 and just not 
responding, but I can't help myself.

On 01/12/2012 11:45 AM, Vincent Diepeveen wrote:
> The biggest problem for this list:
> 1) The lack of postings by RGB past few months, especially the ones
> where he explains how easy
> it is to build a nuke, given the right ingredients, which gives
> interesting discussions.

The last post from RGB was a long, long discussion about how very wrong 
you were about RNGs.  You just don't get it.  It's okay to be wrong once 
in a while Vincent, and even moreso to just agree to disagree.  Foolish, 
unedited and inflammatory diatribes with a unnatural dose of newlines 
are what is killing this list and what that blog I referenced was 
specifically disappointed with.

So please, I'm begging you.  Stop writing huge emails that trail off 
from their original point.  Try to say things in a non-inflammatory 
manner.  Use spell-check, and try to read your emails once before 
sending them.  And last of all, remember that there are many people on 
this list that have all sorts of different applications -- not just 
Chess.  Your experience does not generalize well to all areas.

Speaking of which, for anyone who is interested in doing serious work 
with low-power processors, please see a paper named FAWN for an 
excellent example of use-cases where low hertz low power processors can 
do some great work.  It's by Dave Anderson of CMU.  I was lucky enough 
to be invited to the CMU PDL retreat a few months back and had a nice 
conversation about the project when we went for a run together.  There 
are some use-cases that benefit massively from that kind of architecture.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 13:10:24 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 10:10:24 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <4F0F17A9.7010400@runnersroll.com>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<4B53A30C-FF3E-4996-916F-2D8455C90C5D@xs4all.nl>
	<4F0F17A9.7010400@runnersroll.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9168@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Ellis H. Wilson III
Sent: Thursday, January 12, 2012 9:26 AM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] A cluster of Arduinos

On 01/12/2012 10:21 AM, Vincent Diepeveen wrote:
> On Jan 12, 2012, at 4:10 PM, Lux, Jim (337C) wrote:
>> This is exactly the population you want to hit.  Bring in 100 
>> advanced high school (grade 11-12 in US) students.  Have them all use 
>> cheap hardware to do a cluster.  Some fraction will think, "this is 
>> kind of cool, maybe I should major in CS instead of X"  Some fraction 
>> will think,
>
> Your example here will just take care a big number of students don't 
> want to have to do anything with those studies, as there is a few lame 
> nerds there who toy with equipment that's factor 50k slower (adding to 
> the factor 500 the object oriented slowdown of factor 100)  than what 
> they have at home, and it can do nothing useful.
>
> But in this specific case you'll just scare away students and the real 
> clever ones will get total desinterested as you are busy with lame 
> duck speed type cpu's.

You have made it abundantly clear you aren't interested in enrolling in such a course.  Thanks for your comments.

On a related note, as I was thinking about 'lame duck' education, I remembered that I took an undergraduate machine learning course in which we designed players for connect-four, which would compete using recently learned techniques against other students in the class.  Despite that particular game being a solved one, we all had a blast and got quite competitive trying to beat each other out using the recently acquired skills.  I would encourage Jim to do something similar once the basics of cluster administration are done -- perhaps a mini SC Cluster Competition would be a neat application for the Arduinos?


----------------------------------------
Ooohh.. that sounds *very* cool..  

A bunch of slow processors.
A simple problem to solve (e.g. 3D tic-tac-toe) for which there might even be published parallel approaches
The challenge is effectively using the limited system, warts and all.

The RaspberryPI might be a better vehicle, if it hits the price/availability targets: Comparable to Arduinos in price, but a bit more sophisticated and less contrived.


We've been talking about what kind of software competitions JPL could run as a recruiting tool at Universities, and that's along those lines.  Hmm... I wonder if they'd be willing to spend recruiting funds on that?  (probably not.. we're all poor this fiscal year)


And, on the undergrad education thing... At UCLA, I had to write stuff in MIXAL to run on a simulated MIX machine and complained mightily to the TAs, who just pointed to the sacred texts of Knuth, rather than giving an intelligent response as to why we didn't do something like work in PDP-11 ASM or System/360 BAL. (UCLA at the time had a monster 360, but I don't know that they had many 11s, and realistically, BAL is not something I'd inflict on 2nd quarter first year students.   We were a PL/I or PL/C shop in the first couple years' classes for the most part, although there were people doing Algol)

OTOH, I suspect was an atypical incoming student for 1977.

 I had, the previous year, done the Pascal courses at UCSD with p-machines running on LSI-11s as well as the Pascal system on the big Burroughs B6700, which uses a form of ALGOL as the machine language and is a stack machine to boot (how cool is that? Burroughs always did have cool machines.. Hey, they built ILLIAC IV). I had also done some ASM stuff on an 11/20 under RT-11.    I guess that's characteristic of the differences in philosophy between different CS departments  (UCSD was heading more in the direction of Software Engineering being part of the School of Engineering and Applied Sciences, while UCLA it was part of the Math department.  Little did I know, as a cybernetics major, what the difference was: It sure as heck isn't manifested in the course catalog, at least in a form that a incoming student could discern.  Going back now, I could probably look at catalogs from the various universities of the era and divine their philosophies, but that's clearly 2020 hindsight
 )
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 13:22:26 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 10:22:26 -0800
Subject: [Beowulf] FAWN
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>

Fast Array of Wimpy Nodes..
http://www.cs.cmu.edu/~fawnproj/

Very cool stuff...

Their original motivation (reduction of power) is at a much larger scale than my work usually works at (they're talking megawatts in googleish clusters.. I worry about watts derived from solar panels and such)

But it's a whole 'nother twist on the idea of clustering of low performance nodes (by some metric.. they've got good nanojoule/operation metrics) .

And they're doing a very clever thing where they work with the very asymmetric read/write speeds on flash memory.  (And FLASH memory is something I spend a lot of time thinking about these days.. It's what we use in space for NVRAM these days)

Looks like I've got some reading for the holiday weekend.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120112/a93fa9f2/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From ellis at runnersroll.com  Thu Jan 12 13:26:26 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 13:26:26 -0500
Subject: [Beowulf] FAWN
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <4F0F25D2.90305@runnersroll.com>

On 01/12/2012 01:22 PM, Lux, Jim (337C) wrote:
> But it?s a whole ?nother twist on the idea of clustering of low
> performance nodes (by some metric.. they?ve got good nanojoule/operation
> metrics) .
>

Not just good, from a sorting perspective, /best/:
http://sortbenchmark.org/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Thu Jan 12 13:47:21 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Thu, 12 Jan 2012 13:47:21 -0500
Subject: [Beowulf] List traffic
In-Reply-To: <4F0F1E9D.9000800@runnersroll.com>
References: <CB34382E.12F19%james.p.lux@jpl.nasa.gov>
	<4AB87920-7A8F-41B4-8129-3C19218FD9AB@xs4all.nl>
	<4F0F1E9D.9000800@runnersroll.com>
Message-ID: <4F0F2AB9.5060105@scalableinformatics.com>

On 01/12/2012 12:55 PM, Ellis H. Wilson III wrote:
> I really should be following Joe's advice circa 2008 and just not
> responding, but I can't help myself.

huh ...?

>
> On 01/12/2012 11:45 AM, Vincent Diepeveen wrote:
>> The biggest problem for this list:
>> 1) The lack of postings by RGB past few months, especially the ones
>> where he explains how easy
>> it is to build a nuke, given the right ingredients, which gives
>> interesting discussions.
>
> The last post from RGB was a long, long discussion about how very wrong
> you were about RNGs.  You just don't get it.  It's okay to be wrong once
> in a while Vincent, and even moreso to just agree to disagree.  Foolish,
> unedited and inflammatory diatribes with a unnatural dose of newlines
> are what is killing this list and what that blog I referenced was
> specifically disappointed with.
>
> So please, I'm begging you.  Stop writing huge emails that trail off
> from their original point.  Try to say things in a non-inflammatory

... oh ... never mind :)


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 12 14:08:38 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 12 Jan 2012 11:08:38 -0800
Subject: [Beowulf] FAWN
In-Reply-To: <4F0F25D2.90305@runnersroll.com>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9186@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Ellis H. Wilson III
Sent: Thursday, January 12, 2012 10:26 AM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] FAWN

On 01/12/2012 01:22 PM, Lux, Jim (337C) wrote:
> But it's a whole 'nother twist on the idea of clustering of low 
> performance nodes (by some metric.. they've got good 
> nanojoule/operation
> metrics) .
>

Not just good, from a sorting perspective, /best/:
http://sortbenchmark.org/
-------------

I was thinking that their low powered nodes are poor in an absolute performance standpoint (i.e. MIPS), but actually quite good on a computation work per joule basis.

Yes, for sorting, they are kicking rear.


This is interesting, but when you start talking power consumption, one needs to be careful about where you draw boundaries and what's "in the system".  Do you count conversion efficiency in the power supply?   At one level, you say, no, just worry about DC power consumption, but even there.. is it at the board edge, or at the chip?  Something drawing 100Amps at 0.5V is a very different beast than something drawing 10Amps at 5V, and you can't locally optimize too far because your choices inside box A start to affect the design and performance of Box B and Box C.


The contest rules point to a variety of power measurement systems, but based on what I see there, I think there's some scope for "gaming" the system. It sort of seems it's "wall plug power", but then, they do allow DC power systems.  

For instance, one could tune the power supply for the expected load conditions..  You could run those fans at warp speed before the test run starts to cool down as much as possible, and then slow them down (saving power) during the run, maybe even letting the processor get pretty hot.

Sort of like running a top fuel dragster. Only has to go fast for 3 or 4 seconds, so why bother putting in a water pump.


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at runnersroll.com  Thu Jan 12 14:40:15 2012
From: ellis at runnersroll.com (Ellis H. Wilson III)
Date: Thu, 12 Jan 2012 14:40:15 -0500
Subject: [Beowulf] FAWN
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9186@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9186@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <4F0F371F.2060704@runnersroll.com>

On 01/12/2012 02:08 PM, Lux, Jim (337C) wrote:
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Ellis H. Wilson III
> Sent: Thursday, January 12, 2012 10:26 AM
> To: beowulf at beowulf.org
> Subject: Re: [Beowulf] FAWN
>
> On 01/12/2012 01:22 PM, Lux, Jim (337C) wrote:
>> But it's a whole 'nother twist on the idea of clustering of low
>> performance nodes (by some metric.. they've got good
>> nanojoule/operation
>> metrics) .
>>
>
> Not just good, from a sorting perspective, /best/:
> http://sortbenchmark.org/
> -------------
>
> I was thinking that their low powered nodes are poor in an absolute performance standpoint (i.e. MIPS), but actually quite good on a computation work per joule basis.
>
> Yes, for sorting, they are kicking rear.
>
>
> This is interesting, but when you start talking power consumption, one needs to be careful about where you draw boundaries and what's "in the system".  Do you count conversion efficiency in the power supply?   At one level, you say, no, just worry about DC power consumption, but even there.. is it at the board edge, or at the chip?  Something drawing 100Amps at 0.5V is a very different beast than something drawing 10Amps at 5V, and you can't locally optimize too far because your choices inside box A start to affect the design and performance of Box B and Box C.
>
>
> The contest rules point to a variety of power measurement systems, but based on what I see there, I think there's some scope for "gaming" the system. It sort of seems it's "wall plug power", but then, they do allow DC power systems.
>
> For instance, one could tune the power supply for the expected load conditions..  You could run those fans at warp speed before the test run starts to cool down as much as possible, and then slow them down (saving power) during the run, maybe even letting the processor get pretty hot.
>
> Sort of like running a top fuel dragster. Only has to go fast for 3 or 4 seconds, so why bother putting in a water pump.

All fair points, and I can't contest the suggestion that they likely 
tune their algorithm and physical units very highly to perform well for 
this sorting environment.  Dave actually keeps a pretty balanced 
perspective when discussing this, as shown in his reaction to Google 
talking down wimpy nodes.  Wired has a nice article on it, with inside 
it a link to Googles pub that discusses the other half of the coin:

http://www.wired.com/wiredenterprise/2012/01/wimpy_nodes/

Some more reading material for the weekend ;).

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Thu Jan 12 15:45:16 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Thu, 12 Jan 2012 15:45:16 -0500
Subject: [Beowulf] Partial OT: CPU grouping control for Windows 2008 R2 x64
 server for big calcs
Message-ID: <4F0F465C.4010301@scalableinformatics.com>

Ok, this one is fun.  For some definitions of fun.  Unusual definitions 
of fun...  And there is a question towards the end.  This is for folks 
who've been administrating clusters and HPC systems with big windows 
machines (32+ CPUs and large RAM).

Imagine you have a machine as part of a very loose computing cluster. 
End user wants to run Windows (2008R2 x64 enterprise) on it.  This 
machine has 32 processor cores (real ones, no hyperthreading), 1TB ram.

Yeah, its a fun machine to work on.  I won't discuss the OS choice here. 
  You can see some of my playing with it here: 
http://scalability.org/?p=3541 and http://scalability.org/?p=3515

Windows machines can let up to 64 logical processors be part of a 
"group".  A group is a scheduling artifice, and not necessarily directly 
related to the NUMA system ... think of it as a layer abstraction above 
this.

Ok, still with me?

This scheduling artifice, these groups, require at minimum a 
recompilation to work properly with.  Its actually more than that, they 
do require some additional processor affinity bits be handled.  If you 
have a code which doesn't handle this correctly, it will probably crash. 
  Or not work well.  Or both.

Matlab appears to be such a beast.  This isn't necessarily a Matlab 
issue per se, it appears to be something of a design compromise issue in 
Windows.  Windows wasn't designed with large processor counts in mind. 
The changes they'd need to make in order to enable a single large 
spanning entity across all CPUs at once are quite likely not in the 
companies best interests, as there are very few customers with such 
machines.

Still with me?  Here's the problem.

Matlab seems to crash (according to the user) if run on a unit with more 
than one group.  I've not been able to verify on the machine yet myself, 
but I have no reason to disbelieve this.  The issue as its been stated 
to me is that if there is more than one group of processors, Matlab 
crashes.  This is the symptom.

When the unit boots by default, we have 2 16 processor groups.  So 
looking at bcdedit examples, I see how to turn off groups.

One minor problem.

It doesn't work.

I can do an

	bcdedit /set groupaware off

reboot.  Which should completely disable groups, so that all 32 
processor are in one group.  Still 2 groups.

I can do an

	bcdedit /set groupsize 64

reboot.  Still 2 groups.

So far, the only thing that seems to change this is if I install the 
hyperV role.  With that, there is now 1 group.

Looking at all the boot options with

	bcdedit /enum

there's only one config for boot, and its the default.

So ... my questions

1) Does Windows really ignore its approximate equivalent to its boot 
options on a grub line?

2) Is there any way to compel Windows to do the right thing?

As noted, this is for a computing cluster.  Our recommended OS isn't 
feasible right now for them and their application.

Definitely annoying.  I'd love there to be a bios setting to help 
windows past its desire to ignore my requested number of groups.  Not 
sure if adding in the hyperV will impact performance (did some base 
testing with Scilab to see, and I didn't see anything I'd call significant).

Will be bugging Microsoft about this as well (pretty obviously a bug in 
2008R2 x64).

And related to this, I read something about limits in the different 
windows editions.  Is anyone using Windows HPC cluster on big memory 
machines with lots of cores?  Looking at the Microsoft docs, they 
indicate some relatively low limits on ram and processor count.  So does 
this mean that they won't be supporting Interlagos 4 socket machines 16 
cores per socket and 1/2 TB ram in compute nodes for Windows HPC ?  I am 
just imagining someone buying a few of those nodes and being required to 
buy Enterprise or Data center licenses for those machines (which clearly 
would not be used for anything more than HPC).


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Fri Jan 13 00:36:50 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Fri, 13 Jan 2012 16:36:50 +1100
Subject: [Beowulf] FAWN
In-Reply-To: <4F0F25D2.90305@runnersroll.com>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com>
Message-ID: <4F0FC2F2.5090606@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 13/01/12 05:26, Ellis H. Wilson III wrote:

> Not just good, from a sorting perspective, /best/:
> http://sortbenchmark.org/

But that algorithm isn't running on exactly wimpy hardware..

Intel Core i5-2400S 2.5 GHz, 16GB RAM and a bunch of SSDs

cheers!
Chris
- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8PwvIACgkQO2KABBYQAh84cgCfQZN1ZpKfzxLmazCiZLg93n89
dwYAoIZHAFmUYENP2xwMwo5M3xile4F3
=4lFT
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 13 09:01:59 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 13 Jan 2012 15:01:59 +0100
Subject: [Beowulf] Robots
In-Reply-To: <4F0F19CF.2050603@runnersroll.com>
References: <CB32F661.12E61%james.p.lux@jpl.nasa.gov>
	<4F0DBFD3.3070503@ias.edu>	<2734FA76-6286-486F-B762-A48E2EAEF612@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9007@ALTPHYEMBEVSP20.RES.AD.JPL>
	<CALW9tqWwOqiTCgqrC-WVxnZLc0Ficb7y9mAoDmjFEgX0WNkphw@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E901C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<0249BC0E-8C7A-470E-933B-A0CBCC272888@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E90D3@ALTPHYEMBEVSP20.RES.AD.JPL>
	<C6EF41B1-A588-41D5-BAF4-1DED733885B0@xs4all.nl>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9106@ALTPHYEMBEVSP20.RES.AD.JPL>
	<7B548142-9871-4DCF-8EF1-90DA5A4BDEF6@xs4all.nl>
	<4F0EE6FC.2050002@runnersroll.com>
	<95F43560-B9AA-4B56-9B32-1A1979461B07@xs4all.nl>
	<4F0F19CF.2050603@runnersroll.com>
Message-ID: <01D34971-9054-4F19-9776-8F107B118A1D@xs4all.nl>


On Jan 12, 2012, at 6:35 PM, Ellis H. Wilson III wrote:

> On 01/12/2012 10:56 AM, Vincent Diepeveen wrote:
>> On Jan 12, 2012, at 2:58 PM, Ellis H. Wilson III wrote:
>>> I think this is likely the reason why many
>>> introductory engineering classes incorporate use of Lego Mindstorm
>>> robots rather than lunar rovers (or even overstock lunar rovers :D).
>>
>> I didn't comment on other complete wrong examples, but i want to  
>> highlight
>> one. Your example of a lego robot actually is disproving your  
>> statement.
>
> It was a price comparison, and without diving into the nitty-gritty  
> of how good or bad both the Arduino and the Mindstorms are in their  
> respective areas, it was spot on.  Jim wants to give each student a  
> 10 node cluster on the cheap (i.e. 20 to 30 bucks per node = 300  
> bucks), universities want to give each student (or teams of  
> students sometimes) a robot (~280).  Both provide an approachable  
> level of difficulty and potential for education at a reasonable price.
>
> Feel free to continue to disagree for the sake of such.  It was  
> just an example.
>
> Best,
>
> ellis

It's not even spot on.  You're lightyears away with your comparision.

You're comparing one of the best available robots that gets mass  
produced,
with some freak thing where there is 100 alternatives which work way  
better,
alternatives are 500x faster, and if you want to also cheaper,
and above all achieve the original goal better of demonstrating SMP  
programming,
as the freak hardware, thanks to real low clocked type of CPU,
has a neglectible latency to other cpu's.

Where the robot shows you how to work with robots, the educational  
purpose as Jim wrote down,
you won't get very well with the embedded cpu's, as the equipment has  
none of the typical problems you can encounter in
  a normal SMP system let alone a cluster environment, meanwhile it  
has total other problems,
which you will never encounter at CPU's.

Such as that embedded cpu's have severely limited caches and can  
execute just 1 instruction at a time.

Embedded programming is total different from CPU programming and  
latencies embedded, thanks to the slow processor speed,
are not even comparable with SMP programming between cores of 1 cpu.

Such multicore box definitely has a cost below $300.

On ebay i see nodes with 8 cores for $200.

And those are 500x faster.

Myself i'm looking at some socket 771 Xeon machines say with a L5420.  
Though they eat a lot more power than intel claims,
it's still i guess a 170 watt a machine or so under full load.

Note we still skipped the algorithmic discussion, as from algorithmic  
viewpoint, if i look to artificial intelligence, getting something to  
work
at 70Mhz machines is gonna behave total different and needs total  
different approach than todays hardware. It's not even in the same  
ballpark.

Vincent


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ntmoore at gmail.com  Fri Jan 13 09:33:33 2012
From: ntmoore at gmail.com (Nathan Moore)
Date: Fri, 13 Jan 2012 08:33:33 -0600
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
References: <41677598-47C5-4592-BDD6-314CB0EC860E@xs4all.nl>
	<CB34335D.12EED%james.p.lux@jpl.nasa.gov>
Message-ID: <CACD67sxsy6wHh5LGQaR+Ny8LyifDh+o+=d8dhawGFqfr7f9vCQ@mail.gmail.com>

Jim,

Have you ever interacted with the "Modeling Instruction" folks over at
ASU?  http://modeling.asu.edu/

They've done, for HS Physics, more or less what you're talking about
in terms of making the subject engaging, compelling, and diven by
student, not teacher, interest.


On Thu, Jan 12, 2012 at 9:10 AM, Lux, Jim (337C)
<james.p.lux at jpl.nasa.gov> wrote:
>
>
> On 1/12/12 6:39 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:
>
>>The average guy is not interested in knowing all details regarding
>>how to
>>play tennis with a wooden racket from the 1980s, just around
>>the time when McEnroe was on the tennisfield playing there.
>>
>>Most people are more interested in whether you can win that grandslam
>>with what you produce.
>>
>>The nerds however are interested in how well you can do with a wooden
>>racket
>>from 1980s,therefore projecting your own interest upon those students
>>will just
>>get them desinterested and you will be judged by them as an
>>irrelevant person
>>in their life, whose name they soon forget.
>>
>
> Having spent some time recently in Human Resources meetings about how to
> better recruit software people for JPL, I'd say that something that
> appeals to nerds and gives them something to do is not all bad. Part of
> the educational process is to find and separate the people who are
> interested and have a passion. ?I'm not sure that someone who starts
> getting into clusters mostly because they are interested in breaking into
> the Top500 is the target audience in any case.
>
> If you look over the hobby clusters out there, the vast majority are "hey,
> I heard about this interesting idea, I scrounged up N old/small/slow/easy
> to find computers and tried to cluster them and do something. ?I learned
> something about cluster administration, and it was fun, but I don't use it
> anymore"
>
> This is exactly the population you want to hit. ?Bring in 100 advanced
> high school (grade 11-12 in US) students. ?Have them all use cheap
> hardware to do a cluster. ?Some fraction will think, "this is kind of
> cool, maybe I should major in CS instead of X" ?Some fraction will think,
> "how lame, why not make the single processor faster", and they can be
> CompEng or EE majors looking at how to reduce feature sizes and get the
> heat out.
>
> It's just like biology or chemistry classes. ?In high school biology
> (9th/10th grade) most of it is mundane memorization (Krebs cycle, various
> descriptive stuff. ?Other than the use of cheap cmos cameras, microscopes
> used at this level haven't really changed much in the last 100 years (and
> the microscopes at my kids' school are probably 10-20 years old). They
> also do some more modern molecular biology in a series of labs partly
> funded by Amgen: ? Some recombinant DNA to put fluorescent proteins in a
> bacteria, running some gels, etc. ?The vast majority of the students will
> NOT go on to a career in biology, but some fraction do, they get
> interested in some aspect, and they wind up majoring in bio, or being a
> pre-med, etc.
>
> Not everyone is looking for the world beater. ?A lot of kids start with
> Kart racing, even though even the fastest Karts aren't as fast as F1 (or
> even a Smart Car). ?How many engineers started with dismantling the
> lawnmower engine?
>
>
> For my own work, I'd rather have people who are interested in solving
> problems by ganging up multiple failure prone processors, rather than
> centralizing it all in one monolithic box (even if the box happens to have
> multiple cores).
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


-- 
- - - - - - -?? - - - - - - -?? - - - - - - -
Nathan Moore
Associate Professor, Physics
Winona State University
- - - - - - -?? - - - - - - -?? - - - - - - -
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Fri Jan 13 09:38:28 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Fri, 13 Jan 2012 09:38:28 -0500
Subject: [Beowulf] FAWN
In-Reply-To: <4F0FC2F2.5090606@unimelb.edu.au>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com> <4F0FC2F2.5090606@unimelb.edu.au>
Message-ID: <4c1a72f2e8097c3585e495e1c95bfa2c.squirrel@mail.eadline.org>


> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 13/01/12 05:26, Ellis H. Wilson III wrote:
>
>> Not just good, from a sorting perspective, /best/:
>> http://sortbenchmark.org/
>
> But that algorithm isn't running on exactly wimpy hardware..
>
> Intel Core i5-2400S 2.5 GHz, 16GB RAM and a bunch of SSDs

I can vouch for the i5-2400S processors, one of the best
values out there, I got 200 GFLOPS on a Limulus
using 4 of these. Some more benchmarks here:

http://www.clustermonkey.net//content/view/306/1/

--
Doug

>
> cheers!
> Chris
> - --
>     Christopher Samuel - Senior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>          http://www.vlsci.unimelb.edu.au/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8PwvIACgkQO2KABBYQAh84cgCfQZN1ZpKfzxLmazCiZLg93n89
> dwYAoIZHAFmUYENP2xwMwo5M3xile4F3
> =4lFT
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at eadline.org  Fri Jan 13 10:18:02 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Fri, 13 Jan 2012 10:18:02 -0500
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9163@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<c16a5282ab55e2c99f48c1a52862a505.squirrel@mail.eadline.org>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9163@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <6ea096464afff4fe7e544e0cd28c5204.squirrel@mail.eadline.org>


>
>
> -----Original Message-----
> From: Douglas Eadline [mailto:deadline at eadline.org]
> Sent: Thursday, January 12, 2012 8:49 AM
> To: Lux, Jim (337C)
> Cc: beowulf at beowulf.org
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> snip
>>
>>
>> For my own work, I'd rather have people who are interested in solving
>> problems by ganging up multiple failure prone processors, rather than
>> centralizing it all in one monolithic box (even if the box happens to
>> have multiple cores).
>>
>
> This is going to be an exascale issue. i.e. how to compute on a systems
> whose parts might be in a constant state of breaking. An other interesting
> question is how do you know you are getting the right answer on a *really*
> large system?
>
> Of course I spend much of my time optimizing really small systems.
>
> --
>
> Your point about scaling is well taken.. so far, the computing world has
> largely dealt with things by trying to make the processor perfect and
> error free.  Some limited areas of error correction are popular (RAM).
> But think in a bigger area... say your arithmetic unit has some infrequent
> unknown errors (e.g. FDIV bug on Pentium).. could clever algorithm design
> and multiple processors (or multi cores) mitigate this (e.g. instead of
> just computing  Z = X/Y you also compute Z1 = (X*2)/(Y*2).. and compare
> answers... that exact example's not great because you've added 2
> operations, but I can see that there are other clever techniques that
> might be possible.. )
>
> What is nice if you can do things like temporal redundancy (do the
> calculation twice, and if it's different, do it a third time), or even
> better some sort of "check calculation" that takes small time compared to
> mainline calculation.
>
> This, I think, is somewhere that even the big iron/cluster folks could be
> doing some research.  What are optimum communication fabrics to support
> this kind of "side calculation" which may have different communication
> patterns and data flow than the "mainline".  It has a parallel in things
> like CRC checks in communications protocols.  A lot of hardware has a
> dedicated little CRC checker that is continuously calculating the CRC as
> the bits arrive, so that when you get to the end of the frame, the answer
> is already there.
>
>
> And Doug, your small systems have a lot of the same issues, perhaps
> because that small Limulus might be operated in environments other than
> what the underlying hardware was designed for.  I know people who have
> been rudely surprised when they found that the design environment for a
> laptop is a pretty narrow temperature range (e.g. office desktop) and when
> they put them in a car, subject to 0C or 40C temperatures, if not wider,
> that things don't work quite as well as expected.

I will be curious to see where these things show up since
all you really need is a power plug. (a little nervous actually).

>
> Very small systems (few nodes) have the same issues, in some environments
> (e.g. a cluster subject to single event upsets or functional interrupts in
> a high radiation environment with a lot of high energy charged particles.
> it's not so much a total dose thing, but a SEE thing)
>
> For Juno (which is in polar orbit around Jupiter), we shielded everything
> in a vault (a 1 meter cube with 1cm thick titanium walls) and still it's
> an issue.  We don't get very long before everything is cooked.
>
> And I think that a non-trivially small cluster (e.g. more than 4 nodes, I
> think) you could do a lot of experimentation on techniques.

I agree. Four nodes is really small. BTW, the most fun in designing
this system is a set of tighter constraints than are found on the typical
cluster. Noise, power, space, cabling, low cost packaging, etc. I have
been asked about a rack mount version, we'll see.

One thing I find interesting is the core/node efficiency.
(what I call "effective cores") In general *on some codes*, I found that
less cores (1P micro-atx 4-cores) is more efficient than many
cores (2P server 12-core). Seems obvious, but I like to test things.

>
>
> (oddly, simulated fault injection is one of the trickier parts)
>

I would assume, because in a sense, the black swan* is
by definition hard to predict.

(* the book by Nick Taleb, not the movie)


--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From james.p.lux at jpl.nasa.gov  Fri Jan 13 11:26:29 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Fri, 13 Jan 2012 08:26:29 -0800
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <6ea096464afff4fe7e544e0cd28c5204.squirrel@mail.eadline.org>
Message-ID: <CB359697.13052%james.p.lux@jpl.nasa.gov>


On 1/13/12 7:18 AM, "Douglas Eadline" <deadline at eadline.org> wrote:
>>
>>
>> And Doug, your small systems have a lot of the same issues, perhaps
>> because that small Limulus might be operated in environments other than
>> what the underlying hardware was designed for.  I know people who have
>> been rudely surprised when they found that the design environment for a
>> laptop is a pretty narrow temperature range (e.g. office desktop) and
>>when
>> they put them in a car, subject to 0C or 40C temperatures, if not wider,
>> that things don't work quite as well as expected.
>
>I will be curious to see where these things show up since
>all you really need is a power plug. (a little nervous actually).

Yes.. That *will* be interesting...  And wait til someone has a cluster of
Limuluses (Not sure of the proper alliterative collective noun, nor the
plural form.. A litany of limuli? A school? A murder?)

>
>I agree. Four nodes is really small. BTW, the most fun in designing
>this system is a set of tighter constraints than are found on the typical
>cluster. Noise, power, space, cabling, low cost packaging, etc. I have
>been asked about a rack mount version, we'll see.
>
>One thing I find interesting is the core/node efficiency.
>(what I call "effective cores") In general *on some codes*, I found that
>less cores (1P micro-atx 4-cores) is more efficient than many
>cores (2P server 12-core). Seems obvious, but I like to test things.


Yes, because we're using, in general, commodity components/assemblies,
we're subject to the results of optimizations and market/business forces
in other user spaces.  Someone designing a media PC for home use might not
care about electrical efficiency (there's no big yellow energy tags on
computers, yet), but would care about noise.  Someone designing a rack
mounted server cares not a whit about noise, but really cares about a 10%
change in power consumption.

And, drop on top of that the non-synchronized differences in
development/manufacturing/fabrication generations for the underlying
parts.  Consumer stuff comes out for the winter selling season. Commercial
stuff probably is on a different cycle. It's not like everyone uses the
same "model year changeover".


>
>>
>>
>> (oddly, simulated fault injection is one of the trickier parts)
>>
>
>I would assume, because in a sense, the black swan* is
>by definition hard to predict.

Not so much that, as the actual mechanics of fault injection.  Think about
testing error detection and recovery for Flash memory.  The underlying
specification error rate is something like 1E-9 or 1E-10/read, and that's
a worst case kind of spec, so errors aren't too common (I.e. You can't
just run and wait for them to occur).  SO how do you cause errors to occur
(without perturbing the system.)...

In the flash case, because we developed our own flash controller logic in
an FPGA, we can add "error injection logic" to the design, but that's not
always the case.  How would you simulate upsets in a CPU core?  (short of
blasting it with radiation.. Which is difficult and expensive.. I wish it
was as easy as getting a little Co60 gamma source and putting it on top of
the chip.. We hike to somewhere that has an accelerator (UC Davis,
Brookhaven, etc) and shoot protons and heavy ions at it.

>
>(* the book by Nick Taleb, not the movie)


Black swans in this case would be things like the Pentium divide bug.
Yes.. That *would* be a challenge, but hey, we've got folks in our JPL
Laboratory for Reliable Software (LARS) who sit around thinking of how to
do that, among other things.  (http://lars-lab.jpl.nasa.gov/)   Hmm.. I'll
have to go talk to those guys about clusters of pi or arduinos...  They're
big into formal verifications, too, and model based verification.  So you
could have a modeled system in SysML or UML and compare its behavior with
that on your prototype.
>
>
>--
>Doug
>
>-- 
>This message has been scanned for viruses and
>dangerous content by MailScanner, and is
>believed to be clean.
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Fri Jan 13 23:18:57 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Fri, 13 Jan 2012 23:18:57 -0500 (EST)
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <CB359697.13052%james.p.lux@jpl.nasa.gov>
References: <CB359697.13052%james.p.lux@jpl.nasa.gov>
Message-ID: <alpine.LFD.2.02.1201132248190.10797@coffee.psychology.mcmaster.ca>

> care about electrical efficiency (there's no big yellow energy tags on
> computers, yet), but would care about noise.  Someone designing a rack

the "plus 80" branding is pretty ubiquitous now, and the best part
is that commodity ATX parts are starting to show up at gold levels.
server vendors have offered gold or platinum for a while now, but it's
probably more important in the home, since personal machines spend more
time idling, thus running the PSU at low demand.  poor-quality PSUs
are remarkably bad at low utilization.

regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Fri Jan 13 23:46:17 2012
From: samuel at unimelb.edu.au (Chris Samuel)
Date: Sat, 14 Jan 2012 15:46:17 +1100
Subject: [Beowulf] A cluster of Arduinos
In-Reply-To: <6ea096464afff4fe7e544e0cd28c5204.squirrel@mail.eadline.org>
References: <CB34335D.12EED%james.p.lux@jpl.nasa.gov>
	<ECE7A93BD093E1439C20020FBE87C47F01101B2E9163@ALTPHYEMBEVSP20.RES.AD.JPL>
	<6ea096464afff4fe7e544e0cd28c5204.squirrel@mail.eadline.org>
Message-ID: <201201141546.17872.samuel@unimelb.edu.au>

On Sat, 14 Jan 2012 02:18:02 AM Douglas Eadline wrote:

> I would assume, because in a sense, the black swan* is
> by definition hard to predict.

Ahem, not around here, they're all black [1].  Now a white swan, that 
would be something to see!

[1] http://www.flickr.com/photos/earthinmyeyes/4608041877/

cheers!
Chris
-- 
   Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Thu Jan 19 09:46:26 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Thu, 19 Jan 2012 09:46:26 -0500
Subject: [Beowulf] Parallel Programming Survey Report
In-Reply-To: <4c1a72f2e8097c3585e495e1c95bfa2c.squirrel@mail.eadline.org>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com> <4F0FC2F2.5090606@unimelb.edu.au>
	<4c1a72f2e8097c3585e495e1c95bfa2c.squirrel@mail.eadline.org>
Message-ID: <6ec5ed08a6fb8c5b390b26bdfc18803a.squirrel@mail.eadline.org>

Last year Dr Dobb's did a survey of parallel programming.
Today I received a copy of:

The Parallel Programming Landscape: Multicore has gone mainstream --
but are developers ready?

It is mostly about multi-core and a bit Intel centric (they
sponsored it) and not too much about HPC. Still interesting
to see how the programming world is coping with multi-core.
If you are interested in a copy you have to sign up here:

https://www.cmpadministration.com/ars/emailnew.do?mode=emailnew&P=P2&MZP=&L=&F=1003933&K=&cid_download

I'll probably read closer and post a summary on Cluster Monkey
at some point.

--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From eugen at leitl.org  Thu Jan 19 09:57:37 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Thu, 19 Jan 2012 15:57:37 +0100
Subject: [Beowulf] Parallel Programming Survey Report
In-Reply-To: <6ec5ed08a6fb8c5b390b26bdfc18803a.squirrel@mail.eadline.org>
References: <ECE7A93BD093E1439C20020FBE87C47F01101B2E9173@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4F0F25D2.90305@runnersroll.com> <4F0FC2F2.5090606@unimelb.edu.au>
	<4c1a72f2e8097c3585e495e1c95bfa2c.squirrel@mail.eadline.org>
	<6ec5ed08a6fb8c5b390b26bdfc18803a.squirrel@mail.eadline.org>
Message-ID: <20120119145737.GK21917@leitl.org>

On Thu, Jan 19, 2012 at 09:46:26AM -0500, Douglas Eadline wrote:
> Last year Dr Dobb's did a survey of parallel programming.
> Today I received a copy of:
> 
> The Parallel Programming Landscape: Multicore has gone mainstream --
> but are developers ready?
> 
> It is mostly about multi-core and a bit Intel centric (they
> sponsored it) and not too much about HPC. Still interesting
> to see how the programming world is coping with multi-core.
> If you are interested in a copy you have to sign up here:
> 
> https://www.cmpadministration.com/ars/emailnew.do?mode=emailnew&P=P2&MZP=&L=&F=1003933&K=&cid_download
> 
> I'll probably read closer and post a summary on Cluster Monkey
> at some point.

While speaking about multicore, I recommend this 21 min
video interview (even if you dislike talking heads and
smarmy interviewers) with david Ungar:

http://channel9.msdn.com/Blogs/Charles/SPLASH-2011-David-Ungar-Self-ManyCore-and-Embracing-Non-Determinism

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Mon Jan 23 08:45:10 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Mon, 23 Jan 2012 14:45:10 +0100
Subject: [Beowulf] =?utf-8?q?CPU_Startup_Combines_CPU+DRAM=E2=80=94And_A_W?=
	=?utf-8?q?hole_Bunch_Of_Crazy?=
Message-ID: <20120123134510.GF7343@leitl.org>


(Old idea, makes sense, will they be able to pull it off?)

http://hothardware.com/News/CPU-Startup-Combines-CPUDRAMAnd-A-Whole-Bunch-Of-Crazy/

CPU Startup Combines CPU+DRAM?And A Whole Bunch Of Crazy

Sunday, January 22, 2012 - by Joel Hruska

The CPU design firm Venray Technology announced a new product design this
week that it claims can deliver enormous performance benefits by combining
CPU and DRAM on to a single piece of silicon. We spent some time earlier this
fall discussing the new TOMI (Thread Optimized Multiprocessor) with company
CTO Russell Fish, but while the idea is interesting; its presentation is
marred by crazy conceptualizing and deeply suspect analytics.

The Multicore Problem:

There are three limiting factors, or walls, that limit the scaling of modern
microprocessors. First, there's the memory wall, defined as the gap between
the CPU and DRAM clock speed. Second, there's the ILP (Instruction Level
Parallelism) wall, which refers to the difficulty of decoding enough
instructions per clock cycle to keep a core completely busy. Finally, there's
the power wall--the faster a CPU is and the more cores it has, the more power
it consumes.

Attempting to compensate for one wall often risks running afoul of the other
two. Adding more cache to decrease the impact of the CPU/DRAM speed
discrepancy adds die complexity and draws more power, as does raising CPU
clock speed. Combined, the three walls are a set of fundamental
constraints--improving architectural efficiency and moving to a smaller
process technology may make the room a bit bigger, but they don't remove the
walls themselves.

TOMI attempts to redefine the problem by building a very different type of
microprocessor. The TOMI Borealis is built using the same transistor
structures as conventional DRAM; the chip trades clock speed and performance
for ultra-low low leakage. Its design is, by necessity, extremely simple. Not
counting the cache, TOMI is a 22,000 transistor design, as compared to 30,000
transistors for the original ARM2. The company's early prototypes, built on
legacy DRAM technology, ran at 500MHz on a 110nm process.

Instead of surrounding a CPU core with a substantial amount of L2 and L3
cache, Venray inserted a CPU core directly into a DRAM design. A TOMI
Borealis core connects eight TOMI cores to a 1Gbit DRAM with a total of 16
ICs per 2GB DIMM. This works out to a total of 128 processor cores per DIMM.
Because they're built using ultra-low-leakage processes and are so small,
such cores cost very little to build and consume vanishingly small amounts of
power (Venray claims power consumption is as low as 23mW per core at 500MHz).

It's an interesting idea.

The Bad:

When your CPU has fewer transistors than an architecture that debuted in
1986, it's a good chance that you left a few things out--like an FPU, branch
prediction, pipelining, or any form of speculative execution. Venray may have
created a chip with power consumption an order of magnitude lower than
anything ARM builds and more memory bandwidth than Intel's highest-end Xeons,
but it's an ultra-specialized, ultra-lightweight core that trades 25 years of
flexibility and performance for scads of memory bandwidth.


The last few years have seen a dramatic surge in the number of low-power,
many-core architectures being floated as the potential future of computing,
but Venray's approach relies on the manufacturing expertise of companies who
have no experience in building microprocessors and don't normally serve as
foundries. This imposes fundamental restrictions on the CPU's ability to
scale; DRAM is manufactured using a three layer mask rather than the 10-12
layers Intel and AMD use for their CPUs. Venray already acknowledges that
these conditions imposed substantial limitations on the original TOMI design.

Of course, there's still a chance that the TOMI uarch could be effective in
certain bandwidth-hungry scenarios--but that's where the Venray Crazy Train
goes flying off the track.

The Disingenuous and Crazy

Let's start here. In a graph like this, you expect the two bars to represent
the same systems being compared across three different characteristics.
That's not the case. When we spoke to Russell Fish in late November, he
pointed us to this publicly available document and claimed that the results
came from a customer with 384 2.1GHz Xeons. There's no such thing as an S5620
Xeon and even if we grant that he meant the E5620 CPU, that's a 2.4GHz chip.

The "Power consumption" graphs show Oracle's maximum power consumption for a
system with 10x Xeon E7-8870s, 168 dedicated SQL processors, 5.3TB (yes, TB)
of Flash and 15x 10,000 RPM hard drives. It's not only a worst-case figure,
it's a figure utterly unrelated to the workload shown in the Performance
comparison. Furthermore, given that each Xeon E7-8870 has a 130W TDP, ten of
them only come out to 1.3kW--Oracle's 17.7kW figure means that the
overwhelming majority of the cabinet's power consumption is driven by
components other than its CPUs.

>From here, things rapidly get worse. Fish makes his points about power walls
by referring to unverified claims that prototype 90nm Tejas chips drew 150W
at 2.8GHz back in 2004. That's like arguing that Ford can't build a decent
car because the Edsel sucked.

After reading about the technology, you might think Venray was planning to
market a small chip to high-end HPC niche markets... and you'd be wrong. The
company expects the following to occur as a result of this revolutionary
architecture (organized by least-to-most creepy):

    Computer speech will be so common that devices will talk to other devices
in the presence of their users.

    Your cell phone camera will recognize the face of anyone it sees and scan
the computer cloud for backround red flags as well as six degrees of
separation

    Common commands will be reduced to short verbal cues like clicking your
tongue or sucking your lips

    Your personal history will be displayed for one and all to see...women
will create search engines to find eligible, prosperous men. Men will create
search engines to qualify women. Criminals will find their jobs much more
difficult because their history will be immediately known to anyone who
encounters them.

    TOMI Technology will be built on flash memories creating the elemental
unit of a learning machine... the machines will be able to self organize,
build robust communicating structures, and collaborate to perform tasks.

    A disposable diaper company will give away TOMI enabled teddy bears that
teach reading and arithmetic. It will be able to identify specific
children... and from time to time remind Mom to buy a product. The bear will
also diagnose a raspy throat, a cough, or runny nose.

Conclusion:

Fish has spent decades in the microprocessor industry--he invented the first
CPU to use a clock multiplier in conjunction with Chuck H. Moore--but his
vision of the future is crazy enough to scare mad dogs and Englishmen.

His idea for a CPU architecture is interesting, even underneath the
obfuscation and false representation, but too practically limited to ever
take off. Google, an enthusiastic and dedicated proponent of energy
efficient, multi-core research said it best in a paper titled "Brawny cores
still beat wimpy cores, most of the time."

 "Once a chip?s single-core performance lags by more than a factor to two or
so behind the higher end of current-generation commodity processors, making a
business case for switching to the wimpy system becomes increasingly
difficult... So go forth and multiply your cores, but do it in moderation, or
the sea of wimpy cores will stick to your programmers? boots like clay."
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Mon Jan 23 10:38:39 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 23 Jan 2012 10:38:39 -0500
Subject: [Beowulf]
 =?utf-8?q?CPU_Startup_Combines_CPU+DRAM=E2=80=94And_A_W?=
 =?utf-8?q?hole_Bunch_Of_Crazy?=
In-Reply-To: <20120123134510.GF7343@leitl.org>
References: <20120123134510.GF7343@leitl.org>
Message-ID: <4F1D7EFF.7080206@ias.edu>

If you read this PDF from Venray Technologies, which is linked to in the
article, you see where the 'Whole Bunch of Crazy" part comes from. After
reading it, Venray lost a lot of credibility in my book.

https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf

--
Prentice


On 01/23/2012 08:45 AM, Eugen Leitl wrote:
> (Old idea, makes sense, will they be able to pull it off?)
>
> http://hothardware.com/News/CPU-Startup-Combines-CPUDRAMAnd-A-Whole-Bunch-Of-Crazy/
>
> CPU Startup Combines CPU+DRAM?And A Whole Bunch Of Crazy
>
> Sunday, January 22, 2012 - by Joel Hruska
>
> The CPU design firm Venray Technology announced a new product design this
> week that it claims can deliver enormous performance benefits by combining
> CPU and DRAM on to a single piece of silicon. We spent some time earlier this
> fall discussing the new TOMI (Thread Optimized Multiprocessor) with company
> CTO Russell Fish, but while the idea is interesting; its presentation is
> marred by crazy conceptualizing and deeply suspect analytics.
>
> The Multicore Problem:
>
> There are three limiting factors, or walls, that limit the scaling of modern
> microprocessors. First, there's the memory wall, defined as the gap between
> the CPU and DRAM clock speed. Second, there's the ILP (Instruction Level
> Parallelism) wall, which refers to the difficulty of decoding enough
> instructions per clock cycle to keep a core completely busy. Finally, there's
> the power wall--the faster a CPU is and the more cores it has, the more power
> it consumes.
>
> Attempting to compensate for one wall often risks running afoul of the other
> two. Adding more cache to decrease the impact of the CPU/DRAM speed
> discrepancy adds die complexity and draws more power, as does raising CPU
> clock speed. Combined, the three walls are a set of fundamental
> constraints--improving architectural efficiency and moving to a smaller
> process technology may make the room a bit bigger, but they don't remove the
> walls themselves.
>
> TOMI attempts to redefine the problem by building a very different type of
> microprocessor. The TOMI Borealis is built using the same transistor
> structures as conventional DRAM; the chip trades clock speed and performance
> for ultra-low low leakage. Its design is, by necessity, extremely simple. Not
> counting the cache, TOMI is a 22,000 transistor design, as compared to 30,000
> transistors for the original ARM2. The company's early prototypes, built on
> legacy DRAM technology, ran at 500MHz on a 110nm process.
>
> Instead of surrounding a CPU core with a substantial amount of L2 and L3
> cache, Venray inserted a CPU core directly into a DRAM design. A TOMI
> Borealis core connects eight TOMI cores to a 1Gbit DRAM with a total of 16
> ICs per 2GB DIMM. This works out to a total of 128 processor cores per DIMM.
> Because they're built using ultra-low-leakage processes and are so small,
> such cores cost very little to build and consume vanishingly small amounts of
> power (Venray claims power consumption is as low as 23mW per core at 500MHz).
>
> It's an interesting idea.
>
> The Bad:
>
> When your CPU has fewer transistors than an architecture that debuted in
> 1986, it's a good chance that you left a few things out--like an FPU, branch
> prediction, pipelining, or any form of speculative execution. Venray may have
> created a chip with power consumption an order of magnitude lower than
> anything ARM builds and more memory bandwidth than Intel's highest-end Xeons,
> but it's an ultra-specialized, ultra-lightweight core that trades 25 years of
> flexibility and performance for scads of memory bandwidth.
>
>
> The last few years have seen a dramatic surge in the number of low-power,
> many-core architectures being floated as the potential future of computing,
> but Venray's approach relies on the manufacturing expertise of companies who
> have no experience in building microprocessors and don't normally serve as
> foundries. This imposes fundamental restrictions on the CPU's ability to
> scale; DRAM is manufactured using a three layer mask rather than the 10-12
> layers Intel and AMD use for their CPUs. Venray already acknowledges that
> these conditions imposed substantial limitations on the original TOMI design.
>
> Of course, there's still a chance that the TOMI uarch could be effective in
> certain bandwidth-hungry scenarios--but that's where the Venray Crazy Train
> goes flying off the track.
>
> The Disingenuous and Crazy
>
> Let's start here. In a graph like this, you expect the two bars to represent
> the same systems being compared across three different characteristics.
> That's not the case. When we spoke to Russell Fish in late November, he
> pointed us to this publicly available document and claimed that the results
> came from a customer with 384 2.1GHz Xeons. There's no such thing as an S5620
> Xeon and even if we grant that he meant the E5620 CPU, that's a 2.4GHz chip.
>
> The "Power consumption" graphs show Oracle's maximum power consumption for a
> system with 10x Xeon E7-8870s, 168 dedicated SQL processors, 5.3TB (yes, TB)
> of Flash and 15x 10,000 RPM hard drives. It's not only a worst-case figure,
> it's a figure utterly unrelated to the workload shown in the Performance
> comparison. Furthermore, given that each Xeon E7-8870 has a 130W TDP, ten of
> them only come out to 1.3kW--Oracle's 17.7kW figure means that the
> overwhelming majority of the cabinet's power consumption is driven by
> components other than its CPUs.
>
> From here, things rapidly get worse. Fish makes his points about power walls
> by referring to unverified claims that prototype 90nm Tejas chips drew 150W
> at 2.8GHz back in 2004. That's like arguing that Ford can't build a decent
> car because the Edsel sucked.
>
> After reading about the technology, you might think Venray was planning to
> market a small chip to high-end HPC niche markets... and you'd be wrong. The
> company expects the following to occur as a result of this revolutionary
> architecture (organized by least-to-most creepy):
>
>     Computer speech will be so common that devices will talk to other devices
> in the presence of their users.
>
>     Your cell phone camera will recognize the face of anyone it sees and scan
> the computer cloud for backround red flags as well as six degrees of
> separation
>
>     Common commands will be reduced to short verbal cues like clicking your
> tongue or sucking your lips
>
>     Your personal history will be displayed for one and all to see...women
> will create search engines to find eligible, prosperous men. Men will create
> search engines to qualify women. Criminals will find their jobs much more
> difficult because their history will be immediately known to anyone who
> encounters them.
>
>     TOMI Technology will be built on flash memories creating the elemental
> unit of a learning machine... the machines will be able to self organize,
> build robust communicating structures, and collaborate to perform tasks.
>
>     A disposable diaper company will give away TOMI enabled teddy bears that
> teach reading and arithmetic. It will be able to identify specific
> children... and from time to time remind Mom to buy a product. The bear will
> also diagnose a raspy throat, a cough, or runny nose.
>
> Conclusion:
>
> Fish has spent decades in the microprocessor industry--he invented the first
> CPU to use a clock multiplier in conjunction with Chuck H. Moore--but his
> vision of the future is crazy enough to scare mad dogs and Englishmen.
>
> His idea for a CPU architecture is interesting, even underneath the
> obfuscation and false representation, but too practically limited to ever
> take off. Google, an enthusiastic and dedicated proponent of energy
> efficient, multi-core research said it best in a paper titled "Brawny cores
> still beat wimpy cores, most of the time."
>
>  "Once a chip?s single-core performance lags by more than a factor to two or
> so behind the higher end of current-generation commodity processors, making a
> business case for switching to the wimpy system becomes increasingly
> difficult... So go forth and multiply your cores, but do it in moderation, or
> the sea of wimpy cores will stick to your programmers? boots like clay."
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Mon Jan 23 11:35:56 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 23 Jan 2012 08:35:56 -0800
Subject: [Beowulf]
 =?windows-1252?q?CPU_Startup_Combines_CPU+DRAM=8BAnd_A_?=
 =?windows-1252?q?Whole_Bunch_Of_Crazy?=
In-Reply-To: <4F1D7EFF.7080206@ias.edu>
Message-ID: <CB42C3AE.133C9%james.p.lux@jpl.nasa.gov>

The CPU reminds me of the old bipolar AMD2901 CPU chip sets...
RISC before it was called RISC.

The white paper sort of harps on the fact that one cannot accurately
predict the future (hey, I was a 10th grader at NCC in 1975, and saw the
Altair at the MITS display in their trailer and KNEW that I wanted one,
but I also wanted lots of other things there, which didn't pan out).
Then, having established that you can make predictions with impunity and
nobody can prove you wrong, they go on with a couple pages of ideas.
(establishing priority for patenting.. Eh?  Like the story Feynman tells
about getting a patent on nuclear powered airplanes)

The concept isn't particularly new (see, e.g. Transputers), but that's
true of most architectural things. I think what happens is that as
manufacturing or other limits/bumps in the road are hit, it forces a
review. There's always the argument that building a bigger, faster version
of what we had before is easier (support for legacy codes, etc.) and at
some point, the balance shifts.. It's not easier to just build bigger
faster.

Vector processors
Pipelines
Cluster computers
Etc.

The "processors in a sea of memory" model has been around for a while
(and, in fact, there were a lot of designs in the 80s, at the board if not
the chip level: transputers, early hypercubes, etc.)  So this is
revisiting the architecture at a smaller level of integration.

One thing about power consumption.. Those memory cells consume so little
power because most of them  are not being accessed.  They're essentially
"floating" capacitors. So the power consumption of the same transistor in
a CPU (where the duty factor is 100%) is going to be higher than the power
consumption in a memory cell (where the duty factor is 0.001% or
something).

And, as always, the challenge is in the software to effectively use the
distributed computing architecture.  When you think about it, we've had
almost a century to figure out how to program single instruction stream
computers of one sort or another, and it was easy, because we are single
stream (SISD) ourselves.  We can create a simulation of multiple threads
by timesharing in some sense (in either the human or machine models)

And we have lots of experience with EP type, or even scatter/gather type
processes (tilling land, building pyramids, assembly lines) so that model
of software/hardware architecture can be argued to be a natural outgrowth
of what humans already do, and have been figuring out how to do for
millenia.  (did Imhotep use some form of project planning tools?  You bet
he did)

However, true parallelism (MIMD) is harder to conceptualize.  Vector and
matrix math is one area, but I'd argue that it's just the same as EP
tasks, just at a finer grain. Systolic arrays, vector pipelines, FFT boxes
from FloatingPointSystems, are all basically ways to use the underlying
structure of the task, in an easy way (how long til there's a hardware
implementation of the new faster-than-FFT algorithm published last week?)
And in all those cases, you have to explicitly make use of the special
capabilities.  That is, in general, the compiler doesn't recognize it
(although, modern parallelizing compilers ARE really smart.. So they
probably do find most of the cases)

I don't know that we have good conceptual tools to take a complex task and
break it effectively into multiple disparate component tasks that can
effectively run in parallel.  It's a hard task for something
straightforward (e.g. Designing a big system or building a spacecraft),
and I don't know that any of outputs of current project planning
techniques (which are entirely manual) can be said to produce
"generalized" optimum outputs.  They produce *an* output for dividing the
complex task up (or else the project can't be done), but I don't know that
the output is provably optimum or even workable (an awful lot of projects
over-run, and not just because of bad estimates for time/cost).

So the problem facing would-be users of new computing architectures (be
they TOMI, HyperCube, ConnectionMachine, or Beowulf) is like that facing a
project planner given a big project, and a brand new crew of workers who
speak a different language, with skill sets totally different than the
planner is used to.

This is what the computer user is facing:  There's no compiler or problem
description technique that will automatically generate a "work plan" to
use that new architecture. It's all manual, and it's hard, and you're up
against a brute force "why not just hook 500 people up to that rock and
drag it" approach.  The people who figure out the new way will certainly
benefit society, but there's going to be a lot of false starts along the
way.  And, I'm not particularly sanguine about the process being automated
(at least in the sense of automatic parallelizing compilers that recognize
loops and repetitve stuff).  I think that for the next few years
(decades?) using new architectures is going to rely on skilled humans to
figure out how to use it, on an ad hoc, unique to each application, basis.


[Back in the 80s, I had a loaner "sugarcube" 4 node Intel hypercube
sitting on my desk for a while.  I wanted to figure out something to do
with it that is non-trivial, and not the examples given in the docs (which
focused on stuff like LISP and Prolog).  I started, as I'm sure many
people do, by taking a multithreaded application I had, and distributing
the threads to processors.  You pretty quickly realize, though, that it's
tough to evenly distribute the loads among processors, and you wind up
with processor 1 waiting for something that processor 2 is doing, which in
turn is waiting for something that processor 3 is doing, and so forth.  In
a "shared processor" this isn't a big deal, and is transparent: the
processor is always working, and aside from deadlocks, there's no
particular reason why you need to balance load among threads.

For what it's worth, the task I was doing was comparable to taking
execution of a Matlab/simulink model and distributing it across multiple
processors.  You had signals flowing among blocks, etc.  These things are
computationally intensive (especially if you have loops in the design, so
you need an iterative solution of some sort) so the idea of putting
multiple processors to work is attractive.   But the "work" in each block
in the diagram isn't known a-priori and might vary during the course of
the simulation, so it's not like you can come up with some sort of
automatic partitioning algorithm.


On 1/23/12 7:38 AM, "Prentice Bisbal" <prentice at ias.edu> wrote:

>If you read this PDF from Venray Technologies, which is linked to in the
>article, you see where the 'Whole Bunch of Crazy" part comes from. After
>reading it, Venray lost a lot of credibility in my book.
>
>https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf
>
>--
>Prentice
>
>
>On 01/23/2012 08:45 AM, Eugen Leitl wrote:
>> (Old idea, makes sense, will they be able to pull it off?)
>>
>>
>>http://hothardware.com/News/CPU-Startup-Combines-CPUDRAMAnd-A-Whole-Bunch
>>-Of-Crazy/
>>
>> CPU Startup Combines CPU+DRAM?And A Whole Bunch Of Crazy
>>
>> Sunday, January 22, 2012 - by Joel Hruska
>>
>> The CPU design firm Venray Technology announced a new product design
>>this
>> week that it claims can deliver enormous performance benefits by
>>combining
>> CPU and DRAM on to a single piece of silicon. We spent some time
>>earlier this
>> fall discussing the new TOMI (Thread Optimized Multiprocessor) with
>>company
>> CTO Russell Fish, but while the idea is interesting; its presentation is
>> marred by crazy conceptualizing and deeply suspect analytics.
>>
>> The Multicore Problem:
>>
>> There are three limiting factors, or walls, that limit the scaling of
>>modern
>> microprocessors. First, there's the memory wall, defined as the gap
>>between
>> the CPU and DRAM clock speed. Second, there's the ILP (Instruction Level
>> Parallelism) wall, which refers to the difficulty of decoding enough
>> instructions per clock cycle to keep a core completely busy. Finally,
>>there's
>> the power wall--the faster a CPU is and the more cores it has, the more
>>power
>> it consumes.
>>
>> Attempting to compensate for one wall often risks running afoul of the
>>other
>> two. Adding more cache to decrease the impact of the CPU/DRAM speed
>> discrepancy adds die complexity and draws more power, as does raising
>>CPU
>> clock speed. Combined, the three walls are a set of fundamental
>> constraints--improving architectural efficiency and moving to a smaller
>> process technology may make the room a bit bigger, but they don't
>>remove the
>> walls themselves.
>>
>> TOMI attempts to redefine the problem by building a very different type
>>of
>> microprocessor. The TOMI Borealis is built using the same transistor
>> structures as conventional DRAM; the chip trades clock speed and
>>performance
>> for ultra-low low leakage. Its design is, by necessity, extremely
>>simple. Not
>> counting the cache, TOMI is a 22,000 transistor design, as compared to
>>30,000
>> transistors for the original ARM2. The company's early prototypes,
>>built on
>> legacy DRAM technology, ran at 500MHz on a 110nm process.
>>
>> Instead of surrounding a CPU core with a substantial amount of L2 and L3
>> cache, Venray inserted a CPU core directly into a DRAM design. A TOMI
>> Borealis core connects eight TOMI cores to a 1Gbit DRAM with a total of
>>16
>> ICs per 2GB DIMM. This works out to a total of 128 processor cores per
>>DIMM.
>> Because they're built using ultra-low-leakage processes and are so
>>small,
>> such cores cost very little to build and consume vanishingly small
>>amounts of
>> power (Venray claims power consumption is as low as 23mW per core at
>>500MHz).
>>
>> It's an interesting idea.
>>
>> The Bad:
>>
>> When your CPU has fewer transistors than an architecture that debuted in
>> 1986, it's a good chance that you left a few things out--like an FPU,
>>branch
>> prediction, pipelining, or any form of speculative execution. Venray
>>may have
>> created a chip with power consumption an order of magnitude lower than
>> anything ARM builds and more memory bandwidth than Intel's highest-end
>>Xeons,
>> but it's an ultra-specialized, ultra-lightweight core that trades 25
>>years of
>> flexibility and performance for scads of memory bandwidth.
>>
>>
>> The last few years have seen a dramatic surge in the number of
>>low-power,
>> many-core architectures being floated as the potential future of
>>computing,
>> but Venray's approach relies on the manufacturing expertise of
>>companies who
>> have no experience in building microprocessors and don't normally serve
>>as
>> foundries. This imposes fundamental restrictions on the CPU's ability to
>> scale; DRAM is manufactured using a three layer mask rather than the
>>10-12
>> layers Intel and AMD use for their CPUs. Venray already acknowledges
>>that
>> these conditions imposed substantial limitations on the original TOMI
>>design.
>>
>> Of course, there's still a chance that the TOMI uarch could be
>>effective in
>> certain bandwidth-hungry scenarios--but that's where the Venray Crazy
>>Train
>> goes flying off the track.
>>
>> The Disingenuous and Crazy
>>
>> Let's start here. In a graph like this, you expect the two bars to
>>represent
>> the same systems being compared across three different characteristics.
>> That's not the case. When we spoke to Russell Fish in late November, he
>> pointed us to this publicly available document and claimed that the
>>results
>> came from a customer with 384 2.1GHz Xeons. There's no such thing as an
>>S5620
>> Xeon and even if we grant that he meant the E5620 CPU, that's a 2.4GHz
>>chip.
>>
>> The "Power consumption" graphs show Oracle's maximum power consumption
>>for a
>> system with 10x Xeon E7-8870s, 168 dedicated SQL processors, 5.3TB
>>(yes, TB)
>> of Flash and 15x 10,000 RPM hard drives. It's not only a worst-case
>>figure,
>> it's a figure utterly unrelated to the workload shown in the Performance
>> comparison. Furthermore, given that each Xeon E7-8870 has a 130W TDP,
>>ten of
>> them only come out to 1.3kW--Oracle's 17.7kW figure means that the
>> overwhelming majority of the cabinet's power consumption is driven by
>> components other than its CPUs.
>>
>> From here, things rapidly get worse. Fish makes his points about power
>>walls
>> by referring to unverified claims that prototype 90nm Tejas chips drew
>>150W
>> at 2.8GHz back in 2004. That's like arguing that Ford can't build a
>>decent
>> car because the Edsel sucked.
>>
>> After reading about the technology, you might think Venray was planning
>>to
>> market a small chip to high-end HPC niche markets... and you'd be
>>wrong. The
>> company expects the following to occur as a result of this revolutionary
>> architecture (organized by least-to-most creepy):
>>
>>     Computer speech will be so common that devices will talk to other
>>devices
>> in the presence of their users.
>>
>>     Your cell phone camera will recognize the face of anyone it sees
>>and scan
>> the computer cloud for backround red flags as well as six degrees of
>> separation
>>
>>     Common commands will be reduced to short verbal cues like clicking
>>your
>> tongue or sucking your lips
>>
>>     Your personal history will be displayed for one and all to
>>see...women
>> will create search engines to find eligible, prosperous men. Men will
>>create
>> search engines to qualify women. Criminals will find their jobs much
>>more
>> difficult because their history will be immediately known to anyone who
>> encounters them.
>>
>>     TOMI Technology will be built on flash memories creating the
>>elemental
>> unit of a learning machine... the machines will be able to self
>>organize,
>> build robust communicating structures, and collaborate to perform tasks.
>>
>>     A disposable diaper company will give away TOMI enabled teddy bears
>>that
>> teach reading and arithmetic. It will be able to identify specific
>> children... and from time to time remind Mom to buy a product. The bear
>>will
>> also diagnose a raspy throat, a cough, or runny nose.
>>
>> Conclusion:
>>
>> Fish has spent decades in the microprocessor industry--he invented the
>>first
>> CPU to use a clock multiplier in conjunction with Chuck H. Moore--but
>>his
>> vision of the future is crazy enough to scare mad dogs and Englishmen.
>>
>> His idea for a CPU architecture is interesting, even underneath the
>> obfuscation and false representation, but too practically limited to
>>ever
>> take off. Google, an enthusiastic and dedicated proponent of energy
>> efficient, multi-core research said it best in a paper titled "Brawny
>>cores
>> still beat wimpy cores, most of the time."
>>
>>  "Once a chip?s single-core performance lags by more than a factor to
>>two or
>> so behind the higher end of current-generation commodity processors,
>>making a
>> business case for switching to the wimpy system becomes increasingly
>> difficult... So go forth and multiply your cores, but do it in
>>moderation, or
>> the sea of wimpy cores will stick to your programmers? boots like clay."
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>>http://www.beowulf.org/mailman/listinfo/beowulf
>>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Mon Jan 23 14:28:26 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Mon, 23 Jan 2012 11:28:26 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
Message-ID: <20120123192826.GB17383@bx9.net>

http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 14:59:30 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Mon, 23 Jan 2012 20:59:30 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120123192826.GB17383@bx9.net>
References: <20120123192826.GB17383@bx9.net>
Message-ID: <F0AE95A2-DDCD-4BFE-ACF8-4B12E559B02B@xs4all.nl>

Interesting article.

Difficult for me analyse - usually you sell your business when it's a  
succes, or when you want to run away.
Not sure which of the 2 it is here.

Maybe some years from now with some support from Intel that Qlogic  
also can unroll FDR. Right now they're stuck with QDR,
which on their homepage they announce as 40 gigabit per second.

http://www.qlogic.com/Products/adapters/Pages/InfiniBandAdapters.aspx

Showing the Qlogic 7300 series.

Mellanox is slamdunking with FDR now, the new generation network  
which is double the bandwidth i suppose from QDR,
which already got unrolled a few months ago and should be shipping by  
now.

Qlogic AFAIK didn't even announce their next generation network yet,  
let alone display it
and still toys with QDR, which is what i toy at home with. Fact they  
announced 'improving' the oldie QDR
i would interpret as bad news for innovating to FDR.

Maybe someone from Mellanox wants to comment on FDR and whether it's  
double the bandwidth of QDR,
as i suppose some will be monitoring this list.


On Jan 23, 2012, at 8:28 PM, Greg Lindahl wrote:

> http://www.hpcwire.com/hpcwire/2012-01-23/ 
> intel_to_buy_qlogic_s_infiniband_business.html
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Mon Jan 23 15:00:07 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Mon, 23 Jan 2012 15:00:07 -0500 (EST)
Subject: [Beowulf]
	=?utf-8?q?CPU_Startup_Combines_CPU+DRAM=E2=80=94And_A_W?=
	=?utf-8?q?hole_Bunch_Of_Crazy?=
In-Reply-To: <4F1D7EFF.7080206@ias.edu>
References: <20120123134510.GF7343@leitl.org> <4F1D7EFF.7080206@ias.edu>
Message-ID: <alpine.LFD.2.02.1201231424150.2099@coffee.psychology.mcmaster.ca>

> If you read this PDF from Venray Technologies, which is linked to in the
> article, you see where the 'Whole Bunch of Crazy" part comes from. After
> reading it, Venray lost a lot of credibility in my book.
>
> https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf

wow, you're not kidding.  mostly it makes me wonder whether the economy
is such that you can actually get first-round VC with collateral like that!
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Mon Jan 23 15:17:01 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 23 Jan 2012 12:17:01 -0800
Subject: [Beowulf]
 =?windows-1252?q?CPU_Startup_Combines_CPU+DRAM=8BAnd_A_?=
 =?windows-1252?q?Whole_Bunch_Of_Crazy?=
In-Reply-To: <alpine.LFD.2.02.1201231424150.2099@coffee.psychology.mcmaster.ca>
Message-ID: <CB42FF53.1347B%james.p.lux@jpl.nasa.gov>

I don't know..

Maybe it's the list of potential applications (some of which are
speculative and well out there) is what it takes to justify VC..

Like DARPA.. High risk, high reward.  The typical VC doesn't expect every
investment to hit, but the ones that do, they want big returns from.

If you're just interested in slogging through successive refinement, there
are probably other sources of capital that are more appropriate.

While some of those things are downright creepy, none of them appear to
violate the laws of physics, and if someone with cash is willing to put
some up to run the idea forward and establish a position (patent term is
20 years after all.. Which is a long ways in the future in the technology
world).

In 2030 there may be gripes on the equivalent of SlashDot about how this
Venray had patents on all the fundamental things people are using.  Think
of hyperlinks, mice, etc.

On 1/23/12 12:00 PM, "Mark Hahn" <hahn at mcmaster.ca> wrote:

>> If you read this PDF from Venray Technologies, which is linked to in the
>> article, you see where the 'Whole Bunch of Crazy" part comes from. After
>> reading it, Venray lost a lot of credibility in my book.
>>
>> https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf
>
>wow, you're not kidding.  mostly it makes me wonder whether the economy
>is such that you can actually get first-round VC with collateral like
>that!
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From raysonlogin at gmail.com  Mon Jan 23 15:50:09 2012
From: raysonlogin at gmail.com (Rayson Ho)
Date: Mon, 23 Jan 2012 15:50:09 -0500
Subject: [Beowulf]
	=?utf-8?q?CPU_Startup_Combines_CPU+DRAM=E2=80=B9And_A_W?=
	=?utf-8?q?hole_Bunch_Of_Crazy?=
In-Reply-To: <CB42C3AE.133C9%james.p.lux@jpl.nasa.gov>
References: <4F1D7EFF.7080206@ias.edu>
	<CB42C3AE.133C9%james.p.lux@jpl.nasa.gov>
Message-ID: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>

On Mon, Jan 23, 2012 at 11:35 AM, Lux, Jim (337C)
<james.p.lux at jpl.nasa.gov> wrote:
> The "processors in a sea of memory" model has been around for a while
> (and, in fact, there were a lot of designs in the 80s, at the board if not
> the chip level: transputers, early hypercubes, etc.) ?So this is
> revisiting the architecture at a smaller level of integration.

I remember 12-15 years ago I was reading quite a few papers published
by the Berkeley Intelligent RAM (IRAM) Project:

http://iram.cs.berkeley.edu/

So 15 years later someone suddenly thinks that it is a good idea to
ship IRAM systems to real customers?? :-D

Rayson

=================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/


> One thing about power consumption.. Those memory cells consume so little
> power because most of them ?are not being accessed. ?They're essentially
> "floating" capacitors. So the power consumption of the same transistor in
> a CPU (where the duty factor is 100%) is going to be higher than the power
> consumption in a memory cell (where the duty factor is 0.001% or
> something).
>
> And, as always, the challenge is in the software to effectively use the
> distributed computing architecture. ?When you think about it, we've had
> almost a century to figure out how to program single instruction stream
> computers of one sort or another, and it was easy, because we are single
> stream (SISD) ourselves. ?We can create a simulation of multiple threads
> by timesharing in some sense (in either the human or machine models)
>
> And we have lots of experience with EP type, or even scatter/gather type
> processes (tilling land, building pyramids, assembly lines) so that model
> of software/hardware architecture can be argued to be a natural outgrowth
> of what humans already do, and have been figuring out how to do for
> millenia. ?(did Imhotep use some form of project planning tools? ?You bet
> he did)
>
> However, true parallelism (MIMD) is harder to conceptualize. ?Vector and
> matrix math is one area, but I'd argue that it's just the same as EP
> tasks, just at a finer grain. Systolic arrays, vector pipelines, FFT boxes
> from FloatingPointSystems, are all basically ways to use the underlying
> structure of the task, in an easy way (how long til there's a hardware
> implementation of the new faster-than-FFT algorithm published last week?)
> And in all those cases, you have to explicitly make use of the special
> capabilities. ?That is, in general, the compiler doesn't recognize it
> (although, modern parallelizing compilers ARE really smart.. So they
> probably do find most of the cases)
>
> I don't know that we have good conceptual tools to take a complex task and
> break it effectively into multiple disparate component tasks that can
> effectively run in parallel. ?It's a hard task for something
> straightforward (e.g. Designing a big system or building a spacecraft),
> and I don't know that any of outputs of current project planning
> techniques (which are entirely manual) can be said to produce
> "generalized" optimum outputs. ?They produce *an* output for dividing the
> complex task up (or else the project can't be done), but I don't know that
> the output is provably optimum or even workable (an awful lot of projects
> over-run, and not just because of bad estimates for time/cost).
>
> So the problem facing would-be users of new computing architectures (be
> they TOMI, HyperCube, ConnectionMachine, or Beowulf) is like that facing a
> project planner given a big project, and a brand new crew of workers who
> speak a different language, with skill sets totally different than the
> planner is used to.
>
> This is what the computer user is facing: ?There's no compiler or problem
> description technique that will automatically generate a "work plan" to
> use that new architecture. It's all manual, and it's hard, and you're up
> against a brute force "why not just hook 500 people up to that rock and
> drag it" approach. ?The people who figure out the new way will certainly
> benefit society, but there's going to be a lot of false starts along the
> way. ?And, I'm not particularly sanguine about the process being automated
> (at least in the sense of automatic parallelizing compilers that recognize
> loops and repetitve stuff). ?I think that for the next few years
> (decades?) using new architectures is going to rely on skilled humans to
> figure out how to use it, on an ad hoc, unique to each application, basis.
>
>
> [Back in the 80s, I had a loaner "sugarcube" 4 node Intel hypercube
> sitting on my desk for a while. ?I wanted to figure out something to do
> with it that is non-trivial, and not the examples given in the docs (which
> focused on stuff like LISP and Prolog). ?I started, as I'm sure many
> people do, by taking a multithreaded application I had, and distributing
> the threads to processors. ?You pretty quickly realize, though, that it's
> tough to evenly distribute the loads among processors, and you wind up
> with processor 1 waiting for something that processor 2 is doing, which in
> turn is waiting for something that processor 3 is doing, and so forth. ?In
> a "shared processor" this isn't a big deal, and is transparent: the
> processor is always working, and aside from deadlocks, there's no
> particular reason why you need to balance load among threads.
>
> For what it's worth, the task I was doing was comparable to taking
> execution of a Matlab/simulink model and distributing it across multiple
> processors. ?You had signals flowing among blocks, etc. ?These things are
> computationally intensive (especially if you have loops in the design, so
> you need an iterative solution of some sort) so the idea of putting
> multiple processors to work is attractive. ? But the "work" in each block
> in the diagram isn't known a-priori and might vary during the course of
> the simulation, so it's not like you can come up with some sort of
> automatic partitioning algorithm.
>
>
> On 1/23/12 7:38 AM, "Prentice Bisbal" <prentice at ias.edu> wrote:
>
>>If you read this PDF from Venray Technologies, which is linked to in the
>>article, you see where the 'Whole Bunch of Crazy" part comes from. After
>>reading it, Venray lost a lot of credibility in my book.
>>
>>https://www.venraytechnology.com/economics_of_cpu_in_DRAM2.pdf
>>
>>--
>>Prentice
>>
>>
>>On 01/23/2012 08:45 AM, Eugen Leitl wrote:
>>> (Old idea, makes sense, will they be able to pull it off?)
>>>
>>>
>>>http://hothardware.com/News/CPU-Startup-Combines-CPUDRAMAnd-A-Whole-Bunch
>>>-Of-Crazy/
>>>
>>> CPU Startup Combines CPU+DRAM?And A Whole Bunch Of Crazy
>>>
>>> Sunday, January 22, 2012 - by Joel Hruska
>>>
>>> The CPU design firm Venray Technology announced a new product design
>>>this
>>> week that it claims can deliver enormous performance benefits by
>>>combining
>>> CPU and DRAM on to a single piece of silicon. We spent some time
>>>earlier this
>>> fall discussing the new TOMI (Thread Optimized Multiprocessor) with
>>>company
>>> CTO Russell Fish, but while the idea is interesting; its presentation is
>>> marred by crazy conceptualizing and deeply suspect analytics.
>>>
>>> The Multicore Problem:
>>>
>>> There are three limiting factors, or walls, that limit the scaling of
>>>modern
>>> microprocessors. First, there's the memory wall, defined as the gap
>>>between
>>> the CPU and DRAM clock speed. Second, there's the ILP (Instruction Level
>>> Parallelism) wall, which refers to the difficulty of decoding enough
>>> instructions per clock cycle to keep a core completely busy. Finally,
>>>there's
>>> the power wall--the faster a CPU is and the more cores it has, the more
>>>power
>>> it consumes.
>>>
>>> Attempting to compensate for one wall often risks running afoul of the
>>>other
>>> two. Adding more cache to decrease the impact of the CPU/DRAM speed
>>> discrepancy adds die complexity and draws more power, as does raising
>>>CPU
>>> clock speed. Combined, the three walls are a set of fundamental
>>> constraints--improving architectural efficiency and moving to a smaller
>>> process technology may make the room a bit bigger, but they don't
>>>remove the
>>> walls themselves.
>>>
>>> TOMI attempts to redefine the problem by building a very different type
>>>of
>>> microprocessor. The TOMI Borealis is built using the same transistor
>>> structures as conventional DRAM; the chip trades clock speed and
>>>performance
>>> for ultra-low low leakage. Its design is, by necessity, extremely
>>>simple. Not
>>> counting the cache, TOMI is a 22,000 transistor design, as compared to
>>>30,000
>>> transistors for the original ARM2. The company's early prototypes,
>>>built on
>>> legacy DRAM technology, ran at 500MHz on a 110nm process.
>>>
>>> Instead of surrounding a CPU core with a substantial amount of L2 and L3
>>> cache, Venray inserted a CPU core directly into a DRAM design. A TOMI
>>> Borealis core connects eight TOMI cores to a 1Gbit DRAM with a total of
>>>16
>>> ICs per 2GB DIMM. This works out to a total of 128 processor cores per
>>>DIMM.
>>> Because they're built using ultra-low-leakage processes and are so
>>>small,
>>> such cores cost very little to build and consume vanishingly small
>>>amounts of
>>> power (Venray claims power consumption is as low as 23mW per core at
>>>500MHz).
>>>
>>> It's an interesting idea.
>>>
>>> The Bad:
>>>
>>> When your CPU has fewer transistors than an architecture that debuted in
>>> 1986, it's a good chance that you left a few things out--like an FPU,
>>>branch
>>> prediction, pipelining, or any form of speculative execution. Venray
>>>may have
>>> created a chip with power consumption an order of magnitude lower than
>>> anything ARM builds and more memory bandwidth than Intel's highest-end
>>>Xeons,
>>> but it's an ultra-specialized, ultra-lightweight core that trades 25
>>>years of
>>> flexibility and performance for scads of memory bandwidth.
>>>
>>>
>>> The last few years have seen a dramatic surge in the number of
>>>low-power,
>>> many-core architectures being floated as the potential future of
>>>computing,
>>> but Venray's approach relies on the manufacturing expertise of
>>>companies who
>>> have no experience in building microprocessors and don't normally serve
>>>as
>>> foundries. This imposes fundamental restrictions on the CPU's ability to
>>> scale; DRAM is manufactured using a three layer mask rather than the
>>>10-12
>>> layers Intel and AMD use for their CPUs. Venray already acknowledges
>>>that
>>> these conditions imposed substantial limitations on the original TOMI
>>>design.
>>>
>>> Of course, there's still a chance that the TOMI uarch could be
>>>effective in
>>> certain bandwidth-hungry scenarios--but that's where the Venray Crazy
>>>Train
>>> goes flying off the track.
>>>
>>> The Disingenuous and Crazy
>>>
>>> Let's start here. In a graph like this, you expect the two bars to
>>>represent
>>> the same systems being compared across three different characteristics.
>>> That's not the case. When we spoke to Russell Fish in late November, he
>>> pointed us to this publicly available document and claimed that the
>>>results
>>> came from a customer with 384 2.1GHz Xeons. There's no such thing as an
>>>S5620
>>> Xeon and even if we grant that he meant the E5620 CPU, that's a 2.4GHz
>>>chip.
>>>
>>> The "Power consumption" graphs show Oracle's maximum power consumption
>>>for a
>>> system with 10x Xeon E7-8870s, 168 dedicated SQL processors, 5.3TB
>>>(yes, TB)
>>> of Flash and 15x 10,000 RPM hard drives. It's not only a worst-case
>>>figure,
>>> it's a figure utterly unrelated to the workload shown in the Performance
>>> comparison. Furthermore, given that each Xeon E7-8870 has a 130W TDP,
>>>ten of
>>> them only come out to 1.3kW--Oracle's 17.7kW figure means that the
>>> overwhelming majority of the cabinet's power consumption is driven by
>>> components other than its CPUs.
>>>
>>> From here, things rapidly get worse. Fish makes his points about power
>>>walls
>>> by referring to unverified claims that prototype 90nm Tejas chips drew
>>>150W
>>> at 2.8GHz back in 2004. That's like arguing that Ford can't build a
>>>decent
>>> car because the Edsel sucked.
>>>
>>> After reading about the technology, you might think Venray was planning
>>>to
>>> market a small chip to high-end HPC niche markets... and you'd be
>>>wrong. The
>>> company expects the following to occur as a result of this revolutionary
>>> architecture (organized by least-to-most creepy):
>>>
>>> ? ? Computer speech will be so common that devices will talk to other
>>>devices
>>> in the presence of their users.
>>>
>>> ? ? Your cell phone camera will recognize the face of anyone it sees
>>>and scan
>>> the computer cloud for backround red flags as well as six degrees of
>>> separation
>>>
>>> ? ? Common commands will be reduced to short verbal cues like clicking
>>>your
>>> tongue or sucking your lips
>>>
>>> ? ? Your personal history will be displayed for one and all to
>>>see...women
>>> will create search engines to find eligible, prosperous men. Men will
>>>create
>>> search engines to qualify women. Criminals will find their jobs much
>>>more
>>> difficult because their history will be immediately known to anyone who
>>> encounters them.
>>>
>>> ? ? TOMI Technology will be built on flash memories creating the
>>>elemental
>>> unit of a learning machine... the machines will be able to self
>>>organize,
>>> build robust communicating structures, and collaborate to perform tasks.
>>>
>>> ? ? A disposable diaper company will give away TOMI enabled teddy bears
>>>that
>>> teach reading and arithmetic. It will be able to identify specific
>>> children... and from time to time remind Mom to buy a product. The bear
>>>will
>>> also diagnose a raspy throat, a cough, or runny nose.
>>>
>>> Conclusion:
>>>
>>> Fish has spent decades in the microprocessor industry--he invented the
>>>first
>>> CPU to use a clock multiplier in conjunction with Chuck H. Moore--but
>>>his
>>> vision of the future is crazy enough to scare mad dogs and Englishmen.
>>>
>>> His idea for a CPU architecture is interesting, even underneath the
>>> obfuscation and false representation, but too practically limited to
>>>ever
>>> take off. Google, an enthusiastic and dedicated proponent of energy
>>> efficient, multi-core research said it best in a paper titled "Brawny
>>>cores
>>> still beat wimpy cores, most of the time."
>>>
>>> ?"Once a chip?s single-core performance lags by more than a factor to
>>>two or
>>> so behind the higher end of current-generation commodity processors,
>>>making a
>>> business case for switching to the wimpy system becomes increasingly
>>> difficult... So go forth and multiply your cores, but do it in
>>>moderation, or
>>> the sea of wimpy cores will stick to your programmers? boots like clay."
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Mon Jan 23 15:58:11 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 23 Jan 2012 12:58:11 -0800
Subject: [Beowulf]
 =?windows-1252?q?CPU_Startup_Combines_CPU+DRAM=8BAnd_A_?=
 =?windows-1252?q?Whole_Bunch_Of_Crazy?=
In-Reply-To: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>
Message-ID: <CB430870.13492%james.p.lux@jpl.nasa.gov>


On 1/23/12 12:50 PM, "Rayson Ho" <raysonlogin at gmail.com> wrote:

>On Mon, Jan 23, 2012 at 11:35 AM, Lux, Jim (337C)
><james.p.lux at jpl.nasa.gov> wrote:
>> The "processors in a sea of memory" model has been around for a while
>> (and, in fact, there were a lot of designs in the 80s, at the board if
>>not
>> the chip level: transputers, early hypercubes, etc.)  So this is
>> revisiting the architecture at a smaller level of integration.
>
>I remember 12-15 years ago I was reading quite a few papers published
>by the Berkeley Intelligent RAM (IRAM) Project:
>
>http://iram.cs.berkeley.edu/
>
>So 15 years later someone suddenly thinks that it is a good idea to
>ship IRAM systems to real customers?? :-D
>
>Rayson


Or maybe, all good ideas keep coming up again, and each time, it's refined
a bit, or there's another possible source of funding appearing.

Look at "solar power transmitted by microwaves from orbit" as an example.
That one has a 15-20 year cycle time.


You have an idea which is attractive.. You get some money to run it
forward, and then insurmountable problems crop up, discoverable only with
significant investment of time/money (>> 1 work month).  That puts the
idea to sleep for a while until either the reasons are forgotten, or
technology has advanced to the point where what might have been
unreasonable the previous time is reasonable now.

Certainly in the computing world, where 10-15 years is sufficient for many
orders of magnitude change in performance along many axes, it pays to
revisit things, since what may have been a good balance or trade back
then, isn't now.

And that's sort of the thrust of their white paper (justifying that now
the time is right), as well as staking their claim to a bunch of general
applications, few of which are uniquely enabled by their proposed
technology.


>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Mon Jan 23 16:19:34 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Mon, 23 Jan 2012 16:19:34 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120123192826.GB17383@bx9.net>
References: <20120123192826.GB17383@bx9.net>
Message-ID: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>

> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html

wonder what Intel's thinking - could do some very interesting stuff,
but it would take a bit of charisma.  QPI-over-IB anyone?

I'm not crazy about Intel being a vertically-integrated HPC supplier
(chips, systems, interconnect, mpi, compilers - I guess they still
don't have their own scheduler or sexy cloud branding ;)

the world is a better place when each level has internal competition
based on useful, open (free), multi-implementation standards.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Mon Jan 23 16:33:48 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 23 Jan 2012 16:33:48 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <4F1DD23C.8080601@scalableinformatics.com>

On 01/23/2012 04:19 PM, Mark Hahn wrote:

> the world is a better place when each level has internal competition
> based on useful, open (free), multi-implementation standards.

Markets always go through these full on vertical integration phases (for 
a while) before the assets are sold off (either voluntarily or via 
bankruptcy court).  Its a natural part of the business cycle.

Cisco is building servers now.  Oracle, the whole stack.  Pretty soon, 
some whipper snapper of a company is going to come along and eat their 
lunches, and then they will get competitive pressure to change.

This said, many *many* large university sites like dealing with "a 
single vendor" (that is until they get eventually screwed over by that 
one vendor, or realize that the "great deal" they are getting really 
isn't as great as it sounded ... ).  Which is part of the reason its so 
hard getting into accounts other vendors have locked up.  Sadly, lots of 
this works around the spirit (and probably skating very close to the 
edge of the letter) of the law surrounding most public acquisition 
processes, but thats life I guess.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Mon Jan 23 16:46:11 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 23 Jan 2012 16:46:11 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <4F1DD523.4020005@ias.edu>


On 01/23/2012 04:19 PM, Mark Hahn wrote:
>> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
> wonder what Intel's thinking - could do some very interesting stuff,
> but it would take a bit of charisma.  QPI-over-IB anyone?

That's what I'm thinking!
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 16:49:12 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Mon, 23 Jan 2012 22:49:12 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <A39A88E0-1752-41B8-A543-1EFCDDB043BF@xs4all.nl>


On Jan 23, 2012, at 10:19 PM, Mark Hahn wrote:

>> http://www.hpcwire.com/hpcwire/2012-01-23/ 
>> intel_to_buy_qlogic_s_infiniband_business.html
>
> wonder what Intel's thinking - could do some very interesting stuff,
> but it would take a bit of charisma.  QPI-over-IB anyone?

forget it

>
> I'm not crazy about Intel being a vertically-integrated HPC supplier
> (chips, systems, interconnect, mpi, compilers - I guess they still
> don't have their own scheduler or sexy cloud branding ;)

maybe they just want a new generation ethernet nic dirt cheap for  
their motherboards;

if you produce it in those numbers as they do probably anything gets  
dirt cheap,
this doesn't bit highend, yet it might be cheaper then to buy qlogic  
than pay royalties to
any of the infiniband vendors; which would be either mellanox or qlogic.

Also they bought qlogic for 125 million dollar, though in cash, which  
doesn't seem to me as exceptionnel much
from intels viewpoint whereas they might intend to sell some of their  
upcoming line of vector cpu's which badly
need a network of course.

125 million is just a few supercomputers. maybe it was just a cheap  
buy, as qlogic doesn't have FDR yet, who knows?

What i wonder about is how wallstreet knew in advance about qlogic  
getting taken over. If we look careful we see that
since say roughly december 19th 2011, the nasdaq rose roughly 10.5%  
and qlogic rose quite a lot more, several percent.

So it was significant more in demand than the index, which is weird  
if we realize that qlogic has unrolled nothing those months
whereas its competitor Mellanox has unrolled FDR.

It's obvious some traders knew this deal was coming, but real  
fingerpointing is not my job.

Vincent


> the world is a better place when each level has internal competition
> based on useful, open (free), multi-implementation standards.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Mon Jan 23 18:00:02 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Tue, 24 Jan 2012 10:00:02 +1100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <4F1DE672.6000602@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 08:19, Mark Hahn wrote:

> wonder what Intel's thinking - could do some very interesting stuff,
> but it would take a bit of charisma.  QPI-over-IB anyone?

I remember way back hearing the IB was going to be the technology to
replace all those various buses (PCI, etc) on a motherboard [1], then it
all went quiet and then it re-emerged as an interconnect.  So perhaps
Intel (who were part of one of the two groups that merged to create IB)
have thoughts again on this?

cheers,
Chris

[1] interestingly a similar comment appears on the IB Wikipedia page
under history, but sadly without references..

http://en.wikipedia.org/wiki/InfiniBand#History

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8d5nIACgkQO2KABBYQAh+rcACgjTSmbr9EC4clrh0J2EQUT8lX
Sz0AniUG4pdhBkliNWGq5E1tsXiOa8IV
=0k6Z
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From joshua_mora at usa.net  Mon Jan 23 18:02:12 2012
From: joshua_mora at usa.net (Joshua mora acosta)
Date: Mon, 23 Jan 2012 17:02:12 -0600
Subject: [Beowulf] Intel buys QLogic InfiniBand business
Message-ID: <708qawXBm8848S02.1327359732@web02.cms.usa.net>

Do you mean IB over QPI ?
Either way, High Node Count Coherence will be an issue.
In any case, by acquiring their IP it is a step forward towards SoC (System on
Chip). A preliminary step (building block) for the Exascale strategy and for
low cost enterprise/cloud solutions.

Joshua
------ Original Message ------
Received: 03:47 PM CST, 01/23/2012
From: Prentice Bisbal <prentice at ias.edu>
To: beowulf at beowulf.org
Subject: Re: [Beowulf] Intel buys QLogic InfiniBand business

> 
> On 01/23/2012 04:19 PM, Mark Hahn wrote:
> >>
http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
> > wonder what Intel's thinking - could do some very interesting stuff,
> > but it would take a bit of charisma.  QPI-over-IB anyone?
> 
> That's what I'm thinking!
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 18:24:15 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 00:24:15 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
Message-ID: <3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>


On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:

> Do you mean IB over QPI ?
> Either way, High Node Count Coherence will be an issue.

Just ignore his statement - it's total nonsense.

Nanosecond latency of QPI using 2 rings versus something that has a  
latency up to factor 1000 slower
with the pci-e as the slowest delaying factor.

Doing cache coherency over that forget it.

 From what i understand a big problem at modern cpu's is the  
crossbar. At latest chip displayed,
the bulldozer, it's taking a significant amount of transistors.

If you confront that crossbar suddenly with latencies a a factor 4000  
slower, that's not gonna let it perform better
of course.


> In any case, by acquiring their IP it is a step forward towards SoC  
> (System on
> Chip). A preliminary step (building block) for the Exascale  
> strategy and for
> low cost enterprise/cloud solutions.

Not with intel. Intel sells fast equipment yet it has a huge price  
always,
about the opposite of infiniband which is a dirt cheap technology.

I guess we must see this much simpler. At such a giant as intel,  
paying a bit over 100 million is peanuts.
Probably less than what they would need to pay for royalties to a  
manufacturer owning a bunch of patents
in the ethernet NIC area; the HPC intel gets 'for free'.

Allows them to produce maybe a 10 gigabit ethernet NIC dirt cheap  
without needing to pay royalties to qlogic.
It will not be a big performer such 10 gigabit ethernet nic, yet  
price matters a lot of course when integrating. Every penny counts then.

What you typically see with intel is that for them the mass market is  
so important, read that's the 1 gigabit ethernet market right now,
that all other products suffer there, as they will give their mass  
market products always, of course, priority.

Itanium is a good example; it always was proces generations behind  
their main products. It never was given a fair chance to compete.

So where they win it with sandy bridge becasue it's soon a proces  
generation or 2 having the edge on AMD,
there intels other products suffer from this,as they don't get that  
proces technology.

meanwhile ethernet is total crucial to have low latency for the  
financial world, as they can make dozens of billions a year by being  
faster
than others at exchanges.

Now back to that mass market and integration of a good and especially  
cheap 10 gigabit nic into intels mainboards,
this buy might be pretty interesting to intel.

Yet that's a market so big, it has nothing to do with HPC i'd argue.

 From HPC viewpoint i wouldn't see this takeover as a threat to  
anyone in HPC,
i guess it basically means intel won't challenge for the crown in  
HPC, giving Mellanox monopoly for a while at FDR.

It's about ethernet i bet.

>
> Joshua
> ------ Original Message ------
> Received: 03:47 PM CST, 01/23/2012
> From: Prentice Bisbal <prentice at ias.edu>
> To: beowulf at beowulf.org
> Subject: Re: [Beowulf] Intel buys QLogic InfiniBand business
>
>>
>> On 01/23/2012 04:19 PM, Mark Hahn wrote:
>>>>
> http://www.hpcwire.com/hpcwire/2012-01-23/ 
> intel_to_buy_qlogic_s_infiniband_business.html
>>> wonder what Intel's thinking - could do some very interesting stuff,
>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>
>> That's what I'm thinking!
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Mon Jan 23 19:03:14 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 23 Jan 2012 19:03:14 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
Message-ID: <4F1DF542.6050504@scalableinformatics.com>

On 01/23/2012 06:24 PM, Vincent Diepeveen wrote:
>
> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:

[...]

> Nanosecond latency of QPI using 2 rings versus something that has a
> latency up to factor 1000 slower
> with the pci-e as the slowest delaying factor.
>
> Doing cache coherency over that forget it.

Hear that Shai F?  Stop work on vSMP now, cause Vincent says it can't 
work!!!

More seriously, with this acquisition, I could see serious contention 
for ScaleMP.  SoC type stuff, using IB between many nodes, in smaller boxen.


>> In any case, by acquiring their IP it is a step forward towards SoC
>> (System on
>> Chip). A preliminary step (building block) for the Exascale
>> strategy and for
>> low cost enterprise/cloud solutions.

Yes.

> Not with intel. Intel sells fast equipment yet it has a huge price
> always,
> about the opposite of infiniband which is a dirt cheap technology.

Must use Shakespeare for this takedown:  Methinks thou dost protesteth 
too much ...

>
> I guess we must see this much simpler. At such a giant as intel,
> paying a bit over 100 million is peanuts.
> Probably less than what they would need to pay for royalties to a
> manufacturer owning a bunch of patents
> in the ethernet NIC area; the HPC intel gets 'for free'.

So ... exactly what are the existing intel 10GbE NIC's then ... Swiss 
Cheese?  I see a fair number of vendors licensing Intel's IP, or, more 
to the point, using Intel silicon (hint: this might be a good reason for 
the acquisition) to build their stuff...

> Allows them to produce maybe a 10 gigabit ethernet NIC dirt cheap

... which they have been doing for years ...

> without needing to pay royalties to qlogic.

... not sure they were, but its possible Qlogic has 10GbE IP that Intel 
licenses, but this transaction was about ... Infiniband ...

[...]

> meanwhile ethernet is total crucial to have low latency for the
> financial world, as they can make dozens of billions a year by being
> faster
> than others at exchanges.

Errr ... given that this is one of our core markets, don't mind if I 
note that latency is critical to these players, so proximity to the 
exchange, and reliable and deterministic latency is absolutely critical. 
  There are switches that are doing 300ns port to port in the Ethernet 
space now.  With the NICs, you are looking in the 2-ish microsecond 
regime.  These are not cheap.

Compare this to QDR.  1 microsecond +/- some.

Which has lower latency?

There are many reasons why exchanges (mostly) aren't on IB.  A few of 
them are even valid technical reasons.  Historical momentum, and 
conservative approaches to new technology rank pretty high.  So does the 
inability to generally export IB far and wide.  And the complexity of 
the stack.  Ethernet is (almost) plug and play.  Its just a network.

IB is sort of kind of plug, install OFED, and play for a while over 
IPoIB until you can recode for some of the RDMA bits.  And don't try to 
run file systems and other things with lots of traffic over IPoIB.  It 
leaks and gradually you will catch some cool ... surprises.

Honestly, its a shame that IPoIB never really got the attention it 
deserved like the other elements of the IB stack did.  Getting a rock 
solid IP implementation atop a fast/low latency net could have driven 
many design wins outside of HPC.  And would have been a gateway 
drug^H^H^H^Htechnology for using the other stack elements.


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Mon Jan 23 19:06:43 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 23 Jan 2012 19:06:43 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F1DF542.6050504@scalableinformatics.com>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
Message-ID: <4F1DF613.1060603@scalableinformatics.com>

On 01/23/2012 07:03 PM, Joe Landman wrote:

> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it can't
> work!!!
>

There is an implicit /sarc tag here BTW.  vSMP does a wonderful job 
(where Vincent claims that things won't work ... they do work, and very 
well at that).

> More seriously, with this acquisition, I could see serious contention
> for ScaleMP.  SoC type stuff, using IB between many nodes, in smaller boxen.

Serious contention to buy ScaleMP (as in potential acquirers)

Must be getting too much blood in the coffee stream.  Can't communicate ...

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From atp at piskorski.com  Mon Jan 23 19:30:30 2012
From: atp at piskorski.com (Andrew Piskorski)
Date: Mon, 23 Jan 2012 19:30:30 -0500
Subject: [Beowulf] CPU Startup Combines CPU+DRAM?And A Whole Bunch Of
	Crazy
In-Reply-To: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>
References: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>
Message-ID: <20120124003030.GA80957@piskorski.com>

On Mon, Jan 23, 2012 at 03:50:09PM -0500, Rayson Ho wrote:

> http://iram.cs.berkeley.edu/
> 
> So 15 years later someone suddenly thinks that it is a good idea to
> ship IRAM systems to real customers?? :-D

Sure.  But from when I last read about the IRAM stuff, I'm pretty sure
it was strictly single core.  Their VIRAM1 chip had 13 MB of DRAM, 1
cpu core, and 4 "vector lanes", with no mention of SMP or any sort of
multi-chip parallelism at all.  If Venray has a good design for using
hundreds or more IRAM-like chips in a parallel machine, that sounds
like a significant step forward.  (The intended fab process and
attendant design rules might also be quite different, although I'm not
at all sure about that.)

-- 
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 19:40:13 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 01:40:13 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F1DF542.6050504@scalableinformatics.com>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
Message-ID: <F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>


On Jan 24, 2012, at 1:03 AM, Joe Landman wrote:

> On 01/23/2012 06:24 PM, Vincent Diepeveen wrote:
>>
>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>
> [...]
>
>> Nanosecond latency of QPI using 2 rings versus something that has a
>> latency up to factor 1000 slower
>> with the pci-e as the slowest delaying factor.
>>
>> Doing cache coherency over that forget it.
>
> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it can't
> work!!!
>
> More seriously, with this acquisition, I could see serious contention
> for ScaleMP.  SoC type stuff, using IB between many nodes, in  
> smaller boxen.
>

That would be some BlueGene type machine you speak about that intel  
would produce with a low power SoC.

This where at this point the bluegene type machines simply can't  
compete with the tiny processors
that get produced by the dozens of millions.

"The tiny processors have won"
    Linus Thorvalds

Intel has themselves a second law of Moore. You can google for it.  
Every new generation of factory that
can produce this machine with double the number of transistors, that  
factory also is 2x more expensive.

A few years ago intel projected that by 2020 building a single  
factory would have a cost of 20 billion dollar.

Now Obama might contribute to this by overspending 40-50%, more  
overspending than the overspending of
Greece, Spain, UK and Portugal combined.

So that will cause massive inflation, which will hurt the poor most,  
and it sure will help the 2nd law of Moore become sooner a reality
rather than later; yet if we move away from politics to money and  
mass production;
i hope you realize that a few HPC cpu's won't pay back for 20 billion  
dollar.

In short only cpu's that get mass produced can.

A good example of massproduced processors are gpu's.

If we look at the leading gpu's, which have by now thousands of  
cores, there is no way to compete with that with SoC's.

What's price of producing 1 gpu versus 200 SOC's with a small core?

Furthermore intel never really could compete in the SOC world so far  
with the low power cpu's that get produced by the billion a year,
so betting on that would be quite surprising, though not impossible  
gamble.

Intel always has been good in low latency designs. yet obviously  
further integration of logics into the cpu means of course you also
need a capable ethernet chip in your cpu. Qlogics can provide that.

Mass produce half a billion of those and then it's cheaper to buy a  
company with such technology than to pay royalties.

Another HPC problem with the bluegene type designs:

all those soc's basically spread the calculation power over a bigger  
area than 1 big power eating chip will.
Bigger area means bigger distance to transfer massive data, and  
that's in itself a very expensive thing.

Overall seen bluegene machines never really had a low power usage,  
despite some stupid professors shouting that.
Per gflop it always was never the performance king; they just  
compared with total hopeless type designs and IBM usually
delivered in time, something that is very important in HPC as well.

IMHO the only reason bluegene could be competative is because it was  
fighting dinosaur type HPC cpu's.

Now SoC's might be mighty interesting in the gamersworld and in the  
telecom to build new phones with,
wich makes it mighty interesting for intel to produce those  
dirtcheap, and maybe even put a more capable ethernet
chip on it, again dirtcheap; as for the HPC world i don't see it  
happen that this SoC can compete anyhow with a gpu or even CPU.

Better write some code in CUDA or OpenCL i'd argue.

Latest AMD gpu the HD Radeon 7970, it is delivering 1 teraflop or so?

With soon a 2 gpu version coming on 1 card that's gonna deliver close  
to 2 Tflop a card, double precision yes.
Multiply by 4 for single precision. 8+ Teraflop single precision.

For a couple of hundreds of dollars. Nvidia will undoubtfully follow  
with their 1 teraflop gpu.

If take a washing machine and pack it with cheapo socks, creating a 2  
Tflop machine, do you guess you can SELL that for a couple of
hundreds of dollars?

Just transport costs already will be more expensive than a single gpu  
card...

Intel cannot compete with that in HPC for the stuff that needs  
bandwidth and doesn't care for latency. as at a new proces technology,
they first go produce a few FPGA cpu's, and after that they produce  
worlds fastest CPU. So there is simply no window in
time to use the latest proces technology for a HPC vector type chip.  
That's why AMD-ATI and Nvidia will win that contest handsdown.

And we sure hope intel will keep selling its cpu's very well, which  
if it is the case means that this won't change.

After all they already make cash on majority of supercomputers as  
each node also usually has 2 Xeon cpu's which go for a multiple of  
the price
of the GPU that's in the box...

>
>>> In any case, by acquiring their IP it is a step forward towards SoC
>>> (System on
>>> Chip). A preliminary step (building block) for the Exascale
>>> strategy and for
>>> low cost enterprise/cloud solutions.
>
> Yes.
>
>> Not with intel. Intel sells fast equipment yet it has a huge price
>> always,
>> about the opposite of infiniband which is a dirt cheap technology.
>
> Must use Shakespeare for this takedown:  Methinks thou dost protesteth
> too much ...
>
>>
>> I guess we must see this much simpler. At such a giant as intel,
>> paying a bit over 100 million is peanuts.
>> Probably less than what they would need to pay for royalties to a
>> manufacturer owning a bunch of patents
>> in the ethernet NIC area; the HPC intel gets 'for free'.
>
> So ... exactly what are the existing intel 10GbE NIC's then ... Swiss
> Cheese?  I see a fair number of vendors licensing Intel's IP, or, more
> to the point, using Intel silicon (hint: this might be a good  
> reason for
> the acquisition) to build their stuff...
>
>> Allows them to produce maybe a 10 gigabit ethernet NIC dirt cheap
>
> ... which they have been doing for years ...
>
>> without needing to pay royalties to qlogic.
>
> ... not sure they were, but its possible Qlogic has 10GbE IP that  
> Intel
> licenses, but this transaction was about ... Infiniband ...
>
> [...]
>
>> meanwhile ethernet is total crucial to have low latency for the
>> financial world, as they can make dozens of billions a year by being
>> faster
>> than others at exchanges.
>
> Errr ... given that this is one of our core markets, don't mind if I
> note that latency is critical to these players, so proximity to the
> exchange, and reliable and deterministic latency is absolutely  
> critical.
>   There are switches that are doing 300ns port to port in the Ethernet
> space now.  With the NICs, you are looking in the 2-ish microsecond
> regime.  These are not cheap.
>
> Compare this to QDR.  1 microsecond +/- some.
>
> Which has lower latency?
>
> There are many reasons why exchanges (mostly) aren't on IB.  A few of
> them are even valid technical reasons.  Historical momentum, and
> conservative approaches to new technology rank pretty high.  So  
> does the
> inability to generally export IB far and wide.  And the complexity of
> the stack.  Ethernet is (almost) plug and play.  Its just a network.
>
> IB is sort of kind of plug, install OFED, and play for a while over
> IPoIB until you can recode for some of the RDMA bits.  And don't  
> try to
> run file systems and other things with lots of traffic over IPoIB.  It
> leaks and gradually you will catch some cool ... surprises.
>
> Honestly, its a shame that IPoIB never really got the attention it
> deserved like the other elements of the IB stack did.  Getting a rock
> solid IP implementation atop a fast/low latency net could have driven
> many design wins outside of HPC.  And would have been a gateway
> drug^H^H^H^Htechnology for using the other stack elements.
>
>
>
> -- 
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>         http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Mon Jan 23 19:51:59 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Tue, 24 Jan 2012 11:51:59 +1100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
Message-ID: <4F1E00AF.4090206@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 11:40, Vincent Diepeveen wrote:

> Overall seen bluegene machines never really had a low power usage,  
> despite some stupid professors shouting that.

So that's why the top 5 places on the last Green500 are all BlueGene..

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8eAK8ACgkQO2KABBYQAh+nIwCdH88tISGrx772Sq/57XquLFRb
GtcAni1urHGd2j+MIJA0LXG2sGk+YymR
=tfjM
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 20:00:43 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 02:00:43 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F1E00AF.4090206@unimelb.edu.au>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
	<4F1E00AF.4090206@unimelb.edu.au>
Message-ID: <7C608C57-D51B-4369-A973-6943E2D2DB7C@xs4all.nl>


On Jan 24, 2012, at 1:51 AM, Christopher Samuel wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 24/01/12 11:40, Vincent Diepeveen wrote:
>
>> Overall seen bluegene machines never really had a low power usage,
>> despite some stupid professors shouting that.
>
> So that's why the top 5 places on the last Green500 are all BlueGene..
>

I wondered about that as well.

When i see 1 gpu get nearly 1 teraflop eating probably a tad more  
power than
official, say a 250 watt it'll consume. I already use more power now  
than the specs in
fact.

Yet even then that's 4 gflop per watt.

Last time i calculated bluegene, sure that's probably the previous  
generation,
it was 3 watts per gflop, or factor 12 more power than a Radon HD 7970.

Please note that in the statements of most HPC centers claiming blue  
gene to be energy efficient,
usually they do not release numbers.

But now the important question, what's price of bluegene per teraflop?

It's let's have a look, around a 500 euro or so for a Radeon HD7970  
card.

Vincent

> - --
>     Christopher Samuel - Senior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>          http://www.vlsci.unimelb.edu.au/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8eAK8ACgkQO2KABBYQAh+nIwCdH88tISGrx772Sq/57XquLFRb
> GtcAni1urHGd2j+MIJA0LXG2sGk+YymR
> =tfjM
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Mon Jan 23 20:06:41 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Tue, 24 Jan 2012 12:06:41 +1100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <7C608C57-D51B-4369-A973-6943E2D2DB7C@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E00AF.4090206@unimelb.edu.au>
	<7C608C57-D51B-4369-A973-6943E2D2DB7C@xs4all.nl>
Message-ID: <4F1E0421.80009@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 12:00, Vincent Diepeveen wrote:

> But now the important question, what's price of bluegene per teraflop?
> 
> It's let's have a look, around a 500 euro or so for a Radeon HD7970  
> card.

What does that matter if you can't power or cool a similar performance
GPU system?   Let alone have any applications that will actually take
advantage of it.

cheers,
Chris
- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8eBCEACgkQO2KABBYQAh839wCdFz1MjiPGCKwvbKpANCmJZpnU
V4UAoJYIfKNf6VleNi0SduPcBtSkqxQq
=E7Rh
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Mon Jan 23 20:07:58 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Mon, 23 Jan 2012 20:07:58 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F1DD523.4020005@ias.edu>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
	<4F1DD523.4020005@ias.edu>
Message-ID: <39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>


>
> On 01/23/2012 04:19 PM, Mark Hahn wrote:
>>> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
>> wonder what Intel's thinking - could do some very interesting stuff,
>> but it would take a bit of charisma.  QPI-over-IB anyone?
>
> That's what I'm thinking!

Numascale does this already with SCI

--
Doug

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From deadline at eadline.org  Mon Jan 23 20:15:30 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Mon, 23 Jan 2012 20:15:30 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
Message-ID: <2d90512c0be6a3eba887e5f6ab96b3c1.squirrel@mail.eadline.org>


>> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
>
> wonder what Intel's thinking - could do some very interesting stuff,
> but it would take a bit of charisma.  QPI-over-IB anyone?

There were some exascale goals mentioned. I wonder if there is
some plans for a MIC based exascale beast

--
Doug

>
> I'm not crazy about Intel being a vertically-integrated HPC supplier
> (chips, systems, interconnect, mpi, compilers - I guess they still
> don't have their own scheduler or sexy cloud branding ;)
>
> the world is a better place when each level has internal competition
> based on useful, open (free), multi-implementation standards.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From ellis at cse.psu.edu  Mon Jan 23 20:19:08 2012
From: ellis at cse.psu.edu (Ellis H. Wilson III)
Date: Mon, 23 Jan 2012 20:19:08 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
Message-ID: <4F1E070C.4040107@cse.psu.edu>

On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>> Nanosecond latency of QPI using 2 rings versus something that has a
>>> latency up to factor 1000 slower
>>> with the pci-e as the slowest delaying factor.
>>>
>>> Doing cache coherency over that forget it.
>>
>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it can't
>> work!!!
>>
>> More seriously, with this acquisition, I could see serious contention
>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>> smaller boxen.
>
> That would be some BlueGene type machine you speak about that intel
> would produce with a low power SoC.
>
> This where at this point the bluegene type machines simply can't
> compete with the tiny processors
> that get produced by the dozens of millions.

For...chess?  ;D

> "The tiny processors have won"
>      Linus Thorvalds

*Torvalds, and if Linux (or any well-supported kernel/OS for that 
matter) currently had data structures designed for extremely high 
parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I 
would agree with this statement.  As I currently see it, all we can 
really say is that someday, probably, perhaps even hopefully:

"The tiny processors will win."

That's after we work out all the nasty nuances involved with designing 
new data structures for OSes that can handle that number of cores, and 
probably design new applications that can use these new OS features. 
And no, GPU support in Linux doesn't count as this already having been 
done.  We just farm out very specific code to run on those things.  If 
somebody has an example of a full-blown, usable OS running on a GPU 
ALONE, I would stand (very interestingly) corrected.

> Intel has themselves a second law of Moore. You can google for it.

Thanks, for a moment there, I almost used AskJeeves.

> A good example of massproduced processors are gpu's.

Was waiting for the hook.  Inevitable really.  I think if we were 
discussing the efficacy and quality of resultant bread from various 
bread machines versus the numerous methods for making bread by hand 
somehow, someway, a GPU would make better bread.  Might be a wholesome 
cyber-loaf of artisan wheat, but nonetheless, it would be better in 
every way.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 20:44:10 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 02:44:10 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F1E070C.4040107@cse.psu.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
	<4F1E070C.4040107@cse.psu.edu>
Message-ID: <D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>

In hardware you cannot beat manycore performance CPU's at the same  
cost structure;
cpu's have an exponential cost structure, for example to maintain  
cache-coherency.

This has many implications; for example also on size and scale.
If you produce a 1000 mm^2 cpu this is extremely expensive with real  
low yields,
whereas a 1000 mm^2 manycore is not a problem at all; cores that do  
not work you
can just turn off. There is no coherency.

So if you produce bigger cpu's, the price goes up per square  
millimeter, with manycores it scales near lineair.

If i remember well at 2007 a NCSA director already had put the  
implication
of this reality in his sheets, assuming by 2010 NCSA would build  
supercomputers
exclusively using manycores.

Note that manycores are not ideal for chess - they are however  
possible to use for majority of system
time that gets burned in HPC as majority of HPC needs throughput  
rather than latency.

Comparing bluegene machines with gpu's makes perfect sense of course  
as the latency
on them is also total crap.

I see the bluegene system by IBM as a genius move from IBM, starting  
an evolution,
moving away from huge expensive cpu's where you produce just a  
handful from in a total
outdated proces technology, with extremely bad yields,
with a milliondollar of startup costs, which by now woud be at todays  
factories approaching
20 million dollar startup costs just to print a  single batch of  
processors.

IBM developing power8 will have a serious problem with newer  
generation factories.
Every batch they print, every mistake it has, DANG 20 million dollar  
gone.

This concept of using simple cpu's, yet not that massively produced  
yet, obviously evoluted now
into a gpu, which is 1 total mass produced cheap chip, that  
integrates all those tiny cores into 1 cpu, which is way
cheaper.

What's price of a bluegene system per teraflop?

It's 500 euro for a 1 teraflop double precision Radeon HD7970...


On Jan 24, 2012, at 2:19 AM, Ellis H. Wilson III wrote:

> On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>>> Nanosecond latency of QPI using 2 rings versus something that has a
>>>> latency up to factor 1000 slower
>>>> with the pci-e as the slowest delaying factor.
>>>>
>>>> Doing cache coherency over that forget it.
>>>
>>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it  
>>> can't
>>> work!!!
>>>
>>> More seriously, with this acquisition, I could see serious  
>>> contention
>>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>>> smaller boxen.
>>
>> That would be some BlueGene type machine you speak about that intel
>> would produce with a low power SoC.
>>
>> This where at this point the bluegene type machines simply can't
>> compete with the tiny processors
>> that get produced by the dozens of millions.
>
> For...chess?  ;D
>
>> "The tiny processors have won"
>>      Linus Thorvalds
>
> *Torvalds, and if Linux (or any well-supported kernel/OS for that
> matter) currently had data structures designed for extremely high
> parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I
> would agree with this statement.  As I currently see it, all we can
> really say is that someday, probably, perhaps even hopefully:
>
> "The tiny processors will win."
>
> That's after we work out all the nasty nuances involved with designing
> new data structures for OSes that can handle that number of cores, and
> probably design new applications that can use these new OS features.
> And no, GPU support in Linux doesn't count as this already having been
> done.  We just farm out very specific code to run on those things.  If
> somebody has an example of a full-blown, usable OS running on a GPU
> ALONE, I would stand (very interestingly) corrected.
>
>> Intel has themselves a second law of Moore. You can google for it.
>
> Thanks, for a moment there, I almost used AskJeeves.
>
>> A good example of massproduced processors are gpu's.
>
> Was waiting for the hook.  Inevitable really.  I think if we were
> discussing the efficacy and quality of resultant bread from various
> bread machines versus the numerous methods for making bread by hand
> somehow, someway, a GPU would make better bread.  Might be a wholesome
> cyber-loaf of artisan wheat, but nonetheless, it would be better in
> every way.
>
> Best,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Mon Jan 23 20:55:41 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 02:55:41 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
	<4F1DD523.4020005@ias.edu>
	<39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>
Message-ID: <534AD42D-DC33-4199-B476-9ADED3E09073@xs4all.nl>


On Jan 24, 2012, at 2:07 AM, Douglas Eadline wrote:

>
>>
>> On 01/23/2012 04:19 PM, Mark Hahn wrote:
>>>> http://www.hpcwire.com/hpcwire/2012-01-23/ 
>>>> intel_to_buy_qlogic_s_infiniband_business.html
>>> wonder what Intel's thinking - could do some very interesting stuff,
>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>
>> That's what I'm thinking!
>
> Numascale does this already with SCI

They sold 300 systems, is claim on homepage. Not exactly what intel  
aims for. I bet they instead aim to sell half a billion cpu's with
built in ethernet - let's face it their NICs started to get outdated.

For HPC it won't be a slamming succes let alone give you any  
performance.

After all what's price of 1000 SoC's with 1000 tiny cpu's on it, that  
together produce you 1 teraflop,
versus 1 manycore that produces 1 teraflop?

This is not what you buy Qlogics for.

Maybe it was just a cheap buy for the number of patents they posses,  
and the big need within intel for some engineers
that can improve their cpu's with connectivity that the average user  
will like; as for HPC,
moving those engineers within intel to the areas where intel can make  
most cash, that's with cpu's and not with HPC
hardware, seems Mellanox gets a monopoly on HPC network performance.

>
> --
> Doug
>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>
>
> -- 
> Doug
>
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Mon Jan 23 23:55:41 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Mon, 23 Jan 2012 20:55:41 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120123192826.GB17383@bx9.net>
References: <20120123192826.GB17383@bx9.net>
Message-ID: <20120124045541.GB10196@bx9.net>

On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:

> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html

I figured out the main why:

http://seekingalpha.com/news-article/2082171-qlogic-gains-market-share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets

> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
> surpassed $100 million per quarter, and are on track for about fifty
> percent annual growth, according to Crehan Research.

That's the whole market, and QLogic says they are #1 in the FCoE
adapter segment of this market, and #2 in the overall 10 gig adapter
market (see
http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-f2q12-results-earnings-call-transcript)

Historically, QLogic had a fibre channel adapter business that was a
huge cash cow, and they bought their way into various markets and had
limited success with them: iscsi, fibre channel switches, and yes,
InfiniBand, where QLogic managed to get some large sales (TriLabs 3 PF
procurement) yet was at only 15%-20% market share.

I'm surprised that QLogic could succeed in 10gige adapters given all
the competition, but hey, I never understood why fibre channel was
popular, either.

Now that QLogic has found what the next best thing after fibre channel
adapters is, they might as well concentrate on it. It'll be
interesting what Intel plans to do in the exascale market. I've
thought for a long time that non-cache-coherent processors like MIC
ought to have InfiniPath-like hardware queues for sending and
receiving short messages efficiently, even on-chip.

Not to mention that whole exascale thing.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From scrusan at ur.rochester.edu  Tue Jan 24 00:02:26 2012
From: scrusan at ur.rochester.edu (Steve Crusan)
Date: Tue, 24 Jan 2012 00:02:26 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
	<4F1E070C.4040107@cse.psu.edu>
	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
Message-ID: <DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
> 
> 
> It's 500 euro for a 1 teraflop double precision Radeon HD7970...


Great, and nothing runs on it. GPUs are insanely useful for certain tasks, but they aren't going to be able to handle most normal workloads(similar to the BG class of course). Any center that buys BGP (or Q at this point) gear is going to pay for a scientific programmer to adapt their code to take advantage of the BG's strengths; parallelism. 

But It's nice that supercomputing centers use GPUs to boost their flops numbers. Any word on that Chinese system's efficiency? If you look at the architecture of the new K computer in Japan, it's similar to the BlueGene line.

PS: I'm really not an IBMer.


> 
> 
> 
> On Jan 24, 2012, at 2:19 AM, Ellis H. Wilson III wrote:
> 
>> On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>>>> Nanosecond latency of QPI using 2 rings versus something that has a
>>>>> latency up to factor 1000 slower
>>>>> with the pci-e as the slowest delaying factor.
>>>>> 
>>>>> Doing cache coherency over that forget it.
>>>> 
>>>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it  
>>>> can't
>>>> work!!!
>>>> 
>>>> More seriously, with this acquisition, I could see serious  
>>>> contention
>>>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>>>> smaller boxen.
>>> 
>>> That would be some BlueGene type machine you speak about that intel
>>> would produce with a low power SoC.
>>> 
>>> This where at this point the bluegene type machines simply can't
>>> compete with the tiny processors
>>> that get produced by the dozens of millions.
>> 
>> For...chess?  ;D
>> 
>>> "The tiny processors have won"
>>>     Linus Thorvalds
>> 
>> *Torvalds, and if Linux (or any well-supported kernel/OS for that
>> matter) currently had data structures designed for extremely high
>> parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I
>> would agree with this statement.  As I currently see it, all we can
>> really say is that someday, probably, perhaps even hopefully:
>> 
>> "The tiny processors will win."
>> 
>> That's after we work out all the nasty nuances involved with designing
>> new data structures for OSes that can handle that number of cores, and
>> probably design new applications that can use these new OS features.
>> And no, GPU support in Linux doesn't count as this already having been
>> done.  We just farm out very specific code to run on those things.  If
>> somebody has an example of a full-blown, usable OS running on a GPU
>> ALONE, I would stand (very interestingly) corrected.
>> 
>>> Intel has themselves a second law of Moore. You can google for it.
>> 
>> Thanks, for a moment there, I almost used AskJeeves.
>> 
>>> A good example of massproduced processors are gpu's.
>> 
>> Was waiting for the hook.  Inevitable really.  I think if we were
>> discussing the efficacy and quality of resultant bread from various
>> bread machines versus the numerous methods for making bread by hand
>> somehow, someway, a GPU would make better bread.  Might be a wholesome
>> cyber-loaf of artisan wheat, but nonetheless, it would be better in
>> every way.
>> 
>> Best,
>> 
>> ellis
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

 ----------------------
 Steve Crusan
 System Administrator
 Center for Research Computing
 University of Rochester
 https://www.crc.rochester.edu/


-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org

iQEcBAEBAgAGBQJPHjtzAAoJENS19LGOpgqKUHUH/Rvn6tXy8Kla86JNbNwt3KUJ
B+70SwJL/aBDstcDG4ChT5uW0WCcuvS7qRx5e1Zwu68m7qFEZRvIwc0uu0bgHbxt
KRynFRZ6suwudEp0o4HMpCBYNaC7uG7xkUeFbUHKfnfCflWDoz4Y9Fq3a/OhoriK
a5JrQqjVI6HZij+xDqrFvyn80Ec8eSwfRYd8lxfq4abHtE1tKYm/cF5I5Bn2lD5l
wVNvBQiU99ZPeqhcbL5XyvIsceB6ncodJ9zmBxIahrNIogMCq7UJbUhsikSRp6Dd
cL7r0AekTyiRmvZaHZZKbuad68DfATT4hy9/HzodBqTWLxxTMlrW8vNH9a7dSOo=
=oA7r
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Tue Jan 24 00:09:57 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Tue, 24 Jan 2012 16:09:57 +1100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
Message-ID: <4F1E3D25.7000008@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 16:02, Steve Crusan wrote:

> Any center that buys BGP (or Q at this point) gear is
> going to pay for a scientific programmer to adapt their
> code to take advantage of the BG's strengths; parallelism. 

The advantage of the BG platform though is that it's just MPI and
threads, nothing that unusual at all - certainly no need to learn CUDA,
OpenCL, etc..

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8ePSUACgkQO2KABBYQAh+hPQCggfFgdr9R9G6H7hW0Dk1/sGK+
Fe8Aniu7M6CEThw0s7F2CtqTCmuNZMRg
=mH9r
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Tue Jan 24 00:32:08 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Tue, 24 Jan 2012 00:32:08 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>
References: <20120123192826.GB17383@bx9.net>
	<alpine.LFD.2.02.1201231516040.2099@coffee.psychology.mcmaster.ca>
	<4F1DD523.4020005@ias.edu>
	<39e1ffabf2e6448ea3b65da4e34b712a.squirrel@mail.eadline.org>
Message-ID: <alpine.LFD.2.02.1201240016360.5375@coffee.psychology.mcmaster.ca>

>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>
>> That's what I'm thinking!
>
> Numascale does this already with SCI

it's easy to source and build pretty big IB systems;
how much so with SCI?

I actually like the idea of high-fanout-distributed-router systems,
but they seem prepetually exotic.  where are the hypercubes, FNNs?
afaikt, commodification of IB has snuffed topology as a design issue,
except for cray/BG/k machine-level projects.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Tue Jan 24 00:53:14 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 23 Jan 2012 21:53:14 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201240016360.5375@coffee.psychology.mcmaster.ca>
Message-ID: <CB43872A.1354B%james.p.lux@jpl.nasa.gov>

Inevitably, though, massively parallel interconnects (all boxes connected
to all other boxes) won't scale.


On 1/23/12 9:32 PM, "Mark Hahn" <hahn at mcmaster.ca> wrote:

>>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>>
>>> That's what I'm thinking!
>>
>> Numascale does this already with SCI
>
>it's easy to source and build pretty big IB systems;
>how much so with SCI?
>
>I actually like the idea of high-fanout-distributed-router systems,
>but they seem prepetually exotic.  where are the hypercubes, FNNs?
>afaikt, commodification of IB has snuffed topology as a design issue,
>except for cray/BG/k machine-level projects.
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 24 06:53:35 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 24 Jan 2012 12:53:35 +0100
Subject: [Beowulf] CPU Startup Combines CPU+DRAM?And A Whole Bunch
	Of	Crazy
In-Reply-To: <20120124003030.GA80957@piskorski.com>
References: <CAHwLALOSnPcvqj_hsuMgWnQQeZr74dLQyHj4GJTkzYCC9yV9wg@mail.gmail.com>
	<20120124003030.GA80957@piskorski.com>
Message-ID: <20120124115335.GW7343@leitl.org>

On Mon, Jan 23, 2012 at 07:30:30PM -0500, Andrew Piskorski wrote:
> On Mon, Jan 23, 2012 at 03:50:09PM -0500, Rayson Ho wrote:
> 
> > http://iram.cs.berkeley.edu/
> > 
> > So 15 years later someone suddenly thinks that it is a good idea to
> > ship IRAM systems to real customers?? :-D
> 
> Sure.  But from when I last read about the IRAM stuff, I'm pretty sure
> it was strictly single core.  Their VIRAM1 chip had 13 MB of DRAM, 1
> cpu core, and 4 "vector lanes", with no mention of SMP or any sort of
> multi-chip parallelism at all.  If Venray has a good design for using
> hundreds or more IRAM-like chips in a parallel machine, that sounds
> like a significant step forward.  (The intended fab process and
> attendant design rules might also be quite different, although I'm not
> at all sure about that.)

In order to make best use of eDRAM it's best to organize
the CPU around the layout of the memory cells, treating it
as an array. You'll need a refresh register, best as wide
as possible, multi-kBit word sizes, add shifts (which helps
the network processor), VLIW/SIMD, large integer addition
and subtraction, and so on.

If you shrink the dies, use redunant connections to route
around dead dies you can have WSI with utilization rates
of >90% of the real estate. Even without FPUs such a sea
of nodes on a mesh maps very well to massively parallel
physical problems, AI (spiking neurons), and such. Even as
a particle swarm/game physics accelerator engine integrated
into RAM it really helps with massively boosting game
video and physics performance, with obvious applications in
GPGPU as well.

This is not at all stupid, if only this wouldn't be pushed
by apparent bozos.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Tue Jan 24 07:48:23 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Tue, 24 Jan 2012 07:48:23 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <CB43872A.1354B%james.p.lux@jpl.nasa.gov>
References: <CB43872A.1354B%james.p.lux@jpl.nasa.gov>
Message-ID: <986a2d9cf54a1630130a3361fc25a547.squirrel@mail.eadline.org>


> Inevitably, though, massively parallel interconnects (all boxes connected
> to all other boxes) won't scale.
>
Indeed, when thinking about scale I always end up thinking about
the masters of scale -- ants

--
Doug

>
> On 1/23/12 9:32 PM, "Mark Hahn" <hahn at mcmaster.ca> wrote:
>
>>>>> but it would take a bit of charisma.  QPI-over-IB anyone?
>>>>
>>>> That's what I'm thinking!
>>>
>>> Numascale does this already with SCI
>>
>>it's easy to source and build pretty big IB systems;
>>how much so with SCI?
>>
>>I actually like the idea of high-fanout-distributed-router systems,
>>but they seem prepetually exotic.  where are the hypercubes, FNNs?
>>afaikt, commodification of IB has snuffed topology as a design issue,
>>except for cray/BG/k machine-level projects.
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>To change your subscription (digest mode or unsubscribe) visit
>>http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From diep at xs4all.nl  Tue Jan 24 07:51:54 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 13:51:54 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>
	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>
	<4F1DF542.6050504@scalableinformatics.com>
	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>
	<4F1E070C.4040107@cse.psu.edu>
	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
Message-ID: <F4B5F8D0-208C-4C34-AB63-5AFA2C34A325@xs4all.nl>


On Jan 24, 2012, at 6:02 AM, Steve Crusan wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
>
> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>>
>>
>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
>
>
> Great, and nothing runs on it.

You build a system of millions of euro's alltogether, NCSA having a  
huge budget and you can't even pay for a few programmers who
write some crunching code for gpu's????

> GPUs are insanely useful for certain tasks, but they aren't going  
> to be able to handle most normal workloads(similar to the BG class  
> of course). Any center that buys BGP (or Q at this point) gear is  
> going to pay for a scientific programmer to adapt their code to  
> take advantage of the BG's strengths; parallelism.
>

bluegene is ibm's equivalent of a HPC gpu, just it's a lot more  
expensive such box.


> But It's nice that supercomputing centers use GPUs to boost their  
> flops numbers. Any word on that Chinese system's efficiency?

Actually on this mailing list if you scroll back in history, and look  
in 2007, some chinese researchers here posted their codes were,
we speak of the 512 streamcore ATI's, already reaching 50% IPC, and  
it worked crossplatform at AMD and Nvidia. They got 25% efficiency
at nvidia.

Now if we realize that most codes on this planet can't use multiply- 
add, then 25% at nvidia and 50% at ATI was really good.

If we look to all sorts of applications and see that if 1 good  
programmer is doing effort, suddenly it works great at gpu's.


> If you look at the architecture of the new K computer in Japan,  
> it's similar to the BlueGene line.
>
> PS: I'm really not an IBMer.
>

I took a look at latest BlueGene/Q and basically it's 4 threads per  
core @ 18 core @ 1.6Ghz or something they are gonna build.
that's a much improved chip over the old bluegenes which are 3 watt  
per gflop.

Yet to my surprise, or maybe not, it's still not in the league of  
gpu's. the not yet built bluegene/q supercomputer claims
2 flops per watt now.

GPU's are 4 flops per watt now and already you can buy it in a shop.

And at least 1 chinese researcher posted here in 2007 to get 2 flops  
per watt out of it.

What works on such ibm hardware efficient should also be no problem  
to port to a GPU.

I see no money amounts quoted on what bluegene/q is gonna cost, yet  
we can be sure it's gonna cost you more than a gpu in the shops.

So a chip not yet sold by ibm, if i may believe wiki, especially  
designed for its purpose, can't compete with a gpu, that's already in  
the shops,
which has been designed for gamers.

Realize that the gpu has been designed for single precision  
calculations and delivers 4x more single precision flops than double,
and we are comparing it double precision here.

BG/Q is using 45 nm processors and AMD7970 is using 28 nm proces  
technology, to just show my point.


>
>
>>
>>
>>
>> On Jan 24, 2012, at 2:19 AM, Ellis H. Wilson III wrote:
>>
>>> On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>>>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>>>>> Nanosecond latency of QPI using 2 rings versus something that  
>>>>>> has a
>>>>>> latency up to factor 1000 slower
>>>>>> with the pci-e as the slowest delaying factor.
>>>>>>
>>>>>> Doing cache coherency over that forget it.
>>>>>
>>>>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it
>>>>> can't
>>>>> work!!!
>>>>>
>>>>> More seriously, with this acquisition, I could see serious
>>>>> contention
>>>>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>>>>> smaller boxen.
>>>>
>>>> That would be some BlueGene type machine you speak about that intel
>>>> would produce with a low power SoC.
>>>>
>>>> This where at this point the bluegene type machines simply can't
>>>> compete with the tiny processors
>>>> that get produced by the dozens of millions.
>>>
>>> For...chess?  ;D
>>>
>>>> "The tiny processors have won"
>>>>     Linus Thorvalds
>>>
>>> *Torvalds, and if Linux (or any well-supported kernel/OS for that
>>> matter) currently had data structures designed for extremely high
>>> parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I
>>> would agree with this statement.  As I currently see it, all we can
>>> really say is that someday, probably, perhaps even hopefully:
>>>
>>> "The tiny processors will win."
>>>
>>> That's after we work out all the nasty nuances involved with  
>>> designing
>>> new data structures for OSes that can handle that number of  
>>> cores, and
>>> probably design new applications that can use these new OS features.
>>> And no, GPU support in Linux doesn't count as this already having  
>>> been
>>> done.  We just farm out very specific code to run on those  
>>> things.  If
>>> somebody has an example of a full-blown, usable OS running on a GPU
>>> ALONE, I would stand (very interestingly) corrected.
>>>
>>>> Intel has themselves a second law of Moore. You can google for it.
>>>
>>> Thanks, for a moment there, I almost used AskJeeves.
>>>
>>>> A good example of massproduced processors are gpu's.
>>>
>>> Was waiting for the hook.  Inevitable really.  I think if we were
>>> discussing the efficacy and quality of resultant bread from various
>>> bread machines versus the numerous methods for making bread by hand
>>> somehow, someway, a GPU would make better bread.  Might be a  
>>> wholesome
>>> cyber-loaf of artisan wheat, but nonetheless, it would be better in
>>> every way.
>>>
>>> Best,
>>>
>>> ellis
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>> Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
>
>  ----------------------
>  Steve Crusan
>  System Administrator
>  Center for Research Computing
>  University of Rochester
>  https://www.crc.rochester.edu/
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
> Comment: GPGTools - http://gpgtools.org
>
> iQEcBAEBAgAGBQJPHjtzAAoJENS19LGOpgqKUHUH/Rvn6tXy8Kla86JNbNwt3KUJ
> B+70SwJL/aBDstcDG4ChT5uW0WCcuvS7qRx5e1Zwu68m7qFEZRvIwc0uu0bgHbxt
> KRynFRZ6suwudEp0o4HMpCBYNaC7uG7xkUeFbUHKfnfCflWDoz4Y9Fq3a/OhoriK
> a5JrQqjVI6HZij+xDqrFvyn80Ec8eSwfRYd8lxfq4abHtE1tKYm/cF5I5Bn2lD5l
> wVNvBQiU99ZPeqhcbL5XyvIsceB6ncodJ9zmBxIahrNIogMCq7UJbUhsikSRp6Dd
> cL7r0AekTyiRmvZaHZZKbuad68DfATT4hy9/HzodBqTWLxxTMlrW8vNH9a7dSOo=
> =oA7r
> -----END PGP SIGNATURE-----
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Tue Jan 24 07:52:46 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 13:52:46 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F1E3D25.7000008@unimelb.edu.au>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
	<4F1E3D25.7000008@unimelb.edu.au>
Message-ID: <08826288-2842-4C6B-B16A-180E5CCCF9D1@xs4all.nl>


On Jan 24, 2012, at 6:09 AM, Christopher Samuel wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 24/01/12 16:02, Steve Crusan wrote:
>
>> Any center that buys BGP (or Q at this point) gear is
>> going to pay for a scientific programmer to adapt their
>> code to take advantage of the BG's strengths; parallelism.
>
> The advantage of the BG platform though is that it's just MPI and
> threads, nothing that unusual at all - certainly no need to learn  
> CUDA,
> OpenCL, etc..
>

If you don't learn opencl, you're gonna run behind.

Vincent

> - --
>     Christopher Samuel - Senior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>          http://www.vlsci.unimelb.edu.au/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8ePSUACgkQO2KABBYQAh+hPQCggfFgdr9R9G6H7hW0Dk1/sGK+
> Fe8Aniu7M6CEThw0s7F2CtqTCmuNZMRg
> =mH9r
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 24 08:20:40 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 24 Jan 2012 14:20:40 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <CB43872A.1354B%james.p.lux@jpl.nasa.gov>
References: <alpine.LFD.2.02.1201240016360.5375@coffee.psychology.mcmaster.ca>
	<CB43872A.1354B%james.p.lux@jpl.nasa.gov>
Message-ID: <20120124132040.GC7343@leitl.org>

On Mon, Jan 23, 2012 at 09:53:14PM -0800, Lux, Jim (337C) wrote:
> Inevitably, though, massively parallel interconnects (all boxes connected
> to all other boxes) won't scale.

You can soup up a local 3d torus with a small network
like connectivity. That keeps the the node connectivity
and number of wires still manageable.

Moreover, the universe does it with local connectivity
(even quantum entanglement needss a relativistic channel
to tell it from RNG) just fine. A 3d grid/torus would
be a good match for anything that can do long-range
by iterating short-range interactions.
 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 24 08:23:27 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 24 Jan 2012 14:23:27 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120124132040.GC7343@leitl.org>
References: <alpine.LFD.2.02.1201240016360.5375@coffee.psychology.mcmaster.ca>
	<CB43872A.1354B%james.p.lux@jpl.nasa.gov>
	<20120124132040.GC7343@leitl.org>
Message-ID: <20120124132327.GE7343@leitl.org>

On Tue, Jan 24, 2012 at 02:20:40PM +0100, Eugen Leitl wrote:
> On Mon, Jan 23, 2012 at 09:53:14PM -0800, Lux, Jim (337C) wrote:
> > Inevitably, though, massively parallel interconnects (all boxes connected
> > to all other boxes) won't scale.
> 
> You can soup up a local 3d torus with a small network

s/small network/small world network

> like connectivity. That keeps the the node connectivity
> and number of wires still manageable.
> 
> Moreover, the universe does it with local connectivity
> (even quantum entanglement needss a relativistic channel
> to tell it from RNG) just fine. A 3d grid/torus would
> be a good match for anything that can do long-range
> by iterating short-range interactions.

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Tue Jan 24 11:21:54 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Tue, 24 Jan 2012 08:21:54 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <986a2d9cf54a1630130a3361fc25a547.squirrel@mail.eadline.org>
Message-ID: <CB44168F.1357B%james.p.lux@jpl.nasa.gov>


On 1/24/12 4:48 AM, "Douglas Eadline" <deadline at eadline.org> wrote:

>
>> Inevitably, though, massively parallel interconnects (all boxes
>>connected
>> to all other boxes) won't scale.
>>
>Indeed, when thinking about scale I always end up thinking about
>the masters of scale -- ants
>
>--

Unfortunately, ants only run a small set of specialized codes, and are not
the generalized computing resource that we're looking for (and, frankly,
don't yet know how to effectively use, if it were to exist)

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 24 11:24:31 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 24 Jan 2012 17:24:31 +0100
Subject: [Beowulf] MIT Genius Stuffs 100 Processors Into Single Chip
Message-ID: <20120124162431.GJ7343@leitl.org>


http://www.wired.com/wiredenterprise/2012/01/mit-genius-stu/

MIT Genius Stuffs 100 Processors Into Single Chip

By Eric Smalley January 23, 2012 | 6:30 am | 
    
Categories: Big Data, Tiny Chips, Data Centers, Hardware, Microprocessors,
Servers, Spin-offs

Anant Agarwal is crazy. If you say otherwise, he's not doing his job. Photo:
Wired.com/Eric Smalley

WESTBOROUGH, Massachusetts ? Call Anant Agarwal?s work crazy, and you?ve made
him a happy man.

Agarwal directs the Massachusetts Institute of Technology?s vaunted Computer
Science and Artificial Intelligence Laboratory, or CSAIL. The lab is housed
in the university?s Stata Center, a Dr. Seussian hodgepodge of forms and
angles that nicely reflects the unhindered-by-reality visionary research that
goes on inside.

Agarwal and his colleagues are figuring out how to build the computer chips
of the future, looking a decade or two down the road. The aim is to do
research that most people think is nuts. ?If people say you?re not crazy,?
Agarwal tells Wired, ?that means you?re not thinking far out enough.?

Agarwal has been at this a while, and periodically, when some of his
pie-in-the-sky research becomes merely cutting-edge, he dons his serial
entrepreneur hat and launches the technology into the world. His latest
commercial venture is Tilera. The company?s specialty is squeezing cores onto
chips ? lots of cores. A core is a processor, the part of a computer chip
that runs software and crunches data. Today?s high-end computer chips have as
many as 16 cores. But Tilera?s top-of-the-line chip has 100.

The idea is to make servers more efficient. If you pack lots of simple cores
onto a single chip, you?re not only saving power. You?re shortening the
distance between cores.

Today, Tilera sells chips with 16, 32, and 64 cores, and it?s scheduled to
ship that 100-core monster later this year. Tilera provides these chips to
Quanta, the huge Taiwanese original design manufacturer (ODM) that supplies
servers to Facebook and ? according to reports, Google. Quanta servers sold
to the big web companies don?t yet include Tilera chips, as far as anyone is
admitting. But the chips are on some of the companies? radar screens.

Agarwal?s outfit is part of an ever growing movement to reinvent the server
for the internet age. Facebook and Google are now designing their own servers
for their sweeping online operations. Startups such as SeaMicro are cramming
hundreds of mobile processors into servers in an effort to save power in the
web data center. And Tilera is tackling this same task from different angle,
cramming the processors into a single chip.

Tilera grew out of a DARPA- and NSF-funded MIT project called RAW, which
produced a prototype 16-core chip in 2002. The key idea was to combine a
processor with a communications switch. Agarwal calls this creation a tile,
and he?s able to build these many tiles into a piece of silicon, creating
what?s known as a ?mesh network.?

?Before that you had the concept of a bunch of processors hanging off of a
bus, and a bus tends to be a real bottleneck,? Agarwal says. ?With a mesh,
every processor gets a switch and they all talk to each other?. You can think
of it as a peer-to-peer network.?

What?s more, Tilera made a critical improvement to the cache memory that?s
part of each core. Agarwal and company made the cache dynamic, so that every
core has a consistent copy of the chip?s data. This Dynamic Distributed Cache
makes the cores act like a single chip so they can run standard software. The
processors run the Linux operating system and programs written in C++, and a
large chunk of Tilera?s commercialization effort focused on programming
tools, including compilers that let programmers recompile existing programs
to run on Tilera processors.

The end result is a 64-core chip that handles more transactions and consumes
less power than an equivalent batch of x86 chips. A 400-watt Tilera server
can replace eight x86 servers that together draw 2,000 watts. Facebook?s
engineers have given the chip a thorough tire-kicking, and Tilera says it has
a growing business selling its chips to networking and videoconferencing
equipment makers. Tilera isn?t naming names, but claims one of the top two
videoconferencing companies and one of the top two firewall companies.

An Army of Wimps

There?s a running debate in the server world over what are called wimpy
nodes. Startups SeaMicro and Calxeda are carving out a niche for low-power
servers based on processors originally built for cellphones and tablets.
Carnegie Mellon professor Dave Andersen calls these chips ?wimpy.? The idea
is that building servers with more but lower-power processors yields better
performance for each watt of power. But some have downplayed the idea,
pointing out that it only works for certain types of applications.

Tilera takes the position that wimpy cores are okay, but wimpy nodes ? aka
wimpy chips ? are not.

Keeping the individual cores wimpy is a plus because a wimpy core is low
power. But if your cores are spread across hundreds of chips, Agarwal says,
you run into problems: inter-chip communications are less efficient than
on-chip communications. Tilera gets the best of both worlds by using wimpy
cores but putting many cores on a chip. But it still has a ways to go.

There?s also a limit to how wimpy your cores can be. Google?s infrastructure
guru, Urs H?lzle, published an influential paper on the subject in 2010. He
argued that in most cases brawny cores beat wimpy cores. To be effective, he
argued, wimpy cores need to be no less than half the power of higher-end x86
cores.

Tilera is boosting the performance of its cores. The company?s most recent
generation of data center server chips, released in June, are 64-bit
processors that run at 1.2 to 1.5 GHz. The company also doubled DRAM speed
and quadrupled the amount of cache per core. ?It?s clear that cores have to
get beefier,? Agarwal says.

The whole debate, however, is somewhat academic. ?At the end of the day, the
customer doesn?t care whether you?re a wimpy core or a big core,? Agarwal
says. ?They care about performance, and they care about performance per watt,
and they care about total cost of ownership, TCO.?

Tilera?s performance per watt claims were validated by a paper published by
Facebook engineers in July. The paper compared Tilera?s second generation
64-core processor to Intel?s Xeon and AMD?s Opteron high end server
processors. Facebook put the processors through their paces on Memcached, a
high-performance database memory system for web applications.

According to the Facebook engineers, a tuned version of Memcached on the
64-core Tilera TILEPro64 yielded at least 67 percent higher throughput than
low-power x86 servers. Taking power and node integration into account as
well, a TILEPro64-based S2Q server with 8 processors handled at least three
times as many transactions per second per Watt as the x86-based servers.

Despite the glowing words, Facebook hasn?t thrown its arms around Tilera. The
stumbling block, cited in the paper, is the limited amount of memory the
Tilera processors support. Thirty-two-bit cores can only address about 4GB of
memory. ?A 32-bit architecture is a nonstarter for the cloud space,? Agarwal
says.

Tilera?s 64-bit processors change the picture. These chips support as much as
a terabyte of memory. Whether the improvement is enough to seal the deal with
Facebook, Agarwal wouldn?t say. ?We have a good relationship,? he says with a
smile.

While Intel Lurks

Intel is also working on many-core chips, and it expects to ship a
specialized 50-core processor, dubbed Knights Corner, in the next year or so
as an accelerator for supercomputers. Unlike the Tilera processors, Knights
Corner is optimized for floating point operations, which means it?s designed
to crunch the large numbers typical of high-performance computing
applications.

In 2009, Intel announced an experimental 48-core processor code-named Rock
Creek and officially labeled the Single-chip Cloud Computer (SCC). The chip
giant has since backed off of some of the loftier claims it was making for
many-core processors, and it focused its many-core efforts on
high-performance computing. For now, Intel is sticking with the Xeon
processor for high-end data center server products.

Dave Hill, who handles server product marketing for Intel, takes exception to
the Facebook paper. ?Really what they compared was a very optimized set of
software running on Tilera versus the standard image that you get from the
open source running on the x86 platforms,? he says.

The Facebook engineers ran over a hundred different permutations in terms of
the number of cores allocated to the Linux stack, the networking stack and
the Memcached stack, Hill says. ?They really kinda fine tuned it. If you
optimize the x86 version, then the paper probably would have been more apples
to apples.?

Tilera?s roadmap calls for its next generation of processors, code-named
Stratton, to be released in 2013. The product line will expand the number of
processors in both directions, down to as few as four and up to as many as
200 cores. The company is going from a 40-nm to a 28-nm process, meaning
they?re able to cram more circuits in a given area. The chip will have
improvements to interfaces, memory, I/O and instruction set, and will have
more cache memory.

But Agarwal isn?t stopping there. As Tilera churns out the 100-core chip,
he?s leading a new MIT effort dubbed the Angstrom project. It?s one of four
DARPA-funded efforts aimed at building exascale supercomputers. In short,
it?s aiming for a chip with 1,000 cores. 
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Tue Jan 24 13:13:17 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Tue, 24 Jan 2012 10:13:17 -0800
Subject: [Beowulf] balance between compute and communicate
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B447C5D@ALTPHYEMBEVSP20.RES.AD.JPL>

One of the lines in the article Eugen posted:


"There's also a limit to how wimpy your cores can be. Google's infrastructure guru, Urs H?lzle, published an influential paper on the subject in 2010. He argued that in most cases brawny cores beat wimpy cores. To be effective, he argued, wimpy cores need to be no less than half the power of higher-end x86 cores."


Is interesting.. I think the real issue is one of "system engineering".. you want processor speed, memory size/bandwidth, and internode communication speed/bandwidth to be "balanced".  Super duper 10GHz cores with 1k of RAM  interconnected with 9600bps serial links is clearly an unbalanced system..


The paper is at

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/36448.pdf


>From the paper:
Typically, CPU power decreases by approximately O(k2) when CPU frequency decreases by k,

Hmm.. this isn't necessarily true, with modern designs.  In the bad old days, when core voltages were high and switching losses dominated, yes, this is the case, but with modern designs, the leakage losses are starting to be comparable to the switching losses.  But that's ok, because he never comes back to the power issue again, and heads off on Amdahl's law (which we 'wulfers all know) and the inevitable single thread bottleneck that exists at some point.


However, I certainly agree with him  when he says:
Cost numbers used by wimpy-core evangelists always exclude software development costs. Unfortunately, wimpy-core systems can require applications to be explicitly parallelized or otherwise optimized for acceptable performance....
But, I don't go for
Software development costs often dominate a company's overall technical expenses

I don't know that software development costs dominate.  If you're building a million computer data center (distributed geographically, perhaps), that's on the order of several billion dollars, and you can buy an awful lot of skilled developer time for a billion dollars.  It might cost another billion to manage all of them, but that's still an awful lot of development.  But maybe in his space, the development time is more costly than the hardware purchase and operating costs.


He summarizes with
Once a chip's single-core performance lags by more than a factor to two or so behind the higher end of current-generation commodity
processors, making.....


Which is essentially my system engineering balancing argument, in the context of expectations that the surrounding stuff is current generation.

So the real Computer Engineering question is: Is there some basic rule of thumb that one can use to determine appropriate balance, given things like speeds/bandwidth/power consumption?

Could we, for instance, take moderately well understood implications and forecasts of future performance (e.g. Moore's law and its ilk) and predict what size machines with what performance would be reasonable in say, 20 years?  The scaling rules for CPUs, for Memory, and for Communications are fairly well understood.

(or maybe this is something that's covered in every lower division computer engineering class these days?.. I confess I'm woefully ignorant of what they teach at various levels these days)


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120124/9d4a6fc8/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From diep at xs4all.nl  Tue Jan 24 13:25:07 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 24 Jan 2012 19:25:07 +0100
Subject: [Beowulf] MIT Genius Stuffs 100 Processors Into Single Chip
In-Reply-To: <20120124162431.GJ7343@leitl.org>
References: <20120124162431.GJ7343@leitl.org>
Message-ID: <FF1E8DD6-76AA-4F3C-9CC6-D4FEA319A4E4@xs4all.nl>

I remember the first announcement some years ago from Tilera.
Some persons shipped some emails to tilera asking for more details.
Some just asked - like me - others also offered money to buy a cpu.

They all got a 'no'.

But now that there are more details the chip sounds less impressive.
Let's analyze based upon the vague information on the homepage.

Lots of statements that a marketing department in India would write  
down as such
are there as well; reformulating existing slogans into more political  
slogans,
allowing you to deny later on that it performs very well. We know  
that trick just all
too well.

First of all homepage report it's 23 watts, yet doesn't say whether  
that's idle or under full load.
It just says 'active'. Active is a vague way of formulating. I assume  
that's a core that isn't idle yet
isn't under 100% load. So then it eats like a portion of the power.

So probably it's a watt or 50 under full load.
Then it says 64 cores in a grid @ 700Mhz.

700Mhz sounds as a possible Ghz frequency that you can get if you're  
a professional
(if i'd build something count at it that it'll run 300Mhz or so).
Doesn't seem like weird claim.
64 * 0.7 = 44.8Ghz measure
Yet at the same time it claims on homepage 443 billion operations per  
second.
What is an operation? Is that an internal iop?
It says it's 32 bits VLIW. So that would mean it's processing each  
cycle 10 integers.
Now we know from all other manufacturers they cheat factor 2, by  
double counting if just 1 instruction theirs is doing for example  
Fused Multiply Add.
So we can divide it by 2 probably and get to 220 gflop.
So then a vector would be 5 integers long, which seems like a weird  
measure.
Maybe they rounded it up a tad and in reality mean 4 integers, sounds  
most reasonable.
So then it's 64 cores in a grid executing vectors existing out of 4  
units of 32 bits. Sounds plausible.

If we compare that with some GPU's which are in our notebooks from a  
few years ago, then suddenly it's not so impressive.

Vincent

On Jan 24, 2012, at 5:24 PM, Eugen Leitl wrote:

>
> http://www.wired.com/wiredenterprise/2012/01/mit-genius-stu/

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Tue Jan 24 17:36:14 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Wed, 25 Jan 2012 09:36:14 +1100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <F4B5F8D0-208C-4C34-AB63-5AFA2C34A325@xs4all.nl>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
	<F4B5F8D0-208C-4C34-AB63-5AFA2C34A325@xs4all.nl>
Message-ID: <4F1F325E.9010109@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/01/12 23:51, Vincent Diepeveen wrote:

> You build a system of millions of euro's alltogether, NCSA having a  
> huge budget and you can't even pay for a few programmers who
> write some crunching code for gpu's????

I was at a meeting at SC'06 where the folks from various large
institutions in the US were bemoaning the fact that there was all this
money for petaflop hardware available but none for programmers or
algorithm development to make apps scale out to the systems.

Just because the scientists say it's a good thing to have doesn't mean
the US government funding people will listen to them..

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8fMl4ACgkQO2KABBYQAh95lwCfQodU25X1A0yngWOOwuAqmU2X
thAAoICeeMk8fwx33enCWQ/XGvatdsEc
=OFC+
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Wed Jan 25 17:01:48 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Wed, 25 Jan 2012 17:01:48 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
Message-ID: <4F207BCC.9010701@ias.edu>

On 01/24/2012 12:02 AM, Steve Crusan wrote:
>
>
> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>
>
> > It's 500 euro for a 1 teraflop double precision Radeon HD7970...
>
>
> Great, and nothing runs on it. GPUs are insanely useful for certain
> tasks, but they aren't going to be able to handle most normal
> workloads(similar to the BG class of course). Any center that buys BGP
> (or Q at this point) gear is going to pay for a scientific programmer
> to adapt their code to take advantage of the BG's strengths; parallelism.
>
> But It's nice that supercomputing centers use GPUs to boost their
> flops numbers. Any word on that Chinese system's efficiency? If you
> look at the architecture of the new K computer in Japan, it's similar
> to the BlueGene line.

I attended a presentation at Princeton U. on Monday about the state of
HPC in China. The talk  was given by someone who has been to China and
spoken with the leaders of their HPC efforts. While the Chinese systems
get great scores on LINPACK, even the Chinese concede that on their
"real" applications, they are getting well below the theoretical max
flops, because their codes aren't getting the most out of their systems.
In other words, on real programs, they aren't all that efficient (yet).

--
Prentice


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Wed Jan 25 19:46:57 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 26 Jan 2012 01:46:57 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F207BCC.9010701@ias.edu>
References: <708qawXBm8848S02.1327359732@web02.cms.usa.net>	<3B284FD5-80E7-4384-9DBD-640D2F465E39@xs4all.nl>	<4F1DF542.6050504@scalableinformatics.com>	<F55D5B47-4DEB-4D32-9E2F-223AA41F2093@xs4all.nl>	<4F1E070C.4040107@cse.psu.edu>	<D3EB2B35-A42F-410C-9E1A-DF586FA6DDA7@xs4all.nl>
	<DC917FFD-53A9-4900-AB03-B42ABE92DAAF@ur.rochester.edu>
	<4F207BCC.9010701@ias.edu>
Message-ID: <76840233-6CA8-4B9E-BF66-4A1A93CD1F1F@xs4all.nl>

The supercomputing codes i saw run on processors, to say polite, were  
losing it everywhere.

Also NASA when porting from Origin3800 to Itanium2 1.5Ghz, reported  
publicly a speedup of factor 2 in the forums.

However my own chessprogram, not exactly optimized for itanium2, got  
a boost of factor 4 moving from 500Mhz R14000 (origin3800)
to itanium2 1.3Ghz. That was just a single compile, and it's an  
integer program, whereas the itanium2 is a floating point processor.

The itanium2 1.5Ghz has 6 gflops on paper versus the R14k 500Mhz has  
1 Gflop on paper.

Now a Chinese reporter posted on THIS mailing list, the beowulf  
mailing list, already at GPU hardware some generations ago
an IPC of 25% at nvidia and 50% at AMD.

At the same gpu's back then, most studentprojects got around 25% at  
nvidia; Volkov then went ahead and understood GPU's better
and scored 70% efficiency - again at very old gpu's. Sincethen they  
really improved.

See: http://www.cs.berkeley.edu/~volkov/

So you want to build a supercomptuer now 10x more expensive, and each  
generation lose more efficiency on newer hardware,
whereas some who do effort to write new good code, they get very high  
efficiency?

Just learn how to program and ignore the desinformation - if you have  
a box that fast you really can get a lot of speed out of it.

You shouldn't ask for a 1 billion dollar box that can run your  
oldschool Fortran codes as good as a 5 million GPU box,
look what you can do to write good codes for that manycore hardware.  
OpenGL works at all, CUDA just at nvidia.

Vincent

On Jan 25, 2012, at 11:01 PM, Prentice Bisbal wrote:

> On 01/24/2012 12:02 AM, Steve Crusan wrote:
>>
>>
>> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>>
>>
>>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
>>
>>
>> Great, and nothing runs on it. GPUs are insanely useful for certain
>> tasks, but they aren't going to be able to handle most normal
>> workloads(similar to the BG class of course). Any center that buys  
>> BGP
>> (or Q at this point) gear is going to pay for a scientific programmer
>> to adapt their code to take advantage of the BG's strengths;  
>> parallelism.
>>
>> But It's nice that supercomputing centers use GPUs to boost their
>> flops numbers. Any word on that Chinese system's efficiency? If you
>> look at the architecture of the new K computer in Japan, it's similar
>> to the BlueGene line.
>
> I attended a presentation at Princeton U. on Monday about the state of
> HPC in China. The talk  was given by someone who has been to China and
> spoken with the leaders of their HPC efforts. While the Chinese  
> systems
> get great scores on LINPACK, even the Chinese concede that on their
> "real" applications, they are getting well below the theoretical max
> flops, because their codes aren't getting the most out of their  
> systems.
> In other words, on real programs, they aren't all that efficient  
> (yet).
>
> --
> Prentice
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Jan 26 00:04:31 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Wed, 25 Jan 2012 21:04:31 -0800
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <4F1F325E.9010109@unimelb.edu.au>
Message-ID: <CB461DD4.13981%james.p.lux@jpl.nasa.gov>


On 1/24/12 2:36 PM, "Christopher Samuel" <samuel at unimelb.edu.au> wrote:

>institutions in the US were bemoaning the fact that there was all this
>money for petaflop hardware available but none for programmers or
>algorithm development to make apps scale out to the systems.


That's partly because people are an expense, while hardware is an asset
that sits on the balance sheet.

If I fork out a million bucks for a computer, I now have an asset that is
worth a million dollars.

If I fork out a million dollars for 3 skilled developers for a year, at
the end of the year, it's not clear I'll possess an asset that I can sell
for a million dollars.

Obviously, the work product must be worth something, because otherwise we
wouldn't have jobs, but the connection is more tenuous.


The other thing (when government funding is considered) is that the
million dollar hardware purchase might turn into more jobs than the 3
software weenies, if only because "computer assemblers and deliverers" get
paid a lot less, and when it comes to statistics, they don't look at
"cumulative wages", they look at "number of people employed"

>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 26 07:28:41 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 26 Jan 2012 13:28:41 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
Message-ID: <EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>

Mike you replied to me not to mailing list.

note that itanium2 released too late and it was $100k a box initially  
and $7500 a cpu (1.5Ghz) if you ordered a 1000.
And it had same IPC for integers like opteron at the time (later on  
compilers got pgo for opteron as well and then opteron was faster,
at least for diep, in ipc).

Larrabee indeed resembles itanium to some extend, but not quite.
intels expertise is producing highclocked cpu's. itanium was a low  
clocked cpu and therefore failed.
no one pays big bucks for a low clocked cpu. look on ebay - cheapest  
cpu's always the lowclocked ones.

larrabee is something in between a cpu and a gpu so total other  
ballgame - intel moving to a market where they actually have competition
and are not the ones owning the patents.

So that's not gonna be easy for intel some years from now if they  
show up with a 100% vectorized design and not some dreadnought
in between cpu and gpu which is low clocked.

As for your infiniband remark realize that it took 25 years or so to  
bugfix ethernet everywhere - forget 'setting a new standard' there  
for the average Joe.
Not gonna work.

Infiniband is meant for HPC and uses MPI protocol to communicate.  
This is very powerful for clusters and the way to go when scaling at  
supercomputers,
yet it's not gonna conquer average joe's machine, as there is a price  
to pay which is too high for now.

However realize some of sales of the HPC manufacturers goes to low  
latency ethernet - my guess is that intel will use qlogics know how  
there to improve
their cheapo cpu's and upgrade them with better ethernet. Seems  
plausible goal and a very useful one, the rest, such as rivalling  
Mellanox at ethernet,
that's not gonna happen.

On Jan 26, 2012, at 7:23 AM, MDG wrote:

> Technically the Itanium Chip was a failure, it was not x86 100%  
> compatible and actually was for servers but often under-preformed  
> the traditional x86 chips, Intel let it quietly vanish as it came  
> nowhere near the first advertised performance. It varied too far  
> from the x86 architecture design requiring special programing code,  
> so much like the GPUs, though they are actually able to run some  
> parallel process, both under Windows and Linux.
>
> There is a difference the M series NVIDIA cards are moe for servers  
> and the C series such as the C2070 or C2075 for Workstations, the M  
> series also used the same numbering sequence and I think they are  
> up to the 2090 or 2095 series, but you do need PCIe high speed  
> slots for both sets of cards.  Most resale cards I have talked to a  
> few, and be careful there are some knockoffs from mainland China, I  
> verified this with NVIDIA.
>
>  These GPUs are designed that they are not seen as cores or cpus,  
> also most resale?s, are pulled from in one case a pool of HP  
> Workstations and servers, yet the seller had no idea the difference  
> between the C2070 and the M2070s. and as I said none had of them  
> had the required software, most did not even know it was needed!  
> Otherwise the GPUs do not function.  So, as for resale?s it is a  
> pretty expensive gamble as they are untested as no software to even  
> try them with!
>
> The GPUs can be used if you wtrite your own parallel code usually  
> in C++ per NVIDIA, but you still need the software to offload the  
> work to them.  If you are into heavy number crunching, assuming  
> allows parallel processing versus the traditional linear method  
> where a must always come before B and b before C in processes, you  
> will see a lot more results than a typical program, in other things  
> you will see little improvement, my talk with an NVIDIA technician  
> confirmed this you can get a great results for creating say  
> graphics but very little improvement to display a already designed  
> piece, same for statistics, weather forecasting, geology,  
> technically intel has even used their network as a massive HPC to  
> elp design chips, so add engineering, while beyond most physics and  
> nuclear explosions simulations, etc.
>
> Also with fiber optics now coming down in price the idea of  
> multiple super-workstations and even super-servers where a client  
> server relationship and the Server does most of the processing will  
> most likely grow into stable and usable systems before the average  
> work-station.
>
> It will help some with a statistics driven database but not that  
> much for a pure relational database, it also works well with  
> MathLab and SPSS.
>
>
>
> Overall I would expect that the GPUs will soon have more code  
> written for them as they become more plentiful in the real world  
> applications, also there is open source code that is available and  
> being further developed under linux, which with Wine and Winex can  
> run Windows, to some degree, not 100% and as for Windows 7 I have  
> not a clue if it will run under Wine or WineX, though the  
> Macintosh?s now run Windows very well as a second operating  
> system..  Than I would like to have 4 12 core Xeons in my  
> workstation but that bill is far higher than a few 448 GPU cards.  
> Just as any new technology it starts on the high end and then as  
> developed works its way down the price chain, than I was shocked to  
> see a twin Xeon 6 core in a Game machine! So things are moving  
> faster than I anticipated.
>
>
>
> I know I am watching the GPU idea and cards carefully as so far  
> beyond just throwing more cores in the x86 architecture it seems to  
> be moving far faster than when intel started moving upwards, maybe  
> you remember the hardware flaw in the first Pentiums where simple  
> math was processed incorrectly?  Like all things when you introduce  
> new variables into a system, be it hardware or software, there are  
> a lot of things that will not always work or work to the potential  
> of the system.
>
>
>
> As I said I am watching the GPUs closely as so far they seem the  
> most likely next beak-through as software is written that can take  
> advantage of their unique abilities. Also from what I have read  
> they draw far less power than even the new generation of multi-core  
> x86 series.  I am not an expert with these GPU systems but they do  
> hold a great promise as in a leap-forward than just adding x86 cores.
>
> The buying of Infiniband shows hat Intel is looking to move past  
> the copper Ethernet systems, which surpased Arcnet systems.  the  
> only constamnt is change, while technically not an Intel Chip this  
> still shows Moore's law is being leveraged to other platforms  
> including GPUs
>
> Mike.
>
> --- On Wed, 1/25/12, Vincent Diepeveen <diep at xs4all.nl> wrote:
>
> From: Vincent Diepeveen <diep at xs4all.nl>
> Subject: Re: [Beowulf] cpu's versus gpu's - was Intel buys QLogic  
> InfiniBand business
> To: "Prentice Bisbal" <prentice at ias.edu>
> Cc: "Beowulf Mailing List" <beowulf at beowulf.org>
> Date: Wednesday, January 25, 2012, 2:46 PM
>
> The supercomputing codes i saw run on processors, to say polite, were
> losing it everywhere.
>
> Also NASA when porting from Origin3800 to Itanium2 1.5Ghz, reported
> publicly a speedup of factor 2 in the forums.
>
> However my own chessprogram, not exactly optimized for itanium2, got
> a boost of factor 4 moving from 500Mhz R14000 (origin3800)
> to itanium2 1.3Ghz. That was just a single compile, and it's an
> integer program, whereas the itanium2 is a floating point processor.
>
> The itanium2 1.5Ghz has 6 gflops on paper versus the R14k 500Mhz has
> 1 Gflop on paper.
>
> Now a Chinese reporter posted on THIS mailing list, the beowulf
> mailing list, already at GPU hardware some generations ago
> an IPC of 25% at nvidia and 50% at AMD.
>
> At the same gpu's back then, most studentprojects got around 25% at
> nvidia; Volkov then went ahead and understood GPU's better
> and scored 70% efficiency - again at very old gpu's. Sincethen they
> really improved.
>
> See: http://www.cs.berkeley.edu/~volkov/
>
> So you want to build a supercomptuer now 10x more expensive, and each
> generation lose more efficiency on newer hardware,
> whereas some who do effort to write new good code, they get very high
> efficiency?
>
> Just learn how to program and ignore the desinformation - if you have
> a box that fast you really can get a lot of speed out of it.
>
> You shouldn't ask for a 1 billion dollar box that can run your
> oldschool Fortran codes as good as a 5 million GPU box,
> look what you can do to write good codes for that manycore hardware.
> OpenGL works at all, CUDA just at nvidia.
>
> Vincent
>
> On Jan 25, 2012, at 11:01 PM, Prentice Bisbal wrote:
>
> > On 01/24/2012 12:02 AM, Steve Crusan wrote:
> >>
> >>
> >> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
> >>
> >>
> >>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
> >>
> >>
> >> Great, and nothing runs on it. GPUs are insanely useful for certain
> >> tasks, but they aren't going to be able to handle most normal
> >> workloads(similar to the BG class of course). Any center that buys
> >> BGP
> >> (or Q at this point) gear is going to pay for a scientific  
> programmer
> >> to adapt their code to take advantage of the BG's strengths;
> >> parallelism.
> >>
> >> But It's nice that supercomputing centers use GPUs to boost their
> >> flops numbers. Any word on that Chinese system's efficiency? If you
> >> look at the architecture of the new K computer in Japan, it's  
> similar
> >> to the BlueGene line.
> >
> > I attended a presentation at Princeton U. on Monday about the  
> state of
> > HPC in China. The talk  was given by someone who has been to  
> China and
> > spoken with the leaders of their HPC efforts. While the Chinese
> > systems
> > get great scores on LINPACK, even the Chinese concede that on their
> > "real" applications, they are getting well below the theoretical max
> > flops, because their codes aren't getting the most out of their
> > systems.
> > In other words, on real programs, they aren't all that efficient
> > (yet).
> >
> > --
> > Prentice
> >
> >
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> > Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Thu Jan 26 07:35:40 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Thu, 26 Jan 2012 13:35:40 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
Message-ID: <E4216EFF-8FA5-49AB-A0C1-335C257567A0@xs4all.nl>


On Jan 26, 2012, at 1:28 PM, Vincent Diepeveen wrote:

> Mike you replied to me not to mailing list.
>
> note that itanium2 released too late and it was $100k a box  
> initially and $7500 a cpu (1.5Ghz) if you ordered a 1000.
> And it had same IPC for integers like opteron at the time (later on  
> compilers got pgo for opteron as well and then opteron was faster,
> at least for diep, in ipc).
>
> Larrabee indeed resembles itanium to some extend, but not quite.
> intels expertise is producing highclocked cpu's. itanium was a low  
> clocked cpu and therefore failed.
> no one pays big bucks for a low clocked cpu. look on ebay -  
> cheapest cpu's always the lowclocked ones.
>
> larrabee is something in between a cpu and a gpu so total other  
> ballgame - intel moving to a market where they actually have  
> competition
> and are not the ones owning the patents.
>
> So that's not gonna be easy for intel some years from now if they  
> show up with a 100% vectorized design and not some dreadnought
> in between cpu and gpu which is low clocked.
>
> As for your infiniband remark realize that it took 25 years or so  
> to bugfix ethernet everywhere - forget 'setting a new standard'  
> there for the average Joe.
> Not gonna work.
>
> Infiniband is meant for HPC and uses MPI protocol to communicate.  
> This is very powerful for clusters and the way to go when scaling  
> at supercomputers,
> yet it's not gonna conquer average joe's machine, as there is a  
> price to pay which is too high for now.
>
> However realize some of sales of the HPC manufacturers goes to low  
> latency ethernet - my guess is that intel will use qlogics know how  
> there to improve
> their cheapo cpu's and upgrade them with better ethernet. Seems  
> plausible goal and a very useful one, the rest, such as rivalling  
> Mellanox at ethernet,
> that's not gonna happen.
>

Oops small typo during speedy write. "mellanox at ethernet" should of  
course be 'mellanox at HPC'.

The question is whether typical low latency ethernet products are  
gonna suffer from intels move. I doubt solarflare will.
they already deliver this stuff only to those who really battle for  
every picosecond, so price is just not the issue there.

Vincent

> On Jan 26, 2012, at 7:23 AM, MDG wrote:
>
>> Technically the Itanium Chip was a failure, it was not x86 100%  
>> compatible and actually was for servers but often under-preformed  
>> the traditional x86 chips, Intel let it quietly vanish as it came  
>> nowhere near the first advertised performance. It varied too far  
>> from the x86 architecture design requiring special programing  
>> code, so much like the GPUs, though they are actually able to run  
>> some parallel process, both under Windows and Linux.
>>
>> There is a difference the M series NVIDIA cards are moe for  
>> servers and the C series such as the C2070 or C2075 for  
>> Workstations, the M series also used the same numbering sequence  
>> and I think they are up to the 2090 or 2095 series, but you do  
>> need PCIe high speed slots for both sets of cards.  Most resale  
>> cards I have talked to a few, and be careful there are some  
>> knockoffs from mainland China, I verified this with NVIDIA.
>>
>>  These GPUs are designed that they are not seen as cores or cpus,  
>> also most resale?s, are pulled from in one case a pool of HP  
>> Workstations and servers, yet the seller had no idea the  
>> difference between the C2070 and the M2070s. and as I said none  
>> had of them had the required software, most did not even know it  
>> was needed! Otherwise the GPUs do not function.  So, as for  
>> resale?s it is a pretty expensive gamble as they are untested as  
>> no software to even try them with!
>>
>> The GPUs can be used if you wtrite your own parallel code usually  
>> in C++ per NVIDIA, but you still need the software to offload the  
>> work to them.  If you are into heavy number crunching, assuming  
>> allows parallel processing versus the traditional linear method  
>> where a must always come before B and b before C in processes, you  
>> will see a lot more results than a typical program, in other  
>> things you will see little improvement, my talk with an NVIDIA  
>> technician confirmed this you can get a great results for creating  
>> say graphics but very little improvement to display a already  
>> designed piece, same for statistics, weather forecasting, geology,  
>> technically intel has even used their network as a massive HPC to  
>> elp design chips, so add engineering, while beyond most physics  
>> and nuclear explosions simulations, etc.
>>
>> Also with fiber optics now coming down in price the idea of  
>> multiple super-workstations and even super-servers where a client  
>> server relationship and the Server does most of the processing  
>> will most likely grow into stable and usable systems before the  
>> average work-station.
>>
>> It will help some with a statistics driven database but not that  
>> much for a pure relational database, it also works well with  
>> MathLab and SPSS.
>>
>>
>>
>> Overall I would expect that the GPUs will soon have more code  
>> written for them as they become more plentiful in the real world  
>> applications, also there is open source code that is available and  
>> being further developed under linux, which with Wine and Winex can  
>> run Windows, to some degree, not 100% and as for Windows 7 I have  
>> not a clue if it will run under Wine or WineX, though the  
>> Macintosh?s now run Windows very well as a second operating  
>> system..  Than I would like to have 4 12 core Xeons in my  
>> workstation but that bill is far higher than a few 448 GPU cards.  
>> Just as any new technology it starts on the high end and then as  
>> developed works its way down the price chain, than I was shocked  
>> to see a twin Xeon 6 core in a Game machine! So things are moving  
>> faster than I anticipated.
>>
>>
>>
>> I know I am watching the GPU idea and cards carefully as so far  
>> beyond just throwing more cores in the x86 architecture it seems  
>> to be moving far faster than when intel started moving upwards,  
>> maybe you remember the hardware flaw in the first Pentiums where  
>> simple math was processed incorrectly?  Like all things when you  
>> introduce new variables into a system, be it hardware or software,  
>> there are a lot of things that will not always work or work to the  
>> potential of the system.
>>
>>
>>
>> As I said I am watching the GPUs closely as so far they seem the  
>> most likely next beak-through as software is written that can take  
>> advantage of their unique abilities. Also from what I have read  
>> they draw far less power than even the new generation of multi- 
>> core x86 series.  I am not an expert with these GPU systems but  
>> they do hold a great promise as in a leap-forward than just adding  
>> x86 cores.
>>
>> The buying of Infiniband shows hat Intel is looking to move past  
>> the copper Ethernet systems, which surpased Arcnet systems.  the  
>> only constamnt is change, while technically not an Intel Chip this  
>> still shows Moore's law is being leveraged to other platforms  
>> including GPUs
>>
>> Mike.
>>
>> --- On Wed, 1/25/12, Vincent Diepeveen <diep at xs4all.nl> wrote:
>>
>> From: Vincent Diepeveen <diep at xs4all.nl>
>> Subject: Re: [Beowulf] cpu's versus gpu's - was Intel buys QLogic  
>> InfiniBand business
>> To: "Prentice Bisbal" <prentice at ias.edu>
>> Cc: "Beowulf Mailing List" <beowulf at beowulf.org>
>> Date: Wednesday, January 25, 2012, 2:46 PM
>>
>> The supercomputing codes i saw run on processors, to say polite, were
>> losing it everywhere.
>>
>> Also NASA when porting from Origin3800 to Itanium2 1.5Ghz, reported
>> publicly a speedup of factor 2 in the forums.
>>
>> However my own chessprogram, not exactly optimized for itanium2, got
>> a boost of factor 4 moving from 500Mhz R14000 (origin3800)
>> to itanium2 1.3Ghz. That was just a single compile, and it's an
>> integer program, whereas the itanium2 is a floating point processor.
>>
>> The itanium2 1.5Ghz has 6 gflops on paper versus the R14k 500Mhz has
>> 1 Gflop on paper.
>>
>> Now a Chinese reporter posted on THIS mailing list, the beowulf
>> mailing list, already at GPU hardware some generations ago
>> an IPC of 25% at nvidia and 50% at AMD.
>>
>> At the same gpu's back then, most studentprojects got around 25% at
>> nvidia; Volkov then went ahead and understood GPU's better
>> and scored 70% efficiency - again at very old gpu's. Sincethen they
>> really improved.
>>
>> See: http://www.cs.berkeley.edu/~volkov/
>>
>> So you want to build a supercomptuer now 10x more expensive, and each
>> generation lose more efficiency on newer hardware,
>> whereas some who do effort to write new good code, they get very high
>> efficiency?
>>
>> Just learn how to program and ignore the desinformation - if you have
>> a box that fast you really can get a lot of speed out of it.
>>
>> You shouldn't ask for a 1 billion dollar box that can run your
>> oldschool Fortran codes as good as a 5 million GPU box,
>> look what you can do to write good codes for that manycore hardware.
>> OpenGL works at all, CUDA just at nvidia.
>>
>> Vincent
>>
>> On Jan 25, 2012, at 11:01 PM, Prentice Bisbal wrote:
>>
>> > On 01/24/2012 12:02 AM, Steve Crusan wrote:
>> >>
>> >>
>> >> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>> >>
>> >>
>> >>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
>> >>
>> >>
>> >> Great, and nothing runs on it. GPUs are insanely useful for  
>> certain
>> >> tasks, but they aren't going to be able to handle most normal
>> >> workloads(similar to the BG class of course). Any center that buys
>> >> BGP
>> >> (or Q at this point) gear is going to pay for a scientific  
>> programmer
>> >> to adapt their code to take advantage of the BG's strengths;
>> >> parallelism.
>> >>
>> >> But It's nice that supercomputing centers use GPUs to boost their
>> >> flops numbers. Any word on that Chinese system's efficiency? If  
>> you
>> >> look at the architecture of the new K computer in Japan, it's  
>> similar
>> >> to the BlueGene line.
>> >
>> > I attended a presentation at Princeton U. on Monday about the  
>> state of
>> > HPC in China. The talk  was given by someone who has been to  
>> China and
>> > spoken with the leaders of their HPC efforts. While the Chinese
>> > systems
>> > get great scores on LINPACK, even the Chinese concede that on their
>> > "real" applications, they are getting well below the theoretical  
>> max
>> > flops, because their codes aren't getting the most out of their
>> > systems.
>> > In other words, on real programs, they aren't all that efficient
>> > (yet).
>> >
>> > --
>> > Prentice
>> >
>> >
>> >
>> > _______________________________________________
>> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> > Computing
>> > To change your subscription (digest mode or unsubscribe) visit
>> > http://www.beowulf.org/mailman/listinfo/beowulf
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From samuel at unimelb.edu.au  Thu Jan 26 18:27:21 2012
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Fri, 27 Jan 2012 10:27:21 +1100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
Message-ID: <4F21E159.7000905@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 26/01/12 23:28, Vincent Diepeveen wrote:

> Mike you replied to me not to mailing list.

That was probably deliberate, and it is inconsiderate to post a reply
publicly without checking with the writer that they are OK with that,
especially as you quoted what they wrote - they may not have wanted that
in the public domain.

- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8h4VkACgkQO2KABBYQAh9lJgCfQXwsmDG9l1v4Jt9vUr5YYCr0
fDYAoJdJBbUJBApO5ZOh200gZ5+Lo/vt
=mpU4
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Thu Jan 26 20:48:55 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Thu, 26 Jan 2012 20:48:55 -0500 (EST)
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
Message-ID: <alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>

> Larrabee indeed resembles itanium to some extend, but not quite.

wow, that has to be your most loosely-tethered-to-reality statement yet!
it's true that Larrabee and Itanium are very close 
in the number of letters in their name.

> Infiniband is meant for HPC and uses MPI protocol to communicate.

no and no.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 01:04:17 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 07:04:17 +0100
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
Message-ID: <BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>


On Jan 27, 2012, at 2:48 AM, Mark Hahn wrote:

>> Larrabee indeed resembles itanium to some extend, but not quite.
>
> wow, that has to be your most loosely-tethered-to-reality statement  
> yet!
> it's true that Larrabee and Itanium are very close
> in the number of letters in their name.

Your personal attack seems to indicate you disagree with my  
qualification of the entire Larrabee line
having any reality sense in the long run.

Instead of throwing mudd, mind to explain why a Larrabee,
an architecture far away from mainstream, makes any chance of  
competing in HPC
with the existing architectural concepts in the long run?

Vincent


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 01:06:07 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 07:06:07 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F21E159.7000905@unimelb.edu.au>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
Message-ID: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>

Why do you write this?

On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 26/01/12 23:28, Vincent Diepeveen wrote:
>
>> Mike you replied to me not to mailing list.
>
> That was probably deliberate, and it is inconsiderate to post a reply
> publicly without checking with the writer that they are OK with that,
> especially as you quoted what they wrote - they may not have wanted  
> that
> in the public domain.
>
> - --
>     Christopher Samuel - Senior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>          http://www.vlsci.unimelb.edu.au/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8h4VkACgkQO2KABBYQAh9lJgCfQXwsmDG9l1v4Jt9vUr5YYCr0
> fDYAoJdJBbUJBApO5ZOh200gZ5+Lo/vt
> =mpU4
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Fri Jan 27 10:37:43 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Fri, 27 Jan 2012 10:37:43 -0500 (EST)
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
Message-ID: <alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>

>>> Larrabee indeed resembles itanium to some extend, but not quite.
>>
>> wow, that has to be your most loosely-tethered-to-reality statement
>> yet!
>> it's true that Larrabee and Itanium are very close
>> in the number of letters in their name.
>
> Your personal attack seems to indicate you disagree with my
> qualification of the entire Larrabee line
> having any reality sense in the long run.

not surprisingly, no: I disagree that Larrabee and Itanium resemble
each other in any but really silly ways.

Itanium is a custom, VLIW architecture; Larrabee is an on-chip
cluster of non-VLIW, commodity x86_64 cores.

none of the distinctive features of Itanium (multi-instruction bundles,
dependency on compile-time scheduling, intended market, implementation,
success limited to predictable, high-bandwidth situations, directory-based
inter-node cache coherency) are anything close to the features of Larrabee
(standard x86_64 ISA, no special compiler needed, on-chip message-passing
network, suitable for complex/dynamic/unpredictable loads, possibly not even
cache-coherent across one chip.)

my guess is that you were thinking about how ia64 chips tended to run 
at low clock rates, and thinking about how gpus (probably including
larrabee) also tend to be low-clocked.

> Instead of throwing mudd, mind to explain why a Larrabee,
> an architecture far away from mainstream, makes any chance of
> competing in HPC
> with the existing architectural concepts in the long run?

as far as I know, larrabee will be a mesh of conventional x86_64 cores
that will run today's x86_64 code.  I don't know whether Intel has stated
(or even decided) whether the cores will have full or partial cache
coherency, or whether they'll really be an MPI-like shared-nothing cluster.

if you want to compare Larrabee to Fermi or AMD GCN, that might be 
interesting.  or to mainstream multicore - like bulldozer, with 
32c per package vs larrabee with ">=50".

but not ia64.  it's best we all just forget about it.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Fri Jan 27 10:39:06 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Fri, 27 Jan 2012 10:39:06 -0500 (EST)
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
Message-ID: <alpine.LFD.2.02.1201271037560.32179@coffee.psychology.mcmaster.ca>

> Why do you write this?

because he though you might be interested in improving your etiquette.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 10:42:48 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 10:42:48 -0500
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
Message-ID: <4F22C5F8.6010804@scalableinformatics.com>

On 01/27/2012 10:37 AM, Mark Hahn wrote:
>>>> Larrabee indeed resembles itanium to some extend, but not quite.
>>>
>>> wow, that has to be your most loosely-tethered-to-reality statement
>>> yet!
>>> it's true that Larrabee and Itanium are very close
>>> in the number of letters in their name.
>>
>> Your personal attack seems to indicate you disagree with my
>> qualification of the entire Larrabee line
>> having any reality sense in the long run.
>
> not surprisingly, no: I disagree that Larrabee and Itanium resemble
> each other in any but really silly ways.
>
> Itanium is a custom, VLIW architecture; Larrabee is an on-chip
> cluster of non-VLIW, commodity x86_64 cores.

But ...  but .... they are both made of Silicon .... doesn't that mean 
they are the same?

/sarc

(Sorry, its been a fun week ... and this was just ... too ... 
irresistible ...)

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Fri Jan 27 11:06:00 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Fri, 27 Jan 2012 11:06:00 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
Message-ID: <4F22CB68.3080605@ias.edu>

Vincent,

He wrote that because he's trying to educate you on proper mailing list
etiquette, which is something you appear to be lacking.

Chris is absolutely right - you should not reply to off-list e-mails
on-list.

--
Prentice

On 01/27/2012 01:06 AM, Vincent Diepeveen wrote:
> Why do you write this?
>
> On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:
>
> On 26/01/12 23:28, Vincent Diepeveen wrote:
>
> >>> Mike you replied to me not to mailing list.
>
> That was probably deliberate, and it is inconsiderate to post a reply
> publicly without checking with the writer that they are OK with that,
> especially as you quoted what they wrote - they may not have wanted  
> that
> in the public domain.
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 11:12:35 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 17:12:35 +0100
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
Message-ID: <109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>


On Jan 27, 2012, at 4:37 PM, Mark Hahn wrote:

>>>> Larrabee indeed resembles itanium to some extend, but not quite.
>>>
>>> wow, that has to be your most loosely-tethered-to-reality statement
>>> yet!
>>> it's true that Larrabee and Itanium are very close
>>> in the number of letters in their name.
>>
>> Your personal attack seems to indicate you disagree with my
>> qualification of the entire Larrabee line
>> having any reality sense in the long run.
>
> not surprisingly, no: I disagree that Larrabee and Itanium resemble
> each other in any but really silly ways.
>
> Itanium is a custom, VLIW architecture; Larrabee is an on-chip
> cluster of non-VLIW, commodity x86_64 cores.
>
> none of the distinctive features of Itanium (multi-instruction  
> bundles,
> dependency on compile-time scheduling, intended market,  
> implementation,
> success limited to predictable, high-bandwidth situations,  
> directory-based
> inter-node cache coherency) are anything close to the features of  
> Larrabee
> (standard x86_64 ISA, no special compiler needed, on-chip message- 
> passing
> network, suitable for complex/dynamic/unpredictable loads, possibly  
> not even
> cache-coherent across one chip.)
>
> my guess is that you were thinking about how ia64 chips tended to  
> run at low clock rates, and thinking about how gpus (probably  
> including
> larrabee) also tend to be low-clocked.
>

And both are seem failures from user viewpoint, maybe not from intels  
income viewpoint,
but from intels aim to replace and/or create a new long lasting  
architecture
that can even *remotely* compete with other manufacturers,
not to mention far too high pricepoints for such cpu's.

>> Instead of throwing mudd, mind to explain why a Larrabee,
>> an architecture far away from mainstream, makes any chance of
>> competing in HPC
>> with the existing architectural concepts in the long run?
>
> as far as I know, larrabee will be a mesh of conventional x86_64 cores
> that will run today's x86_64 code.  I don't know whether Intel has  
> stated
> (or even decided) whether the cores will have full or partial cache
> coherency, or whether they'll really be an MPI-like shared-nothing  
> cluster.

Assuming you're not completely born stupid, i assume you will realize  
that IN ORDER to run
most existing x64 codes, it needs to have cache coherency, and that  
it always has been
presented as having exactly that.

Which is one of reasons why the architecture doesn't scale of course.

Well you can forget about them running your x64 fortran codes on it  
at any fast speed.

You need to total rewrite your code to be able to use vectors of  
doubles,
and in contradiction to GPU's where you can indirectly with arrays  
see each PE or each 'compute core'
(which is 4 PE's of in case of AMD-ATI that can execute 1 double a  
cycle),

Such lookups are a disaster at larrabee - having a cost of 7 cycles  
for indirect lookups,
so you really need to use vectors.

Now i bet majority of your oldie x64 code doesn't use such huge vectors,
so to even get some remote performance out of it, a total rewrite of  
most code is needed,
if it can work at all.

We can then also see the insight that GPU's are total superior to  
larrabee at most terrains and
most importantly at multiplicative codes.

As you might know GPU's are worldchampion in doing multiplications  
and CPU's are not.

Multiplication happens to be something that is of major importance  
for the majority of HPC codes.
Majority i really mean - approaching 90% at the public supercomputers.

Vincent

>
> if you want to compare Larrabee to Fermi or AMD GCN, that might be  
> interesting.  or to mainstream multicore - like bulldozer, with 32c  
> per package vs larrabee with ">=50".
>
> but not ia64.  it's best we all just forget about it.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 11:15:05 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 17:15:05 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F22CB68.3080605@ias.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
Message-ID: <B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>

And why do you post this?

On Jan 27, 2012, at 5:06 PM, Prentice Bisbal wrote:

> Vincent,
>
> He wrote that because he's trying to educate you on proper mailing  
> list
> etiquette, which is something you appear to be lacking.
>
> Chris is absolutely right - you should not reply to off-list e-mails
> on-list.
>
> --
> Prentice
>
> On 01/27/2012 01:06 AM, Vincent Diepeveen wrote:
>> Why do you write this?
>>
>> On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:
>>
>> On 26/01/12 23:28, Vincent Diepeveen wrote:
>>
>>>>> Mike you replied to me not to mailing list.
>>
>> That was probably deliberate, and it is inconsiderate to post a reply
>> publicly without checking with the writer that they are OK with that,
>> especially as you quoted what they wrote - they may not have wanted
>> that
>> in the public domain.
>>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at cse.psu.edu  Fri Jan 27 11:25:15 2012
From: ellis at cse.psu.edu (Ellis H. Wilson III)
Date: Fri, 27 Jan 2012 11:25:15 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
Message-ID: <4F22CFEB.6080404@cse.psu.edu>

On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
> And why do you post this?

"Assuming you're not completely born stupid, i assume you will realize
that IN ORDER to" write an effective email that conveys some idea or 
argument, it is extremely helpful to utilize some form of etiquette or 
at the very least, self-restraint in your writing so we all don't stop 
reading your emails.  In fact, while it's not a terribly great book 
IMHO, it might still help to read "How to Win Friends and Influence 
People."  Seems like you have enough time on your hands to write 
near-to-incoherent emails on this list and program near-to-impossible 
applications for GPUs, so perhaps if you can steal a little time from 
one or the other you can finish it in a day or so.

But admittedly, perhaps requesting etiquette from you is truly an 
unthinkable thing to do.  Hence your boggled state of mind.

ellis

>
> On Jan 27, 2012, at 5:06 PM, Prentice Bisbal wrote:
>
>> Vincent,
>>
>> He wrote that because he's trying to educate you on proper mailing
>> list
>> etiquette, which is something you appear to be lacking.
>>
>> Chris is absolutely right - you should not reply to off-list e-mails
>> on-list.
>>
>> --
>> Prentice
>>
>> On 01/27/2012 01:06 AM, Vincent Diepeveen wrote:
>>> Why do you write this?
>>>
>>> On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:
>>>
>>> On 26/01/12 23:28, Vincent Diepeveen wrote:
>>>
>>>>>> Mike you replied to me not to mailing list.
>>>
>>> That was probably deliberate, and it is inconsiderate to post a reply
>>> publicly without checking with the writer that they are OK with that,
>>> especially as you quoted what they wrote - they may not have wanted
>>> that
>>> in the public domain.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Fri Jan 27 11:34:41 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Fri, 27 Jan 2012 11:34:41 -0500
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
Message-ID: <4F22D221.3020504@ias.edu>

On 01/27/2012 11:12 AM, Vincent Diepeveen wrote:
> And both are seem failures from user viewpoint, maybe not from intels  
> income viewpoint,
> but from intels aim to replace and/or create a new long lasting  
> architecture
> that can even *remotely* compete with other manufacturers,
> not to mention far too high pricepoints for such cpu's.

This argument is ridiculous. Just because two completely different
technologies (architectures) both fail, doesn't make them similar.

That's like saying a Ford Edsel and Pontiac Aztek are similar cars.

> Assuming you're not completely born stupid, i assume you will realize  
> that IN ORDER to run

Calling someone "completely born stupid" is unacceptable behavior.
> most existing x64 codes, it needs to have cache coherency, and that  
> it always has been
> presented as having exactly that.
> Which is one of reasons why the architecture doesn't scale of course.

Cache-coherent systems don't scale well? Really? SGI Origins were ccNUMA
systems, and they scaled well.

> Well you can forget about them running your x64 fortran codes on it  
> at any fast speed.
>
> You need to total rewrite your code to be able to use vectors of  
> doubles,
> and in contradiction to GPU's where you can indirectly with arrays  
> see each PE or each 'compute core'
> (which is 4 PE's of in case of AMD-ATI that can execute 1 double a  

This argument makes no sense in the context of this discussion.  You
need to do a significant rewrite of your code to take advantage of GPUs,
too, so how are GPUs better?

> cycle),
>
> Such lookups are a disaster at larrabee - having a cost of 7 cycles  
> for indirect lookups,
> so you really need to use vectors.
>
> Now i bet majority of your oldie x64 code doesn't use such huge vectors,
> so to even get some remote performance out of it, a total rewrite of  
> most code is needed,
> if it can work at all.
>
> We can then also see the insight that GPU's are total superior to  
> larrabee at most terrains and
> most importantly at multiplicative codes.
>
> As you might know GPU's are worldchampion in doing multiplications  
> and CPU's are not.
>
> Multiplication happens to be something that is of major importance  
> for the majority of HPC codes.
> Majority i really mean - approaching 90% at the public supercomputers.

I'm at a loss for words...


Prentice
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Fri Jan 27 11:38:02 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Fri, 27 Jan 2012 11:38:02 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
Message-ID: <4F22D2EA.1080309@ias.edu>

Vincent,

I posted that because you asked a question and I answered it, which is
also good mailing list etiquette.

Since you posted your question "Why do you write this?" to the mailing
list instead of replying just to Chris, anyone on this list is free to
reply to it. Again, this is basic mailing list etiquette.

--
Prentice

On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
> And why do you post this?
>
> On Jan 27, 2012, at 5:06 PM, Prentice Bisbal wrote:
>
>> Vincent,
>>
>> He wrote that because he's trying to educate you on proper mailing  
>> list
>> etiquette, which is something you appear to be lacking.
>>
>> Chris is absolutely right - you should not reply to off-list e-mails
>> on-list.
>>
>> --
>> Prentice
>>
>> On 01/27/2012 01:06 AM, Vincent Diepeveen wrote:
>>> Why do you write this?
>>>
>>> On Jan 27, 2012, at 12:27 AM, Christopher Samuel wrote:
>>>
>>> On 26/01/12 23:28, Vincent Diepeveen wrote:
>>>
>>>>>> Mike you replied to me not to mailing list.
>>> That was probably deliberate, and it is inconsiderate to post a reply
>>> publicly without checking with the writer that they are OK with that,
>>> especially as you quoted what they wrote - they may not have wanted
>>> that
>>> in the public domain.
>>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 11:41:55 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 17:41:55 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F22CFEB.6080404@cse.psu.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
Message-ID: <3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>


On Jan 27, 2012, at 5:25 PM, Ellis H. Wilson III wrote:

> On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
>> And why do you post this?

So you can follow all etiquette, yet only techincal your mind is not  
capable of following the discussions -
so you just felt replying to etiquette.

That says more about you, than about me.

What everyone hates about politics is that people just speak about  
how things are phrased instead of looking at the intention of the  
phrased text.

Why don't you go into politics, maybe you'll do better there.

Vincent
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at cse.psu.edu  Fri Jan 27 11:58:25 2012
From: ellis at cse.psu.edu (Ellis H. Wilson III)
Date: Fri, 27 Jan 2012 11:58:25 -0500
Subject: [Beowulf] The Absurdity of Diep - Was cpu's versus gpu's - Was
 Intel buys QLogic InfiniBand business
In-Reply-To: <3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
Message-ID: <4F22D7B1.4020508@cse.psu.edu>

On 01/27/2012 11:41 AM, Vincent Diepeveen wrote:
> On Jan 27, 2012, at 5:25 PM, Ellis H. Wilson III wrote:
>
>> On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
>>> And why do you post this?
>
> So you can follow all etiquette, yet only techincal your mind is not
> capable of following the discussions -
> so you just felt replying to etiquette.

No, I've given up writing technically when you're posting because:
a) You go into discussions to prove everyone wrong
b) You rapidly switch the topic if too many people disagree, which is 
frustrating and confusing (hence, was intel buys qlogic, then became 
cpus versus gpus, which became Itanium vs Larabee somehow, and now it is 
how poorly you communicate)
c) There is nothing to gain from having discussions with you

> That says more about you, than about me.

My personal background is storage and communication protocol-heavy.  Not 
processor-oriented.  You are right to suggest I am hesitant to post on a 
thread that directly compares two seemingly different processors, just 
like you hesitate to deal with the reality that you lack basic social 
skills.  Everyone caters to their own strengths, and generally (if they 
are wise), takes a back-seat and tries to learn something in areas they 
are weak.

> What everyone hates about politics is that people just speak about
> how things are phrased instead of looking at the intention of the
> phrased text.
>
> Why don't you go into politics, maybe you'll do better there.

Just because this is a list on Beowulfery and broadly covers everything 
remotely attached to HPC does not mean it needs to be bereft of a 
baseline of etiquette and respect for one another.  I know quite a few 
very nice, but rather intelligent and technically-capable people.  These 
two qualities can in fact coexist in a person, believe it or not.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 12:03:38 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 18:03:38 +0100
Subject: [Beowulf] Larrabee - Mark Hahn's personal attack
In-Reply-To: <4F22D221.3020504@ias.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<4F22D221.3020504@ias.edu>
Message-ID: <208B7C7D-3A3E-4134-A352-4D7D78B304D1@xs4all.nl>


On Jan 27, 2012, at 5:34 PM, Prentice Bisbal wrote:

> On 01/27/2012 11:12 AM, Vincent Diepeveen wrote:
>> And both are seem failures from user viewpoint, maybe not from intels
>> income viewpoint,
>> but from intels aim to replace and/or create a new long lasting
>> architecture
>> that can even *remotely* compete with other manufacturers,
>> not to mention far too high pricepoints for such cpu's.
>
> This argument is ridiculous. Just because two completely different
> technologies (architectures) both fail, doesn't make them similar.
>
> That's like saying a Ford Edsel and Pontiac Aztek are similar cars.
>
>> Assuming you're not completely born stupid, i assume you will realize
>> that IN ORDER to run
>
> Calling someone "completely born stupid" is unacceptable behavior.

Whereaas everyone knows the statements of intel on larrabee there and
that without cache coherency you can't multithread and everything  
also has
to be done blocked - so there is zero compatibility with x64 then and  
any compatibility
then cannot get garantueed.

You know this really well - yet you kept yourself dumb there trying  
to cheap score.

As without cache coherency of course it's easy to build big cpu's  
that scale well,
yet they don't work x64 then.

of course intel will be forced to design some kick butt design  
somewhere in future that's
not x64 compatible at all which isn't using things like cache coherency.

Which isn't remotely the idea of larrabee.

That's why you wrote it down as such.

>> most existing x64 codes, it needs to have cache coherency, and that
>> it always has been
>> presented as having exactly that.
>> Which is one of reasons why the architecture doesn't scale of course.
>
> Cache-coherent systems don't scale well? Really? SGI Origins were  
> ccNUMA
> systems, and they scaled well.
>

Indeed this didn't scale near lineair in price.

Each Origin3800 @ 64 processors @ 1.5Ghz was exactly 1 million dollar,
whereas a simple normal x64 cpu at the time had a price similar to  
the square root of that.

In GPU's it all scales very cheap, and when using cache coherency you  
start to lose that
scaling.

Yields will go down of course. Most manufacturers need a pretty high  
yield to sell a chip at
any decent price, so production costs of a larrabee chip in the same  
proces technology as a GPU,
having the same performance will be a huge factor higher. That also  
will cause intel to really sell few of them.

You would consider buying a larrabee at 1 million dollar a card?


>> Well you can forget about them running your x64 fortran codes on it
>> at any fast speed.
>>
>> You need to total rewrite your code to be able to use vectors of
>> doubles,
>> and in contradiction to GPU's where you can indirectly with arrays
>> see each PE or each 'compute core'
>> (which is 4 PE's of in case of AMD-ATI that can execute 1 double a
>
> This argument makes no sense in the context of this discussion.  You
> need to do a significant rewrite of your code to take advantage of  
> GPUs,
> too, so how are GPUs better?

If you need to rewrite it anyway, why not get a much faster  
performance at part of the price?

It's the same effort you have to do.


>
>> cycle),
>>
>> Such lookups are a disaster at larrabee - having a cost of 7 cycles
>> for indirect lookups,
>> so you really need to use vectors.
>>
>> Now i bet majority of your oldie x64 code doesn't use such huge  
>> vectors,
>> so to even get some remote performance out of it, a total rewrite of
>> most code is needed,
>> if it can work at all.
>>
>> We can then also see the insight that GPU's are total superior to
>> larrabee at most terrains and
>> most importantly at multiplicative codes.
>>
>> As you might know GPU's are worldchampion in doing multiplications
>> and CPU's are not.
>>
>> Multiplication happens to be something that is of major importance
>> for the majority of HPC codes.
>> Majority i really mean - approaching 90% at the public  
>> supercomputers.
>
> I'm at a loss for words...
>

http://www.nwo.nl/nwohome.nsf/pages/NWOP_8DEEKL_Eng

title:       "Overview of recent supercomputers 2010"
Author: Aad van der Steen

>
> Prentice
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Fri Jan 27 13:29:52 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Fri, 27 Jan 2012 13:29:52 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
Message-ID: <4F22ED20.7040105@ias.edu>

On 01/27/2012 11:41 AM, Vincent Diepeveen wrote:
> On Jan 27, 2012, at 5:25 PM, Ellis H. Wilson III wrote:
>
>> On 01/27/2012 11:15 AM, Vincent Diepeveen wrote:
>>> And why do you post this?
> So you can follow all etiquette, yet only techincal your mind is not  
> capable of following the discussions -
> so you just felt replying to etiquette.
>
> That says more about you, than about me.
>

What it says is that we've given up on discussing technology with you,
because your arguments are completely nonsensical. Since you clearly
don't understand technology, we're hoping you can at least understand
the simple concepts of basic etiquette.

--
Prentice
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From glykos at mbg.duth.gr  Fri Jan 27 13:57:31 2012
From: glykos at mbg.duth.gr (Nicholas M Glykos)
Date: Fri, 27 Jan 2012 20:57:31 +0200 (EET)
Subject: [Beowulf] Signal to noise.
In-Reply-To: <109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
Message-ID: <Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>


Dear List,

I have been a (mostly) quiet reader of this list for the last ~5 years and 
my intention is to continue reading the excellent posts that the members 
of this community contribute almost daily. Having said that, the recent 
Vincent-centric 'discussions' have ---as I am sure you all know---
significantly reduced the signal-to-noise ratio. Can we get back to 
normal, please ?

Thanks,
Nicholas

-- 


            Nicholas M. Glykos, Department of Molecular Biology
     and Genetics, Democritus University of Thrace, University Campus,
  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
    Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From moloney.brendan at gmail.com  Fri Jan 27 14:26:12 2012
From: moloney.brendan at gmail.com (Brendan Moloney)
Date: Fri, 27 Jan 2012 11:26:12 -0800
Subject: [Beowulf] Signal to noise.
In-Reply-To: <Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
Message-ID: <CABOLwMp1qfog+fZ3GRVZX668M7AV+RfWNdzc9c==-3gRjanYyw@mail.gmail.com>

I am in a similar position.  I posted a question to this list quite some
time ago but have remained subscribed to the list ever since. I have always
(or at least until recently) enjoyed reading the discussions on here. I
hope that one person does not ruin such a great resource.

Thanks,
Brendan

On Fri, Jan 27, 2012 at 10:57 AM, Nicholas M Glykos <glykos at mbg.duth.gr>wrote:

>
> Dear List,
>
> I have been a (mostly) quiet reader of this list for the last ~5 years and
> my intention is to continue reading the excellent posts that the members
> of this community contribute almost daily. Having said that, the recent
> Vincent-centric 'discussions' have ---as I am sure you all know---
> significantly reduced the signal-to-noise ratio. Can we get back to
> normal, please ?
>
> Thanks,
> Nicholas
>
> --
>
>
>            Nicholas M. Glykos, Department of Molecular Biology
>     and Genetics, Democritus University of Thrace, University Campus,
>  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
>    Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120127/840bf499/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From h-bugge at online.no  Fri Jan 27 14:29:35 2012
From: h-bugge at online.no (=?iso-8859-1?Q?H=E5kon_Bugge?=)
Date: Fri, 27 Jan 2012 11:29:35 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120124045541.GB10196@bx9.net>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
Message-ID: <8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>

Greg,


On 23. jan. 2012, at 20.55, Greg Lindahl wrote:

> On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:
> 
>> http://www.hpcwire.com/hpcwire/2012-01-23/intel_to_buy_qlogic_s_infiniband_business.html
> 
> I figured out the main why:
> 
> http://seekingalpha.com/news-article/2082171-qlogic-gains-market-share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets
> 
>> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
>> surpassed $100 million per quarter, and are on track for about fifty
>> percent annual growth, according to Crehan Research.
> 
> That's the whole market, and QLogic says they are #1 in the FCoE
> adapter segment of this market, and #2 in the overall 10 gig adapter
> market (see
> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-f2q12-results-earnings-call-transcript)

That can explain why QLogic is selling, but not why Intel is buying.

10 years ago, Intel went _out_ of the Infiniband marked, see http://www.networkworld.com/newsletters/servers/2002/01383318.html

So has the IB business evolved so incredible well compared to what Intel expected back in 2002? Do not think so.

I would guess that we will see message passing/RDMA over Thunderbolt or similar.


H?kon


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 15:06:54 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 21:06:54 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
Message-ID: <88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>


On Jan 27, 2012, at 8:29 PM, H?kon Bugge wrote:

> Greg,
>
>
> On 23. jan. 2012, at 20.55, Greg Lindahl wrote:
>
>> On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:
>>
>>> http://www.hpcwire.com/hpcwire/2012-01-23/ 
>>> intel_to_buy_qlogic_s_infiniband_business.html
>>
>> I figured out the main why:
>>
>> http://seekingalpha.com/news-article/2082171-qlogic-gains-market- 
>> share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets
>>
>>> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
>>> surpassed $100 million per quarter, and are on track for about fifty
>>> percent annual growth, according to Crehan Research.
>>
>> That's the whole market, and QLogic says they are #1 in the FCoE
>> adapter segment of this market, and #2 in the overall 10 gig adapter
>> market (see
>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses- 
>> f2q12-results-earnings-call-transcript)
>
> That can explain why QLogic is selling, but not why Intel is buying.
>
> 10 years ago, Intel went _out_ of the Infiniband marked, see http:// 
> www.networkworld.com/newsletters/servers/2002/01383318.html
>
> So has the IB business evolved so incredible well compared to what  
> Intel expected back in 2002? Do not think so.
>
> I would guess that we will see message passing/RDMA over  
> Thunderbolt or similar.
>
>

Qlogic offers that QDR.
Mellanox is a generation newer there with FDR.

Both in latency as well as in bandwidth a huge difference.


> H?kon
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 15:19:31 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 15:19:31 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
Message-ID: <4F2306D3.4080509@scalableinformatics.com>

On 01/27/2012 03:06 PM, Vincent Diepeveen wrote:
>
> On Jan 27, 2012, at 8:29 PM, H?kon Bugge wrote:
>
>> Greg,
>>
>>
>> On 23. jan. 2012, at 20.55, Greg Lindahl wrote:
>>
>>> On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:
>>>
>>>> http://www.hpcwire.com/hpcwire/2012-01-23/
>>>> intel_to_buy_qlogic_s_infiniband_business.html
>>>
>>> I figured out the main why:
>>>
>>> http://seekingalpha.com/news-article/2082171-qlogic-gains-market-
>>> share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets
>>>
>>>> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
>>>> surpassed $100 million per quarter, and are on track for about fifty
>>>> percent annual growth, according to Crehan Research.
>>>
>>> That's the whole market, and QLogic says they are #1 in the FCoE
>>> adapter segment of this market, and #2 in the overall 10 gig adapter
>>> market (see
>>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-
>>> f2q12-results-earnings-call-transcript)

I found that statement interesting.   I've actually not known anything 
about their 10GbE products.  My bad.

>>
>> That can explain why QLogic is selling, but not why Intel is buying.
>>
>> 10 years ago, Intel went _out_ of the Infiniband marked, see http://
>> www.networkworld.com/newsletters/servers/2002/01383318.html
>>
>> So has the IB business evolved so incredible well compared to what
>> Intel expected back in 2002? Do not think so.
>>
>> I would guess that we will see message passing/RDMA over
>> Thunderbolt or similar.

Intel buying makes quite a bit of sense IMO.  They are in 10GbE silicon 
and NICs, and being in IB silicon and HCAs gives them not only a hedge 
(10GbE while growing rapidly, is not the only high performance network 
market, and Intel is very good at getting economies of scale going with 
its silicon ... well ... most of its silicon ... ignoring Itanium here 
...).  Its quite likely that Intel would need IB for its PetaScale 
plans.  Someone here postualted putting the silicon on the CPU.  Not 
sure if this would happen, but I could see it on an IOH, easily.  That 
would make sense (at least in terms of the Westmere designs ... for the 
Romley et al. I am not sure where it would make most sense).

But Intel sees the HPC market growth, and I think they realize that 
there are interesting opportunities for them there with tighter high 
performance networking interconnects (Thunderbolt, USB3, IB, 10GbE 
native on all these systems).

> Qlogic offers that QDR.
> Mellanox is a generation newer there with FDR.
>
> Both in latency as well as in bandwidth a huge difference.

Haven't looked much at FDR or EDR latency.  Was it a huge delta (more 
than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us 
for a while, and switches are still ~150-300ns port to port.  At some 
point I think you start hitting a latency floor, bounded in part by "c", 
but also by an optimal technology path length that you can't shorten 
without significant investment and new technology.  Not sure how close 
we are to that point (maybe someone from Qlogic/Mellanox could comment 
on the headroom we have).

Bandwidth wise, you need E5 with PCIe 3 to really take advantage of FDR. 
  So again, its a natural fit, especially if its LOM ....

Curiously, I think this suggests that ScaleMP could be in play on the 
software side ... imagine stringing together bunches of the LOM FDR/QDR 
motherboards with E5's and lots of ram into huge vSMPs (another thread). 
  Shai may tell me I'm full of it (hope he doesn't), but I think this is 
a real possibility.  The Qlogic purchase likely makes this even more 
interesting for Intel (or Cisco, others as a defensive acq).

We sure do live in interesting times!

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 15:27:24 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 15:27:24 -0500
Subject: [Beowulf] Signal to noise.
In-Reply-To: <Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
Message-ID: <4F2308AC.9010704@scalableinformatics.com>

On 01/27/2012 01:57 PM, Nicholas M Glykos wrote:
>
> Dear List,
>
> I have been a (mostly) quiet reader of this list for the last ~5 years and
> my intention is to continue reading the excellent posts that the members
> of this community contribute almost daily. Having said that, the recent
> Vincent-centric 'discussions' have ---as I am sure you all know---
> significantly reduced the signal-to-noise ratio. Can we get back to
> normal, please ?
>

Greetings Nicholas and many others:

   I've found that filters help.  I have some simple procmail filters 
set up in my mail directory that redirect some people's email (and in 
some cases responses to them) to a file I  ... well ... never read.

   By doing so, I find the S/N ratio to be vastly improved.

   Only one person from Beowulf is in this (not Vincent ... I am still 
deeply amused by some of the emails, though that is fading fast with the 
personal attacks).

   Procmail filters look like this

:0:
* ^From:.*bad at person.com
$HOME/twit.filter

   Then I never read the twit.filter.  Just empty it out every now and 
then.  Maybe once every few years.

   Doing this has dramatically improved S/N here and elsewhere.  If you 
don't have this capability directly, your mail client can probably fake 
it.  I use this as I have (far too) many mail clients and I don't want 
to manage the rules on all of them.  If you are afflicted with Microsoft 
exchange as your mail server, I am not sure what you can (easily) do.


Joe


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From glykos at mbg.duth.gr  Fri Jan 27 15:58:02 2012
From: glykos at mbg.duth.gr (Nicholas M Glykos)
Date: Fri, 27 Jan 2012 22:58:02 +0200 (EET)
Subject: [Beowulf] Signal to noise.
In-Reply-To: <Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
Message-ID: <Pine.LNX.4.62.1201272251530.27131@aspera.cluster.mbg.gr>


Hi Joe,


> I've found that filters help.

You are killing my daily digests.


> If you are afflicted with Microsoft ...

What is 'Microsoft' ?
:-)


All the best (and apologies to the list for the email traffic),
Nicholas


-- 


            Nicholas M. Glykos, Department of Molecular Biology
     and Genetics, Democritus University of Thrace, University Campus,
  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
    Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 16:07:34 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 16:07:34 -0500
Subject: [Beowulf] Signal to noise.
In-Reply-To: <Pine.LNX.4.62.1201272251530.27131@aspera.cluster.mbg.gr>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<alpine.LFD.2.02.1201262043060.27884@coffee.psychology.mcmaster.ca>
	<BC4F6060-34A5-4A4D-8BD4-E88455FDA49D@xs4all.nl>
	<alpine.LFD.2.02.1201271021370.32179@coffee.psychology.mcmaster.ca>
	<109ADC53-91F0-4699-A3F9-0EFB57EEC25E@xs4all.nl>
	<Pine.LNX.4.62.1201272045460.24036@aspera.cluster.mbg.gr>
	<Pine.LNX.4.62.1201272251530.27131@aspera.cluster.mbg.gr>
Message-ID: <4F231216.3020703@scalableinformatics.com>

On 01/27/2012 03:58 PM, Nicholas M Glykos wrote:
>
> Hi Joe,
>
>
>> I've found that filters help.
>
> You are killing my daily digests.

Do'h !  ...  I seem to remember that you can do some more fancy 
filtering ...  Someone showed me something a few years ago, that would 
break apart digests, filter, and reassemble.

Something like this:

http://easierbuntu.blogspot.com/2011/09/managing-your-email-with-fetchmail.html

(they have some interesting procmail recipes, but you can find them to 
do this if you really want to).
>
>
>> If you are afflicted with Microsoft ...
>
> What is 'Microsoft' ?
> :-)

A small, very gentle company in the North West USA.

> All the best (and apologies to the list for the email traffic),
> Nicholas

:)


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 16:42:24 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 27 Jan 2012 22:42:24 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2306D3.4080509@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
Message-ID: <69BBD80B-05C9-4683-99F7-B48A0BDA285D@xs4all.nl>


On Jan 27, 2012, at 9:19 PM, Joe Landman wrote:

> On 01/27/2012 03:06 PM, Vincent Diepeveen wrote:
>>
>> On Jan 27, 2012, at 8:29 PM, H?kon Bugge wrote:
>>
>>> Greg,
>>>
>>>
>>> On 23. jan. 2012, at 20.55, Greg Lindahl wrote:
>>>
>>>> On Mon, Jan 23, 2012 at 11:28:26AM -0800, Greg Lindahl wrote:
>>>>
>>>>> http://www.hpcwire.com/hpcwire/2012-01-23/
>>>>> intel_to_buy_qlogic_s_infiniband_business.html
>>>>
>>>> I figured out the main why:
>>>>
>>>> http://seekingalpha.com/news-article/2082171-qlogic-gains-market-
>>>> share-in-both-fibre-channel-and-10gb-ethernet-adapter-markets
>>>>
>>>>> Server-class 10Gb Ethernet Adapter and LOM revenues have recently
>>>>> surpassed $100 million per quarter, and are on track for about  
>>>>> fifty
>>>>> percent annual growth, according to Crehan Research.
>>>>
>>>> That's the whole market, and QLogic says they are #1 in the FCoE
>>>> adapter segment of this market, and #2 in the overall 10 gig  
>>>> adapter
>>>> market (see
>>>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-
>>>> f2q12-results-earnings-call-transcript)
>
> I found that statement interesting.   I've actually not known anything
> about their 10GbE products.  My bad.
>
>>>
>>> That can explain why QLogic is selling, but not why Intel is buying.
>>>
>>> 10 years ago, Intel went _out_ of the Infiniband marked, see http://
>>> www.networkworld.com/newsletters/servers/2002/01383318.html
>>>
>>> So has the IB business evolved so incredible well compared to what
>>> Intel expected back in 2002? Do not think so.
>>>
>>> I would guess that we will see message passing/RDMA over
>>> Thunderbolt or similar.
>
> Intel buying makes quite a bit of sense IMO.  They are in 10GbE  
> silicon
> and NICs, and being in IB silicon and HCAs gives them not only a hedge
> (10GbE while growing rapidly, is not the only high performance network
> market, and Intel is very good at getting economies of scale going  
> with
> its silicon ... well ... most of its silicon ... ignoring Itanium here
> ...).  Its quite likely that Intel would need IB for its PetaScale

Why buy previous generation IB in such case?
It's about the ethernet of course...

They produce tens of millions of cpu's each quarter and also  
announced a SoC (socket on chip).

 From SoC's actually the market produces billions a year. So it's  
alucrative market, yet highly competative.

Having 10 gigabit ethernet on such SoC and the total at a low price  
would give intel a huge lead there
worth dozens of billions a year.

It's not clear to me where all their SoC plans go, but i bet right  
now they are open to any market needing SoC's.

Note that many SoC's are dirt cheap. Even in very low volume we speak  
about some tens of dollars, cpu included
and other connectivity included.

Price is everything there, yet i guess intel will be offering the  
'top' SoC's there with faster cpu's and 10 GigE.

Then they produce a bunch of mainboards.

Think also of upcoming generation of consoles, ipad 3's and similar  
products etc - it's not clear
yet which company gets the contracts for upcoming consoles, it's all  
wide open for now.

Yet they might sell also a 100+ million of those.

Intel is an attractive company to do business with for console  
manufacturers now.

IBM's cell kind of lost momentum there and has nothing new to offer  
that really outperforms as it seems.
Also power usage of cell was kind of disappointing.

Initial version PS3 was 220 watts on average and 100% usage it could  
go up to 380+ watt.
Try to put that on your couch.

Don't confuse this with the later crunching CELL version, a much  
improved chip, used for some supercomputers.

Yet if i remember well, some reports, was it Aad v/d Steen (?)  
already predicted it would be not interesting for upcoming  
supercomputers
as it is some kind of hybrid chip - which has no long term future.

He was right.

> plans.  Someone here postualted putting the silicon on the CPU.  Not
> sure if this would happen, but I could see it on an IOH, easily.  That
> would make sense (at least in terms of the Westmere designs ... for  
> the
> Romley et al. I am not sure where it would make most sense).
>
> But Intel sees the HPC market growth, and I think they realize that
> there are interesting opportunities for them there with tighter high
> performance networking interconnects (Thunderbolt, USB3, IB, 10GbE
> native on all these systems).
>

Undoubtfully they'll try something in the HPC market.

If you already have put lots of cash in development of a product it's  
better to put it
on the market.

Based upon their name they'll sell some.

And some years from now they should have something bigtime improved.
Yet realize how complicated it is to tape out a GPU at a new process  
technology
  if you aren't sure you gonna sell a 100+ million of them.

Such massive projects have to pay back for factories. A product  
that's having a potential of not even selling for over a few dozens
of billions of dollars is not even interesting to develop.

Just startup costs for a GPU at a new proces technology is some  
dozens of millions for each run and the more complex it is and the
newer the proces technology the more expensive it is.

Realize IBM produces its power7 and bluegene/q upcoming cpu at 45 nm  
technology.

GPU's release now in 28 nm. That's giving theoretically an advantage  
of a tad less of (45 / 28) ^ 2 = 2.58

So a gpu of intel needs to be factor 2.58 better in the same proces  
technology than todays gpu's of
AMD (already released 28 nm) and Nvidia (coming soon 28 nm i'd expect).

This where with cpu's, intels big advantage is always that they are  
better in getting newer proces technologies to work sooner than the  
competition.

Ivy Bridge will be 22 nm so i heard rumours.

>> Qlogic offers that QDR.
>> Mellanox is a generation newer there with FDR.
>>
>> Both in latency as well as in bandwidth a huge difference.
>
> Haven't looked much at FDR or EDR latency.  Was it a huge delta (more
> than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us
> for a while, and switches are still ~150-300ns port to port.  At some

Posting here some months ago from Gilad Shainer was it's 0.85 us RDMA  
for FDR versus 1.3 us or so for the other;
more importantly for clusters is the bandwidth.

I guess that pci-e 3.0 allows simply much higher speeds whereas the  
QDR is PCI-E 2.0 stuff.

Isn't pci-e 3.0 about 2x higher bandwidth than 2 pci-e 2.0?

Now i might be happy with that last, but i guess that for big FFT's  
or be it matrice,
you still need massive bandwidth.

Even if n is big in O ( k *  n log n )

Where k in case of matrice is a tad bigger than n and in case of  
Number Theory is usually around the number of bits,
so 3.32 times n or so, that means you still need k steps of n log n.

That's massive bandwidth.

> point I think you start hitting a latency floor, bounded in part by  
> "c",
> but also by an optimal technology path length that you can't shorten
> without significant investment and new technology.  Not sure how close
> we are to that point (maybe someone from Qlogic/Mellanox could comment
> on the headroom we have).

There is a lot of headroom for better latencies from software viewpoint,
as cpu's keep getting faster yet latency of years ago networks was  
just marginally
worse than what's there now.

In case of hardware i really am no expert there.

>
> Bandwidth wise, you need E5 with PCIe 3 to really take advantage of  
> FDR.
>   So again, its a natural fit, especially if its LOM ....
>

All the socket2011 boards that are in the shops now are PCI-e 3.0 and  
a wave of
mainboards with 2 sockets will release a few days before or at the  
same day that
intel finally releases the Xeon version of Sandy Bridge.

Seems it didn't release yet as it's not too high clocked, if i look  
at this sample cpu :)

It's 2Ghz to be precise (8 cores Xeon).

> Curiously, I think this suggests that ScaleMP could be in play on the
> software side ... imagine stringing together bunches of the LOM FDR/ 
> QDR
> motherboards with E5's and lots of ram into huge vSMPs (another  
> thread).
>   Shai may tell me I'm full of it (hope he doesn't), but I think  
> this is
> a real possibility.  The Qlogic purchase likely makes this even more
> interesting for Intel (or Cisco, others as a defensive acq).
>

A technology that just sold to 300 machines, this is not interesting  
market for intel.


They have very expensive factories that each cost many billions of  
dollars.
These need to produce nonstop and sell products, to pay back for the  
factories and to make a profit.

Intel used to be worth over a 100 billion dollar at NASDAQ.

Wasting your most clever engineers, from which each company always  
has too few, to products that can't keep busy your
factories, is a total waste of time. So your huge base of B-class  
engineers, let me not quote some mailing list names,
that's the ones you move to Qlogic then for the HPC.

That's enough to keep it afloat for a while in combination with  
'intel inside'.

Intels profit is too huge to be busy toying with tiny markets with a  
handful of customers,
from which majority forgot to take their medicine when you propose  
rewriting the software to some new hardware platform
you are gonna unroll. A habit intel is not exactly excited about of   
course, as they like to sell each time new technology.

Also each larrabee intel would sell means they sell a bunch of xeons  
less of course.

> We sure do live in interesting times!
>

Not for everyone i guess - many lost their job and as i predicted  
some years ago a guy with a
nobel prize might be carpet bombing a huge nation this summer.

Intel has 3 huge factories in Israel last time i checked.

It sure can give unpredicted results for future.

> -- 
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>         http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 16:47:21 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 16:47:21 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <69BBD80B-05C9-4683-99F7-B48A0BDA285D@xs4all.nl>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<69BBD80B-05C9-4683-99F7-B48A0BDA285D@xs4all.nl>
Message-ID: <4F231B69.1050404@scalableinformatics.com>

On 01/27/2012 04:42 PM, Vincent Diepeveen wrote:
>
> On Jan 27, 2012, at 9:19 PM, Joe Landman wrote:
>
>> On 01/27/2012 03:06 PM, Vincent Diepeveen wrote:


[... merciful trimming ...]


>>>> I would guess that we will see message passing/RDMA over
>>>> Thunderbolt or similar.
>>
>> Intel buying makes quite a bit of sense IMO.  They are in 10GbE
>> silicon
>> and NICs, and being in IB silicon and HCAs gives them not only a hedge
>> (10GbE while growing rapidly, is not the only high performance network
>> market, and Intel is very good at getting economies of scale going
>> with
>> its silicon ... well ... most of its silicon ... ignoring Itanium here
>> ...).  Its quite likely that Intel would need IB for its PetaScale
>
> Why buy previous generation IB in such case?

IP.  Its all about IP.  Its always about IP.  If ever you think its not 
about IP, you should remember "Landman's N+1<sup>th</sup> rule of M&A: 
It's the IP man ... just da IP!"

> It's about the ethernet of course...

... no its not.  Intel has its own ethernet.  Its had it for a LONG 
time, and it did not buy Qlogic ethernet ... Its not about the ethernet. 
  Say it with me ... ITS NOT ABOUT THE ETHERNET ... There, don't you 
feel better now?  I do ...


> They produce tens of millions of cpu's each quarter and also
> announced a SoC (socket on chip)

SoC is "System On a Chip".  Socket on a chip is ... er ... cart before 
the horse?


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Fri Jan 27 17:13:12 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Fri, 27 Jan 2012 14:13:12 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
Message-ID: <20120127221312.GA29961@bx9.net>

On Fri, Jan 27, 2012 at 11:29:35AM -0800, H?kon Bugge wrote:

> That can explain why QLogic is selling, but not why Intel is buying.

That's right. This was probably bought, not sold. If you look at the
press release Intel put out, it's all about Exascale computing.

http://newsroom.intel.com/community/intel_newsroom/blog/2012/01/23/intel-takes-key-step-in-accelerating-high-performance-computing-with-infiniband-acquisition

If you want to put an IB HCA in a CPU or a {north,south}bridge,
TrueScale nee InfiniPath is a much smaller implementation than others,
and most of the chip is memory, which Intel knows how to shrink
drastically compared to the usual way people implement memory.

Also, keep in mind that Intel's benchmarking group in Moscow has a lot
of experience with benchmarking real apps for bids using TrueScale
head-to-head against other HCAs, and I wouldn't be surprised if it was
the case that TrueScale QDR is faster than that other company's FDR on
many real codes, for the usual reason that TrueScale's MPI-oriented
InfiniBand extension is more suited for MPI than the standard
InfiniBand has-more-features-than-MPI-requires protocols.

Finally, I haven't seen it mentioned whether or not QLogic's IB switch
was part of the purchase. If it is, then you should note that it's not
hard to make that chip speak ethernet, and Intel could probably
dramatically improve it with their superior serdes technology.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Shainer at Mellanox.com  Fri Jan 27 17:25:58 2012
From: Shainer at Mellanox.com (Gilad Shainer)
Date: Fri, 27 Jan 2012 22:25:58 +0000
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120127221312.GA29961@bx9.net>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
Message-ID: <F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>

> If you want to put an IB HCA in a CPU or a {north,south}bridge, TrueScale nee
> InfiniPath is a much smaller implementation than others, and most of the chip
> is memory, which Intel knows how to shrink drastically compared to the usual
> way people implement memory.


So I wonder why multiple OEMs decided to use Mellanox for on-board solutions and no one used the QLogic silicon... 


> Also, keep in mind that Intel's benchmarking group in Moscow has a lot of
> experience with benchmarking real apps for bids using TrueScale head-to-head
> against other HCAs, and I wouldn't be surprised if it was the case that TrueScale
> QDR is faster than that other company's FDR on many real codes, 


Surprise surprise... this is no more than FUD. If you have real numbers to back it up please send. If it was so great, how come more people decided to use the Mellanox solutions? If QLogic was doing so great with their solution, I would guess they would not be selling the IB business... 


> Finally, I haven't seen it mentioned whether or not QLogic's IB switch was part
> of the purchase. If it is, then you should note that it's not hard to make that chip
> speak ethernet, and Intel could probably dramatically improve it with their
> superior serdes technology.
> 
> -- greg
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Fri Jan 27 17:27:23 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Fri, 27 Jan 2012 14:27:23 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2306D3.4080509@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
Message-ID: <20120127222723.GB29961@bx9.net>

On Fri, Jan 27, 2012 at 03:19:31PM -0500, Joe Landman wrote:

> >>> That's the whole market, and QLogic says they are #1 in the FCoE
> >>> adapter segment of this market, and #2 in the overall 10 gig adapter
> >>> market (see
> >>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-
> >>> f2q12-results-earnings-call-transcript)
> 
> I found that statement interesting.   I've actually not known anything 
> about their 10GbE products.  My bad.

I'm not surprised, as this 10ge adapter is aimed at the same part of
the market that uses fibre channel, which isn't that common in HPC. It
doesn't have the kind of TCP offload features which have been
(futilely) marketed in HPC; it's all about running the same fibre
channel software most enterprises have run for a long time, but having
the network be ethernet.

> Haven't looked much at FDR or EDR latency.  Was it a huge delta (more 
> than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us 
> for a while, and switches are still ~150-300ns port to port.

Are you talking about the latency of 1 core on 1 system talking to 1
core on one system, or the kind of latency that real MPI programs see,
running on all of the cores on a system and talking to many other
systems? I assure you that the latter is not 0.8 for any IB system.

> At some 
> point I think you start hitting a latency floor, bounded in part by "c", 

Last time I did the computation, we were 10X that floor. And, of
course, each increase in bandwidth usually makes latency worse, absent
heroic efforts of implementers to make that headline latency look
better.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From tom.elken at qlogic.com  Fri Jan 27 18:08:58 2012
From: tom.elken at qlogic.com (Tom Elken)
Date: Fri, 27 Jan 2012 15:08:58 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120127221312.GA29961@bx9.net>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
Message-ID: <35AAF1E4A771E142979F27B51793A4888885B23AE5@AVEXMB1.qlogic.org>

> Finally, I haven't seen it mentioned whether or not QLogic's IB switch
> was part of the purchase.

>From the QLogic press release: " QLogic Corp. ...
today announced a definitive agreement to sell the product lines ... associated with its InfiniBand business to Intel Corporation ..."

So "the product lines" means both the switch and HCA product lines.

Last summer Intel acquired an Ethernet switch business:
http://newsroom.intel.com/community/intel_newsroom/blog/2011/07/19/intel-to-acquire-fulcrum-microsystems
so it is not unprecedented that they are interested in switching as well as host technologies.

-Tom


If it is, then you should note that it's not
> hard to make that chip speak ethernet, and Intel could probably
> dramatically improve it with their superior serdes technology.
>
> -- greg
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf


This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Fri Jan 27 16:07:08 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Fri, 27 Jan 2012 16:07:08 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2306D3.4080509@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
Message-ID: <alpine.LFD.2.02.1201271554310.1084@coffee.psychology.mcmaster.ca>

>>>> http://seekingalpha.com/article/303061-qlogic-s-ceo-discusses-
>>>> f2q12-results-earnings-call-transcript)
>
> I found that statement interesting.   I've actually not known anything
> about their 10GbE products.  My bad.

I was a bit surprised that the entire transcript had only one 
side-ways mention of IB.  also interesting that they seem quite 
heavily into the heavily-offloaded adapter market (which is sort
of the opposite of the original infinipath stuff.)

>>> I would guess that we will see message passing/RDMA over
>>> Thunderbolt or similar.

has there been any mention of Thunderbolt in a switched context?
afaikt it's just a weird "let's do faster USB and throw in video" thing.

> Intel buying makes quite a bit of sense IMO.  They are in 10GbE silicon
> and NICs, and being in IB silicon and HCAs gives them not only a hedge
> (10GbE while growing rapidly, is not the only high performance network

weird to have redundant/competing parts in many of the same markets though.
afaik, intel 10G has a reasonable rep; they presumably won't be junking
their own products.

> ...).  Its quite likely that Intel would need IB for its PetaScale
> plans.

I can't quite tell whether Qlogic's IB switches use Mellanox chips or not.
afaik, Qlogic has their own adapter chips (and perhaps FC/eth).

> than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us
> for a while, and switches are still ~150-300ns port to port.  At some

mellanox qdr systems I've tested are about 1.6 us half-rtt pingpong.
I don't think the switch latency is a big deal, since with 36x fanout,
you don't need a very tall fat-tree.

> Curiously, I think this suggests that ScaleMP could be in play on the
> software side

really?  I'd be interested in hearing from real people who've actually
used it (not marketing, thanks).  I don't really understand how ScaleMP
can do the required coherency in units smaller than a page, which means
that "non-embarassing" programs will surely notice...
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From tom.elken at qlogic.com  Fri Jan 27 18:24:21 2012
From: tom.elken at qlogic.com (Tom Elken)
Date: Fri, 27 Jan 2012 15:24:21 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201271554310.1084@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<alpine.LFD.2.02.1201271554310.1084@coffee.psychology.mcmaster.ca>
Message-ID: <35AAF1E4A771E142979F27B51793A4888885B23AF3@AVEXMB1.qlogic.org>


> I can't quite tell whether Qlogic's IB switches use Mellanox chips or not.

With the QDR generation, QLogic developed its own IB switch chip, and uses it in the 12000 line of switches.

-Tom

This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From bill at cse.ucdavis.edu  Fri Jan 27 21:10:02 2012
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Fri, 27 Jan 2012 18:10:02 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
Message-ID: <4F2358FA.4030009@cse.ucdavis.edu>

On 01/27/2012 02:25 PM, Gilad Shainer wrote:
> So I wonder why multiple OEMs decided to use Mellanox for on-board 
> solutions and no one used the QLogic silicon...

That's a strange argument.

What does Intel want?  Something to make them more money.

In the past that's been integrating functionality into their CPU or
support chipsets.  In the past that's been sata, usb, memory controller,
pci-e controller, and GigE.  The cost in transistors and die
area seems very relevant to Intel's interests.

Anyone have an estimate on how much latency a direct connect to QPI
would save vs pci-e?

What to motherboard board manufacturers want?  Something to make them
more money.

So that's mostly marketing/reputation, pricing, and whatever they can do
to differentiate themselves.  If buying a $150 IB chip lets them charge
$400 more then it's a win, assuming they spend less than $250 of R&D to
add it to the motherboard.  I doubt the difference in transistors or a
few watts would be a big deal either way.

>> Also, keep in mind that Intel's benchmarking group in Moscow has a 
>> lot of experience with benchmarking real apps for bids using 
>> TrueScale
head-to-head
>> against other HCAs, and I wouldn't be surprised if it was the case
that TrueScale
>> QDR is faster than that other company's FDR on many real codes,
> 
> 
> Surprise surprise... this is no more than FUD. If you have real
> numbers to back it up please send. If it was so great, how come more
> people decided to use the Mellanox solutions? If QLogic was doing so
> great with their solution, I would guess they would not be selling the
> IB business...

FUD = Fear, Uncertainty, and Doubt.  Doesn't sound like FUD to me.
More like a cheap attack on Greg, I think we (the mailing list) can do
better.

I've personally compared several generations of Myrinet and Infinipath
to allegedly faster Mellanox adapters.  Mellanox hasn't won yet, but
I've not compared QDR or FDR yet.  With that said the reason I run the
benchmarks to find the best solution and it might well be Mellanox next
time.  It would be irresponsible to recommend Mellanox cluster provide
just pick mellanox FDR over Qlogic QDR just because of the spec sheet.
Of course recommending Qlogic over Mellanox without quantifying real
world performance would be just as irresponsible.

Maybe we could have a few less attacks, complaining and hand waving and
more useful information?  IMO Greg never came across as a commercial
(which beowulf list isn't an appropriate place for), but does regularly
contribute useful info.  Arguing market share as proof of performance
superiority is just silly.

Speaking of which, you said:
  There is some add latency due to the 66/64 new encoding, but overall
  latency is lower than QDR. MPI is below 1us.

I googled for additional information, looked around the Mellanox
website, and couldn't find anything.  Is that above number relevant to
HPC folks running clusters?  Does it involve a switch?   If not
realistic are there any realistic numbers available?
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From landman at scalableinformatics.com  Fri Jan 27 21:24:10 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 27 Jan 2012 21:24:10 -0500
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <20120127222723.GB29961@bx9.net>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<20120127222723.GB29961@bx9.net>
Message-ID: <4F235C4A.8040409@scalableinformatics.com>

On 01/27/2012 05:27 PM, Greg Lindahl wrote:

> I'm not surprised, as this 10ge adapter is aimed at the same part of
> the market that uses fibre channel, which isn't that common in HPC. It
> doesn't have the kind of TCP offload features which have been
> (futilely) marketed in HPC; it's all about running the same fibre
> channel software most enterprises have run for a long time, but having
> the network be ethernet.

That makes sense.

>> Haven't looked much at FDR or EDR latency.  Was it a huge delta (more
>> than 30%) better than QDR?  I've been hearing numbers like 0.8-0.9 us
>> for a while, and switches are still ~150-300ns port to port.
>
> Are you talking about the latency of 1 core on 1 system talking to 1
> core on one system, or the kind of latency that real MPI programs see,
> running on all of the cores on a system and talking to many other
> systems? I assure you that the latter is not 0.8 for any IB system.

I am looking at these things from a "best of all possible cases" 
scenario.  So when someone comes at me with new "best of all possible 
cases" numbers, I can compare.  Sadly this seems to be the state of many 
OEM/integrators/manufacturers.

In storage, we see small disk form factor SSDs marketed generally, with 
statments like 50k IOPs, and 500 MB/s.  Though they neglect to mention 
several specific issues with these, such as writing all zeros, or the 
75k IOPs are sequential IOPs you get from taking the 600 MB/s interface, 
dividing by 8k byte operations on a sequential read.  Actually do a real 
random read and write and you get very ... very different results. 
Especially with non-zero (real) data.


>> At some
>> point I think you start hitting a latency floor, bounded in part by "c",
>
> Last time I did the computation, we were 10X that floor. And, of
> course, each increase in bandwidth usually makes latency worse, absent
> heroic efforts of implementers to make that headline latency look
> better.

I think thats the point though, that moving that performance "knee" down 
to lower latency involves (potentially) significant cost, for a modest 
return ... in terms of real performance benefit to a code.

Thanks for the pointer on the computation.  If we are 1000x off the 
floor, we can probably come up with a way to do better. 10x, probably 
its much harder than we think and not necessarily worth the effort.


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Fri Jan 27 21:38:14 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Sat, 28 Jan 2012 03:38:14 +0100
Subject: [Beowulf] Setting up new benchmark
In-Reply-To: <4F235C4A.8040409@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<20120127222723.GB29961@bx9.net>
	<4F235C4A.8040409@scalableinformatics.com>
Message-ID: <8C9E1983-6805-4951-8DEB-79FA871940F1@xs4all.nl>

No worries - when by mid februari all components from ebay arrived  
and i've setup a small cluster here
i hope to write some MPI benchmarks that do all sorts of latency  
tests which i'll attach GPL header to,
and which should measure from latency to bandwidth using RDMA reads  
mostly,

with all cores of every node busy.
Will be interesting then to compare it all.
Maybe several over here want to benchmark.
When i first designed the latency benchmark, later on Paul Hsieh  
managed to make the ideas implementation a bit more efficient.
I jumped with a random generator through the memory, Paul Hsieh had  
optimized it to just jumping random.
Dieter Buerssner then wrote the test for single cpu to compare  
whether it was similar to output i got - which appeared to be the case.
Setting up random pattern took very long though - then i optimized to  
setup the random pattern to O ( n log n ).

The advantage of all this is that one really sees the impact with all  
cores at the same time, whereas most tests use a total idle cluster  
and test 1 microtiny thing.

Vincent
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Sat Jan 28 00:29:36 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Fri, 27 Jan 2012 21:29:36 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2358FA.4030009@cse.ucdavis.edu>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
Message-ID: <20120128052936.GF20008@bx9.net>

On Fri, Jan 27, 2012 at 06:10:02PM -0800, Bill Broadley wrote:

> Anyone have an estimate on how much latency a direct connect to QPI
> would save vs pci-e?

~ 0.2us. Remember that the first 2 generations of InfiniPath were both
SDR: one for HyperTransport and one for PCIe. The difference was 0.3us
back then; PathScale + QLogic did some heroic things since to shorten
the pipeline stages & up the clock rate.

-- greg
(and if anyone needs a reminder, I no longer have any financial
involvement with QLogic or Intel.)


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lindahl at pbm.com  Sat Jan 28 00:34:17 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Fri, 27 Jan 2012 21:34:17 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F235C4A.8040409@scalableinformatics.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<88F64A51-B5B9-494A-B6E8-52F3C5FCCAEE@xs4all.nl>
	<4F2306D3.4080509@scalableinformatics.com>
	<20120127222723.GB29961@bx9.net>
	<4F235C4A.8040409@scalableinformatics.com>
Message-ID: <20120128053417.GG20008@bx9.net>

On Fri, Jan 27, 2012 at 09:24:10PM -0500, Joe Landman wrote:

> > Are you talking about the latency of 1 core on 1 system talking to 1
> > core on one system, or the kind of latency that real MPI programs see,
> > running on all of the cores on a system and talking to many other
> > systems? I assure you that the latter is not 0.8 for any IB system.
> 
> I am looking at these things from a "best of all possible cases" 
> scenario.  So when someone comes at me with new "best of all possible 
> cases" numbers, I can compare.  Sadly this seems to be the state of many 
> OEM/integrators/manufacturers.

The point I've been trying to make for the past 8 years is that one of
the two chip families you're looking at doesn't degrade as much as the
other from the "best of all possible cases" to a real cluster running
a real code.

> In storage, we see small disk form factor SSDs marketed generally, with 
> statments like 50k IOPs, and 500 MB/s.

And if you knew that one family of SSDs had a wildly different ratio
of peak alleged perf to real application performance, would you ignore
that? I suspect not.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Sat Jan 28 05:17:32 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Sat, 28 Jan 2012 11:17:32 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
	business
In-Reply-To: <4F22ED20.7040105@ias.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu>
Message-ID: <20120128101732.GG7343@leitl.org>

On Fri, Jan 27, 2012 at 01:29:52PM -0500, Prentice Bisbal wrote:

> What it says is that we've given up on discussing technology with you,
> because your arguments are completely nonsensical. Since you clearly
> don't understand technology, we're hoping you can at least understand
> the simple concepts of basic etiquette.

Who's the list moderator, by the way?

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Sat Jan 28 08:32:26 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Sat, 28 Jan 2012 14:32:26 +0100
Subject: [Beowulf] photonic buffer bloat
Message-ID: <20120128133226.GU7343@leitl.org>


Relevant for future clusters, see the PPT presentation linked
in below URL.

----- Forwarded message from Masataka Ohta <mohta at necom830.hpcl.titech.ac.jp> -----

From: Masataka Ohta <mohta at necom830.hpcl.titech.ac.jp>
Date: Sat, 28 Jan 2012 21:42:13 +0900
To: nanog at nanog.org
Subject: Re: photonic buffer bloat
User-Agent: Mozilla/5.0 (Windows NT 5.1;
	rv:9.0) Gecko/20111222 Thunderbird/9.0.1

Eugen Leitl wrote:

> In future photonic networks (which will do relativistic cut-through
> directly in a photonic crossbar without converting photons to electrons
> and back) the fiber is not just a transport channel but also a photonic
> buffer

Yes.

> (e.g. at 10 GBit/s Ethernet a short reach fiber already buffers
> a standard 1500 MTU).

Wrong. 10Gbps is too slow for optical buffering.

At 1Tbps, you can use 100 times less lengthy fiber than at 10Gbps
to buffer packets.

A 1Tbps packet can be constructed by simultaneously encoding
100 wavelengths at 10Gbps.

> Of course photonic gates are expensive, individual delays do add up
> so even with slow light buffers

Don't try to make light slower. Slow light buffers have resonators,
which means they have very very very narrow bandwidth.

Instead, make communication speed faster, which shortens fiber
length of fiber delay line buffers.

> or optical delay loops taken into consideration
> current TCP/IP header layout has not been optimized for leading edge
> containing most significant switching/routing information, or even
> local-knowledge routing (with no global routes). It's too bad IPv6
> was not radical enough, so today's legacy protocols have to be tunneled
> through the networks of the future.

Considering that, in practice, packet headers must be processed
electrically, IPv4 at the photonic backbone is just fine, if most
routing table entries are aggregated at /24 or better, which is
the current practice. You only have to read a 16M entry SRAM.

A problem of IPv6 with 128bit addresses is that route look up
can not be performed within a constant time of a few nano
seconds, which means packets have overrun fiber delay lines.

> I presume this future is some 20-30 years away still.

Not so much. Moore's law requires much rapid bandwidth
increase.

My slides presented at IEEE photonics society 2009 summer topical

	ftp://chacha.hpcl.titech.ac.jp/IEEE-ST.ppt

might be interesting for you.

						Masataka Ohta

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Shainer at Mellanox.com  Sat Jan 28 13:21:59 2012
From: Shainer at Mellanox.com (Gilad Shainer)
Date: Sat, 28 Jan 2012 18:21:59 +0000
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <4F2358FA.4030009@cse.ucdavis.edu>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
Message-ID: <F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>

> > So I wonder why multiple OEMs decided to use Mellanox for on-board
> > solutions and no one used the QLogic silicon...
> 
> That's a strange argument.

It is not an argument, it is stating a fact. If someone claims that a product provide 10x better performance, best fit etc., and from the other side it has very little attraction, something does not make dense. 

> What does Intel want?  Something to make them more money.

Intel explained their move in their PR. They see lots of growth in HPC, definitely in the Exascale, and they see InfiniBand as a key to deliver the right solution. They also mention InfiniBand adoption in other markets, so a good validation for InfiniBand as a leading solution for any server and storage connectivity.

<snip>

> >> Also, keep in mind that Intel's benchmarking group in Moscow has a
> >> lot of experience with benchmarking real apps for bids using
> >> TrueScale
> head-to-head
> >> against other HCAs, and I wouldn't be surprised if it was the case
> that TrueScale
> >> QDR is faster than that other company's FDR on many real codes,
> >
> >
> > Surprise surprise... this is no more than FUD. If you have real
> > numbers to back it up please send. If it was so great, how come more
> > people decided to use the Mellanox solutions? If QLogic was doing so
> > great with their solution, I would guess they would not be selling the
> > IB business...
> 
> FUD = Fear, Uncertainty, and Doubt.  Doesn't sound like FUD to me.
> More like a cheap attack on Greg, I think we (the mailing list) can do better.


I never saw any genuine testing from PathScale and then QLogic comparing their stuff to Mellanox, and you are more than welcome to try and prove me wrong. The argument in this email thread is no more than a re-cap of QLogic latest marketing campaign and yes, it is no more than FUD. Cheap attacks are not my game, so please....


> I've personally compared several generations of Myrinet and Infinipath to
> allegedly faster Mellanox adapters.  Mellanox hasn't won yet, but I've not
> compared QDR or FDR yet.  With that said the reason I run the benchmarks to
> find the best solution and it might well be Mellanox next time.  It would be
> irresponsible to recommend Mellanox cluster provide just pick mellanox FDR
> over Qlogic QDR just because of the spec sheet.
> Of course recommending Qlogic over Mellanox without quantifying real world
> performance would be just as irresponsible.


Going into a bit more of a technical discussion... QLogic way of networking is doing everything in the CPU, and Mellanox way is to implement if all in the hardware (we all know that). The second option is a superset, therefore worse case can be even performance. I encourage you to contact me directly for any application benchmarking you do, and I will be happy to provide you the feedback on what you need in order to get the best out of the Mellanox products. That can be QDR vs QDR as well, no need to go to FDR - I am open for the competition any time... 


> Maybe we could have a few less attacks, complaining and hand waving and
> more useful information?  IMO Greg never came across as a commercial
> (which beowulf list isn't an appropriate place for), but does regularly contribute
> useful info.  Arguing market share as proof of performance superiority is just
> silly.

I am not sure about that... quick search in past emails can show amazing things... 
I believe most of us are in agreement here. Less FUD, more facts.

> Speaking of which, you said:
>   There is some add latency due to the 66/64 new encoding, but overall
>   latency is lower than QDR. MPI is below 1us.
> 
> I googled for additional information, looked around the Mellanox website, and
> couldn't find anything.  Is that above number relevant to
> HPC folks running clusters?  Does it involve a switch?   If not

It is with a switch

-Gilad

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Sat Jan 28 13:41:56 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Sat, 28 Jan 2012 19:41:56 +0100
Subject: [Beowulf] What It'll Take to Go Exascale
Message-ID: <20120128184156.GB7343@leitl.org>


http://www.sciencemag.org/content/335/6067/394.full 

Science 27 January 2012:

Vol. 335 no. 6067 pp. 394-396

DOI: 10.1126/science.335.6067.394

Computer Science

What It'll Take to Go Exascale

Robert F. Service

Scientists hope the next generation of supercomputers will carry out a
million trillion operations per second. But first they must change the way
the machines are built and run.

On fire.

More powerful supercomputers now in the design stage should make modeling
turbulent gas flames more accurate and revolutionize engine designs.

"CREDIT: J. CHEN/CENTER FOR EXASCALE SIMULATION OF COMBUSTION IN TURBULENCE,
SANDIA NATIONAL LABORATORIES"

Using real climate data, scientists at Lawrence Berkeley National Laboratory
(LBNL) in California recently ran a simulation on one of the world's most
powerful supercomputers that replicated the number of tropical storms and
hurricanes that had occurred over the past 30 years. Its accuracy was a
landmark for computer modeling of global climate. But Michael Wehner and his
LBNL colleagues have their eyes on a much bigger prize: understanding whether
an increase in cloud cover from rising temperatures would retard climate
change by reflecting more light back into space, or accelerate it by trapping
additional heat close to Earth.

To succeed, Wehner must be able to model individual cloud systems on a global
scale. To do that, he will need supercomputers more powerful than any yet
designed. These so-called exascale computers would be capable of carrying out
1018 floating point operations per second, or an exaflop. That's nearly 100
times more powerful than today's biggest supercomputer, Japan's ?K Computer,?
which achieves 11.3 petaflops (1015 flops) (see graph), and 1000 times faster
than the Hopper supercomputer used by Wehner and his colleagues. The United
States now appears poised to reach for the exascale, as do China, Japan,
Russia, India, and the European Union.

It won't be easy. Advances in supercomputers have come at a steady pace over
the past 20 years, enabled by the continual improvement in computer chip
manufacturing. But this evolutionary approach won't cut it in getting to the
exascale. Instead, computer scientists must first figure out ways to make
future machines far more energy efficient and tolerant of errors, and find
novel ways to program them.

?The step we are about to take to exascale computing will be very, very
difficult,? says Robert Rosner, a physicist at the University of Chicago in
Illinois, who chaired a recent Department of Energy (DOE) committee charged
with exploring whether exascale computers would be achievable. Charles Shank,
a former director of LBNL who recently headed a separate panel collecting
widespread views on what it would take to build an exascale machine, agrees.
?Nobody said it would be impossible,? Shank says. ?But there are significant
unknowns.?

Gaining support

The next generation of powerful supercomputers will be used to design
high-efficiency engines tailored to burn biofuels, reveal the causes of
supernova explosions, track the atomic workings of catalysts in real time,
and study how persistent radiation damage might affect the metal casing
surrounding nuclear weapons. ?It's a technology that has become critically
important for many scientific disciplines,? says Horst Simon, LBNL's deputy
director.

That versatility has made supercomputing an easy sell to politicians. The
massive 2012 spending bill approved last month by Congress contained $1.06
billion for DOE's program in advanced computing, which includes a down
payment to bring online the world's first exascale computer. Congress didn't
specify exactly how much money should be spent on the exascale initiative,
for which DOE had requested $126 million. But it asked for a detailed plan,
due next month, with multiyear budget breakdowns listing who is expected to
do what, when. Those familiar with the ways of Washington say that the
request reflects an unusual bipartisan consensus on the importance of the
initiative.

?In today's political atmosphere, this is very unusual,? says Jack Dongarra,
a computer scientist at the University of Tennessee, Knoxville, who closely
follows national and international high-performance computing trends. ?It
shows how critical it really is and the threat perceived of the U.S. losing
its dominance in the field.? The threat is real: Japan and China have built
and operate the three most powerful supercomputers in the world.

The rest of the world also hopes that their efforts will make them less
dependent on U.S. technology. Of today's top 500 supercomputers, the vast
majority were built using processors from Intel, Advanced Micro Devices
(AMD), and NVIDIA, all U.S.-based companies. But that's beginning to change,
at least at the top. Japan's K machine is built using specially designed
processors from Fujitsu, a Japanese company. China, which had no
supercomputers in the Top500 List in 2000, now has five petascale machines
and is building another with processors made by a Chinese company. And an
E.U. research effort plans to use ARM processing chips made by a U.K.
company.

Getting over the bumps

Although bigger and faster, supercomputers aren't fundamentally different
from our desktops and laptops, all of which rely on the same sorts of
specialized components. Computer processors serve as the brains that carry
out logical functions, such as adding two numbers together or sending a bit
of data to a location where it is needed. Memory chips, by contrast, hold
data for safekeeping for later use. A network of wires connects processors
and memory and allows data to flow where and when they are needed.

For decades, the primary way of improving computers was creating chips with
ever smaller and faster circuitry. This increased the processor's frequency,
allowing it to churn through tasks at a faster clip. Through the 1990s,
chipmakers steadily boosted the frequency of chips. But the improvements came
at a price: The power demanded by a processor is proportional to its
frequency cubed. So doubling a processor's frequency requires an eightfold
increase in power.

New king.

Japan has the fastest machine (bar), although the United States still has the
most petascale computers (number in parentheses).

"CREDIT: ADAPTED FROM JACK DONGARRA/TOP 500 LIST/UNIVERSITY OF TENNESSEE"

On the rise.

The gap in available supercomputing capacity between the United States and
the rest of the world has narrowed, with China gaining the most ground.

"CREDIT: ADAPTED FROM JACK DONGARRA/TOP 500 LIST/UNIVERSITY OF TENNESSEE"

With the rise of mobile computing, chipmakers couldn't raise power demands
beyond what batteries could store. So about 10 years ago, chip manufacturers
began placing multiple processing ?cores? side by side on single chips. This
arrangement meant that only twice the power was needed to double a chip's
performance.

This trend swept through the world of supercomputers. Those with single
souped-up processors gave way to today's ?parallel? machines that couple vast
numbers of off-the-shelf commercial processors together. This move to
parallel computing ?was a huge, disruptive change,? says Robert Lucas, an
electrical engineer at the University of Southern California's Information
Sciences Institute in Los Angeles.

Hardware makers and software designers had to learn how to split problems
apart, send individual pieces to different processors, synchronize the
results, and synthesize the final ensemble. Today's top machine?Japan's ?K
Computer??has 705,000 cores. If the trend continues, an exascale computer
would have between 100 million and 1 billion processors.

But simply scaling up today's models won't work. ?Business as usual will not
get us to the exascale,? Simon says. ?These computers are becoming so
complicated that a number of issues have come up that were not there before,?
Rosner agrees.

The biggest issue relates to a supercomputer's overall power use. The largest
supercomputers today use about 10 megawatts (MW) of power, enough to power
10,000 homes. If the current trend of power use continues, an exascale
supercomputer would require 200 MW. ?It would take a nuclear power reactor to
run it,? Shank says.

Even if that much power were available, the cost would be prohibitive. At $1
million per megawatt per year, the electricity to run an exascale machine
would cost $200 million annually. ?That's a non-starter,? Shank says. So the
current target is a machine that draws 20 MW at most. Even that goal will
require a 300-fold improvement in flops per watt over today's technology.

Ideas for getting to these low-power chips are already circulating. One would
make use of different types of specialized cores. Today's top-of-the-line
supercomputers already combine conventional processor chips, known as CPUs,
with an alternative version called graphical processing units (GPUs), which
are very fast at certain types of calculations. Chip manufacturers are now
looking at going from ?multicore? chips with four or eight cores to
?many-core? chips, each containing potentially hundreds of CPU and GPU cores,
allowing them to assign different calculations to specialized processors.
That change is expected to make the overall chips more energy efficient.
Intel, AMD, and other chip manufacturers have already announced plans to make
hybrid many-core chips.

Another stumbling block is memory. As the number of processors in a
supercomputer skyrockets, so, too, does the need to add memory to feed bits
of data to the processors. Yet, over the next few years, memory manufacturers
are not projected to increase the storage density of their chips fast enough
to keep up with the performance gains of processors. Supercomputer makers can
get around this by adding additional memory modules. But that's threatening
to drive costs too high, Simon says.

Even if researchers could afford to add more memory modules, that still won't
solve matters. Moving ever-growing streams of data back and forth to
processors is already creating a backup for processors that can dramatically
slow a computer's performance. Today's supercomputers use 70% of their power
to move bits of data around from one place to another.

One potential solution would stack memory chips on top of one another and run
communication and power lines vertically through the stack. This more-compact
architecture would require fewer steps to route data. Another approach would
stack memory chips atop processors to minimize the distance bits need to
travel.

A third issue is errors. Modern processors compute with stunning accuracy,
but they aren't perfect. The average processor will produce one error per
year, as a thermal fluctuation or a random electrical spike flips a bit of
data from one value to another.

Such errors are relatively easy to ferret out when the number of processors
is low. But it gets much harder when 100 million to 1 billion processors are
involved. And increasing complexity produces additional software errors as
well. One possible solution is to have the supercomputer crunch different
problems multiple times and ?vote? for the most common solution. But that
creates a new problem. ?How can I do this without wasting double or triple
the resources?? Lucas asks. ?Solving this problem will probably require new
circuit designs and algorithms.?

Finally, there is the challenge of redesigning the software applications
themselves, such as a novel climate model or a simulation of a chemical
reaction. ?Even if we can produce a machine with 1 billion processors, it's
not clear that we can write software to use it efficiently,? Lucas says.
Current parallel computing machines use a strategy, known as message passing
interface, that divides computational problems and parses out the pieces to
individual processors, then collects the results. But coordinating all this
traffic for millions of processors is becoming a programming nightmare.
?There's a huge concern that the programming paradigm will have to change,?
Rosner says.

DOE has already begun laying the groundwork to tackle these and other
challenges. Last year it began funding three ?co-design? centers,
multi-institution cooperatives led by researchers at Los Alamos, Argonne, and
Sandia national laboratories. The centers bring together scientific users who
write the software code and hardware makers to design complex software and
computer architectures that work in the fastest and most energy-efficient
manner. It poses a potential clash between scientists who favor openness and
hardware companies that normally keep their activities secret for proprietary
reasons. ?But it's a worthy goal,? agrees Wilfred Pinfold, Intel's director
of extreme-scale programming in Hillsboro, Oregon.

Not so fast.

Researchers have some ideas on how to overcome barriers to building exascale
machines.

Coming up with the cash

Solving these challenges will take money, and lots of it. Two years ago,
Simon says, DOE officials estimated that creating an exascale computer would
cost $3 billion to $4 billion over 10 years. That amount would pay for one
exascale computer for classified defense work, one for nonclassified work,
and two 100-petaflops machines to work out some of the technology along the
way.

Those projections assumed that Congress would deliver a promised 10-year
doubling of the budget of DOE's Office of Science. But those assumptions are
?out of the window,? Simon says, replaced by the more likely scenario of
budget cuts as Congress tries to reduce overall federal spending.

Given that bleak fiscal picture, DOE officials must decide how aggressively
they want to pursue an exascale computer. ?What's the right balance of being
aggressive to maintain a leadership position and having the plan sent back to
the drawing board by [the Office of Management and Budget]?? Simon asks. ?I'm
curious to see.? DOE's strategic plan, due out next month, should provide
some answers.

The rest of the world faces a similar juggling act. China, Japan, the
European Union, Russia, and India all have given indications that they hope
to build an exascale computer within the next decade. Although none has
released detailed plans, each will need to find the necessary resources
despite these tight fiscal times.

The victor will reap more than scientific glory. Companies use 57% of the
computing time on the machines on the Top500 List, looking to speed product
design and gain other competitive advantages, Dongarra says. So government
officials see exascale computing as giving their industries a leg up. That's
particularly true for chip companies that plan to use exascale designs to
improve future commodity electronics. ?It will have dividends all the way
down to the laptop,? says Peter Beckman, who directs the Exascale Technology
and Computing Initiative at Argonne National Laboratory in Illinois.

The race to provide the hardware needed for exascale computing ?will be
extremely competitive,? Beckman predicts, and developing software and
networking technology will be equally important, according to Dongarra. Even
so, many observers think that the U.S. track record and the current alignment
of its political and scientific forces makes it America's race to lose.

Whatever happens, U.S. scientists are unlikely to be blindsided. The task of
building the world's first exascale computer is so complex, Simon says, that
it will be nearly impossible for a potential winner to hide in the shadows
and come out of nowhere to claim the prize.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Sat Jan 28 14:26:48 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Sat, 28 Jan 2012 14:26:48 -0500 (EST)
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <20120128101732.GG7343@leitl.org>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
Message-ID: <alpine.LFD.2.02.1201281420330.7178@coffee.psychology.mcmaster.ca>

>> the simple concepts of basic etiquette.
>
> Who's the list moderator, by the way?

no, please - if there were a moderator who had to plow through
all messages, no matter how long, meandering and low-worth,
it would become a very unpleasant chore...

the list doesn't get a lot of passing weirdos - pretty stable 
set of characters, fairly predictable in how much you want to read
their messages, and how much good you expect to gain from them ;)

regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Sat Jan 28 16:28:09 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Sat, 28 Jan 2012 16:28:09 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
Message-ID: <alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>

>>> So I wonder why multiple OEMs decided to use Mellanox for on-board
>>> solutions and no one used the QLogic silicon...
>>
>> That's a strange argument.
>
> It is not an argument, it is stating a fact.

you are mistaken.  you ask a pointed question - do not construe it 
as a statement of fact.  if you wanted to state a fact, you might say:
"multiple OEMs decided to use Mellanox and none have used Qlogic".

by stating this, you are implying that Mellanox is superior in some way,
though another perfectly adequate explanation could be that Qlogic 
didn't offer their chips to OEMs, or did so at a higher price.  (in fact,
the latter would suggest the possibility that Qlogic chips are actually
worth more.)  note my use of subjunctive here.

in reality, Mellanox is the easy choice - widely known and used,
the default.  OEMs are fond of making easy choices: more comfortable
to a lazy customer, possibly lower customer support costs, etc.

this says nothing about whether an easy choice is a superior solution 
to the customer (that is, in performance, price, etc).


> If someone claims that a product provide 10x better performance, best fit
>etc., and from the other side it has very little attraction, something does
>not make dense.

I saw no 10x performance claim here.  there was some casual mention
of a situation where Qlogic QDR performs similar to Mellanox FDR.


>good validation for InfiniBand as a leading solution for any server and
>storage connectivity.

besides Lustre, where do you see IB used for storage?


> Going into a bit more of a technical discussion... QLogic way of networking
>is doing everything in the CPU, and Mellanox way is to implement if all in
>the hardware (we all know that).

this is a dishonest statement: you know that QLogic isn't actually trying
to do *everything* in the CPU.


> The second option is a superset, therefore
>worse case can be even performance.

this is also dishonest: making the adapter more intelligent clearly
introduces some tradeoffs, so it's _not_ a superset.  unless you are 
claiming that within every Mellanox adapter is _literally_ the same 
functionality, at the same performance, as is in a Qlogic adapter.


>> Maybe we could have a few less attacks, complaining and hand waving and
>> more useful information?  IMO Greg never came across as a commercial
>> (which beowulf list isn't an appropriate place for), but does regularly contribute
>> useful info.  Arguing market share as proof of performance superiority is just
>> silly.
>
> I am not sure about that... quick search in past emails can show amazing things...
> I believe most of us are in agreement here. Less FUD, more facts.

"facts" in this context (as opposed to FUD, armwaiving, etc) must be 
dispassionate and quantifiable.  not hyperbole and suggestive rhetoric.

out of curiosity, has anyone set up a head-to-head comparison
(two or more identical machines, both with a Qlogic and a Mellanox card of
the same vintage)?

regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From diep at xs4all.nl  Sat Jan 28 19:12:59 2012
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Sun, 29 Jan 2012 01:12:59 +0100
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
Message-ID: <D7E0943A-D155-44EF-B7C8-A0A1C6EF385F@xs4all.nl>


On Jan 28, 2012, at 10:28 PM, Mark Hahn wrote:
[snip]

> out of curiosity, has anyone set up a head-to-head comparison
> (two or more identical machines, both with a Qlogic and a Mellanox  
> card of
> the same vintage)?
>
> regards, mark hahn.

Mark, i stumbled upon the same problem a few months ago when i  
googled for 4x infiniband you can find something,
when moving up to QDR it becomes more sporadic.
Not to mention that the interesting test is where the cards are bad -  
latency.
If you find anything, usually it's manufacturer side statements  
without clear testsetup and usually doing 0 byte tests.

This is exactly why i intend to write a benchmark.

What i personally believe is not important whether FDR,  pci-e 3.0  
and a considerable higher claimed bandwidth than pci-e 2.0 QDR.

What i do believe is that one must measure objectively.

That's why i'm posting for a while now that as soon as the cluster  
works here i'm gonna
write a benchmark to measure latencies moving up the read length  
slowly so that it more and more gets a bandwidth game and simply  
present the
graph for the interested readers.

We're not interested in theoretic tests of 1 core busy that is  
measuring a latency of another core at the other side busy.

A test really requires all cores busy and hammering onto the network  
card.

In the end always everything is a measure of bandwidth of course, but  
even then the lack of scientists online who tested objectively QDR,
no matter *what manufacturer*, such tests really are there in short  
supply and some of them either just tested 1 tiny thing or a  
theoretic thing,
or just lacked all realism when i read the rest of the article.

All with all, after some days of googling,

I found 1 tester who toyed something using the same switch (good  
idea) but the graphs drawn presenting the results are tough to interpret
and basically was interested in something else than what's fast now  
for the network cards.

Running the same oldie tests, whereas all manufacturers have way  
faster alternatives now, such as RDMA reads, is just not interesting.

To be continued in some months...

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Shainer at Mellanox.com  Sun Jan 29 00:03:31 2012
From: Shainer at Mellanox.com (Gilad Shainer)
Date: Sun, 29 Jan 2012 05:03:31 +0000
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
Message-ID: <F46B2E61C40ADF4ABD39500BC54C3C791896C86F@MTIDAG01.mtl.com>

> >>> So I wonder why multiple OEMs decided to use Mellanox for on-board
> >>> solutions and no one used the QLogic silicon...
> >>
> >> That's a strange argument.
> >
> > It is not an argument, it is stating a fact.
> 
> you are mistaken.  you ask a pointed question - do not construe it as a
> statement of fact.  if you wanted to state a fact, you might say:
> "multiple OEMs decided to use Mellanox and none have used Qlogic".

You probably meant to say "I think differently" and not "you are mistaken".... Making this mailing list little more polite will benefit us all.  
 
> by stating this, you are implying that Mellanox is superior in some way, though
> another perfectly adequate explanation could be that Qlogic didn't offer their
> chips to OEMs, or did so at a higher price.  (in fact, the latter would suggest the
> possibility that Qlogic chips are actually worth more.)  note my use of
> subjunctive here.
> 
> in reality, Mellanox is the easy choice - widely known and used, the default.
> OEMs are fond of making easy choices: more comfortable to a lazy customer,
> possibly lower customer support costs, etc.
> 
> this says nothing about whether an easy choice is a superior solution to the
> customer (that is, in performance, price, etc).

OEMs don't place devices on the motherboard just because they can, not because it is cheaper. They do so because they believe it will benefit their users, hence they will sell more. I can assure you that silicon was offered from both companies, and it wasn't an issue of price. From this point you can make any conclusion that you wish to. 
 
<snip>

> >good validation for InfiniBand as a leading solution for any server and
> >storage connectivity.
> 
> besides Lustre, where do you see IB used for storage?

Protocols: iSER (iSCSI), NFSoRDMA, SRP, GPFS, SMB and others
OEMs: DDN, Xyratex, Netapp, EMC, Oracle, SGI, HP, IBM and others. 

> > Going into a bit more of a technical discussion... QLogic way of networking
> >is doing everything in the CPU, and Mellanox way is to implement if all in
> >the hardware (we all know that).
> 
> this is a dishonest statement: you know that QLogic isn't actually trying
> to do *everything* in the CPU.

You are right, you do need a HW translation from PCIe to IB. But I am sure you know where the majority of the transport, error handling etc is being done....

> > The second option is a superset, therefore
> >worse case can be even performance.
> 
> this is also dishonest: making the adapter more intelligent clearly
> introduces some tradeoffs, so it's _not_ a superset.  unless you are
> claiming that within every Mellanox adapter is _literally_ the same
> functionality, at the same performance, as is in a Qlogic adapter.

It is not dishonest. In general offloading is a superset. You can chose to implement just offloading or to leave room for CPU control as well. There will always be parts that are better to be in HW, and if you have flexibility for the rest it is a superset.  


> >> Maybe we could have a few less attacks, complaining and hand waving and
> >> more useful information?  IMO Greg never came across as a commercial
> >> (which beowulf list isn't an appropriate place for), but does regularly
> contribute
> >> useful info.  Arguing market share as proof of performance superiority is
> just
> >> silly.
> >
> > I am not sure about that... quick search in past emails can show amazing
> things...
> > I believe most of us are in agreement here. Less FUD, more facts.
> 
> "facts" in this context (as opposed to FUD, armwaiving, etc) must be
> dispassionate and quantifiable.  not hyperbole and suggestive rhetoric.

Maybe we read different emails.

> out of curiosity, has anyone set up a head-to-head comparison
> (two or more identical machines, both with a Qlogic and a Mellanox card of
> the same vintage)?
> 
> regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From hahn at mcmaster.ca  Mon Jan 30 10:04:53 2012
From: hahn at mcmaster.ca (Mark Hahn)
Date: Mon, 30 Jan 2012 10:04:53 -0500 (EST)
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <CANe5L+QYZUZ88LSffUitw==ya8SRtzLSxtAMCQ+mE7pHByjfsA@mail.gmail.com>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
	<CANe5L+QYZUZ88LSffUitw==ya8SRtzLSxtAMCQ+mE7pHByjfsA@mail.gmail.com>
Message-ID: <alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>

>> out of curiosity, has anyone set up a head-to-head comparison
>> (two or more identical machines, both with a Qlogic and a Mellanox card of
>> the same vintage)?
>>
>> There was a bit of discussion of InfiniBand benchmarking in this thread
> and it seems it would be helpful to the casual readers like myself to have
> a few references to benchmarking toolkits and actual results.
>
> Most often reported results are gathered with either Netpipe from Ames or
> Intel MPI Benchmark (formerly known as Palas Benchmark) or OSU
> Micro-benchmarks.
>
> Searching the web produced a recent report from Swiss CSCS where a Mellanox
> ConnectX3 QDR HCA with a Mellanox switch is set against a Qlogic 7300 QDR
> HCA connected to a Qlogic switch.
> http://www.cscs.ch/fileadmin/user_upload/customers/cscs/Tech_Reports/Performance_Analysis_IB-QDR_final-2.pdf

as far as I can tell, this paper mainly says "a coalescing stack delivers
benchmark results showing a lot higher bandwidth and message rate than a
non-coalescing stack."  the comment on figure 8:

     To some extent, the environment variables mentioned before
     contribute to this outstanding result

which is remarkably droll.  I'm not sure how well coalescing works for real
applications.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From prentice at ias.edu  Mon Jan 30 11:20:46 2012
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 30 Jan 2012 11:20:46 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
 business
In-Reply-To: <20120128101732.GG7343@leitl.org>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>	<4F21E159.7000905@unimelb.edu.au>	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>	<4F22CFEB.6080404@cse.psu.edu>	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>	<4F22ED20.7040105@ias.edu>
	<20120128101732.GG7343@leitl.org>
Message-ID: <4F26C35E.7060702@ias.edu>

On 01/28/2012 05:17 AM, Eugen Leitl wrote:
> On Fri, Jan 27, 2012 at 01:29:52PM -0500, Prentice Bisbal wrote:
>
>> What it says is that we've given up on discussing technology with you,
>> because your arguments are completely nonsensical. Since you clearly
>> don't understand technology, we're hoping you can at least understand
>> the simple concepts of basic etiquette.
> Who's the list moderator, by the way?
>

I don't think there is one, hence all the noise. The mailing list and
beowulf.org is maintained by Penguin Computing/Scyld Software. Maybe
they'd be interested in appoint a moderator or 3.

---
Prentice
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Shainer at Mellanox.com  Mon Jan 30 14:22:24 2012
From: Shainer at Mellanox.com (Gilad Shainer)
Date: Mon, 30 Jan 2012 19:22:24 +0000
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
	<CANe5L+QYZUZ88LSffUitw==ya8SRtzLSxtAMCQ+mE7pHByjfsA@mail.gmail.com>
	<alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>
Message-ID: <F46B2E61C40ADF4ABD39500BC54C3C7918970799@MTIDAG01.mtl.com>

> >> out of curiosity, has anyone set up a head-to-head comparison (two or
> >> more identical machines, both with a Qlogic and a Mellanox card of
> >> the same vintage)?
> >>
> >> There was a bit of discussion of InfiniBand benchmarking in this
> >> thread
> > and it seems it would be helpful to the casual readers like myself to
> > have a few references to benchmarking toolkits and actual results.
> >
> > Most often reported results are gathered with either Netpipe from Ames
> > or Intel MPI Benchmark (formerly known as Palas Benchmark) or OSU
> > Micro-benchmarks.
> >
> > Searching the web produced a recent report from Swiss CSCS where a
> > Mellanox
> > ConnectX3 QDR HCA with a Mellanox switch is set against a Qlogic 7300
> > QDR HCA connected to a Qlogic switch.
> > http://www.cscs.ch/fileadmin/user_upload/customers/cscs/Tech_Reports/P
> > erformance_Analysis_IB-QDR_final-2.pdf
> 
> as far as I can tell, this paper mainly says "a coalescing stack delivers
> benchmark results showing a lot higher bandwidth and message rate than a
> non-coalescing stack."  the comment on figure 8:
> 
>      To some extent, the environment variables mentioned before
>      contribute to this outstanding result
> 
> which is remarkably droll.  I'm not sure how well coalescing works for real
> applications.

First, I looked on the paper and it includes latency and bandwidth comparison as well, not only message rate. It is important for others to know that, and not to dismiss it. Second, both companies have options for message coalescing. You can chose to use it or not - I saw apps that got a benefit from it, and saw applications that does not. Without coalescing Mellanox provides around 30M message per second.

-Gilad.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From peter.st.john at gmail.com  Mon Jan 30 18:07:11 2012
From: peter.st.john at gmail.com (Peter St. John)
Date: Mon, 30 Jan 2012 18:07:11 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
	business
In-Reply-To: <4F26C35E.7060702@ias.edu>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
Message-ID: <CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>

Instead of appointing a moderator, we could grow one with recursive Page
Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
about this type of thing a while ago because of "citation analysis", see
the link).

Someone writes an open script and members of the list mail it with the
answers to these three questions:
1. do you volunteer to moderate?
2. Who should moderate? (give email addresses)
3. Who should judge who should moderate? (give email addresses).

Then you iterate over scoring people by "wisdom" and who gets the most
"wise" votes, until the scores converge.
The biggest hurdle would probably be getting volunteers, though.
Peter

On Mon, Jan 30, 2012 at 11:20 AM, Prentice Bisbal <prentice at ias.edu> wrote:

> On 01/28/2012 05:17 AM, Eugen Leitl wrote:
> > On Fri, Jan 27, 2012 at 01:29:52PM -0500, Prentice Bisbal wrote:
> >
> >> What it says is that we've given up on discussing technology with you,
> >> because your arguments are completely nonsensical. Since you clearly
> >> don't understand technology, we're hoping you can at least understand
> >> the simple concepts of basic etiquette.
> > Who's the list moderator, by the way?
> >
>
> I don't think there is one, hence all the noise. The mailing list and
> beowulf.org is maintained by Penguin Computing/Scyld Software. Maybe
> they'd be interested in appoint a moderator or 3.
>
> ---
> Prentice
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120130/8b652bf4/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From landman at scalableinformatics.com  Mon Jan 30 18:09:48 2012
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 30 Jan 2012 18:09:48 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
Message-ID: <4F27233C.8080508@scalableinformatics.com>

On 01/30/2012 06:07 PM, Peter St. John wrote:
> Instead of appointing a moderator, we could grow one with recursive Page
> Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
> about this type of thing a while ago because of "citation analysis", see
> the link).

Please ... no moderator.  Lists get boring while waiting for content 
filtering organisms to fulfill their voluntary tasks ...

If you don't like someone's writing, filter them.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Mon Jan 30 18:21:45 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 30 Jan 2012 15:21:45 -0800
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic
	InfiniBand	business
In-Reply-To: <CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>	<4F22ED20.7040105@ias.edu>
	<20120128101732.GG7343@leitl.org>	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>


The biggest hurdle would probably be getting volunteers, though.
Peter

You got that right...  Moderating takes a deft touch and a thick skin.


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.clustermonkey.net/pipermail/beowulf/attachments/20120130/2b0c0a1e/attachment-0001.html>
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

From james.p.lux at jpl.nasa.gov  Mon Jan 30 18:25:49 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Mon, 30 Jan 2012 15:25:49 -0800
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <4F27233C.8080508@scalableinformatics.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>	<4F22ED20.7040105@ias.edu>
	<20120128101732.GG7343@leitl.org>	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<4F27233C.8080508@scalableinformatics.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B448232@ALTPHYEMBEVSP20.RES.AD.JPL>


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Joe Landman
Sent: Monday, January 30, 2012 3:10 PM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand business

On 01/30/2012 06:07 PM, Peter St. John wrote:
> Instead of appointing a moderator, we could grow one with recursive 
> Page Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we 
> knew about this type of thing a while ago because of "citation 
> analysis", see the link).

Please ... no moderator.  Lists get boring while waiting for content filtering organisms to fulfill their voluntary tasks ...

If you don't like someone's writing, filter them.


--
I agree.
However, there is also "after the fact moderation".. all posts go through by default, but someone acts as a "list conscience" and gently (or not so gently) applies a corrective force, presumably using some sort of adaptive algorithm (different people have different "plant characteristics" so the optimal controller changes).

But that requires an even deft-er touch and thicker skin.

All lists with participation by knowledgeable and opinionated people with varied interests and specialization tend to go off on tangents occasionally.  You just delete when needed, and wait for the transient to die out.  My best guess is that about 48 hours is how long the transient lasts (because it takes two cycles, for those who read the list once a day, to realize that it's died out and not keep feeding it)


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Mon Jan 30 18:52:14 2012
From: deadline at eadline.org (Douglas Eadline)
Date: Mon, 30 Jan 2012 18:52:14 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
In-Reply-To: <CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
References: <1327559032.32034.YahooMailClassic@web120504.mail.ne1.yahoo.com>
	<EAD63BE5-C9A9-4C92-A8CE-CAFE31A2A280@xs4all.nl>
	<4F21E159.7000905@unimelb.edu.au>
	<95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
Message-ID: <294b053bd84fed49f071a631c79be7e8.squirrel@mail.eadline.org>

I use my personal Zen type moderation.

yea, whatever

--
Doug

> Instead of appointing a moderator, we could grow one with recursive Page
> Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
> about this type of thing a while ago because of "citation analysis", see
> the link).
>
> Someone writes an open script and members of the list mail it with the
> answers to these three questions:
> 1. do you volunteer to moderate?
> 2. Who should moderate? (give email addresses)
> 3. Who should judge who should moderate? (give email addresses).
>
> Then you iterate over scoring people by "wisdom" and who gets the most
> "wise" votes, until the scores converge.
> The biggest hurdle would probably be getting volunteers, though.
> Peter
>
> On Mon, Jan 30, 2012 at 11:20 AM, Prentice Bisbal <prentice at ias.edu>
> wrote:
>
>> On 01/28/2012 05:17 AM, Eugen Leitl wrote:
>> > On Fri, Jan 27, 2012 at 01:29:52PM -0500, Prentice Bisbal wrote:
>> >
>> >> What it says is that we've given up on discussing technology with
>> you,
>> >> because your arguments are completely nonsensical. Since you clearly
>> >> don't understand technology, we're hoping you can at least understand
>> >> the simple concepts of basic etiquette.
>> > Who's the list moderator, by the way?
>> >
>>
>> I don't think there is one, hence all the noise. The mailing list and
>> beowulf.org is maintained by Penguin Computing/Scyld Software. Maybe
>> they'd be interested in appoint a moderator or 3.
>>
>> ---
>> Prentice
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From lindahl at pbm.com  Tue Jan 31 02:53:18 2012
From: lindahl at pbm.com (Greg Lindahl)
Date: Mon, 30 Jan 2012 23:53:18 -0800
Subject: [Beowulf] Intel buys QLogic InfiniBand business
In-Reply-To: <alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>
References: <20120123192826.GB17383@bx9.net> <20120124045541.GB10196@bx9.net>
	<8B4C5C67-6548-4B70-B250-CE6C930E5510@online.no>
	<20120127221312.GA29961@bx9.net>
	<F46B2E61C40ADF4ABD39500BC54C3C791896B3E6@MTIDAG01.mtl.com>
	<4F2358FA.4030009@cse.ucdavis.edu>
	<F46B2E61C40ADF4ABD39500BC54C3C791896C165@MTIDAG01.mtl.com>
	<alpine.LFD.2.02.1201281525250.7178@coffee.psychology.mcmaster.ca>
	<CANe5L+QYZUZ88LSffUitw==ya8SRtzLSxtAMCQ+mE7pHByjfsA@mail.gmail.com>
	<alpine.LFD.2.02.1201300957220.22992@coffee.psychology.mcmaster.ca>
Message-ID: <20120131075318.GA2600@bx9.net>

On Mon, Jan 30, 2012 at 10:04:53AM -0500, Mark Hahn wrote:

> > http://www.cscs.ch/fileadmin/user_upload/customers/cscs/Tech_Reports/Performance_Analysis_IB-QDR_final-2.pdf
> 
> as far as I can tell, this paper mainly says "a coalescing stack delivers
> benchmark results showing a lot higher bandwidth and message rate than a
> non-coalescing stack."  the comment on figure 8:
> 
>      To some extent, the environment variables mentioned before
>      contribute to this outstanding result
> 
> which is remarkably droll.  I'm not sure how well coalescing works for real
> applications.

Note also that many of the benchmarks in this analysis weren't run
using MPI -- if I remember correctly, the ib_* commands mentioned use
InfiniBand verbs directly, which means they aren't accellerated on
InfiniPath.

-- greg

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 31 04:28:18 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 31 Jan 2012 10:28:18 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
	business
In-Reply-To: <4F27233C.8080508@scalableinformatics.com>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<4F27233C.8080508@scalableinformatics.com>
Message-ID: <20120131092818.GW7343@leitl.org>

On Mon, Jan 30, 2012 at 06:09:48PM -0500, Joe Landman wrote:
> On 01/30/2012 06:07 PM, Peter St. John wrote:
> > Instead of appointing a moderator, we could grow one with recursive Page
> > Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
> > about this type of thing a while ago because of "citation analysis", see
> > the link).
> 
> Please ... no moderator.  Lists get boring while waiting for content 
> filtering organisms to fulfill their voluntary tasks ...

On all the lists I run and participate in you only turn
moderation on by default for new list members and put 
known bozos on permanent moderation.

The result is zero delay as soon as new list subscribers
have produced their first non-spam non-bozo post.
 
> If you don't like someone's writing, filter them.

I already do, but content producers typically don't bother
and vote with their feet. I have seen many communities die
in that manner. Never surprising, still always sad.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eugen at leitl.org  Tue Jan 31 04:31:04 2012
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 31 Jan 2012 10:31:04 +0100
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
	business
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <20120131093104.GX7343@leitl.org>

On Mon, Jan 30, 2012 at 03:21:45PM -0800, Lux, Jim (337C) wrote:
> 
> 
> The biggest hurdle would probably be getting volunteers, though.
> Peter
> 
> You got that right...  Moderating takes a deft touch and a thick skin.

I would have no issues moderating Beowulf@ since that would 
require only negligible additional workload.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Glen.Beane at jax.org  Tue Jan 31 07:15:51 2012
From: Glen.Beane at jax.org (Glen Beane)
Date: Tue, 31 Jan 2012 12:15:51 +0000
Subject: [Beowulf] cpu's versus gpu's - was Intel buys
	QLogic	InfiniBand	business
In-Reply-To: <20120131093104.GX7343@leitl.org>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu>
	<20120128101732.GG7343@leitl.org> <4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<20120131093104.GX7343@leitl.org>
Message-ID: <FC01EA88-B735-4DE8-8177-EEC7B3CE8CC8@jax.org>


On Jan 31, 2012, at 4:31 AM, Eugen Leitl wrote:

> On Mon, Jan 30, 2012 at 03:21:45PM -0800, Lux, Jim (337C) wrote:
>> 
>> 
>> The biggest hurdle would probably be getting volunteers, though.
>> Peter
>> 
>> You got that right...  Moderating takes a deft touch and a thick skin.
> 
> I would have no issues moderating Beowulf@ since that would 
> require only negligible additional workload.


Did this list used to be moderated?  I remember when I first joined there would be a significant delay for my email sent to the list, while I was waiting for my replies to show up a whole conversation would be unfolding between "veteran posters"


--
Glen L. Beane
Senior Software Engineer
The Jackson Laboratory
(207) 288-6153

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From ellis at cse.psu.edu  Tue Jan 31 10:30:48 2012
From: ellis at cse.psu.edu (Ellis H. Wilson III)
Date: Tue, 31 Jan 2012 10:30:48 -0500
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic	InfiniBand
 business
In-Reply-To: <FC01EA88-B735-4DE8-8177-EEC7B3CE8CC8@jax.org>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<20120131093104.GX7343@leitl.org>
	<FC01EA88-B735-4DE8-8177-EEC7B3CE8CC8@jax.org>
Message-ID: <4F280928.7080806@cse.psu.edu>

On 01/31/2012 07:15 AM, Glen Beane wrote:
> Did this list used to be moderated?  I remember when I first joined there would be a significant delay for my email sent to the list, while I was waiting for my replies to show up a whole conversation would be unfolding between "veteran posters"

Yea, same used to happen to me back in '06 when I first joined.  Sent an 
email about it and got a response back from Don Becker stating that I 
was taken off the moderation list.  I'm not sure if he's still the 
moderator anymore, however.  While I think that's a great way to deal 
with newcomers, I'm not sure there is a fair way to determine which of 
the existing posters are and are not trolls deserving of moderation. 
Therefore I also vote to continue in a non-moderated fashion.

On that note, my sincere apologies to the list if any of my replies 
served in any way to kindle this discussion.  I got a bit colorful due 
to a building frustration from years of eye-rolling.

Best,

ellis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From cbergstrom at pathscale.com  Tue Jan 31 10:40:48 2012
From: cbergstrom at pathscale.com (=?ISO-8859-1?Q?=22C=2E_Bergstr=F6m=22?=)
Date: Tue, 31 Jan 2012 22:40:48 +0700
Subject: [Beowulf]  List moderation
In-Reply-To: <4F280928.7080806@cse.psu.edu>
References: <95FE2494-EEF9-497E-B446-C44F970FF4DE@xs4all.nl>
	<4F22CB68.3080605@ias.edu>
	<B8C61359-94CA-4AE3-92F5-3191920DB203@xs4all.nl>
	<4F22CFEB.6080404@cse.psu.edu>
	<3AA1B93B-6157-4A71-B800-083FEAFBE3C4@xs4all.nl>
	<4F22ED20.7040105@ias.edu> <20120128101732.GG7343@leitl.org>
	<4F26C35E.7060702@ias.edu>
	<CAF4H3kcqXppohng-E6Hzo0RYteGgw14ss5atAzF_=fgf5cTTcg@mail.gmail.com>
	<ECE7A93BD093E1439C20020FBE87C47F01101B44822C@ALTPHYEMBEVSP20.RES.AD.JPL>
	<20120131093104.GX7343@leitl.org>
	<FC01EA88-B735-4DE8-8177-EEC7B3CE8CC8@jax.org>
	<4F280928.7080806@cse.psu.edu>
Message-ID: <4F280B80.6030800@pathscale.com>

On 01/31/12 10:30 PM, Ellis H. Wilson III wrote:
> On 01/31/2012 07:15 AM, Glen Beane wrote:
>> Did this list used to be moderated?  I remember when I first joined there would be a significant delay for my email sent to the list, while I was waiting for my replies to show up a whole conversation would be unfolding between "veteran posters"
> Yea, same used to happen to me back in '06 when I first joined.  Sent an
> email about it and got a response back from Don Becker stating that I
> was taken off the moderation list.  I'm not sure if he's still the
> moderator anymore, however.  While I think that's a great way to deal
> with newcomers, I'm not sure there is a fair way to determine which of
> the existing posters are and are not trolls deserving of moderation.
> Therefore I also vote to continue in a non-moderated fashion.
-1

 From a bystander perspective I'm all for moderation and reducing the 
noise.  Even people who have their posts moderated would likely be 
understanding that it's for the greater good.  Lets call it peer review 
instead of "moderation".

imho someone with some guts just needs to do it so this doesn't turn 
into a bikeshed discussion
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From joshua_mora at usa.net  Tue Jan 31 14:19:46 2012
From: joshua_mora at usa.net (Joshua mora acosta)
Date: Tue, 31 Jan 2012 13:19:46 -0600
Subject: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
 business
Message-ID: <525qaETsU7536S02.1328037586@web02.cms.usa.net>

I agree with Joe.
Plus I know that most of us, if not all, truly want to share knowledge, and
why not, opinions as well based on personal experiences as long as "we all do
the effort to be respectful with both the individual and the technology and
being open /receptive to be criticized as well". 
That is in fact the reason I like this distribution list.

Joshua.


------ Original Message ------
Received: 05:11 PM CST, 01/30/2012
From: Joe Landman <landman at scalableinformatics.com>
To: beowulf at beowulf.org
Subject: Re: [Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand
business

> On 01/30/2012 06:07 PM, Peter St. John wrote:
> > Instead of appointing a moderator, we could grow one with recursive Page
> > Ranking (http://en.wikipedia.org/wiki/Google_ranking) (in math we knew
> > about this type of thing a while ago because of "citation analysis", see
> > the link).
> 
> Please ... no moderator.  Lists get boring while waiting for content 
> filtering organisms to fulfill their voluntary tasks ...
> 
> If you don't like someone's writing, filter them.
> 
> -- 
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>         http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From mdidomenico4 at gmail.com  Tue Jan 31 15:55:55 2012
From: mdidomenico4 at gmail.com (Michael Di Domenico)
Date: Tue, 31 Jan 2012 15:55:55 -0500
Subject: [Beowulf] rear door heat exchangers
Message-ID: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>

i'm looking for, but have not found yet, a rear door heat exchanger
with fans.  the door should be able to support up to 35kw using
chilled water.  has anyone seen such an animal?

most of the ones i've seen utilize a side car that sits beside the
rack.  unfortunately, i'm space limited and i need something that will
hang on the back of the rack.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From lathama at gmail.com  Tue Jan 31 16:13:48 2012
From: lathama at gmail.com (Andrew Latham)
Date: Tue, 31 Jan 2012 18:13:48 -0300
Subject: [Beowulf] rear door heat exchangers
In-Reply-To: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
References: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
Message-ID: <CA+qj4S8YN0fQ4tQ3Hhyhu_jUh9SZrDDdjE8iACZ5bXWKfgwXhQ@mail.gmail.com>

On Tue, Jan 31, 2012 at 5:55 PM, Michael Di Domenico
<mdidomenico4 at gmail.com> wrote:
> i'm looking for, but have not found yet, a rear door heat exchanger
> with fans. ?the door should be able to support up to 35kw using
> chilled water. ?has anyone seen such an animal?
>
> most of the ones i've seen utilize a side car that sits beside the
> rack. ?unfortunately, i'm space limited and i need something that will
> hang on the back of the rack.
> _____________________________

Maybe: http://www.hoffmanonline.com/product_catalog/section_index.aspx?cat_1=34&cat_2=2383&SelectCatId=2383&CatId=2383

Semi Related question: Has any research been done on cooling the
racks/rails/metal infrastructure in the effort to cool the whole
rack+systems?

-- 
~ Andrew "lathama" Latham lathama at gmail.com http://lathama.net ~
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Tue Jan 31 18:47:18 2012
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Tue, 31 Jan 2012 15:47:18 -0800
Subject: [Beowulf] rear door heat exchangers
In-Reply-To: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
References: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47F01101B4483FA@ALTPHYEMBEVSP20.RES.AD.JPL>

Maybe there's an issue with the weight and or flexible tubing on a swinging door?

The Hoffman products in Andrew's email, I think, aren't the kind that hang on a door, more hang on the side of a large box/cabinet (Type 4,12, 3R enclosure) or wall.

They're also air/air heat exchanges or airconditioners (and vortex coolers.. but you don't want one of those unless you have a LOT of compressed air available)

http://www.42u.com/cooling/liquid-cooling/liquid-cooling.htm
shows "in-row liquid cooling" but I think that's sort of in parallel

They do mention, lower down on the page, "Rear Door Liquid Cooling"
But I notice that the Liebert XDF-5 which is basically a rack and chiller deck in one, only pulls out 14kW.


>From DoE:
http://www1.eere.energy.gov/femp/pdfs/rdhe_cr.pdf

They refer the ones installed at LLBL  as RDHx units, but carefully avoid telling you the brand or any decent data.  They do say they cost $6k/door, and suck up 10-11kW/rack with 9 gal/min flow of 72F water.

Googling RDHx turns up "CoolCentric.com"
http://www.coolcentric.com/resources/data_sheets/Coolcentric-Rear-Door-Heat-Exchanger-Data-Sheet.pdf

33kW is as good as they can do.

I also note that they have no fans in them.


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Michael Di Domenico
Sent: Tuesday, January 31, 2012 12:56 PM
To: Beowulf Mailing List
Subject: [Beowulf] rear door heat exchangers

i'm looking for, but have not found yet, a rear door heat exchanger with fans.  the door should be able to support up to 35kw using chilled water.  has anyone seen such an animal?

most of the ones i've seen utilize a side car that sits beside the rack.  unfortunately, i'm space limited and i need something that will hang on the back of the rack.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From sdm900 at gmail.com  Tue Jan 31 18:54:48 2012
From: sdm900 at gmail.com (Stu Midgley)
Date: Wed, 1 Feb 2012 07:54:48 +0800
Subject: [Beowulf] rear door heat exchangers
In-Reply-To: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
References: <CABOsP2MmTRvPp-iqY4byPiAJQPXTfqjb8oT3haiCDfJs+JGAMQ@mail.gmail.com>
Message-ID: <CAEM1RsV33jkGOa0G1GyKdCDerWh5F23f+LzLHB=zgCLCC-j_mw@mail.gmail.com>

Speak to SGI.  We have about a dozen such racks, all from SGI.


On Wed, Feb 1, 2012 at 4:55 AM, Michael Di Domenico
<mdidomenico4 at gmail.com> wrote:
> i'm looking for, but have not found yet, a rear door heat exchanger
> with fans. ?the door should be able to support up to 35kw using
> chilled water. ?has anyone seen such an animal?
>
> most of the ones i've seen utilize a side car that sits beside the
> rack. ?unfortunately, i'm space limited and i need something that will
> hang on the back of the rack.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


-- 
Dr Stuart Midgley
sdm900 at gmail.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From Herbert.Fruchtl at st-andrews.ac.uk  Tue Jan 31 19:18:10 2012
From: Herbert.Fruchtl at st-andrews.ac.uk (Herbert Fruchtl)
Date: Wed, 1 Feb 2012 00:18:10 +0000
Subject: [Beowulf] moderation - was cpu's versus gpu's - was Intel buys
	QLogic
Message-ID: <97E75B730E3076479EB136E2BC5D410054874491@uos-dun-mbx1>

Folks,

I missed part of this discussion (for obvious reasons I lost interest), but since it seems to be moving in that direction, I'll throw in my two smallest-local-currency-units. I'm a lurker (in old usenet parlance) on this list: reading, but very rarely posting. There are probably many of us, but the others are posting even more rarely...

As long as we don't get real off-topic discussions that attract the weirdos of the Internet (global warming anybody? intelligent design? even C/Fortran tends to peter out quickly nowadays), I am opposed to censorship (aka moderation). The simplistic arguments are:
1) This is my own, selfish, most important argument: it costs time! When, every two years, I have a technical question for the list, I don't want to wait until the USA is out of bed and hope that the moderator isn't at a conference for a week.
2) You need a moderator. It's quite some work, so it will only be done by somebody who gets some satisfaction out of it. This means that the job will attract exactly the kind of people who will not moderate neutrally and dispassionately. Even if they try, there's the fact that power corrupts. You're tempted to censor views that are too far from your own ("ludicrous" is the word you would use), and in the end you have an in-crowd confirming each other's views.
3) You are opening yourself to lawsuits. If something is said on the list that, let's say Intel's corporate lawyers find defamatory, they may go after the moderator.

If you really find somebody's views (and their presentation) objectionable, just killfile them (it's called "filter" in the 21st century). And if certain people think ad hominem attacks help their case, ignore them instead of thinking you can look dignified in taking them on in their own game. You won't.

Back to those dark alleys where we lurkers feel at home...

  Herbert


_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.