|
Page 1 of 3
Fear and Loathing in Reno
I usually write a very long detailed summary of the SC, but I'm hoping to
make this one a bit shorter. It's not that SC07 wasn't good - it was. It's not that
Reno was bad - it was. It's that I'm utterly burned out and still trying to recover.
I've never been so tired after a Supercomputer conference before. I actually think
that's a good thing (Doug will be expounding on why). If I managed to sleep 4 hours
a night, I was lucky. If I manged to actually eat one meal (or something close to it),
I had a banquet. But there were some cool things at SC07 and I've hope I do them
justice.
What was the IEEE thinking?
If you recall, I thought Tampa Bay was not a good location for the conference.
The exhibit floor was very cramped (there were booths out in the hall), the
hotels were spread all over Florida, and there was absolutely nothing to do
around the conference center. Then some person who scheduled a cigar
and cognac party discovered that in Tampa the cigar bars all close on Monday
(nudge, nudge, wink, wink, know what I mean? know what I mean?). So I had
high hopes about Reno. And let's not forget about leaky roof in the exhibit hall.
The exhibit floor while fairly large was actually split across several rooms. This
kind of spread things out a bit, but at least all of the exhibitors were in one
place. But, once again, the hotels were spread out. However, the IEEE
committee did a good job in providing luxury coaches to and from the hotels
and the convention center. On the negative side, if you had a meeting with
a company that had a "whisper suite" or meeting room, you were virtually
guaranteed to walk a bit since they were all well away from the conference
center. All of them were in the hotels which meant that you had to negotiate
through the casino maze to find them. Of course this was the fault of the
hotels and not IEEE, but I can't tell you how hard it was to wander through the
casinos trying to find a specific room. Reno itself had a good number of restaurants
and places to go (I went to the Beowulf Bash at a cool Jazz Bar -- nice place).
However, my overall opinion of Reno, if you haven't guessed by now, was that
it's not the best place to hang out. So it looks like IEEE is 0 for 2 over the last two years.
Next year's SC is in Austin, which I know has lots of good restaurants, places
to go, etc. So let's hope that IEEE can pull this one out of their a** because
their recent track record is not good (with me anyway). But my intention is not to complain about
the location, etc. The conference is about HPC, so let's talk about that.
Themes and Variations
I always try to find a theme in the SC conference but if you read my
blogs you will find out that I actually found 3 themes at
SC07. They are,
- Green Computing
- Heterogeneous Computing
- The Personal SuperComputer
Green Computing (but why aren't the boxes Green?)
The world seems to be captivated by everything Green (that is after
being captivated by the melt down of Brittany Spears). We're all looking for ways
to improve our use of natural resources and reduce our green house gas
emissions and HPC (High Performance Computing) is no exception. Everywhere
you walked on the floor you saw companies stating how they improved the
power consumption and cooling of systems. I think I even saw a company that
claimed the color of their 1U box was specifically chosen to reduce the power
consumption and cooling. While I'm exaggerating of course, almost all companies
claimed that their products were redesigned to make them more green. In my
opinion, some of these claims were warranted and some were not.
Green Computing will become a mantra for many customers now and in the future
and with good reason - power and cooling are reaching epic proportions. Many
data centers can't produce enough cooling for the compute density that vendors
can deliver. Sometimes the problem is there just isn't cooling air but many
times it's not enough pressure to push the cool air to the top of the rack.
APC has done a tremendous amount of research
and development to understand the power and cooling of high density data centers.
For example APC
explains that for racks with conventional front to back cooling with power
requirement of 18kW per rack (fairly dense rack), the rule of thumb for cooling
requirements is about 2,500 cfm (Cubic Feet per Minute) of cooling air through
the rack. Also, they go on to explain that a conventional perforated cooling
tile for under the floor cooling can only provide about 300 cfm of cooling. This
means you need 8 tiles to cool a single rack! On the other hand, grated tiles can
provide up to about 667 cfm of cooling air to a rack, only requiring about 4
tiles to cool a single rack. Either way, the cooling requirements are rather
large for high density systems.
Don't forget that if you have 2,500 cfm of air going in, you will have 2,500 cfm
of hot air coming out. This air needs some kind of return. If you have a 12 inch
round duct for the return air, 2,500 cfm requires the air move at about 35 mph! This
also means the air coming out of a cooling tile is also going to have a high velocity.
So power and cooling are big deals for today's dense systems. We can and need to
design data centers to better cool today's systems. I talked to a few people at
the show, and a number of them were asking about liquid cooling. You can get what
I call liquid cooled assist devices today from APC
and from Liebert. Both devices uses liquid
flowing through a device to cool air exiting the rack. Some of the devices are
self-contained. That is, the liquid never leaves the device. Some of the devices rely on chilled
water inside the data center (somehow this all seems like back to the future when
Cray had liquid cooled systems).
So there are many ways to cool a data center - too many to discuss for this article.
In addition to more efficient cooling, vendors have focused on more efficient
processors that produce less heat.
Transmeta was
one of the first to develop products targeted at reducing the power consumption
of systems without unduly hurting performance. This lead to companies like
RLX and
Orion Multisystems
using Transmeta processors in HPC clusters. But, unfortunately these companies
didn't survive (I like to think that these companies gave their "lives" so to
speak, to advance technology for the rest of us).
Today we have
SiCortex who is making systems with a very
large number of low power CPUs (derived from embedded 64-bit MIPS chips). They
installed their first machine at Argonne National Laboratory this year and are
working on delivering other systems. The approach that SiCortex has taken is to use
to use large amounts of low power and slower CPUs.
One of the other keys to their performance is the use of a very fast network to
help applications scale to a large number of processors. At SC07 they showed their new
SC072 workstation that uses
their low power CPUs and networking to produce a box with a peak performance of
72 GFLOPS using less than 200W. While performance may not be earth shaking, the goal of the SC072 is to provide software developers with a platform that has a meaningful amount of cores and is identical their full production system.
The picture below in Figure One shows the inside of a SC072 workstation.
 Figure One: View of the inside of SiCortex SC072 Workstation
You can see the 12 heat sinks (heat sinks not heat sinks+fans) in the picture.
Under each heat sink are 6 CPUs for a total of 72 CPUs. Each CPU is capable of
1 GFLOP of theoretical performance, so you get 72 GFLOPS in this small box.
I'll be writing a follow-up article on Personal Supercomputers where I will go
into more depth about the SC072. Needless to say it is one cool box (figuratively
and literally).
|