|
Page 3 of 3
SiCortex
As I mentioned in my discussion about Penguin Computing, people think
that the HPC market is large enough to justify the introduction of new
hardware. One company has taken an even bigger view of the market and
is introducing a whole new type of HPC systems.
SiCortex has developed a new
line of large processor count/low-power HPC systems. Linux was a designed
criteria further supporting the idea that Linux is the HPC operating system of choice.
Over the last few years, the overall power consumption of system has
dramatically increased. The CPU vendors have been slowly edging up
the power requirements of their chips. First they did it with the
power for a single core. Now they have multiple cores and other
functions on a single chip. This is coupled with the huge appetite
for more compute power and this drives up density. The end result is
that data centers are having a difficult time keeping their data
centers cool. In fact, certain companies such as APC, are
recommending that under the floor cooling will not be adequate
depending upon the power consumption density. The simple reason is
that it becomes almost impossible to get enough volume of air through
vented floor tiles (Marlyn Monroe's skirt along with the rest of her
would have risen a few feet with some of the air flow requirements
that data centers are seeing). What can be done about it? There are
several options: (1) try alternative cooling methods, (2) put all
data centers above the Arctic circle, or (3) try low power chips
but use lots of them. This third approach is what SiCortex is
trying.
SiCortex has based their hardware on a low-power 64-bit MIPS chips
that run at 500 MHz. They then take six of these
cores and put them into one chip. But, wait, that's not all. They
also put a high performance interconnect and a PCI-Express connection
to storage and other networking on the chips as well. Each chip
has 2 DIMM slots associated with it to form a node. The nodes are
packaged in
groups of 27 per module. Each module has 27 nodes and 54 DIMM slots.
as well as the needed interconnect, PCI-Express access and support
networking. A single
node with DDR2 memory only consumes about 15W of power. So a
module only draws a bit over 400W of power. The Mother of all Boards
used by SiCortex is shown in Figure 11. Each blue and black heat sink is six cores.
You can get an idea of the board size by using the DIMMs as a reference.
 Figure 11: SiCortex Motherboard
Since SiCortex is using standard MIPS chips, the standard Linux
kernel and remaining software stack will work with the system.
However SiCortex has had to write their own network drivers and
monitoring tools -- all of which are open source. They are using Pathscale's (now Qlogic's) EKO
compilers for their systems. The developers of the EKO compilers
were originally MIPS developers. So porting the EKO compilers to
the SiCortex architecture wasn't too difficult.
At SC06, SiCortex introduced two models. The SC5832, a larger
machine with 5,832 processors, up to 8 TB of memory, and 2.1
TB/s of IO capacity (see Figure 12).
 Figure 12: SiCortex SC5832
It comes in a single cabinet
that is kind of interesting looking. It provides a theoretical
performance of 5.8 TFLOPS. It uses only 18KW of total power
and is about 56" wide x 56" deep x 72" high.
The second model, the SC648, is a smaller
cabinet, with only 648 processors (see Figure 13).
 Figure 13: SiCortex SC648
It offers a theoretical peak of
648 GFLOPS of performance, up to 864 GB of memory, and 240 GB/s
of IO capacity. Because of the low-power chips, the SC648 can plug
into a single 110V outlet and draws 2 KW of power. The unit is
23" wide x 36" deep x 72" high.
One of the challenges that SiCortex has to face is that they are using
a large number of chips to achieve high levels of performance. This means
that to get a large percentage of peak performance they need a very good,
low-latency, high-bandwidth interconnect. So SiCortex has done away with
the traditional switched network and gone to a direct connect network.
But instead of using something like a Torus or a Mesh, they are using
a Kautz network.
The network connects each of the nodes where the network traffic is handled by dedicated hardware on the chip, so there is no load on the CPU cores.
I think the SiCortex is an interesting system. It is a fresh approach
to HPC and also provides better performance/watt than other approaches.
I think SiCortex will be facing a
number of challenges, however. Unlike a commodity cluster, they will have to develop and support much of
their own hardware - chips, boards, interconnects, etc. These tasks cost
money and one assumes they have to recover the cost through
a higher product price. (although no pricing has been announced to date).
Second, they are using lots of chips to
achieve high performance. This means that codes will need to be
able to scale well to take advantage of this cool hardware.
But, I take my hat off to them. I think it's
always a good idea for a new company to enter the market and shake
things up. Plus, I think it's high time we start paying attention
to power consumption of our clusters.
What you didn't see
This should have been the year of 10 GigE for HPC. There were some really
cool 10 GigE companies on the floor - Neterion, Chelsio, Neteffects, Foundry,
Netxen, Extreme, Quadrics, etc. But the per port price of 10 GigE is
still way
too high for clusters. I was hoping that we would see 10 GigE NICs below
$400 and switch costs below $500 for large port counts (128 ports and
above). But the costs of 10 GigE are still too high. Let's look at why
the costs for 10 GigE are still too high. Although one interesting
development is Myricom's support
of 10 GigE.
Ten Gigabit Ethernet is pushing lots of data across the wire (or fiber).
This development means that the frequency is a bit higher so you
need thicker cables to handle the data transmission requirements. Initially
10 GigE used fiber connectors which are nice and thin and easy to work
with (small bend radius) but are expensive compared to copper cables.
Plus you need the laser converters at each end of the fiber which also
drives up the cost.
Then 10 GigE moved to CX4 cables like the ones used for Infiniband.
Infiniband has been using these cable for even DDR (double data rate)
IB, but they are pushing the limits of what the cable can carry and
have some limitations on cable length. Still these cables are
reasonable inexpensive and you don't need the laser converters. Also,
the cat6 specification that could allow 10 GigE to travel over wires
similar to what we use for GigE was released this year.
So why haven't the costs for 10 GigE come down?
In talking to people in the industry I think the answer is:
- The cost of the PHY's is still too
high either for copper (CX4 or cat) or fiber.
- In the case of fiber, the cost of the cables is high compared
to copper.
- PHYs for the new cat specification for 10 GigE aren't really out yet.
- Switch cards for the cat specification aren't out yet.
- The price of 10 GigE NICs hasn't really dropped yet (could be a
function of demand).
- The demand for 10 GigE hasn't yet gotten large enough to drive
down the costs.
I was hoping that HPC could ride the coat tails of the enterprise
market that would start yelling for cheaper 10 GigE. However, this
demand hasn't really developed to the point where costs are starting
to go down.
In addition, the price
of the 10 GigE NICs hasn't been dropping by much either. Perhaps
this is a function of the price of the PHYs, but perhaps not.
Regardless, you are still looking at a minimum of $700 for a 10
GigE NIC.
So is 2007 going to be the year for 10 GigE in HPC? Maybe. There are
some companies making new NIC ASICs that could drive down the price
of the NICs. On the other side of the equation, companies such as
Fulcrum Micro are developing
10 GigE switch ASICs can help drive down the price of switches and
drive up performance. Perhaps the combination of development
coupled with increased demand will drive down the cost of 10 GigE.
But there is a potential problem though.
In 2007, Infiniband is going to release QDR (Quad Data Rate) IB. This
will almost require a switch to fiber cables to handle the data rates.
So IB is going to start moving away from copper wires. This is both
a good thing and a bad thing. It's bad in that the combination of
the 10 GigE demand and the IB demand helps drives down the prices of
CX4 cables and now IB is not going to be able to use the cables for
their new product. However, it's good in that this might help drive
down the cost of the fiber optic PHYs. This could help fiber connectors
for 10 GigE.
With the appearance of fiber cables for IB and with 100 Gigabit
Ethernet being developed which will require fiber cables from the
start, I think it's safe to say that fiber is the future for
HPC cabling. Let's just hope there is a enough demand so that
the cost of the associated hardware (PHYs, etc.) can come down enough
in price that we can afford such things in clusters (never mind putting
them in our house).
Potpourri for a Million, Alex
This is category where I put lots of other neat stuff or neat companies
that I didn't get a chance to
see or even grab their literature. But I think they are cool enough to
warrant a comment and perhaps even some rumors.
- There was a company, Evergrid that
as showing what they claimed was a true system level checkpoint and
restart capability. I heard their demo was compelling. Something to
check out.
- Liquid Computing was there
but I didn't get to talk with them. They are trying new ideas about
constructing clusters to make them more scalable.
- Qlogic mentioned that they will
have a double date rate version of Infinipath in 2007.
- There was a UPC (Unified Parallel C) booth with George Washington
University. This is a new high-level language that can help with getting
new applications onto clusters.
- Scali was showing the new version
of their cluster management tool as well as some cool performance
improvements in their MPI (BTW - they won the HPCWire 2006 Editor's Choice Award
for the cluster management tool - Scali Manage).
- Dr. Hank Dietz and the gang at Aggregate.org
were there but I didn't get to spend as much time as I wanted with them.
They are always up to something really cool. Check their website.
- I didn't get over to a number of other companies that always have
something interesting to talk about: Clearspeed,
Quadrics,
Myricom,
Voltaire,
Chelsio,
Neterion,
NetXen,
and Neteffect.
There are many other companies that have cool things to talk about at
the show, but I just couldn't get to everyone nor grab literature
in the short time I was there. Be sure to look at the floor plan from
the SC06 website and see who was
there. Also look at the website to see announcements during the show.
You can also hear Doug and I discuss SC06 and interview
the likes of Don Becker and Greg Lindahl on ClusterCast.
Until Next Year
I think I will stop here before I attempt to mention just about everyone who had a
booth at the show. While the show seemed slow and the floor was cramped, and
sometimes I think "SC" stands for "Sleep Deprivation Seminar," there were a
number of cool things at the show and I got to see lots of friends. So it's not all
bad by any stretch. Next year's show is in Reno Nevada at the Reno Convention
Center. I've been to Reno the last two years for the AIAA show in January. I'm
not a big fan of the Reno area, but we'll see what the show will hold (God, I hope
I'm not becoming a grumpy old man...).
Jeffrey Layton has been a cluster enthusiast for many years. He firmly believes that
you have a God-given right to build a cluster on your own (but only after
getting approval from your management, household or otherwise). He can also
be found swinging from the trees at the ClusterMonkey Refuge for Wayward Cluster
Engineers (CRWCE - hey I think this spells something in Welsh).
Comment on this article
You must login to leave comments...
Other Visitors Comments
There are no comments currently....
|