If Nikola could see us now!
Each year NVidia provides a "year in review" that I find very interesting. It is a good summary of the years events (of course from NVidia's's perspective), but none the less informative. Plus, there are plenty of links to facilitate further exploration. This years round-up follows.
Tesla - A Year in Review - 2010
The growth of GPU Computing in HPC has continued unabated this year with many new milestones achieved. Hard to believe that it's only been three and a half years since Tesla launched.At the end of last year, we talked about how it felt like we had reached a "tipping point" with Tesla, a level at which momentum for change seemed unstoppable. If I had to find two words to summarize this year, I would say that it feels like Tesla has reached escape velocity, the required speed one needs to break free of a gravitational field, or in the case of Tesla, a stage of momentum where we're seeing a rapid increase in deployments and the question on many of our customers' lips is no longer "if" we deploy GPUs, it's "when".
These are our Top 10 takeaways for the year:
- CUDA by the numbers. There a lot of metrics we use internally to track the
progress of CUDA, but however you cut it, we've seen stellar growth across the board
this year in terms of developer adoption, education and community momentum.
2009 2010 % Increase Attendees at GPU
Technology Conference (GTC)1423 2166 52%
(ind. av. = ~20%)Universities Teaching CUDA 270 350 30% CUDA related videos on YouTube 800 1250 56% Submissions to CUDA Zone 670 1235 85% Cumulative downloads of CUDA SDK 293,000 668,000 127% CUDA-related citations on Google Scholar 2700 7000+ 160% Submissions to speak at GTC 67 334 398%
- The Computational Laboratory - In January, we launched a new initiative for
the bio-informatics and computational chemistry community, called
WorkbenchTesla Bio. The initiative
brought together more than 20 prominent computational research codes, such as
AMBER,
VMD and
LAMMPS, enabling scientists who rely on these codes to turn their standard PCs into "computational
laboratories" capable of doing science more than 10-20 times faster through the use
of Tesla.
In the case of AMBER, one of the most widely used applications for biochemists, performance increases of up to 100X are being seen and more importantly, critical research that once required a supercomputer could now be done on a desktop workstation. The Tesla Bio Workbench site saw more than 10,000 visitors in the first two weeks alone and since then, more than half of the 150,000+ visitors have clicked through to the specific pages belonging to the research codes.
- "Build it and they will come" - when I wrote this recap last year, there was
1 OEM with a Tesla SKU as a part of their line-up. Today, this number is up to 9,
with a total of 19 Tesla-specific SKUs now available, many using the Tesla M2050 GPU
Computing Module.
The list includes all the major players such as Cray, Dell, HP and SGI, but perhaps
most notable is IBM, who in May became the first major OEM to offer a Tesla -based
server solution in its iDataPlex
line. For IBM,
it was a sign that GPU Computing was mature enough to warrant their entry into the
space. Dave Turek IBM VP of Deep Computing said:
"I think what's changed is that customers have been experimenting for a long time and now they're getting ready to buy. It wasn't the technology that drove us to do this. It was the maturation of the marketplace and the attitude toward using this technology. It's as simple as that."
- To the Nebulae and Beyond - At the International Supercomputing Conference
in June, the world's first Tesla GPU-enabled petaflop supercomputer made its debut.
Equipped with 4640 Tesla "Fermi" GPUs,
Nebulae at the National
Supercomputer Center in Shenzhen China, made its mark on the Top500 by entering at
number 2, with sustained performance of 1.27 petaflops. Another system from the
Chinese Academy of Sciences also entered the chart and number 19.
This marked the beginning of what was to be an impressive year for China. As a relative newcomer to the supercomputing space, China is unrestricted by the need to support legacy software and systems, so it has been fearless in its adoption of GPU computing. The country has shown that it understands the significance of supercomputing, as it seeks to evolve from being a manufacturing powerhouse to become a global leader in science and technology.
- The Beginning of the Race for Better Science - Following the June list of
the Top500, the Undersecretary for Science at the DOE, Steve Koonin,
wrote an
OpEd for the San Francisco Chronicle. In this piece he voiced his concern about
Nebulae, stating that "these challenges to U.S. leadership in supercomputing and
chip design threaten our country's economic future." Undersecretary Koonin's concern
is that without the latest technologies, the U.S. will fall behind the rest of the
world in critical areas of industry, such as simulation for product design.
Leadership here enables the U.S. to continue to push the envelope in terms of
technology while encouraging innovation.
The sentiment was echoed by others, such as Senator Mark Warner and NVIDIA's own Andy Keane whose piece on AllThingsD encouraged a lot of lively discussion, such as this comment from insideHPC:
"I agree with Andy on this one; the Senate should get behind Senator Mark Warner (D-VA) and his amendment to the reauthorization of the America Competes Act. If we as an HPC community, or as a country for that matter, aren't agile enough to adapt, we could find ourselves being trounced by our own inventions."
The use of GPUs to further science was a topic covered in a recent pilot of a documentary series that NVIDIA produced, entitled The 3rd Pillar of Science. In this pilot, we spoke to leading medical experts who are using GPUs for ground-breaking medical methods, such as advanced cancer treatment and real-time open heart surgery.
- 2200 Geniuses and a Self-Driving
Car - After the success of last
year's GPU Technology
Conference, we were
pretty excited to host our 2nd event in September this year. Our attendee numbers
grew more than 50%, well above average for a technical conference, and submissions
from eager CUDA developers wanting to present their work grew nearly 400%. In fact,
we had so many that we doubled the number of sessions at the conference to 280, all
of which are online
for your viewing and listening pleasure :)
It was pretty interesting to see the difference in the show since last year. The sheer breadth of topics covered made the show unlike any other - from astrophysics to video processing, from computational fluid dynamics to neuroscience and from energy exploration to designing autonomous cars. Tables were filled with engineers, scientists, developers, students and researchers, all sharing experiences and ideas. We'll be staying in San Jose, California for GTC 2011, and we hope to see you all there.
Here are a few of my favorite quotes from members of the press that attended:
"Absolutely one of the best - and most important conferences in the technology and advanced computing sector" - The Exascale Report
"What we are seeing here is like going from propellers to jet engines." - insideHPC
"...GTC is growing even as it specializes on just one aspect of NVIDIA's business, the CUDA platform for GPU computing. That's just one of many signals that point to an undeniable trend: the use of GPUs for non-graphics computation is on the rise, led largely by NVIDIA's efforts." - Tech Report
"NVIDIA's GTC is a blast. The demos, keynotes, exhibits, technical papers, and emerging companies' presentations are first class, interesting and informative. Well worth the price of admission. There was no heavy product messaging, no call to action to buy something other than the idea that parallel processing is here and it's important-and by our observations it was mission accomplished." - Tech Watch
- Turbocharged Tools - This year we saw GPU-enabled, production releases of
some of the most important applications in the technical and scientific computing
space. ACUSIM Software launched a GPU-enabled
version of its CFD software AcuSolve, delivering double the
performance
for its users.
Tom Lange, director of Modeling and Simulation at P&G said:
"GPU-accelerated CFD allows for more realism, helping us replace slow and expensive physical learning cycles with virtual ones. This transforms engineering analysis from the study of failure to true virtual trial and error, and design optimization."
ANSYS released performance data on its CUDA implementation of ANSYS Mechanical, revealing that CUDA helps cut turnaround times for complex simulations in half. Wolfram Research released the latest version of Mathematica, delivering for its users, in some cases, speed increases of more than 100X from within the familiar confines of the Mathematica programming environment. Check out the video here of their demo earlier this year at Siggraph. And finally, NVIDIA and Mathworks collaborated on its latest release of MATLAB 2010b, to include support for GPU acceleration for users of Parallel Computing Toolbox and MATLAB Distributed Computing Server.
- Cloudy with a chance of GPUs - this year saw the first GPU deployments to
the
Cloud,
from Peer1 in July and Amazon Web Services
(AWS) in November. Developing for the CUDA architecture of NVIDIA GPUs already
offers the lowest cost of entry for any HPC architecture, but with these new
services, you don't even need to buy the hardware yourself. Through AWS for
example, you can now get access to 2 Tesla 20-series GPUs and 2 CPUs for just $2.10
an hour. Businesses of all sizes can now run heavy duty simulations and more with
simple on-demand pricing , and no large up front capital investment.
GigaOm Pro had
this
to say about the announcement:
"Performance (of Amazon's Cluster Compute Instances) was high already, and the addition of GPUs just ups the octane level. According to a benchmark test by HPC cloud-resource middleman Cycle Computing, GPU Instances outperform in-house GPU clusters in certain cases."
Amazon's CTO, Werner Vogels, published an interesting blog, and one of Amazon's technical specialists, Jeff Barr, gave a great technical overview of the new service.
- Lean, Mean & Green - The year ended with a bang for the Tesla business at
SC'10 in New Orleans. The final Top500 and Green500 lists of the year were announced
and Tesla had its best showing yet. Just prior to SC'10 commencing, the National
Supercomputer Center in Tianjin announced
Tianhe-1A
which, with a Linpack score of 2.57 petaflops, secured the #1 spot on the list. Two other Tesla
GPU-enabled systems
made the Top5; the aforementioned Nebulae, and Tsubame 2.0 from Tokyo Tech.
Tsubame 2.0 was ranked at #2 in the Green500, but more notably it was the only petaflop system in the entire Top 10. Equipped with 4200 Tesla GPUs, yet consuming just 1.340 megawatts, it is, by far, the most power efficient petaflop system the world has ever seen ands an incredible achievement from Prof. Satoshi Matsuoka and his team.
NVIDIA and its customers were also recognized in a number of industry awards at the show. GPUs were highlighted in two Gordon Bell1, 2 awards. The best student paper went the way of Tokyo and Purdue Universities who collaborated on a new interface to make parallel programming on the GPU even more accessible. And perhaps most exciting, we saw some major organizations receiving honors for their work with GPUs, including , SchlumbergerCitadel Investment Group and Weta Digital .
- Some Years in Review for my Year in Review - while writing this, a couple of
other year in review articles caught my eye, and included some quotes that I thought
would make a fitting end to this recap.
HPCwire released their biggest trends of the year podcast last week and pronounced GPU Computing as the #1 Trend of the Year. They commented:
"This year, it (GPU Computing) hit the mainstream, deployed by all the major vendors."
They also added that:
"If NVIDIA hadn't been there, this wouldn't have happened. AMD was only lukewarm about this. NVIDIA put energy and money into it. They changed the trajectory of GPU computing, without a doubt. NVIDIA CUDA made this possible."
Another article recently appeared on O'Reilly Media, who produce a wealth of books, online services, magazines, research, and conferences for the technical computing community. Their summary was that GPUs coupled with CPUs is the architecture of choice for the processing of computationally heavy data
"You won't get the processing power you need at a price you want just by enabling traditional multicore CPUs; you need the dedicated computational units that GPUs provide."
We couldn't agree more :)