NVidia Tesla: 2010 Year in Review | Select News

If Nikola could see us now!

Each year NVidia provides a "year in review" that I find very interesting. It is a good summary of the years events (of course from NVidia's's perspective), but none the less informative. Plus, there are plenty of links to facilitate further exploration. This years round-up follows.

Tesla - A Year in Review - 2010

The growth of GPU Computing in HPC has continued unabated this year with many new milestones achieved. Hard to believe that it's only been three and a half years since Tesla launched.

At the end of last year, we talked about how it felt like we had reached a "tipping point" with Tesla, a level at which momentum for change seemed unstoppable. If I had to find two words to summarize this year, I would say that it feels like Tesla has reached escape velocity, the required speed one needs to break free of a gravitational field, or in the case of Tesla, a stage of momentum where we're seeing a rapid increase in deployments and the question on many of our customers' lips is no longer "if" we deploy GPUs, it's "when".

These are our Top 10 takeaways for the year:

CUDA by the numbers. There a lot of metrics we use internally to track the progress of CUDA, but however you cut it, we've seen stellar growth across the board this year in terms of developer adoption, education and community momentum.

	2009	2010	% Increase
Attendees at GPU Technology Conference (GTC)	1423	2166	52% (ind. av. = ~20%)
Universities Teaching CUDA	270	350	30%
CUDA related videos on YouTube	800	1250	56%
Submissions to CUDA Zone	670	1235	85%
Cumulative downloads of CUDA SDK	293,000	668,000	127%
CUDA-related citations on Google Scholar	2700	7000+	160%
Submissions to speak at GTC	67	334	398%

The Computational Laboratory - In January, we launched a new initiative for the bio-informatics and computational chemistry community, called WorkbenchTesla Bio. The initiative brought together more than 20 prominent computational research codes, such as AMBER, VMD and LAMMPS, enabling scientists who rely on these codes to turn their standard PCs into "computational laboratories" capable of doing science more than 10-20 times faster through the use of Tesla.
In the case of AMBER, one of the most widely used applications for biochemists, performance increases of up to 100X are being seen and more importantly, critical research that once required a supercomputer could now be done on a desktop workstation. The Tesla Bio Workbench site saw more than 10,000 visitors in the first two weeks alone and since then, more than half of the 150,000+ visitors have clicked through to the specific pages belonging to the research codes.

"Build it and they will come" - when I wrote this recap last year, there was 1 OEM with a Tesla SKU as a part of their line-up. Today, this number is up to 9, with a total of 19 Tesla-specific SKUs now available, many using the Tesla M2050 GPU Computing Module. The list includes all the major players such as Cray, Dell, HP and SGI, but perhaps most notable is IBM, who in May became the first major OEM to offer a Tesla -based server solution in its iDataPlex line. For IBM, it was a sign that GPU Computing was mature enough to warrant their entry into the space. Dave Turek IBM VP of Deep Computing said:
"I think what's changed is that customers have been experimenting for a long time and now they're getting ready to buy. It wasn't the technology that drove us to do this. It was the maturation of the marketplace and the attitude toward using this technology. It's as simple as that."

To the Nebulae and Beyond - At the International Supercomputing Conference in June, the world's first Tesla GPU-enabled petaflop supercomputer made its debut. Equipped with 4640 Tesla "Fermi" GPUs, Nebulae at the National Supercomputer Center in Shenzhen China, made its mark on the Top500 by entering at number 2, with sustained performance of 1.27 petaflops. Another system from the Chinese Academy of Sciences also entered the chart and number 19.
This marked the beginning of what was to be an impressive year for China. As a relative newcomer to the supercomputing space, China is unrestricted by the need to support legacy software and systems, so it has been fearless in its adoption of GPU computing. The country has shown that it understands the significance of supercomputing, as it seeks to evolve from being a manufacturing powerhouse to become a global leader in science and technology.

The Beginning of the Race for Better Science - Following the June list of the Top500, the Undersecretary for Science at the DOE, Steve Koonin, wrote an OpEd for the San Francisco Chronicle. In this piece he voiced his concern about Nebulae, stating that "these challenges to U.S. leadership in supercomputing and chip design threaten our country's economic future." Undersecretary Koonin's concern is that without the latest technologies, the U.S. will fall behind the rest of the world in critical areas of industry, such as simulation for product design. Leadership here enables the U.S. to continue to push the envelope in terms of technology while encouraging innovation.
The sentiment was echoed by others, such as Senator Mark Warner and NVIDIA's own Andy Keane whose piece on AllThingsD encouraged a lot of lively discussion, such as this comment from insideHPC:
"I agree with Andy on this one; the Senate should get behind Senator Mark Warner (D-VA) and his amendment to the reauthorization of the America Competes Act. If we as an HPC community, or as a country for that matter, aren't agile enough to adapt, we could find ourselves being trounced by our own inventions."
The use of GPUs to further science was a topic covered in a recent pilot of a documentary series that NVIDIA produced, entitled The 3rd Pillar of Science. In this pilot, we spoke to leading medical experts who are using GPUs for ground-breaking medical methods, such as advanced cancer treatment and real-time open heart surgery.

2200 Geniuses and a Self-Driving Car - After the success of last year's GPU Technology Conference, we were pretty excited to host our 2nd event in September this year. Our attendee numbers grew more than 50%, well above average for a technical conference, and submissions from eager CUDA developers wanting to present their work grew nearly 400%. In fact, we had so many that we doubled the number of sessions at the conference to 280, all of which are online for your viewing and listening pleasure :)
It was pretty interesting to see the difference in the show since last year. The sheer breadth of topics covered made the show unlike any other - from astrophysics to video processing, from computational fluid dynamics to neuroscience and from energy exploration to designing autonomous cars. Tables were filled with engineers, scientists, developers, students and researchers, all sharing experiences and ideas. We'll be staying in San Jose, California for GTC 2011, and we hope to see you all there.
Here are a few of my favorite quotes from members of the press that attended:
"Absolutely one of the best - and most important conferences in the technology and advanced computing sector" - The Exascale Report
"What we are seeing here is like going from propellers to jet engines." - insideHPC
"...GTC is growing even as it specializes on just one aspect of NVIDIA's business, the CUDA platform for GPU computing. That's just one of many signals that point to an undeniable trend: the use of GPUs for non-graphics computation is on the rise, led largely by NVIDIA's efforts." - Tech Report
"NVIDIA's GTC is a blast. The demos, keynotes, exhibits, technical papers, and emerging companies' presentations are first class, interesting and informative. Well worth the price of admission. There was no heavy product messaging, no call to action to buy something other than the idea that parallel processing is here and it's important-and by our observations it was mission accomplished." - Tech Watch

Turbocharged Tools - This year we saw GPU-enabled, production releases of some of the most important applications in the technical and scientific computing space. ACUSIM Software launched a GPU-enabled version of its CFD software AcuSolve, delivering double the performance for its users.
Tom Lange, director of Modeling and Simulation at P&G said:
"GPU-accelerated CFD allows for more realism, helping us replace slow and expensive physical learning cycles with virtual ones. This transforms engineering analysis from the study of failure to true virtual trial and error, and design optimization."
ANSYS released performance data on its CUDA implementation of ANSYS Mechanical, revealing that CUDA helps cut turnaround times for complex simulations in half. Wolfram Research released the latest version of Mathematica, delivering for its users, in some cases, speed increases of more than 100X from within the familiar confines of the Mathematica programming environment. Check out the video here of their demo earlier this year at Siggraph. And finally, NVIDIA and Mathworks collaborated on its latest release of MATLAB 2010b, to include support for GPU acceleration for users of Parallel Computing Toolbox and MATLAB Distributed Computing Server.

Cloudy with a chance of GPUs - this year saw the first GPU deployments to the Cloud, from Peer1 in July and Amazon Web Services (AWS) in November. Developing for the CUDA architecture of NVIDIA GPUs already offers the lowest cost of entry for any HPC architecture, but with these new services, you don't even need to buy the hardware yourself. Through AWS for example, you can now get access to 2 Tesla 20-series GPUs and 2 CPUs for just $2.10 an hour. Businesses of all sizes can now run heavy duty simulations and more with simple on-demand pricing , and no large up front capital investment. GigaOm Pro had this to say about the announcement:
"Performance (of Amazon's Cluster Compute Instances) was high already, and the addition of GPUs just ups the octane level. According to a benchmark test by HPC cloud-resource middleman Cycle Computing, GPU Instances outperform in-house GPU clusters in certain cases."
Amazon's CTO, Werner Vogels, published an interesting blog, and one of Amazon's technical specialists, Jeff Barr, gave a great technical overview of the new service.

Lean, Mean & Green - The year ended with a bang for the Tesla business at SC'10 in New Orleans. The final Top500 and Green500 lists of the year were announced and Tesla had its best showing yet. Just prior to SC'10 commencing, the National Supercomputer Center in Tianjin announced Tianhe-1A which, with a Linpack score of 2.57 petaflops, secured the #1 spot on the list. Two other Tesla GPU-enabled systems made the Top5; the aforementioned Nebulae, and Tsubame 2.0 from Tokyo Tech.
Tsubame 2.0 was ranked at #2 in the Green500, but more notably it was the only petaflop system in the entire Top 10. Equipped with 4200 Tesla GPUs, yet consuming just 1.340 megawatts, it is, by far, the most power efficient petaflop system the world has ever seen ands an incredible achievement from Prof. Satoshi Matsuoka and his team.
NVIDIA and its customers were also recognized in a number of industry awards at the show. GPUs were highlighted in two Gordon Bell1, 2 awards. The best student paper went the way of Tokyo and Purdue Universities who collaborated on a new interface to make parallel programming on the GPU even more accessible. And perhaps most exciting, we saw some major organizations receiving honors for their work with GPUs, including , SchlumbergerCitadel Investment Group and Weta Digital .

Some Years in Review for my Year in Review - while writing this, a couple of other year in review articles caught my eye, and included some quotes that I thought would make a fitting end to this recap.
HPCwire released their biggest trends of the year podcast last week and pronounced GPU Computing as the #1 Trend of the Year. They commented:
"This year, it (GPU Computing) hit the mainstream, deployed by all the major vendors."
They also added that:
"If NVIDIA hadn't been there, this wouldn't have happened. AMD was only lukewarm about this. NVIDIA put energy and money into it. They changed the trajectory of GPU computing, without a doubt. NVIDIA CUDA made this possible."
Another article recently appeared on O'Reilly Media, who produce a wealth of books, online services, magazines, research, and conferences for the technical computing community. Their summary was that GPUs coupled with CPUs is the architecture of choice for the processing of computationally heavy data
"You won't get the processing power you need at a price you want just by enabling traditional multicore CPUs; you need the dedicated computational units that GPUs provide."
We couldn't agree more :)

And so 2011 is upon us. From everyone in the NVIDIA Tesla and CUDA teams, we wish you a happy and successful New Year.