Hits: 40358

Of course it all depends, but deciding to use the Cloud for HPC is not as simple as it may seem

Computing in the "Cloud" allows computing to be purchased as a service (like electricity) and not as a product (like a generator). Made possible by operating system (OS) virtualization and the Internet, Cloud computing allows almost any server environment to be replicated (and scaled) instantly. Many web service companies find Cloud computing more economical than purchasing (or co-locating) hardware because they can pay for computing services only when needed.

The definition of a "Computing Cloud" can vary depending on the customer and vendor. The definition from Wikipedia is as follows:

Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a metered service over a network (typically the Internet).

The ability to rapidly construct and meter needed computing services is what makes the Cloud model successful for both providers and customers. Grid computing made the same promise years ago, but had issues with the rapid delivery of services. Most grid systems offered a low level library compatibility to end users rather than the machine level compatibility of Clouds. Offering full OS virtualization ensured full compatibility for users and eliminated the library mis-match issues that often occurred in Grid systems.

HPC In A Cloud

The advantages of Cloud computing are certainly attractive to HPC users. Indeed, in many cases, users cannot get enough cycles on existing systems and Cloud HPC would be a viable economic alternative to purchasing more hardware. At first glance, Clouds would seem to be a welcome addition to the HPC toolbox, however, on close inspection the traditional Clouds do not (or cannot) offer many important aspects of HPC computing. To illustrate the lack of overlap, consider the following diagram that lists the desirable aspects of a traditional Cloud and those of an HPC System (e.g. a typical cluster). The only shared features are scalability and reliability. The other aspects are orthogonal in nature and represent a serious mismatch between the two approaches.

Taking A Deeper Look

A "Traditional Cloud" offers features that are attractive to web service organizations. Most of these services are single loosely coupled instances (an instance of an OS running in a virtual environment). There are service level agreements (SLA) that provide the end user with guaranteed levels of service. The features that are attractive to end users, as shown in the figure above are as follows:

Contrast these features with those that are attractive to most HPC users:

One shared aspect is resource scalability. That is, the ability of the user to increase the compute resources quickly. Since most Cloud applications are sequential single process jobs, scalability is easily accomplished with adding additional virtual machines. In the case of HPC, scalability is usually referred to as an application property that determines how many cores (processors) can be applied to the problem before performance levels off. It can also represent the number of users jobs that can run on an HPC cluster. In essence, in both Clouds and HPC clusters users can scale up to the amount of computing they require. There is a big difference, however, in how the scalability is managed. In a Cloud, more resources are created by adding more virtual machines (OS instances). In a cluster, the resource scheduler provides the physical resources for the users application. Due to their large shared nature, Clouds often have more raw compute capacity than many large clusters.

Another shared aspect is redundancy though hardware independence. That is, the user, for the most part, does not care (or control) on which exact hardware their applications run. Thus, both Clouds and clusters can schedule around broken or failed hardware.

Perhaps the biggest mismatch is in the performance area. HPC applications strive to maximize performance on particular hardware. Clouds only guarantee "minimal" level of performance in terms of compute and I/O capability. Thus, if your maximum requirements are near the Cloud minimum, then cloud computing may be a solution. Otherwise, the performance you were expecting may not be possible or delivered on a consistent basis.

The Skinny On Scalability

One important and often misunderstood issue is HPC scalability. There is a general misconception that adding more servers to any HPC problem automatically increases performance. In HPC, scalability is loosely defined by a question, "As I add processors (cores) how much faster will my program run?" A highly scalable program can use many cores, while a less scalable program will show no speed-up as more cores are added. Thus, scalability is function of the program and is well described by Amdahl's Law. There are, however, machine aspects that can contribute to scalability via Amdahl's law. Simply put, the more "things" have to wait for data the worse the scalability. (For those that are familiar with Amdahl's law, this amounts to increasing the sequential portion of the program.)

In HPC the goal is to speed-up applications by keeping your resources as busy as possible. If resources are waiting, then you are not getting the best utilization possible and adding more resources may actually make things worse. As stated, scalability is a function of the program. Thus, there are programs that are highly scalable or "embarrassingly parallel" and there are those that are difficult to scale, which we will call "interconnect sensitive."

An HPC cluster can be built using many different types of hardware. In general, the better the connection between cores (the interconnect between server nodes) the better interconnect sensitive programs will run. If a highly scalable program (e.g. image rendering) is run on a cluster with Gigabit Ethernet and then on a cluster with InfiniBand, the scalability and performance would be almost the same (all other things being equal). If however, an interconnect sensitive program (e.g. weather modeling) were run on the same two clusters, the scalability on the Gigabit Ethernet cluster would be much less than that of the InfiniBand cluster and the performance on the InfiniBand cluster would be much better.

Because the underlying hardware is important to scalability for some programs, maximizing certain aspects can dramatically help improve application performance. Indeed, InfiniBand goes to great lengths to keep the application as close to the "wires" as possible. Most traditional Clouds use either Gigabit or 10-Gigabit Ethernet. HPC instances in these Clouds will absolutely work for embarrassingly parallel programs, but may struggle with those that require a better interconnect. That is, scalability and hence performance will suffer. A true HPC Cloud needs to offer a high performance interconnect that will not limit scalability of some applications.

In addition to a high performance interconnect, an HPC Cloud needs to keep the user as close as possible to the hardware. This requirement runs counter to the virtualization layer that is used on all standard Clouds. Virtualization provides great flexibility to the users, but is designed to keep them from touching the real hardware. As implemented this requirement may limit the flexibility of a true HPC Cloud. There is, however, a way to provide high performance and Cloud flexibility to HPC applications.

Enter Dynamic Provisioning

The key to an HPC Cloud is an idea called "dynamic provisioning." The traditional cluster usually has a fixed Operating System (OS) on all the compute servers. A user program must conform to this specification or it may not run. Similar to a standard Cloud, an HPC Cloud should allow the user to pick and choose (even design) the OS environment for the computing servers. This capability is possible though dynamic provisioning where all compute servers are bare-metal provisioned by the resource scheduler. In essence the compute nodes are rebuilt each time a program is executed.

While dynamic provision may seem time consuming and inefficient, there are a few things to consider. First, most HPC applications run for hours, days, or even weeks. Giving away a small chunk of run-time is a small price to pay for a flexible Cloud like environment. Second, and perhaps more important, there are provisioning methods that do not require the hard drive on each worker node to be re-imaged, thus reducing the time required to provision the node.

Using options such as RAM based disks, NFS, and other standard *NIX tools, nodes can be easily provisioned with unique OS environments without touching any of the node hard drives (should they even exist). Hard drives on the nodes can still be used for local scratch storage, but all important OS files and directories are loaded by the resource allocator into a RAM disk. One interesting example of this type of tool is the Warewulf Project. The Warewulf toolset is a freely available package that allows easy creation and management of node images that are then loaded as RAM disk images on the compute servers. Booting a node is actually very fast and can be easily changed to suit user or application preferences. A flexible commercial solution is Bright Cluster Manager, which allows easy installation, monitoring, managing of cluster. Bright also offers a Cloud Bursting where jobs can be directed to external Clouds.

HPC Clouds that combine high performance interconnects and dynamic provisioning can offer the most desirable Cloud features such as flexibility, scalability, and software choice while also maintaining HPC features that deliver expected performance levels.

The Storage Issue

An application that needs heavy I/O, which implies predictable and consistent I/O rates, is usually highly tuned for a given I/O environment (and vice-versa). Contrast this with the standard Cloud storage where bulk storage is generally flexible and robust. There are few if any service level agreements (SLAs) that will guarantee high I/O rates. The storage is there, growable, and reliable, but not guaranteed to work at HPC levels.

HPC storage is often an engineered solution based on the user needs and application data flow. The baseline plug-and-play solution is the Network Files System (NFS), which was never designed to support high performance or parallel access. In light I/O cases, NFS is a valid solution, however it can quickly become a bottleneck for large clusters due to its shared design. A specialized high performance NAS can help in this situation, but even this capability is not normally found in the standard Cloud. Higher performance solutions that employ distributed or parallel file systems are the preferred method in many clusters.

One solution is the adoption of pNFS (parallel NFS). The pNFS standard allows storage vendors to supply high performance storage through a standard and familiar method. The back-end storage, which will determine the ultimate performance, depends on the vendors technology. The I/O rate will be a necessary part of the Cloud HPC SLAs. It is unlikely that the traditional Cloud will ever offer this level of service and there is not much incentive to include pNFS or any other high performance storage. A properly engineered HPC Cloud should have a robust I/O solution if it is to a larger portion of the HPC market.

But Will It Work For Me?

As presented above, the typical Cloud may not be the best candidate for production HPC work. Recent estimates from IDC put the HPC or technical server market at about $10 Billion per year for last year (2011). The exact amount is not important, but the fact that it is a sizable market seems to have attracted many traditional Cloud vendors to the "HPC space.".

Making a business case based on a large unsegmented market is a rookie mistake. Experienced start-up veterans will often use market segmentation to better define the "real market." A simplified analysis is useful in the case of Cloud HPC. Starting with the total market size of $10 Billion, we can do a first segmentation based on the need for a high speed interconnect. This requirement means your Cloud needs either InfiniBand (IB) or high performance Gigabit Ethernet. If your applications can be considered Embarrassing Parallel (EP) then standard 10 Gigabit Ethernet (or even Gigabit Ethernet) would be adequate. A generous ball park estimate of 60% for the EP portion of the total HPC market results in a $6B market that can be addressed by traditional Clouds.

Next, consider I/O requirements. There are many applications that require heavy I/O otherwise computation will stall waiting to read or write data (e.g. scratch file, restart files, and "big data" applications). In this analysis, heavy I/O utilizes enhanced technologies to boost I/O performance. These capability are not normally part of a traditional Cloud and may include optimized NAS, distributed file systems, parallel file systems, and even the use of SSDs.

If we assume 40% of the EP market uses non-heavy or light I/O then the total market share for HPC computing on traditional Clouds has now shrunk to 24% and is an estimated $240 million. A summary of the oversimplified analysis is shown in Figure Two below.

While the numbers may not be exact, the lesson is clear. Not all HPC applications will work "out of the box" in traditional Clouds. Indeed, there are other factors that may further segment the market, such as data movement to and from the Cloud, security, availability, and backups of big data results, etc.

A True HPC Cloud

The following are some general conclusions that may be helpful in deciding if Cloud Based HPC is right for you or your organization. Keep in mind, there are vendors who specialize in HPC Cloud computing such as R-HPC and Penguin Computing.

Based on these observations, performing HPC in the Cloud is indeed possible, but many applications cannot be shoehorned into any Cloud solution. Clouds designed for HPC are needed and represent a viable solution to many organizations. In addition, there may be other issues that need to be discussed before HPC Cloud can deliver low cost and flexible HPC cycles. Don't decommission that cluster just yet!