Article Index

Heat Generation

In terms of heat, processors and GPUs are rated using Thermal Design Power (TDP) that is the maximum amount of heat generated by a component under any workload. This number helps determine how to cool the system. Keep in mind that TDP is not an upper limit on the amount of heat a device can create. There is some controversy regarding the effectiveness and accuracy of the TDP metrics, but for the purposes of NDN designs, it is usable. In addition to TDP rated components, power supplies can generate a lot of heat. Power supplies built to the current 80-Plus rating system ensures that 20% or less of the electricity used by the power supply will be lost as heat (e.g. A 1000W 80-Plus power supply can potentially create 200W of heat under load.)

In the chart in Figure One, the TDP limit per processor is given as 65-95 Watts. For a typical systems builder, this may seem quite low. Conventional wisdom suggests that for the fastest performance, use the fastest processor available. In the data center, this may be true, but cooling processors and GPUs present other design issues. We are also going to assume air cooling for the moment. Water cooling will be addressed in other installments.

Faster processors always mean more heat per processor. For example, if we decide to use an AMD Threadripper 2990WX (32-cores, 250W TDP) or an Intel Core i9-7980XE (18 cores. 165W TDP), a very specific CPU cooler is needed. In the case of the Threadripper, a typical cooler may have a volume of 1.9 x10^6 mm3 that translates to 7722 mm3 per watt--the cooler volume needed per watt. Note that for a 65W processor (e.g. Intel i7-8700), the volume is (2.5x10^5 mm3) and the per watt rating is 3836 mm3 (or about half). Thus, as the TDP increases, the size of coolers seems to increase in a nonlinear fashion. The calculations were based on commercially available desktop coolers.

This nonlinear growth means coolers tend to become quite large and actually need much more air to be moved though the cooler. Fortunately, bigger and quieter fans (see below) can be used on big coolers; however, the amount of air needed to move though the cooler can create noise. Some coolers are rated at 40dB (decibels) at top speed. From a design standpoint, systems that need to cool over 100 Watts present airflow, noise, and space constraints on the physical designs.

These constraints can be lifted if several cooler processors are used instead of a single hot processor (admittedly, "hot" and "cool" are relative terms here). For instance, there are several 65W coolers that are less than 30mm in height, which is usually just under the height of motherboard memory modules. This arrangement allows stacking of multiple motherboards in a confined volume.

Using multiple processors also has other advantages in terms of heat movement. First, instead of moving heat from one specific location in the design, the total heat, which may actually equal that of a large hot processor, is more easily dissipated (i.e. smaller and quieter coolers). Second, multiple processors also allow for power control. It is possible to turn off entire motherboards if they are unused, thus reducing both power consumption and the amount of "do nothing" heat generated. Finally, there are some system performance aspects that aid in this design. As will be covered later in the series, the choice of system architecture can have a huge impact on system performance. For instance, is one large multi-core processor better than a handful of slower processors (i.e. cluster design vs. SMP) from a performance standpoint? As will be shown, the answer is not as simple as counting up cores or memory bandwidth.

Chip designers have gone to great lengths to power down parts of the processors that are not being used--and then quickly power them back up--as a means to reduce heat. Without such measures, modern X86 processors would run "hot" even when they are not doing anything. This thermal design is also used in GPUs. Even with these measures, high TDP CPUs and GPUs still create a certain level of "do nothing heat" and can become an under-desk "space heater." This type of heat is not an issue in a data center.

NDN design does not preclude high temperature CPUs and GPUs from being used. The performance per watt may be attractive, but without data center level service, the heat must often be handled by an office or lab HVAC system. Under full load, the amount of heat generated by NDN systems must be considered. A fast processor and two fast GPUs may require moving 600W of heat from three specific devices and a power supply in the system. Heat concentration means faster and louder fans.

Before moving to the noise aspect of NDN design, a quick mention of the Arrhenius equation is important. In practical terms, the equation states that approximately every ten degree increase in temperature results in a doubling of chemical reaction rate. Translated to computer hardware, it simply means that "the hotter electronics operate, the more likely they are to fail." Thus, outside of the controlled data center, local environments can be expected to fluctuate since ambient environments can get hot for many reasons. Depending on the specific needs, designing with multiple lower temperature (slower) components may offer better stability than a few high temperature (faster) components.

Noise Tolerance

The noise component is pretty simple to understand. Anyone who has been in a data center would not want one or two loud servers sitting next to their desk. The general rule is "small fans, big noise; big fans, small noise." Everything else being equal, small fans must rotate faster to move the same amount of air.

Understanding that noise levels vary by distance, a general office, lab, or classroom environment has a sound level of about 40-60 dB. In these conditions, people can have conversations or talk on the phone without being bothered by continuous ambient noise. A good measure of fan sound is the Sone scale. Sones are not decibels or volume, but rather how sound is sensed. The Sone scale is linear and a normal office environment is usually between 1-4 Sones. As an example, bathroom fans are usually rated in Sones. A quiet fan is less than 2 Sones, while a loud fan is 4 Sones.

A design target less than two Sones is important to NDN systems. Any louder could be considered mildly annoying, and at worst, create issues with conversations. A good rule of thumb is that when sitting next to an NDN system, it should not impinge on your workflow, cause interruption, or be annoying. This rule may vary by person, age, etc.

Keeping Computing in the Green Box

Designing within the above constraints results in systems that are not usually "off-the-shelf" or easy to construct on your own. Each situation may require some customization to fit within the NDN green box shown in Figure One. These constraints contribute to system design in ways that data center computing can ignore. Over the course of this series, the following aspects will be covered (in no particular order). Like many design issues, there will be overlapping issues and trade-offs. The good news is that highly effective NDN edge-based systems can provide data center level computing almost anywhere.

What is in a name?
The icon and name for this series harkens back to the rock band known as Yes and their album called Close to the Edge. The da Vinci-looking flying machine could often be found in small places on their vinyl album covers beautifully illustrated by William Roger Dean.
  • Processor choice (single or clustered)
  • Power design and distribution
  • Memory (local or distributed)
  • Networking and Switching
  • Storage Issues
  • Packaging and hacking geometry
  • Installation/Administration
  • Benchmarking
  • Price-to-Performance

Open Design/Open Software

One very practical aspect of NDN systems is a flexible design process. This requirement implies that the underlying software--or "the plumbing"--should be as open as possible. The obvious and logical choice is GNU/Linux that drives many data center systems. Open software does not preclude closed-source solutions, but rather ensures a large amount of choice and control over the design. In terms of areas like High Performance Computing (HPC) and data analytics (Hadoop/Spark), it also allows very easy migration of applications to and from the edge and data center.

Expect Less, Compute More

Based on the design parameters, NDN edge computing may not be performance competitive with similar hardware housed in a real data center. In exchange for this performance hit, there is huge degree of environmental freedom for NDN systems. Keep this aspect in mind because as we move through the design of NDN systems, it will be tempting to "Yea, but..." the design. For instance, "Yea but, I could just use a server on a desk with 4 GPUs and get better performance." Such a proposal is certainly possible if you can live outside the green box.

By assuming a clearly defined power, heat, and noise envelope, a large amount of "location freedom" is provided to end users. NDN systems should be as easy to relocate as a deskside or desktop PC, and they should provide a reasonable fraction of data center performance, often at lower cost, to anyone. As we journey closer to the edge, so does our attention to design, efficiency, and performance of NDN systems. Welcome aboard.

You have no rights to post comments

Search

Login And Newsletter

Create an account to access exclusive content, comment on articles, and receive our newsletters.

Feedburner


This work is licensed under CC BY-NC-SA 4.0

©2005-2023 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.