Print
Hits: 2586

From the What's Watts Dept.

Some Clarity around TDP ratings

Managing power usage on multi-core processors has become an important aspect with modern computing systems. At the same time, finding an accurate specification of actual power usage has become more difficult. Knowing power usage is important in many areas and particularly when considering the efficiency of High Performance Computing (HPC) systems. In almost all modern CPUs and GPUs the only number that seems to give a hint about power usage is Thermal Design Power or TDP.

According to Wikipedia Thermal Design Power is defined as follows:

... is the maximum amount of heat generated by a computer chip or component (often a CPU, GPU or system on a chip) that the cooling system in a computer is designed to dissipate under any workload.

Some sources state that the peak power rating for a microprocessor is usually 1.5 times the TDP rating

TDP ratings for processors began showing up in the early 2000's. Previously, it was not uncommon to find the power rating in the top-line processor specifications. Admittedly, as multi-core, multi-threading, turbo modes, and thermal throttling were incorporated, it became a bit difficult to pin down the actual electrical power usage of a processor. In addition, the electrical power budget is spread over the whole processor and is very application and environment dependent (e.g. a poorly cooled system may throttle the clock to reduce the amount of heat generated and thus reduce the amount of electrical power used by the processor.)

Pushing the clocks for higher performance requires more current and generates more heat. This heat needs to be moved away from the processor and often requires a heat-sink or cooler (liquid or air heat transport). A larger the amount of input electrical current usually means more heat is generated and indeed, there is a direct relationship between input electrical current and the thermal energy created by the processor.

TDP ratings address the amount heat that needs to be removed and are important for choosing an adequate cooling solution. The metric provides a measure of heat that must be removed from the processors (Joules per second) to maintain proper operation (i.e. if the processor gets hot, it will self-throttle the clock or possibly shutdown). There is also no standard way to compute a TDP rating and each vendor usually has there own method that approximates the cooling needs of the processor.

The relationship between TDP and electrical power often gets confused because both are reported in Watts. TDP is a measure of heat and processor power is a measure of electrical power i.e Power(Watts) = Current(Amperes)/ Voltage(Volts). As, mentioned, there is a directly relationship between TDP and processors power, but it is not a one-to-one relationship.

This confusion is now pervasive throughout the industry. Processors specifications often display TDP in an entirely wrong fashion, as shown in Figure One:


Figure One: Example of incorrect power specification.

So What is the Issue?

Often times, TDP is used interchangeable with processor power. For example, it is common to rate processor efficiency using (Floating Point Operations/Second) or FLOPS/Watt, where the Watts part is often based on TDP and not actual processor power--presumably because TDP is often a published processor specification and finding processor power ratings takes a bit more nuance.

The non-standard method of calculating TDP combined with the fact that it is not the actual processor power requirement undercuts the intention of such metrics. For example, choosing a power-supply (rated in electrical Watts) for a processor and/or GPU specified in TDP (rated in thermal Watts) can be problematic if systems are not over provisioned above the thermal TDP the system may not operate or behave strangely. A rule of thumb is the required power in Watts is CPU/GPU TDP times 1.5 with additional headroom for other components. Keep in mind that over provisioning too much can cause issues as well. (e.g. adding 1000W power-supply when 550W is needed can result in reduced efficiency of the power-supply--often the power-supply efficiency can drop below 50%).

Is the Difference Enough to Care About?

Understanding the relationship between TDP and processor power is best accomplished by running tests. Unfortunately, most users only have access to wall power meters, but not the actual power used by the processor. There are software tools that can probe processors metrics as well. To do actual measurements can require expensive test equipment. It is possible, however, to get close to these numbers by using a computing blade powered by 12 Volt power rail.

Using a 12V powered blade, it is not too difficult to test the processor/motherboard/memory power usage directly. The measurement assumes that the processor uses the lions share of the power and thus can indicate how much input power is actually used. For these tests a μATX computing blade from a Limulus Computing desk-side cluster was used.

The Limulus Computing blades use a 12V power rail provided by a standard ATX power-supply (normally used for GPU cards). In order to proved the needed voltages for the motherboard a 12V DC/DC power-supply (converter) from Mini-Box is used (The converter is rated at electrical 160W across the 5,12,and 3.3 volt rails it presents to the motherboard, the DC/DC conversion efficiency is about 95%. i.e. very little heat generated).

The test set up is shown in Figure Two. Note, all Limulus systems employ USB controlled relays for controlling the 12V power to all blades.


Figure Two Test setup dagram.

The blade was removed from the Limulus chassis and a power meter was placed between the power relay and the blade. Connectors were added to the power meter so it could easily connect to blade chassis and then to the blade. As can be seen in Figure Three, the power meter displays the current, voltage, Watts, and WattH for the 12 V rail.


Figure Three: In-line power meter.

For basic tests, the NAS Parallel Benchmarks were used. Each kernel benchmark (LU, BT, SP, FT) were run as a single process (not parallel) using size A, however, to load the system, the number of processes was set to equal the number of cores on the processor. For the AMD Ryzen 5 5600X with 6 cores, there were six instances of each single process benchmark run at the same time. This configuration ensured the processor would be adequately loaded with work.

As the benchmarks were run, a script measured the processor load and temperature. The actual power in Watts was recorded from the power meter. Table One show the results for several of the NAS parallel kernels.

NAS Kernel Load Temperature Electrical Power
Idle 00.2 34.6°C 27W
LU 97.5 72.0°C 104W
BT 99.6 63.4°C 85W
SP 99.4 62.4°C 79W
FT 96.2 69.9°C 103W

Table One: Results of processor load test for NAS kernels.

Not the Watts We Were Looking For

Due to the nature of the test the total measured Watts includes the memory, processor, fan, and motherboard circuitry. Recall the TDP rating for the 5600X is 65 Watts and two of the kernels (LU and FT) were running at well over 100 Watts of input power. Some of this power maybe due to additional memory usage, a faster running fan, and motherboard chipset activity, however, it is unlikely the a large amount of the power used above the idle value (27W) was due to these devices. Indeed, the power use is greater than the TDP value.

As mentioned, a measure of power used solely by the processor is not possible, however, a few things can be concluded from the tests performed on a Ryzen 5600X (TDP rating of 65W):

  1. The TDP Watt rating (as a vendor specified metric for this processor) is not equal to the electrical power usage.
  2. Actual electrical power usage is almost always higher than stated TDP ratings and the (unsourced) Wikipedia "rule-of-thumb" factor (1.5x) appears to be a good gauge of actual power usage for processors and GPUs.
  3. The non-standard manufacturer provided TDP numbers should not be used to rate processor efficiency.

Another aspect to consider is the Package Power Tracking (PPT) setting in BIOS for recent AMD Ryzen motherboards. The PPT threshold is the allowed socket power consumption permitted across the voltage rails supplying the socket (electrical power). The default for 65W TDP processors is 88W, which tends to correlate closer to the numbers in Table One. For a 105W AMD Ryzen processor the PPT value is 142W. Both of these values have a ratio of about 1.4 (Processor Power/TDP) that again correlates to the stated (but unsourced) Wikipedia value of 1.5. However, the non-standard nature of TDP rating needs to be considered when making estimates of processor power requirements.

Using the PPT setting to reduce processor power usage in constrained environments (i.e. Edge Computing) will be explored in a subsequent article.