Q & A : Benchmarks : Power Management Efficiency

BenchPowerEff Q & A : Benchmarks : Power Management Efficiency

 

What is it?

A benchmark specifically designed to measure the efficiency of the power management technologies employed by processors. In effect, how “green” are the processors in the system.

Performance measuring benchmarks do not show whether or how well the power management technologies work as they typically use the full power of the system. Systems typically switch into the high-performance state, which consumes the most power on such high loads which tells us nothing how “green” the system is.

This benchmark does not test the power management efficiency of other devices in the system, e.g. chipset, display, hard drives, etc.

Why do we measure it?

Today, the “Green Issue” is important, with countries, companies and individuals aiming to reduce power consumption, lower carbon (CO2) emissions, etc. Thus it is useful to know how efficient the power management technologies are – or even if they work at all!

Lower power means lower generated heat thus lower fan noise resulting in a cool and quiet system. Alternatively, smaller thermal solutions can be employed, resulting in smaller systems but just as performant as legacy, older systems.

All processors on the market today, not just mobile processors, employ power management technologies. All aim to reduce power not just when running idle but also when utilisation is lower than certain levels which the benchmark also measures. Traditional processors consumed similar amounts of power whether utilised 100% or not!

What do the results mean?

    1. ALU Performance (MIPS) (when Integer/ALU performance is measured)
      FPU Performance (MFLOPS) (when floating-point/FPU performance is measured)

      • The sum of all performance states at the tested workload in MIPS/MFLOPS (millions of integer/floating-point instructions per second).
      • The higher the score the better (better performance).
      • Being a sum, the individual scores impact its value.

 

    1. Power Efficiency Index (no unit):
      • The average of all performance states at the tested workload.
      • The higher the score the better (efficiency is higher).
      • Being a sum, the individual scores impact its value.
      • 0.0 means processor runs at highest state (greatest power consumption) regardless of workload. Values greater than 1.00 show by how much the performance state is more efficient than the highest performance state.

 

    1. Average Clock/Voltage Ratios Graph:
      • The ratio of current speed/voltage v.s. full speed/voltage at the tested workload.
      • The lower ratio the better (less speed/voltage = less power used). Thus the score at the tested workload is better.
      • The power modes available to the processor impact this graph directly. Some processors have just 1 lower power mode, the more power states the better. Also, the lower the power modes available the better (how low can they go?)

 

    1. Workload: Switch between Integer/ALU and Floating-Point/FPU workloads
      • To simulate normal workloads, e.g. music/MP3, DVD/MPEG playback use the ALU workload based on the Dhrystone benchmark.
      • To simulate scientific workloads, use the FPU workload based on the Whetstone benchmark.

 

  1. Frequency: Switch between various rate of interruptions
    • To simulate interruptible workloads, e.g. DVD/MPEG playback use PAL/NTSC frequency which breaks the processing into “chunks” at that frequency. Once the chunk of data is processed, the processors remain idle until the next timer tick, similar to processing frames.
    • For continuous workloads, use the continuous mode.

Power Management Efficiency Dialog (c) SiSoftware

 

Most Popular Processors

Most popular Processors as benchmarked by users (past 30 days): Most popular Processors as bought through the store (past week):
1. 0% Intel Intel 10.79 GBP
1. Crucial 2GB Upgrade for a Acer Aspire One D255 (Intel Atom N550) DDR3 Crucial 2GB Upgrade for a Acer Aspire One D255 (Intel Atom N550) DDR3 10.79 GBP
2. Crucial 32GB kit (8GBx4), Ballistix 288-pin DIMM, DDR4 PC4-19200, Crucial 32GB kit (8GBx4), Ballistix 288-pin DIMM, DDR4 PC4-19200, 279.59 GBP
For a complete list of statistics, check out the Most Popular Hardware page. For a list of more products, see SiSoftware Shopping.

Typical Results from Processors on the Market

Testing various current processors or just checking out the reference results makes the differences in power management technologies very clear. Let’s see a few examples:

Processor ALU Performance (MIPS) Power Efficiency (no unit) Commentary
AMD Athlon X2 / Opteron DC 11975 MIPS @ 2.6GHz 2.50 @ 2.6GHz With 3 low power modes (~40%, ~70%, ~90%) that reduce both frequency & voltage appreciably, AMD has a winner here. Its 2 cores and fast ALU/MIPS performance make it not just “green” but also fast. Do note that lower speeds do not do as well as they use just 2 or even 1 low power modes.
AMD Phenom 9xxx / Opteron QC (Barcelona) 15710 MIPS @ 2.4GHz 1.57 @ 2.4GHz Current Phenoms/Barcelona use only 3 power states out of a maximum of 5 the power efficiency is not as high as it could be, it does well against the competition. Further BIOS/P-state optimisations would its “green” performance further.
Intel Core Duo 5729 MIPS @ 1.83GHz 2.08 @ 1.83GHz With only 1 low power mode (~55%) but rather shallow ramp-up speed it does better than expected though no match for processors that have more low power modes. Its 2 cores and fast ALUs allow it to score much better than previous designs. The Solo versions score much lower though still better than older Pentium M they replace.
Intel Core 2 Duo (Conroe E6700) 9533 MIPS @ 2.67GHz 1.65 @ 2.67GHz With only 2 low power modes (~60%, ~80%), Enhanced SpeedStep does not do as well as competing power management technologies. Its 2 cores and powerful ALUs allow it to handle higher workloads thus the overall score which could be much lower.
Intel Core 2 Quad (Penryn X9650) 18518 MIPS @ 3GHz 1.57 @ 3GHz Quad cores allow it to handle higher workloads; multiple power modes allow it a high efficiency that matches the Phenom. Best result on the market so far.
Intel Pentium M 1959 MIPS @ 2.13GHz 1.75 @ 2.13GHz With 2 low power modes (~52%, ~72%) but shallow ramp-up speed it does well against current offerings power efficiency wise; performance wise it cannot compete as well against multi-core offerings.
Intel Pentium 4-M 833 MIPS @ 2GHz 1.25 @ 2GHz With only 1 low power mode (~60%), no voltage reduction and steep ramp-up, it is not very power efficient. Its performance is also very low compared to today’s products.
Intel Pentium III-M 161 MIPS @ 750MHz N/A With no dynamic power mode switching – except on AC/DC transitions – it has no place here.
Transmeta Efficeon TM8000 1GHz 161 MIPS @ 1GHz N/A Due to its low performance, it needs to work at full power for even the lowest workload. Much lower workloads are needed to show its power efficiency.
VIA C7-M 399 MIPS @ 2.2GHz 1.31 @ 2.2GHz With 2 power modes (~65%, ~85%) it is quite power efficient; its performance lets it down as well as lack of multi-core which would have allowed it to stay in low power modes at higher workloads.

 

How does it work?

The benchmark uses the established Dhrystone benchmark for ALU workloads and Whetstone benchmark for FPU workloads. Their operation is exactly as in the Processor Arithmetic benchmark but with a different execution harness: instead of running it at full tilt (to measure performance) a given workload per second is used (e.g. 500MIPS, 500MFLOPS, etc.)

In addition, the workload is not computed in one go, but interrupted based on the given frequency (e.g. 25 times per second = 25Hz, etc.). The purpose here is to simulate processing of frames, be it audio, video, scientific data in “real life” tests.

Why not just use a MP3 / MPEG2 decoder and use “real life” testing? Firstly these algorithms are patentented, secondly the processing time does vary depending on the data stream. Using the established benchmarks also gives us scores in meaningful units (MIPS/MFLOPS).

The harness monitors whether the frame decoding at tested workload is completed before the next “tick” arrives; if more than 5% frames are dropped the test fails. This means the workload is too great for the processors to handle in time.

The harness also monitors the clock speed, core voltage throughout the tested workload for all processors (cores, SMT units, etc.). At the end of the test, the values are averaged to calculate the efficiency of the power management of the processors. The lower the clock speed, voltage throughout the test, the better the efficiency at that workload.

The combined power efficiency score is the average of all the efficiencies at each of the workloads tested. The performance value also takes into account the workload value (MIPS/MFLOPS), thus a faster processor that can complete higher workloads will score better than a slower processor even though their power management efficiencies are the same.

 

Technical Information

  • Algorithm/paradigm: based on the audio/video decoding paradigm
  • Systems supported: multi-core, SMP, SMT, SMP/SMT multi-core systems, NUMA systems
  • Operating systems supported: native 32-bit, 64-bit ports; Windows XP/Server 2003/Vista/Server 2008
  • Threading: as many threads as processor units are used
  • Instruction Sets: SSE2 required for FPU; nothing for ALU
  • Options: The operation is fully automatic; there are no user-configurable settings that affect benchmark operation.

Comments are closed.