A Timely Discovery: Examining Our AMD 2nd Gen Ryzen Results
by Ian Cutress & Ryan Smith on April 25, 2018 11:15 AM ESTA Timely Re-Discovery
Most users have no need to worry about the internals of a computer: point, click, run, play games, and spend money if they want something faster. However one of the important features in a system relates to how they measure time. A modern system relies on a series of both hardware and software timers, both internal and external, in order to maintain a linear relation between requests, commands, execution, and interrupts.
The timers have different users, such as following instructions, maintaining video coherency, tracking real time, or managing the flow of data. Timers can (but not always) use external references to ensure their own consistency – damage, unexpected behavior, and even thermal environments can cause timers to lose their accuracy.
Timers are highly relevant for benchmarking. Most benchmark results are a measure of work performed per unit time, or in a given time. This means that both the numerator and the denominator need to be accurate: the system has to be able to measure what amount of work has been processed, and how long it took to do it in. Ideally there is no uncertainty in either of those values, giving an accurate result.
With the advent of Windows 8, between Intel and Microsoft, the way that the timers were used in the OS were changed. Windows 8 had the mantra that it had to ‘support all devices’, all the way from the high-cost systems down to the embedded platforms. Most of these platforms use what is called an RTC, a ‘real time clock’, to maintain the real-world time – this is typically a hardware circuit found in almost all devices that need to keep track of time and the processing of data. However, compared to previous versions of Windows, Microsoft changed the way it uses timers, such that it was compatible with systems that did not have a hardware-based RTC, such as low-cost and embedded devices. The RTC was an extra cost that could be saved if the software was built to do so.
Ultimately, any benchmark software in play has to probe the OS to determine the current time during the benchmark to then at the end give an accurate result. However the concept of time, without an external verifying source, is an arbitrarily defined constant – without external synchronization, there is no guarantee that ‘one second’ on the system equals ‘one second’ in the real world. For the most part, all of us rely on the reporting from the OS and the hardware that this equality is true, and there are a lot of hardware/software engineers ensuring that this is the case.
However, back in 2013, it was discovered that it was fairly easy to 'distort time' on a Windows 8 machine. After loading into the operating system, any adjustment in the base frequency of the processor, which is usually 100 MHz, can cause the ‘system time’ to desynchronise with ‘real time’. This was a serious issue in the extreme overclocking community, where world records require the best system tuning: when comparing two systems at the same frequency but with different base clock adjustments, up to a 7% difference in results were observed when there should have been a sub-1% difference. This was down to how Windows was managing its timers, and was observed on most modern systems.
For home users, most would suspect that this is not an issue. Most users tend not to adjust the base frequencies of their systems manually. For the most part that is true. However, as shown in some of our motherboard testing over the years, frequency response due to default BIOS settings can provide an observable clock drift around a specified value, something which can be exacerbated by the thermal performance. Having a system with observable clock drift, and subsequent timing drift, is not a good thing. It relies on the accuracy and quality of the motherboard components, as well as the state of the firmware. This issue has formally been classified as ‘RTC Bias’.
The extreme overclocking community, after analysing the issue, found a solution: forcing the High Performance Event Timer, known as HPET, found in the chipset. Some of our readers will have heard of HPET before, however our analysis is more interesting than it first appears.
Why A PC Has Multiple Timers
Aside from the RTC, a modern system makes use of many timers. All modern x86 processors have a Time Stamp Counter (TSC) for example, that counts the number of cycles from a given core, which was seen back in the day as a high-resolution, low-overhead way to get CPU timing information. There is also a Query Performance Counter (QPC), a Windows implementation that relies on the processor performance metrics to get a better resolution version of the TSC, which was developed in the advent of multi-core systems where the TSC was not applicable. There is also a timer function provided by the Advanced Configuration and Power Interface (ACPI), which is typically used for power management (which means turbo related functionality). Legacy timing methodologies, such as the Programmable Interval Timer (PIT), are also in use on modern systems. Along with the High Performance Event Timer, depending on the system in play, these timers will run at different frequencies.
The timers will be used for different parts of the system as described above. Generally, the high performance timers are the ones used for work that is more time sensitive, such as video streaming and playback. HPET, for example, was previously referred to by its old name, the Multimedia Timer. HPET is also the preferred timer for a number of monitoring and overclocking tools, which becomes important in a bit.
With the HPET timer being at least 10 MHz as per the specification, any code that requires it is likely to be more in sync with the real-world time (the ‘one-second in the machine’ actually equals ‘one-second in reality’) than using any other timer.
In a standard Windows installation, the operating system has access to all the timers available. The software used above is a custom tool developed to show if a system has any of those four timers (but the system can have more). For the most part, depending on the software instructions in play, the operating system will determine which timer is to be used – from a software perspective, it is fundamentally difficult to determine which timers will be available, so the software is often timer agnostic. There is not much of a way to force an algorithm to use one timer or another without invoking specific hardware or instructions that rely on a given timer, although the timers can be probed in software like the tool above.
HPET is slightly different, in that it can be forced to be the only timer. This is a two stage process:
The first stage is that it needs to be enabled in the BIOS. Depending on the motherboard and the chipset, there may or may not be an option for this. The options are usually for enable/disable, however this is not a simple on/off switch. When disabled, HPET is truly disabled. However, when enabled, this only means that the HPET is added to the pool of potential timers that the OS can use.
The second stage is in the operating system. In order to force HPET as the only timer to be used for the OS, it has to be explicitly mentioned in the system Boot Configuration Data (BCD). In standard operation, HPET is not in the BCD, so it remains in the pool of timers for the OS to use. However, for software to guarantee that the HPET is the only timer running, the software will typically request to make a change and make an accompanying system reboot to ensure the software works as planned. Ever wondered why some overclocking software requests a reboot *before* starting the overclock? One of the reasons is sometimes to force HPET to be enabled.
This leads to four potential configuration implementations:
- BIOS enabled, OS default: HPET is in list of potential timers
- BIOS enabled, OS forced: HPET is used in all situations
- BIOS disabled, OS default: HPET is not available
- BIOS disabled, OS forced: HPET is not available
Again, for extreme overclockers relying on benchmark results to be equal on Windows 8/10, HPET has to be forced to ensure benchmark consistency. Without it, the results are invalid.
The Effect of a High Performance Timer
With a high performance timer, the system is able to accurately determine clock speeds for monitoring software, or video streaming processing to ensure everything hits in the right order for audio and video. It can also come into play when gaming, especially when overclocking, ensuring data and frames are delivered in an orderly fashion, and has been shown to reduce stutter on overclocked systems. And perhaps most importantly, it avoids any timing issues caused by clock drift.
However, there are issues fundamental to the HPET design which means that it is not always the best timer to use. HPET is a continually upward counting timer, which relies on register recall or comparison metrics rather than a ‘set at x and count-down’ type of timer. The speed of the timer can, at times, cause a comparison to fail, depending on the time to write the compared value to the register and that time already passing. Using HPET for very granular timing requires a lot of register reads/writes, adding to the system load and power draw, and in a workload that requires explicit linearity, can actually introduce additional latency. Usually one of the biggest benefits to disabling HPET on some systems is the reduction in DPC Latency, for example.
242 Comments
View All Comments
Cooe - Wednesday, April 25, 2018 - link
Chris Hook was a marketing guy through and through and was behind some of AMD's worst marketing campaigns in the history of the company. Him leaving is total non-issue in my eyes and potentially even a plus assuming they can replace him with someone that can actually run good marketing. That's always been one of AMD's most glaring weak spots.HilbertSpace - Wednesday, April 25, 2018 - link
Thanks for the great follow up article. Very informative.Aichon - Wednesday, April 25, 2018 - link
I laud with your decision to reflect default settings going forward, since the purpose of these reviews is to give your reader a sense of how these chips compare to each other in various forms of real-world usage.As to the closing question of how these settings should be reflected to readers, I think the ideal case (read: way more work than I'm actually expecting you to do) would be that you extend the Benchmarking Setup page in future reviews to include mention of any non-default settings you use, with details about which setting you chose, why you set it that way, and, optionally, why someone might want to set it differently, as well as how it might impact them. Of course, that's a LOAD of work, and, frankly, a lot of how it might impact other users in unknown workflows would be speculation, so what you end up doing should likely be less than that. But doing it that way would give us that information if we want it, would tell us how our usage might differ from yours, and, for any of us who don't want that information, would make it easy to skip past.
phoenix_rizzen - Wednesday, April 25, 2018 - link
Would be interesting to see a series of comparisons for the Intel CPU:No Meltdown, No Spectre, HPET default
No Meltdown, No Spectre, HPET forced
Meltdown, No Spectre, HPET default
Meltdown, No Spectre, HPET forced
To compare to the existing Meltdown, Spectre, HPET default/forced results.
Will be interesting to see just what kind of performance impact Meltdown/Spectre fixes really have.
Obviously, going forward, all benchmarks should be done with full Meltdown/Spectre fixes in place. But it would still be interesting to see the full range of their effects on Intel CPUs.
lefty2 - Wednesday, April 25, 2018 - link
Yes, I'd like to second this suggestion ;) . No one has done any proper analysis of the Meltdown/Spectre performance on Windows since Intel and AMD released the final microcode mitigations. (i.e post April 1st).FreckledTrout - Wednesday, April 25, 2018 - link
I agree as the timing makes this very curious. One would think this would have popped up before this review. I get this gut feeling the HPET being forced is causing a much greater penalty with the Meltdown and Spectre patches applied.Psycho_McCrazy - Wednesday, April 25, 2018 - link
Thanks to Ryan and Ian for such a deep dive into the matter and for finding out what the issue was...Even though this changes the gaming results a bit, still does not change the fact that the 2700x is a very very competent 4k gamer cpu.
Zucker2k - Wednesday, April 25, 2018 - link
You mean gpu-bottle-necked gaming? Sure!Cooe - Wednesday, April 25, 2018 - link
But to be honest, the 8700K's advantage when totally CPU limited isn't all that fantastic though either. Sure, there are still a handful of titles that put up notable 10-15% advantages, most are now well in the realm of 0-10%, with many titles now in a near dead heat which compared to the Ryzen 7 vs Kaby Lake launch situation is absolutely nuts. Hell, even when comparing the 1st Gen chips today vs then; the gaps have all shrunk dramatically with no changes in hardware and this slow & steady trend shows no signs of petering out (Zen in particular is an arch design extraordinarily ripe for software level optimizations). Whereas there were a good number of build/use scenerios where Intel was the obviously superior option vs 1st Gen Ryzen, with how much the gap has narrowed those have now shrunk into a tiny handful of rather bizarre niches.These being those first & foremost gamers whom use a 1080p 144/240Hz monitor with at least a GTX 1080/Vega 64. For most everyone with more realistic setups like 1080p 60/75Hz with a mid-range card or a high end card paired with 1440p 60/144Hz (or 4K 60Hz), the Intel chip is going to have all of no gaming performance advantage whatsoever, while being slower to a crap ton slower than Ryzen 2 in any sort of multi-tasking scenerio, or decently threaded workload(s). And unlike Ryzen's notable width advantage, Intel's in general single-thread perf is most often near impossible to notice without both systems side by side and a stopwatch in hand, while running a notoriously single-thread heavy load like some serious Photoshop (both are already so fast on a per-core basis that you pretty much deliberately have to seek out situations where there'll be a noticeable difference, whereas AMD's extra cores/threads & superior SMT becomes readily apparent as soon as you start opening & running more and more things concurrently. (All modern OS' are capable of scaling to as many cores/threads as you can find them).
Just my 2 cents at least. While the i7-8700K was quite compelling for a good number of use-cases vs Ryzen 1, it just.... well isn't vs Ryzen 2.
Tropicocity - Monday, April 30, 2018 - link
The thing is, any gamer (read: gamer!) looking to get a 2700x or an 8700k is very likely to be pairing it with at least a GTX 1070 and more than likely either a 1080/144, a 1444/60, or a 1440/144 monitor. You don't generally spend $330-$350/ £300+ on a CPU as a gamer unless you have sufficient pixel-pushing hardware to match with it.Those who are still on 1080/60 would be much more inclined to get more 'budget' options, such as a Ryzen 1400-1600, or an 8350k-8400.
There is STILL an advantage at 1440p, which these results do not show. At 4k, yes, the bottleneck becomes almost entirely the GPU, as we're not currently at the stage where that resolution is realistically doable for the majority.
Also, as a gamer, you shouldn't neglect the single-threaded scenario. There are a few games who benefit from extra cores and threads sure, but if you pick the most played games in the world, you'll come to see that the only thing they appreciate is clock speed and single (occasionally dual) threaded workloads. League of Legends, World of Warcraft, Fortnite, CS:GO etc etc.
The games that are played by more people globally than any other, will see a much better time being played on a Coffee Lake CPU compared to a Ryzen.
You do lose the extra productivity, you won't be able to stream at 10mbit (Twitch is capped to 6 so its fine), but you Will certainly have improvements when you're playing the game for yourself.
Don't get me wrong here; I agree that Ryzen 2 vs Coffee Lake is a lot more balanced and much closer in comparison than anything in the past decade in terms of Intel vs AMD, but to say that gamers will see "no performance advantage whatsoever" going with an Intel chip is a little too farfetched.