Cheating in benchmarks is a problem that rears its head in a lot of different contexts in the PC and mobile industries, but up until now, the cheating has typically been limited to OEM implementations of a different company’s SoC.
Cheating in benchmarks is a problem that rears its head in a lot of different contexts in the PC and mobile industries, but up until now, the cheating has typically been limited to OEM implementations of a different company’s SoC. MediaTek, apparently, has decided to simplify the process and has been selling devices that include built-in benchmark cheating functions in firmware.
That’s the word from Anandtech, which has been conducting an ongoing investigation into this problem. The site’s investigation found multiple devices from multiple manufacturers, each with a similar Power_Whitelist_CFG.xml file with the same applications loaded in it. Anandtech presents evidence that these files are coming from MediaTek and integrated into the SoC BSP (Board Support Package), defined as “the layer of software containing hardware-specific drivers and other routines.”
There’s a lot of disturbing angles to this cheating. It’s been going on since 2016, it’s present in multiple devices including those from companies like Sony, it impacts a wide variety of tests including more recent AI tests, and it includes applications we haven’t seen companies cheating in before. Because this type of cheating works by throwing out thermal and power limits and letting the SoC go nuts, there’s also the risk of faster battery degradation and overheating, with no actual performance gain. Remember — this isn’t some feature a company built in to boost performance in applications you already use. It’s intended to lie to you by making you think performance gains in applications are larger than they are.
The Problem With Cheating
MediaTek’s response to Anandtech is nothing but a self-contradictory semi-admission of guilt. The company claims that its devices “follows accepted industry standards” and that tests are run at maximum clock and power draw without any of the thermal limits that apply in other circumstances in order to “show the full capabilities of the chipset.”
There’s truth to this. Modern mobile devices are thermally limited and can generally only operate at boost clocks for short periods of time. But the point of benchmarking a smartphone isn’t just to test the individual performance of a low-level component like the CPU, GPU, or AI co-processor. The point of performing these comparisons is to give end-users an overall sense of what it’s like to use the phone.
This applies to PC components as well. Most of the time, people think about reviews as being written based on benchmark results. Reviewers measure performance in a variety of tests, then write the review based on how different hardware compares in various metrics. This is broadly true, but there’s another side to things: Reviewers also attempt to find benchmarks that capture the experience of using the device.
Imagine, for example, that a GPU was designed to use one set of thermal and power rules when running popular benchmarks and another set of rules that left it much closer in performance to a competitor when running the most popular games. This breaks the entire point and purpose of a benchmark: It’s literally no longer telling you anything valuable about the larger experience of using the phone.
MediaTek’s response states: “We believe that showcasing the full capabilities of a chipset in benchmarking tests is in line with the practices of other companies and gives consumers an accurate picture of device performance.”
The first sentence may be true, but the second definitionally isn’t. The point of this sort of cheating is to give customers an inaccurate picture of device performance.
One of the major problems with this kind of cheating is that once you’ve started, it’s difficult to stop. For an example of why, consider our recent Surface Laptop 3 review. Relative CPU performance between the 7700HQ (a 2016 quad-core Kaby Lake with a 45W TDP) and the Core i7-1065G7 (a 2020 quad-core Ice Lake with a 15W TDP) was how the 7700HQ was consistently faster in multi-threading thanks to higher TDP, but still fell behind the newer chip in single-threaded tests.
Now, imagine that Intel and Microsoft allowed the Surface Laptop 3 to boost up to 45W TDP temporarily while being tested against other chips and only enforced the 15W limit in regular usage. The results of my testing would have been radically different. In reality, a 45W TDP Kaby Lake CPU is often still faster than a 15W Ice Lake CPU in multi-threaded code. In this hypothetical, however, a temporary kick to 45W would have allowed the Ice Lake to make hash of Kaby — and it would have made the Surface Laptop 3 look as if CPU efficiency had improved far more than it has.
But if Intel and Microsoft had begun manipulating benchmark results in this way back in 2016, changing the methodology in 2020 would make Ice Lake look far weaker than it actually was. Once you’ve started lying to customers, you typically have to either keep doing it or slowly unwind the effort over time, in a gradual enough way that people don’t realize what has happened.
Manipulating test results like this always backfires on the company doing it. The only thing MediaTek has proven is that its engineering teams are incapable of matching the results other manufacturers achieve in an honest fashion and must, therefore, cheat to make up the difference. A company that lies to you about its performance will have no qualms lying to you about everything else.