Apple

What Kind of Performance Should We Expect From ARM-Based Macs?

Jun 28, 2020
What Kind of Performance Should We Expect From ARM-Based Macs?

The big question on everyone's mind since Apple's unveiling of its upcoming ARM shift is what kind of performance we can expect the new chips to offer. It's not an easy question to answer right now, and there's some misinformation about what the differences are between modern x86 versus ARM CPUs in the first place.

It's Not About CISC vs. RISC

Some of the articles online are framing this as a CISC-versus-RISC battle, but that's an outdated comparison.

The "classic" formulation of the x86 versus ARM debate goes back to two different methods for building instruction set architectures (ISAs): CISC and RISC. Decades ago, CISC (Complex Instruction Set Computer) designs like x86 focused on relatively complicated, variable-length instructions that could encode more than one operation. CISC-style CPU designs dominated the industry when memory was extremely expensive, both in terms of absolute cost per bit and in access latencies. Complex instruction sets allowed for denser code and fewer memory accesses.

ARM, in contrast, is a RISC (Reduced Instruction Set Computer) ISA, meaning it uses fixed-length instructions that each perform exactly one operation. RISC-style computing became practical in the 1980s when memory costs became lower. RISC designs won out over CISC designs because CPU designers realized it was better to build simple architectures at higher clock speeds than to take the performance and power hits required by CISC-style computing.

No modern x86 CPU actually uses x86 instructions internally, however. In 1995, Intel introduced the Pentium Pro, the first x86 microprocessor to translate x86 CISC instructions into an internal RISC format for execution. All but one Intel and AMD CPU designed since the late 1990s has executed RISC operations internally. RISC won the CISC-versus RISC-war. It's been over for decades.

20200628.What-Kind-of-Performance-Should-We-Expect-From-ARM-Based-Macs-01.png

The original Pentium Pro decoder, with two simple, fast decoder blocks and one complex, slower block. Designs have evolved since then. Image by Ars Technica.

The reason you'll still see companies referring to this idea, long after it should have been retired, is that it's easy to tell people. ARM is faster/more efficient (if it is), because it's a RISC CPU, while x86 is CISC. But it's not really accurate. The original Atom (Bonnell, Moorestown, Saltwell) is the only Intel or AMD chip in the past 20 years to execute native x86 instructions.

What people are actually arguing, when they argue about CISC versus RISC, is whether the decoder block x86 CPUs use to convert CISC into RISC burns enough power to be considered a categorical disadvantage against x86 chips.

When I've raised this point with AMD and Intel in the past, they've always said it isn't true. Decoder power consumption, I've been told, is in the 3-5 percent range. That's backed up by independent evaluation. A comparison of decoder power consumption in the Haswell era suggested an impact of 3 percent when L2 / L3 cache are stressed and no more than 10 percent if the decoder is, itself, the primary bottleneck. The CPU cores' static power consumption was nearly half the total. The authors of the comparison note that 10 percent represents an artificially inflated figure based on their test characteristics.

A 2014 paper on ISA efficiency also backs up the argument that ISA efficiency is essentially equal above the microcontroller level. In short, whether ARM is faster than x86 has been consistently argued to be based on fundamentals of CPU design, not ISA. No major work on the topic appears to have been conducted since these comparisons were written. One thesis defense I found claimed somewhat different results, but it was based entirely on theoretical modeling rather than real-world hardware evaluation.

CPU power consumption is governed by factors like the efficiency of your execution units, the power consumption of your caches, your interconnect subsystem, your fetch and decode units (when present), and so on. ISA may impact the design parameters of some of those functional blocks, but ISA itself has not been found to play a major role in modern microprocessor performance.

Can Apple Build a Better Chip Than AMD or Intel?

PC Mag's benchmarks paint a mixed picture. In tests like GeekBench 5 and GFX Bench 5 Metal, the Apple laptops with Intel chips are outpaced by Apple's iPad Pro (and sometimes, by the iPhone 11).

20200628.What-Kind-of-Performance-Should-We-Expect-From-ARM-Based-Macs-02.png

In applications like WebXPRT 3, Intel still leads on the whole. The performance comparisons we can perform between the platforms are limited, and they point in opposite directions.

20200628.What-Kind-of-Performance-Should-We-Expect-From-ARM-Based-Macs-03.png

This implies a few different things are true. First, we need better benchmarks performed under something more like equal conditions, which obviously won't happen until macOS devices with Apple ARM chips are available to be compared against macOS on Intel. GeekBench is not the final word in CPU performance -- there've been questions before about how effective it is as a cross-platform CPU test -- and we need to see some real-world application comparisons.

Factors working in Apple's favor include the company's excellent year-on-year improvements to its CPU architecture and the fact that it's willing to take this leap in the first place. If Apple didn't believe it could deliver at least competitive performance, there'd be no reason to change. The fact that it believes it can create a permanent advantage for itself in doing so says something about how confident Apple is about its own products.

At the same time, however, Apple isn't shifting to ARM in a year, the way it did with x86 chips. Instead, Apple hopes to be done within two years. One way to read this decision is to see it as a reflection of Apple's long-term focus on mobile. Scaling a 3.9W iPhone chip into a 15-25W laptop form factor is much easier than scaling it into a 250W TDP desktop CPU socket with all the attendant chipset development required to support things like PCIe 4.0 and standard DDR4 / DDR5 (depending on launch window).

It's possible that Apple may be able to launch a superior laptop chip compared with Intel's x86 products, but that larger core desktop CPUs with their higher TDPs will remain an x86 strength for several years yet. I don't think it's an exaggeration to say this will be the most closely watched CPU launch since AMD's Ryzen back in 2017.

Apple's historic price and market strategy make it unlikely that the company would attack the mass market. But mainstream PC OEMs aren't going to want to see a rival switch architectures and be decisively rewarded for it while they're stuck with suddenly second-rate AMD and Intel CPUs. Alternately, of course, it's possible that Apple will demonstrate weaker-than-expected gains, or only be able to show decisive impacts in contrived scenarios. I'm genuinely curious to see how this shapes up.