According to rumors (compiled by our colleagues over at Videocardz), AMD’s Navi 31 GPU will be the company’s first MCM based design and pack a ton of punch. MCM designs have been the holy grail of speculation for quite a few years now and ever since AMD introduced MCM based CPUs, it has been the next logical step in their product evolution. In a way, an MCM GPU makes even more sense than a CPU – considering the largely parallel type of tasks the former handles.
AMD quietly working on massive MCM based Navi 31 GPU
An MCM GPU is a GPU that is based on chiplets and combines many of these chiplets (usually with a command or IO chip) on a single package. This technique drastically increases the yield of the GPU (as yield is inversely proportional to the size of the die) as the on-wafer size of the dies goes down instead of up. Using multiple small dies will always result in higher yields than one large monolithic die. NVIDIA is also working on its own MCM-based designs on the Hopper architecture and Intel has already demoed-ed MCM designs of its Arctic Sound platform with up to 4 tiles. The future is MCM – that much is clear – and the only question is which one of the three will get there first.
This is where the rumor part of the article comes in.
According to some relatively unknown leakers retweeted by 3DCenter, AMD’s Navi 31 GPU will be an MCM GPU with 80 CUs chiplets and the top stock-keeping unit will have 2 of them (for 160 CUs or 10240 stream processors assuming the CU to SP ratio remains the same). If we assume a conservative clock speed of 1800 Mhz, this is a GPU capable of 36.8 TFLOPs of fp32 compute! AMD’s existing RDNA2 architecture was blindsided by NVIDIA’s push to ray tracing and RDNA3 should fix that weakness in AMD’s architecture as well as increase the overall rasterization performance.
AMD has already been working on its own variants of NVIDIA’s DLSS technology and considering a large part of GPUs functionality in the future will be running DNNs or inference tech, this is very likely true. Considering current GPU monolithic dies are already at the very edge of the maximum etch size for silicon wafers, there is definitely some merit to the MCM argument if the companies want to keep on aiming for core count doubling every X years. In fact, we have heard of AMD’s MCM ambitions for a very long time, starting with the EHP GPU processor a few years back. Considering their expertise with packaging Zen, a shift to MCM GPUs, while not exactly easy, would not be very difficult either (compared to the transition to Zen for example).
Underfox on Twitter has earned an almost legendary status for sniffing out obscure AMD patents and some of his discoveries corroborate the speculation about Navi 31 being an MCM GPU. The patent about a command scheduler for a multi dispatch scheduler would be the cornerstone of any MCM GPU design and a key component. The synchronization mechanism for GPUs is also another part of the puzzle.
Here is the thing though, we have already started hearing rumors about Navi 41 and Navi 31 is the one that is next in the pipeline. If that is going to be an MCM design then we will start to see leaks before long that confirm the same. AMD has also recently filed a patent for an MCM GPU design which would more or less confirm that an MCM GPU is in the pipeline:
AMD patents GPU chiplet design for future graphics cards
AMD has filed a patent for something that everyone knew would eventually happen: an MCM GPU Chiplet design. Spotted by LaFriteDavid over at Twitter and published on Freepatents.com, the document shows how AMD plans to build a GPU chiplet graphics card that is eerily reminiscent of its MCM based CPU designs. With NVIDIA working on its own MCM design with Hopper architecture, it’s about time that we left monolithic GPU designs in the past and enable truly exponential performance growth.
The patent points out that one of the reasons why MCM GPUs have not been attempted in the past is due to the high latency between chiplets, programming models and it being harder to implement parallelism. AMD’s patent attempts to solve all these problems by using an on-package interconnect it calls the high bandwidth passive crosslink. This would enable each GPU chiplet to communicate with the CPU directly as well as other chiplets via the passive crosslink. Each GPU would also feature its own cache. This design appears to suggest that each GPU chiplet will be a GPU in its own right and fully addressable by the operating system.
The full patent is given below: