MCD with 3D V-Cache: AMD seems to be preparing graphics chiplets with an additional cache

Splitting a processor and a GPU into multiple chips in their tasks and resources has certainly given AMD an advantage from a technological point of view. The next step in this chiplet strategy was then taken with the Ryzen 5 5800X3D’s stacked SRAM (3D V-Cache) and Milan-X processors. The next Ryzen and EPYC generation with the additional cache is also already in the starting blocks.

It is normal in this industry for chip manufacturers to also look at the solutions of the competition, because after all you want to know the technological status of the competition. Semiconductor engineer Tom Wassick, who works for IBM, once bought a Radeon RX 7900 XT examined more closely and used a 3D X-ray device in a first step. With the Radeon RX 7900 series, AMD has also switched to a chiplet strategy for GPUs in the end customer segment for the first time and has developed a package that consists of a central graphics chiplet die (GCD) and several memory chiplet dies (MCD). consists.

Apparently there are similar “keep out zones” (KOZ) on the MCDs as on the CCDs of the processors. These KOZ are then used for the internal connections when the SRAM chiplet is placed on the CCD.

The similarities of the KOZ are not only present in their dimensions, but apparently also in the distance between the contacts (pitch). According to Wassick, this is 17 to 18 µm and thus corresponds to the distance that we know from the CCDs of the processors.

It shouldn’t be particularly surprising that AMD is also working on a way to further increase the capacity of the memory or cache for the graphics cards. All manufacturers are working on placing the largest possible amount of memory as close and as quickly as possible to the actual processing units. After the graphics memory, the currently simplest solution (both in terms of technical implementation and costs) is to use HBM. However, a cache has a much higher bandwidth. AMD wants to use 128 GB of HBM3 for the future Instinct MI300 accelerator. The current Instinct MI250 accelerators with their 128 GB HBM2e achieve a memory bandwidth of 3,276.8 GB/s.

If you compare this with the Infinity Cache of the RDNA-3 architecture, AMD realizes more than 5 TB/s of memory bandwidth between the central GCD and the six MCDs. These each offer 16 MB of memory on a chip area of ​​37.5 mm². With an additional SRAM chip, AMD could increase this to 70 MB per MCD. With the Radeon cards, however, the question of their usefulness arises. Because the effectiveness of the hit rate does not increase significantly above a certain capacity, making such an expensive technology difficult to justify. In the data center environment, such large and, above all, fast caches can be more in demand, and AMD may have already tested a technology at this point that will only be used much later.

Middle MCD is just a dummy

In addition, it is now also known which of the six MCDs on the Radeon RX 7900 XT is switched off.

AMD Radeon RX 7900 XT Referenzdesign PCB

Of course, the package for both cards, the Radeon RX 7900 XTX as well as the Radeon RX 7900 XT, consists of a GCD and six MCDs, but with the Radeon RX 7900 XT one of the MCDs is just a piece of bare silicon, i.e. not a full memory -Chiplet. In the case of the GPU Wassick examined, one of the middle chips was just the dummy. This is installed in this form so that the cooler continues to have as large and flat a surface as possible on which it can rest.