Jump to content
Facebook Twitter Youtube

[Hardware] AMD's EPYC Milan-X is Official: 3D V-Cache Brings Up To 768MB of L3 Cache, 64 Cores (Updated)


Recommended Posts

Posted

nH3ve52DkRix3sMntNm3rm-1024-80.jpg.webp

 

 

AMD CEO Lisa Su unveiled the first details about the company's EPYC Milan-X processors, which come with a 3D-stacked L3 cache called 3D V-Cache, during its Accelerated Data Center event today. AMD says that its new cache-stacking technology, which it will add to the existing Zen 3-powered EPYC Milan models to create the new Milan-X chips, will bring up to 768MB of total L3 cache per chip. That means there will soon be dual-socket servers with an eye-popping 1.5 GB of L3 cache in the system. AMD also shared a few examples of workloads that will benefit, and an impressive benchmark result that shows a 60% performance improvement.

The chips will come to market in Q1 2022, but they are available as a preview instance in Azure now. Microsoft has released its own performance projections, too, but we'll cover those in the article below as well. 

As a quick refresher, AMD introduced its 3D V-Cache technology at CES 2021, showing a third-gen Ryzen prototype outfitted with an additional chunk of L3 cache. 3D V-Cache uses a novel new hybrid bonding technique that fuses an additional 64MB of 7nm SRAM cache stacked vertically atop the Ryzen compute chiplets to triple the amount of L3 cache per Ryzen chip. AMD claims that brings up to a 15% performance improvement in some games, meaning those chips will vie for the title of Best CPU for gaming when they come to market early next year. We've since learned many more details about those chips, including deep-dive info on the packaging tech at a Hot Chips presentation earlier this year.
Now AMD is bringing this same tech to its long-rumored Milan-X data center processors, but it hasn't yet shared detailed specifications of the new chips. However, it has confirmed via its briefings and endnotes that the chips will come in at least 16-, 32- and 64-core variants, lining up with an earlier leaked list of the product stack. In fact, we've even seen them listed for sale at a B2B retailer. Here are the purported specs:
Like with the consumer variants, AMD stacks a single 6x6mm layer of L3 cache directly over the L3 cache already present on each CCD (compute chiplet).

Each CCD has 32MB of L3 cache before the modification. Adding the vertically-stacked L3 cache slice adds another 64 MB of cache, bringing the total to 96MB per CCD. The Milan-X chips will stretch up to 64-core models with eight CCDs, which brings the total to 768MB of L3 cache per chip. AMD has confirmed that its chips support higher stacks of L3, and HardwareLuxx has even found server BIOS settings that enable up to four cache stacks per chip with existing AMD EPYC Milan servers.

The stacked L3 cache adds a roughly ~10% overhead to overall latency, which is comparable to the standard latency impact from simply adding capacity with standard on-die techniques. That's partly because the additional L3 cache slice is somewhat 'dumb' — all the control circuitry resides on the existing CCD, which helps reduce the latency overhead. In addition, because the larger cache reduces trips to main memory due to higher L3 cache hit rates, the additional capacity relieves bandwidth pressure on main memory, thus reducing latency and thereby improving application performance from multiple axes.

AMD uses the same Zen 3 cores as normal; the control circuitry for 3D V-Cache was added as a forward-looking design choice during the initial design phases. AMD uses the existing EPYC Milan chips as the building block, so the chips will drop into the SP3 sockets in EPYC servers (a BIOS update is required). That reduces qualification time and speeds time to market.

AMD reiterated many of the benefits of the solder-less hybrid bonding technique that enables 3D V-Cache, like a 200X interconnect density increase over 2D chiplets and a 15X density increase and 3X energy efficiency gain over micro-bump 3D packaging. AMD says hybrid bonding also improves thermals, transistor density, and interconnect pitch over other 3D approaches, making it the most flexible active-on-active silicon stacking tech.

Additionally, AMD says no software modifications are required to leverage the increased cache capacity, though it is working with several partners to create certified software packages. Those packages might see further performance optimizations, too.

Guest
This topic is now closed to further replies.

WHO WE ARE?

CsBlackDevil Community [www.csblackdevil.com], a virtual world from May 1, 2012, which continues to grow in the gaming world. CSBD has over 70k members in continuous expansion, coming from different parts of the world.

 

 

Important Links