AMD attacks the latest Intel / Nvidia bastion, the $ 29 billion datacenter market. Crucial for this is the 7-nanometer production of the chip maker TSMC, which is used for both the upcoming Zen 2 CPU cores and the Radeon GPUs.
The Zen-2 architecture has significantly depleted AMD and now wants to achieve twice the throughput per clock over Zen and Zen + when the vector units are used. For this purpose, the vector units used for AVX / AVX2 calculations in Zen 2 have 256-bit data paths and correspondingly large computing units.
AMD has also aligned parts of the pipeline, promising that the high throughput for all operating modes will be maintained. Instruction cache and the micro-op cache have also been improved, which has been increased. And of course the frequently mentioned and improved jump prediction can not be lacking.
7nm chiplets and I / O chips
For Epyc, AMD relies on an asymmetrical design with special I / O chip and chip sets called satellite molds. This makes optimization of the design process possible, as well as minimizing risks & # 39; s. Because only the chip sets containing the actual Zen 2 units of account are manufactured in 7 nm and connected to the central I / O chip via Infinity Fabric.
The latter will be further produced using the 14 nm process. It also spends the analog driver's, which scales up poorly with smaller production processes. At the same time, the I / O chip can also help meet the wafer agreement with the 7-nm Global Foundries.
For a 64-core Epyc eight chiplets must be connected, each with eight cores.
Which I / O possibilities are still in the boxes, except Infinity Fabric is unclear. These can best be optimized if all I / O circuits have been omitted and the chip sets are designed as pure compute modules. However, even simple and above all price-sensitive desktop processors should be produced as multi-chip carriers with a (possibly simpler) I / O chip.
[[[[-update 07.11.2018 08:05: In addition, the upcoming Zen-2 architecture will be the first x86 CPU to connect accelerator, graphics and storage cards via PCI Express 4.0, thus doubling the transfer rate over PCI Express 3.0.]
Rome: upwards and downwards compatible platform
AMD has also announced that the Rome-generation Epyc processors from the first Epyc generation are compatible with existing Naples platforms. The successor Milan (Zen 3) will also run on the same platform. However, BIOS updates may be necessary.
[[[[-update: AMD specified the details in one of the sessions after the execution. Epyc CPUs of the Zen 2 generation are basically feasible in existing Naples systems, some of the innovations would require special support and therefore do not run in legacy systems. This probably had to do with, among other things, PCIe 4.0]
[[[[update: Compatibility with the SP3 version allows eight DDR4 DRAM channels per CPU and 128 PCIe lanes per system. Presumably, the I / O chip contains both the memory controllers and the PCIe Root Complex and possibly also the Last Level Cache (LLC).]
The key: 7 nm production
In addition to the double transistor density, the 7 nm production of the TSMC chip maker also allows a quarter of the power compared to the current 12- / 14-nm FinFET process, ie a clock with the same power consumption or halved power consumption with the same performance. Depending on the product requirements, these cornerstones can be combined – all three at the same time are not possible, especially the higher clock speed and the lower energy consumption. Probably the doubled transistor density, which leads to smaller chips, which can then be combined on the carrier substrate.
Several times at the Next Horizon event, which analysts also said, AMD repeated that it was fully on track with the production and already provided the first Zen 2 processors as a model. The announced 7 nm Radeon GPU on the machine learning card MI60 is already ready and the MI60 will even be on the market in the fourth quarter of 2018.
[Update 07.11.2018 08:05]:
Performance in comparison
A few hours after the implementation, there were a few more specific performance levels in a separate session. AMD ordered a Zen-2 system with a 64-core Epyc for each of a system with two sockets from Intel and one of the first Epyc generation. The task was the Linux running rendering benchmark C-Ray.
The system with a single "Rome" Epyc first reached the finish after 27.7 seconds, followed by the two Epyc 7601 with 28.5 seconds. The two Xeon Platinum 8180M, which needed 30.5 seconds with their 56 cores, came in last.
To the home page