Most Viewed
Latest Arrivals
  • PCs
  • Gaming
  • Graphics Cards
  • Components
  • Monitors
  • Accessories
  • Deals
Game-ready PCs next day
FREE Delivery Over £150
3 year PC warranty*

Cloudflare Will Use AMD EPYC Milan Instead Of Intel Ice Lake Xeons

Cloudflare Gen11 servers will use AMD EPYC Milan processors after Intel Ice Lake performance deemed “unattractive“

Cloudflare Will Use AMD EPYC Milan Instead Of Intel Ice Lake Xeons

 

Cloudflare is now well and truly embedded in the rivers and lakes which flow through today’s internet infrastructure, and has constantly looked at ways to improve their own network infrastructure to provide their customers with the best service. Cloudflare published an article on their blog entitled The EPYC journey continues to Milan in Cloudflare’s 11th generation Edge Server. Chris Howells, Platform Operations Engineer at Cloudflare wrote the article which documented the company’s evolution alongside industry-wide changes saying We aim to introduce a new server platform to our edge network every 12 to 18 months or so, to ensure that we keep up with the latest industry technologies and developments.

We continually work with our silicon vendors to receive product roadmaps and stay on top of the latest technologies. Since mid-2020, the hardware engineering team at Cloudflare has been working on our generation 11 server.

 

64-core EPYC 7713 in Gen11 Cloudflare test servers

Image courtesy of Cloudflare

 

Results per Watt

Chris outlined that one of the defining characteristics of building their Gen11 servers is Results per Watt, and which is critical in discovering how much more efficient their new build is over previous generations. Operational costs are of course a primary factor, and effective power consumption reduction is, naturally, a key factor in keeping those costs down.

We evaluated Intel’s latest generation of ‘Ice Lake’ Xeon processors. Although Intel’s chips were able to compete with AMD in terms of raw performance, the power consumption was several hundred watts higher per server - that’s enormous. This meant that Intel’s Performance per Watt was unattractive.

Yet again, this is not good news for Intel. Cloudflare absolutely cast a long shadow over the industry, and it would be unwise not to take on board their findings if you are competing in the same space.

AMD 48, 56 and 64 Core Tests

When Cloudflare announced their 10th generation servers would use AMD EPYC 7642’s processors (code name Rome), the tech industry was quick to champion the 48 core processor built on AMD’s 2nd generation EPYC architecture. This time around, Cloudflare evaluated 48, 56 and 64 core samples which are based on AMD’s 3rd generation EPYC architecture (code named Milan). The natural progression was of course to test the 2nd and 3rd generation 48 core samples head-to-head. Seeing a significant performance boost, the 56 and 64 core tests were eagerly anticipated by the team at Cloudflare.

With the AMD silicon making the shortlist, Cloudflare began testing at their hardware validation lab, making changes to existing 10th generation servers themselves. The new silicon was installed in their repurposed servers. As with most companies in the content delivery network and DDoS mitigation space, Cloudflare’s hardware is sourced and assembled by multiple vendors known as ODMs (Original Design Manufacturers). Working with brand new silicon and firmware is not without its trials and tribulations, and Cloudflare needed to work closely with their ODMs to overcome the few hurdles they experienced – one of those being the Linux kernel panicking on boot.

Once these issues were ironed out, Cloudflare ordered test samples from ODMs and began using synthetic benchmarking tools to verify the performance including cf_benchmark (CF Benchmarking Utilities) and their own internal tool suite to apply a synthetic load across their software stack.

Positive Production Test Results

Once confidence grew in the test servers, Cloudflare began to roll out small numbers of servers that would provide the data to support full migration to the new 11th generation servers. The question still remained above the heads of the 56 and 64 core silicon, however:

What kind of performance increase could be expected now that the positive results were in on the 48 core processors?

We can see how each processor stacked up from Cloudflare’s table, published on their blog by Chris Howells:

 

  AMD EPYC 7642 AMD EPYC 7643 AMD EPYC 7663 AMD EPYC 7713
Status Incumbent Candidate Candidate Candidate
Core Count 48 48 56 64
Thread Count 96 96 112 128
Base Clock 2.3GHz 2.3GHz 2.0GHz 2.0GHz
Max Boost Clock 3.3GHz 3.6GHz 3.5GHz 3.675GHz
Total L3 Cache 256MB 256MB 256MB 256MB
Default TDP 225W 225W 240W 225W
Configurable TDP 240W 240W 240W 240W

 

Chris wrote In the above chart, TDP refers to Thermal Design Power, a measure of the heat dissipated. All of the above processors have a configurable TDP - assuming the cooling solution is capable - giving more performance at the expense of increased power consumption. We tested all processors configured at their highest supported TDP.

The theory was that as 64 core processors have 33% more cores than the 48 core processors, a 33% increase in performance could be expected. Benchmarks, however, proved that this was not the case and only a “modest increase” in performance was seen. Chris explained that the 64 core processors specification includes a lower base clock frequency which allows it to fit within the 225W power envelope.

…we found that the 64 core EPYC 7713 gave us around a 29% performance boost over the incumbent, […] having similar power consumption and thermal properties.

Memory Upgrade

The memory was also brought up to spec, using 384GB DDR4-3200 over the previous 256GB DDR4-2933, following testing on 256GB, 384GB and 512GB memory configurations, using benchmarking tools such as STREAM as an overwatch for significantly poor performance.

The next phase would then be testing the memory alongside the various processor configurations and benchmarking for performance using actual production loads. As Cloudflare core HTTP servers use memory to cache web assets, there can often be insufficient memory and assets are pulled from disk storage, and this has a direct impact on performance. To analyze scenarios such as this, Cloudflare monitors request latency and disk IO performance using rich metric reporting tools such as Grafana and Prometheus. Should request latency and disk IO volume and latency increase, they could then identify situations where there is insufficient memory.

Chris calls this a balancing act in his post, saying We want enough memory to take advantage of the fact that serving web assets directly from memory is much faster than even the best NVMe disks.

A highlight of the tests was the direct comparisons of DDR4-2933 and DDR4-3200, which showed not only a performance increase, but also a cost benefit, in that the current market value of DDR4-3200 meant it would be a cost-effective switch.

SSD Testing

For the 11th generation servers, an upgrade to disks was also made. They opted to move from 3x 1TB drives to 2x 2TB drives and performance tested various drives for latency and durability. Not only did this provide extra storage but saved 6W of power in having one less SSD in the server. Cloudflare elected to use Samsung’s PM9A3 SSDs as their Gen11 drives, which performed in line with Samsung’s own claims & data. …we could see a 1.5x - 2x improvement in read and write bandwidths, said Chris.

No Change To Mellanox 25G Ethernet

Cloudflare did not swap out their Mellanox ConnectX-4 dual-port 25G ethernet, as it continued to match their expectations and needs in the Gen11 servers.

Milan-X and 3D V-Cache

The first question on everyone’s lips who is interested in the continuing Intel and AMD processor war is if AMD have any new chips in the pipeline that can compete against Intel’s Sapphire Rapids in terms of Results per Watt. There is no concrete data available to suggest that Intel can compete with Milan, which would suggest AMD will keep their nose out in front of the processor race for now, at least.

To further complicate the engineering roadmaps for Intel, chip stacking technology for Milan-X was recently leaked by ExecutableFix and Patrick Schur. Particular attention was immediately paid to the mention 3D V-Cache.

According to AMD CEO Lisa Su, who revealed the news in a keynote address to the virtual Computex exhibition, 3D V-Cache bonds 64MB of 7nm SRAM cache directly onto each core complex, effectively tripling the amount of cache feeding AMD Zen3 cores, and It averages 15 per cent improvement at 1080p just from 3D V-cache.

That said, Su did not allude to any business cases for the 3D V-Cache, and it is quite clear they are hoping to disrupt the gaming-focused processor market, with Reddit awash with debates over what this means for the already unstable GPU marketplace.

 

Reddit comment by idwtlotplanetanymore regarding Milan-X being almost ready to launch

 

As Reddit swarmed, Twitter lit up overhead with this tweet from prominent leaker @momomo_us

 

Twitter user @momomo_us tweet regarding pricing on AMD Milan-X chips

 

These leaked prices for Milan-X sent a rapid shockwave through the internet, touching both the high-end gaming and investment communities with comments such as this from Reddit user _lostincyberspace_:

 

Reddit comment from user _lostincyberspace_ regarding expectations of AMD margins and market penetration

 

With the rumour mill now in full swing, and AMDs official statements few and far between, we have much to look forward to while - at the same time - plenty to marvel at thanks to Cloudflare’s continuing ingenuity and transparency.