Cloudflare is now well and truly embedded in the rivers and lakes which flow through today’s internet infrastructure, and has constantly looked at ways to improve their own network infrastructure to provide their customers with the best service. Cloudflare published an article on their blog entitled The EPYC journey continues to Milan in Cloudflare’s 11th generation Edge Server. Chris Howells, Platform Operations Engineer at Cloudflare wrote the article which documented the company’s evolution alongside industry-wide changes saying
We aim to introduce a new server platform to our edge network every 12 to 18 months or so, to ensure that we keep up with the latest industry technologies and developments.
We continually work with our silicon vendors to receive product roadmaps and stay on top of the latest technologies. Since mid-2020, the hardware engineering team at Cloudflare has been working on our generation 11 server.
Image courtesy of Cloudflare
Results per Watt
Chris outlined that one of the defining characteristics of building their Gen11 servers is Results per Watt, and which is critical in discovering how much more efficient their new build is over previous generations. Operational costs are of course a primary factor, and effective power consumption reduction is, naturally, a key factor in keeping those costs down.
We evaluated Intel’s latest generation of ‘Ice Lake’ Xeon processors. Although Intel’s chips were able to compete with AMD in terms of raw performance, the power consumption was several hundred watts higher per server - that’s enormous. This meant that Intel’s Performance per Watt was unattractive.
Yet again, this is not good news for Intel. Cloudflare absolutely cast a long shadow over the industry, and it would be unwise not to take on board their findings if you are competing in the same space.
AMD 48, 56 and 64 Core Tests
When Cloudflare announced their 10th generation servers would use AMD EPYC 7642’s processors (code name Rome), the tech industry was quick to champion the 48 core processor built on AMD’s 2nd generation EPYC architecture. This time around, Cloudflare evaluated 48, 56 and 64 core samples which are based on AMD’s 3rd generation EPYC architecture (code named Milan). The natural progression was of course to test the 2nd and 3rd generation 48 core samples head-to-head. Seeing a significant performance boost, the 56 and 64 core tests were eagerly anticipated by the team at Cloudflare.
With the AMD silicon making the shortlist, Cloudflare began testing at their hardware validation lab, making changes to existing 10th generation servers themselves. The new silicon was installed in their repurposed servers. As with most companies in the content delivery network and DDoS mitigation space, Cloudflare’s hardware is sourced and assembled by multiple vendors known as ODMs (Original Design Manufacturers). Working with brand new silicon and firmware is not without its trials and tribulations, and Cloudflare needed to work closely with their ODMs to overcome the few hurdles they experienced – one of those being the Linux kernel panicking on boot.
Once these issues were ironed out, Cloudflare ordered test samples from ODMs and began using synthetic benchmarking tools to verify the performance including cf_benchmark (CF Benchmarking Utilities) and their own internal tool suite to apply a synthetic load across their software stack.
Positive Production Test Results
Once confidence grew in the test servers, Cloudflare began to roll out small numbers of servers that would provide the data to support full migration to the new 11th generation servers. The question still remained above the heads of the 56 and 64 core silicon, however:
What kind of performance increase could be expected now that the positive results were in on the 48 core processors?
We can see how each processor stacked up from Cloudflare’s table, published on their blog by Chris Howells:
| ||AMD EPYC 7642 ||AMD EPYC 7643 ||AMD EPYC 7663 ||AMD EPYC 7713 |
|Status ||Incumbent ||Candidate ||Candidate ||Candidate |
|Core Count ||48 ||48 ||56 ||64 |
|Thread Count ||96 ||96 ||112 ||128 |
|Base Clock ||2.3GHz ||2.3GHz ||2.0GHz ||2.0GHz |
|Max Boost Clock ||3.3GHz ||3.6GHz ||3.5GHz ||3.675GHz |
|Total L3 Cache ||256MB ||256MB ||256MB ||256MB |
|Default TDP ||225W ||225W ||240W ||225W |
|Configurable TDP ||240W ||240W ||240W ||240W |
In the above chart, TDP refers to Thermal Design Power, a measure of the heat dissipated. All of the above processors have a configurable TDP - assuming the cooling solution is capable - giving more performance at the expense of increased power consumption. We tested all processors configured at their highest supported TDP.
The theory was that as 64 core processors have 33% more cores than the 48 core processors, a 33% increase in performance could be expected. Benchmarks, however, proved that this was not the case and only a “modest increase” in performance was seen. Chris explained that the 64 core processors specification includes a lower base clock frequency which allows it to fit within the 225W power envelope.
…we found that the 64 core EPYC 7713 gave us around a 29% performance boost over the incumbent,