Elegant using Intel vs AMD CPUs

Moderators: cyao, michael_borland

Post Reply
Björklund
Posts: 84
Joined: 19 May 2016, 07:14

Elegant using Intel vs AMD CPUs

Post by Björklund » 29 Jan 2020, 03:36

Hi!

Not directly related to anything on a user level in the code, more something I've been wondering for a while, out of curiosity:

Has anyone benchmarked Elegant (well, actually Pelegant) on any recent Intel and AMD CPUs? I don't know a whole ton on the topic of instruction sets, but I do know that different instruction sets perform differently on CPUs from the two vendors, and I don't know what kind of instructions Elegant is using, and there are not a whole ton of benchmarks on scientific codes out there anyway. The raw power of something like an AMD Threadripper 3970X (32c/64t) for a high-end workstation is pretty appealing to me, especially since it can be paired with very fast RAM and PCIe 4.0 storage. Performance for highly parallelized ray-tracing applications (for rendering images, most benchmarks I've seen have been for image/video production/processing or gaming) seems to blow Intels high-end desktop (HEDT) out of the water, for a lower price per core.

So, does anyone have any input on this? I suppose that data for the AMD EPYC server CPUs would be useful as well, as they are quite similar to Threadripper. I'm mostly interested in any data on the Zen 2 architecture, i.e. the latest family (3000-series) of CPUs. Intel has had the same architecture (Skylake) and basic process node (14 nm) for years now and have mostly bumped the clock speeds a little each generation, so older data could be extrapolated more easily there. My simulations are small enough that I run them on a single machine (albeit currently with 2 physical CPUs), so data pertaining to clusters would not be so interesting for me right now.

Any input or speculations or so are welcome :)

Best regards
Jonas

nkuklev
Posts: 8
Joined: 13 Aug 2019, 09:08
Location: University of Chicago

Re: Elegant using Intel vs AMD CPUs

Post by nkuklev » 06 Feb 2020, 14:01

I am also very interested in seeing Zen2 scaling in Pelegant, especially on the upcoming 3990X. If Pelegant is memory limited, things could go weird due to latency/asymmetry. There is also the NUMA settings, compile optimization (-march=znver2), and libraries (MKL hacking to enable AVX) to consider.

I am building a personal Ryzen 5 3600 system, and can benchmark single core perf when ready to give you a rough idea, but the better way is probably to pay a few bucks for a bare metal compute instance. I only found one place offering access to EPYC Rome platform (link), but their 24 core 7402P should be equivalent to 3960X (except for frequencies).

- Nikita

Björklund
Posts: 84
Joined: 19 May 2016, 07:14

Re: Elegant using Intel vs AMD CPUs

Post by Björklund » 11 Feb 2020, 02:11

Hi Nikita!

It would be really cool if you could try it on the R5 3600, just check some quick scaling with number of threads! It's a good idea to pay for a compute instance, but I'm not very familiar with that kind of stuff and unfortunately don't really have the time to get into it right now. I will definitely keep that in mind, though!

Zen2 tends to scale very well with higher memory frequency (up to 3600 MHz after which memory and infinity fabric clocks can become decoupled) and lower timings, at least for the application bench marks that I have seen. Pelegant doesn't use a whole lot of memory (at least not for the stuff I tend to run), but I suspect that it would benefit from fast RAM too. There are very good memory tuning calculators out there, e.g. https://www.techpowerup.com/download/ry ... alculator/. The Threadripper platform is limited to 256 GB of RAM, but I don't think that should be an issue for Pelegant - the rather low max memory is one of the criticisms I have heard against the 3990X in reviews, and it only has 4 memory channels vs 8 for EPYC, but it can run the memory at higher clocks as far as I know.

It does sound like you know more about the computational side than I do, looking forward to seeing your results with the R5 3600 :)

//Jonas

Post Reply