Pelegant performance on Win7_x64
Posted: 03 Apr 2019, 10:42
I am a brand new user of the elegant and need to use it in order to study microbunching instability of ultra bright beams in x-ray FELs. I have .ele and .lte inputs to start with.
I have installed March 7th, 2019 version and then updated April 1st, 2019 version of Elegant_x64.msi on Windows 7 with MS-MPI 10 by following pelegant instructions.
I have been disappointed by the performance with 1.25M particles:
Old Pelegant with MS-MPI 7 on E5-2687 (8 cores) at 3.1 GHz and -n 14 completes the simulation in 6 minutes
New Pelegant with MS-MPI 10 on dual socket E5-2687v4 (12 cores *2) at 3.0 GHz and -n 14 completes the simulation in 11 minutes.
Pushing -n 48 gives only modest improvement down to 9m.
Are there MS-MPI 10 issues that might be a reason for such a performance? I have read that multiple NIC interfaces can slow down MS-MPI.
Is there a way to debug Pelegant in order to see where it spends its time?
Thank you,
Petr
Here is a list how run time depends on -n option:
Elegant simulation on 1 core
Tracking step completed ET: 00:23:29 CP: 1408.46 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 2 cores
Tracking step completed ET: 00:32:53 CP: 1972.50 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 4 cores
Tracking step completed ET: 00:17:03 CP: 1023.05 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 8 cores
Tracking step completed ET: 00:10:40 CP: 639.25 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 12 cores
Tracking step completed ET: 00:11:04 CP: 663.66 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 14 cores
Tracking step completed ET: 00:10:54 CP: 654.28 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 16 cores
Tracking step completed ET: 00:11:56 CP: 716.02 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 20 cores
Tracking step completed ET: 00:12:17 CP: 736.92 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 24 cores
Tracking step completed ET: 00:12:25 CP: 745.36 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 32 cores
Tracking step completed ET: 00:11:05 CP: 665.06 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 48 cores
Tracking step completed ET: 00:09:09 CP: 548.50 BIO:0 DIO:0 PF:0 MEM:0
I have installed March 7th, 2019 version and then updated April 1st, 2019 version of Elegant_x64.msi on Windows 7 with MS-MPI 10 by following pelegant instructions.
I have been disappointed by the performance with 1.25M particles:
Old Pelegant with MS-MPI 7 on E5-2687 (8 cores) at 3.1 GHz and -n 14 completes the simulation in 6 minutes
New Pelegant with MS-MPI 10 on dual socket E5-2687v4 (12 cores *2) at 3.0 GHz and -n 14 completes the simulation in 11 minutes.
Pushing -n 48 gives only modest improvement down to 9m.
Are there MS-MPI 10 issues that might be a reason for such a performance? I have read that multiple NIC interfaces can slow down MS-MPI.
Is there a way to debug Pelegant in order to see where it spends its time?
Thank you,
Petr
Here is a list how run time depends on -n option:
Elegant simulation on 1 core
Tracking step completed ET: 00:23:29 CP: 1408.46 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 2 cores
Tracking step completed ET: 00:32:53 CP: 1972.50 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 4 cores
Tracking step completed ET: 00:17:03 CP: 1023.05 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 8 cores
Tracking step completed ET: 00:10:40 CP: 639.25 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 12 cores
Tracking step completed ET: 00:11:04 CP: 663.66 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 14 cores
Tracking step completed ET: 00:10:54 CP: 654.28 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 16 cores
Tracking step completed ET: 00:11:56 CP: 716.02 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 20 cores
Tracking step completed ET: 00:12:17 CP: 736.92 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 24 cores
Tracking step completed ET: 00:12:25 CP: 745.36 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 32 cores
Tracking step completed ET: 00:11:05 CP: 665.06 BIO:0 DIO:0 PF:0 MEM:0
Pelegant simulation on 48 cores
Tracking step completed ET: 00:09:09 CP: 548.50 BIO:0 DIO:0 PF:0 MEM:0