Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Moderators: cyao, michael_borland

jcytsai
Posts: 41
Joined: 01 Oct 2012, 20:18

Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by jcytsai » 23 Oct 2020, 08:35

Hi,

I was trying but fail to generate a bunch of electrons; I get the message below

Terminated by SIGSEGVProgram trace-back

Could anyone give a hint how it might be resolved? I am not sure if it is caused by memory setting or other issue. Please find attached the input files.

Thanks!
Cheng-Ying
ps: I am using elegant 2020.4.0.
Attachments
test_gen.ele
(990 Bytes) Downloaded 161 times
test_gen.lte
(38 Bytes) Downloaded 161 times

soliday
Posts: 391
Joined: 28 May 2008, 09:15

Re: Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by soliday » 23 Oct 2020, 09:20

You are running into a memory problem related to the n_particles_per_bunch you are using. If you reduce this by half, it will work. Some of the code is limited to MAX_INT32 / 4 (2147483647 / 4)

michael_borland
Posts: 1933
Joined: 19 May 2008, 09:33
Location: Argonne National Laboratory
Contact:

Re: Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by michael_borland » 23 Oct 2020, 09:50

Cheng-Ying,

This is a problem in a library called by elegant's particle generation routine. I will fix it in the next release.

Meanwhile, as Bob suggests, you can reduce the number of particles. You can also use the parallel version.

--Michael

jcytsai
Posts: 41
Joined: 01 Oct 2012, 20:18

Re: Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by jcytsai » 23 Oct 2020, 20:03

Thank you Michael and Bob!

I was intending to increase the number of particles 8-) Now I know more of the situation. It runs without error message using Pelegant, but I could not (post-)process or read the generated output file (*.bun file). The file size is big; therefore I believe the information of particles should have been written. When I plot it using sddsplot, it shows the following warning/error message:

warning: problem reading page--unrecognized data mode (SDDS_ReadPageSparse)
error: no datasets to plot

The above case happens on CentOS 7.6. For more information, it works fine on MacOS (for the same input files).
Thanks!
Cheng-Ying

soliday
Posts: 391
Joined: 28 May 2008, 09:15

Re: Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by soliday » 26 Oct 2020, 09:46

Do you have the latest version of SDDS ToolKit? This supports larger file sizes for sddsplot and the other SDDS tools. If you still have this error with the latest version of SDDS ToolKit, then perhaps you can send me the input files you are using so I can try to reproduce the problem.
soliday@anl.gov

jcytsai
Posts: 41
Joined: 01 Oct 2012, 20:18

Re: Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by jcytsai » 26 Oct 2020, 23:34

Dear Dr. Soliday,

I installed both elegant and SDDS Toolkit recently (within the two weeks) based on Build-AOP-RPMs script; so I guess they should be the latest version. I tried today again using Pelegant (based on the above attached input files) and it generates *.bun (file size about 5.2 GB).

It will be great if you could run the above input files and may reproduce this issue. The above input files simply generate a bunch of electrons. When I "sddsquery" it, nothing responds. When I "sddsplot" it, it shows

Error:
Unable to read page--unrecognized data mode (SDDS_ReadPageSparse)
error: no datasets to plot

Please let me know if I could provide further information. Thanks again for the help!
Cheng-Ying
PS: I am using CentOS Linux release 7.6.1810. This issue does not appear elegant/SDDS Tookit in MacOS.

soliday
Posts: 391
Joined: 28 May 2008, 09:15

Re: Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by soliday » 27 Oct 2020, 10:13

I ran your files with Pelegant and got the 5.2 GB output file. However I am able to run sddscheck, sddsquery and sddsplot without seeing an error message. Can you tell me what you see when you run:

Code: Select all

uname -a
and

Code: Select all

rpm -qa | fgrep SDDSToolKit
I want to verify that you are running on a 64bit version of linux. You should see x86_64 in the output. I also would like to verify that the SDDSToolKit that you built and installed is listed as version 5.0

jcytsai
Posts: 41
Joined: 01 Oct 2012, 20:18

Re: Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by jcytsai » 27 Oct 2020, 19:20

Dear Dr. Soliday,

Following your instruction, I get (uname -a)

Linux centos 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

and (rpm -qa | fgrep SDDSToolKit) gives

SDDSToolKit-5.0-1.x86_64
SDDSToolKit-devel-5.0-1.x86_64

It looks as expected.
Some observations:
0) shows file size 4.9 GB from terminal and 5.2 GB from graphical panel (this difference may happen?)
1) when I "sddscheck" the *.bun, it shows badHeader;
2) when I "sddsquery" it, it gives nothing back;
3) when I "sddsplot" it, it gives an error no datasets to plot.

Thanks!
Cheng-Ying

soliday
Posts: 391
Joined: 28 May 2008, 09:15

Re: Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by soliday » 27 Oct 2020, 19:34

If you run it the same way but with fewer particles, is the output file ok? if not, then I would suspect an issue with the filesystem. MPI programs writing to an NFS filesystem can cause problems with corrupted files if the NFS file system is not mounted with the "noac" mount option.

If the smaller output files are fine, then I would suspect the MPI you are using isn't the same version that Pelegant was built with. The RPMs that I released use:
/usr/lib64/mpich/
/usr/lib64/mpich-3.2/
/usr/lib64/openmpi/

If you are using a different MPI version, then you will have to build Pelegant yourself. You can do that with the Build-AOP-RPMs script on the software page.

jcytsai
Posts: 41
Joined: 01 Oct 2012, 20:18

Re: Terminated by SIGSEGVProgram trace-back, when generating a bunch of electrons

Post by jcytsai » 28 Oct 2020, 08:26

Dear Dr. Soliday,

Your first concern is indeed my case: I tried with fewer particles and the same error occurs. I use MPICH2-1, installed based on Build-AOP-RPMs script. Assume the problem is due to mount option in NFS file system, is there any workaround? I had some search on mouting with noac option but do not have an idea how to proceed. After cat /etc/mtab, I did not see noac option.

Thanks,
Cheng-Ying

Post Reply