Errors with Pelegant on OSX

Moderators: cyao, michael_borland

Post Reply
JoelFrederico
Posts: 60
Joined: 05 Aug 2010, 11:32
Location: SLAC National Accelerator Laboratory

Errors with Pelegant on OSX

Post by JoelFrederico » 16 Dec 2010, 23:47

So, I'm getting some very strange behavior from Pelegant. Here's what we've done:

1. Downloaded latest version of Pelegant, v23.1.2.
2. Compiled MPICH2 v1.3.1. Configure options CFLAGS=-m32 -O2 LDFLAGS=-L/usr/X11R6/lib

Is this the version of MPICH2 that Pelegant was compiled for? This was working, but now I'm having strange errors (which are not reproducible in not-parallel elegant):

Code: Select all

Pelegant(59880) malloc: *** mmap(size=2877292544) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
error: memory allocation failure--2877292544 bytes requested.
tmalloc() has allocated 235765591 bytes previously

Terminated by SIGABRT
Program trace-back:
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
So I tried to pare down the ele and lte files to find what is causing this. I was using a very short beamline, so I tried several things. Any one of these things eliminated the error:

1. Using a new lattice file (test.lte) that only has relevant elements.
2. Changing the value for final in the second vary_element from 1e-01 to 1.000000000000000e-01.
3. Deleting the matrix_output section.
4. Deleting any one of the run_setup entries output, centroid, sigma, final, and losses.
5. At one point (not reproducible at the moment), just changing the file names of the lattice and ele file fixed things.

At one point, while trying to see if there was a specific line in the lattice file that was causing problems, I discovered that removing lots of extraneous elements seemed to fix things. Shouldn't that not matter, if they're not in my beamline? I have no idea what's going on.

I tried to build Pelegant and ran into many problems. A few:

1. Making things seems to be dependent on absolute paths and versions used. At one point it wants the EPICS base v. R3.14.9.
2. It seems to use built-in compilers provided by Apple, for instance /usr/bin/cc, instead of gcc installed by fink or macports?
3. It seems to compile some things in 32-bit, some in 64-bit, and then has trouble when trying to link (of course).

It would be great to get this working. When it works, it is quite a bit faster than elegant. I'm specifically trying to make sure we are using enough bins and particles for the CSR simulation to be valid, at the moment. A postdoc is trying to get some lattice optimizations going and this would be helpful there too.

Any ideas? I'd love to get to the point of building Pelegant on OSX, I'm guessing that's the weak point in our solution right now.

Files: http://dl.dropbox.com/u/693663/Archive.zip

soliday
Posts: 391
Joined: 28 May 2008, 09:15

Re: Errors with Pelegant on OSX

Post by soliday » 17 Dec 2010, 17:04

The last Pelegant release for OSX was built with mpich2-1.2.1p1, which was the latest at the time. I have heard of other people trying to use newer versions of MPICH2 with our prebuilt binaries and not having success, so I assume they are not 100% compatible. So I would first try going back to this MPICH2 version. If that is not possible then I can look into creating a new release built against the newer version of MPICH2.

JoelFrederico
Posts: 60
Joined: 05 Aug 2010, 11:32
Location: SLAC National Accelerator Laboratory

Re: Errors with Pelegant on OSX

Post by JoelFrederico » 17 Dec 2010, 17:18

Thanks very much! I'll build that version of MPICH2 and let you know if it cures things.

Do you have any updated Pelegant build instructions that'd work with OSX by any chance? In particular, which versions of which epics configure/source/SDDS/OAG packages should be used and where they should be expanded, so that the Pelegant make goes smoothly. The ones in the manual are from v17. It'd be terrific to be able to compile from source so we can be sure we don't have any library/dependency versioning issues in the future.

JoelFrederico
Posts: 60
Joined: 05 Aug 2010, 11:32
Location: SLAC National Accelerator Laboratory

Re: Errors with Pelegant on OSX

Post by JoelFrederico » 17 Dec 2010, 17:59

I installed MPICH2-1.2.1p1, and it hasn't improved things:

Code: Select all

error: memory allocation failure--3347054592 bytes requested.
tmalloc() has allocated 269319785 bytes previously
Pelegant(91041) malloc: *** mmap(size=3347054592) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug

Terminated by SIGABRT
Program trace-back:
rank 0 in job 1  dhcp-137-226.slac.stanford.edu_62053   caused collective abort of all ranks
  exit status of rank 0: return code 1 
~/test $ which mpiexec
/usr/local/mpich2-1.2.1p1/bin/mpiexec

ywang25
Posts: 52
Joined: 10 Jun 2008, 19:48

Re: Errors with Pelegant on OSX

Post by ywang25 » 20 Dec 2010, 10:23

You can check out this thread to find some useful information about elegant installation on OSX:
viewtopic.php?f=11&t=12

For Pelegant compilation on Linux, there is a post about it:
viewtopic.php?f=18&t=107

The only additional step to build Pelegant is under the SDDSlib directory:
make clean
make MPI=1

I could not reproduce the problem with the input files you provided on Linux machine. I will try to build it on a OSX machine to see if there is a problem.

Yusong

JoelFrederico
Posts: 60
Joined: 05 Aug 2010, 11:32
Location: SLAC National Accelerator Laboratory

Re: Errors with Pelegant on OSX

Post by JoelFrederico » 03 Jan 2011, 19:47

Hi, Yusong,

I suppose we could run Pelegant in a Linux virtual machine since it works in Linux? Although that would really not be preferable. Have you had any luck testing on OSX?

I will try to run through the Pelegant instructions for Linux with OSX (using the appropriate modifications). Elegant binaries are working fine, so the OSX install instructions aren't necessary.

I think that's it.

Joel

ywang25
Posts: 52
Joined: 10 Jun 2008, 19:48

Re: Errors with Pelegant on OSX

Post by ywang25 » 04 Jan 2011, 10:21

To my understanding, the Pelegant should work on a Linux virtual machine, although I don't know what speedup you can expect. A comparison between a binary elegant for OSX and Linux elegant on a virtual machine will show if there is any performance penalty.

I tested the binary version of Pelegant on a darwin-x86 machine with your input files and experienced a similar problem. We are going to build a new version to address the issue.

When I tried to build Pelegant on OSX by myself, I found the following files are useful if you want to control the compilers (32 or 64 bits)
~/epics/base/configure/os/CONFIG.darwinCommon.darwinCommon
~/oag/apps/configure/RELEASE.darwin-x86
I still need support about OSX compilers to build a binary for OSX (our lab was closed during the holidays).

Post Reply