File becomes locked after initial run of Pelegant

Moderators: cyao, michael_borland

smolloy
Posts: 6
Joined: 20 Jul 2009, 08:13
Location: Royal Holloway, University of London

File becomes locked after initial run of Pelegant

Post by smolloy » 28 Jul 2009, 08:54

After a successful initial run of Pelegant over a transfer line lattice, I get the following error message on all subsequent runs.

Code: Select all

Error:
unable to open file Results/rtml.cen for writing--file is locked (SDDS_InitializeOutput)
This suggests that the file is not being unlocked properly after the initial run. Can someone suggest a way to get around this (or point out what I am missing!)

Many thanks.

ywang25
Posts: 52
Joined: 10 Jun 2008, 19:48

Re: File becomes locked after initial run of Pelegant

Post by ywang25 » 28 Jul 2009, 22:24

The centroid output file should only be written by one process. Can you provide more information about the MPICH setup (version, daemon), Pelegant (pre-build version?) and operating system. It will be very helpful if you could provide a simplified lattice/template to reproduce the problem.

The MPICH distribution has the following note for NFS file server, although I don't think it will solve this problem.
1. There are problems with NFS (they are not in ROMIO).
NFS requires special care and use in order to get correct behavior when
multiple hosts access or try to lock the same file. Without this special
care, NFS silently fails (for example, a file lock system call will succeed,
but the actual file lock will not be correctly handled. This is considered
a feature, not a bug, of NFS. Go figure). If you need to use NFS, then you
should do the following:
Make sure that you are using version 3 of NFS
Make sure that attribute caching is turned off
This will have various negative consequences (automounts won't work, some
other operations will be slower). The up side is that file operations will
be correctly implemented. This is an instance of "do you want it fast or
correct; you can't have both". More details on this may be found in the
ROMIO README.


Yusong

smolloy
Posts: 6
Joined: 20 Jul 2009, 08:13
Location: Royal Holloway, University of London

Re: File becomes locked after initial run of Pelegant

Post by smolloy » 29 Jul 2009, 08:38

Apologies for not including that information in my initial post, but I was convinced that the error was in my lack of practice with Pelegant, and not with Pelegant or MPICH.

I am using Pelegant version 22.0.1 on Scientific Linux 5 (kernel = 2.6.18-128.1.10.el5, x86_64). I am unsure of the version number of MPICH, however I do know that it is the most up to date version available from the Scientific Linux repositories. I have attached the .ele and .lte files that I am using.

I tried running the same commands on a local disc (instead of NFS), and I get the same errors.

I am beginning to suspect that the problem is with Pelegant, since, it is not writing any of my WATCH files, nor my bunch file correctly. When I run the same .ele and .lte file with elegant (not Pelegant), then these files are populated correctly. Perhaps there is a bug in Pelegant?
Attachments
rtml.lte
Lattice file
(56.29 KiB) Downloaded 179 times
rtml.ele
Elegant file
(1.84 KiB) Downloaded 175 times

michael_borland
Posts: 1933
Joined: 19 May 2008, 09:33
Location: Argonne National Laboratory
Contact:

Re: File becomes locked after initial run of Pelegant

Post by michael_borland » 29 Jul 2009, 08:43

Can you supply the wake_bc1.sdds file and any other required files so I can run this?

Thanks--Michael

smolloy
Posts: 6
Joined: 20 Jul 2009, 08:13
Location: Royal Holloway, University of London

Re: File becomes locked after initial run of Pelegant

Post by smolloy » 29 Jul 2009, 08:50

Please accept my apologies for not including all necessary files in my previous post, and many thanks for helping me with this issue.
Attachments
wake_booster.sdds
(908.54 KiB) Downloaded 183 times
wake_bc2.sdds
(908.54 KiB) Downloaded 159 times
wake_bc1.sdds
(908.54 KiB) Downloaded 158 times

michael_borland
Posts: 1933
Joined: 19 May 2008, 09:33
Location: Argonne National Laboratory
Contact:

Re: File becomes locked after initial run of Pelegant

Post by michael_borland » 29 Jul 2009, 11:14

Unfortunately, I'm unable to duplicate the problem. I ran Pelegant with your input files using a local drive on my desktop, as well as on a cluster using NFS. In both cases the run completed normally and created all the files.

Perhaps Yusong will have more ideas...

--Michael

smolloy
Posts: 6
Joined: 20 Jul 2009, 08:13
Location: Royal Holloway, University of London

Re: File becomes locked after initial run of Pelegant

Post by smolloy » 29 Jul 2009, 12:29

Thanks for trying Michael.

If it helps, I installed elegant/Pelegant from the elegant x86_64 rpm, and I installed the mvapich package from the scientific linux yum repository. Perhaps Pelegant is incompatible with this implementation of MPI?

soliday
Posts: 391
Joined: 28 May 2008, 09:15

Re: File becomes locked after initial run of Pelegant

Post by soliday » 29 Jul 2009, 12:46

The version of Pelegant in the RPM files is built against MPICH2-1.0.8p1 with the default configuration. It may work with newer versions of MPICH2 but I am not sure about that. But I do know that it will definitely not work with MAVPICH. MAVPICH is basically MPICH2 implemented over InfiniBand hardware. For this configuration you will need to build Pelegant from source code.

smolloy
Posts: 6
Joined: 20 Jul 2009, 08:13
Location: Royal Holloway, University of London

Re: File becomes locked after initial run of Pelegant

Post by smolloy » 29 Jul 2009, 14:09

Thanks.

I uninstalled MVAPICH, and installed version 1.0.8p1-2.el5 of mpich, mpich-libs, and mpich-devel. All to no avail. I'm still getting the locked file errors, and the bunch file is empty.

I might try installing this all on my Ubuntu laptop to see if it works on that, but I'd really appreciate any other advice/hints/tips you have for researching this problem.

ywang25
Posts: 52
Joined: 10 Jun 2008, 19:48

Re: File becomes locked after initial run of Pelegant

Post by ywang25 » 29 Jul 2009, 21:14

I can also get output files with your input files on my laptop. It could be an MPICH2 setup/usage problem.

Please refer to the following document for your Pelegant installation/usage
http://www.aps.anl.gov/Accelerator_Syst ... gant.shtml

Make sure the mpd ring is set properly if you use the pre-built version of Pelegant and the mpiexec is the desired one (e.g. which mpiexec). MAVPICH may be ok if the mpd daemon is used. mpich-libs, and mpich-dev are not necessary for pre-built Pelegant as it is built statically. Newer version of MPICH2 should not have a problem either.

After you set up MPICH2, and still have the problem, you can compile (with mpicc) and run "MPI hello C program" to test the installation/setup;

Also, it is better to have a large number of particles to use Pelegant efficiently.

Post Reply