SDDS file locking

Moderators: cyao, michael_borland

Post Reply
jrowland
Posts: 19
Joined: 12 Mar 2009, 04:30

SDDS file locking

Post by jrowland » 10 Sep 2009, 07:02

Hi

We have just upgraded to the Lustre file system. File locking with 'lockf' isn't working at the moment, so the SDDS tools won't produce any output:

[jr76@cs04r-sc-serv-04 jr76]$ sddsconvert -binary m2_c1g5c_1500.sdds
warning: existing file m2_c1g5c_1500.sdds will be replaced (sddsconvert)
Error for sddsconvert:
Unable to access file tmp6835.1--file is locked (SDDS_InitializeOutput)

Any suggestions? My options are

turn off locking in SDDS build
enable local locking in Lusture
enable global locking in Lustre (possibly not supported yet?)

Thanks

James

soliday
Posts: 390
Joined: 28 May 2008, 09:15

Re: SDDS file locking

Post by soliday » 10 Sep 2009, 09:43

I don't have experience on Luster either as a user or an administrator, but we do have a Luster file system on order, so that will change in the near future.

If you are building from the source code then I would recommend changing epics/extensions/src/SDDS/SDDSlib/SDDS_util.c. After the last #include statement you need to add:

#undef F_TEST

This will fool the code into thinking that the locking routines do not exist and it will not attempt to lock or check for locks. You will then need to go to epics/extensions/src/SDDS and type "make clean all" to propagate the change to all the programs.

soliday
Posts: 390
Joined: 28 May 2008, 09:15

Re: SDDS file locking

Post by soliday » 10 Sep 2009, 09:55

Reading through some forum posts about luster and flock it looks like if you mount the filesystem on the clients with -o flock it should work.

jrowland
Posts: 19
Joined: 12 Mar 2009, 04:30

Re: SDDS file locking

Post by jrowland » 10 Sep 2009, 14:50

Thanks I will try both and report back.

jrowland
Posts: 19
Joined: 12 Mar 2009, 04:30

Re: SDDS file locking

Post by jrowland » 11 Sep 2009, 08:53

#undef F_TEST

Works fine for SDDS tools and serial elegant.

I still have a problem with parallel elegant using MPI-IO as the openmpi 1.2.7 MPI-IO uses locking.
I suspect the correct thing to do is use the Lustre-aware MPI-IO backend (ADIO driver) which is available in openmpi 1.3.3

Until I get the locking enabled is it possible to build parallel elegant 22 without parallel I/O?

Thanks

James

soliday
Posts: 390
Joined: 28 May 2008, 09:15

Re: SDDS file locking

Post by soliday » 11 Sep 2009, 09:55

I forwarded your question onto someone you should know the answer to this. But it looks to me like you need to edit elegant/Makefile.OAG and change two things:

Comment out "USER_MPI_FLAGS += -DSDDS_MPI_IO=1"

Link to SDDS1 instead of SDDSmpi by changing USR_LIBS

jrowland
Posts: 19
Joined: 12 Mar 2009, 04:30

Re: SDDS file locking

Post by jrowland » 11 Sep 2009, 10:41

I tried that and it builds but it stalls when doing tracking (no output, processor is working), I hadn't got as far as testing pelegant-22 before the file system upgrade so I'm not sure the problem is related. For now I've got a working Pelegant-21 with locking disabled.

ywang25
Posts: 52
Joined: 10 Jun 2008, 19:48

Re: SDDS file locking

Post by ywang25 » 12 Sep 2009, 16:20

The most recent version of Pelegant can ONLY be used with parallel SDDS .
Some extra steps need to be done after compiling SDDS from source code:
1) go to SDDSlib directory;
2) make clean;
3) make MPI=1;
These will generate the correct version of SDDSmpi library to build Pelegant.

I don't have a system to test with file locking disabled. But if the serial version of elegant works when you set
#undef F_TEST
The parallel version should work if you compiled SDDSmpi correctly.

I did have access to the Flanklin supercomputer at NERSC, which has the Luster file system installed. I didn't experience any problem with the parallel I/O with MPICH.
Users reported successful experience of building the Pelegant with parallel I/O with openmpi 1.2.7.

Please refer to the post:

viewtopic.php?f=18&t=107

Yusong

jrowland
Posts: 19
Joined: 12 Mar 2009, 04:30

Re: SDDS file locking

Post by jrowland » 17 Sep 2009, 08:35

Hi

Elegant-22 is working fine now with Lustre and Infiniband.

I've upgraded to openmpi-1.3.3 using the Lustre MPIIO driver, for reference the configure command is:

./configure --prefix=/dls_sw/apps/physics/openmpi-1.3.3 --with-sge --with-io-romio-flags="--with-file-system=lustre+nfs+ufs"

Thanks

James

Post Reply