Parallel elegant with non-parallel SDDS

Moderators: cyao, michael_borland

Post Reply
frank_stulle
Posts: 13
Joined: 04 Jun 2009, 04:58

Parallel elegant with non-parallel SDDS

Post by frank_stulle » 19 Feb 2010, 08:42

I recently upgraded my Pelegant/SDDS installation to the newest versions. The computers run Scientific Linux 4 and OpenMPI 1.2.6. But now I seem to run into problems with parallel I/O by SDDS. The usual result is that the *.bun and *.out files, i.e. input and output particle distributions, are crippled. Now I wonder if it would be possible to compile Pelegant and SDDS in a way that SDDS uses the old non-parallelized I/O.

Best regards
Frank

ywang25
Posts: 52
Joined: 10 Jun 2008, 19:48

Re: Parallel elegant with non-parallel SDDS

Post by ywang25 » 19 Feb 2010, 19:17

It is not supported to compile Pelegant with non-parallelized I/O. First, make sure you compile the SDDS/SDDSlib with "make clean; make MPI=1" (The desired version of mpicc should be in your $PATH); then do "make clean; make Pelegant " under the elegant directory.

If the problem still exists, you might need check the file server setup to find out if the parallel I/O is supported. It is recommended to use parallel file system, although NFS should be fine.

You can check with "mount" command to see if the attribute caching (noac) is turned off, and report the other setup of your file server.

"There are problems with NFS (they are not in ROMIO).
NFS requires special care and use in order to get correct behavior when
multiple hosts access or try to lock the same file. Without this special
care, NFS silently fails (for example, a file lock system call will succeed,
but the actual file lock will not be correctly handled. This is considered
a feature, not a bug, of NFS. Go figure). If you need to use NFS, then you
should do the following:
Make sure that you are using version 3 of NFS
Make sure that attribute caching is turned off
This will have various negative consequences (automounts won't work, some
other operations will be slower). The up side is that file operations will
be correctly implemented. This is an instance of "do you want it fast or
correct; you can't have both". More details on this may be found in the
ROMIO README."

Yusong

frank_stulle
Posts: 13
Joined: 04 Jun 2009, 04:58

Re: Parallel elegant with non-parallel SDDS

Post by frank_stulle » 22 Feb 2010, 03:23

I think I compiled Pelegant and SDDS correctly. The problem could well be with the file server, since the files are located in AFS (not NFS). I have no idea how this behaves with parallel IO. Unfortunately, I do not have any influence on the file server or file system used. Do you have any experience with AFS and parallel IO?

Best regards
Frank

ywang25
Posts: 52
Joined: 10 Jun 2008, 19:48

Re: Parallel elegant with non-parallel SDDS

Post by ywang25 » 22 Feb 2010, 21:31

I am afraid the AFS is not supported.
"Supported file systems are IBM PIOFS, Intel PFS, HP/Convex HFS, SGI XFS, NEC
SFS, PVFS, NFS, and any Unix file system (UFS)."

The reason is the last process to close file overwrites the updates of all the other processes on AFS.

Pelegant is designed for high performance computing with a large number of particles. The serial I/O in the previous versions are the major bottleneck because of communications and memory usage on master. Pelegant works properly on both luster and NFS file systems on our cluster. If you have a significant amount of workload which requires you to use Pelegant, you can request to install another file system from the above list on the cluster. Or you can apply an account from a supercomputer center, such as, NERSC or Argonne Leadership Computing Facility at Argoonne National Laboratory. We have Pelegant with parallel I/O tested on the major supercomputers of these two centers.

Yusong

frank_stulle
Posts: 13
Joined: 04 Jun 2009, 04:58

Re: Parallel elegant with non-parallel SDDS

Post by frank_stulle » 23 Feb 2010, 03:11

That's really a pity. It seems for the moment I have to stick to older Pelegant/SDDS versions without parallel I/O. At least it is a good argument for me to make IT improve our "cluster" (actually it is just a bunch of computers). Here at CERN this is always a fight.

Best regards
Frank

Post Reply