The NeXus Dictionary API

The NeXus Dictionary API

Mark Könnecke
Labor für Neutronenstreuung
Paul Scherrer Institut
CH-5232 Villigen PSI
Switzerland
Mark.Koennecke@psi.ch

Abstract

There is a proposed portable data exchange format for neutron and X-ray scattering communities, NeXus based on the hierarchical data format (HDF) from the Nationa Center for Supercomputer Applications (NCSA), USA. Accompanying NeXus there exists a set of application programmers interface functions to NEXUS. This is a base level API which hides many of the hideous details of the HDF interface from the NeXus programmer. The present document introduces a higher level application programmers interface sitting on top of the NeXus API. This API (the NEXDICT-API), reads all file structure data from a dictionary data file and creates the structure automatically from that information. The NEXDICT user only needs to specify the data to write or read.

Contents

1  Introduction
2  The NeXus Data Definition Language
3  The NXDICT-API
    3.1  Dictionary Maintainance Function
    3.2  Data Handling functions
4  NeXus Utility Functions
5  Conclusion
6  References

1.  Introduction

There exists a proposal for a portable data exchange format for neutron and X-ray scattering communities, NeXus. NeXus is fully described elsewhere1. NeXus has been realised on top of the hierarchical data format (HDF) as defined and specified by the National Center for Supercomputer Applications2, NCSA, USA. Most neutron or X-ray scattering data items can be described as multidimensional data sets of a given number type. Such an item is called a SDS in HDF. A SDS in HDF can have auxiliary data (for instance units) associated with it. Such auxiliary data is called attributes. Data in a HDF file can be structured much like the directory hierarchy of a file system. The HDF equivalent to a directory is called a vGroup.

HDF comes with a library of access functions. On top of the HDF-library an application programmers interface (API) for NeXus was defined which hides many of the low level details and ideosyncracies of the HDF interface from the NeXus programmer. However, writing NeXus files stays hideous even with this interface due to the amount of repetitive code required to implement the NeXus structure. Now, repetitive tasks is one area a computer is good at. So, why not have the computer create the structure of the NeXus file? In order to do this the following components are needed:

This approach has the additional benefit that changes in the file structure just require to edit the dictionary data file with no changes to the source code writing or reading the data

2.  The NeXus Data Definition Language

The NeXus Data Definition Language's(NXDDL) purpose is to define the structure and data items in a NeXus file in a form which can be understood by a human programmer and which can be parsed by the computer in order to create the structure. For this a dictionary based approach will be used. This dictionary will contain pairs of short aliases for data items and definition strings which hold the structure information. This dictionary will be initialized from a data file, the NXDDL-file. Such a dictionary can be used in the following way: Given an appropriate API function, a NXDICT programmer needs to specify only the alias and the data to write and everything else is taken care of by the API: vGroup creation, opening, SDS definition etc. Another use may involve the creation of definition strings completely or partly at run time which can then be used by an API function in order to create the structures defined by the definition string. The same holds for writing as well.

A NXDDL dictionary is preferably initialized from a file. Such a NXDDL file has to follow these general structure guidelines:

The next thing to define is the content of the definition string. A definition string will have the general form:

PATH/TerminalSymbol

This means a definition string will consist of a path specifier which describes the position of a data item in the vGroup hierarchy and a terminal symbol which describes the nature of the data item.

The path through the vGroup hierarchy to a data item will be described in a manner analog to a Unix directory hierarchy. However, NeXus requires two pieces of data in order to fully qualify a vGroup. This is it's name and class. Consequently, both name and classname will be given for each vGroup, separated by a komma. A valid path string then looks like:

 
     /scan1,NXentry/DMC,NXinstrument/big_detector,NXdetector/TerminalSymbol
This translates into: TerminalSymbol in vGroup big_detector, class NXdetector, which resides in vGroup DMC of class NXinstrument, which in turn is situated in the vGroup scan1 of class NXentry.

The terminal symbol in a definition string is used to define the data item at the end of the definition. NeXus currently supports only three types of data items at the end of the chain: these are scientific data sets (SDS), vGroups and links to other data items or vGroups. The terminal symbol for a link is specified by the keyword NXLINK followed by a valid alias of another data item or vGroup. For example the terminal symbol:

SDS counts

would define a SDS with name counts.

A vGroup would be denoted by the keyword VGROUP. By then, the vGroup has already been defined by the path string. This form of alias is only useful for the definition of links to vGroups.

A SDS is more involved. The definition of an SDS starts with the keyword SDS. This keyword must then be followed by the name of the SDS. Following the name there are option value pairs which define the details of the SDS. The following options exist:

If no options are given a default is used. This will be a single floating point number, as this is the most frequently written data item. As an example see the definition of a 3d array of 32 bit integers:
   PATHSTRING/SDS counts -rank 3 -dim {64,64,712} -type DFNT_INT32 \
                  -attr {Units,Counts}      

  

3.  The NXDICT-API

In order to interface with the NeXus dictionary API a set of API-functions is needed. All functions and data types belonging to this API start with the letters: NXD. The functions belonging to this API fall into three groups:

One additional data type is needed for this API:


typedef struct __NXdict *NXdict;
endlist NXdict will be used as a handle for the dictionary currently in use.

3.1  Dictionary Maintainance Function


NXstatus NXDinitfromfile(char *filename, NXdict *pDict);
NXstatus NXDclose(NXdict handle, char *filename);

NXstatus NXDadd(NXdict handle, char *alias, char *DefString);
NXstatus NXDget(NXdict handle, char *alias, char *pBuffer, int iBufLen);
NXstatus NXDupdate(NXdict handle, char *alias, char *pNewVal);
NXDinitfromfile creates a new NeXus dictionary. If filename is NULL, this is all that happens. If filename is not NULL, it will be opened and the dictionary will be initialized from the file specified. The return value is either 0 for failure or non zero for success.

NXDclose deletes and writes a NeXus dictionary. If filename is not NULL, the dictionary specified by handle is written to the file specified by filename. In any case the dictionary specified by handle will be deleted.

NXDadd adds a new alias - Definition String pair to the dictionary specified by handle.

NXDget retrieves the definition string for the alias specified as the second parameter from the dictionary handle. The definition string is copied to pBuffer. Maximum iBufLen characters will be copied.

NXDupdate replaces the definition for the alias specified as second parameter with the new value supplied as last parameter.

If a special dictionary vGroup as extension to NeXus would be accepted, two more functions need to be defined which read and write the dictionary from the NeXus file.

3.2  Data Handling functions


NXstatus NXDputalias(NXhandle file, NXdict dict,
char *alias, void *pData);
NXstatus NXDputdef(NXhandle file, NXdict dict, char *pDefString, void *pData);

NXstatus NXDgetalias(NXhandle file, NXdict dict,
char *alias, void *pData);
NXstatus NXDgetdef(NXhandle file, NXdict dict, char *pDefString, void *pData);

NXstatus NXDaliaslink(NXhandle file, NXdict dict,
char *pAlias1, char *pAlias2);
NXstatus NXDdeflink(NXhandle file, NXdict dict,
char *pDef1, char *pDef2);

NXstatus NXDopenalias(NXhandle file, NXdict dict,
char *alias);
NXstatus NXDopendef(NXhandle file, NXdict dict, char *pDefString);

The NXDICT data handling functions go in pairs. The version ending in alias expects an NXdict and an alias as input. These routines retrieve a definition string from the dictionary and work out the path from that. The other version ending on def acts upon a definition string specified as second parameter. Using this scheme both full dictionary operations are possible, as well as operation with dynamically generated definition strings. All routines return the usual NeXus status returns. All these routines start at the current vGroup level and return back to it.

NXDputalias, NXDputdef write the data element specified by the alias or the definition string to the NeXus file specified as first parameter. pData is a pointer to the data to be written. These routines will check for the existence of all vGroups required in the path part of the definition string. If a vGroup is missing it will be created. These routines step back to the same vGroup level from which they were called.

NXDgetalias, NXDgetdef read a data item from file. pData MUST point to a data area large enough to hold the data read. If a vGroup is missing in the path for one of these routines an error is generated because it is assumed that the data is present if a program wants to read it. These routines step back to the same vGroup level from which they were called.

NXDaliaslink, NXDdeflink links the alias or definition given as fourth parameter to the vGroup specified by the third parameter. pAlias1 or pDef1 MUST refer to a vGroup because HDF can only link items into vGroups. The item being linked against MUST exist, otherwise the software will complain. The vGroup into which the link is installed will be created on the fly, if not present. Please note, that bot aliases or definition strings must refer to the same starting point in the vGroup hierarchy of the NeXus file. These routines step back to the same vGroup level from which they were called.

NXDopenalias, NXDopendef open the specified data items specified by the alias or the definition string. Then the usual NeXus functions can be used to interact with the data. These routines use the same scheme for creating vGroups on the fly as the put routines above. The status in the vGroup hierarchy after this call is dependent on the nature of the terminal symbol. If it is a SDS, the vGroup hierarchy will be stepped back to the level from which the call occurred. The SDS will be left open. If the terminal symbol is a vGroup, then the this vGroup will be made the current vGroup. No back stepping in the vGroup hierarchy occurs.

4.  NeXus Utility Functions

This section list a couple of functions which either perform common tasks on NeXus files or relate to aspects of error handling and debugging.


NXstatus NXUwriteglobals(NXhandle file,
char *filename,
char *owner,
char *adress,
char *phone,
char *email,
char *fax,
char *thing);


NXstatus NXUentergroup(NXhandle hFil, char *name, char *class);
NXstatus NXUenterdata (NXhandle fileid, char* label, int datatype,
int rank, int dim[], char *pUnits);

NXstatus NXUallocSDS(NXhandle hFil, void **pData);
NXstatus NXUfreeSDS(void **pData);

NXUwriteglobals writes the global attributes to a newly opened NeXus file. The parameters should be self explaining. In addition, the file creation date is automatically written.

NXUentergroup tries to open the group specified by name and class. If it not present, it will be created and opened.

NXUenterdata tries to open the SDS specified by label. If it not present, it will be created and opened.

NXUallocSDS allocates enough space for the currently open SDS. The pointer created is returned in pData.

NXUfreeSDS returns memory allocated by NXUallocSDS to the system.

5.  Conclusion

As an example see the code for writing three data items with the NeXus API, the same code for doing this with the NXDICT API and the accompanying dictionary data file . The goal of simplifying NeXus programming has been achieved. It is hoped that the dictionary API described here will simplify programming for NeXus sufficiently to make NeXus a success. Please note, that the convenience of this API comes with at a cost. This cost is extra CPU time and memory overhead. In most applications this will not be critical but programmers involved with the definition of NeXus data writing routines for instruments which have a very high data flux should stick to the NeXus base API.

NXDLL may prove useful as a tool for discussing data structures for instruments or data analysis tools in its own right. It may seem useful to store a data dictionary, when available, in the NeXus file and to create a tool which extracts the dictionary from the NeXus file. Such a scheme would make the distribution of data dictionaries easier and would solve the problem of possibly existing different versions for data dictionaries.

6.  References

  1. M. Könnecke, P. Klosowski, J. Tischler, R. Osborn: NeXus: A proposal for a common data exchnage format for the neutron and X-ray scattering community. To be published.
  2. HDF Reference Manual, NCSA, February 1994, available electronically form ftp.ncsa.uiuc.edu:/Documentation/HDF3.3


File translated from TEX by TTH, version 0.9.