Extract SDDS Page Efficiently
Moderators: cyao, michael_borland
-
- Posts: 60
- Joined: 05 Aug 2010, 11:32
- Location: SLAC National Accelerator Laboratory
Extract SDDS Page Efficiently
Hi all,
I'd like to know if, on a fundamental level, it's possible to extract a page from an SDDS file efficiently. I'm running large simulations again, and all the code I've seen does an SDDS_ReadPage to move through the pages one-by-one until it gets to the desired location. Is there an easy way to specify a page to start at and then read the page?
- Joel
I'd like to know if, on a fundamental level, it's possible to extract a page from an SDDS file efficiently. I'm running large simulations again, and all the code I've seen does an SDDS_ReadPage to move through the pages one-by-one until it gets to the desired location. Is there an easy way to specify a page to start at and then read the page?
- Joel
-
- Posts: 2015
- Joined: 19 May 2008, 09:33
- Location: Argonne National Laboratory
- Contact:
Re: Extract SDDS Page Efficiently
Joel,
Unfortunately, there isn't a way to do this. It's something we should add.
You might get some benefit from the following trick: instead of using SDDS_ReadPage(), use SDDS_ReadPageSparse() and set the interval parameter to a large value. This will speed up the process of reading the unwanted pages. Once you reach the page you are interested in, set the interval to 1.
--Michael
Unfortunately, there isn't a way to do this. It's something we should add.
You might get some benefit from the following trick: instead of using SDDS_ReadPage(), use SDDS_ReadPageSparse() and set the interval parameter to a large value. This will speed up the process of reading the unwanted pages. Once you reach the page you are interested in, set the interval to 1.
--Michael
-
- Posts: 60
- Joined: 05 Aug 2010, 11:32
- Location: SLAC National Accelerator Laboratory
Re: Extract SDDS Page Efficiently
Thanks Michael,
Is there documentation for how to use these sddsdata functions? SDDS_ReadPageSparse takes fileIndex, sparse_interval, and sparse_offset. What do they mean? I can get fileIndex, in python that's (sddsclassvariable).index. But I don't know what to put for sparse_interval or sparse_offset.
Joel
Is there documentation for how to use these sddsdata functions? SDDS_ReadPageSparse takes fileIndex, sparse_interval, and sparse_offset. What do they mean? I can get fileIndex, in python that's (sddsclassvariable).index. But I don't know what to put for sparse_interval or sparse_offset.
Joel
-
- Posts: 2015
- Joined: 19 May 2008, 09:33
- Location: Argonne National Laboratory
- Contact:
Re: Extract SDDS Page Efficiently
Joel,
Alas, there isn't any documentation of that specific routine.
mode is unused and is present for future expansion.
sparse_interval is an integer greater than 1 giving the interval between rows that are read into memory.
sparse_offset is an integer greater than 0 giving the offset in rows to the first row that will be stored.
Hope this helps.
--Michael
Alas, there isn't any documentation of that specific routine.
Code: Select all
int32_t SDDS_ReadPageSparse(SDDS_DATASET *SDDS_dataset, uint32_t mode,
int32_t sparse_interval,
int32_t sparse_offset)
sparse_interval is an integer greater than 1 giving the interval between rows that are read into memory.
sparse_offset is an integer greater than 0 giving the offset in rows to the first row that will be stored.
Hope this helps.
--Michael
-
- Posts: 60
- Joined: 05 Aug 2010, 11:32
- Location: SLAC National Accelerator Laboratory
Re: Extract SDDS Page Efficiently
Michael,
I don't think I understand. I have a 1000-page file. If I run:
I get:
If I leave off the PrintErrors, it seems to do the exact same thing ReadPage does.
Also, loosely related: it occasionally reads 47914655154177L instead of 1L from the file. Any ideas what's going on with that? It's intermittent too - it's usually just the first page, but sometimes it's every single page. (This is a parameter file I'm testing on that's for use with loading parameters, the occurrence is supposed to be 1 for all of the rows.)
I don't think I understand. I have a 1000-page file. If I run:
Code: Select all
(load page and initialize class)
page=sddsdata.ReadPageSparse(a.index,10,3)
if page != 1:
sddsdata.PrintErrors(a.SDDS_EXIT_PrintErrors)
Code: Select all
Error:
Unable to read rows--failure reading string (SDDS_ReadBinaryRows)
Also, loosely related: it occasionally reads 47914655154177L instead of 1L from the file. Any ideas what's going on with that? It's intermittent too - it's usually just the first page, but sometimes it's every single page. (This is a parameter file I'm testing on that's for use with loading parameters, the occurrence is supposed to be 1 for all of the rows.)
Re: Extract SDDS Page Efficiently
When using this in python the code would look like:
Code: Select all
skipToPage=6
page = 0
while (page < skipToPage - 1)):
page=sddsdata.ReadPageSparse(a.index,99999999,0)
if page == 0:
sddsdata.PrintErrors(a.SDDS_EXIT_PrintErrors)
if page == -1:
break
if page != -1:
page=sddsdata.ReadPageSparse(a.index,1,0)
while page > 0:
for i in range(numberOfParameters):
a.parameterData[i].append(sddsdata.GetParameter(a.index,i))
for i in range(numberOfColumns):
a.columnData[i].append(sddsdata.GetColumn(a.index,i))
page = sddsdata.ReadPage(a.index)
if page == 0:
sddsdata.PrintErrors(a.SDDS_EXIT_PrintErrors)
#close SDDS file
if sddsdata.Terminate(a.index) != 1:
sddsdata.PrintErrors(a.SDDS_EXIT_PrintErrors)
-
- Posts: 60
- Joined: 05 Aug 2010, 11:32
- Location: SLAC National Accelerator Laboratory
Re: Extract SDDS Page Efficiently
Thanks for the code, I'll see if I can make it work.
Question: So (sparse_offset >= 0)? Not (sparse_offset > 0)? Is this the same with sparse_interval?
Question: So (sparse_offset >= 0)? Not (sparse_offset > 0)? Is this the same with sparse_interval?
Re: Extract SDDS Page Efficiently
Sparse interval has to be 1 or greater. 1 meaning there really is no 'sparsing' going on because it is reading every row.JoelFrederico wrote: Question: So (sparse_offset >= 0)? Not (sparse_offset > 0)? Is this the same with sparse_interval?
Sparse offset has to be 0 or greater. 0 means it will read the first row.
So in my example code, it will read the first row of every page but then skip the rest of it. It must read at least one row otherwise it will bomb which is why having the sparse offset greater than the number or rows will cause problems.
-
- Posts: 60
- Joined: 05 Aug 2010, 11:32
- Location: SLAC National Accelerator Laboratory
Re: Extract SDDS Page Efficiently
Ohhhhh, awesome, this is exactly the information I was looking for. Thanks! I think this can be called closed now.
-
- Posts: 60
- Joined: 05 Aug 2010, 11:32
- Location: SLAC National Accelerator Laboratory
Re: Extract SDDS Page Efficiently
Okay, too quick to speak. I don't think this will be a problem for me now because I'm not looking at integers, but it's strange. I'm including the code you'll need to replicate the problem.
The first run through, I believe before it compiles .pyc files, it has errors reading the second column in pages 3, 4, and 7. (It should always read as 1.) Subsequent runs have errors on the first page. I don't really know what's going on.
The first run through, I believe before it compiles .pyc files, it has errors reading the second column in pages 3, 4, and 7. (It should always read as 1.) Subsequent runs have errors on the first page. I don't really know what's going on.
Code: Select all
joelfred@noric05 error$ ./script.py
Attempt to load pages 1-10
Page number 1.
[[['EOFFSET']], [[1L]], [['DE']], [['0.000000e+00\r']]]
Page number 2.
[[['EOFFSET']], [[1L]], [['DE']], [['-1.508067e-03\r']]]
Page number 3.
[[['EOFFSET']], [[1L]], [['DE']], [['1.155152e-03\r']]]
Page number 4.
[[['EOFFSET']], [[47944719925249L]], [['DE']], [['-1.099028e-02\r']]]
Page number 5.
[[['EOFFSET']], [[1L]], [['DE']], [['1.996521e-03\r']]]
Page number 6.
[[['EOFFSET']], [[1L]], [['DE']], [['7.943676e-03\r']]]
Page number 7.
[[['EOFFSET']], [[47944719925249L]], [['DE']], [['4.454010e-03\r']]]
Page number 8.
[[['EOFFSET']], [[1L]], [['DE']], [['3.903766e-03\r']]]
Page number 9.
[[['EOFFSET']], [[1L]], [['DE']], [['-1.027400e-03\r']]]
Page number 10.
[[['EOFFSET']], [[47944719925249L]], [['DE']], [['7.756908e-04\r']]]
joelfred@noric05 error$ ./script.py
Attempt to load pages 1-10
Page number 1.
[[['EOFFSET']], [[47051366727681L]], [['DE']], [['0.000000e+00\r']]]
Page number 2.
[[['EOFFSET']], [[1L]], [['DE']], [['-1.508067e-03\r']]]
Page number 3.
[[['EOFFSET']], [[1L]], [['DE']], [['1.155152e-03\r']]]
Page number 4.
[[['EOFFSET']], [[1L]], [['DE']], [['-1.099028e-02\r']]]
Page number 5.
[[['EOFFSET']], [[1L]], [['DE']], [['1.996521e-03\r']]]
Page number 6.
[[['EOFFSET']], [[1L]], [['DE']], [['7.943676e-03\r']]]
Page number 7.
[[['EOFFSET']], [[1L]], [['DE']], [['4.454010e-03\r']]]
Page number 8.
[[['EOFFSET']], [[1L]], [['DE']], [['3.903766e-03\r']]]
Page number 9.
[[['EOFFSET']], [[1L]], [['DE']], [['-1.027400e-03\r']]]
Page number 10.
[[['EOFFSET']], [[1L]], [['DE']], [['7.756908e-04\r']]]
- Attachments
-
- error.tar.gz
- (3.97 KiB) Downloaded 413 times