5

I am having a tough time reading variable-length arrays from the entries of a FITS table, using the CFITSIO libraries (I have to use them due to another software I am developing).

Now, the FITS table I am trying to read looks like this:

enter image description here

As you can see, the last three columns, instead of having scalar values in their cells, contain variable-length arrays.

The CFITSIO documentation is not very helpful with this particular case: most of the basic routine is thought to generate array by reading directly regular columns (with scalar in their cells, see section 2 of https://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c_user/node46.html). fits_read_col will not work with this data structure.

Now it is recommended to use the fits_read_descript routine when reading variable columns. The problem is that this function returns low-level information, in particular, the starting offset in the heap where the array is stored (see section 7 of https://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c_user/node82.html). So even if I get low-level information on the cell containing multiple arrays, then it is not clear how to use it to fetch the numerical values!

CFITSIO Iterators are marginally helpful, and there is no example with such complicated data structure.

Has anybody ever done this before? Is there anybody able to produce a snippet using the CFITSIO to read variable-length array? It would be immensely helpful.

The FITS file of which I took the screenshot can be found here.

Here a tentative snippet opening the file and examining the columns and rows, applying the suggested fits_read_descript function for variable-length columns. I do not know how to proceed further because I don't know how to exploit the parameters returned to fetch the actual numerical values in the table.

#include "fitsio.h"
#include <iostream>

int main(){

    fitsfile *fp = 0; // pointer to fitsfile type provided in CFITSIO library
    int status = 0; // variable passed down to different CFITSIO functions 

    // open the fits file, go to the Header Data Unit 1 containing the table 
    // with variable-length arrays
    fits_open_file(&fp, "rmf_obs5029747.fits[1]", READONLY, &status);

    // read HDU type
    int hdutype;
    fits_get_hdu_type(fp, &hdutype, &status);
    std::cout << "found type " << hdutype << " HDU type." << "\n";

    // read number of rows and columns
    long nTableRows;
    int  nTableCols;
    fits_get_num_rows(fp, &nTableRows, &status);
    fits_get_num_cols(fp, &nTableCols, &status);
    std::cout << "the table has " << nTableRows << " rows" << "\n";
    std::cout << "the table has " << nTableCols << " columns" << "\n";

    // loop through the columns and consider only those with a negative typecode
    // indicating that they contain a variable-length array
    // https://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c_user/node29.html
    int typecode;
    long repeat;
    long width; 
    long offset;

    for (int colnum = 0; colnum < nTableCols; ++colnum) {
        fits_get_coltype(fp, colnum+1, &typecode, &repeat, &width, &status);
        if (typecode < 1) {
            std::cout << "->column " << colnum << " contains a variable-length array" << "\n"; 
            std::cout << "->examining its rows..." << "\n";
            // loop through the rows
            for (int rownum = 0; rownum < nTableRows; ++rownum)
                fits_read_descript(fp, colnum, rownum, &repeat, &offset, &status);
        }
    }
}

3 Answers3

1

Just a thought, probably you might have seen this/thought about this, but just in case if you havent.

fits_get_col_display_width() = to know the number of characters available

If fits_read_descript() gives the number of elements in the array and starting offset, can the total number of bytes be read into a string and tokenize with delimiter "," and get the numbers?

lorenz
  • 78
  • 7
  • Hi @lorenz, thanks for your help. `fits_get_col_display_width()` crashes when used on a variable-length column. But you are right `fits_read_descript(fitsfile *fptr, int colnum, LONGLONG rownum, > long *repeat, long *offset, int *status)` returns the number of elements `repeat` and the starting `offset`. Just I do not know how to employ them... there is a `fits_read_tblbytes(fitsfile *fptr, LONGLONG firstrow, LONGLONG firstchar, LONGLONG nchars, > unsigned char *values, int *status)` but the documentation is not clear on what `firstrow`, `firstchar` and `nchars` are. – cosimoNigro Mar 04 '20 at 08:43
1

There are a lot of odd special cases in FITS, and you have found one of them.

I have successfully used this.

  1. Call fits_read_descript{s}{ll}() to determine the repeat and offset for the row in question. You can also use the descripts variant to determine the repeat and offset for many rows at one time.
  2. Call fits_read_col{null}() to read the data, one row at a time. The number of elements is the repeat count you found from step 1, or less if you want a subset. Set the column, row number and first element as you would normally do. Using the null variant works just fine.

The important thing is you can only read variable length data this way one row at a time, but it saves you the trouble of doing all the byte decoding and heap indexing. Even if you use the fits_read_descripts() variant to determine multiple repeat counts in one function call, you still must call fits_read_col() once for each table row you are interested in.

Reading variable length strings or bit arrays are their own fun diversions, but you look like you just want to read X-ray response matrix data (floating point) so this should answer your question.

CraigM
  • 11
  • 1
0

Thanks @lorenz and @CraigM for your suggestions.

I actually found the solution and implemented it in the ROOT class interfacing with the CFITSIO library, so if someone else has the same problem can copy the solution or directly use ROOT. The function I introduced to read a variable-length cell in ROOT is: TFITSHDU::GetTabVarLengthVectorCell()

I did it one year ago and forgot to post the answer here :)

You can find the code with the solution here.

I actually used the scheme you proposed @CraigM, i.e. combining fits_read_descript and fits_read_col.