I have a 60GB FITS file containing a binary table. I would like to read (and process) this table one row/entry/line/block* at a time.
(*I'm unsure of the correct nomenclature)
I am using pyfits and what I would like to do boils down to simply:
import pyfits
hdulist = = pyfits.open("file.fits")
# the binary table has to be in the 2nd extension
# hence it is in hdulist[1]
n_entries = hdulist[1].header['NAXIS2']
for i in xrange(n_entries):
entry = hdulist[1].data[i] # I am confused what happens at this step
# now do stuff with the values in entry
# .....
The variable entry
is of type <class 'pyfits.fitsrec.FITS_record'>
and has a length equal to the number of columns in the binary table. However what appears to happen is the whole of the binary table is read into memory at this line: entry = hdulist[1].data[i]
.
I have looked through the pyfits documentation but I can't find any methods that seem to read data from a binary table extension on a table entry by table entry basis (or small sets of entries at a time). I don't want to select certain entries from the table, just simply scan through them in order.
I guess my questions are:
0) What is happening at the hdulist[1].data[i]
step? Why is everything being read into memory? (is there some way around this?)
1) Have I missed something and can pyfits actually do what I want?
2) Is there another python library out there that will? (ie using a binary table in a FITS extension)
3) If not, can I re-write the data in a different binary (or other compressed/not ascii) format (that is not FITS) and find some other python library or module to do what I want?