-2

How can I add a single row to binary table inside large fits file using pyfits, astropy.io.fits or maybe some other python library?

This file is used as a log, so every second a single row will be added, eventually the size of the file will reach gigabytes so reading all the file and writing it back or keeping the copy of data in memory and writing it to the file every seconds is actually impossible. With pyfits or astropy.io.fits so far I could only read everything to memory add new row and then write it back.

Example. I create fits file like this:

import numpy, pyfits
data = numpy.array([1.0])
col = pyfits.Column(name='index', format='E', array=data)
cols = pyfits.ColDefs([col])
tbhdu = pyfits.BinTableHDU.from_columns(cols)
tbhdu.writeto('test.fits')

And I want to add some new value to the column 'index', i.e. add one more row to the binary table.

Vitalii
  • 57
  • 5
  • Why don't you just create a new file every minute add one row (in memory) every second and then write it to a file. Then start over again with a new file. You can eventually concatenate them (if you have to) but keeping several small files is probably much faster and memory-efficient and by only writing out one file every minute you save a lot of IO time. – MSeifert Oct 26 '17 at 17:45
  • 2
    This seems a bit of an X/Y problem. Logs are normally stored in plain text (easier to peruse), and rotated once the file becomes the large (with the older files compressed). Alternatively, you may want to think about using a database instead, which sounds like a much better alternative to an Astropy table in this case. It also really depends on what kind of data you want to store. –  Oct 27 '17 at 00:15
  • @MSeifert The idea is to have a single file with one table, the log includes the metadata which will be needed and later will be used along with main data. Creating multiple files and then merging them seems slightly weird and inefficient especially taking into account that this is a kind of log. – Vitalii Oct 27 '17 at 13:41
  • @Evert These are logs of the telescope, astronomers like fits and anyway this is the official requirement which cannot be overcome, later these logs will be used along with main data for data reconstruction, so they cannot be in simple ascii file. Unfortunately databases cannot be used as well. The data will be mostly primitives (int32, int64, float, double) or their 1D arrays, so fits files are just fine for that. – Vitalii Oct 27 '17 at 13:45
  • astropy supports csv files -> fits file conversion so you could store them as ascii and convert them to FITS when the log is complete. Minimizing IO and the (very expensive) FITS IO. – MSeifert Oct 27 '17 at 13:49
  • @MSeifert Losing precision is not an option and making e.g. float -> string -> float conversion is just not efficient especially since I already have array of the binary data with correct format (because it was used to initially create the file) where the appropriate values were replaced with new ones. I've already wrote a solution above, please see it. – Vitalii Oct 27 '17 at 14:17
  • Seems like you're storing actual data instead of logging your operations, so your terminology is confusion. Hence the confusion on the why and how to store it. –  Oct 27 '17 at 17:06
  • @Evert In some sense it is data, not primary but rather auxiliary, though it is a pretty usual log which stores where the telescope was pointing in some particular moment (and a lot of other stuff). Then these logs are used to make some high level analysis on primary data, calibrations, etc. So I consider my terminology to be just fine and actually my question was really simple: I have fits file and I want to append to existing binary table a new row (of exactly same structure) but with updated values. Please see my examples in my solution above it is pretty straightforward. – Vitalii Oct 27 '17 at 17:27
  • If you have a solution, best is to answer your own question (and accept it), since this is a question - answer site. Updating your question with a solution doesn't actually make it clear your problem is solved: the solution has now become part of the question, and it's unclear what the actual question is. –  Oct 31 '17 at 23:36

1 Answers1

0

Solution This is a trivial task for cfitsio library (method fits_insert_row(...)), so I use python module which is based on it: https://github.com/esheldon/fitsio

And here is the solution using fitsio. To create new fits file the one can do:

import fitsio, numpy
from fitsio import FITS,FITSHDR

fits = FITS('test.fits','rw')
data = numpy.zeros(1, dtype=[('index','i4')])
data[0]['index'] = 1
fits.write(data)
fits.close()

To append a row:

fits = FITS('test.fits','rw')
#you can actually use the same already opened fits file, 
#to flush the changes you just need: fits.reopen()
 data = numpy.zeros(1, dtype=[('index','i4')])
 data[0]['index'] = 2
 fits[1].append(data)
 fits.close()

Thank you for your help.

Vitalii
  • 57
  • 5