First things first, I've already read the following:
- OSError 24 (Too many open files) when reading bunch of FITS with astropy.io
- https://astropy.readthedocs.io/en/latest/io/fits/appendix/faq.html#i-m-opening-many-fits-files-in-a-loop-and-getting-oserror-too-many-open-files
And some more links from the first one, but none of them worked...
My problem is with opening huge (>80 Mb/pc.) and numerous (~3000) FITS files in Jupyter Notebook. The relevant code snippet is the following:
# Dictionary to store NxN data matrices of cropped image tiles
CroppedObjects = {}
# Defining some other, here used variable....
# ...
# Interate over all images ('j'), which contain the current object, indexed by 'i'
for i in range(0, len(finalObjects)):
for j in range(0, len(containingImages[containedObj[i]])):
countImages += 1
# Path to the current image: 'mnt/...'
current_image_path = ImagePaths[int(containingImages[containedObj[i]][j])]
# Open .fits images
with fits.open(current_image_path, memmap=False) as hdul:
# Collect image data
image_data = fits.getdata(current_image_path)
# Collect WCS data from the current .fits's header
ImageWCS = wcs.WCS(hdul[1].header)
# Cropping parameters:
# 1. Sky-coordinates of the croppable object
# 2. Size of the crop, already defined above
Coordinates = coordinates.SkyCoord(finalObjects[i][1]*u.deg,finalObjects[i][2]*u.deg, frame='fk5')
size = (cropSize*u.pixel, cropSize*u.pixel)
try:
# Cut out the image tile
cutout = Cutout2D(image_data, position=Coordinates, size=size, wcs=ImageWCS, mode='strict')
# Write the cutout to a new FITS file
cutout_filename = "Cropped_Images_Sorted/Cropped_" + str(containedObj[i]) + current_image_path[-23:]
# Sava data to dictionary
CroppedObjects[cutout_filename] = cutout.data
foundImages += 1
except:
pass
else:
del image_data
continue
# Memory maintainance
gc.collect()
# Progress bar
sys.stdout.write("\rProgress: [{0}{1}] {2:.3f}%\tElapsed: {3}\tRemaining: {4} {5}".format(u'\u2588' * int(countImages/allCrops * progressbar_width),
u'\u2591' * (progressbar_width - int(countImages/allCrops * progressbar_width)),
countImages/allCrops * 100,
datetime.now()-starttime,
(datetime.now()-starttime)/countImages * (allCrops - countImages),
foundImages))
sys.stdout.flush()
So ok, it does actually three things:
- Opens a particular FITS file
- Cuts a square out of it (but
strict
ly, so if arrays only overlap partly, thetry
statement jumps to the next step in the loop) - Updates the progress bar
Then goes to the next file, does the same things and iterates over all of my FITS files.
BUT: If I try running this code, after approximately 1000 found pictures, it stops and gives and OSError: [Errno 24] Too many open files
on the line:
image_data = fits.getdata(current_image_path)
I tried everything, which was supposed to solve the problem, but nothing helped... Not even setting memory mapping to false
or using fits.getdata
and gc.collect()
... Also tried many minor changes, like running without the try
statement, cutting out all of the image tiles, without any limitations. The del
inside the else statement is also another miserable attempt by me.
What else can I try to make this finally work?
Also, feel free to ask me if somethings not clear! I'll also try to help you to understand the problem!