I have a set of 5 files in the format .npz. I need to extract the numpy arrays from these files one by one and then use it to train a model. After loading the first numpy array in the memory and training the model with it, if i try to remove it from memory by slicing it, the memory consumed is not reducing. Because of this, I am unable to load the second numpy array and eventually get a MemoryError.
How do I make sure that the memory is freed after training the model?
PS: Size of X_test and y_test is very small so can be ignored.
Code:
for person_id in range(1, 5):
print "Initial memory ",resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
temp1 = np.load("../final_data/speaker_input" + str(person_id))
X_train = temp1['arr_0']
y_train = np.load("../final_data/speaker_final_output" + str(person_id)+ ".npy")
X_test,y_test = data.Test(person_id=1)
print "Input dimension ", X_train.shape
print "Output dimension",y_train.shape
print "Before training ",resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
lipreadtrain.train(model=net,X_train=X_train, y_train=y_train,X_test=X_test, y_test=y_test)
print "After training ", resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
X_train = X_train[:1]
y_train = y_train[:1]
X_test = X_test[:1]
y_test = y_test[:1]
print len(X_train),len(y_train),len(X_test),len(y_test)
gc.collect()
temp1.close()
Output:
Initial memory 861116
Input dimension (8024, 50, 2800)
Output dimension (8024, 53)
Before training 9642152
Training the model, which will take a long long time...
Epoch 1/1
8024/8024 [==============================] - 42s - loss: nan - acc: 0.2316
----- Training Takes 42.3187870979 Seconds -----
Finished!
After training 9868080
1 1 0 0
Initial memory 9868080
Traceback (most recent call last):
File "test.py", line 21, in <module>
X_train = temp1['arr_0']
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/numpy/lib/npyio.py", line 224, in __getitem__
pickle_kwargs=self.pickle_kwargs)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/numpy/lib/format.py", line 661, in read_array
array = numpy.empty(count, dtype=dtype)
MemoryError