0

Its my first time deploying a model. I've created a cnn model using tensorflow, keras, Xception and saved model is about 80 mb. When I load it from a function and do a prediction, it takes about 4-5 seconds. Is there a way to reduce this time? Does the model has to be loaded for every prediction?

enter image description here

1 Answers1

0

The model load only once in your program. for each prediction, you use the loaded model. it might take time to predict. TensorFlow doesn't load the model on prediction. the better way is to only save weights after training and for inference create model architecture and then load the saved weights.

  • I've added an image of my func. model=load_model('model.h5') is inside my func. I have to deploy multiple models on a website, So there is another py file which imports this function. How do I go about it? – sonam agarwal Mar 01 '21 at 08:42
  • the best practice is to write a singletone class to load and predict on the model and load the model in the constructor of your class. singleton class only initialize once so your model just load once in the startup of your application. This question is helpful : https://stackoverflow.com/questions/6760685/creating-a-singleton-in-python – Sadegh Ranjbar Mar 01 '21 at 09:18
  • Thanks, I'll try that. – sonam agarwal Mar 01 '21 at 10:06