Saving and loading neupy algorithm with dill library can return different predictions for the same time period?

Question

First of all thank you for reading this, and thank you in advance if you can help. This is the algorithm that I´m using for supervised learning:

   # Define neural network
cgnet = algorithms.LevenbergMarquardt(
    connection=[
        layers.Input(XTrain.shape[1]),
        layers.Relu(6),
        layers.Linear(1)
    ],
    mu_update_factor=2,
    mu=0.1,
    shuffle_data=True,
    verbose=True,
    decay_rate=0.1,
    addons=[algorithms.WeightElimination]
)

Cross validation results are good (k=10):

[0.16767815652364237, 0.13396493112368024, 0.19033966833586402, 0.12023567250054788, 0.11826824035439124, 0.13115856672872392, 0.14250003819441104, 0.12729442202775898, 0.31073760721487326, 0.19299511349686768]
[0.9395976956178138, 0.9727526340820827, 0.9410503161549465, 0.9740922179654977, 0.9764171089773663, 0.9707258917808179, 0.9688830174583372, 0.973160633351555, 0.8551738446276884, 0.936661707991699]
MEA: 0.16 (+/- 0.11)
R2: 0.95 (+/- 0.07)

After training I have saved the algorithm with dill:

with open('network-storage.dill', 'wb') as f:
    dill.dump(cgnet, f)

Then if I load the network with dill and consider the X values of the entire training set I get the same R2 (0.9691), until now everything is ok. This are the results:

If I try to do the same thing but with only the last few years [2018-2022] I get this (prediction of y with X training values (2018 to 2022):

Instead of this (prediction of y with X training values (1992 to 2022):

Why do I get different predictions for the same period when I load different X values range? (X input from 1992 to 2022: y prediction for 1992 to 2022 is ok. (X input from 2018 to 2022: y prediction for 2018 to 2022 is not ok.

This is the code:

import numpy as np
import pandas as pd
import datetime as dt
import matplotlib.pyplot as plt
import dill
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn import metrics
from sklearn.model_selection import KFold
from scipy.interpolate import Rbf
from scipy import stats
from neupy import layers, environment, algorithms
from neupy import plots


# Import data 
data = pd.read_excel('DataAL_Incremento.xlsx', index_col=0, header=1).iloc[:,[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,-1]]
data.columns = ['PPO4L(in)','PPO4(in)','NH4L(in)','NH4(in)','NO3L(in)','NNO3(in)','CBOL(in)', 'CBO(in)','Temp(In)','Temp(alb)','Tair ','Tdew',
                'Wvel','Cl_aL(in)','Cl_a(in)','ODL(in)','OD(in)','Qin(in)','ODalb','PPO4(alb)','NNO3(alb)']


# Add filtered data
tmp0 = data.iloc[:,[9, 6, 14]].rolling(9, center=False, axis=0).mean()
tmp0.columns = ['Temp(alb)_09','CBOL(in)_09','Cl_a(in)_09']
tmp1 = data.iloc[:,[9, 6, 14]].rolling(15, center=False, axis=0).mean()
tmp1.columns = ['Temp(alb)_15', 'CBOL(in)_15','Cl_a(in)_15']
tmp2 = data.iloc[:,[9, 6, 14]].rolling(31, center=False, axis=0).mean()
tmp2.columns = ['Temp(alb)_31', 'CBOL(in)_31','Cl_a(in)_31']
data = pd.concat((data, tmp0, tmp1, tmp2), axis=1)

# Drop empty records
data = data.dropna()

# Define data
X = data.loc[:, ['CBOL(in)', 'CBO(in)','Temp(In)','Temp(alb)','Tair ','Cl_aL(in)','Cl_a(in)','OD(in)','Temp(alb)_31', 'CBOL(in)_31','Cl_a(in)_31']]

y = data.loc[:, ['ODalb']]


years = data.index.year
yearsTrain = range(1992,2022)
yearsTest = 2019,2020,2021

#yearsTrain, yearsTest = train_test_split(np.unique(years), test_size=0.2, train_size=0.8, random_state=None)

XTrain = X.query('@years in @yearsTrain')
yTrain = y.query('@years in @yearsTrain').values.ravel()
XTest = X.query('@years in @yearsTest')
yTest = y.query('@years in @yearsTest').values.ravel()

results = y.query('@years in @yearsTest')


#===============================================================================
# Neural network
#===============================================================================

# Define neural network
cgnet = algorithms.LevenbergMarquardt(
    connection=[
        layers.Input(XTrain.shape[1]),
        layers.Relu(6),
        layers.Linear(1)
    ],
    mu_update_factor=2,
    mu=0.1,
    shuffle_data=True,
    verbose=True,
    decay_rate=0.1,
    addons=[algorithms.WeightElimination]
)

# Scale
XScaler = StandardScaler()
XScaler.fit(XTrain)
XTrainScaled = XScaler.transform(XTrain)
XTestScaled = XScaler.transform(XTest)

yScaler = StandardScaler()
yScaler.fit(yTrain.reshape(-1, 1))
yTrainScaled = yScaler.transform(yTrain.reshape(-1, 1)).ravel()
yTestScaled = yScaler.transform(yTest.reshape(-1, 1)).ravel()

# Train 
cgnet.train(XTrainScaled, yTrainScaled, XTestScaled, yTestScaled, epochs=30)

yEstTrain = yScaler.inverse_transform(cgnet.predict(XTrainScaled).reshape(-1, 1)).ravel()
mae = np.mean(np.abs(yTrain-yEstTrain))
results['ANN'] = yScaler.inverse_transform(cgnet.predict(XTestScaled).reshape(-1, 1)).ravel()

# Metrics
mse  = np.mean((yTrain-yEstTrain)**2)
mseTes = np.mean((yTest-results['ANN'])**2)
maeTes = np.mean(np.abs(yTest-results['ANN']))
meantrain = np.mean(yTrain)
ssTest = (yTrain-meantrain)**2
r2=(1-(mse/(np.mean(ssTest))))
meantest = np.mean(yTest)
ssTrain = (yTest-meantest)**2
r2Tes=(1-(mseTes/(np.mean(ssTrain))))


# Plot results
print("NN MAE: %f (All), %f (Test) " % (mae, maeTes))
print ("NN MSE: %f (All), %f (Test) " % (mse, mseTes))
print ("NN R2: %f (All), %f (Test) " % (r2, r2Tes))

results.plot()
plt.show(block=True)

plots.error_plot(cgnet)
plt.show(block=True)

plt.scatter(yTest,results['ANN'])
plt.xlabel('True Values')
plt.ylabel('Predictions')

plt.show(block=True)


#===============================================================================
# Save algorithms - Neural network
#===============================================================================

with open('network-storage.dill', 'wb') as f:
    dill.dump(cgnet, f)

#===============================================================================
# Load algorithms - Neural network
#===============================================================================

#Prepare data

dataVal = pd.read_excel('DataAL_IncrementoTeste.xlsx', index_col=0, header=1).iloc[:,[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,-1]]

dataVal.columns = ['PPO4L(in)','PPO4(in)','NH4L(in)','NH4(in)','NO3L(in)','NNO3(in)','CBOL(in)', 'CBO(in)','Temp(In)','Temp(alb)','Tair ','Tdew',
                   'Wvel','Cl_aL(in)','Cl_a(in)','ODL(in)','OD(in)','Qin(in)','ODalb','PPO4(alb)','NNO3(alb)']


# Add filtered data
tmp0 = dataVal.iloc[:,[9, 6, 14]].rolling(9, center=False, axis=0).mean()
tmp0.columns = ['Temp(alb)_09','CBOL(in)_09','Cl_a(in)_09']
tmp1 = dataVal.iloc[:,[9, 6, 14]].rolling(15, center=False, axis=0).mean()
tmp1.columns = ['Temp(alb)_15', 'CBOL(in)_15','Cl_a(in)_15']
tmp2 = dataVal.iloc[:,[9, 6, 14]].rolling(31, center=False, axis=0).mean()
tmp2.columns = ['Temp(alb)_31', 'CBOL(in)_31','Cl_a(in)_31']
dataVal = pd.concat((dataVal, tmp0, tmp1, tmp2), axis=1)

# Drop empty records (removes adjacent columns)
dataVal = dataVal.dropna()

# Define data
Xval = dataVal.loc[:, ['CBOL(in)', 'CBO(in)','Temp(In)','Temp(alb)','Tair ','Cl_aL(in)','Cl_a(in)','OD(in)','Temp(alb)_31', 'CBOL(in)_31','Cl_a(in)_31']]
yval = dataVal.loc[:, ['ODalb']]

years = dataVal.index.year
yearsTrain = range(2018,2022)

XFinalVal = Xval.query('@years in @yearsTrain')
yFinalVal = yval.query('@years in @yearsTrain').values.ravel()
resultsVal = yval.query('@years in @yearsTrain')


# Load algorithms 
with open('network-storage.dill', 'rb') as f:
    cgnet = dill.load(f)
# Scale X
    XScaler = StandardScaler()
    XScaler.fit(XFinalVal)
    XFinalScaled = XScaler.transform(XFinalVal)

# Scale y  
    yScaler = StandardScaler()
    yScaler.fit(yFinalVal.reshape(-1, 1))
    yTrainScaled = yScaler.transform(yFinalVal.reshape(-1, 1)).ravel()
# Predict
    y_predicted = yScaler.inverse_transform(cgnet.predict(XFinalScaled).reshape(-1, 1)).ravel()

    resultsVal['ANN'] = y_predicted
    scoreMean = metrics.mean_absolute_error(yFinalVal, y_predicted)
    scoreR2 = metrics.r2_score(yFinalVal, y_predicted)


print(scoreMean)
print(scoreR2)


plt.scatter(yFinalVal,y_predicted)

plt.xlabel('True Values')
plt.ylabel('Predictions')

plt.show(block=True)

resultsVal.plot()
plt.show(block=True)


#===============================================================================
# Cross validation - Neural network
#===============================================================================
XScaler = StandardScaler()
XScaler.fit(XTrain)
XTrainScaled = XScaler.transform(XTrain)
XTestScaled = XScaler.transform(XTest)

yScaler = StandardScaler()
yScaler.fit(yTrain.reshape(-1, 1))
yTrainScaled = yScaler.transform(yTrain.reshape(-1, 1)).ravel()
yTestScaled = yScaler.transform(yTest.reshape(-1, 1)).ravel()

kfold = KFold(n_splits=10, shuffle=True, random_state=None)
scoresMean = []   
scoresR2 = [] 

for train, test in kfold.split(XTrainScaled):
    x_train, x_test = XTrainScaled[train], XTrainScaled[test]
    y_train, y_test = yTrainScaled[train], yTrainScaled[test]

    cgnet = algorithms.LevenbergMarquardt(
        connection=[
            layers.Input(XTrain.shape[1]),
            layers.Relu(6),
            layers.Linear(1)
        ],
        mu_update_factor=2,
        mu=0.1,
        shuffle_data=True,
        verbose=True,
        decay_rate=0.1,
        addons=[algorithms.WeightElimination]
    )

    cgnet.train(x_train, y_train, epochs=100)
    y_predicted = cgnet.predict(x_test)

    scoreMean = metrics.mean_absolute_error(y_test, y_predicted)
    scoreR2 = metrics.r2_score(y_test, y_predicted)
    scoresMean.append(scoreMean)
    scoresR2.append(scoreR2)

print(scoresMean)
print(scoresR2)
scoresMean = np.array(scoresMean)
scoresR2 = np.array(scoresR2)

print("MEA: %0.2f (+/- %0.2f)" % (scoresMean.mean(), scoresMean.std() * 2))

print("R2: %0.2f (+/- %0.2f)" % (scoresR2.mean(), scoresR2.std() * 2))

Hi I'm the `dill` author. That's a lot of code to parse through, and it's fairly specific to the learning code... so, that alone will limit the responses you get. I don't know if there's a way that you can distill your question down a bit to the most essential parts. The simpler you can make your code, the more likely someone will answer (as it's easier to diagnose minimal code). — Mike McKerns, Nov 28 '18 at 16:05
In the abstract, what I can think of that could potentially cause errors in your results due to a dump/load is that the class that you are pickling (from `neupy`, I gather) may not store all of the important state when it's pickled... but includes defaults that are used when a new object is created. That could be a reason for seeing different results before and after pickling. You can try pickling the most minimal instances of your models to see if any results change, and try to better diagnose what the issue is. This is just a total guess on my part, as I don't have any evidence of it. — Mike McKerns, Nov 28 '18 at 16:10

score 2 · Accepted Answer · answered Nov 29 '18 at 14:01

I think that one of the problems might be with the scaling that you apply before the training. In the training stage you fit scaler function using training data

XScaler = StandardScaler()
XScaler.fit(XTrain)

But after you loaded network using dill you've fitted scaler with different data (validation data specificaly)

XScaler = StandardScaler()
XScaler.fit(XFinalVal)

In the second case, you use different scaling for the prediction which network hasn't seen during the training. New scaling might create different distrubition of the samples compare to the one that networks expects.

In order to make effect from the training reproducible you also need to save XScaler and load it at the same time when you load network.

Everything that I've described also true for the yScaler

Solved, like you say it was just a scaling issue. Thank you very much — Manuel Almeida, Nov 29 '18 at 15:08

Saving and loading neupy algorithm with dill library can return different predictions for the same time period?

1 Answers1