I would like to change a function that takes a single value in the Indicator column to fillna(0) the Values column of a stacked dataframe to work with a list of indicators for which Nan values should be set to zero.
import pandas as pd
import numpy as np
df = pd.DataFrame({'ISO3': ['Australia', 'Austria', 'Belgium', 'Canada', 'Australia', 'Austria', 'Belgium', 'Canada'],
'Year': [1991, 1991, 1991, 1991, 1991, 1991, 1991, 1991],
'Indicator' : ['Disaster Fatalities', 'Disaster Fatalities', 'Disaster Fatalities', 'Disaster Fatalities', 'Oil Reserves', 'Oil Reserves', 'Oil Reserves', 'Oil Reserves' ],
'Value' : [np.nan, 5, np.nan, 18, np.nan, np.nan, np.nan, np.nan]
})
df.head(8)
Gives:
ISO3 Year Indicator Value
0 Australia 1991 Disaster Fatalities NaN
1 Austria 1991 Disaster Fatalities 5.0
2 Belgium 1991 Disaster Fatalities NaN
3 Canada 1991 Disaster Fatalities 18.0
4 Australia 1991 Oil Reserves NaN
5 Austria 1991 Oil Reserves NaN
6 Belgium 1991 Oil Reserves NaN
7 Canada 1991 Oil Reserves NaN
Function for setting Nan values to zero for single indicator:
def zerofillnaindicator (df, Indicators):
mask = (df['Indicator'] == Indicators)
df.loc[mask, 'Value'] = df.loc[mask, 'Value'].fillna(0)
return df
Called with
df2 = zerofillnaindicator (df = df, Indicators = 'Disaster Fatalities')
df2.head(8)
Gives as desired:
ISO3 Year Indicator Value
0 Australia 1991 Disaster Fatalities 0.0
1 Austria 1991 Disaster Fatalities 5.0
2 Belgium 1991 Disaster Fatalities 0.0
3 Canada 1991 Disaster Fatalities 18.0
4 Australia 1991 Oil Reserves NaN
5 Austria 1991 Oil Reserves NaN
6 Belgium 1991 Oil Reserves NaN
7 Canada 1991 Oil Reserves NaN
But how do I change this to take a list of Indicators like this:
df2 = zerofillnaindicator (df = df, Indicators = ['Disaster Fatalities', 'Oil Reserves'])
df2.head(8)
I tried replacing the condition for the mask with df.isin(Indicators) but this resulted in a 'Cannot index with multidimensional key' error on the .isin function
def zerofillnaindicator (df, Indicators):
mask = df.isin(Indicators)
df.loc[mask, 'Value'] = df.loc[mask, 'Value'].fillna(0)
return df