1

I would like to change a function that takes a single value in the Indicator column to fillna(0) the Values column of a stacked dataframe to work with a list of indicators for which Nan values should be set to zero.

import pandas as pd
import numpy as np

df = pd.DataFrame({'ISO3': ['Australia', 'Austria', 'Belgium', 'Canada', 'Australia', 'Austria', 'Belgium', 'Canada'], 
                   'Year': [1991, 1991, 1991, 1991, 1991, 1991, 1991, 1991],
                   'Indicator' : ['Disaster Fatalities', 'Disaster Fatalities', 'Disaster Fatalities', 'Disaster Fatalities', 'Oil Reserves', 'Oil Reserves', 'Oil Reserves', 'Oil Reserves' ],
                   'Value' : [np.nan, 5, np.nan, 18, np.nan, np.nan, np.nan, np.nan]
                  })
df.head(8)

Gives:

ISO3    Year    Indicator   Value
0   Australia   1991    Disaster Fatalities NaN
1   Austria 1991    Disaster Fatalities 5.0
2   Belgium 1991    Disaster Fatalities NaN
3   Canada  1991    Disaster Fatalities 18.0
4   Australia   1991    Oil Reserves    NaN
5   Austria 1991    Oil Reserves    NaN
6   Belgium 1991    Oil Reserves    NaN
7   Canada  1991    Oil Reserves    NaN

Function for setting Nan values to zero for single indicator:

def zerofillnaindicator (df, Indicators):
    mask = (df['Indicator'] == Indicators)
    df.loc[mask, 'Value'] = df.loc[mask, 'Value'].fillna(0)
    return df

Called with

df2 = zerofillnaindicator (df = df, Indicators = 'Disaster Fatalities')
df2.head(8)

Gives as desired:


ISO3    Year    Indicator   Value
0   Australia   1991    Disaster Fatalities 0.0
1   Austria 1991    Disaster Fatalities 5.0
2   Belgium 1991    Disaster Fatalities 0.0
3   Canada  1991    Disaster Fatalities 18.0
4   Australia   1991    Oil Reserves    NaN
5   Austria 1991    Oil Reserves    NaN
6   Belgium 1991    Oil Reserves    NaN
7   Canada  1991    Oil Reserves    NaN

But how do I change this to take a list of Indicators like this:

df2 = zerofillnaindicator (df = df, Indicators = ['Disaster Fatalities', 'Oil Reserves'])
df2.head(8)

I tried replacing the condition for the mask with df.isin(Indicators) but this resulted in a 'Cannot index with multidimensional key' error on the .isin function

def zerofillnaindicator (df, Indicators):
    mask = df.isin(Indicators)
    df.loc[mask, 'Value'] = df.loc[mask, 'Value'].fillna(0)
    return df
Laurens
  • 75
  • 7

1 Answers1

1

If need working with scalar or with list in same function use isinstance for convert scalar to one element list:

def zerofillnaindicator (df, Indicators):
    vals = [Indicators] if isinstance(Indicators, str) else Indicators

    mask = df['Indicator'].isin(vals)
    df.loc[mask, 'Value'] = df.loc[mask, 'Value'].fillna(0)
    return df

df2 = zerofillnaindicator (df = df, Indicators = 'Disaster Fatalities')
print (df2)
        ISO3  Year            Indicator  Value
0  Australia  1991  Disaster Fatalities    0.0
1    Austria  1991  Disaster Fatalities    5.0
2    Belgium  1991  Disaster Fatalities    0.0
3     Canada  1991  Disaster Fatalities   18.0
4  Australia  1991         Oil Reserves    NaN
5    Austria  1991         Oil Reserves    NaN
6    Belgium  1991         Oil Reserves    NaN
7     Canada  1991         Oil Reserves    NaN

df3 = zerofillnaindicator (df = df, Indicators = ['Disaster Fatalities', 'Oil Reserves'])
print (df3)
        ISO3  Year            Indicator  Value
0  Australia  1991  Disaster Fatalities    0.0
1    Austria  1991  Disaster Fatalities    5.0
2    Belgium  1991  Disaster Fatalities    0.0
3     Canada  1991  Disaster Fatalities   18.0
4  Australia  1991         Oil Reserves    0.0
5    Austria  1991         Oil Reserves    0.0
6    Belgium  1991         Oil Reserves    0.0
7     Canada  1991         Oil Reserves    0.0
jezrael
  • 629,482
  • 62
  • 918
  • 895
  • Do you mean like this: 'mask = np.array(df.isin(Indicators).values)'? I still get ValueError: too many values to unpack (expected 1) – Laurens May 09 '19 at 10:48
  • Wonderful! Now I just have to study the code to understand it:-) – Laurens May 09 '19 at 11:39
  • @Laurens - There is used [ternary operator](https://stackoverflow.com/questions/394809/does-python-have-a-ternary-conditional-operator) – jezrael May 09 '19 at 11:40