In Python, you can use the pandas module to load the Excel file as a DataFrame
. Post this, it is easy to substitute the NaN
/missing values.
Let's say your excel is named madrid_air.xlsx
import pandas as pd
df = pd.read_excel('madrid_air.xlsx')
Post this, you will have what they call a DataFrame
which consists of the data in the excel file in the same tabular format with column names and index. In the DataFrame the missing values will be loaded as NaN
values. So in order to get the rows which contains NaN
values,
df_nan = df[df.isna()]
df_nan
will have the rows which has NaN
values in them.
Now if you want to fill all those NaN
values with let's say 0.
df_zerofill = df.fillna(0)
df_zerofill
will have the whole DataFrame with all the NaN
s substituted with 0.
In order to specifically fill coulmns use the coumn names.
df[['NO','NO_2']] = df[['NO','NO_2']].fillna(0)
This will fill the NO
and NO_2
columns' missing values with 0.
To read up more about DataFrame
: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html
To read up more about handling missing data in DataFrame
s : https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html