I am trying to create a data frame out of a space-separated dataset. Some values in the 3rd column are missing and they're labelled Missing_x
. I'm trying to replace these values with np.nan
but its throwing me a ValueError.
from datetime import datetime
import pandas as pd
import numpy as np
data = ["1/3/2012 16:00:00 Missing_1",
"1/4/2012 16:00:00 27.47",
"1/5/2012 16:00:00 27.728",
"1/6/2012 16:00:00 28.19",
"1/9/2012 16:00:00 28.1",
"1/10/2012 16:00:00 28.15",
"12/13/2012 16:00:00 27.52",
"12/14/2012 16:00:00 Missing_19",
"12/17/2012 16:00:00 27.215",
"12/18/2012 16:00:00 27.63",
"12/19/2012 16:00:00 27.73",
"12/20/2012 16:00:00 Missing_20",
"12/21/2012 16:00:00 27.49",
"12/24/2012 13:00:00 27.25",
"12/26/2012 16:00:00 27.2",
"12/27/2012 16:00:00 27.09",
"12/28/2012 16:00:00 26.9",
"12/31/2012 16:00:00 26.77"]
date_list = []
mrc_list = []
for i in data:
data = i.split('\t')
days_of_data = datetime.strptime(data[0], '%m/%d/%Y %H:%M:%S')
date_list.append(days_of_data)
try:
mrc_list.append(float(data[1]))
except:
mrc_list.append(np.nan)
pass
mrc_df = pd.Series(mrc_list, index=date_list)
mrc_df.index.name = 'Date'
print(mrc_df)
This is the error:
Traceback (most recent call last):
File "/home/onur/Documents/code-signal/mercury.py", line 37, in <module>
days = datetime.strptime(data_list[0], '%m/%d/%Y %H:%M:%S')
File "/home/onur/anaconda3/lib/python3.7/_strptime.py", line 577, in _strptime_datetime
tt, fraction, gmtoff_fraction = _strptime(data_string, format)
File "/home/onur/anaconda3/lib/python3.7/_strptime.py", line 362, in _strptime
data_string[found.end():])
ValueError: unconverted data remains: Missing_1
I understand the error. I just don't understand why my way of addressing it does not work.