-1

I have read this Pandas: convert type of column and this How to convert datatype:object to float64 in python? I have current output of df:

Day       object
Time      object
Open     float64
Close    float64
High     float64
Low      float64
                 Day          Time   Open  Close   High    Low
0      ['2019-03-25']  ['02:00:00']  882.2  882.6  884.0  882.1
1      ['2019-03-25']  ['02:01:00']  882.9  882.9  883.4  882.9
2      ['2019-03-25']  ['02:02:00']  882.8  882.8  883.0  882.7

So I can not use this:

day_=df.loc[df['Day'] == '2019-06-25']

My final purpose is to extract df by filtering the value of column "Day" by specific condition. I think the reason of df.loc above failed to excecute is that dtype of Day is object so I can not execute df.loc so I try to convert the above df to something like this:

               Day       Time   Open  Close   High    Low
0      2019-03-25  ['02:00:00']  882.2  882.6  884.0  882.1
1      2019-03-25  ['02:01:00']  882.9  882.9  883.4  882.9
2      2019-03-25  ['02:02:00']  882.8  882.8  883.0  882.7

I have tried:

df=pd.read_csv('output.csv')
df = df.convert_objects(convert_numeric=True)

#df['Day'] = df['CTR'].str.replace('[','').astype(np.float64)
df['Day'] = pd.to_numeric(df['Day'].str.replace(r'[,.%]','')) 

But it does not work with error like this:

ValueError: Unable to parse string "['2019-03-25']" at position 0

I am novice at pandas and this may be duplicated! Pls, help me to find solution. Thanks alot.

Huynh
  • 364
  • 4
  • 14
  • 1
    Start with `df = df.replace([r"\['", r"'\]"], ['', ''], regex=True)`, then do as you please. – cs95 Jun 23 '19 at 05:47
  • Thank you so much! It works as my expectation. – Huynh Jun 23 '19 at 06:04
  • You're welcome! I recommend deleting your question as it is quite an esoteric issue (mostly an issue of incorrectly saving or loading your data). Cheers. – cs95 Jun 23 '19 at 06:05

1 Answers1

1

Try this I hope it would work
first remove list brackets by from day then do filter using .loc

df = pd.DataFrame(data={'Day':[['2016-05-12']],
                       'day2':[['2016-01-01']]})

df['Day'] = df['Day'].apply(''.join)
df['Day'] = pd.to_datetime(df['Day']).dt.date.astype(str)

days_df=df.loc[df['Day'] == '2016-05-12']

Second Solution If the list is stored as string

from ast import literal_eval
df2 = pd.DataFrame(data={'Day':["['2016-05-12']"],
                       'day2':["['2016-01-01']"]})
df2['Day'] = df2['Day'].apply(literal_eval)
df2['Day'] = df2['Day'].apply(''.join)
df2['Day'] = pd.to_datetime(df2['Day']).dt.date.astype(str)
days_df=df2.loc[df2['Day'] == '2016-05-12']


tawab_shakeel
  • 3,345
  • 5
  • 23