-1

I am having trouble removing quote characters that appear around my arrays. When I read in my file like this:

data = pd.read_csv('filepath.csv', sep='|', index_col=0, nrows=5)

the dtype of my problematic column is object but the individual entries are strings:

print(type(data.body_tokens[0]))
data.body_tokens[0]
<class 'str'>
"['he', 'knows', 'what', 'he', 's', 'doing']"

How can I remove the quotation marks around the array?

1 Answers1

0
import ast

string = "['he', 'knows', 'what', 'he', 's', 'doing']"

list = ast.literal_eval(string)

type(list)    #list

print(list)   #['he', 'knows', 'what', 'he', 's', 'doing']

want this one?

deceze
  • 471,072
  • 76
  • 664
  • 811
JY Won
  • 78
  • 5