0

I am trying to make a "results" dataframe with in each cell calculated values and their uncertainties, separated with the ± symbol. I then save the dataframe to a csv. I've made the example below to illustrate the problem: when I open the csv with textedit (or print it to screen) it all works. However, when I open (or import) the csv in excel suddenly the ± symbol is proceeded with another symbol: Â.

I've tried to force the variable to a string using "str()" and played around with the "encoding='utf-8' " but I can't seem to figure out what's going wrong here.. I'm using Python2.7 and Komodo on a macbook air.

import os
import pandas as pd

PmSign = (u'\u00B1').encode("utf-8")

columns = ['A']
data = [1,2,3,4,'5 {} 3' .format(PmSign),6]

df = pd.DataFrame(data, columns=columns)

df.to_csv(os.path.join('/Users/python/','UnicodeDFTEST.csv'), encoding='utf-8')
EdChum
  • 294,303
  • 173
  • 671
  • 486
Algomas
  • 41
  • 4
  • Are you sure Excel knows that the file is UTF-8? There should be a dropdown saying 'File origin' or something like that when you import your file. Make sure that's set to UTF. ([More info](http://stackoverflow.com/a/6488070/182402)) – Wander Nauta Jul 20 '15 at 13:40

2 Answers2

0

As @Wander Nauta mentioned, there is a File origin setting. You need to use the Text Import Wizard. Try the following:

  1. Start Microsoft Excel.
  2. Open a blank workbook.
  3. Select the DATA tab at the top.
  4. Press the From Text button.
  5. Choose your UnicodeDFTEST.csv file.
  6. Under File origin select 65001: Unicode (UTF-8) from the long drop down list.
  7. Press Next >
  8. Make sure the correct delimiter is used.
Martin Evans
  • 37,882
  • 15
  • 62
  • 83
  • Thanks for the suggestions but if I do that, the ± sign disappears completely. Also, I was hoping for a Python-fix so I can dubbelclick the csv file to open (which is much easier/faster). Currently I do that and then replace the extra symbol by a space which works but is a bit silly imo. – Algomas Jul 20 '15 at 14:17
  • You could consider using a different Python library to write your file directly in xlsx format thus avoiding any possible CSV conversion issues. I have not tried that with UTF-8 format yet. – Martin Evans Jul 20 '15 at 14:20
0

Try the following:

import os
import pandas as pd

PmSign = (u'\u00B1')

columns = ['A']
data = [1, 2, 3, 4, u'5 {} 3'.format(PmSign), 6]

df = pd.DataFrame(data, columns=columns)
writer = pd.ExcelWriter(os.path.join('/Users/python/','UnicodeDFTEST.xlsx')
df.to_excel(excel_writer=writer, sheet_name="Sheet 1")
writer.save()
Martin Evans
  • 37,882
  • 15
  • 62
  • 83