0

By python I have read some xml data and put them into a pandas.DataFrame now I want to write them in a file by df.to_csv(...) but it gives me this error

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 1257: ordinal not in range(128)

How can I make 'ascii' codec understand codes like u'\u2014', u'\u2019' etc

Martijn Pieters
  • 889,049
  • 245
  • 3,507
  • 2,997
someone
  • 208
  • 2
  • 12
  • 2
    You can't. ASCII cannot handle anything beyond `\u007f`. You'd pick a different encoding instead. – Martijn Pieters May 19 '14 at 18:22
  • The alternative is to use [Unidecode](https://pypi.python.org/pypi/Unidecode) to replace non-ASCII codepoints with reasonable ASCII equivalents and surrogates. U+2014 is EM-DASH, which can reasonably be substituted with a doubled plain `-` dash, for example. Ditto for U+2019, which is a fancy single quote, `'` could replace it. That library does this. – Martijn Pieters May 19 '14 at 18:27
  • You can use `encoding` argument in `to_csv()` : string, optional a string representing the encoding to use if the contents are non-ascii, for python versions prior to 3 – Vor May 19 '14 at 19:03
  • Thanks a lot indeed. it worked, it just added some ] or [ to the text – someone May 19 '14 at 19:24

0 Answers0