0

I have a CSV that is encoded in Unicode, however lacks a byte order mark at the start. As such Excel (2013) opens without encoding correctly (i think it assumes ASCII if no BOM specified...), meaning that certain characters are displayed incorectly.

From reading around i have read that a BOM of "\uFEFF" should be entered at the start of the CSV file. I have tried opening in txt editor and adding the characters e.g.

\uFEFF
r1test 1, r1text2, r1text3     
r2test 1, r2text2, r2text3   

However, this does not solve the problem - the characters "\uFEFF" show up on the first row when I open in excel, rather than it beign interpreted as a BOM. I am not sure what I am doing wrong, and the format of how the text should be specified such that it is interpreted as a BOM, rather than text in the the first of the data

I have only very limited experience using CSV, and only just heard of a BOM... and thus I could be implementing this completely wrong!

(for reference, i know that I could specify the encoding if i use the import data option within excel... however I really want to work out how to get it correctly specified in advance such that I can just open the csv... I have several thousand of these files that I am creating and exporting - once I know how to do this 'manually' [i.e. by adding some text at start of a the file], I can configure to automatically do in Python).

Thanks in advance

kyrenia
  • 4,495
  • 7
  • 53
  • 82
  • 3
    Possibly a duplicate of http://stackoverflow.com/questions/934160/write-to-utf-8-file-in-python , but as stated in the comments there, you _really_ should not put a BOM (byte-order mark) in a UTF-8 file, which is a format without a byte order! – ComputerDruid Nov 21 '14 at 18:00
  • Thanks!!! While i think that my question was different, it had enough elements that allowed me to work it out. Basically, needed to specify the code "\uFEFF" itself in unicode form. i.e. in Python u"\uFEFF" rather than just "\uFEFF". (before I think i had the whole of the CSV in unicode form, with the exception of the BOM to specify that it is in unicode) – kyrenia Nov 21 '14 at 18:52

1 Answers1

0

For someone else wanting to tell Excel to add a BOM: See if you can "Save as Unicode Text".

source

tuxayo
  • 894
  • 1
  • 11
  • 20