SPSS-Python write to CSV - wrong encoding when opening in Excel

Question

In SPSS, using python, I am writing a list of lists into a csv file:

begin program.

import spss,spssaux, sys, csv, codecs

def WriteDim():

    MyArray=[some list of lists]
    for MyVar in MyFile.varlist:
        MyArray.append([MyVar.name,MyVar.label])

    DimFile="DimCSV.csv"

    with codecs.open(DimFile, "w",encoding='utf8') as output:
        writer = csv.writer(output,  lineterminator='\n')
        writer.writerows(MyArray)

end program.

I have some Spanish text in my practice array, for example "reparación". If I open the output file in a text editor, all looks fine. However, if I open it in Excel 2016, it looks like this: "reparaciÃ³n". I would need to go to Data/Import From text" and manually choose UTF encoding, but this is not an option for the future users of my SPSS program.

Is there any way of writing the file so that Excel will open it using UTF-8 encoding ? It has to be a csv file - opening it in excel is only one use of it.

score 1 · Accepted Answer · answered Jul 11 '17 at 08:36

1

You explicitely ask for a utf8 encoding at codecs.open(DimFile, "w",encoding='utf8'), and later say you would prefere not to use utf8. Just directly use the expected encoding:

with codecs.open(DimFile, "w",encoding='cp1252') as output:

(cp1252 is the common encoding for Spanish on Windows)

answered Jul 11 '17 at 08:36

Serge Ballesta

121,548
10
94
199

I said I prefer to not choose the encoding manually when opening in Excel; I assumed UTF-8 would work, so I saved the file as UTF but it seems that Excel, as default, does not open the file as UTF. The problem is that Excel does not use the file encoding to open it – horace_vr Jul 11 '17 at 08:47
@horace_vr: Excel uses by default the system encoding which is commonly win-1252 code page for west european language systems. Why do you want to use utf8 encoding on Windows? – Serge Ballesta Jul 11 '17 at 08:59
I am looking into a way to save those texts. I assumed UTF-8 would do the trick, but apprentice no. win-1252 is not a valid encoding for the python program, as far as I can see – horace_vr Jul 11 '17 at 09:04

horace_vr · Answer 2 · 2017-07-11T12:52:23.153

0

While Serge Ballesta's answer worked perfectly for Spanish, I found that encoding='utf-8-sig' works best for all characters I tested. I felt UTF-8 should be used, as it is more common than the other suggested encodings.

Credit to this topic: Write to UTF-8 file in Python

edited Jul 11 '17 at 12:52

answered Jul 11 '17 at 11:14

horace_vr

2,328
3
23
43

SPSS-Python write to CSV - wrong encoding when opening in Excel

2 Answers2