Python 3.4 unicode character displayed correctly on console but no in text file

Question

My current code, displays the results in both console and an output text file with the following statement

     fw.write("Number of files processed within 512\u00B1 1 samples:  "+str(count))

My output on the text file is "Number of files processed within 512xB1 1 samples: 328"

But sometimes I do get the correct output"Number of files processed with 512± 1 samples: 3"

Console/interpreter output is fine with a print("") function where I always get ±. Being scouring and trying with encoding statements, any ideas?

What is `fw` exactly? I'm guessing a file, but how does it get opened? There's an optional `encoding` parameter to `open`, which might be useful here, but it's not entirely clear what's going on from your question. — Blckknght, Feb 12 '16 at 22:04
What OS are you using? Windows for example doesn't default to Unicode files natively. — Mark Ransom, Feb 12 '16 at 22:33

score 1 · Accepted Answer · answered Feb 12 '16 at 22:45

Unless you specifically require your source file to contain only ASCII, or your editor doesn't support rendering this specific character, just literally write the character in the source code:

fw.write("Number of files processed within 512±1 1 samples:  "+str(count))

Otherwise, explicitly open the file with the utf8 encoding:

with open('file.txt', 'w', encoding='utf8') as fw:
    fw.write("Number of files processed within 512\u00B1 1 samples:  "+str(count))

score 1 · Answer 2 · answered Feb 13 '16 at 03:49

Use Unicode strings.
Declare your source encoding.
Use the characters directly in the file if you like instead of escape codes.
For printing, just print the Unicode string.
For files, use io.open, declare the encoding (can be different than source and console), and write Unicode strings.
Save the source in the source encoding.

Then, if your console encoding supports the character (even if the console is a different encoding than the source file), it will display correctly. Files will contain the correctly encoded character.

Example (works in Python 2 and 3):

#coding:utf8
from __future__ import unicode_literals,print_function
import io
count = 57
with io.open('out.txt','w',encoding='utf8') as fw:
    fw.write("Number of files processed within 512±1 samples: {}".format(count))
    print("Number of files processed within 512±1 samples: {}".format(count))

Output:

C:\temp>chcp              # Console is a different encoding!
Active code page: 437

C:\temp>py -2 x.py        # Python 2 displays correctly
Number of files processed within 512±1 samples: 57

C:\temp>py -3 x.py        # Python 3 displays correctly
Number of files processed within 512±1 samples: 57

C:\temp>chcp 65001        # Change to output file encoding (UTF-8)
Active code page: 65001

C:\temp>type out.txt      # Content of file is correct.
Number of files processed within 512±1 samples: 57

score 1 · Answer 3 · answered Feb 13 '16 at 15:22

1

First, don't use Notepad, Microsoft refused to add UTF-8 support to it even on Windows 10.

Yes, enforcing use of UTF-8 is recommend when reading or writing files.

answered Feb 13 '16 at 15:22

sorin

137,198
150
472
707

score -1 · Answer 4 · answered Feb 12 '16 at 22:14

-1

Try opening the file in write-binary mode by using fw=open('file','wb').

If that doesn't work, try fw=open('file',mode='w',encoding='utf-8') to open in write mode with utf-8 encoding, which should solve your problem.

answered Feb 12 '16 at 22:14

Valkyrie

801
8
20

argument 'encoding='utf-8'' worked! Thanks! – Masud Syed Feb 14 '16 at 01:09
If it worked, mind setting as correct answer? – Valkyrie Feb 14 '16 at 02:37
I did 2 days ago. Adding the encoding argument solved it. – Masud Syed Feb 15 '16 at 18:56

Python 3.4 unicode character displayed correctly on console but no in text file

4 Answers4