I used to think I had this whole encoding stuff pretty figured out. I seem to be wrong because I can't explain what's happening here.
What I was trying to do is to use the tabulate
module to print a nicely formatted table using
from tabulate import tabulate
s = tabulate([[1,2],[3,4]], ["x","y"], tablefmt="fancy_grid")
print(s)
in IPython 3.5.0's interactive console under Windows 10. I expected the result to be
╒═════╤═════╕
│ x │ y │
╞═════╪═════╡
│ 1 │ 2 │
├─────┼─────┤
│ 3 │ 4 │
╘═════╧═════╛
but instead, I got a
UnicodeEncodeError: 'charmap' codec can't encode character '\u2552' in position 0: character maps to <undefined>
Puzzled, I tried to find out where the problem was and looked at the repr
of the string:
In [15]: s
Out[15]: '╒═════╤═════╕\n│ x │ y │\n╞═════╪═════╡\n│ 1 │ 2 │\n├─────┼─────┤\n│ 3 │ 4 │\n╘═════╧═════╛'
Hmm, all the characters can be displayed by the terminal (even the first one that triggered the error).
Just checking some details:
In [16]: sys.stdout.encoding
Out[16]: 'cp850'
In [17]: s.encode("cp850")
[...]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2552' in position 0: character maps to <undefined>
So which encoding is the terminal using? Python says that it's cp850
, and it tells me that cp850
doesn't have a ╒
character (which is true, it's one of the characters from cp437
that had to make room for accented letters), but I can see it in the terminal window!
To complicate things further, when using the native Python console instead of IPython, the error seems more understandable:
>>> s
'\u2552═══\u2564═══\u2555\n│ 1 │ 2 │\n├───┼───┤\n│ 3 │ 4 │\n\u2558═══\u2567═══\u255b'
>>> sys.stdout.encoding
'cp850'
>>> print(s)
Traceback (most recent call last):
[...]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2552' in position 0: character maps to <undefined>
So at least Python is consistent, but what's happening with IPython?