60

does anyone have an idea, why this Python 3.2 code

try:    
    raise Exception('X')
except Exception as e:
    print("Error {0}".format(str(e)))

works without problem (apart of unicode encoding in windows shell :/), but this

try:    
    raise Exception('X')
except Exception as e:
    print("Error {0}".format(str(e, encoding = 'utf-8')))

throws TypeError: coercing to str: need bytes, bytearray or buffer-like object, Exception found ?

How to convert an Error to a string with custom encoding?

Edit

It does not works either, if there is \u2019 in message:

try:    
    raise Exception(msg)
except Exception as e:
    b = bytes(str(e), encoding = 'utf-8')
    print("Error {0}".format(str(b, encoding = 'utf-8')))

But why cannot str() convert an exception internally to bytes?

sorin
  • 137,198
  • 150
  • 472
  • 707
ts.
  • 9,930
  • 7
  • 42
  • 67
  • 2
    Did you try `str(e).encode('utf-8')`? – agf Aug 16 '11 at 08:06
  • 1
    @agf Itself it returns bytes instead of string. I can use it as replacement of bytes(str(e), encoding = 'utf-8'), but I have always to do second conversion bytes => str – ts. Aug 16 '11 at 08:24
  • “why cannot str() convert to bytes” — how would it know to which encoding to convert? Also, your new code is equivalent to .format(str(e)) – hamstergene Aug 16 '11 at 08:27
  • @Eugene is right. You should just encode it after formatting. If you try and use the `encoding` parameter, it requires the source be accessible as bytes. – agf Aug 16 '11 at 08:28
  • @Eugene Try to run it in windows shell on french win7, you will see that it is not equivalent – ts. Aug 16 '11 at 08:33
  • But why cannot str() convert an exception internally to bytes? The reason is that the interpreter tries to format a tuple object that is e.args . – Sebastiano Merlino Aug 16 '11 at 08:39
  • I find simply using `print(e)` an easier solution. – Andy Hayden Apr 18 '14 at 05:32

5 Answers5

60

In Python 3.x, str(e) should be able to convert any Exception to a string, even if it contains Unicode characters.

So unless your exception actually returns an UTF-8 encoded byte array in its custom __str__() method, str(e, 'utf-8') will not work as expected (it would try to interpret a 16bit Unicode character string in RAM as an UTF-8 encoded byte array ...)

My guess is that your problem isn't str() but the print() (i.e. the step which converts the Python Unicode string into something that gets dumped on your console). See this answer for solutions: Python, Unicode, and the Windows console

Community
  • 1
  • 1
Aaron Digulla
  • 297,790
  • 101
  • 558
  • 777
  • indeed, initially i had a problem with *UnicodeEncodeError: 'charmap' codec can't encode character...* in shell under french version of Win 7. It seemed more portable to convert explicitly everything to utf-8, instead of using some custom, os-dependant wrappers. – ts. Aug 16 '11 at 08:32
14

Try this, it should work.

try:    
    raise Exception('X')
except Exception as e:
    print("Error {0}".format(str(e.args[0])).encode("utf-8"))

Considering you have only a message in your internal tuple.

Sebastiano Merlino
  • 1,235
  • 12
  • 23
5

In Python3, string does not have such attribute as encoding. It's always unicode internally. For encoded strings, there are byte arrays:

s = "Error {0}".format(str(e)) # string
utf8str = s.encode("utf-8") # byte array, representing utf8-encoded text
hamstergene
  • 22,806
  • 4
  • 50
  • 70
3

In Python 3, you are already in "unicode space" and don't need encoding. Depending on what you want to achieve, you should the conversion do immediately before doing stuff.

E.g. you can convert all this to bytes(), but rather in the direction

bytes("Error {0}".format(str(e)), encoding='utf-8')

.

glglgl
  • 81,640
  • 11
  • 130
  • 202
1

There is a version-agnostic conversion here:

# from the `six` library
import sys
PY2 = sys.version_info[0] == 2
if PY2:
    text_type = unicode
    binary_type = str
else:
    text_type = str
    binary_type = bytes

def exc2str(e):
    if e.args and isinstance(e.args[0], binary_type):
        return e.args[0].decode('utf-8')
    return text_type(e)

and tests for it:

def test_exc2str():
    a = u"\u0856"
    try:
        raise ValueError(a)
    except ValueError as e:
        assert exc2str(e) == a
        assert isinstance(exc2str(e), text_type)
    try:
        raise ValueError(a.encode('utf-8'))
    except ValueError as e:
        assert exc2str(e) == a
        assert isinstance(exc2str(e), text_type)
    try:
        raise ValueError()
    except ValueError as e:
        assert exc2str(e) == ''
        assert isinstance(exc2str(e), text_type)
tsionyx
  • 1,435
  • 1
  • 16
  • 28