2

Here's a question that's puzzled me for a long time. I use bbedit to edit python code. Running the code from within the editor, as I understand it, spawns its own python process, runs the code, and writes the output to a log file. So bbedit doesn't know about environment variables and such. If I try this at the command line in a terminal:

>>> s = 'háček'
>>> print s
háček

find and good. But if I have the following file in bbedit:

#!/opt/local/bin/python
# -*- coding: utf-8 -*- # 

s = u'háček'
print s

and try to run it from within the editor I get:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-2: ordinal not in range(128)

so I have to do this:

print s.encode('utf-8')

Can anyone familiar with bbedit illuminate for me what's going on here? Is there a way to tell the editor how to behave in the presence of unicode characters?

Thanks, Jon

jjon
  • 628
  • 1
  • 6
  • 21

2 Answers2

2

If BBEdit relies on external files and/or redirection to do this then no, there's no way to fix it. Fixing it would require poking some internal Python structures in order to tell it to use UTF-8 when encoding output.

Ignacio Vazquez-Abrams
  • 699,552
  • 132
  • 1,235
  • 1,283
  • Thanks Ignacio, You're quite right. It does require some poking of internal Python structures. Fortunately, it appears that python itself provides a mechanism to do this. I supply my own answer outlining the rather weird, hacky, solution. – jjon Jun 29 '11 at 16:30
1

In the unlikely event that others have run into this, here's an odd corner of python lore that I knew nothing of:

The good folks at BBedit clarified this for me.

The specific problem, it appears, is caused by a runtime condition in bbedit that they haven't tracked down, but there is this work-around:

Python knows about a special script named "sitecustomize.py"

If you put the following text in sitecustomize.py

import sys
sys.setdefaultencoding('utf-8')

and move this file into

/Path/To/Python/Installation/site-packages/

Then when bbedit spawns a python process, the python 'site' module automatically calls sys.setdefaultencoding() and then after having set the default encoding for the session (and this is the weird bit) removes setdefaultencoding from the sys namespace. See:

http://docs.python.org/library/sys.html#sys.setdefaultencoding

jjon
  • 628
  • 1
  • 6
  • 21
  • 1
    Just a note: `sys.setdefaultencoding` has been removed from Python 3. I ran into a similar problem, where the culprit turned out to be the locale. `locale.getpreferredencoding()` returned US-ASCII, causing some headaches with file IO. Explicitly setting LANG in my ~/.bash_profile fixed it. – robjwells Sep 13 '13 at 16:37