-1

Just when I thought I had my head wrapped around converting unicode to strings Python 2.7 throws an exception.

The code below loops over a number of accented characters and converts them to their non-accented equivalents. I've put in an special case for the double s.

#!/usr/bin/python
# -*- coding: utf-8 -*-
import unicodedata

def unicodeToString(uni):
  return unicodedata.normalize("NFD", uni).encode("ascii", "ignore")

accentList = [
#(grave accent)
u"à",
u"è",
u"ì",
u"ò",
u"ù",
u"À",
u"È",
u"Ì",
u"Ò",
u"Ù",

#(acute accent)
u"á",
u"é",
u"í",
u"ó",
u"ú",
u"ý",
u"Á",
u"É",
u"Í",
u"Ó",
u"Ú",
u"Ý",

#(arrete accent) 
u"â",
u"ê",
u"î",
u"ô",
u"û",
u"Â",
u"Ê",
u"Î",
u"Ô",
u"Û",

#(tilde )
u"ã",
u"ñ",
u"õ",
u"Ã",
u"Ñ",
u"Õ",

#(diaresses)
u"ä",
u"ë",
u"ï",
u"ö",
u"ü",
u"ÿ",
u"Ä",
u"Ë",
u"Ï",
u"Ö",
u"Ü",
u"Ÿ",

#ring 
u"å",
u"Å",

#ae ligature
u"æ",
u"Æ", 

#oe ligature
u"œ",
u"Œ",

#c cidilla
u"ç",
u"Ç",

# D stroke?
u"ð",
u"Ð",

# o slash
u"ø",
u"Ø",

u"¿", # Spanish ?
u"¡", # Spanish !
u"ß"  # Double s
]

for i in range(0, len(accentList)):
  try:
    u = accentList[i]
    s = unicodeToString(u)
    if u == u"ß":
      s = "ss"
    print("%s -> %s" % (u, s))
  except:
    pass

Without the try/except I get an error:

File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\xc0' in  position 0
: character maps to <undefined>

Is there anything I can do to make the code run without using the try/except? I'm using Sublime Text 2.

Ghoul Fool
  • 4,667
  • 9
  • 54
  • 100

1 Answers1

2

try/except does not make Unicode work. It just hides errors.

To fix the UnicodeEncodeError error, drop try/except and see Python, Unicode, and the Windows console.

Community
  • 1
  • 1
jfs
  • 346,887
  • 152
  • 868
  • 1,518