-2

I have the following unicode: s = u'\\u5b50'. I want to convert s to m = u'\u5b50'. How do I do it?

s = u'\\u5b50'
m = u'\u5b50'
print len(s) # 6
print len(m) # 1
print s # \u5b50
print m # 子
Transcendental
  • 889
  • 2
  • 8
  • 25

1 Answers1

0

This works:

print s.decode('unicode-escape') # 子
print len(s.decode('unicode-escape')) # 1
Transcendental
  • 889
  • 2
  • 8
  • 25
  • Yes, until you hit a UTF-16 surrogate pair, like `'\uD83D\uDC33'`. It then depends on wether or not you are using a wide UCS4 build of Python; you *may* get `u'\ud83d\udc33'` or `u'\U0001f433'`. The first is incorrect. – Martijn Pieters Jan 08 '17 at 13:05