3

I am trying to convert u'\u30c9\u30e9\u30b4\u30f3' to japanese character using python

here is my sample code

s = u'\u30c9\u30e9\u30b4\u30f3'.encode('utf-8')
print str(s)

I got this error UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)

STF
  • 1,437
  • 3
  • 17
  • 32
Min Min
  • 125
  • 1
  • 10
  • Which line created the error? I can't seem to reproduce it. – Neil Nov 15 '17 at 06:19
  • 'Traceback (most recent call last): File "c:/api/test.py", line 2, in print (s.decode('utf-8')) File "C:\Python27\lib\encodings\cp437.py", line 12, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to ' – Min Min Nov 15 '17 at 06:24
  • 1
    https://stackoverflow.com/questions/5419/python-unicode-and-the-windows-console – Argus Malware Nov 15 '17 at 06:29
  • It is my test code but still getting this error UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to import sys, codecs, locale print sys.stdout.encoding sys.stdout = codecs.getwriter(locale.getpreferredencoding())(sys.stdout) print sys.stdout line = u'\u30c9\u30e9\u30b4\u30f3' print type(line), len(line) sys.stdout.write(line) print line – Min Min Nov 15 '17 at 06:49
  • 1
    Please don't post error messages or multi-line code fragments to the comments. That's hard to read. Edit your question and place them there, properly formatted. – Tomalak Nov 15 '17 at 07:19
  • The commented code (which should be edited in the question) is completely different from the question code. The question code also doesn't produce the error claimed. See [ask] and how to create a [mcve]. – Mark Tolonen Nov 15 '17 at 08:27

3 Answers3

2

This will depend on your OS and configuration, but normally, you just print the Unicode string. If your OS, default terminal encoding, and font support Japanese, you only need:

>>> s = u'\u30c9\u30e9\u30b4\u30f3'
>>> print s
ドラゴン

On Linux, this requires your terminal to be properly configured to (typically) UTF-8.

On Windows, you need an IDE that supports UTF-8, but if using the Windows console, you will get a UnicodeEncodeError unless using a localized version of Windows that supports Japanese, or changing the system locale to Japanese. Another workaround is to use win-unicode-console and install a Japanese console font.

My example above used the PythonWin IDE that comes with the pywin32 module, and also works in the Python IDLE IDE that comes with a standard Python installation.

Mark Tolonen
  • 132,868
  • 21
  • 152
  • 208
1

I had an UnicodeEncodeError for Japanese characters in REPL on Windows 10.

I followed Mark Tolonen's suggestion and went to

Change system locale

in the Region settings. There was an option that said

Beta: Use Unicode UTF-8 for worldwide language support.

I checked this option on with leaving the current system locale as English (i.e., unchanged).
After reboot, REPL started to print Japanese characters correctly.

zx485
  • 24,099
  • 26
  • 45
  • 52
vjou
  • 53
  • 3
0

You get s in bytes. To get the Japanese characters, use print(s.decode('utf-8')).

hcheung
  • 2,332
  • 3
  • 9
  • 22