Questions tagged [python-unicode]

Python distinguishes between byte strings and unicode strings. *Decoding* transforms bytestrings to unicode; *encoding* transform unicode strings to bytes.

Python distinguishes between byte strings and unicode strings. Decoding transforms bytestrings to unicode; encoding transform unicode strings to bytes.

Remember: you decode your input to unicode, work with unicode, then encode unicode objects for output as bytes.

See the

960 questions
1380
votes
31 answers

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)

I'm having problems dealing with unicode characters from text fetched from different web pages (on different sites). I am using BeautifulSoup. The problem is that the error is not always reproducible; it sometimes works with some pages, and…
Homunculus Reticulli
  • 54,445
  • 72
  • 197
  • 297
331
votes
10 answers

UnicodeDecodeError: 'utf8' codec can't decode byte 0x9c

I have a socket server that is supposed to receive UTF-8 valid characters from clients. The problem is some clients (mainly hackers) are sending all the wrong kind of data over it. I can easily distinguish the genuine client, but I am logging to…
transilvlad
  • 12,220
  • 12
  • 41
  • 77
293
votes
6 answers

SyntaxError: Non-ASCII character '\xa3' in file when function returns '£'

Say I have a function: def NewFunction(): return '£' I want to print some stuff with a pound sign in front of it and it prints an error when I try to run this program, this error message is displayed: SyntaxError: Non-ASCII character '\xa3' in…
SNIFFER_dog
  • 2,993
  • 2
  • 12
  • 4
132
votes
9 answers

How to print Unicode character in Python?

I want to make a dictionary where English words point to Russian and French translations. How do I print out unicode characters in Python? Also, how do you store unicode chars in a variable?
NoobDev4iPhone
  • 4,757
  • 10
  • 28
  • 33
124
votes
7 answers

Python - 'ascii' codec can't decode byte

I'm really confused. I tried to encode but the error said can't decode.... >>> "你好".encode("utf8") Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0:…
thoslin
  • 5,749
  • 6
  • 25
  • 28
54
votes
3 answers

Python string to unicode

Possible Duplicate: How do I treat an ASCII string as unicode and unescape the escaped characters in it in python? How do convert unicode escape sequences to unicode characters in a python string I have a string that contains unicode characters…
prongs
  • 8,944
  • 19
  • 61
  • 104
47
votes
2 answers

Python string argument without an encoding

Am trying to a run this piece of code, and it keeps giving an error saying "String argument without an encoding" ota_packet = ota_packet.encode('utf-8') + bytearray(content[current_pos:(final_pos)]) + '\0'.encode('utf-8') Any help?
lonely
  • 651
  • 1
  • 5
  • 8
41
votes
1 answer

Removing unicode \u2026 like characters in a string in python2.7

I have a string in python2.7 like this, This is some \u03c0 text that has to be cleaned\u2026! it\u0027s annoying! How do i convert it to this, This is some text that has to be cleaned! its annoying!
37
votes
1 answer

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128)

I have this code: printinfo = title + "\t" + old_vendor_id + "\t" + apple_id + '\n' # Write file f.write (printinfo + '\n') But I get this error when running it: f.write(printinfo + '\n') UnicodeEncodeError: 'ascii' codec can't…
speedyrazor
  • 2,645
  • 5
  • 27
  • 47
35
votes
3 answers

Correctly reading text from Windows-1252(cp1252) file in python

so okay, as the title suggests the problem I have is with correctly reading input from a windows-1252 encoded file in python and inserting said input into SQLAlchemy-MySql table. The current system setup: Windows 7 VM with "Roger Access Control…
Krisjanis Zvaigzne
  • 415
  • 1
  • 6
  • 7
31
votes
6 answers

UnicodeDecodeError: ('utf-8' codec) while reading a csv file

what i am trying is reading a csv to make a dataframe---making changes in a column---again updating/reflecting changed value into same csv(to_csv)- again trying to read that csv to make another dataframe...there i am getting an error…
Satya
  • 3,707
  • 16
  • 38
  • 63
30
votes
2 answers

Unicode Encode Error when writing pandas df to csv

I cleaned 400 excel files and read them into python using pandas and appended all the raw data into one big df. Then when I try to export it to a csv: df.to_csv("path",header=True,index=False) I get this error: UnicodeEncodeError: 'ascii' codec…
collarblind
  • 3,779
  • 11
  • 25
  • 43
29
votes
2 answers

Google App Engine: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 48: ordinal not in range(128)

I'm working on a small application using Google App Engine which makes use of the Quora RSS feed. There is a form, and based on the input entered by the user, it will output a list of links related to the input. Now, the applications works fine for…
Manas Chaturvedi
  • 4,110
  • 15
  • 43
  • 98
26
votes
4 answers

Python 3: os.walk() file paths UnicodeEncodeError: 'utf-8' codec can't encode: surrogates not allowed

This code: for root, dirs, files in os.walk('.'): print(root) Gives me this error: UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 27: surrogates not allowed How do I walk through a file tree without getting toxic…
Collin Anderson
  • 12,371
  • 6
  • 55
  • 50
1
2 3
63 64