Questions tagged [python-unicode]

Python distinguishes between byte strings and unicode strings. *Decoding* transforms bytestrings to unicode; *encoding* transform unicode strings to bytes.

Python distinguishes between byte strings and unicode strings. Decoding transforms bytestrings to unicode; encoding transform unicode strings to bytes.

Remember: you decode your input to unicode, work with unicode, then encode unicode objects for output as bytes.

See the

960 questions

votes

3 answers

shlex.split still not supporting unicode?

According to the documentation, in Python 2.7.3, shlex should support UNICODE. However, when running the code below, I get: UnicodeEncodeError: 'ascii' codec can't encode characters in position 184-189: ordinal not in range(128) Am I doing something…

asked Jan 08 '13 at 15:57

petr

2,430
3
18
27

votes

3 answers

python byte string encode and decode

I am trying to convert an incoming byte string that contains non-ascii characters into a valid utf-8 string such that I can dump is as json. b = '\x80' u8 = b.encode('utf-8') j = json.dumps(u8) I expected j to be '\xc2\x80' but instead I…

python json unicode utf-8 python-unicode

asked Mar 07 '12 at 15:54

kung-foo

votes

4 answers

Treat an emoji as one character in a regex

Here's a small example: reg = ur"((?P[+\-])(?P.+?))$" (In both cases the file has -*- coding: utf-8 -*-) In Python 2: re.match(reg, u"hello").groupdict() # => {u'initial': u'\ud83d', u'rest': u'\udc4dhello'} # unicode why must you do…

python regex python-2.7 python-unicode unicode-literals

asked Jan 16 '18 at 05:53

naiveai

votes

2 answers

Tensorflow can not restore vocabulary in evaluation process

I am new to tensorflow and neural network. I started a project which is about detecting errors in persian texts. I used the code in this address and developed the code in here. please check the code because I can not put all the code here. What I…

python tensorflow python-unicode

asked Dec 03 '17 at 10:49

Masoud Masoumi Moghadam

votes

5 answers

How to write Russian characters in file?

In console when I'm trying output Russian characters It gives me ??????????????? Who know why? I tried write to file - in this case the same situation. for example f=open('tets.txt','w') f.write('some russian text') f.close inside file is -…

python windows unicode python-2.x python-unicode

asked Jul 07 '10 at 20:59

Pol

20,480
26
66
88

votes

1 answer

Display width of unicode strings in Python

How can I determine the display width of a Unicode string in Python 3.x, and is there a way to use that information to align those strings with str.format()? Motivating example: Printing a table of strings to the console. Some of the strings contain…

python string unicode width python-unicode

asked Mar 06 '14 at 13:05

Christian Aichinger

6,278
2
34
56

votes

4 answers

Comparing string and unicode in Python 2.7.5

I wonder why when I make: a = [u'k',u'ę',u'ą'] and then type: 'k' in a I get True, while: 'ę' in a will give me False? It really gives me headache and it seems someone made this on purpose to make people mad...

python python-2.7 python-unicode

asked Nov 14 '13 at 00:42

Kulawy Krul

votes

5 answers

base64 encoding unicode strings in python 2.7

I have a unicode string retrieved from a webservice using the requests module, which contains the bytes of a binary document (PCL, as it happens). One of these bytes has the value 248, and attempting to base64 encode it leads to the following…

python character-encoding base64 unicode-string python-unicode

asked Mar 05 '12 at 18:57

Marcin

44,601
17
110
191

votes

3 answers

UnicodeEncodeError when fetching url

I have this issue trying to get all the text nodes in an HTML document using lxml but I get an UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 8995: ordinal not in range(128). However, when I try to find out the type of…

python unicode encoding urlfetch python-unicode

asked Jun 16 '12 at 00:22

Robert Smith

8,127
15
68
113

votes

1 answer

TypeError: write() argument 1 must be unicode, not str

I'm trying import a text file and save it on my desktop, but the text is in "utf-8" (there is this information in the book), so when I save without encoding the text has many strange characters, but when I try to save with explicit encoding this…

python python-2.7 urllib2 python-unicode

asked Oct 10 '18 at 18:48

Ana Cecília Vieira

votes

1 answer

Pipreqs: UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 1206: character maps to

When I use pipreqs, I have this problem. I use anaconda and Russian Windows. root@DESKTOP-ETLLRI1 C:\Users\root\Desktop\resumes $ pipreqs C:\Users\root\Desktop\resumes Traceback (most recent call last): File…

python character-encoding anaconda python-unicode

asked Jul 24 '18 at 16:52

krax1337

votes

3 answers

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 6: ordinal not in range(128)

I am trying to pull a list of 500 restaurants in Amsterdam from TripAdvisor; however after the 308th restaurant I get the following error: Traceback (most recent call last): File "C:/Users/dtrinh/PycharmProjects/TripAdvisorData/LinkPull-HK.py",…

python python-2.7 web-scraping python-unicode

asked Nov 15 '16 at 21:05

dtrinh

votes

1 answer

Python returns length of 2 for single Unicode character string

In Python 2.7: In [2]: utf8_str = '\xf0\x9f\x91\x8d' In [3]: print(utf8_str) In [4]: unicode_str = utf8_str.decode('utf-8') In [5]: print(unicode_str) In [6]: unicode_str Out[6]: u'\U0001f44d' In [7]: len(unicode_str) Out[7]: 2 Since unicode_str…

python python-2.7 unicode python-unicode

asked Mar 17 '15 at 21:21

Tom

votes

1 answer

UnicodeDecodeError when using Python 2.x unicodecsv

I'm trying to write out a csv file with Unicode characters, so I'm using the unicodecsv package. Unfortunately, I'm still getting UnicodeDecodeErrors: # -*- coding: utf-8 -*- import codecs import unicodecsv raw_contents = 'He observes an…

python unicode python-unicode

asked Jul 31 '14 at 15:26

Scott

votes

1 answer

Python print unicode list

With the following code lst = [u'\u5de5', u'\u5de5'] msg = repr(lst).decode('unicode-escape') print msg I got [u'工', u'工'] How can I remove the leading u so that the content of msg is: ['工', '工']

python string python-2.7 python-unicode

asked Mar 30 '14 at 15:29

gongzhitaao

6,073
3
32
44

Prev 1 2

…

63 64 Next