Questions tagged [python-unicode]

Python distinguishes between byte strings and unicode strings. *Decoding* transforms bytestrings to unicode; *encoding* transform unicode strings to bytes.

Python distinguishes between byte strings and unicode strings. Decoding transforms bytestrings to unicode; encoding transform unicode strings to bytes.

Remember: you decode your input to unicode, work with unicode, then encode unicode objects for output as bytes.

See the

960 questions
-1
votes
2 answers

Python 3 UnicodeEncodeError for characters and smileys in Tweets

I'm making a Twitter API, I get tweets about a specific word (right now it's 'flafel'). Everything is fine except this tweet b'And when I\'m thinking about getting the chili sauce on my flafel and the waitress, a Pinay, tells me not to get it cos…
GLHF
  • 3,313
  • 7
  • 32
  • 71
-1
votes
2 answers

UnicodeDecodeError: 'ascii' codec can't decode byte 0x92?

So I am trying to read data off a .txt file and then find the most common 30 words and print them out. However, whenever I'm reading my txt file, I receive the error: "UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 338:…
-1
votes
2 answers

unicode error printing \u2002 using Python 3

I am getting the error that Python can't decode character \u2002 when trying to print a block of text: UnicodeEncodeError: 'charmap' codec can't encode character '\u2002' in position 355: character maps to What i don't understand is…
kyrenia
  • 4,495
  • 7
  • 53
  • 82
-1
votes
2 answers

Unicode category for commas and quotation marks

I have this helper function that gets rid of control characters in XML text: def remove_control_characters(s): #Remove control characters in XML text t = "" for ch in s: if unicodedata.category(ch)[0] == "C": t += " " …
SANBI samples
  • 1,590
  • 2
  • 12
  • 18
-1
votes
1 answer

How to convert u'\xd0' to d0 in hex?

I got this simple but difficult problem in Python. For unknown reason with pypyjs, I got my binary buffer as u'\xd0\xcf\x11\xe0\xa1...'. By the look of it, I knew it would be alright if it is a binary stream of 'd0cf 11e0 a1...'. I wondered how do I…
chfw
  • 4,126
  • 2
  • 22
  • 29
-1
votes
2 answers

how to get dictionary value as same using python?

Solved with your help #!/usr/bin/python # -*- coding: utf-8 -*- message = {'message1':'நாம்','message2':'செய்தி'} a={} for i in message.keys(): if "message" in i: a[i]=message[i] status="success" print a got…
the-run
  • 1,021
  • 1
  • 10
  • 21
-1
votes
1 answer

Remove Unicode values that have spaces between them

I have a file containing Unicode strings aligned line by line. ജുഗുപ്‌സയോ നീരസമോ പരിഹാസമോ ദ്യോതിപ്പിക്കുന്ന മുഖഭാവം വളവ്‌ വക്രത തിരിവ്‌ കോട്ടം നന്നേ ചെറുപ്രായത്തില്‍ അസാമന്യ ജീവിത വിജയം നേടുന്നയാള്‍ ഇന്റര്‍നെറ്റിലെ പ്രധാനപ്പെട്ട…
user2085779
-1
votes
1 answer

Arabic code point range in python

i have a code below which Liang Sun implemented #Created by Liang Sun in 2013 import re import collections import hashlib class Simhash(object): def __init__(self, value): self.f = 64 self.reg = ur'[\w\ufb50-\ufdff]' …
NZrMd
  • 25
  • 11
-1
votes
3 answers

Python doesn't save file with unicode characters

Python doesn't save the file with Hebrew characters. How do I fix this? (Python 2.7) The example image shows a file in the SPE IDE with a first line of heb = ["ד" ,"ג" ,"ב", "א", ...]
-2
votes
1 answer

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1: ordinal not in range(128)

I am using python = 3.6 Can't able to get solution for this ? Can anyone help me to get a solution for this issue!
-2
votes
1 answer

Python: Are there any libraries with all the unicode characters similar to the string library for ascii characters?

In python, the string library has methods like string.ascii_letters. Is there anything similar for Unicode characters or symbols? I haven't been able to find anything myself. I appreciate any help! Fairly new to this type of thing so apologies if…
-2
votes
1 answer

How to get all unicode characters of a language by using its ISO language code in Python?

For example, the ISO language code of German language is de. How do I get all unicode characters of that language in Python? If that's not directly possible, how about the following: Given an ISO language code (say de), How do I find the script…
Gokul NC
  • 723
  • 2
  • 11
  • 31
-2
votes
2 answers

Insert enter marks before the selected word

I need to insert line breaks (enter marks) between a string before each new word starts. String: test (n) trial, experiment, check run (v) race, rush speed (n) race, sprint, rush, dash, zoom Expected: test (n) trial, experiment, check run (v)…
shantanuo
  • 27,732
  • 66
  • 204
  • 340
-2
votes
3 answers

How to handle UnicodeDecodeError

str1="khloé kardashian" Traceback (most recent call last): File "", line 1, in UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 4: ordinal not in range(128) how to encode it in perfect way. I am trying…
Raj
  • 203
  • 2
  • 10
-2
votes
1 answer

Unicode conversion issue while using API gateway

The following URL works as expected and returns "null". https://zga2tn1wgd.execute-api.us-east-1.amazonaws.com/mycall?url=https://mr.wikipedia.org/s/4jp4 But the same page, with unicode string instead of ascii string, throws an…
shantanuo
  • 27,732
  • 66
  • 204
  • 340
1 2 3
63
64