I am pulling tweets in python using tweepy.
It gives the entire data in type unicode.
Eg: print type(data) gives me <type 'unicode'>
It contains unicode characters in it.
Eg: hello\u2026 im am fine\u2019s
I want to remove all of these unicode characters. Is there any regular expression i can use?
str.replace
isn't a viable option as unicode characters can be any values, from smileys to unicode apostrophes.