If you're opening the file with a native text editor and it looks fine, the issue is likely with your other program which isn't correctly detecting the encoding and mojibaking it up. As mentioned in comments, it's almost assuredly a Unicode quote character that looks like an '
but isn't.
my_string = ('The Knights who say '
'\N{LEFT SINGLE QUOTATION MARK}'
'Ni!'
'\N{RIGHT SINGLE QUOTATION MARK}'
)
def print_repr_escaped(x):
print(repr(x.encode('unicode_escape').decode('ascii')))
print_repr_escaped(my_string)
# 'The Knights who say \\u2018Ni!\\u2019'
If you can't control the encoding of the other program, you have 2 options:
Drop all Unicode characters like so:
stripped = my_string.encode('ascii', 'ignore').decode('ascii')
print_repr_escaped(stripped)
# 'The Knights who say Ni!'
Attempt to convert Unicode characters to ASCII with something like Unidecode
import unidecode
converted = unidecode.unidecode(my_string)
print_repr_escaped(converted)
# "The Knights who say 'Ni!'"