3

I am writing a script to process emails, and I have access to the raw string content of the emails.

I am currently looking for the string "Content-Transfer-Encoding:" and scanning the characters that follow immediately after, to determine the encoding. Example encodings: base64 or 7bit or quoted-printable ..

Is there a better way to automatically determine the email encoding(at least a more pythonic way)?

Thank you.

sisanared
  • 3,397
  • 1
  • 19
  • 39

2 Answers2

1

You may use this standard Python package: email.

For example:

import email

raw = """From: John Doe <example@example.com>
MIME-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

Hi there!
"""

my_email = email.message_from_string(raw)
print my_email["Content-Transfer-Encoding"]

See other examples here.

turdus-merula
  • 7,390
  • 5
  • 32
  • 47
0

Python: Is there a way to determine the encoding of text file? has some good answers. Basically there's no way to do it perfectly reliably, and the initial approach you're using is the best (and should be checked first), but if it isn't there then there are a few options that can work sometimes.