I spent a few hours yesterday trying to decode emails using the right encoding. The solution looks trivial but anyway, here it is:
from email.parser import Parser def decode_email(msg_str): p = Parser() message = p.parsestr(msg_str) decoded_message = '' for part in message.walk(): charset = part.get_content_charset() if part.get_content_type() == 'text/plain': part_str = part.get_payload(decode=1) decoded_message += part_str.decode(charset) return decoded_message
It's very simple, you just have not to forget to get the charset of what you're trying to decode. This way you can handle UTF-8 as well as ISO-8859-1 encodings, or just any other.