Decoding emails in Python

December 18, 2009 Tags: python, email

Note to self regarding charsets/encodings with emails in Python.

I spent a few hours yesterday trying to decode emails using the right encoding. The solution looks trivial but anyway, here it is:

from email.parser import Parser

def decode_email(msg_str):
    p = Parser()
    message = p.parsestr(msg_str)
    decoded_message = ''
    for part in message.walk():
        charset = part.get_content_charset()
        if part.get_content_type() == 'text/plain':
            part_str = part.get_payload(decode=1)
            decoded_message += part_str.decode(charset)
    return decoded_message

It's very simple, you just have not to forget to get the charset of what you're trying to decode. This way you can handle UTF-8 as well as ISO-8859-1 encodings, or just any other.

Comments

Add a comment

Comments are closed for this entry.

Short URL

http://bruno.im/e5