Python 电子邮件中的编码问题_Python_Email_Encoding

Python 电子邮件中的编码问题

python email encoding

Python 电子邮件中的编码问题,python,email,encoding,Python,Email,Encoding,我有一个小python脚本，可以从POP邮件地址中提取电子邮件，并将其转储到一个文件中（一个文件一个电子邮件）然后一个PHP脚本在文件中运行并显示它们我对ISO-8859-1（拉丁语-1）编码的电子邮件有问题下面是我得到的一个文本示例：=？iso-8859-1？Q？G=EDsli_Karlsson？=和Sj=E1um hva=F0=F3li er kl=E1r J 我拉邮件的方式是这样的 pop = poplib.POP3(server) mail_list = pop.list()[1

我有一个小python脚本，可以从POP邮件地址中提取电子邮件，并将其转储到一个文件中（一个文件一个电子邮件）

然后一个PHP脚本在文件中运行并显示它们

我对ISO-8859-1（拉丁语-1）编码的电子邮件有问题

下面是我得到的一个文本示例：=？iso-8859-1？Q？G=EDsli_Karlsson？=和Sj=E1um hva=F0=F3li er kl=E1r J

我拉邮件的方式是这样的

pop = poplib.POP3(server)

mail_list = pop.list()[1]

for m in mail_list:
    mno, size = m.split()
    lines = pop.retr(mno)[1]

    file = StringIO.StringIO("\r\n".join(lines))
    msg = rfc822.Message(file)

    body = file.readlines()

    f = open(str(random.randint(1,100)) + ".email", "w")
    f.write(msg["From"] + "\n")
    f.write(msg["Subject"] + "\n")
    f.write(msg["Date"] + "\n")

    for b in body:
        f.write(b)

我可能已经尝试过python和php中的所有编码/解码组合。

这是MIME内容，这就是电子邮件的实际外观，而不是某个地方的bug。您必须在PHP端使用MIME解码库（或者自己手动解码）（如果我理解正确，它就是充当电子邮件呈现器的库）

在Python中，您可以使用。在PHP中，我不确定。Zend框架似乎在某个地方有一个MIME解析器，而且可能有无数的代码片段

直到最近，在标题中还不允许使用纯拉丁语-N或utf-N，这意味着它们将由中最初描述的方法进行编码，但后来被取代。重音符号用引号可打印或Base64编码，由？Q？表示？（对于Base64，则为B）。你必须破译它们。哦，空间被编码为“389;”。请参阅。

直到最近，在标题中还不允许使用纯拉丁语-N或utf-N，这意味着它们将通过中最初描述的方法进行编码，但后来被取代。重音符号用引号可打印或Base64编码，由？Q？表示？（对于Base64，则为B）。你必须破译它们。哦，空间被编码为“389;”。请参阅。

您可以使用python电子邮件库（python 2.5+）来避免这些问题：

import email
import poplib
import random
from cStringIO import StringIO
from email.generator import Generator

pop = poplib.POP3(server)

mail_count = len(pop.list()[1])

for message_num in xrange(mail_count):
    message = "\r\n".join(pop.retr(message_num)[1])
    message = email.message_from_string(message)

    out_file = StringIO()
    message_gen = Generator(out_file, mangle_from_=False, maxheaderlen=60)
    message_gen.flatten(message)
    message_text = out_file.getvalue()

    filename = "%s.email" % random.randint(1,100)
    email_file = open(filename, "w")
    email_file.write(message_text)
    email_file.close()

这段代码将从服务器获取所有消息，并将它们转换为Python消息对象，然后再次将它们展平为字符串以写入文件。通过使用Python标准库中的电子邮件包，MIME编码和解码问题应该为您解决

免责声明：我尚未测试该代码，但它应该可以正常工作。

您可以使用python电子邮件库（python 2.5+）来避免这些问题：

import email
import poplib
import random
from cStringIO import StringIO
from email.generator import Generator

pop = poplib.POP3(server)

mail_count = len(pop.list()[1])

for message_num in xrange(mail_count):
    message = "\r\n".join(pop.retr(message_num)[1])
    message = email.message_from_string(message)

    out_file = StringIO()
    message_gen = Generator(out_file, mangle_from_=False, maxheaderlen=60)
    message_gen.flatten(message)
    message_text = out_file.getvalue()

    filename = "%s.email" % random.randint(1,100)
    email_file = open(filename, "w")
    email_file.write(message_text)
    email_file.close()

免责声明：我尚未测试该代码，但它应该可以正常工作。

这是头的MIME编码。下面是如何用Python对其进行解码：

import email.Header
import sys

header_and_encoding = email.Header.decode_header(sys.stdin.readline())
for part in header_and_encoding:
    if part[1] is None:
        print part[0],
    else:
        upart = (part[0]).decode(part[1])
        print upart.encode('latin-1'),
print

在

中有更详细的解释（法语），这是头的MIME编码。下面是如何用Python对其进行解码：

import email.Header
import sys

header_and_encoding = email.Header.decode_header(sys.stdin.readline())
for part in header_and_encoding:
    if part[1] is None:
        print part[0],
    else:
        upart = (part[0]).decode(part[1])
        print upart.encode('latin-1'),
print

更详细的解释（法语）在

中有一个更好的方法，但这就是我的结论。谢谢你们的帮助

import poplib, quopri
import random, md5
import sys, rfc822, StringIO
import email
from email.Generator import Generator

user = "email@example.com"
password = "password"
server = "mail.example.com"

# connects
try:
    pop = poplib.POP3(server)
except:
    print "Error connecting to server"
    sys.exit(-1)

# user auth
try:
    print pop.user(user)
    print pop.pass_(password)
except:
    print "Authentication error"
    sys.exit(-2)

# gets the mail list
mail_list = pop.list()[1]

for m in mail_list:
    mno, size = m.split()
    message = "\r\n".join(pop.retr(mno)[1])
    message = email.message_from_string(message)

    # uses the email flatten
    out_file = StringIO.StringIO()
    message_gen = Generator(out_file, mangle_from_=False, maxheaderlen=60)
    message_gen.flatten(message)
    message_text = out_file.getvalue()

    # fixes mime encoding issues (for display within html)
    clean_text = quopri.decodestring(message_text)

    msg = email.message_from_string(clean_text)

    # finds the last body (when in mime multipart, html is the last one)
    for part in msg.walk():
        if part.get_content_type():
            body = part.get_payload(decode=True)

    filename = "%s.email" % random.randint(1,100)

    email_file = open(filename, "w")

    email_file.write(msg["From"] + "\n")
    email_file.write(msg["Return-Path"] + "\n")
    email_file.write(msg["Subject"] + "\n")
    email_file.write(msg["Date"] + "\n")
    email_file.write(body)

    email_file.close()

pop.quit()
sys.exit()

有一个更好的方法可以做到这一点，但这就是我最终得到的结果。谢谢你们的帮助

import poplib, quopri
import random, md5
import sys, rfc822, StringIO
import email
from email.Generator import Generator

user = "email@example.com"
password = "password"
server = "mail.example.com"

# connects
try:
    pop = poplib.POP3(server)
except:
    print "Error connecting to server"
    sys.exit(-1)

# user auth
try:
    print pop.user(user)
    print pop.pass_(password)
except:
    print "Authentication error"
    sys.exit(-2)

# gets the mail list
mail_list = pop.list()[1]

for m in mail_list:
    mno, size = m.split()
    message = "\r\n".join(pop.retr(mno)[1])
    message = email.message_from_string(message)

    # uses the email flatten
    out_file = StringIO.StringIO()
    message_gen = Generator(out_file, mangle_from_=False, maxheaderlen=60)
    message_gen.flatten(message)
    message_text = out_file.getvalue()

    # fixes mime encoding issues (for display within html)
    clean_text = quopri.decodestring(message_text)

    msg = email.message_from_string(clean_text)

    # finds the last body (when in mime multipart, html is the last one)
    for part in msg.walk():
        if part.get_content_type():
            body = part.get_payload(decode=True)

    filename = "%s.email" % random.randint(1,100)

    email_file = open(filename, "w")

    email_file.write(msg["From"] + "\n")
    email_file.write(msg["Return-Path"] + "\n")
    email_file.write(msg["Subject"] + "\n")
    email_file.write(msg["Date"] + "\n")
    email_file.write(body)

    email_file.close()

pop.quit()
sys.exit()

你得到的实际错误是什么。你得到的实际错误是什么。那么，普通UTF-8（不是拉丁语-1）是在RFC 5335“国际化电子邮件头”中授权的。但它具有实验性的地位，并没有得到广泛的应用。当前的标准是RFC 2047“MIME（多用途Internet邮件扩展）第三部分：非ASCII文本的邮件标题扩展”谢谢，我确信您会加入我需要的参考资料：）嗯，普通UTF-8（非拉丁语-1）在RFC 5335“国际化邮件标题”中得到授权。但它具有实验性的地位，并没有得到广泛的应用。当前的标准是RFC 2047“MIME（多用途Internet邮件扩展）第三部分：非ASCII文本的邮件标题扩展”谢谢，我确信你会加入我需要的参考资料：）我结合了你的回复和我发现的其他内容。我添加了我的代码作为答案。我使用了你的回答和我发现的其他东西的组合。我添加了我的代码作为答案。