Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Unicode 如何在Python 3.2中生成二进制RFC822样式的头文件?_Unicode_Python 3.x - Fatal编程技术网

Unicode 如何在Python 3.2中生成二进制RFC822样式的头文件?

Unicode 如何在Python 3.2中生成二进制RFC822样式的头文件?,unicode,python-3.x,Unicode,Python 3.x,如何说服email.generator.generator在Python 3.2中使用二进制文件?这似乎正是Python 3.3中引入的策略框架的用例,但我希望我的代码在3.2中运行 from email.parser import Parser from email.generator import Generator from io import BytesIO, StringIO data = "Key: \N{SNOWMAN}\r\n\r\n" message = Parser().p

如何说服
email.generator.generator
在Python 3.2中使用二进制文件?这似乎正是Python 3.3中引入的
策略
框架的用例,但我希望我的代码在3.2中运行

from email.parser import Parser
from email.generator import Generator
from io import BytesIO, StringIO

data = "Key: \N{SNOWMAN}\r\n\r\n"
message = Parser().parse(StringIO(data))
with open("/tmp/rfc882test", "w") as out:
    Generator(out, maxheaderlen=0).flatten(message)

UnicodeEncodeError失败:“ascii”编解码器无法对位置0处的字符“\u2603”进行编码:序号不在范围(128)

您的数据不是有效的RFC2822头,我怀疑这会误导您。它是一个Unicode字符串,但RFC2822始终仅为ASCII。要使用非ASCII字符,需要使用字符集和base64或带引号的可打印编码对其进行编码

因此,有效代码如下所示:

from email.parser import Parser
from email.generator import Generator
from io import BytesIO, StringIO

data = "Key: =?utf8?b?4piD?=\r\n\r\n"
message = Parser().parse(StringIO(data))
with open("/tmp/rfc882test", "w") as out:
    Generator(out, maxheaderlen=0).flatten(message)
这当然完全避免了错误

问题是如何生成诸如
=?utf8?b?4piD?=
之类的标题,答案在模块中

我举了一个例子:

>>> from email import header
>>> header.Header('\N{SNOWMAN}', 'utf8').encode()
'=?utf8?b?4piD?='
要处理具有
Key:Value
格式的文件,电子邮件模块是错误的解决方案。如果没有电子邮件模块,处理此类文件非常简单,而且您不必绕过RF2822的限制。例如:

# -*- coding: UTF-8 -*-
import io
import sys
if sys.version_info > (3,):
    def u(s): return s
else:
    def u(s): return s.decode('unicode-escape')

def parse(infile):
    res = {}
    payload = ''

    for line in infile:
        key, value = line.strip().split(': ',1)
        if key in res:
            raise ValueError(u("Key {0} appears twice").format(key))
        res[key] = value
    return res

def generate(outfile, data):
    for key in data:
        outfile.write(u("{0}: {1}\n").format(key, data[key]))


if __name__ == "__main__":
    # Ensure roundtripping:
    data = {u('Key'): u('Value'), u('Foo'): u('Bar'), u('Frötz'): u('Öpöpöp')}
    with io.open('/tmp/outfile.conf', 'wt', encoding='UTF8') as outfile:
        generate(outfile, data)

    with io.open('/tmp/outfile.conf', 'rt', encoding='UTF8') as infile:
        res = parse(infile)

    assert data == res
这段代码花了15分钟编写,在Python2和Python3中都可以使用。如果你想要线的延续等等,那也很容易添加


是一个更完整的支持注释等的解决方案。

有用的解决方案来自:


这适用于希望读取和写入二进制
Key:value
文件而不考虑编码的程序。要将标题作为解码文本使用,而不能够使用
Generator()
Parser().parse(open(“headers.txt”,“r”,encoding=“utf-8”)
将其写回,就足够了。

s/an encoding/rfc 2047 encoding/
以避免[character]编码的歧义。调用
'\N{SNOWMAN}'。解码('unicode-escape')
在Python2.x上(OP想要Python2的兼容性)。@J.F.Sebastian:嗯,它应该是
header.header(u'\N{SNOWMAN}',utf8').encode()
在Python2.x中。我假设它用于“2.x/3.x代码的单源兼容性”,即
u'
在3.0-3.2上失败。例如,
u=lambda s:s.decode('unicode-escape')如果isinstance(s,bytes)els s
,然后在代码中
u('\N{SNOWMAN}')
。我要的是rfc822/style/headers。我的文件格式允许utf8,我正在重用电子邮件模块来处理它。示例仅为python 3。真正的代码读取文件,但错误相同。
from email.parser import Parser
from email.generator import BytesGenerator

# How do I get surrogateescape from a BytesIO/StringIO?
data = "Key: \N{SNOWMAN}\r\n\r\n" # write this to headers.txt
headers = open("headers.txt", "r", encoding="ascii", errors="surrogateescape")
message = Parser().parse(headers)
with open("/tmp/rfc882test", "wb") as out:
    BytesGenerator(out, maxheaderlen=0).flatten(message)