Gzip Python 3与Gzip Python 2_Python_Python 3.x_Gzip_Python 2.x

Gzip Python 3与Gzip Python 2

python python-3.x

Gzip Python 3与Gzip Python 2,python,python-3.x,gzip,python-2.x,Python,Python 3.x,Gzip,Python 2.x,问题：我有一个较旧的代码，它使用Py2'str'并使用gzip压缩该字符串，我想从Py3中的同一字符串中获得gzip的相同输出，但我无法使其工作 Python 2代码 #input_buffer is a str string_buffer = StringIO() gzip_file = GzipFile(fileobj=string_buffer, mode='w', compresslevel = 6) gzip_file.write(input_buffer) gzip_file.cl

问题：我有一个较旧的代码，它使用Py2'str'并使用gzip压缩该字符串，我想从Py3中的同一字符串中获得gzip的相同输出，但我无法使其工作

Python 2代码

#input_buffer is a str 
string_buffer = StringIO()
gzip_file = GzipFile(fileobj=string_buffer, mode='w', compresslevel = 6)
gzip_file.write(input_buffer)
gzip_file.close()
out_buffer = string_buffer.getvalue()

#input_buffer is a the exact same string that I have on Py2
string_buffer = BytesIO()
gzip_file = GzipFile(fileobj=string_buffer, mode=u'w', compresslevel = 6)
gzip_file.write(bytes(input_buffer, 'utf-8'))
gzip_file.close()
out_buffer = string_buffer.getvalue()

现在我尝试在Py3中迁移相同的代码，并期望得到完全相同的结果

Python 3代码

#input_buffer is a str 
string_buffer = StringIO()
gzip_file = GzipFile(fileobj=string_buffer, mode='w', compresslevel = 6)
gzip_file.write(input_buffer)
gzip_file.close()
out_buffer = string_buffer.getvalue()

#input_buffer is a the exact same string that I have on Py2
string_buffer = BytesIO()
gzip_file = GzipFile(fileobj=string_buffer, mode=u'w', compresslevel = 6)
gzip_file.write(bytes(input_buffer, 'utf-8'))
gzip_file.close()
out_buffer = string_buffer.getvalue()

我注意到，一旦我将“str”设置为字节数组，它会添加额外的字符，这些字符随后会被压缩并在最终结果中显示，即使在我解码代码之后也是如此。此外，由于某些字符比预期的大，在没有“忽略”标志的情况下解码也会失败

我的问题有什么解决办法吗

总结一下：我有一个str，我希望Py2和Py3 gzip压缩的输出完全相同。实际上，至少从我的尝试来看，它不起作用

谢谢

我看到的一个问题是，即使它们具有相同的值，它们的表示方式也不同，我希望结果看起来像Python2

Python3
input_buffer='+\n\x01I\x12Default_Source©$c1f33163-ff63-13e6-bd74-d90d67f22ac4Ñ\x06\x80\x9dº\x9fÌVÐ\x07\x02Ë\x08\n\x01)$'
out_buffer =b'\x1f\x8b\x08\x00\x00x\xb0X\x02\xff\xd3\xe6b\xf4\x14rIMK,\xcd)\x89\x0f\xce/-JN=\xb4R%\xd90\xcd\xd8\xd8\xd0\xccX7-\rH\x18\x1a\xa7\x9a\xe9&\xa5\x98\x9b\xe8\xa6X\x1a\xa4\x98\x99\xa7\x19\x19%&\x9b\x1c\x9e\xc8v\xa8\xe1\xd0\xdcC\xbb\x0e\xcd?\xdc\x13vx\x02;\xd3\xe1n\x0e.FM\x15\x00\x03&\xcf\x15S\x00\x00\x00'

Python2
input_buffer='+\n\x01I\x12Default_Source\xa9$c1f33163-ff63-13e6-bd74-d90d67f22ac4\xd1\x06\x80\x9d\xba\x9f\xccV\xd0\x07\x02\xcb\x08\n\x01)$'
out_buffer ='\x1f\x8b\x08\x00\xae|\xb0X\x02\xff\xd3\xe6b\xf4\x14rIMK,\xcd)\x89\x0f\xce/-JN]\xa9\x92l\x98fllhf\xac\x9b\x96\x06$\x0c\x8dS\xcdt\x93R\xccMtS,\rR\xcc\xcc\xd3\x8c\x8c\x12\x93M.\xb25\xcc\xdd5\xffL\xd8\x05v\xa6\xd3\x1c\\\x8c\x9a*\x00\xe9l\xf0\xeaJ\x00\x00\x00'

在Python2

input\u buffer

中，是字节，字符编码是拉丁1。在Python3中，有一个字符串，使用unicode编码，编码为utf-8。要获得相同的结果，必须用Python 3编码为latin1：

input_buffer = '+\n\x01I\x12Default_Source©$c1f33163-ff63-13e6-bd74-d90d67f22ac4Ñ\x06\x80\x9dº\x9fÌVÐ\x07\x02Ë\x08\n\x01)$'
string_buffer = BytesIO()
with GzipFile(fileobj=string_buffer, mode='w', compresslevel=6) as gzip_file:
    gzip_file.write(bytes(input_buffer, 'latin1'))
out_buffer = string_buffer.getvalue()

在Python2

input\u buffer

中，是字节，字符编码是拉丁1。在Python3中，有一个字符串，使用unicode编码，编码为utf-8。要获得相同的结果，必须用Python 3编码为latin1：

input_buffer = '+\n\x01I\x12Default_Source©$c1f33163-ff63-13e6-bd74-d90d67f22ac4Ñ\x06\x80\x9dº\x9fÌVÐ\x07\x02Ë\x08\n\x01)$'
string_buffer = BytesIO()
with GzipFile(fileobj=string_buffer, mode='w', compresslevel=6) as gzip_file:
    gzip_file.write(bytes(input_buffer, 'latin1'))
out_buffer = string_buffer.getvalue()

你能举一个产生问题的

输入缓冲区的例子吗？gzip文件是二进制文件。将字节string\u buffer
解码为utf-8是没有意义的。@onlynone我发布了一个例子。尽管它们看起来不同，但这是Py3和Py2中表示相同字符串的方式，但至少对于out_缓冲区，我只想看到如中所示的结果Py2@onlynone我举了一个错误的例子，我会在Python2input\u buffer
中给出一个正确的例子，你的字符是拉丁文编码的。在Python3中，您有一个使用unicode编码的字符串，用utf8编码。为了得到同样的结果，你必须用python3编码为latin1:gzip\u文件。write（字节（输入缓冲区，'latin1'））
你能举一个产生问题的input\u缓冲区的例子吗？gzip文件是二进制文件。将字节string\u buffer
解码为utf-8是没有意义的。@onlynone我发布了一个例子。尽管它们看起来不同，但这是Py3和Py2中表示相同字符串的方式，但至少对于out_缓冲区，我只想看到如中所示的结果Py2@onlynone我举了一个错误的例子，我会在Python2input\u buffer
中给出一个正确的例子，你的字符是拉丁文编码的。在Python3中，您有一个使用unicode编码的字符串，用utf8编码。为了得到相同的结果，您必须用python3编码为latin1:gzip\u file.write（字节（输入缓冲区，'latin1'））