Python 在内存中下载并解压缩gzip文件？_Python_File_Gzip_Urllib2_Stringio

Python 在内存中下载并解压缩gzip文件？

python file

Python 在内存中下载并解压缩gzip文件？,python,file,gzip,urllib2,stringio,Python,File,Gzip,Urllib2,Stringio,我想使用urllib下载一个文件，并在保存之前在内存中解压缩该文件这就是我现在拥有的： response=urllib2.urlopen（baseURL+filename）压缩文件=StringIO.StringIO（） compressedFile.write（response.read（））解压缩文件=gzip.gzip文件（fileobj=compressedFile，mode='rb'）输出文件=打开（输出文件路径“w”） outfile.write（解压缩文件.read（））

我想使用urllib下载一个文件，并在保存之前在内存中解压缩该文件

这就是我现在拥有的：

response=urllib2.urlopen（baseURL+filename）
压缩文件=StringIO.StringIO（）
compressedFile.write（response.read（））
解压缩文件=gzip.gzip文件（fileobj=compressedFile，mode='rb'）
输出文件=打开（输出文件路径“w”）
outfile.write（解压缩文件.read（））

这最终会写入空文件。我怎样才能实现我的目标

更新答案：

print gzip.GzipFile(fileobj=StringIO.StringIO(urllib2.urlopen(DOWNLOAD_LINK).read()), mode='rb').read()

#/usr/bin/env蟒蛇2
导入urllib2
导入StringIO
导入gzip
baseURL=”https://www.kernel.org/pub/linux/docs/man-pages/"        
#检查文件名：由于新的更新，它可能会随着时间的推移而改变
filename=“man-pages-5.00.tar.gz”
outFilePath=文件名[：-3]
response=urllib2.urlopen（baseURL+filename）
compressedFile=StringIO.StringIO（response.read（））
解压缩文件=gzip.gzip文件（fileobj=compressedFile）
将开放式（输出路径“w”）作为输出文件：
outfile.write（解压缩文件.read（））

在写入压缩文件后，但在将其传递到

gzip.gzip文件（）之前，需要查找压缩文件的开头。否则，它将由gzip
模块从末尾读取，并显示为一个空文件。见下文：
#! /usr/bin/env python
import urllib2
import StringIO
import gzip

baseURL = "https://www.kernel.org/pub/linux/docs/man-pages/"
filename = "man-pages-3.34.tar.gz"
outFilePath = "man-pages-3.34.tar"

response = urllib2.urlopen(baseURL + filename)
compressedFile = StringIO.StringIO()
compressedFile.write(response.read())
#
# Set the file's current position to the beginning
# of the file so that gzip.GzipFile can read
# its contents from the top.
#
compressedFile.seek(0)

decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb')

with open(outFilePath, 'w') as outfile:
    outfile.write(decompressedFile.read())

对于使用Python 3的用户，等效的答案是：
import urllib.request
import io
import gzip

response = urllib.request.urlopen(FILE_URL)
compressed_file = io.BytesIO(response.read())
decompressed_file = gzip.GzipFile(fileobj=compressed_file)

with open(OUTFILE_PATH, 'wb') as outfile:
    outfile.write(decompressed_file.read())

如果您使用的是Python 3.2或更高版本，那么生活会轻松得多：
#!/usr/bin/env python3
import gzip
import urllib.request

baseURL = "https://www.kernel.org/pub/linux/docs/man-pages/"
filename = "man-pages-4.03.tar.gz"
outFilePath = filename[:-3]

response = urllib.request.urlopen(baseURL + filename)
with open(outFilePath, 'wb') as outfile:
    outfile.write(gzip.decompress(response.read()))

对于那些对历史感兴趣的人，请参阅和。
打印解压缩文件内容的单行代码：
print gzip.GzipFile(fileobj=StringIO.StringIO(urllib2.urlopen(DOWNLOAD_LINK).read()), mode='rb').read()

解压缩到磁盘有什么问题？我正在解压缩到磁盘，只是从来没有让压缩的字节接触到磁盘。compressedFile
是否曾经被放进去过？是的，在更新的版本中不相关：你可以使用shutil.copyfileobj（解压缩的文件，outfile）
在不将文件加载到内存的情况下逐块保存文件。原来我可以利用StringIO的\uuuu init\uuuu
，请参阅更新的问题。是的。这样效果更好。：）我将不编辑我的答案，因为您已经添加了更新的答案。谢谢。@OregonTrail:或者你可以剪掉中间人，然后。顺便说一句，不要把答案放在问题里。这是行不通的：你正试图将字节写入一个文本文件；改用二进制模式。尝试：copyfileobj（gzip文件（fileobj=response），打开（outfile_路径，'wb'））