Python 使用zipfile.zipfile动态打开urllib2.urlopen()的响应
似乎Python 使用zipfile.zipfile动态打开urllib2.urlopen()的响应,python,python-2.7,urllib2,Python,Python 2.7,Urllib2,似乎zipfile.zipfile需要随机访问,urlib2返回的“类似文件”对象不支持这种访问 我尝试过用io.BufferedRandom包装它,但得到了: AttributeError: addinfourl instance has no attribute 'seekable' 在没有其他回应的情况下,我选择了下面一个自制的解决方案。在读取zip文件时,它可能不会减少内存占用,但在首先读取zip头时,它可能会改善延迟 from io import BytesIO, SEEK_SET,
zipfile.zipfile
需要随机访问,urlib2
返回的“类似文件”对象不支持这种访问
我尝试过用io.BufferedRandom
包装它,但得到了:
AttributeError: addinfourl instance has no attribute 'seekable'
在没有其他回应的情况下,我选择了下面一个自制的解决方案。在读取zip文件时,它可能不会减少内存占用,但在首先读取
zip
头时,它可能会改善延迟
from io import BytesIO, SEEK_SET, SEEK_END
def _ceil_div(a, b):
return (a + b - 1) / b
def _align_up(a, b):
return _ceil_div(a, b) * b
class BufferedRandomReader:
"""Create random-access, read-only buffered stream adapter from a sequential
input stream which does not support random access (i.e., ```seek()```)
Example::
>>> stream = BufferedRandomReader(BytesIO('abc'))
>>> print stream.read(2)
ab
>>> stream.seek(0)
0L
>>> print stream.read()
abc
"""
def __init__(self, fin, chunk_size=512):
self._fin = fin
self._buf = BytesIO()
self._eof = False
self._chunk_size = chunk_size
def tell(self):
return self._buf.tell()
def read(self, n=-1):
"""Read at most ``n`` bytes from the file (less if the ```read``` hits
end-of-file before obtaining size bytes).
If ``n`` argument is negative or omitted, read all data until end of
file is reached. The bytes are returned as a string object. An empty
string is returned when end of file is encountered immediately.
"""
pos = self._buf.tell()
end = self._buf.seek(0, SEEK_END)
if n < 0:
if not self._eof:
self._buf.write(self._fin.read())
self._eof = True
else:
req = pos + n - end
if req > 0 and not self._eof: # need to grow
bcount = _align_up(req, self._chunk_size)
bytes = self._fin.read(bcount)
self._buf.write(bytes)
self._eof = len(bytes) < bcount
self._buf.seek(pos)
return self._buf.read(n)
def seek(self, offset, whence=SEEK_SET):
if whence == SEEK_END:
if not self._eof:
self._buf.seek(0, SEEK_END)
self._buf.write(self._fin.read())
self._eof = True
return self._buf.seek(offset, SEEK_END)
return self._buf.seek(offset, whence)
def close(self):
self._fin.close()
self._buf.close()
您能展示更多关于如何使用
urlopen
的代码并将其传递到ZipFile
或缓冲区吗?resp=urllib2.urlopen(url)ios=io.BufferedRandom(resp)zf=ZipFile.ZipFile(ios).
请注意,我试图避免使用StringIO(resp.read())
(这很好).
import urllib2
req = urllib2.urlopen('http://test/file.zip')
import zipfile
zf = zipfile.ZipFile(BufferedRandomReader(req), 'r')
...