如何使用python请求库使用多线程下载单个文件
我已经尝试过这段代码,它抛出了一些错误。我已经将urllib2更改为请求库 我在pycharm中运行了这段代码,得到了以下错误 我无法安装urllib2模块 我需要下载一个文件与多个线程使用 请求图书馆 使用多线程,可以同时从不同线程以块的形式下载文件如何使用python请求库使用多线程下载单个文件,python,python-3.x,multithreading,parallel-processing,python-requests,Python,Python 3.x,Multithreading,Parallel Processing,Python Requests,我已经尝试过这段代码,它抛出了一些错误。我已经将urllib2更改为请求库 我在pycharm中运行了这段代码,得到了以下错误 我无法安装urllib2模块 我需要下载一个文件与多个线程使用 请求图书馆 使用多线程,可以同时从不同线程以块的形式下载文件 Error: Exception in thread Thread-1: Traceback (most recent call last): File "C:\Users\suresh_ram\AppData\Local\Programs\
Error:
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 932, in
_bootstrap_inner
self.run()
File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:/Users/suresh_ram/PycharmProjects/DownloadManager/multithreaded_downloader.py", line 37, in downloadChunk
dataDict[idx] = open(req,"wb").write(req.content)
TypeError: expected str, bytes or os.PathLike object, not Response
Exception in thread Thread-3:
Traceback (most recent call last):
File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 932, in _bootstrap_inner
self.run()
File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:/Users/suresh_ram/PycharmProjects/DownloadManager/multithreaded_downloader.py", line 37, in downloadChunk
dataDict[idx] = open(req,"wb").write(req.content)
TypeError: expected str, bytes or os.PathLike object, not Response
Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 932, in _bootstrap_inner
self.run()
File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 870, in run
open(req,“wb”).write(req.content)
导致错误:尝试将其写入几行,以便1)测试每行的执行情况2)确保将正确的内容分配给dataDict[idx]
。您还应该自己测试下载块
,而不必在线程中运行它,以确保它执行您想要的操作。看起来它依赖于副作用。
import threading
import time
URL = "http://www.nasa.gov/images/content/607800main_kepler1200_1600-1200.jpg"
def buildRange(value, numsplits):
lst = []
for i in range(numsplits):
if i == 0:
lst.append('%s-%s' % (i, int(round(1 + i * value/(numsplits*1.0) + value/(numsplits*1.0)-1, 0))))
else:
lst.append('%s-%s' % (int(round(1 + i * value/(numsplits*1.0),0)), int(round(1 + i * value/(numsplits*1.0) + value/(numsplits*1.0)-1, 0))))
return lst
def main(url=None, splitBy=3):
start_time = time.time()
if not url:
print("Please Enter some url to begin download.")
return
fileName = url.split('/')[-1]
sizeInBytes = requests.head(url, headers={'Accept-Encoding': 'identity'}).headers.get('content-length', None)
print("%s bytes to download." % sizeInBytes)
if not sizeInBytes:
print("Size cannot be determined.")
return
dataDict = {}
# split total num bytes into ranges
ranges = buildRange(int(sizeInBytes), splitBy)
def downloadChunk(idx, irange):
req = requests.get(url)
req.headers['Range'] = 'bytes={}'.format(irange)
dataDict[idx] = open(req,"wb").write(req.content)
# create one downloading thread per chunk
downloaders = [
threading.Thread(
target=downloadChunk,
args=(idx, irange),
)
for idx,irange in enumerate(ranges)
]
# start threads, let run in parallel, wait for all to finish
for th in downloaders:
th.start()
for th in downloaders:
th.join()
print ('done: got {} chunks, total {} bytes'.format(
len(dataDict), sum( (
len(chunk) for chunk in dataDict.values()
) )
))
print( "--- %s seconds ---" % str(time.time() - start_time))
if os.path.exists(fileName):
os.remove(fileName)
# reassemble file in correct order
with open(fileName, 'w') as fh:
for _idx,chunk in sorted(dataDict.iteritems()):
fh.write(chunk)
print ("Finished Writing file %s" % fileName)
print ('file size {} bytes'.format(os.path.getsize(fileName)))
if __name__ == '__main__':
main("https://bugs.python.org/file47781/Tutorial_EDIT.pdf")```