如何使用python请求库使用多线程下载单个文件_Python_Python 3.x_Multithreading_Parallel Processing_Python Requests

如何使用python请求库使用多线程下载单个文件

python python-3.x multithreading parallel-processing

如何使用python请求库使用多线程下载单个文件,python,python-3.x,multithreading,parallel-processing,python-requests,Python,Python 3.x,Multithreading,Parallel Processing,Python Requests,我已经尝试过这段代码，它抛出了一些错误。我已经将urllib2更改为请求库我在pycharm中运行了这段代码，得到了以下错误我无法安装urllib2模块我需要下载一个文件与多个线程使用请求图书馆使用多线程，可以同时从不同线程以块的形式下载文件 Error: Exception in thread Thread-1: Traceback (most recent call last): File "C:\Users\suresh_ram\AppData\Local\Programs\

我已经尝试过这段代码，它抛出了一些错误。我已经将urllib2更改为请求库我在pycharm中运行了这段代码，得到了以下错误我无法安装urllib2模块我需要下载一个文件与多个线程使用请求图书馆

使用多线程，可以同时从不同线程以块的形式下载文件

Error:
Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 932, in 
_bootstrap_inner
self.run()
File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:/Users/suresh_ram/PycharmProjects/DownloadManager/multithreaded_downloader.py", line 37, in downloadChunk
    dataDict[idx] = open(req,"wb").write(req.content)
TypeError: expected str, bytes or os.PathLike object, not Response
Exception in thread Thread-3:
Traceback (most recent call last):
  File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:/Users/suresh_ram/PycharmProjects/DownloadManager/multithreaded_downloader.py", line 37, in downloadChunk
    dataDict[idx] = open(req,"wb").write(req.content)
TypeError: expected str, bytes or os.PathLike object, not Response
Exception in thread Thread-2:
Traceback (most recent call last):
  File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Users\suresh_ram\AppData\Local\Programs\Python\Python38\lib\threading.py", line 870, in run

open（req，“wb”）.write（req.content）

导致错误：尝试将其写入几行，以便1）测试每行的执行情况2）确保将正确的内容分配给

dataDict[idx]

。您还应该自己测试

下载块

，而不必在线程中运行它，以确保它执行您想要的操作。看起来它依赖于副作用。

import threading
import time

URL = "http://www.nasa.gov/images/content/607800main_kepler1200_1600-1200.jpg"

def buildRange(value, numsplits):
    lst = []
    for i in range(numsplits):
        if i == 0:
            lst.append('%s-%s' % (i, int(round(1 + i * value/(numsplits*1.0) + value/(numsplits*1.0)-1, 0))))
        else:
            lst.append('%s-%s' % (int(round(1 + i * value/(numsplits*1.0),0)), int(round(1 + i * value/(numsplits*1.0) + value/(numsplits*1.0)-1, 0))))
    return lst

def main(url=None, splitBy=3):
    start_time = time.time()
    if not url:
        print("Please Enter some url to begin download.")
        return

    fileName = url.split('/')[-1]
    sizeInBytes = requests.head(url, headers={'Accept-Encoding': 'identity'}).headers.get('content-length', None)
    print("%s bytes to download." % sizeInBytes)
    if not sizeInBytes:
        print("Size cannot be determined.")
        return

    dataDict = {}

    # split total num bytes into ranges
    ranges = buildRange(int(sizeInBytes), splitBy)

    def downloadChunk(idx, irange):
        req = requests.get(url)
        req.headers['Range'] = 'bytes={}'.format(irange)
        dataDict[idx] = open(req,"wb").write(req.content)

    # create one downloading thread per chunk
    downloaders = [
        threading.Thread(
            target=downloadChunk,
            args=(idx, irange),
        )
        for idx,irange in enumerate(ranges)
        ]

    # start threads, let run in parallel, wait for all to finish
    for th in downloaders:
        th.start()
    for th in downloaders:
        th.join()

    print ('done: got {} chunks, total {} bytes'.format(
        len(dataDict), sum( (
            len(chunk) for chunk in dataDict.values()
        ) )
    ))

    print( "--- %s seconds ---" % str(time.time() - start_time))

    if os.path.exists(fileName):
        os.remove(fileName)
    # reassemble file in correct order
    with open(fileName, 'w') as fh:
        for _idx,chunk in sorted(dataDict.iteritems()):
            fh.write(chunk)

    print ("Finished Writing file %s" % fileName)
    print ('file size {} bytes'.format(os.path.getsize(fileName)))

if __name__ == '__main__':
    main("https://bugs.python.org/file47781/Tutorial_EDIT.pdf")```