如何在python中使用urllib一次请求多个url_Python_Python 3.x_Urllib_Python Multithreading

如何在python中使用urllib一次请求多个url

python python-3.x

如何在python中使用urllib一次请求多个url,python,python-3.x,urllib,python-multithreading,Python,Python 3.x,Urllib,Python Multithreading,我正在编写一个从互联网下载图像的程序，我想同时使用多个请求来加快下载速度所以我写了一个代码，你可以看到我只能这样请求网页： def myrequest(url): worked = False req = Request(url, headers={'User-Agent': 'Mozilla/5.0'}) while not worked: try: webpage_read = urlopen(req).read()

我正在编写一个从互联网下载图像的程序，我想同时使用多个请求来加快下载速度

所以我写了一个代码，你可以看到

我只能这样请求网页：

def myrequest(url):
    worked = False
    req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
    while not worked:
        try:
            webpage_read = urlopen(req).read()
            worked = True
        except:
            print("failed to connect to \n{}".format(url))
    return(webpage_read)

url = "http://www.mangahere.co/manga/mysterious_girlfriend_x"
webpage_read = myrequest(url).decode("utf-8")

while

之所以在这里，是因为我肯定要下载每一张图片，所以我一直在尝试，直到它正常工作为止（除了

urllib.error.HTTPError:HTTP error 504:Gateway timeout

）

我的问题是，如何一次运行多次？

我的想法是让“comander”运行5个（或85个）pythonic脚本，给出每个url并在完成后从中获取网页，但这绝对是一个愚蠢的解决方案：）

编辑： 我用过，但它似乎没有加快程序。这应该是解决问题的办法。我做错了吗？这是我的新问题。你可以使用链接做得到

构建在非常流行的库之上，并使用非阻塞IO：

from requests_futures.sessions import FuturesSession

session = FuturesSession()

# These requests will run at the same time
future_one = session.get('http://httpbin.org/get')
future_two = session.get('http://httpbin.org/get?foo=bar')

# Get the first result
response_one = future_one.result()
print(response_one.status_code)
print(response_one.text)

# Get the second result
response_two = future_two.result()
print(response_two.status_code)
print(response_two.text)

对于（…）中的url:

？它会在给定的时间发送十个请求吗？这样就不用等到第一个请求完成了吗？哦，你的意思是想要非阻塞呼叫？查看例如，而不是

urllib

。使用线程，或者如果您真的想构建一个可扩展的解决方案，请查看gevent库（）。它基于协同例程，但它在顶部隐藏了线程式API，这使得以可伸缩的方式进行web请求变得非常简单。我没有时间，所以我会在两周左右检查这一点（如果我得到信息，问题将在凌晨2点就出现在计算机上）：），但重叠或多或少得到了解决，现在的问题是为什么它没有更快。问题似乎在于从一台服务器请求多张图片，所以我使用的是最大限度的连接。

for url_ep in urls_eps:

    url, maxep = url_ep.split()
    maxep = int(maxep)
    chap = url.split("/")[-1][2:]
    if "." in chap:
        chap = chap.replace(".", "")
    else:
        chap = "{}0".format(chap)

    for ep in range(1, maxep + 1):
        ted = time.time()
        name = "{}{}".format(chap, "{}{}".format((2 - len(str(ep))) * "0", ep))
        if name in downloaded:
            continue

        _thread.start_new_thread(thrue_thread_download_pics, (path, url, ep, name))

checker = -1
while finished != goal:
    if finished != checker:
        checker = finished
        print("{} of {} downloaded".format(finished, goal))
    time.sleep(0.1)

from requests_futures.sessions import FuturesSession

session = FuturesSession()

# These requests will run at the same time
future_one = session.get('http://httpbin.org/get')
future_two = session.get('http://httpbin.org/get?foo=bar')

# Get the first result
response_one = future_one.result()
print(response_one.status_code)
print(response_one.text)

# Get the second result
response_two = future_two.result()
print(response_two.status_code)
print(response_two.text)