Python.com使用带有线程的队列_Python_Multithreading

Python.com使用带有线程的队列

python multithreading

Python.com使用带有线程的队列,python,multithreading,Python,Multithreading,我被告知数据线程可以很容易地与队列组合，但我遇到了一些问题。这段代码应该创建一个程序，该程序将连续或一个接一个地获取网站的URL，并打印出页面的前512字节 from queue import Queue from threading import Thread import urllib.request hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com","http://ibm.com", "http:/

我被告知数据线程可以很容易地与队列组合，但我遇到了一些问题。这段代码应该创建一个程序，该程序将连续或一个接一个地获取网站的URL，并打印出页面的前512字节

from queue import Queue
from threading import Thread
import urllib.request

hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com","http://ibm.com", "http://apple.com"]

queue = Queue()

class ThreadUrl(Thread):
   def __init__(self, queue):
       Thread.__init__(self)
       self.queue = queue

   def run(self):
      while True:
         host = self.queue.get()
         url=urllib.request.urlopen(host)
         print(url.read(512))
         self.queue.task_done()

def main():
    for i in range(5):
        t = ThreadUrl(queue)
        t.setDaemon(True)
        t.start()

    for host in hosts:
        queue.put(host)

    queue.join()

main()

我在最后一个线程中遇到了这个问题

b'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="sr"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script>(function(){window.google={kEI:\'hD3FWZiRJ8G2a8GfqdAF\',kEXPI:\'18168,1352613,1352960,1353383,1353747,1354276,1354401,1354625,1354749,1354875,1355174,1355205,1355217,3700315,3700476,4017608,4029815,4031109,4043492,4045841,4048347,4061945,'
b'\n<!DOCTYPE html>\n<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US" prefix="og: http://ogp.me/ns#" class="no-js">\n\n<head>\n\t\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n<meta charset="utf-8" />\n<link rel="canonical" href="https://www.apple.com/" />\n\n\n\t\n\t<link rel="alternate" href="https://www.apple.com/" hreflang="en-US" /><link rel="alternate" href="https://www.apple.com/ae-ar/" hreflang="ar-AE" /><link rel="alternate" href="https://www.apple.com/ae/" hreflang="en-AE" /><link rel="alternate" href="https://'
b'<!DOCTYPE html>\n<html id="atomic" lang="en-US" class="atomic my3columns  l-out Pos-r https fp fp-v2 rc1 fp-default mini-uh-on viewer-right two-col ntk-wide ltr desktop Desktop bkt201">\n<head>\n    \n    <title>Yahoo</title><meta http-equiv="x-dns-prefetch-control" content="on"><link rel="dns-prefetch" href="//s.yimg.com"><link rel="preconnect" href="//s.yimg.com"><link rel="dns-prefetch" href="//search.yahoo.com"><link rel="preconnect" href="//search.yahoo.com"><link rel="dns-prefetch" href="//y.analytics.yah'
b'<!DOCTYPE html>\n<html lang="en-US">\n<head>\n\t<meta charset="UTF-8">\n\t<meta name="viewport" content="width=device-width, initial-scale=1">\n\t<title>IBM - United States</title>\n\t<link rel="canonical" href="https://www.ibm.com/us-en/"/>\n\t<meta name="robots" content="index,follow">\n\t<meta name="description" content="For more than a century IBM has been dedicated to every client&#x27;s success and to creating innovations that matter for the world">\n\t<meta name="keywords" content="IBM">\n\t<meta name="dcterms.date" c'
Exception in thread Thread-5:
Traceback (most recent call last):
  File "/home/milenko/anaconda3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "f1.py", line 17, in run
    url=urllib.request.urlopen(host)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 564, in error
    result = self._call_chain(*args)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 756, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/home/milenko/anaconda3/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 503: Service Unavailable

b'Google（function（）{window.Google={kEI:'hD3FWZiRJ8G2a8GfqdAF\'，kEXPI:'18168135261313529601353383135374713542135440113546251354749135487513551741355205552173700315004764017608402981540110940434924045841404834061945，'
b'\n\n\n\n\t\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\t\n\t正如错误所说，您得到的是一个HTTP错误；它与线程无关。您调用的URL正在返回一个错误响应
503服务不可用
由于临时过载或定期维护，服务器当前无法处理该请求，这可能会在延迟一段时间后得到缓解
服务器可能会在标头字段1之后发送重试，以建议客户端在重试请求之前等待的适当时间
注意：503状态代码的存在并不意味着服务器在过载时必须使用它。有些服务器可能只是拒绝连接
最有可能的情况是，您太快地敲打URL，并且超出了他们的限制。您可以通过检查响应来确认这一点，查看它是否有重试后的标题。消息正文还可能解释限制是什么
解决方案是降低您对服务的请求速度。阅读他们的文档，找出他们的限制，然后更新您的代码以保持在这些限制内。
因为服务器返回了代码503
，该代码是作为异常抛出的，您无法在线程中处理。此问题与线程无关。它意味着我应该尝试，除此之外？好的，谢谢，我将尝试更改代码。如果我去掉最后一个，线程thread-1中就会出现异常。您仍然可能对每个主机发出的请求太快，并且它们会限制您。请确保您的请求进行得足够慢，以免影响到它们的服务。