Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/285.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python使用worker处理队列中的多个项目_Python - Fatal编程技术网

Python使用worker处理队列中的多个项目

Python使用worker处理队列中的多个项目,python,Python,我正在尝试使用队列构建一个简单的多处理应用程序 Im启动4个进程来处理来自多个网站的数据。我希望每个进程处理不同的网站,但由于某些原因,进程运行多次,永远不会退出 from multiprocessing import Process import Queue import requests def readdata(item): print item r = requests.get(item) print 'read data' print r.status

我正在尝试使用队列构建一个简单的多处理应用程序

Im启动4个进程来处理来自多个网站的数据。我希望每个进程处理不同的网站,但由于某些原因,进程运行多次,永远不会退出

from multiprocessing import Process
import Queue
import requests

def readdata(item):
    print item
    r = requests.get(item)
    print 'read data'
    print r.status_code


def worker(queue):
   while True:
       try:
           print 'start process'
           item = queue.get()
           readdata(item)
           q.task_done()
       except:
           print "the end"
           break

if __name__ == "__main__":
     nthreads = 4
     queue = Queue.Queue()
     # put stuff in the queue here 
     moreStuff = ['http://www.google.com','http://www.yahoo.com','http://www.cnn.com']
     for stuff in moreStuff:
         queue.put(stuff)
     procs = [Process(target = worker, args = (queue,)) for i in xrange(nthreads)]
     for p in procs:
       p.start()
     for p in procs:
       p.join()
输出:

    start process
http://www.google.com
start process
http://www.google.com
start process
http://www.google.com
start process
http://www.google.com
read data
200
start process
http://www.yahoo.com
read data
200
start process
http://www.yahoo.com
read data
200
start process
http://www.yahoo.com
read data
200
start process
http://www.yahoo.com
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
read data
200
start process
http://www.cnn.com
read data
200
start process
http://www.cnn.com
read data
200
start process
http://www.cnn.com
read data
200
start process
read data
200
start process
http://www.cnn.com
read data
200
start process
read data
200
start process
read data
200
start process
如何检查队列是否为空并退出?

对队列使用

另外,作为建议,由于您的队列没有改变,我将改为:

while not queue.empty():  # Wait for the queue to finish
    pass

print('Queue finished')
而不是:

for p in procs:
    p.join()
或者更好地使用:


谢谢。我正在检查queue.empty并中断循环,它工作得很好。但是我不确定为什么同一个项目会被多次处理。因为队列不会改变,所以你也可以使用一个池来简化整个过程,不是吗?
for p in procs:
    p.start()
queue.join()