Python使用worker处理队列中的多个项目
我正在尝试使用队列构建一个简单的多处理应用程序 Im启动4个进程来处理来自多个网站的数据。我希望每个进程处理不同的网站,但由于某些原因,进程运行多次,永远不会退出Python使用worker处理队列中的多个项目,python,Python,我正在尝试使用队列构建一个简单的多处理应用程序 Im启动4个进程来处理来自多个网站的数据。我希望每个进程处理不同的网站,但由于某些原因,进程运行多次,永远不会退出 from multiprocessing import Process import Queue import requests def readdata(item): print item r = requests.get(item) print 'read data' print r.status
from multiprocessing import Process
import Queue
import requests
def readdata(item):
print item
r = requests.get(item)
print 'read data'
print r.status_code
def worker(queue):
while True:
try:
print 'start process'
item = queue.get()
readdata(item)
q.task_done()
except:
print "the end"
break
if __name__ == "__main__":
nthreads = 4
queue = Queue.Queue()
# put stuff in the queue here
moreStuff = ['http://www.google.com','http://www.yahoo.com','http://www.cnn.com']
for stuff in moreStuff:
queue.put(stuff)
procs = [Process(target = worker, args = (queue,)) for i in xrange(nthreads)]
for p in procs:
p.start()
for p in procs:
p.join()
输出:
start process
http://www.google.com
start process
http://www.google.com
start process
http://www.google.com
start process
http://www.google.com
read data
200
start process
http://www.yahoo.com
read data
200
start process
http://www.yahoo.com
read data
200
start process
http://www.yahoo.com
read data
200
start process
http://www.yahoo.com
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
read data
200
start process
http://www.cnn.com
read data
200
start process
http://www.cnn.com
read data
200
start process
http://www.cnn.com
read data
200
start process
read data
200
start process
http://www.cnn.com
read data
200
start process
read data
200
start process
read data
200
start process
如何检查队列是否为空并退出?对队列使用
另外,作为建议,由于您的队列没有改变,我将改为:
while not queue.empty(): # Wait for the queue to finish
pass
print('Queue finished')
而不是:
for p in procs:
p.join()
或者更好地使用:
谢谢。我正在检查queue.empty并中断循环,它工作得很好。但是我不确定为什么同一个项目会被多次处理。因为队列不会改变,所以你也可以使用一个池来简化整个过程,不是吗?
for p in procs:
p.start()
queue.join()