Python pymongo MongoClient无法在多进程中工作?
我正在使用pymongo 3.2,我想在多通道中使用它:Python pymongo MongoClient无法在多进程中工作?,python,mongodb,multiprocessing,pymongo,Python,Mongodb,Multiprocessing,Pymongo,我正在使用pymongo 3.2,我想在多通道中使用它: client = MongoClient(JD_SEARCH_MONGO_URI, connect=False) db = client.jd_search with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor: for jd in db['sample_data'].find(): jdId = jd["jdId"]
client = MongoClient(JD_SEARCH_MONGO_URI, connect=False)
db = client.jd_search
with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
for jd in db['sample_data'].find():
jdId = jd["jdId"]
for cv in db["sample_data"].find():
itemId = cv["itemId"]
executor.submit(intersect_compute, jdId, itemId)
# print "done {} => {}".format(jdId, itemId)
但我得到了一个错误:
UserWarning: MongoClient opened before fork. Create MongoClient with connect=False, or create client after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#using-pymongo-with-multiprocessing>
根据文档,我已将
connect
设置为False
,如您所见您的操作与文档中的操作完全相同(URL除外),但在部分中,决不执行此操作
p、 我在评论末尾更新了你的代码示例
在每个进程内创建到数据库的连接:
永远不要这样做:
您需要更改的是将数据库连接初始化移动到每个进程的分支。因为他们每个人都有自己独立的联系
您的样本已更新:
你在用什么平台?文档中说,只有当您的平台是APPLE、FreeBSD、VMS、OPENBSD或NetBSD时,才需要将connect
设置为False
。根据信息,如果在父进程中您从MongoDB收集了一些数据,则关闭连接。然后,在子进程中,当打开一个新连接以获取其他内容时,它会触发警告。我在苹果开发,我有一台远程机器ubuntu14
,因此,它将用于ubuntu。无论如何,这只是一个警告,而不是一个错误。如果有可能出现死锁,Pymongo将发出警告。您是否也可以显示intersect\u compute()
的代码?我的猜测是,该函数使用client
或db
全局变量。尝试在intersect\u compute()
中创建并使用新客户端。
# Each process creates its own instance of MongoClient.
def func():
db = pymongo.MongoClient().mydb
# Do something with db.
proc = multiprocessing.Process(target=func)
proc.start()
client = pymongo.MongoClient()
# Each child process attempts to copy a global MongoClient
# created in the parent process. Never do this.
def func():
db = client.mydb
# Do something with db.
proc = multiprocessing.Process(target=func)
proc.start()
with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
client = MongoClient(JD_SEARCH_MONGO_URI, connect=False)
db = client.jd_search
for jd in db['sample_data'].find():
jdId = jd["jdId"]
for cv in db["sample_data"].find():
itemId = cv["itemId"]
executor.submit(intersect_compute, jdId, itemId)
# print "done {} => {}".format(jdId, itemId)