芹菜任务中每个查询的Python cassandra驱动程序操作超时
我对芹菜任务中异步执行的每个插入查询(小查询)都有问题。 在同步模式下,当我插入所有内容时,效果非常好,但当它在apply_async()中执行时,我得到以下结果:芹菜任务中每个查询的Python cassandra驱动程序操作超时,python,cassandra,celery,cqlengine,Python,Cassandra,Celery,Cqlengine,我对芹菜任务中异步执行的每个插入查询(小查询)都有问题。 在同步模式下,当我插入所有内容时,效果非常好,但当它在apply_async()中执行时,我得到以下结果: OperationTimedOut('errors=errors=errors={}, last_host=***.***.*.***, last_host=None, last_host=None',) 回溯: Traceback (most recent call last): File "/var/nfs_www/***
OperationTimedOut('errors=errors=errors={}, last_host=***.***.*.***, last_host=None, last_host=None',)
回溯:
Traceback (most recent call last):
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/celery/app/trace.py", line 437, in __protected_call__
return self.run(*args, **kwargs)
File "/var/nfs_www/***/www_v1/app/mods/news_feed/tasks.py", line 26, in send_new_comment_reply_notifications
send_new_comment_reply_notifications_method(comment_id)
File "/var/nfs_www/***www_v1/app/mods/news_feed/methods.py", line 83, in send_new_comment_reply_notifications
comment_type='comment_reply'
File "/var/nfs_www/***/www_v1/app/mods/news_feed/models/storage.py", line 129, in add
CommentsFeed(**kwargs).save()
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cqlengine/models.py", line 531, in save
consistency=self.__consistency__).save()
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cqlengine/query.py", line 907, in save
self._execute(insert)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cqlengine/query.py", line 786, in _execute
tmp = execute(q, consistency_level=self._consistency)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cqlengine/connection.py", line 95, in execute
result = session.execute(query, params)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cassandra/cluster.py", line 1103, in execute
result = future.result(timeout)
File "/var/nfs_www/***/env_v0/local/lib/python2.7/site-packages/cassandra/cluster.py", line 2475, in result
raise OperationTimedOut(errors=self._errors, last_host=self._current_host)
OperationTimedOut: errors={}, last_host=***.***.*.***
有人对这个问题有想法吗
我发现了这一点,但我的查询很少,问题只出现在芹菜任务中
更新:
我做了一个测试任务,它也引发了这个错误
@celery.task()
def test_task_with_cassandra():
from app import cassandra_session
cassandra_session.execute('use news_feed')
return 'Done'
更新2:
他说:
@celery.task()
def test_task_with_cassandra():
from cqlengine import connection
connection.setup(app.config['CASSANDRA_SERVERS'], port=app.config['CASSANDRA_PORT'],
default_keyspace='test_keyspace')
from .models import Feed
Feed.objects.count()
return 'Done'
得到这个:
NoHostAvailable('Unable to connect to any servers', {'***.***.*.***': OperationTimedOut('errors=errors=Timed out creating connection, last_host=None, last_host=None',)})
从shell我可以连接到它
更新3:
从github问题上删除的线程(在我的电子邮件中发现):(这对我也很有用)
以下是我如何在芹菜中添加CQLengine的方法:
from celery import Celery
from celery.signals import worker_process_init, beat_init
from cqlengine import connection
from cqlengine.connection import (
cluster as cql_cluster, session as cql_session)
def cassandra_init():
""" Initialize a clean Cassandra connection. """
if cql_cluster is not None:
cql_cluster.shutdown()
if cql_session is not None:
cql_session.shutdown()
connection.setup()
# Initialize worker context for both standard and periodic tasks.
worker_process_init.connect(cassandra_init)
beat_init.connect(cassandra_init)
app = Celery()
这是粗糙的,但有效。我们应该在FAQ中添加这个片段吗
我也有类似的问题。这似乎与任务之间共享Cassandra会话有关。我通过为每个线程创建一个会话来解决这个问题。确保从任务中调用
get_session()
,然后执行以下操作:
thread_local = threading.local()
def get_session():
if hasattr(thread_local, "cassandra_session"):
return thread_local.cassandra_session
cluster = Cluster(settings.CASSANDRA_HOSTS)
session = cluster.connect(settings.CASSANDRA_KEYSPACE)
thread_local.cassandra_session = session
return session
受Ron答案的启发,我想出了以下代码来放入tasks.py:
import threading
from django.conf import settings
from cassandra.cluster import Cluster
from celery.signals import worker_process_init,worker_process_shutdown
thread_local = threading.local()
@worker_process_init.connect
def open_cassandra_session(*args, **kwargs):
cluster = Cluster([settings.DATABASES["cassandra"]["HOST"],], protocol_version=3)
session = cluster.connect(settings.DATABASES["cassandra"]["NAME"])
thread_local.cassandra_session = session
@worker_process_shutdown.connect
def close_cassandra_session(*args,**kwargs):
session = thread_local.cassandra_session
session.shutdown()
thread_local.cassandra_session = None
当芹菜工人进程开始和停止时,这个简洁的解决方案将自动打开/关闭cassandra会话
旁注:protocol_version=3,因为Cassandra 2.1只支持protocol version 3及更低版本。其他答案对我不起作用,但问题的“更新3”起作用。以下是我的结论(问题中建议的小更新):
使用django cassandra引擎,以下为我解决了问题:
db_connection = connections['cassandra']
@worker_process_init.connect
def connect_db(**_):
db_connection.reconnect()
@worker_shutdown.connect
def disconnect(**_):
db_connection.connection.close_all()
看看你的芹菜用户有权限运行查询吗?嗯,我没有为它设置任何身份验证提供者。或者你是什么意思?我应该在哪里搜索?我更新了我的问题,你解决了吗?我也有同样的问题,不知道发生了什么事@haifzhan,特别为您添加稀有信息:))请参阅更新3谢谢您发布@EllochkaCannibal,我检查了我的脚本,发现如果多进程共享一个会话,它将引发OperationTimedOut异常,在我为每个进程创建一个会话后,问题解决了。我对这个答案投了赞成票,但我改变了主意——这对我来说并不管用。我发布了一个不同的答案。你救了我的一天,我的朋友。你又为另一个朋友救了一天。。。非常感谢。
db_connection = connections['cassandra']
@worker_process_init.connect
def connect_db(**_):
db_connection.reconnect()
@worker_shutdown.connect
def disconnect(**_):
db_connection.connection.close_all()