Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/mongodb/11.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
是否可以扩展芹菜,以便将结果存储到几个MongoDB集合?_Mongodb_Celery - Fatal编程技术网

是否可以扩展芹菜,以便将结果存储到几个MongoDB集合?

是否可以扩展芹菜,以便将结果存储到几个MongoDB集合?,mongodb,celery,Mongodb,Celery,我启动了一个新项目,我想让芹菜将结果保存到几个MongoDB集合,而不是一个。有没有办法通过配置实现这一点,或者我需要扩展芹菜和Kombu来实现这一点 芹菜根据BSD许可证获得许可。源代码已打开 有可能延长芹菜的种植期吗 当然可以。这是许可证授予的自由的一部分 我想知道我是否可以制作芹菜,并将结果保存到多个MongoDB集合而不是一个 因此,您需要下载源代码并花费必要的时间和精力来研究和修改它 阅读有关软件开发的文章。考虑在上游提出一个拉请求的代码改进。 < p>你不需要修改芹菜,你可以扩展它。

我启动了一个新项目,我想让芹菜将结果保存到几个MongoDB集合,而不是一个。有没有办法通过配置实现这一点,或者我需要扩展芹菜和Kombu来实现这一点

芹菜根据BSD许可证获得许可。源代码已打开

有可能延长芹菜的种植期吗

当然可以。这是许可证授予的自由的一部分

我想知道我是否可以制作芹菜,并将结果保存到多个MongoDB集合而不是一个

因此,您需要下载源代码并花费必要的时间和精力来研究和修改它


阅读有关软件开发的文章。考虑在上游提出一个拉请求的代码改进。

< p>你不需要修改芹菜,你可以扩展它。这正是我为一个内部项目所做的。在我的例子中,我不想接触标准的结果后端Redis,但我还想在MongoDB中永久存储任务的状态和结果,同时增强状态/结果

我最终创建了一个名为TaskTracker的小类库,它使用机器来实现这个目标。实现的关键部分如下所示:

import datetime

from celery import signals, states
from celery.exceptions import ImproperlyConfigured
from pymongo import MongoClient, ReturnDocument

class TaskTracker(object):
    """Track task processing and store the state in MongoDB."""

    def __init__(self, app):
        self.config = app.conf.get('task_tracker')
        if not self.config:
            raise ImproperlyConfigured('Task tracker configuration missing')
        self.tasks = set()
        self._mongo = None

        self._connect_signals()

    @property
    def mongo(self):
        # create client on first use to avoid 'MongoClient opened before fork.'
        # warning
        if not self._mongo:
            self._mongo = self._connect_to_mongodb()
        return self._mongo

    def _connect_to_mongodb(self):
        client = MongoClient(self.config['mongodb']['uri'])
        # check connection / error handling
        # ...
        return client

    def _connect_signals(self):
        signals.task_received.connect(self._on_task_received)
        signals.task_prerun.connect(self._on_task_prerun)
        signals.task_retry.connect(self._on_task_retry)
        signals.task_revoked.connect(self._on_task_revoked)
        signals.task_success.connect(self._on_task_success)
        signals.task_failure.connect(self._on_task_failure)

    def _on_task_received(self, sender, request, **other_kwargs):
        if request.name not in self.tasks:
            return

        collection = self.mongo \
            .get_database(self.config['mongodb']['database']) \
            .get_collection(self.config['mongodb']['collection'])
        collection.find_one_and_update(
            {'_id': request.id},
            {
                '$setOnInsert': {
                    'name': request.name,
                    'args': request.args,
                    'kwargs': request.kwargs,
                    'date_received': datetime.datetime.utcnow(),
                    'job_id': request.message.headers.get('job_id')
                },
                '$set': {
                    'status': states.RECEIVED,
                    'root_id': request.root_id,
                    'parent_id': request.parent_id
                },
                '$push': {
                    'status_history': {
                        'date': datetime.datetime.utcnow(),
                        'status': states.RECEIVED
                    }
                }
            },
            upsert=True,
            return_document=ReturnDocument.AFTER)

    # similarly for other signals...
    def _on_task_prerun(self, sender, task_id, task, args, kwargs,
                        **other_kwargs):
        # ...

    def _on_task_retry(self, sender, request, reason, einfo, **other_kwargs):
        # ...

    # ...

    def track(self, task):
        """Set up tracking for given task."""
        # accept either task name or task instance (for use as a decorator)
        if isinstance(task, str):
            self.tasks.add(task)
        else:
            self.tasks.add(task.name)
            return task
# standard Celery settings...
# ...

task_tracker:
    # MongoDB database for storing task state and results
    mongodb:
        uri: "\
            mongodb://myuser:mypassword@\
            mymongo.mydomain.com:27017/?\
            replicaSet=myreplica&tls=true&connectTimeoutMS=5000&\
            w=1&wtimeoutMS=3000&readPreference=primaryPreferred&maxStalenessSeconds=-1&\
            authSource=mydatabase&authMechanism=SCRAM-SHA-1"
        database: 'mydatabase'
        collection: 'tasks'
然后您需要为MongoDB提供配置。我对芹菜使用YAML配置文件,因此看起来如下所示:

import datetime

from celery import signals, states
from celery.exceptions import ImproperlyConfigured
from pymongo import MongoClient, ReturnDocument

class TaskTracker(object):
    """Track task processing and store the state in MongoDB."""

    def __init__(self, app):
        self.config = app.conf.get('task_tracker')
        if not self.config:
            raise ImproperlyConfigured('Task tracker configuration missing')
        self.tasks = set()
        self._mongo = None

        self._connect_signals()

    @property
    def mongo(self):
        # create client on first use to avoid 'MongoClient opened before fork.'
        # warning
        if not self._mongo:
            self._mongo = self._connect_to_mongodb()
        return self._mongo

    def _connect_to_mongodb(self):
        client = MongoClient(self.config['mongodb']['uri'])
        # check connection / error handling
        # ...
        return client

    def _connect_signals(self):
        signals.task_received.connect(self._on_task_received)
        signals.task_prerun.connect(self._on_task_prerun)
        signals.task_retry.connect(self._on_task_retry)
        signals.task_revoked.connect(self._on_task_revoked)
        signals.task_success.connect(self._on_task_success)
        signals.task_failure.connect(self._on_task_failure)

    def _on_task_received(self, sender, request, **other_kwargs):
        if request.name not in self.tasks:
            return

        collection = self.mongo \
            .get_database(self.config['mongodb']['database']) \
            .get_collection(self.config['mongodb']['collection'])
        collection.find_one_and_update(
            {'_id': request.id},
            {
                '$setOnInsert': {
                    'name': request.name,
                    'args': request.args,
                    'kwargs': request.kwargs,
                    'date_received': datetime.datetime.utcnow(),
                    'job_id': request.message.headers.get('job_id')
                },
                '$set': {
                    'status': states.RECEIVED,
                    'root_id': request.root_id,
                    'parent_id': request.parent_id
                },
                '$push': {
                    'status_history': {
                        'date': datetime.datetime.utcnow(),
                        'status': states.RECEIVED
                    }
                }
            },
            upsert=True,
            return_document=ReturnDocument.AFTER)

    # similarly for other signals...
    def _on_task_prerun(self, sender, task_id, task, args, kwargs,
                        **other_kwargs):
        # ...

    def _on_task_retry(self, sender, request, reason, einfo, **other_kwargs):
        # ...

    # ...

    def track(self, task):
        """Set up tracking for given task."""
        # accept either task name or task instance (for use as a decorator)
        if isinstance(task, str):
            self.tasks.add(task)
        else:
            self.tasks.add(task.name)
            return task
# standard Celery settings...
# ...

task_tracker:
    # MongoDB database for storing task state and results
    mongodb:
        uri: "\
            mongodb://myuser:mypassword@\
            mymongo.mydomain.com:27017/?\
            replicaSet=myreplica&tls=true&connectTimeoutMS=5000&\
            w=1&wtimeoutMS=3000&readPreference=primaryPreferred&maxStalenessSeconds=-1&\
            authSource=mydatabase&authMechanism=SCRAM-SHA-1"
        database: 'mydatabase'
        collection: 'tasks'
在任务模块中,您只需创建提供芹菜应用程序的类实例,并装饰您的任务:

import os

from celery import Celery
import yaml

from celery_common.tracking import TaskTracker  # my custom utils library


config_file = os.environ.get('CONFIG_FILE', default='/srv/celery/config.yaml')
with open(config_file) as f:
    config = yaml.safe_load(f) or {}

app = Celery(__name__)
app.conf.update(config)

tracker = TaskTracker(app)

@tracker.track
@app.task(name='mytask')
def mytask(myparam1, myparam2, *args, **kwargs):
    pass

现在,任务的状态和结果将在MongoDB中进行跟踪,与标准结果后端分离。如果您需要将其存储在多个数据库中,您可以对其进行一些调整,创建多个TaskTracker实例,并为您的任务提供多个装饰器。

我已经更新了这个问题,我并不想知道是否可以扩展芹菜,但我更感兴趣的是如何扩展芹菜,这将满足将任务结果保存到不同MongoDB集合的要求。我已经完成了分叉部分,尽管看起来我也需要扩展kombu库。在我深入研究之前,我决定问一个问题,也许有一个更简单的方法,不需要对几个库进行更改,我认为每个认真的芹菜用户在某个时候都会使用监控和检查API实现自己的监控。。。包括我在内,我同意。不过,我的解决方案的主要原因有点不同。我想为我们的用户提供通用的作业管理功能,芹菜加RedBeat作为调度器只是作业的执行部分,运行时。然后,MongoDB充当作业运行信息的永久存储。