Python 3.x 芹菜画布:如何将任务结果列表中的元素分发给一个链,然后再链接其他元素

Python 3.x 芹菜画布:如何将任务结果列表中的元素分发给一个链,然后再链接其他元素,python-3.x,celery,directed-acyclic-graphs,celery-canvas,Python 3.x,Celery,Directed Acyclic Graphs,Celery Canvas,我目前正在学习芹菜,并尝试构建一个类似DAG的数据处理。我的想法是用芹菜画布创建一个管道。这个管道应该包含对所有对象列表完成的任务,或者应用于一个对象并应用于分布式对象。我实现了一个数据类,它将包含我的对象和一些虚拟任务,只是为了尝试管道体系结构。我使用的docker容器为我运行redis,无需任何额外配置。我还为dataclass编写了一个自定义JSON En/De编码器。我知道示例任务没有意义,只是为了展示我问题的mvp 数据类: @dataclass class Car: car_

我目前正在学习芹菜,并尝试构建一个类似DAG的数据处理。我的想法是用芹菜画布创建一个管道。这个管道应该包含对所有对象列表完成的任务,或者应用于一个对象并应用于分布式对象。我实现了一个数据类,它将包含我的对象和一些虚拟任务,只是为了尝试管道体系结构。我使用的docker容器为我运行redis,无需任何额外配置。我还为dataclass编写了一个自定义JSON En/De编码器。我知道示例任务没有意义,只是为了展示我问题的mvp

数据类:

@dataclass
class Car:
    car_id:int
    color:str
    tires:str
    doors:int
## https://stackoverflow.com/questions/43092113/create-a-class-that-support-json-serialization-for-use-with-celery
import json
import collections
import six

def is_iterable(arg):
    return isinstance(arg, collections.Iterable) and not isinstance(arg, six.string_types)


class GenericJSONEncoder(json.JSONEncoder):
    def default(self, obj):
        try:
            return super().default(obj)
        except TypeError:
            pass
        cls = type(obj)
        result = {
            '__custom__': True,
            '__module__': cls.__module__,
            '__name__': cls.__name__,
            'data': obj.__dict__ if not hasattr(cls, '__json_encode__') else obj.__json_encode__
        }
        return result


class GenericJSONDecoder(json.JSONDecoder):
    def decode(self, str):
        result = super().decode(str)
        return GenericJSONDecoder.instantiate_object(result)

    @staticmethod
    def instantiate_object(result):
        if not isinstance(result, dict):  # or
            if is_iterable(result):
                return [GenericJSONDecoder.instantiate_object(v) for v in result]
            else:
                return result

        if not result.get('__custom__', False):
            return {k: GenericJSONDecoder.instantiate_object(v) for k, v in result.items()}

        import sys
        module = result['__module__']
        if module not in sys.modules:
            __import__(module)
        cls = getattr(sys.modules[module], result['__name__'])
        if hasattr(cls, '__json_decode__'):
            return cls.__json_decode__(result['data'])
        instance = cls.__new__(cls)
        data = {k: GenericJSONDecoder.instantiate_object(v) for k, v in result['data'].items()}
        instance.__dict__.update(data)
        return instance


def dumps(obj, *args, **kwargs):
    return json.dumps(obj, *args, cls=GenericJSONEncoder, **kwargs)


def loads(obj, *args, **kwargs):
    return json.loads(obj, *args, cls=GenericJSONDecoder, **kwargs)
数据类的Json De/Encoder:

@dataclass
class Car:
    car_id:int
    color:str
    tires:str
    doors:int
## https://stackoverflow.com/questions/43092113/create-a-class-that-support-json-serialization-for-use-with-celery
import json
import collections
import six

def is_iterable(arg):
    return isinstance(arg, collections.Iterable) and not isinstance(arg, six.string_types)


class GenericJSONEncoder(json.JSONEncoder):
    def default(self, obj):
        try:
            return super().default(obj)
        except TypeError:
            pass
        cls = type(obj)
        result = {
            '__custom__': True,
            '__module__': cls.__module__,
            '__name__': cls.__name__,
            'data': obj.__dict__ if not hasattr(cls, '__json_encode__') else obj.__json_encode__
        }
        return result


class GenericJSONDecoder(json.JSONDecoder):
    def decode(self, str):
        result = super().decode(str)
        return GenericJSONDecoder.instantiate_object(result)

    @staticmethod
    def instantiate_object(result):
        if not isinstance(result, dict):  # or
            if is_iterable(result):
                return [GenericJSONDecoder.instantiate_object(v) for v in result]
            else:
                return result

        if not result.get('__custom__', False):
            return {k: GenericJSONDecoder.instantiate_object(v) for k, v in result.items()}

        import sys
        module = result['__module__']
        if module not in sys.modules:
            __import__(module)
        cls = getattr(sys.modules[module], result['__name__'])
        if hasattr(cls, '__json_decode__'):
            return cls.__json_decode__(result['data'])
        instance = cls.__new__(cls)
        data = {k: GenericJSONDecoder.instantiate_object(v) for k, v in result['data'].items()}
        instance.__dict__.update(data)
        return instance


def dumps(obj, *args, **kwargs):
    return json.dumps(obj, *args, cls=GenericJSONEncoder, **kwargs)


def loads(obj, *args, **kwargs):
    return json.loads(obj, *args, cls=GenericJSONDecoder, **kwargs)
我的任务:

@app.task
def get_cars_from_db():
    return [Car(car_id=1,color=None,tires=None,doors=2),Car(car_id=2,color=None,tires=None,doors=4),Car(car_id=3,color=None,tires=None,doors=4),Car(car_id=1,color=None,tires=None,doors=4)]

@app.task
def paint_car(car:Car):
    car.color = "blue"
    return car


@app.task
def filter_out_two_door(car:Car):
   if car.doors==2:
      return None
   return car


@app.task
def filter_none(cars:[Car]):
    return [c for c in car if c]


@app.task
def change_tires(car:Car):
    car.tires = "winter"
    return car

@app.task
def write_back_whatever(cars:[Car]):
    print(cars)

@app.task
def dmap(args_iter, celery_task):
    """
    Takes an iterator of argument tuples and queues them up for celery to run with the function.
    """
    print(args_iter)
    print(celery_task)
    return group(celery_task(arg) for arg in args_iter)
我的芹菜配置:

from celery import Celery,subtask,group
from kombu.serialization import register, registry
from utils.json_encoders import dumps, loads
register("pipelineJSON",dumps,loads,content_type='application/x-pipelineJSON',content_encoding="utf-8")
registry.enable('pipelineJSON')
app = Celery('pipeline', broker='redis://localhost:6379/0',backend='redis://localhost:6379/0')
app.conf["accept_content"]=["application/x-pipelineJSON","pipelineJSON"]
app.conf["result_serializer"]="pipelineJSON"
app.conf["task_serializer"]="pipelineJSON"
现在,我尝试构建并执行以下工作流:

paint_and_filter = paint_car.s() | filter_out_two_door.s()
workflow = get_cars_from_db.s() | dmap.s(paint_and_filter) | filter_none.s() | 
dmap.s(change_tires.s()) | write_back_whatever.s()
workflow.get()
我的问题是,我无法将get from db任务的列表结果传递给另一个链。我读了stackoverflow和github,偶然发现了
dmap
,但没有成功地使它工作。在我提供的示例代码中,工人抛出以下执行选项:

       return group(celery_task(arg) for arg in args_iter)
       TypeError: 'dict' object is not callable
我还尝试将芹菜任务(arg)包装到子任务中,如下所示:

return group(subtask(celery_task)(arg) for arg in args_iter)
这将在工作进程上创建以下错误:

  File "/Users/utils/json_encoders.py", line 23, in default
  'data': obj.__dict__ if not hasattr(cls, '__json_encode__') else obj.__json_encode__
   kombu.exceptions.EncodeError: 'mappingproxy' object has no attribute '__dict__'
我试着画一幅我要归档的图片:

我正在使用芹菜5.02和Python 3.8.3

如果有人能帮我,我会非常感激的。如何使此
dmap
工作?对于我试图归档的内容,是否有其他或更好的解决方案?提前谢谢