Python 基于动态键对文档进行分组并将键转换为值_Python_Mongodb_Aggregation Framework_Pymongo

Python 基于动态键对文档进行分组并将键转换为值

python mongodb

Python 基于动态键对文档进行分组并将键转换为值,python,mongodb,aggregation-framework,pymongo,Python,Mongodb,Aggregation Framework,Pymongo,我在MongoDB中有一个数据，数据如下： { "good": { "d1": 2, "d2": 56, "d3": 3 }, "school": { "d1": 4, "d3": 5, "d4": 12 } }, { "good": { "d5": 4, "d6": 5 }, "spark": {

我在MongoDB中有一个数据，数据如下：

{
    "good": {
        "d1": 2,
        "d2": 56,
        "d3": 3
    },
    "school": {
        "d1": 4,
        "d3": 5,
        "d4": 12
    }
},
{
    "good": {
        "d5": 4,
        "d6": 5
    },
    "spark": {
        "d5": 6,
        "d6": 11,
        "d7": 10
    },
    "school": {
        "d5": 8,
        "d8": 7
    }
}

我想使用pymongo

mapreduce

生成如下数据：

{
    'word': 'good',
    'info': [
        {
            'tbl_id': 'd1',
            'term_freq': 2
        },
        {
            'tbl_id': 'd2',
            'term_freq': 56
        },
        {
            'tbl_id': 'd3',
            'term_freq': 3
        },
        {
            'tbl_id': 'd5',
            'term_freq': 4
        },
        {
            'tbl_id': 'd6',
            'term_freq': 5
        }
    ]
}
{
    'word': 'school',
    'info': [
        {
            'tbl_id': 'd1',
            'term_freq': 4
        },
        {
            'tbl_id': 'd3',
            'term_freq': 5
        },
        {
            'tbl_id': 'd4',
            'term_freq': 12
        },
        {
            'tbl_id': 'd5',
            'term_freq': 8
        },
        {
            'tbl_id': 'd8',
            'term_freq': 7
        }
    ]
}
{
    'word': 'spark',
    'info': [
        {
            'tbl_id': 'd5',
            'term_freq': 6
        },
        {
            'tbl_id': 'd6',
            'term_freq': 11
        },
        {
            'tbl_id': 'd7',
            'term_freq': 10
        }
    ]
}

我该怎么办？或者还有其他解决方案？

这里不需要*mapReduce`。聚合框架可以很好地处理这个问题

至于它是如何工作的，我建议您查看下表中的每个操作符

然后使用

.aggregate（）

方法运行

collection.aggregate(pipeline)

collection.aggregate(pipeline)