Mongodb 是否可以从嵌套在数组中的dict中查找$lookup?

Mongodb 是否可以从嵌套在数组中的dict中查找$lookup?,mongodb,pymongo,Mongodb,Pymongo,假设我有以下文档模型 {"emb": [{"emb_a": a1, "emb_b": b1}, {"emb_a": a2, "emb_b": b2}]} 在这个结构中,a1、b1、a2、b2都代表不同的ObjectId 目标是聚合查询结果,以便将所有结果加载到内存中 from pymongo import MongoClient from bson import ObjectId from pprint import pprint class Config(object): DAT

假设我有以下文档模型

{"emb": [{"emb_a": a1, "emb_b": b1}, {"emb_a": a2, "emb_b": b2}]}
在这个结构中,a1、b1、a2、b2都代表不同的ObjectId

目标是聚合查询结果,以便将所有结果加载到内存中

from pymongo import MongoClient
from bson import ObjectId
from pprint import pprint


class Config(object):
    DATABASE_URI = "mongodb://localhost:27017/test"
    DATABASE = "test_db"


print(f"Connecting to: [{Config.DATABASE}]...")
client = MongoClient(Config.DATABASE_URI)
db = client[Config.DATABASE]
print(f"Connected: [{Config.DATABASE}]...")


a1 = db.a.insert({"a": 1})
a2 = db.a.insert({"a": 2})

b1 = db.b.insert({"b": 1})
b2 = db.b.insert({"b": 2})


def generate_doc():
    return {"emb": [{"emb_a": a1, "emb_b": b1}, {"emb_a": a2, "emb_b": b2}]}


# INSERT A BUNCH OF DOCUMENTS
db.test_collection.insert_many([generate_doc() for i in range(0, 5)])

# AGGREGATION PIPELINE
pprint(
    list(
        db.test_collection.aggregate(
            [
                {
                    "$lookup": {
                        "from": "a",
                        "localField": "emb.emb_a",
                        "foreignField": "_id",
                        "as": "emb.emb_a",
                    }
                },
                {
                    "$lookup": {
                        "from": "b",
                        "localField": "emb.emb_b",
                        "foreignField": "_id",
                        "as": "emb.emb_b",
                    }
                },
            ]
        )
    )
)


client.drop_database(Config.DATABASE)
下面是该脚本的结果

{'_id': ObjectId('5cd0af6deb62e064cd99bae4'),
  'emb': {'emb_a': [{'_id': ObjectId('5cd0af6deb62e064cd99badc'), 'a': 1},
                    {'_id': ObjectId('5cd0af6deb62e064cd99badd'), 'a': 2}],
          'emb_b': []}}
但我想得到的是

{
    "emb": [
        {"emb_a": {'_id': ObjectId('5cd0af6deb62e064cd99badc'), 'a': 1}, "emb_b": {'_id': ObjectId('5cd0af6deb62e064cd99badd'), 'b': 1}},
        {"emb_a": {'_id': ObjectId('5cd0af6deb62e064cd99bade'), 'a': 2}, "emb_b": {'_id': ObjectId('5cd0af6deb62e064cd99badf'), 'b': 2}}
    ]
}

可以这样做吗?

您的查询不起作用,因为您正在用
as
子句覆盖
emb
属性。试试这个:

db.test_collection.aggregate(
[
    {
        "$lookup": {
            "from": "a",
            "localField": "emb.emb_a",
            "foreignField": "_id",
            "as": "emb_a",
        }
    },
    {
        "$lookup": {
            "from": "b",
            "localField": "emb.emb_b",
            "foreignField": "_id",
            "as": "emb_b",
        }
    },
    {
        $project: {
            '_id': 0,
            'emb': 0
        }
    },
    {
        $replaceRoot: {
            newRoot: {
                'emb': {
                    'emb_a': '$emb_a',
                    'emb_b': '$emb_b'
                }
            }
        }
    }
]);
在这里,您保留了
emb
和嵌套文档
emb\u a
emb\u b
。在第三个管道阶段,我删除了
emb
(带有投影),因为我不再需要它进行查找,最后我使用先前计算的
emb\u a
emb\u b
重新构建它