改进MongoDB聚合

改进MongoDB聚合,mongodb,aggregation-framework,Mongodb,Aggregation Framework,我有两个MongoDB集合:一个用于保存某些产品的数据,另一个用于保存某些类别的数据。我想做的是:给定一个类别,通过查看它的子类别,获取所有相关产品。本质上,我试图通过分类树来获取所有相关的产品页面。我有这样的分类文档 _id: "cat-id", "url": /cat-1", childs: [ {"position": 1, "childs": [ {"category": "sub-category-1-id", "productPage": ""},

我有两个MongoDB集合:一个用于保存某些产品的数据,另一个用于保存某些类别的数据。我想做的是:给定一个类别,通过查看它的子类别,获取所有相关产品。本质上,我试图通过分类树来获取所有相关的产品页面。我有这样的分类文档

_id: "cat-id",
"url": /cat-1",
childs: [
   {"position": 1, "childs": [
         {"category": "sub-category-1-id", "productPage": ""},
         {"category": "sub-category-2-id", "productPage": ""}
      ]
   },
   {"position": 2, "childs": [
         {"category": "", "productPage": "product-page-1-id"},
         {"category": "", "productPage": "product-page-2-id"}
      ]
   }
],
"links": [
   {"position": 0, "url": "/related-category-1-url"},
   {"position": 1, "url": "/related-category-2-url"}
],
"productPages":[
   {"position": 0, "productPage": "product-page-1-id"}, 
   {"position": 1, "productPage": "product-page-2-id"}
]
["product-1", "product-2", ...]
从每个类别中,我获取
productPages
数组,如果它有一些值,那么我将获得直接链接到该类别的页面。接下来,我递归地获取所有相关的类别和子类别,直到子类别是叶子,或者子类别的子类别的productPage字段不为空(这里的问题是:一个类别可以是叶子,但在它的父类别中可以显示其他子类别…我知道,这很奇怪,可能是错误的,但我没有决定这个结构…)

我在MongoDB上完成了Category集合和ProductPages集合之间的聚合。聚合本身是有效的,但是对于大类别(假设类别有50个子类别,每个类别有30个子类别,等等…)查询需要花费太多时间,有时甚至需要几分钟才能最终崩溃。。。这是我现在使用的聚合

db.getCollection('Categories').aggregate([
    {$match: { "url": "/cat-1"}},
    {$unwind: {path: "$links", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$links.values", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$childs", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$childs.childs", preserveNullAndEmptyArrays: true}},
    {$graphLookup: {
        from: "ProductPages",
        startWith: "$productPages.productPage",
        connectFromField: "productPages.productPage",
        connectToField: "_id",
        as: "rootPages"
    }},
    {$graphLookup: {
        from: "ProductPages",
        startWith: "$childs.childs.productPage",
        connectFromField: "childs.childs.productPage",
        connectToField: "_id",
        as: "childPages"
    }},
    {$graphLookup: {
        from: "Categories",
        startWith: "$links.values.url",
        connectFromField: "links.values.url",
        connectToField: "url",
        as: "linkCategories"
    }},
    {$graphLookup: {
        from: "Categories",
        startWith: "$childs.url",
        connectFromField: "childs.url",
        connectToField: "url",
        as: "childUrlCategories"
    }},
    {$graphLookup: {
        from: "Categories",
        startWith: "$childs.childs.category",
        connectFromField: "childs.childs.category",
        connectToField: "_id",
        as: "childCategories"
    }},
    {$unwind: {path: "$linkCategories", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$childUrlCategories", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$childCategories", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$childCategories.childs", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$childCategories.childs.childs", preserveNullAndEmptyArrays: true}},
    {$graphLookup: {
        from: "ProductPages",
        startWith: "$linkCategories.productPages.productPage",
        connectFromField: "linkCategories.productPages.productPage",
        connectToField: "_id",
        as: "linkPages"
    }},
    {$graphLookup: {
        from: "ProductPages",
        startWith: "$childUrlCategories.productPages.productPage",
        connectFromField: "childUrlCategories.productPages.productPage",
        connectToField: "_id",
        as: "childUrlPages"
    }},
    {$graphLookup: {
        from: "ProductPages",
        startWith: "$childCategories.childs.childs.productPage",
        connectFromField: "childCategories.childs.childs.productPage",
        connectToField: "_id",
        as: "childCategoryPages"
    }},
    {$unwind: {path: "$rootPages", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$linkPages", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$childUrlPages", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$childPages", preserveNullAndEmptyArrays: true}},
    {$unwind: {path: "$childCategoryPages", preserveNullAndEmptyArrays: true}},
    {$group: {_id: "", 
        rootPages: {$addToSet: "$rootPages"}, 
        linkPages: {$addToSet: "$linkPages"}, 
        childUrlPages: {$addToSet: "$childUrlPages"},
        childPages: {$addToSet: "$childPages"},
        childCategoryPages: {$addToSet: "$childCategoryPages"}
    }},
    {$project: {_id: 0, 
        rootPages: {_id: 1, product: 1}, 
        linkPages: {_id: 1, product: 1}, 
        childUrlPages: {_id: 1, product: 1}, 
        childPages: {_id: 1, product: 1},
        childCategoryPages: {_id: 1, product: 1}
    }},
    {$addFields: {
        "childCategoryPages": {
            $map: {
                "input": "$childCategoryPages",
                "as": "el",
                "in": "$$el.product"
            }
        }
    }},
    {$addFields: {
        "childPages": {
            $map: {
                "input": "$childPages",
                "as": "el",
                "in": "$$el.product"
            }
        }
    }},
    {$addFields: {
        "childUrlPages": {
            $map: {
                "input": "$childUrlPages",
                "as": "el",
                "in": "$$el.product"
            }
        }
    }},
    {$addFields: {
        "linkPages": {
            $map: {
                "input": "$linkPages",
                "as": "el",
                "in": "$$el.product"
            }
        }
    }},
    {$addFields: {
        "rootPages": {
            $map: {
                "input": "$rootPages",
                "as": "el",
                "in": "$$el.product"
            }
        }
    }},
    {$project: {products: {$concatArrays: ["$rootPages", "$linkPages", "$childUrlPages", "$childPages", "$childCategoryPages"]}}},
    {$unwind: {path: "$products", preserveNullAndEmptyArrays: true}},
    {$group: {
        _id: "",
        products: {$addToSet: "$products"}
    }},
    {$project: {_id: 0, products: 1}},
]);
正如我所说,对于小类别,这是可行的,但是对于大类别,这是非常缓慢的(productPages、links、url和childs字段已经是索引了,如果你想知道的话)。那么,如何改进此查询,使其也适用于大型类别

编辑:这是一个示例ProductPage文档(在聚合中,我从中获取product字段)

聚合结果是从检索到的页面中获取的产品数组,如下所示

_id: "cat-id",
"url": /cat-1",
childs: [
   {"position": 1, "childs": [
         {"category": "sub-category-1-id", "productPage": ""},
         {"category": "sub-category-2-id", "productPage": ""}
      ]
   },
   {"position": 2, "childs": [
         {"category": "", "productPage": "product-page-1-id"},
         {"category": "", "productPage": "product-page-2-id"}
      ]
   }
],
"links": [
   {"position": 0, "url": "/related-category-1-url"},
   {"position": 1, "url": "/related-category-2-url"}
],
"productPages":[
   {"position": 0, "productPage": "product-page-1-id"}, 
   {"position": 1, "productPage": "product-page-2-id"}
]
["product-1", "product-2", ...]

确保所有
connectToField:field\u name
都在其集合中编入了索引(除了_id,默认情况下它是aleady索引的)。另外,在这里发布
ProductPages
sample来查看,也发布所需的结果。我认为,
$facet
是更优雅的解决方案,我已经在问题中添加了产品页面和结果。我也在考虑将聚合拆分到多个管道中,但我担心在执行递归部分时仍然会很慢(因为如果我从管道中排除
childs
相关部分,即使是大型类别,一切都很快),请在MongoPlayground中共享
“url”:“/cat-1”的完整数据(带子类别的真实数据等)如图所示,我们可以看到如何提高性能。由于
links.values
不存在,并且不清楚如果预期结果是纯数组,为什么需要使用
$graphLookup
。我将尝试将其添加到mongoplayground,但同时我提出了这个解决方案,它似乎也适用于大类别