如何在MongoDB中使用嵌套分组_Mongodb_Mongodb Query_Aggregation Framework_Mongodb Aggregation

如何在MongoDB中使用嵌套分组

mongodb

如何在MongoDB中使用嵌套分组,mongodb,mongodb-query,aggregation-framework,mongodb-aggregation,Mongodb,Mongodb Query,Aggregation Framework,Mongodb Aggregation,我需要找到每个组织级别的重复配置文件总数。我有如下文件： { "OrganizationId" : 10, "Profile" : { "_id" : "75" } "_id" : "1" }, { "OrganizationId" : 10, "Profile" : { "_id" : "75" } "_id" : "2" }, { "OrganizationId" : 10, "P

我需要找到每个组织级别的重复配置文件总数。我有如下文件：

{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "75"
    }
    "_id" : "1"
},
{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "75"
    }
    "_id" : "2"
},
{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "77"
    }
    "_id" : "3"
},
{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "77"
    }
    "_id" : "4"
}

Organization    Total
10               2
10               2

 db.getSiblingDB("dbName").OrgProfile.aggregate(
 { $project: { _id: 1, P: "$Profile._id",  O: "$OrganizationId" } },
 { $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} },
 { $match: { c: { $gt: 1 } } });

我编写了一个查询，它是ProfileId和OrganizationId组成的一个组。我得到的结果如下所示：

{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "75"
    }
    "_id" : "1"
},
{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "75"
    }
    "_id" : "2"
},
{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "77"
    }
    "_id" : "3"
},
{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "77"
    }
    "_id" : "4"
}

Organization    Total
10               2
10               2

 db.getSiblingDB("dbName").OrgProfile.aggregate(
 { $project: { _id: 1, P: "$Profile._id",  O: "$OrganizationId" } },
 { $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} },
 { $match: { c: { $gt: 1 } } });

但是我想得到每个组织级别的总和，这意味着组织10应该有一行总和为4

我使用的查询如下所示：

{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "75"
    }
    "_id" : "1"
},
{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "75"
    }
    "_id" : "2"
},
{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "77"
    }
    "_id" : "3"
},
{
    "OrganizationId" : 10,
    "Profile" : {
        "_id" : "77"
    }
    "_id" : "4"
}

Organization    Total
10               2
10               2

 db.getSiblingDB("dbName").OrgProfile.aggregate(
 { $project: { _id: 1, P: "$Profile._id",  O: "$OrganizationId" } },
 { $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} },
 { $match: { c: { $gt: 1 } } });

有什么想法吗？请帮助我，我想我有一个解决办法。在最后一步中，我认为您需要另一个

$group

，而不是匹配

    .aggregate([

     { $project: { _id: 1, P: "$Profile._id",  O: "$OrganizationId" } }
     ,{ $group: {_id: { p: "$P", o: "$O"}, c: { $sum: 1 }} }
     ,{ $group: { _id: "$_id.o" , c: {  $sum: "$c" } }}

     ]);

你可能会读到它，并弄清楚最后一步发生了什么，但以防万一，我会解释。最后一步是对具有相同组织id的所有文档进行分组，然后对上一个

字段指定的数量求和。在第一个组之后，有两个文档的计数

均为2，但配置文件id不同。下一个组忽略配置文件id，如果它们具有相同的组织id，则只对它们进行分组并添加计数

当我运行此查询时，以下是我的结果，我认为您正在查找：

{
    "_id" : 10,
    "c" : 4
}

希望这有帮助。如果您有任何问题，请告诉我。

下面的管道应该会为您提供所需的输出，而最后的

$project

阶段只是出于装饰目的，将

\u id

转换为

组织id

，但基本计算不需要它，因此您可以忽略它

db.getCollection('yourCollection').aggregate([
    { 
        $group: {  
            _id: { org: "$OrganizationId", profile: "$Profile._id" },
            count: { $sum: 1 }
        }
    },
    {
        $group: {
            _id: "$_id.org",
            Total: { 
                $sum: { 
                    $cond: { 
                        if: { $gte: ["$count", 2] }, 
                        then: "$count", 
                        else: 0
                    }
                }
            }
        } 
     },
     {
         $project: {
             _id: 0,
             Organization: "$_id",
             Total: 1
         }
     }
])

给出这个输出

{
    "Total" : 4.0,
    "Organization" : 10
}

要筛选出没有重复的组织，您可以使用

$match

，这也将简化第二个

$group

阶段

...aggregate([
    { 
        $group: {  
            _id: { org: "$OrganizationId", profile: "$Profile._id" },
            count: { $sum: 1 }
        }
    },
    {
        $match: {
            count: { $gte: 2 } 
        }
    },
    {
        $group: {
            _id: "$_id.org",
            Total: { $sum: "$count" }
        } 
     },
     {
         $project: {
             _id: 0,
             Organization: "$_id",
             Total: 1
         }
     }
])

您的查询实际上返回了正确的结果：

{“\u id”：{“p”：“75”，“o”：10}，“c”：4}

谢谢您的回复。此查询返回同一组织的多条记录，我必须再次手动计算总数。@Srinivas请再次通读您的问题，因为您在评论中指出，您希望10的总和为2，但在问题中您提到“这意味着组织10应该有一行总和为4。”-两种说法都不成立match@DAXaholic感谢您指出这一点：这是输出：

{u id:{p:“77”，“o”：10}]，“o:[10,10]，“c”：2}，{u id:{p:“75”，“o”：10}]，“o:[10,10]，“c”：2}

，但我希望Org10有一行，总共4个感谢您的回复。我试图执行此查询，但它返回每个组织的总配置文件计数，而不是重复的配置文件长度。感谢@DAXaholic，我得到了预期的结果，我只有一个疑问，那就是是否有可能筛选出包含0个重复项的组织。我修改了$cond，因为它需要数组格式。我想我使用的是比您更新的版本，它允许使用

if

then

else

属性。关于过滤，我更新了我的答案-希望helpsOk得到它。它给出了过滤的结果。再次感谢：）很高兴听到有帮助：）