Mongodb spark scala mongo聚合：查询多个字段并按2个字段分组_Mongodb_Scala_Apache Spark_Intellij Idea_Aggregation Framework

Mongodb spark scala mongo聚合：查询多个字段并按2个字段分组

mongodb scala apache-spark intellij-idea

Mongodb spark scala mongo聚合：查询多个字段并按2个字段分组,mongodb,scala,apache-spark,intellij-idea,aggregation-framework,Mongodb,Scala,Apache Spark,Intellij Idea,Aggregation Framework,我正在寻找mongo聚合代码示例，该示例查询集合中的多个字段，并按几个字段分组。我的收藏： events: { _id prodId: location: status: user: date: } 上面的系列非常简单。我希望得到如下结果： For status "Completed" (This is a $match condition) {Product: abc {Location: US {user, date}

我正在寻找mongo聚合代码示例，该示例查询集合中的多个字段，并按几个字段分组。我的收藏：

events:
{
_id
prodId:
location:
status:
user:
date:
}

上面的系列非常简单。我希望得到如下结果：

For status "Completed" (This is a $match condition)

    {Product: abc
         {Location: US
            {user, date}
            {user, date
            {user, date}
             .......}
         {Location: APAC
            {user, date}
            {user, date
            {user, date}
             .......}}
    {Product: XYZ
         {Location: US
            {user, date}
            {user, date
            {user, date}
             .......}
         {Location: APAC
            {user, date}
            {user, date
            {user, date}
             .......}}
  ........

我们如何使用嵌套的

$group

和

$match

或任何其他聚合阶段在聚合框架中编写此代码

非常感谢您的任何建议或帮助。谢谢。

使用具有多个字段的组，如下所示：

db.collection.aggregate([{$group: {attr1:'$attr1', attr2:'$attr2'}}])

将组与多个字段一起使用，如下所示：

db.collection.aggregate([{$group: {attr1:'$attr1', attr2:'$attr2'}}])

经过多次尝试和错误，我在一定程度上解决了这个问题。虽然，这不完全是我想要的，但这更好。这是我得到的

{
        "_id" : {
                "Product" : "ABC",
                "location" : "ERU"
        },
        "details" : [
                {   //Each of this is a unique combination
                        "user" : "XXXX",
                        "date" : ISODate("2015-08-01T09:08:15Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-08-01T09:03:08Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-07-20T19:33:57Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-07-20T19:28:50Z")
                }
        ],
        "count" : 4
}
{
        "_id" : {
                "Product" : "AAA",
                "location" : "US"
        },
        "details" : [
                {
                        "user" : "XXXX",
                        "date" : ISODate("2015-08-01T09:08:15Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-08-01T09:03:08Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-07-20T19:33:57Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-07-20T19:28:50Z")
                }
        ],
        "count" : 4
}

我的聚合代码：

db.events.aggregate([
 {$project: 
    {
        ProdId:1,
        location:1,
        username:1,
        status:1,
        dateTime:1
    }
    }
, {$group: 
    {
        _id: {Product: "$prodId", location: "$location"},
        details: {$addToSet: {user: "$username", date: "$dateTime"}},
        count: {$sum: 1}
    }}
],{allowDiskUse: true}
)

希望这有帮助。谢谢。

经过多次尝试和错误，我在一定程度上解决了这个问题。虽然，这不完全是我想要的，但这更好。这是我得到的

{
        "_id" : {
                "Product" : "ABC",
                "location" : "ERU"
        },
        "details" : [
                {   //Each of this is a unique combination
                        "user" : "XXXX",
                        "date" : ISODate("2015-08-01T09:08:15Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-08-01T09:03:08Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-07-20T19:33:57Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-07-20T19:28:50Z")
                }
        ],
        "count" : 4
}
{
        "_id" : {
                "Product" : "AAA",
                "location" : "US"
        },
        "details" : [
                {
                        "user" : "XXXX",
                        "date" : ISODate("2015-08-01T09:08:15Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-08-01T09:03:08Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-07-20T19:33:57Z")
                },
                {
                        "user" : "xxxx",
                        "date" : ISODate("2015-07-20T19:28:50Z")
                }
        ],
        "count" : 4
}

我的聚合代码：

db.events.aggregate([
 {$project: 
    {
        ProdId:1,
        location:1,
        username:1,
        status:1,
        dateTime:1
    }
    }
, {$group: 
    {
        _id: {Product: "$prodId", location: "$location"},
        details: {$addToSet: {user: "$username", date: "$dateTime"}},
        count: {$sum: 1}
    }}
],{allowDiskUse: true}
)

希望这有帮助。谢谢。

卡米尔：谢谢你给我回电话。我已经用更真实的收集域和我需要达到的目标编辑了我的文章。我正在寻找聚合代码。目前我已经走到了这一步。我现在将位置限制在我们这里，以减少数据集

db.events.aggregate（[{$match:{$and:[{location:'US'}，{status:'end'}，{prodId:{$ne:null}}}}，{user:{$ne:null}]}，{$group:{u-id:$prodId，{$push:$user'}}，{$out:'eventProdUsersAgg}，{allowDiskUse:true}）

。但这只是给我一个产品的一组用户。我如何更改它以获取位置和日期以及我在帖子中提到的格式？卡米尔：谢谢你回复我。我已经用更真实的收集域和我需要达到的目标编辑了我的文章。我正在寻找聚合代码。目前我已经走到了这一步。我现在将位置限制在我们这里，以减少数据集

db.events.aggregate（[{$match:{$and:[{location:'US'}，{status:'end'}，{prodId:{$ne:null}}}}，{user:{$ne:null}]}，{$group:{u-id:$prodId，{$push:$user'}}，{$out:'eventProdUsersAgg}，{allowDiskUse:true}）

。但这只是给我一个产品的一组用户。我如何更改它以获取位置和日期以及我在帖子中提到的格式？