Mongodb spark scala mongo聚合:查询多个字段并按2个字段分组
我正在寻找mongo聚合代码示例,该示例查询集合中的多个字段,并按几个字段分组。我的收藏:Mongodb spark scala mongo聚合:查询多个字段并按2个字段分组,mongodb,scala,apache-spark,intellij-idea,aggregation-framework,Mongodb,Scala,Apache Spark,Intellij Idea,Aggregation Framework,我正在寻找mongo聚合代码示例,该示例查询集合中的多个字段,并按几个字段分组。我的收藏: events: { _id prodId: location: status: user: date: } 上面的系列非常简单。我希望得到如下结果: For status "Completed" (This is a $match condition) {Product: abc {Location: US {user, date}
events:
{
_id
prodId:
location:
status:
user:
date:
}
上面的系列非常简单。我希望得到如下结果:
For status "Completed" (This is a $match condition)
{Product: abc
{Location: US
{user, date}
{user, date
{user, date}
.......}
{Location: APAC
{user, date}
{user, date
{user, date}
.......}}
{Product: XYZ
{Location: US
{user, date}
{user, date
{user, date}
.......}
{Location: APAC
{user, date}
{user, date
{user, date}
.......}}
........
我们如何使用嵌套的$group
和$match
或任何其他聚合阶段在聚合框架中编写此代码
非常感谢您的任何建议或帮助。谢谢。使用具有多个字段的组,如下所示:
db.collection.aggregate([{$group: {attr1:'$attr1', attr2:'$attr2'}}])
将组与多个字段一起使用,如下所示:
db.collection.aggregate([{$group: {attr1:'$attr1', attr2:'$attr2'}}])
经过多次尝试和错误,我在一定程度上解决了这个问题。虽然,这不完全是我想要的,但这更好。这是我得到的
{
"_id" : {
"Product" : "ABC",
"location" : "ERU"
},
"details" : [
{ //Each of this is a unique combination
"user" : "XXXX",
"date" : ISODate("2015-08-01T09:08:15Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-08-01T09:03:08Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-07-20T19:33:57Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-07-20T19:28:50Z")
}
],
"count" : 4
}
{
"_id" : {
"Product" : "AAA",
"location" : "US"
},
"details" : [
{
"user" : "XXXX",
"date" : ISODate("2015-08-01T09:08:15Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-08-01T09:03:08Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-07-20T19:33:57Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-07-20T19:28:50Z")
}
],
"count" : 4
}
我的聚合代码:
db.events.aggregate([
{$project:
{
ProdId:1,
location:1,
username:1,
status:1,
dateTime:1
}
}
, {$group:
{
_id: {Product: "$prodId", location: "$location"},
details: {$addToSet: {user: "$username", date: "$dateTime"}},
count: {$sum: 1}
}}
],{allowDiskUse: true}
)
希望这有帮助。谢谢。经过多次尝试和错误,我在一定程度上解决了这个问题。虽然,这不完全是我想要的,但这更好。这是我得到的
{
"_id" : {
"Product" : "ABC",
"location" : "ERU"
},
"details" : [
{ //Each of this is a unique combination
"user" : "XXXX",
"date" : ISODate("2015-08-01T09:08:15Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-08-01T09:03:08Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-07-20T19:33:57Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-07-20T19:28:50Z")
}
],
"count" : 4
}
{
"_id" : {
"Product" : "AAA",
"location" : "US"
},
"details" : [
{
"user" : "XXXX",
"date" : ISODate("2015-08-01T09:08:15Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-08-01T09:03:08Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-07-20T19:33:57Z")
},
{
"user" : "xxxx",
"date" : ISODate("2015-07-20T19:28:50Z")
}
],
"count" : 4
}
我的聚合代码:
db.events.aggregate([
{$project:
{
ProdId:1,
location:1,
username:1,
status:1,
dateTime:1
}
}
, {$group:
{
_id: {Product: "$prodId", location: "$location"},
details: {$addToSet: {user: "$username", date: "$dateTime"}},
count: {$sum: 1}
}}
],{allowDiskUse: true}
)
希望这有帮助。谢谢。卡米尔:谢谢你给我回电话。我已经用更真实的收集域和我需要达到的目标编辑了我的文章。我正在寻找聚合代码。目前我已经走到了这一步。我现在将位置限制在我们这里,以减少数据集
db.events.aggregate([{$match:{$and:[{location:'US'},{status:'end'},{prodId:{$ne:null}}}},{user:{$ne:null}]},{$group:{u-id:$prodId,{$push:$user'}},{$out:'eventProdUsersAgg},{allowDiskUse:true})
。但这只是给我一个产品的一组用户。我如何更改它以获取位置和日期以及我在帖子中提到的格式?卡米尔:谢谢你回复我。我已经用更真实的收集域和我需要达到的目标编辑了我的文章。我正在寻找聚合代码。目前我已经走到了这一步。我现在将位置限制在我们这里,以减少数据集db.events.aggregate([{$match:{$and:[{location:'US'},{status:'end'},{prodId:{$ne:null}}}},{user:{$ne:null}]},{$group:{u-id:$prodId,{$push:$user'}},{$out:'eventProdUsersAgg},{allowDiskUse:true})
。但这只是给我一个产品的一组用户。我如何更改它以获取位置和日期以及我在帖子中提到的格式?