如何使用python在mongodb中使用distinct with pipeline
我有这样的数据如何使用python在mongodb中使用distinct with pipeline,python,mongodb,Python,Mongodb,我有这样的数据 { "_id": "1234gbrghr", "Device" : "samsung", "UserId" : "12654", "Month" : "july" }, { "_id": "1278gbrghr", "Device" : "nokia", "UserId" : "87654", "Month" : "july" }, { "_id": "1234gbrghr", "Device" : "samsung", "UserId" :
{ "_id": "1234gbrghr",
"Device" : "samsung",
"UserId" : "12654",
"Month" : "july"
},
{ "_id": "1278gbrghr",
"Device" : "nokia",
"UserId" : "87654",
"Month" : "july"
},
{ "_id": "1234gbrghr",
"Device" : "samsung",
"UserId" : "12654",
"Month" : "july"
}
我需要在7月份获得特定设备的不同用户的编号。例如,“如果一个用户(UserId)在7月份使用三星设备两次或两次以上,那么它将被算作三星的一次
为此,我使用此查询来获取7月份的用户总数。但我需要获取不同的用户数
pipeline1 = [
{'$match':{'Month':'july'}},
{'$group':{'_id' : '$Device', 'count' : { '$sum' : 1 }}}
]
data = db.command('aggregate', 'collection', pipeline=pipeline1);
您需要先在设备和用户上分组。您可以使用以下管道操作符进行分组:
{'$group':{'_id' : { d: '$Device', u: '$UserId' } } }
其次,您需要计算每个用户的设备数量(就像您已经有过的,但稍有修改:
{ '$group': { '_id' : '$_id.d', 'count': { '$sum' : 1 } } }
使用以下数据集:
{ "_id" : "1234gbrghr", "Device" : "samsung", "UserId" : "12654", "Month" : "july" }
{ "_id" : "1278gbrghr", "Device" : "nokia", "UserId" : "87654", "Month" : "july" }
{ "_id" : "1239gbrghr", "Device" : "samsung", "UserId" : "12654", "Month" : "july" }
{ "_id" : "1238gbrghr", "Device" : "samsung", "UserId" : "12653", "Month" : "july" }
和以下聚合命令:
db.so.aggregate( [
{ '$match' : {'Month' : 'july' } },
{ '$group' : {
'_id' : { d: '$Device', u: '$UserId' },
'count' : { '$sum' : 1 }
} },
{ '$group': {
'_id' : '$_id.d',
'count': { '$sum' : 1 }
} }
] );
这将产生:
{
"result" : [
{
"_id" : "nokia",
"count" : 1
},
{
"_id" : "samsung",
"count" : 2
}
],
"ok" : 1
}