Python MongoDB。如何通过聚合管道携带计算字段?
我试图从json格式的OSM数据中收集一些见解。下面是我在MongoDB/PyMongo中使用的文档示例:Python MongoDB。如何通过聚合管道携带计算字段?,python,mongodb,Python,Mongodb,我试图从json格式的OSM数据中收集一些见解。下面是我在MongoDB/PyMongo中使用的文档示例: {"amenity": "post_office", "name": "Dominion Road Postshop", "created": {"uid": "10829", "changeset": "607706", "version": "5", "user": "myfanwy",
{"amenity": "post_office",
"name": "Dominion Road Postshop",
"created": {"uid": "10829",
"changeset": "607706",
"version": "5",
"user": "myfanwy",
"timestamp": "2007-11-24T12:41:04Z"},
"pos": [-36.8801299, 174.7495053],
"created_by": "Potlatch 0.5d",
"type": "node",
"id": "61076379"}
因此,每个文档都有一个用户和一个便利设施。我想找出每个用户记录的每个便利设施的数量,除以每个用户记录的便利设施总量
下面是我用来查找每个计数的代码片段:
问题1。查找每个用户记录的每个便利设施的数量:
amenity_per_user = coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
{"$group":{"_id":{"user":"$created.user", "amenities":"$amenity"}, "count":{"$sum":1}}},
{"$sort":{"count":-1}}])
results = coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
{"$group":{"_id":"$created.user", "count":{"$sum":1}}},
{"$sort":{"count":-1}}])
问题2。查找每个用户记录的便利设施数量:
amenity_per_user = coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
{"$group":{"_id":{"user":"$created.user", "amenities":"$amenity"}, "count":{"$sum":1}}},
{"$sort":{"count":-1}}])
results = coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
{"$group":{"_id":"$created.user", "count":{"$sum":1}}},
{"$sort":{"count":-1}}])
两种方法的答案都是(每个结果限5个):
现在我想做的是将每个用户的最高舒适度(即Rudy355为停车舒适度输入了1886个条目)除以他们的总记录量(查询2)。-因此,最终结果之一是鲁迪355在“停车场”设施中录制了0.3张唱片1886/6321=0.3
这就是我要做的:
coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
{"$group":{"_id":"$created.user", "user_count":{"$sum":1}}},
{"$group":{"_id":{"user":"$created.user", "amenities":"$amenity"}, "amenity_count":{"$sum":1},
"ucount":{"$push":"$user_count"}}},
{"$unwind":"$ucount"},
{"$project":{"$divide":{"$ucount", "$amenity_count"}}},
{"$sort":{"count":-1}}])
任何帮助都会很棒
顺便说一下,我真的不喜欢使用$push来保存“user\u count”的值。有没有人知道像这样保存计算字段的更好方法 您可以尝试下面的聚合
$push
保存每个舒适度
及其计数
,以便以后使用总计
用户舒适度计数计算记录
[
{"$match":{"amenity":{"$exists":True}}},
{"$group":{"_id":{"user":"$created.user", "amenity":"$amenity"}, "count":{"$sum":1}}},
{"$group":{"_id":"$_id.user", "total":{"$sum":"$count"}, "amenities":{"$push":{amenity:"$_id.amenity","count":"$count"}}}},
{"$unwind":"$amenities"},
{"$project:{"_id":0,"user":"$_id", "amenity":"$amenities.amenity", record":{"$divide":{"$amenities.count", "$total"}}}},
{"$sort":{"record":-1}}
]
您应该有如下输出
{"user":"Rudy355", "amenity":"parking", "record":0.3}