Python MongoDB。如何通过聚合管道携带计算字段?

Python MongoDB。如何通过聚合管道携带计算字段?,python,mongodb,Python,Mongodb,我试图从json格式的OSM数据中收集一些见解。下面是我在MongoDB/PyMongo中使用的文档示例: {"amenity": "post_office", "name": "Dominion Road Postshop", "created": {"uid": "10829", "changeset": "607706", "version": "5", "user": "myfanwy",

我试图从json格式的OSM数据中收集一些见解。下面是我在MongoDB/PyMongo中使用的文档示例:

{"amenity": "post_office",
 "name": "Dominion Road Postshop", 
 "created": {"uid": "10829", 
             "changeset": "607706", 
             "version": "5", 
             "user": "myfanwy",   
              "timestamp": "2007-11-24T12:41:04Z"}, 
 "pos": [-36.8801299, 174.7495053], 
 "created_by": "Potlatch 0.5d", 
 "type": "node", 
 "id": "61076379"}
因此,每个文档都有一个用户和一个便利设施。我想找出每个用户记录的每个便利设施的数量,除以每个用户记录的便利设施总量

下面是我用来查找每个计数的代码片段:

问题1。查找每个用户记录的每个便利设施的数量:

amenity_per_user = coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
                               {"$group":{"_id":{"user":"$created.user", "amenities":"$amenity"}, "count":{"$sum":1}}},
                               {"$sort":{"count":-1}}])
results = coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
                      {"$group":{"_id":"$created.user", "count":{"$sum":1}}},
                      {"$sort":{"count":-1}}])
问题2。查找每个用户记录的便利设施数量:

amenity_per_user = coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
                               {"$group":{"_id":{"user":"$created.user", "amenities":"$amenity"}, "count":{"$sum":1}}},
                               {"$sort":{"count":-1}}])
results = coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
                      {"$group":{"_id":"$created.user", "count":{"$sum":1}}},
                      {"$sort":{"count":-1}}])
两种方法的答案都是(每个结果限5个):

现在我想做的是将每个用户的最高舒适度(即Rudy355为停车舒适度输入了1886个条目)除以他们的总记录量(查询2)。-因此,最终结果之一是鲁迪355在“停车场”设施中录制了0.3张唱片1886/6321=0.3

这就是我要做的:

coll.aggregate([{"$match":{"amenity":{"$exists":True}}},
                    {"$group":{"_id":"$created.user", "user_count":{"$sum":1}}},
                    {"$group":{"_id":{"user":"$created.user", "amenities":"$amenity"}, "amenity_count":{"$sum":1}, 
                               "ucount":{"$push":"$user_count"}}},
                    {"$unwind":"$ucount"},
                    {"$project":{"$divide":{"$ucount", "$amenity_count"}}},
                    {"$sort":{"count":-1}}])
任何帮助都会很棒


顺便说一下,我真的不喜欢使用$push来保存“user\u count”的值。有没有人知道像这样保存计算字段的更好方法

您可以尝试下面的聚合
$push
保存每个
舒适度
及其
计数
,以便以后使用
总计
用户舒适度计数计算
记录

 [
    {"$match":{"amenity":{"$exists":True}}},
    {"$group":{"_id":{"user":"$created.user", "amenity":"$amenity"}, "count":{"$sum":1}}},
    {"$group":{"_id":"$_id.user", "total":{"$sum":"$count"}, "amenities":{"$push":{amenity:"$_id.amenity","count":"$count"}}}},
    {"$unwind":"$amenities"},
    {"$project:{"_id":0,"user":"$_id", "amenity":"$amenities.amenity", record":{"$divide":{"$amenities.count", "$total"}}}},
    {"$sort":{"record":-1}}
]                                                                                                                           
您应该有如下输出

{"user":"Rudy355", "amenity":"parking", "record":0.3}