Mongodb 计算嵌入文档/数组中字段的平均值

Mongodb 计算嵌入文档/数组中字段的平均值,mongodb,mongodb-query,average,aggregation-framework,Mongodb,Mongodb Query,Average,Aggregation Framework,我想用数组ratings中的ratings字段计算此对象的ratings_average字段。您能帮助我了解如何使用$avg聚合吗 { "title": "The Hobbit", "rating_average": "???", "ratings": [ { "title": "best book ever", "rating": 5 }, { "ti

我想用数组ratings中的ratings字段计算此对象的ratings_average字段。您能帮助我了解如何使用$avg聚合吗

{
    "title": "The Hobbit",
    "rating_average": "???",
    "ratings": [
        {
            "title": "best book ever",
            "rating": 5
        },
        {
            "title": "good book",
            "rating": 3.5
        }
    ]
}
MongoDB 3.4及更新版本中的提供了操作符,该操作符可以有效地计算总数,而无需额外的管道。考虑使用它作为表达式返回 总评分并使用获取评分数。因此,与一起,可以使用算术运算符计算平均值,如公式
average=总评分/评分数

db.collection.aggregate([
    { 
        "$addFields": { 
            "rating_average": {
                "$divide": [
                    { // expression returns total
                        "$reduce": {
                            "input": "$ratings",
                            "initialValue": 0,
                            "in": { "$add": ["$$value", "$$this.rating"] }
                        }
                    },
                    { // expression returns ratings count
                        "$cond": [
                            { "$ne": [ { "$size": "$ratings" }, 0 ] },
                            { "$size": "$ratings" }, 
                            1
                        ]
                    }
                ]
            }
        }
    }           
])
样本输出

{
    "_id" : ObjectId("58ab48556da32ab5198623f4"),
    "title" : "The Hobbit",
    "ratings" : [ 
        {
            "title" : "best book ever",
            "rating" : 5.0
        }, 
        {
            "title" : "good book",
            "rating" : 3.5
        }
    ],
    "rating_average" : 4.25
}

对于旧版本,您需要首先在
ratings
数组字段上应用运算符作为初始聚合管道步骤。这将从输入文档中解构
评级
数组字段,以输出每个元素的文档。每个输出文档都用元素值替换数组

第二个管道阶段是操作员,它通过
\u id
title
键标识符表达式对输入文档进行分组,并将所需的累加器表达式应用于计算平均值的每个组。还有另一个累加器运算符,它通过返回将表达式应用于上述组中的每个文档所产生的所有值的数组来保留原始的ratings数组字段

最后一个管道步骤是操作符,该操作符随后对流中的每个文档进行重塑,例如通过添加新字段
ratings\u average

因此,例如,如果您的收藏中有一个示例文档(从上面到下面):

要计算评级数组平均值并将值投影到另一个字段
ratings\u average
,然后可以应用以下聚合管道:

db.collection.aggregate([
    {
        "$unwind": "$ratings"
    },
    {
        "$group": {
            "_id": {
                "_id": "$_id",
                "title": "$title"
            },
            "ratings":{
                "$push": "$ratings"
            },
            "ratings_average": {
                "$avg": "$ratings.rating"
            }
        }
    },
    {
        "$project": {
            "_id": 0,
            "title": "$_id.title",
            "ratings_average": 1,
            "ratings": 1
        }
    }
])
{$unwind: "$ratings"}
db.yourCollection.aggregate([
                               {$unwind: "$ratings"}, 
                               {$group: {_id: "$title", 
                                         ratings: {$push: "$ratings"}, 
                                         average: {$avg: "$ratings.rating"}
                                        }
                               },
                               {$project: {_id: 0, title: "$_id", ratings: 1, average: 1}}
                            ])
结果

/* 1 */
{
    "result" : [ 
        {
            "ratings" : [ 
                {
                    "title" : "best book ever",
                    "rating" : 5
                }, 
                {
                    "title" : "good book",
                    "rating" : 3.5
                }
            ],
            "ratings_average" : 4.25,
            "title" : "The Hobbit"
        }
    ],
    "ok" : 1
}

由于要计算数组中的平均数据,首先需要将其展开。通过在聚合管道中使用
$unwind
执行此操作:

db.collection.aggregate([
    {
        "$unwind": "$ratings"
    },
    {
        "$group": {
            "_id": {
                "_id": "$_id",
                "title": "$title"
            },
            "ratings":{
                "$push": "$ratings"
            },
            "ratings_average": {
                "$avg": "$ratings.rating"
            }
        }
    },
    {
        "$project": {
            "_id": 0,
            "title": "$_id.title",
            "ratings_average": 1,
            "ratings": 1
        }
    }
])
{$unwind: "$ratings"}
db.yourCollection.aggregate([
                               {$unwind: "$ratings"}, 
                               {$group: {_id: "$title", 
                                         ratings: {$push: "$ratings"}, 
                                         average: {$avg: "$ratings.rating"}
                                        }
                               },
                               {$project: {_id: 0, title: "$_id", ratings: 1, average: 1}}
                            ])
然后,您可以使用聚合结果文档中的键
ratings
作为嵌入文档访问数组的每个元素。然后您只需按
标题
$group
计算
$avg

{$group: {_id: "$title", ratings: {$push: "$ratings"}, average: {$avg: "$ratings.rating"}}}
然后只需恢复您的
标题
字段:

{$project: {_id: 0, title: "$_id", ratings: 1, average: 1}}
下面是您的结果聚合管道:

db.collection.aggregate([
    {
        "$unwind": "$ratings"
    },
    {
        "$group": {
            "_id": {
                "_id": "$_id",
                "title": "$title"
            },
            "ratings":{
                "$push": "$ratings"
            },
            "ratings_average": {
                "$avg": "$ratings.rating"
            }
        }
    },
    {
        "$project": {
            "_id": 0,
            "title": "$_id.title",
            "ratings_average": 1,
            "ratings": 1
        }
    }
])
{$unwind: "$ratings"}
db.yourCollection.aggregate([
                               {$unwind: "$ratings"}, 
                               {$group: {_id: "$title", 
                                         ratings: {$push: "$ratings"}, 
                                         average: {$avg: "$ratings.rating"}
                                        }
                               },
                               {$project: {_id: 0, title: "$_id", ratings: 1, average: 1}}
                            ])

这真的可以写得更短,甚至在写作时也是如此。如果你想要一个“平均值”,只需使用:

原因是,截至MongoDB 3.2,运营商获得了“两件事”:

  • 以“表达式”形式处理参数“数组”的能力,而不仅仅是将其作为累加器来处理

  • MongoDB 3.2允许数组表达式使用“速记”符号的特性带来的好处。在构成上为:

    { "array": [ "$fielda", "$fieldb" ] }
    
    或者将数组中的单个属性表示为该属性值的数组:

    { "$avg": "$ratings.rating" } // equal to { "$avg": [ 5, 3.5 ] }
    
  • 在早期版本中,您必须使用才能访问每个数组元素中的
    “rating”
    属性。现在你没有了


    为了记录在案,甚至可以简化使用:

    db.collection.aggregate([
      { "$addFields": {
        "rating_average": {
          "$reduce": {
            "input": "$ratings",
            "initialValue": 0,
            "in": {
              "$add": [ 
                "$$value",
                { "$divide": [ 
                  "$$this.rating", 
                  { "$size": { "$ifNull": [ "$ratings", [] ] } }
                ]}
              ]
            }
          }
        }
      }}
    ])
    
    是的,如前所述,这实际上只是重新实现现有功能,因此,既然该操作符可用,那么它就是应该使用的操作符