MongoDB聚合平均值()

MongoDB聚合平均值(),mongodb,mongodb-query,aggregation-framework,Mongodb,Mongodb Query,Aggregation Framework,嘿,伙计们,我真的是一个新的,如果涉及到聚合,所以请帮助我通过这一点 假设我有多个文档(随时间推移)如下: { "_id": ObjectId("574d6175da461e77030041b7"), "hostname": "VPS", "timestamp": NumberLong(1460040691), "cpuCores": NumberLong(2), "cpuList": [ { "name": "cpu1", "load": 3

嘿,伙计们,我真的是一个新的,如果涉及到聚合,所以请帮助我通过这一点

假设我有多个文档(随时间推移)如下:

{
  "_id": ObjectId("574d6175da461e77030041b7"),
  "hostname": "VPS",
  "timestamp": NumberLong(1460040691),
  "cpuCores": NumberLong(2),
  "cpuList": [
    {
      "name": "cpu1",
      "load": 3.4
    },
    {
      "name": "cpu2",
      "load": 0.7
    }
  ]
},
{
  "_id": ObjectId("574d6175da461e77030041b7"),
  "hostname": "VPS",
  "timestamp": NumberLong(1460040700),
  "cpuCores": NumberLong(2),
  "cpuList": [
    {
      "name": "cpu1",
      "load": 0.4
    },
    {
      "name": "cpu2",
      "load": 6.7
    }
  ]
},
{
  "_id": ObjectId("574d6175da461e77030041b7"),
  "hostname": "VPS",
  "timestamp": NumberLong(1460041000),
  "cpuCores": NumberLong(2),
  "cpuList": [
    {
      "name": "cpu1",
      "load": 25.4
    },
    {
      "name": "cpu2",
      "load": 1.7
    }
  ]
}
{
    "avgCPULoad": "2.8",
    "timestamp": NumberLong(1460040700)
},
{
    "avgCPULoad": "13.55",
    "timestamp": NumberLong(1460041000)
}
db.Pizza.aggregate(
[
    {
        $group: 
        {
            _id:
            {
                $subtract: [
                    '$timestamp',   
                    {
                        $mod: ['$timestamp', 300]
                    }
                ]
            },
            'timestamp': {$last:'$timestamp'}
        },
    {
        $project: {_id: 0, timestamp:'$timestamp'}
    }
])
我想得到X时间内的平均cpu负载。其中X等于300秒

因此,在上面的示例中,我们将得到如下结果集:

{
  "_id": ObjectId("574d6175da461e77030041b7"),
  "hostname": "VPS",
  "timestamp": NumberLong(1460040691),
  "cpuCores": NumberLong(2),
  "cpuList": [
    {
      "name": "cpu1",
      "load": 3.4
    },
    {
      "name": "cpu2",
      "load": 0.7
    }
  ]
},
{
  "_id": ObjectId("574d6175da461e77030041b7"),
  "hostname": "VPS",
  "timestamp": NumberLong(1460040700),
  "cpuCores": NumberLong(2),
  "cpuList": [
    {
      "name": "cpu1",
      "load": 0.4
    },
    {
      "name": "cpu2",
      "load": 6.7
    }
  ]
},
{
  "_id": ObjectId("574d6175da461e77030041b7"),
  "hostname": "VPS",
  "timestamp": NumberLong(1460041000),
  "cpuCores": NumberLong(2),
  "cpuList": [
    {
      "name": "cpu1",
      "load": 25.4
    },
    {
      "name": "cpu2",
      "load": 1.7
    }
  ]
}
{
    "avgCPULoad": "2.8",
    "timestamp": NumberLong(1460040700)
},
{
    "avgCPULoad": "13.55",
    "timestamp": NumberLong(1460041000)
}
db.Pizza.aggregate(
[
    {
        $group: 
        {
            _id:
            {
                $subtract: [
                    '$timestamp',   
                    {
                        $mod: ['$timestamp', 300]
                    }
                ]
            },
            'timestamp': {$last:'$timestamp'}
        },
    {
        $project: {_id: 0, timestamp:'$timestamp'}
    }
])
avgCpuLoad的计算如下:

  • 在相隔300秒内抓取所有文档
  • 计算平均值:
  • ((3.4+0.7)/2)+(0.4+6.7)/2)/2=2.8
  • ((25.4+1.7)/2)=13.55
  • 添加所选文档中的最后一个时间戳
  • 我知道我是如何在每次x时间获得每个文档的。就是这样做的:

    {
      "_id": ObjectId("574d6175da461e77030041b7"),
      "hostname": "VPS",
      "timestamp": NumberLong(1460040691),
      "cpuCores": NumberLong(2),
      "cpuList": [
        {
          "name": "cpu1",
          "load": 3.4
        },
        {
          "name": "cpu2",
          "load": 0.7
        }
      ]
    },
    {
      "_id": ObjectId("574d6175da461e77030041b7"),
      "hostname": "VPS",
      "timestamp": NumberLong(1460040700),
      "cpuCores": NumberLong(2),
      "cpuList": [
        {
          "name": "cpu1",
          "load": 0.4
        },
        {
          "name": "cpu2",
          "load": 6.7
        }
      ]
    },
    {
      "_id": ObjectId("574d6175da461e77030041b7"),
      "hostname": "VPS",
      "timestamp": NumberLong(1460041000),
      "cpuCores": NumberLong(2),
      "cpuList": [
        {
          "name": "cpu1",
          "load": 25.4
        },
        {
          "name": "cpu2",
          "load": 1.7
        }
      ]
    }
    
    {
        "avgCPULoad": "2.8",
        "timestamp": NumberLong(1460040700)
    },
    {
        "avgCPULoad": "13.55",
        "timestamp": NumberLong(1460041000)
    }
    
    db.Pizza.aggregate(
    [
        {
            $group: 
            {
                _id:
                {
                    $subtract: [
                        '$timestamp',   
                        {
                            $mod: ['$timestamp', 300]
                        }
                    ]
                },
                'timestamp': {$last:'$timestamp'}
            },
        {
            $project: {_id: 0, timestamp:'$timestamp'}
        }
    ])
    
    但是怎样才能得到像上面那样的平均值呢?
    我尝试了一下
    $unwind
    ,但没有给出我想要的结果。

    解决方法是在阵列上使用unwind(cpulist)。我为您提供了一个示例查询:

    db.CpuInfo.aggregate([
        {
            $unwind: '$cpuList'
        },
        {
            $group: {
                _id:{
                    $subtract:[
                        '$timestamp', 
                        {$mod: ['$timestamp', 300]}
                    ]
                }, 
                'timestamp':{$last:'$timestamp'},
                'cpuList':{$avg:'$cpuList.load'}
            }
        }
    ])
    

    您需要运行以下聚合操作以获得所需的结果:

    db.collection.aggregate([ 
        { "$unwind": "$cpuList" },
        {
            "$group": {
                "_id": {                
                    "interval": {
                        "$subtract": [ 
                            "$timestamp",                        
                            { "$mod": [ "$timestamp", 60 * 5 ] }
                        ]
                    }
                },             
                "avgCPULoad": { "$avg": "$cpuList.load" },
                "timestamp": { "$max": "$timestamp" }
            } 
        },
        {
            "$project": { "_id": 0, "avgCPULoad": 1, "timestamp": 1 }
        }
    ])
    
    以上按5分钟的间隔(以秒为单位)对展平的文档进行分组;将实际时间戳除以5分钟间隔(以秒为单位)后得到的余数减去以秒为单位的时间戳,即可得出间隔键

    样本输出

    /* 1 */
    {
        "avgCPULoad" : 13.55,
        "timestamp" : NumberLong(1460041000)
    }
    
    /* 2 */
    {
        "avgCPULoad" : 2.8,
        "timestamp" : NumberLong(1460040700)
    }
    

    考虑更改您的模式,以便为您的代码>“CulpSistar”<代码>字段(例如“代码> > CCPIST”)具有嵌入的关键值文档({名称:“CPU1”,Load:3.4 },{名称:“CPU2”,加载:0.7 } < /代码>)。您还需要将
    加载
    值转换为数字值,以便像
    $avg
    这样的聚合累加器操作符可以有效。我之所以创建加载属性字符串,是因为每当我将其设置为数字值时,我都会对精度浮点造成混乱。例如,在php中,一个浮点数四舍五入为2位小数。每当我把它放在Mongo中,它得到的不是2个小数,比如10多个,或者一个非常小的数字offset@chridam好吧,我已经按照你的建议改变了我的模式。如何进行查询以计算平均值?时间戳以秒为单位,而不是以秒为单位miliseconds@Baklap4你说得对,我的错。我已经更新了答案,以反映秒而不是毫秒。是否有可能在mongo中进行取整?是的,但这会有点麻烦,因为mongo中没有
    $round
    操作符。您可以使用@Baklap4进行黑客攻击,如果您可以为此创建一个新问题,则最好是这样,以免您遇到失败。