MongoDB：计算所有文档中的第90个百分位_Mongodb

MongoDB：计算所有文档中的第90个百分位

mongodb

MongoDB：计算所有文档中的第90个百分位,mongodb,Mongodb,我需要计算持续时间的第90个百分位数，其中每个文档的持续时间定义为finish\u time-start\u time 我的计划是：创建$project以计算每个文档的持续时间（秒）计算与第90百分位相对应的索引（在已排序的文档列表中）：90th_百分位_index=0.9*文档数量按创建的变量$duration对文档进行排序使用90%索引对文档进行$limit 从有限的文档子集中选择第一个文档我是新来的MongoDB，所以我想查询可以改进。因此，查询如下所示： db.getColle

我需要计算持续时间的第90个百分位数，其中每个文档的持续时间定义为

finish\u time-start\u time

我的计划是：

创建

$project

以计算每个文档的持续时间（秒）

计算与第90百分位相对应的索引（在已排序的文档列表中）：

90th_百分位_index=0.9*文档数量

按创建的变量

$duration

对文档进行排序

使用

90%索引对文档进行$limit


从有限的文档子集中选择第一个文档
我是新来的MongoDB
，所以我想查询可以改进。因此，查询如下所示：
db.getCollection('scans').aggregate([
{ 
  $project: {
    duration: {
      $divide: [{$subtract: ["$finish_time", "$start_time"]}, 1000] // duration is in seconds
    },
    Percentile90Index: {
      $multiply: [0.9, "$total_number_of_documents"] // I don't know how to get the total number of documents.. 
    }
}
},
{
    $sort : {"$duration": 1},
},
{
    $limit: "$Percentile90Index"
},
{
    $group: {
    _id: "_id",
    percentiles90 : { $max: "$duration" } // selecting the max, i.e, first document after the limit , should give the result.
  }
}
])

我的问题是，我不知道如何获得文档的总数，因此我无法计算索引
例如：
假设我只有3个文档：
{
    "_id" : ObjectId("1"),
    "start_time" : ISODate("2019-02-03T12:00:00.000Z"),
    "finish_time" : ISODate("2019-02-03T12:01:00.000Z"),
}

{
    "_id" : ObjectId("2"),
    "start_time" : ISODate("2019-02-03T12:00:00.000Z"),
    "finish_time" : ISODate("2019-02-03T12:03:00.000Z"),

}

{
    "_id" : ObjectId("3"),
    "start_time" : ISODate("2019-02-03T12:00:00.000Z"),
    "finish_time" : ISODate("2019-02-03T12:08:00.000Z"),
}

因此，我预计结果如下：
{
percentiles50 : 3 // in minutes, since percentiles50=3 is the minimum value that setisfies the request of atleast 50% of the documents have duration <= percentiles50
}

{
percentiles50:3//以分钟为单位，因为percentiles50=3是设置请求的最小值，至少有50%的文档具有持续时间。您可以添加示例数据和预期结果吗？@Neodan我已经用示例文档和所需输出编辑了我的帖子。我希望现在清楚了。