Mongodb 按id对Mongo文档进行分组，并按时间戳获取最新文档_Mongodb_Mongodb Query_Aggregation Framework

Mongodb 按id对Mongo文档进行分组，并按时间戳获取最新文档

mongodb

Mongodb 按id对Mongo文档进行分组，并按时间戳获取最新文档,mongodb,mongodb-query,aggregation-framework,Mongodb,Mongodb Query,Aggregation Framework,假设我们在mongodb中存储了以下一组文档： { "fooId" : "1", "status" : "A", "timestamp" : ISODate("2016-01-01T00:00:00.000Z") "otherInfo" : "BAR", ... } { "fooId" : "1", "status" : "B", "timestamp" : ISODate("2016-01-02T00:00:00.000Z") "otherInfo" : "BAR", ... } { "foo

假设我们在mongodb中存储了以下一组文档：

{ "fooId" : "1", "status" : "A", "timestamp" : ISODate("2016-01-01T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "1", "status" : "B", "timestamp" : ISODate("2016-01-02T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "1", "status" : "C", "timestamp" : ISODate("2016-01-03T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "2", "status" : "A", "timestamp" : ISODate("2016-01-01T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "2", "status" : "B", "timestamp" : ISODate("2016-01-02T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "3", "status" : "A", "timestamp" : ISODate("2016-01-01T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "3", "status" : "B", "timestamp" : ISODate("2016-01-02T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "3", "status" : "C", "timestamp" : ISODate("2016-01-03T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "3", "status" : "D", "timestamp" : ISODate("2016-01-04T00:00:00.000Z") "otherInfo" : "BAR", ... }

我想根据时间戳获取每个食物ID的最新状态。因此，我的回报如下：

{ "fooId" : "1", "status" : "C", "timestamp" : ISODate("2016-01-03T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "2", "status" : "B", "timestamp" : ISODate("2016-01-02T00:00:00.000Z") "otherInfo" : "BAR", ... }
{ "fooId" : "3", "status" : "D", "timestamp" : ISODate("2016-01-04T00:00:00.000Z") "otherInfo" : "BAR", ... }

我一直试图通过使用

group

操作符使用聚合来实现这一点，但我想知道的是，有没有一种简单的方法可以从聚合中获取整个文档，这样看起来就像我使用了查找查询一样？在分组时，似乎必须指定所有字段，如果文档上有我可能不知道的可选字段，那么这似乎是不可扩展的。我当前的查询如下所示：

db.collectionName.aggregate(
[
{$sort:{时间戳：1}}，
{
$group:
{
_id:“$fooId”，
时间戳：{$last:$timestamp}，
状态：{“$last”：“$status”}，
otherInfo:{“$last”：“$otherInfo”}，
}
}
]
)

您可以与操作员一起使用系统变量返回最后一份文档

db.collectionName.aggregate([      
    { "$sort": { "timestamp": 1 } },     
    { "$group": { 
        "_id": "$fooId",   
        "last_doc": { "$last": "$$ROOT" } 
    }}
])

当然，这将是每个组的最后一个文档，作为一个字段的值

{
        "_id" : "2",
        "doc" : {
                "_id" : ObjectId("570e6df92f5bb4fcc8bb177e"),
                "fooId" : "2",
                "status" : "B",
                "timestamp" : ISODate("2016-01-02T00:00:00Z")
        }
}

如果您对该输出不满意，那么最好在使用累加器操作符返回这些文档的数组时，向管道中添加另一个

$group

阶段

db.collectionName.aggregate([      
    { "$sort": { "timestamp": 1 } },     
    { "$group": { 
        "_id": "$fooId",   
        "last_doc": { "$last": "$$ROOT" } 
    }},
    { "$group": { 
        "_id": null, 
        "result": { "$push": "$last_doc" } 
    }}

])

如果要执行和聚合，则需要执行与SQL类似的操作，这意味着指定每列的聚合操作，唯一的选项是使用

$$ROOT

运算符

db.test.aggregate(
   [
    { $sort: { timestamp: 1 } },
     {
       $group:
         {
           _id: "$fooId",
           timestamp: { $last: "$$ROOT" }
         }
     }
   ]
);

但这会稍微改变输出

{ "_id" : "1", "timestamp" : { "_id" : ObjectId("570e6be3e81c8b195818e7fa"), 
  "fooId" : "1", "status" : "A", "timestamp" :ISODate("2016-01-01T00:00:00Z"), 
  "otherInfo" : "BAR" } }

如果要返回原始文档格式，您可能需要一个$project阶段，尽管没有直接的方法返回原始文档，而且我看不到任何值，但请尝试以下聚合查询：

db.collection.aggregate([
   {$sort: {fooId:1, timestamp: -1}},
   {$group:{_id:"$fooId", doc:{$first:"$$ROOT"}}},
   {$project:{_id:0, doc:["$doc"]}}
]).forEach(function(item){

  printjson(item.doc[0]);

});

此查询将发出：

{ 
    "_id" : ObjectId("570e76d5e94e6584078f02c4"), 
    "fooId" : "2", 
    "status" : "B", 
    "timestamp" : ISODate("2016-01-02T00:00:00.000+0000"), 
    "otherInfo" : "BAR"
}
{ 
    "_id" : ObjectId("570e76d5e94e6584078f02c8"), 
    "fooId" : "3", 
    "status" : "D", 
    "timestamp" : ISODate("2016-01-04T00:00:00.000+0000"), 
    "otherInfo" : "BAR"
}
{ 
    "_id" : ObjectId("570e76d5e94e6584078f02c2"), 
    "fooId" : "1", 
    "status" : "C", 
    "timestamp" : ISODate("2016-01-03T00:00:00.000+0000"), 
    "otherInfo" : "BAR"
}

如果有一种不用聚合的方法，我肯定会感兴趣的。使用$$ROOT的输出肯定是不需要的，使用$project仍然会保留使用$ROOT的字段的名称。我真的很想让输出看起来像你做了一个普通的查询一样。@Shark聚合是你最好的选择。正如我在回答中提到的，您可以向管道中添加另一个

$group

阶段。现在，我有一些mongotemplate find查询将结果文档映射回collectionName.class对象。我希望对聚合查询执行类似的操作，而不必创建另一个中间层对象，然后将其映射到collectionName.class。这是我对这件事挑剔的主要原因你已经用正确的方法做了。当然，您可以使用并将整个文档放在一个属性中，但这不是相同的结构，是吗？如果您非常担心“键入”每个字段，那么只需“在代码中生成最终的

$group

管道语句”。这是一件非常简单的事情，所有MongoDB查询和聚合管道语句毕竟只是“数据结构”。