mongodb基于计数聚合查找匹配项_Mongodb

mongodb基于计数聚合查找匹配项

mongodb

mongodb基于计数聚合查找匹配项,mongodb,Mongodb,我有这样一个mongodb系列： {"uid": "01370mask4", "title": "hidden", "post: "hidden", "postTime": "01-23, 2016", "unixPostTime": "1453538601", "upvote": [2, 3]} 我想从超过5篇帖子的用户中选择帖子记录。结构应该是一样的，我只是不需要来自没有很多帖子的用户的文档 db.collection.aggregate( [ { $group

我有这样一个mongodb系列：

{"uid": "01370mask4",
 "title": "hidden",
 "post: "hidden",
 "postTime": "01-23, 2016",
 "unixPostTime": "1453538601",
 "upvote": [2, 3]}

我想从超过5篇帖子的用户中选择帖子记录。结构应该是一样的，我只是不需要来自没有很多帖子的用户的文档

db.collection.aggregate(
   [
     { $group : { _id : "$uid", count: { $sum: 1 } } }
   ]
)

现在我被困在如何使用计数值来查找。我搜索了，但没有找到任何方法通过uid将计数值添加回同一集合。mongodb似乎不支持保存聚合输出并将它们连接在一起。请告知，谢谢

更新：

对不起，我刚才没说清楚。谢谢你的及时回复！我想要原始集合的一个子集，带有post文本、post时间戳等。我不想要聚合输出的一个子集。

只需在组后添加

$match

，并使用正确的查询即可：

db.collection.aggregate(
  [
    { $group : { _id : "$uid", count: { $sum: 1 } } },
    { $match : { count : { $gt : 5 } }
  ]
)

只需使用正确的查询将

$match

添加到您的组之后，即可工作：

db.collection.aggregate(
  [
    { $group : { _id : "$uid", count: { $sum: 1 } } },
    { $match : { count : { $gt : 5 } }
  ]
)

请尝试此选项以选择发表超过5篇文章的用户。要通过使用保留原始字段，如果

$uid

是唯一的，请按如下方式添加字段

db.collection.aggregate([
     {$group: {
          _id: '$uid', 
          title: {$first: '$title'}, 
          post: {$first:'$post'}, 
          postTime:{$first: '$postTime'}, 
          unixPostTime:{$first: '$unixPostTime'},
          upvote:{$first: '$upvote'}, 
          count: {$sum: 1}
     }}, 
     {$match: {count: {$gte: 5}}}])
)

如果同一个

$uid

有多个值，则应将其用于

$group

中的数组

如果要将结果保存到db，请按以下方式尝试

var cur = db.collection.aggregate(
   [
     {$group: {
          _id: '$uid', 
          title: {$first: '$title'}, 
          post: {$first:'$post'}, 
          postTime:{$first: '$postTime'}, 
          unixPostTime:{$first: '$unixPostTime'},
          upvote:{$first: '$upvote'}, 
          count: {$sum: 1}
     }}, 
     {$match: {count: {$gte: 5}}}
   ]
)
cur.forEach(function(doc) {
   db.collectioin.update({_id: doc._id}, {/*the field should be updated */});
});

请尝试此选项以选择发表超过5篇文章的用户。要通过使用保留原始字段，如果

$uid

是唯一的，请按如下方式添加字段

db.collection.aggregate([
     {$group: {
          _id: '$uid', 
          title: {$first: '$title'}, 
          post: {$first:'$post'}, 
          postTime:{$first: '$postTime'}, 
          unixPostTime:{$first: '$unixPostTime'},
          upvote:{$first: '$upvote'}, 
          count: {$sum: 1}
     }}, 
     {$match: {count: {$gte: 5}}}])
)

如果同一个

$uid

有多个值，则应将其用于

$group

中的数组

如果要将结果保存到db，请按以下方式尝试

var cur = db.collection.aggregate(
   [
     {$group: {
          _id: '$uid', 
          title: {$first: '$title'}, 
          post: {$first:'$post'}, 
          postTime:{$first: '$postTime'}, 
          unixPostTime:{$first: '$unixPostTime'},
          upvote:{$first: '$upvote'}, 
          count: {$sum: 1}
     }}, 
     {$match: {count: {$gte: 5}}}
   ]
)
cur.forEach(function(doc) {
   db.collectioin.update({_id: doc._id}, {/*the field should be updated */});
});

如果没有数百万个文档，那么您可以尝试一种快捷方式，通过使用一个聚合和另一个查找查询来实现您正在尝试的功能

聚合查询：

var users = db.collection.aggregate(
  [
    {$group:{_id:'$uid', count:{$sum:1}}},
    {$match:{count:{$gt:5}}},
    {$group:{_id:null,users:{$push:'$_id'}}}
  ]
).toArray()[0]['users']

然后是一个直截了当的查询来查找特定用户：

db.collection.find({uid: {$in: users}})

如果没有数百万个文档，那么您可以尝试一种快捷方式，通过使用一个聚合和另一个查找查询来实现您正在尝试的功能

聚合查询：

var users = db.collection.aggregate(
  [
    {$group:{_id:'$uid', count:{$sum:1}}},
    {$match:{count:{$gt:5}}},
    {$group:{_id:null,users:{$push:'$_id'}}}
  ]
).toArray()[0]['users']

然后是一个直截了当的查询来查找特定用户：

db.collection.find({uid: {$in: users}})

我不清楚你的模式中的正确字段名，我只是在回答中使用了一些示例字段…你能提供一个示例输入文档和你想要的所需输出吗？我认为你最初的问题非常清楚，以后的更新非常混乱-你如何用聚合方法获得帖子的详细信息？您想选择拥有5篇以上帖子的用户的帖子吗？@SarathNair谢谢您的建议，我已经更新了它。@FrankFang是的，我想要collectioin的子集，其中包括拥有5篇以上帖子的用户的帖子记录。我不清楚您架构中的正确字段名，我只是在我的回答中使用了一些示例字段…你能提供一个示例输入文档和你想要的所需输出吗？我认为你最初的问题非常清楚，以后的更新非常混乱-如何使用聚合方法获得帖子的详细信息？你想选择拥有5篇以上帖子的用户的帖子吗？@SarathNair谢谢你的建议，我已经更新了。@FrankFang是的，我想要一个集合的子集，其中包括拥有5篇以上帖子的用户的帖子记录。谢谢！我尝试了

db.collection.aggregate（[{$group:{u id:$uid，title:$title，post:$post，postTime:$postTime，unixPostTime:$unixPostTime），upvote:$upvote，count:{$sum:1}}}，{$match:{$count count:{$gt:5}}}}}]）

但它一直失败：

“errmsg”：例外：组聚合字段“title”必须定义为对象内部的表达式，“code”：15951，“ok“：0

感谢您不断的建议！我认为

$first

与

count

有冲突，因为

$first

只适用于拥有1篇帖子的用户。虽然

$push

方法看起来很实用，但它创建了一个嵌套文档，如，但我仍然希望保留原始结构。这就是为什么我没有遵循合并方法，我说“将它们连接在一起（原始输出和聚合输出），mongodb不支持它们”。@leoce，是的，

$first

仅用于文档的唯一

uid

。但是，如果同一个

uid

有多个文档，则此处应使用

$push

。

$push

为一个唯一用户创建一个文档，其帖子将成为其中的子文档。然而，我想保持原来的结构，使，例如，5个职位分开。我仍在搜索是否有办法更新原始集合。@leoce，据我所知，对于当前的mongodb版本，没有更好的解决方案。如果你有另一个好办法。请让我去拿……谢谢！我尝试了

db.collection.aggregate（[{$group:{u id:$uid，title:$title，post:$post，postTime:$postTime，unixPostTime:$unixPostTime），upvote:$upvote，count:{$sum:1}}}，{$match:{$count count:{$gt:5}}}}}]）