Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/mongodb/11.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在mongodb中保存多个文档还是更少的带有大对象的文档更好?_Mongodb - Fatal编程技术网

在mongodb中保存多个文档还是更少的带有大对象的文档更好?

在mongodb中保存多个文档还是更少的带有大对象的文档更好?,mongodb,Mongodb,我使用mongodb存储多个网站的分析。这些网站每天有数百万次访问数千个不同的URL。我需要计算每个URL的访问次数 现在我需要每天获取前一天的数据 最好是将每个URL存储在自己的文档中,还是将一个对象下的所有URL存储在一个文档中?多个文档或更少的具有大对象的文档 db.posts.find(); {_id: 1, title: 'unicorns are awesome', ...} db.comments.find(); {_id: 1, post_id: 1, title: 'i ag

我使用mongodb存储多个网站的分析。这些网站每天有数百万次访问数千个不同的URL。我需要计算每个URL的访问次数

现在我需要每天获取前一天的数据


最好是将每个URL存储在自己的文档中,还是将一个对象下的所有URL存储在一个文档中?

多个文档或更少的具有大对象的文档

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}

db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
  {title: 'i agree', ...},
  {title: 'they kill vampires too!', ...}
]}
 // separate collection insert and update
  db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
  db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});

  // embedded document insert and update
  db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
  // this specific update requires that we store an _id with each comment
  db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})
不可避免地,每个使用MongoDB的人都必须在使用带有id引用的多个集合或嵌入文档之间进行选择。这两种解决方案各有优缺点。学习使用这两种语言:

使用单独的集合

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}

db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
  {title: 'i agree', ...},
  {title: 'they kill vampires too!', ...}
]}
 // separate collection insert and update
  db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
  db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});

  // embedded document insert and update
  db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
  // this specific update requires that we store an _id with each comment
  db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})
  • 或-
使用嵌入式文档

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}

db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
  {title: 'i agree', ...},
  {title: 'they kill vampires too!', ...}
]}
 // separate collection insert and update
  db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
  db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});

  // embedded document insert and update
  db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
  // this specific update requires that we store an _id with each comment
  db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})

独立的集合提供了最大的查询灵活性

// sort comments however you want
db.comments.find({post_id: 3}).sort({votes: -1}).limit(5)

// pull out one or more specific comment(s)
db.comments.find({post_id: 3, user: 'leto'})

// get all of a user's comments joining the posts to get the title
var comments = db.comments.find({user: 'leto'}, {post_id: true})
var postIds = comments.map(function(c) { return c.post_id; });
db.posts.find({_id: {$in: postIds}}, {title: true});
选择嵌入文档更为有限

// you can select a range (useful for paging)
// but can't sort, so you are limited to the insertion order
db.posts.find({_id: 3}, {comments: {$slice: [0, 5]}})

// you can select the post without any comments also
db.posts.find({_id: 54}, {comments: -1})

// you can't use the update's position operator ($) for field selections
db.posts.find({'comments.user': 'leto'}, {title: 1, 'comments.$': 1})
一个文档,包括其所有嵌入的文档和数组,不能超过16MB。

单独收集需要更多工作

 // finding a post + its comments is two queries and requires extra work
 // in your code to make it all pretty (or your ODM might do it for you)
 db.posts.find({_id: 9001});
 db.comments.find({post_id: 9001})
  // finding a post + its comments
  db.posts.find({_id: 9001});
嵌入式文档简单快速(单搜索)

插入和更新没有太大区别

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}

db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
  {title: 'i agree', ...},
  {title: 'they kill vampires too!', ...}
]}
 // separate collection insert and update
  db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
  db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});

  // embedded document insert and update
  db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
  // this specific update requires that we store an _id with each comment
  db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})

因此,如果您需要选择单个文档、需要对查询进行更多控制或拥有大量文档,则单独的集合是很好的。当您需要整个文档、包含$slice注释的文档或根本没有注释时,嵌入文档是很好的选择。一般来说,如果你有很多“评论”或者评论很大,最好是单独收集。较小和/或更少的文档往往是嵌入的自然选择

记住,你总是可以改变主意的。两者都尝试是最好的学习方法