在mongodb中保存多个文档还是更少的带有大对象的文档更好？_Mongodb

在mongodb中保存多个文档还是更少的带有大对象的文档更好？

mongodb

在mongodb中保存多个文档还是更少的带有大对象的文档更好？,mongodb,Mongodb,我使用mongodb存储多个网站的分析。这些网站每天有数百万次访问数千个不同的URL。我需要计算每个URL的访问次数现在我需要每天获取前一天的数据最好是将每个URL存储在自己的文档中，还是将一个对象下的所有URL存储在一个文档中？多个文档或更少的具有大对象的文档 db.posts.find(); {_id: 1, title: 'unicorns are awesome', ...} db.comments.find(); {_id: 1, post_id: 1, title: 'i ag

我使用mongodb存储多个网站的分析。这些网站每天有数百万次访问数千个不同的URL。我需要计算每个URL的访问次数

现在我需要每天获取前一天的数据

最好是将每个URL存储在自己的文档中，还是将一个对象下的所有URL存储在一个文档中？

多个文档或更少的具有大对象的文档

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}

db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
  {title: 'i agree', ...},
  {title: 'they kill vampires too!', ...}
]}

 // separate collection insert and update
  db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
  db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});

  // embedded document insert and update
  db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
  // this specific update requires that we store an _id with each comment
  db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})

不可避免地，每个使用MongoDB的人都必须在使用带有id引用的多个集合或嵌入文档之间进行选择。这两种解决方案各有优缺点。学习使用这两种语言：

使用单独的集合

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}

db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
  {title: 'i agree', ...},
  {title: 'they kill vampires too!', ...}
]}

 // separate collection insert and update
  db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
  db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});

  // embedded document insert and update
  db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
  // this specific update requires that we store an _id with each comment
  db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})

或-

使用嵌入式文档

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}

db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
  {title: 'i agree', ...},
  {title: 'they kill vampires too!', ...}
]}

 // separate collection insert and update
  db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
  db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});

  // embedded document insert and update
  db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
  // this specific update requires that we store an _id with each comment
  db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})

独立的集合提供了最大的查询灵活性

// sort comments however you want
db.comments.find({post_id: 3}).sort({votes: -1}).limit(5)

// pull out one or more specific comment(s)
db.comments.find({post_id: 3, user: 'leto'})

// get all of a user's comments joining the posts to get the title
var comments = db.comments.find({user: 'leto'}, {post_id: true})
var postIds = comments.map(function(c) { return c.post_id; });
db.posts.find({_id: {$in: postIds}}, {title: true});

选择嵌入文档更为有限

// you can select a range (useful for paging)
// but can't sort, so you are limited to the insertion order
db.posts.find({_id: 3}, {comments: {$slice: [0, 5]}})

// you can select the post without any comments also
db.posts.find({_id: 54}, {comments: -1})

// you can't use the update's position operator ($) for field selections
db.posts.find({'comments.user': 'leto'}, {title: 1, 'comments.$': 1})

一个文档，包括其所有嵌入的文档和数组，不能超过16MB。

单独收集需要更多工作

 // finding a post + its comments is two queries and requires extra work
 // in your code to make it all pretty (or your ODM might do it for you)
 db.posts.find({_id: 9001});
 db.comments.find({post_id: 9001})

  // finding a post + its comments
  db.posts.find({_id: 9001});

嵌入式文档简单快速（单搜索）

插入和更新没有太大区别

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}

db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}

db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
  {title: 'i agree', ...},
  {title: 'they kill vampires too!', ...}
]}

 // separate collection insert and update
  db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
  db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});

  // embedded document insert and update
  db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
  // this specific update requires that we store an _id with each comment
  db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})

因此，如果您需要选择单个文档、需要对查询进行更多控制或拥有大量文档，则单独的集合是很好的。当您需要整个文档、包含$slice注释的文档或根本没有注释时，嵌入文档是很好的选择。一般来说，如果你有很多“评论”或者评论很大，最好是单独收集。较小和/或更少的文档往往是嵌入的自然选择

记住，你总是可以改变主意的。两者都尝试是最好的学习方法