在mongodb中保存多个文档还是更少的带有大对象的文档更好?
我使用mongodb存储多个网站的分析。这些网站每天有数百万次访问数千个不同的URL。我需要计算每个URL的访问次数 现在我需要每天获取前一天的数据在mongodb中保存多个文档还是更少的带有大对象的文档更好?,mongodb,Mongodb,我使用mongodb存储多个网站的分析。这些网站每天有数百万次访问数千个不同的URL。我需要计算每个URL的访问次数 现在我需要每天获取前一天的数据 最好是将每个URL存储在自己的文档中,还是将一个对象下的所有URL存储在一个文档中?多个文档或更少的具有大对象的文档 db.posts.find(); {_id: 1, title: 'unicorns are awesome', ...} db.comments.find(); {_id: 1, post_id: 1, title: 'i ag
最好是将每个URL存储在自己的文档中,还是将一个对象下的所有URL存储在一个文档中?多个文档或更少的具有大对象的文档
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}
db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
{title: 'i agree', ...},
{title: 'they kill vampires too!', ...}
]}
// separate collection insert and update
db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});
// embedded document insert and update
db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
// this specific update requires that we store an _id with each comment
db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})
不可避免地,每个使用MongoDB的人都必须在使用带有id引用的多个集合或嵌入文档之间进行选择。这两种解决方案各有优缺点。学习使用这两种语言:
使用单独的集合
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}
db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
{title: 'i agree', ...},
{title: 'they kill vampires too!', ...}
]}
// separate collection insert and update
db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});
// embedded document insert and update
db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
// this specific update requires that we store an _id with each comment
db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})
- 或-
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}
db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
{title: 'i agree', ...},
{title: 'they kill vampires too!', ...}
]}
// separate collection insert and update
db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});
// embedded document insert and update
db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
// this specific update requires that we store an _id with each comment
db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})
独立的集合提供了最大的查询灵活性
// sort comments however you want
db.comments.find({post_id: 3}).sort({votes: -1}).limit(5)
// pull out one or more specific comment(s)
db.comments.find({post_id: 3, user: 'leto'})
// get all of a user's comments joining the posts to get the title
var comments = db.comments.find({user: 'leto'}, {post_id: true})
var postIds = comments.map(function(c) { return c.post_id; });
db.posts.find({_id: {$in: postIds}}, {title: true});
选择嵌入文档更为有限
// you can select a range (useful for paging)
// but can't sort, so you are limited to the insertion order
db.posts.find({_id: 3}, {comments: {$slice: [0, 5]}})
// you can select the post without any comments also
db.posts.find({_id: 54}, {comments: -1})
// you can't use the update's position operator ($) for field selections
db.posts.find({'comments.user': 'leto'}, {title: 1, 'comments.$': 1})
一个文档,包括其所有嵌入的文档和数组,不能超过16MB。
单独收集需要更多工作
// finding a post + its comments is two queries and requires extra work
// in your code to make it all pretty (or your ODM might do it for you)
db.posts.find({_id: 9001});
db.comments.find({post_id: 9001})
// finding a post + its comments
db.posts.find({_id: 9001});
嵌入式文档简单快速(单搜索)
插入和更新没有太大区别
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ...}
db.comments.find();
{_id: 1, post_id: 1, title: 'i agree', ...}
{_id: 2, post_id: 1, title: 'they kill vampires too!', ...}
db.posts.find();
{_id: 1, title: 'unicorns are awesome', ..., comments: [
{title: 'i agree', ...},
{title: 'they kill vampires too!', ...}
]}
// separate collection insert and update
db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});
db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});
// embedded document insert and update
db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})
// this specific update requires that we store an _id with each comment
db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})
因此,如果您需要选择单个文档、需要对查询进行更多控制或拥有大量文档,则单独的集合是很好的。当您需要整个文档、包含$slice注释的文档或根本没有注释时,嵌入文档是很好的选择。一般来说,如果你有很多“评论”或者评论很大,最好是单独收集。较小和/或更少的文档往往是嵌入的自然选择 记住,你总是可以改变主意的。两者都尝试是最好的学习方法