关于mongodb如何选择索引或$or查询的问题
背景 我有一个具有以下架构的集合:关于mongodb如何选择索引或$or查询的问题,mongodb,indexing,aggregation,Mongodb,Indexing,Aggregation,背景 我有一个具有以下架构的集合: { appId, createdAt, lastSeen, lastHeard, deleted, ...otherFields } 以及以下指标: { appId: 1, deleted: 1, lastSeen: 1 } { appId: 1, deleted: 1, lastHeard: 1 } { appId: 1, createdAt: 1, deleted: 1 } { appId: 1, deleted: 1, createdAt: 1, la
{ appId, createdAt, lastSeen, lastHeard, deleted, ...otherFields }
以及以下指标:
{ appId: 1, deleted: 1, lastSeen: 1 }
{ appId: 1, deleted: 1, lastHeard: 1 }
{ appId: 1, createdAt: 1, deleted: 1 }
{ appId: 1, deleted: 1, createdAt: 1, lastSeen: 1, lastHeard: 1 }
在我的应用程序中,我有一个聚合:
db.getCollection('client_users').aggregate([
{
$match: {
deleted: false,
appId: 'appid',
$or: [
{ createdAt: { $gt: new Date('2020-10-19T17:00:00.000Z') } },
{ lastSeen: { $gt: new Date('2020-10-19T17:00:00.000Z') } },
{ lastHeard: { $gt: new Date('2020-10-19T17:00:00.000Z') } },
]
}
},
{
$group: {
_id: '$geoLocation.city',
count: {
$sum: 1
}
}
},
{
$sort: {
count: -1
}
}
]);
我的意图是将上面的前三个索引用于此聚合,因为我知道$or查询被解析为3个单独的查询。但是,从解释输出中,获胜计划使用第四个索引({appId:1,deleted:1,createdAt:1,lastSeen:1,lastsheard:1}
)作为最后两个子句:
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "OR",
"inputStages" : [
{
"stage" : "IXSCAN",
"keyPattern" : {
"appId" : 1,
"createdAt" : 1,
"deleted" : 1
},
"indexName" : "appId_1_createdAt_1_deleted_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"appId" : [],
"createdAt" : [],
"deleted" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"appId" : [
"[\"appid\", \"appid\"]"
],
"createdAt" : [
"(new Date(1603126800000), new Date(9223372036854775807)]"
],
"deleted" : [
"[false, false]"
]
}
},
{
"stage" : "IXSCAN",
"keyPattern" : {
"appId" : 1,
"deleted" : 1,
"createdAt" : 1,
"lastSeen" : 1,
"lastHeard" : 1
},
"indexName" : "appId_1_deleted_1_createdAt_1_lastSeen_1_lastHeard_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"appId" : [],
"deleted" : [],
"createdAt" : [],
"lastSeen" : [],
"lastHeard" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"appId" : [
"[\"appid\", \"appid\"]"
],
"deleted" : [
"[false, false]"
],
"createdAt" : [
"[MinKey, MaxKey]"
],
"lastSeen" : [
"[MinKey, MaxKey]"
],
"lastHeard" : [
"(new Date(1603126800000), new Date(9223372036854775807)]"
]
}
},
{
"stage" : "IXSCAN",
"keyPattern" : {
"appId" : 1,
"deleted" : 1,
"createdAt" : 1,
"lastSeen" : 1,
"lastHeard" : 1
},
"indexName" : "appId_1_deleted_1_createdAt_1_lastSeen_1_lastHeard_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"appId" : [],
"deleted" : [],
"createdAt" : [],
"lastSeen" : [],
"lastHeard" : []
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"appId" : [
"[\"appid\", \"appid\"]"
],
"deleted" : [
"[false, false]"
],
"createdAt" : [
"[MinKey, MaxKey]"
],
"lastSeen" : [
"(new Date(1603126800000), new Date(9223372036854775807)]"
],
"lastHeard" : [
"[MinKey, MaxKey]"
]
}
}
]
}
},
这不是我想要的。奇怪的是,当我只尝试其中一个条款时,就像在这个$match阶段:
$match: {
deleted: false,
appId: 'appid',
lastSeen: {$gt: new Date('2020-10-19T17:00:00.000Z') },
}
它使用了正确的索引({appId:1,deleted:1,lastSeen:1}
)。我从explain输出和实际聚合的计时中知道这一点。具体地说,在没有提示或使用hint:appId\u 1\u deleted\u 1\u lastSeen\u 1
运行它所需的时间比使用hint:appId\u 1\u deleted\u 1\u createdAt\u 1\u lastSeen\u 1
的时间短三倍。这让我对mongodb如何选择索引感到非常困惑
有人能给我解释一下这种行为的原因吗?有没有办法强迫mongodb在这种情况下使用我想要的索引?谢谢。我想出来了。这正是因为$or查询。Mongodb通过让它们彼此进行小规模竞争来选择查询计划。效率较低的计划幸运地获胜,因为第一个$or子句处理了一切(记住它是$or左右,只有一个子句就足够了)。我通过删除第四个索引来解决这个问题