mongodb$查找不使用哈希索引进行简单的相等联接_Mongodb_Mongodb Query_Aggregation Framework_Pymongo

mongodb$查找不使用哈希索引进行简单的相等联接

mongodb

mongodb$查找不使用哈希索引进行简单的相等联接,mongodb,mongodb-query,aggregation-framework,pymongo,Mongodb,Mongodb Query,Aggregation Framework,Pymongo,我有一个名为songs_test的集合，它有一个字段“artist_name”。我已经在同一页上创建了一个散列索引 > db.songs_test.getIndexes() [ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "songs_new.songs_test" }, {

我有一个名为songs_test的集合，它有一个字段“artist_name”。我已经在同一页上创建了一个散列索引

> db.songs_test.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "songs_new.songs_test"
    },
    {
        "v" : 2,
        "key" : {
            "artist_name" : "hashed"
        },
        "name" : "artist_name_hashed",
        "ns" : "songs_new.songs_test"
    }
]

然后，我运行以下聚合查询：

use songs_new

db.songs_test.explain().aggregate([
        {$lookup : {
                 "from": "songs_test",
                 "localField": "artist_name",
                 "foreignField": "artist_name",
                 "as": "artist_name_matches"
                 }
        },
        {$unwind : "$artist_name_matches"}
], {allowDiskUse: true}
)

运行此聚合查询大约需要13分钟。Songs_test collection拥有100万条记录。当我进行解释时，我发现查找没有使用索引

解释输出：

switched to db songs_new
{
    "stages" : [
        {
            "$cursor" : {
                "query" : {

                },
                "queryPlanner" : {
                    "plannerVersion" : 1,
                    "namespace" : "songs_new.songs_test",
                    "indexFilterSet" : false,
                    "parsedQuery" : {

                    },
                    "queryHash" : "8B3D4AB8",
                    "planCacheKey" : "8B3D4AB8",
                    "winningPlan" : {
                        "stage" : "COLLSCAN",
                        "direction" : "forward"
                    },
                    "rejectedPlans" : [ ]
                }
            }
        },
        {
            "$lookup" : {
                "from" : "songs_test",
                "as" : "artist_name_matches",
                "localField" : "artist_name",
                "foreignField" : "artist_name",
                "unwinding" : {
                    "preserveNullAndEmptyArrays" : false
                }
            }
        }
    ],
    "ok" : 1
}

我尝试过各种方法（在网上建议），但找不到确切的原因。此外，在8线程64GB RAM机器上运行大约需要13分钟。聚合查询只使用一个线程（这也很奇怪）。

有人能帮忙解释一下为什么不使用索引，为什么聚合运行在一个线程上吗？

有趣的问题。你能回答1吗。为什么您要使用

$lookup

？2.你用

hash

类型索引

artist\u name

的目的是什么？有趣的问题。你能回答1吗。为什么您要使用

$lookup

？2.你用散列的

类型为艺术家名称
编制索引的目的是什么？