Python mongodb：不合理的慢查询，带有索引和简单文档_Python_Mongodb

Python mongodb：不合理的慢查询，带有索引和简单文档

python mongodb

Python mongodb：不合理的慢查询，带有索引和简单文档,python,mongodb,Python,Mongodb,我正在使用python和mongodb，现在我需要从数据库中查询文档并保存文档中的一些信息，现在我的代码是： for trips in trip.find({},{'latlng_start':1, 'latlng_end':1, 'trip_data':1, 'trip_id':1}).batch_size(500): orig_coord = trips['latlng_start']['coordinates'] dest_coord = trips['latlng_end

我正在使用python和mongodb，现在我需要从数据库中查询文档并保存文档中的一些信息，现在我的代码是：

for trips in trip.find({},{'latlng_start':1, 'latlng_end':1, 'trip_data':1, 'trip_id':1}).batch_size(500):
    orig_coord = trips['latlng_start']['coordinates']
    dest_coord = trips['latlng_end']['coordinates']
    cell_start = citymap.find({"trips_orig": {"$exists": True},"cell_latlng":{"$geoIntersects":{"$geometry":{"type":"Point", "coordinates":orig_coord}}}})
    cell_end = citymap.find({"trips_dest": {"$exists": True},"cell_latlng":{"$geoIntersects":{"$geometry":{"type":"Point", "coordinates":dest_coord}}}})

    if cell_start.count() == 1 and cell_end.count() == 1 and cell_start[0]['big_cell8']['POI'] != {} and cell_end[0]['big_cell8']['POI'] != {}:
        try:
            labels_raw.append(purpose_mapping[trips['trip_data']['purpose']])           
            user_ids_raw.append(int(trips['trip_id'][:10]))         
            venue_feature_start.append([cell_start[0]['big_cell8']['POI'], orig_coord])
            venue_feature_end.append([cell_end[0]['big_cell8']['POI'], dest_coord]) 
        except:
            continue

    else:
        continue

我已将2dsphere索引指定给集合citymap，此集合的索引为：

[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "CitySeg2014.grid750"
    },
    {
        "v" : 1,
        "key" : {
            "latlng" : "2dsphere"
        },
        "name" : "latlng_2dsphere",
        "ns" : "CitySeg2014.grid750",
        "2dsphereIndexVersion" : 2
    },
    {
        "v" : 1,
        "key" : {
            "cell_latlng" : "2dsphere"
        },
        "name" : "cell_latlng_2dsphere",
        "ns" : "CitySeg2014.grid750",
        "2dsphereIndexVersion" : 2
    },
    {
        "v" : 1,
        "key" : {
            "_fts" : "text",
            "_ftsx" : 1
        },
        "name" : "trips_dest_text_trips_orig_text",
        "ns" : "CitySeg2014.grid750",
        "weights" : {
            "trips_dest" : 1,
            "trips_orig" : 1
        },
        "default_language" : "english",
        "language_override" : "language",
        "textIndexVersion" : 2
    }
]

问题是，尽管只有47000次行程，citymap只包含11600个文档，但查询大约需要3000秒！！！但今天早上，当我运行同一个程序时，大约需要800秒。我不知道为什么会这样。有没有提高效率的想法？

正如您所说，在一天中的不同时间，负载是不同的，请尝试查看没有其他人使用mongodb进行负载测试等，同时查看链接，基本上确保没有太多的索引使用mongodb集成探查器确定哪个查询速度慢。第一个查询读取所有数据，第二个查询以

$exists

开始，这是启动查询的最糟糕方式，因为元信息不是索引的一部分，而且它具有极低的选择性。事实上，如果您需要

$exists

，您可能希望查看您的数据模型，但可能应该将其移到末尾。使用配置文件和

.explain（）

。