Python-加速反向地理编码

Python-加速反向地理编码,python,reverse-geocoding,Python,Reverse Geocoding,我目前正在执行反向地理编码操作,如下所示: import json from shapely.geometry import shape, Point import time with open('districts.json') as f: districts = json.load(f) # file also kept at https://raw.githubusercontent.com/Thevesh/Display/master/districts.json def rever

我目前正在执行反向地理编码操作,如下所示:

import json
from shapely.geometry import shape, Point
import time

with open('districts.json') as f: districts = json.load(f)
# file also kept at https://raw.githubusercontent.com/Thevesh/Display/master/districts.json

def reverse_geocode(lon,lat):
    point = Point(lon, lat) # lon/lat
    for feature in districts['features']:
        polygon = shape(feature['geometry'])
        if polygon.contains(point): return [(feature['properties'])['ADM1_EN'], (feature['properties'])['ADM2_EN']]
    return ['','']

start_time = time.time()
for i in range(1000): test = reverse_geocode(103, 3)
print('----- Code ran in ' + "{:.3f}".format(time.time() - start_time) + ' seconds -----')
这需要大约13秒来反转1000个点的地理编码,这很好

然而,我需要为一个任务反向编码10mil坐标对,这意味着假设线性复杂度,这将需要130k秒(1.5天)。不好

该算法的明显低效之处在于,每次对点进行分类时,它都会遍历整个多边形集,这是一种极大的时间浪费


如何改进此代码?要在任务可接受的时间内计算10mil对,我需要在1秒内运行1k对。

我使用并行性实现了这个算法

如果可能的话,如果对你有用的话,把它还给我。请记住,这是一个业余算法,需要调整

import concurrent.futures

with open('districts.json') as f: districts = json.load(f)

def reverse_geocode(lon:int, lat:int) -> list:

    point = Point(lon, lat) # lon/lat
    for feature in districts['features']:
        polygon = shape(feature['geometry'])
        if polygon.contains(point):
            return [(feature['properties'])['ADM1_EN'], (feature['properties'])['ADM2_EN']]
    return ['','']

if __name__ == '__main__':
    time_start = time.time()

    with concurrent.futures.ProcessPoolExecutor() as process:
        for url in range(1000):
            process.submit(reverse_geocode, 103, 3)

    time_end = time.time()
    print(f'\nfim {round(time_end - time_start, 2)} seconds')

第一个明显的解决方案是使用多处理来运行它,你试过了吗?这将时间减少了一半,这非常棒,但我认为根本的问题仍然是我们正在遍历整个列表。我将尝试一种方法,对要搜索的列表和要搜索的功能进行排序。