Python-加速反向地理编码
我目前正在执行反向地理编码操作,如下所示:Python-加速反向地理编码,python,reverse-geocoding,Python,Reverse Geocoding,我目前正在执行反向地理编码操作,如下所示: import json from shapely.geometry import shape, Point import time with open('districts.json') as f: districts = json.load(f) # file also kept at https://raw.githubusercontent.com/Thevesh/Display/master/districts.json def rever
import json
from shapely.geometry import shape, Point
import time
with open('districts.json') as f: districts = json.load(f)
# file also kept at https://raw.githubusercontent.com/Thevesh/Display/master/districts.json
def reverse_geocode(lon,lat):
point = Point(lon, lat) # lon/lat
for feature in districts['features']:
polygon = shape(feature['geometry'])
if polygon.contains(point): return [(feature['properties'])['ADM1_EN'], (feature['properties'])['ADM2_EN']]
return ['','']
start_time = time.time()
for i in range(1000): test = reverse_geocode(103, 3)
print('----- Code ran in ' + "{:.3f}".format(time.time() - start_time) + ' seconds -----')
这需要大约13秒来反转1000个点的地理编码,这很好
然而,我需要为一个任务反向编码10mil坐标对,这意味着假设线性复杂度,这将需要130k秒(1.5天)。不好
该算法的明显低效之处在于,每次对点进行分类时,它都会遍历整个多边形集,这是一种极大的时间浪费
如何改进此代码?要在任务可接受的时间内计算10mil对,我需要在1秒内运行1k对。我使用并行性实现了这个算法 如果可能的话,如果对你有用的话,把它还给我。请记住,这是一个业余算法,需要调整
import concurrent.futures
with open('districts.json') as f: districts = json.load(f)
def reverse_geocode(lon:int, lat:int) -> list:
point = Point(lon, lat) # lon/lat
for feature in districts['features']:
polygon = shape(feature['geometry'])
if polygon.contains(point):
return [(feature['properties'])['ADM1_EN'], (feature['properties'])['ADM2_EN']]
return ['','']
if __name__ == '__main__':
time_start = time.time()
with concurrent.futures.ProcessPoolExecutor() as process:
for url in range(1000):
process.submit(reverse_geocode, 103, 3)
time_end = time.time()
print(f'\nfim {round(time_end - time_start, 2)} seconds')
第一个明显的解决方案是使用多处理来运行它,你试过了吗?这将时间减少了一半,这非常棒,但我认为根本的问题仍然是我们正在遍历整个列表。我将尝试一种方法,对要搜索的列表和要搜索的功能进行排序。