Python 在查找横向和纵向之间的距离时维护标识符

Python 在查找横向和纵向之间的距离时维护标识符,python,python-3.x,pandas,geopy,Python,Python 3.x,Pandas,Geopy,我有两组横向和纵向,我希望用笛卡尔连接,并找到每对之间的距离。编号或其他_编号可以重复,即每个标识符有两个位置/地址 d = {'number': ['100', '101'], 'lat': ['40.6892', '41.8902'], 'long': ['74.0445','12.4922']} d2 = {'other_number': ['200', '201'], 'lat': ['37.8199', '43.8791'], 'long': ['122.4783','103.4591

我有两组横向和纵向,我希望用笛卡尔连接,并找到每对之间的距离。编号或其他_编号可以重复,即每个标识符有两个位置/地址

d = {'number': ['100', '101'], 'lat': ['40.6892', '41.8902'], 'long': ['74.0445','12.4922']}
d2 = {'other_number': ['200', '201'], 'lat': ['37.8199', '43.8791'], 'long': ['122.4783','103.4591']}
data = pd.DataFrame(data=d)
data2 = pd.DataFrame(data=d2)
我目前正在将lat/long字段转换为元组列表

tuple_list_1 = list(zip(data.lat.astype(float), data.long.astype(float)))
tuple_list_2 = list(zip(data2.lat.astype(float), data2.long.astype(float)))
…然后使用生成器执行笛卡尔连接

gen = ([x, y] for x in tuple_list_1 for y in tuple_list_2)
最后,我通过一个简单的循环找到距离:

from geopy.distance import geodesic

for u, v in gen:
    dist = geodesic(u, v).miles
    print(dist)
最后,我希望将距离与原始信息(即编号和其他编号)联系起来。这是我想要的结果:

    d3 = {'number': ['100', '100','100','100'], 
     'address': ['Statue of Liberty', 'Statue of Liberty', 'Colosseum', 'Colosseum'],
     'other_number': ['200', '200', '201', '201'],
     'other_address': ['Golden Gate Bridge','Mount Rushmore','Golden Gate Bridge','Mount Rushmore'],
     'distance':[2572.262967759492,1515.3455804766047,5400.249562015358,4365.4386483486205]
    }
data3 = pd.DataFrame(data=d3)
我在想,如何有效地检索距离通过生成器循环可能没有那么有效,并将结果绑定到最终数据帧中的标识字段

作为pd进口熊猫 d={'number':['100','101'],'lat':['40.6892','41.8902'],'long':['74.0445','12.4922']} d2={'other_number':['200','201'],'lat':['37.8199','43.8791'],'long':['122.4783','103.4591']} data=pd.DataFramedata=d data2=pd.DataFramedata=d2 执行笛卡尔积 数据['key']=0 数据2['key']=0 df=pd.mergedata,data2,on='key',how='outer' df=df。放下“键”,轴=1 计算距离 从geopy.distance导入测地线 df['distance']=df.applylambda行:测地线['lat_x'],行['long_x'],行['lat_y'],行['long_y']。英里,轴=1 df将如下所示:

  number    lat_x   long_x other_number    lat_y    long_y     distance
0    100  40.6892  74.0445          200  37.8199  122.4783  2572.262968
1    100  40.6892  74.0445          201  43.8791  103.4591  1515.345580
2    101  41.8902  12.4922          200  37.8199  122.4783  5400.249562
3    101  41.8902  12.4922          201  43.8791  103.4591  4365.438648
如果您不喜欢通过一个新的键列执行此项,那么在pandas中还有其他执行笛卡尔乘积的方法,请参阅