python:递归查找组中点之间的距离
我可以将python:递归查找组中点之间的距离,python,pandas,geopy,Python,Pandas,Geopy,我可以将geopy中的vincenty应用于pandas中的dataframe,并确定两台连续机器之间的距离。但是,我希望在不重复的情况下找到组中所有机器之间的距离 例如,如果我按公司名称分组,并且有3台机器与此公司关联,我希望查找机器1和2、1和3以及(2和3)之间的距离,但不计算(2和1)与(3和1)之间的距离,因为它们是对称的(结果相同) 这仅查找组中连续机器的距离如何在此基础上展开以查找组中所有机器的距离 此代码返回: co_nm lat lon ser_no dis
geopy
中的vincenty
应用于pandas
中的dataframe
,并确定两台连续机器之间的距离。但是,我希望在不重复的情况下找到组中所有机器之间的距离
例如,如果我按公司名称分组,并且有3台机器与此公司关联,我希望查找机器1和2、1和3以及(2和3)之间的距离,但不计算(2和1)与(3和1)之间的距离,因为它们是对称的(结果相同)
这仅查找组中连续机器的距离如何在此基础上展开以查找组中所有机器的距离
此代码返回:
co_nm lat lon ser_no dist
0 aa 1 21 1 NaN
1 aa 2 22 2 97.47832
2 aa 3 23 3 97.44923
3 bb 4 24 4 NaN
4 bb 5 25 5 97.34752
5 bb 6 26 6 97.27497
6 bb 7 27 7 97.18804
7 cc 8 28 8 NaN
8 cc 9 29 9 96.97129
9 cc 10 30 0 96.84163
编辑:
In [9]: dist = pd.DataFrame(
...: [ [grp,
...: df.loc[c[0]].ser_no,
...: df.loc[c[1]].ser_no,
...: vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']])
...: ]
...: for grp,lst in df.groupby('co_nm').groups.items()
...: for c in combinations(lst, 2)
...: ],
...: columns=['co_nm','machineA','machineB','distance'])
In [10]: dist
Out[10]:
co_nm machineA machineB distance
0 aa 1 2 156.87614939082016 km
1 aa 1 3 313.7054454472326 km
2 aa 2 3 156.829329105069 km
3 cc 8 9 156.06016539095216 km
4 cc 8 0 311.9109981692541 km
5 cc 9 0 155.85149813446617 km
6 bb 4 5 156.66564183673603 km
7 bb 4 6 313.2143330250297 km
8 bb 4 7 469.6225353388079 km
9 bb 5 6 156.54889741438788 km
10 bb 5 7 312.95759746593706 km
11 bb 6 7 156.4089967703544 km
In [3]: from itertools import combinations
In [4]: import pandas as pd
In [5]: from geopy.distance import vincenty
In [6]: df = pd.DataFrame({'machine': [1,2,3], 'lat': [11, 12, 13], 'lon': [21,22,23]})
In [7]: df
Out[7]:
lat lon machine
0 11 21 1
1 12 22 2
2 13 23 3
In [8]: dist = pd.DataFrame(
...: [ [df.loc[c[0]].machine,
...: df.loc[c[1]].machine,
...: vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']])
...: ]
...: for c in combinations(df.index, 2)
...: ],
...: columns=['machineA','machineB','distance'])
In [9]: dist
Out[9]:
machineA machineB distance
0 1 2 155.3664523771998 km
1 1 3 310.4557192973811 km
2 2 3 155.09044419651156 km
所需输出将找到公司相关机器的唯一距离组合;也就是说,对于co_nm aa
而言,我们将有序列号(1,2)、(1,3)、(2,3)、(1,3)与co_nm bb
和cc
中机器之间的距离,但我们不会确定不同co_nm
组中机器之间的距离
这有意义吗?使用函数更新2::
def calc_dist(df):
return pd.DataFrame(
[ [grp,
df.loc[c[0]].ser_no,
df.loc[c[1]].ser_no,
vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']])
]
for grp,lst in df.groupby('co_nm').groups.items()
for c in combinations(lst, 2)
],
columns=['co_nm','machineA','machineB','distance'])
In [27]: calc_dist(df)
Out[27]:
co_nm machineA machineB distance
0 aa 1 2 156.87614939082016 km
1 aa 1 3 313.7054454472326 km
2 aa 2 3 156.829329105069 km
3 cc 8 9 156.06016539095216 km
4 cc 8 0 311.9109981692541 km
5 cc 9 0 155.85149813446617 km
6 bb 4 5 156.66564183673603 km
7 bb 4 6 313.2143330250297 km
8 bb 4 7 469.6225353388079 km
9 bb 5 6 156.54889741438788 km
10 bb 5 7 312.95759746593706 km
11 bb 6 7 156.4089967703544 km
更新:
In [9]: dist = pd.DataFrame(
...: [ [grp,
...: df.loc[c[0]].ser_no,
...: df.loc[c[1]].ser_no,
...: vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']])
...: ]
...: for grp,lst in df.groupby('co_nm').groups.items()
...: for c in combinations(lst, 2)
...: ],
...: columns=['co_nm','machineA','machineB','distance'])
In [10]: dist
Out[10]:
co_nm machineA machineB distance
0 aa 1 2 156.87614939082016 km
1 aa 1 3 313.7054454472326 km
2 aa 2 3 156.829329105069 km
3 cc 8 9 156.06016539095216 km
4 cc 8 0 311.9109981692541 km
5 cc 9 0 155.85149813446617 km
6 bb 4 5 156.66564183673603 km
7 bb 4 6 313.2143330250297 km
8 bb 4 7 469.6225353388079 km
9 bb 5 6 156.54889741438788 km
10 bb 5 7 312.95759746593706 km
11 bb 6 7 156.4089967703544 km
In [3]: from itertools import combinations
In [4]: import pandas as pd
In [5]: from geopy.distance import vincenty
In [6]: df = pd.DataFrame({'machine': [1,2,3], 'lat': [11, 12, 13], 'lon': [21,22,23]})
In [7]: df
Out[7]:
lat lon machine
0 11 21 1
1 12 22 2
2 13 23 3
In [8]: dist = pd.DataFrame(
...: [ [df.loc[c[0]].machine,
...: df.loc[c[1]].machine,
...: vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']])
...: ]
...: for c in combinations(df.index, 2)
...: ],
...: columns=['machineA','machineB','distance'])
In [9]: dist
Out[9]:
machineA machineB distance
0 1 2 155.3664523771998 km
1 1 3 310.4557192973811 km
2 2 3 155.09044419651156 km
说明:组合部件
In [11]: [c
....: for grp,lst in df.groupby('co_nm').groups.items()
....: for c in combinations(lst, 2)]
Out[11]:
[(0, 1),
(0, 2),
(1, 2),
(7, 8),
(7, 9),
(8, 9),
(3, 4),
(3, 5),
(3, 6),
(4, 5),
(4, 6),
(5, 6)]
旧答案:
In [9]: dist = pd.DataFrame(
...: [ [grp,
...: df.loc[c[0]].ser_no,
...: df.loc[c[1]].ser_no,
...: vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']])
...: ]
...: for grp,lst in df.groupby('co_nm').groups.items()
...: for c in combinations(lst, 2)
...: ],
...: columns=['co_nm','machineA','machineB','distance'])
In [10]: dist
Out[10]:
co_nm machineA machineB distance
0 aa 1 2 156.87614939082016 km
1 aa 1 3 313.7054454472326 km
2 aa 2 3 156.829329105069 km
3 cc 8 9 156.06016539095216 km
4 cc 8 0 311.9109981692541 km
5 cc 9 0 155.85149813446617 km
6 bb 4 5 156.66564183673603 km
7 bb 4 6 313.2143330250297 km
8 bb 4 7 469.6225353388079 km
9 bb 5 6 156.54889741438788 km
10 bb 5 7 312.95759746593706 km
11 bb 6 7 156.4089967703544 km
In [3]: from itertools import combinations
In [4]: import pandas as pd
In [5]: from geopy.distance import vincenty
In [6]: df = pd.DataFrame({'machine': [1,2,3], 'lat': [11, 12, 13], 'lon': [21,22,23]})
In [7]: df
Out[7]:
lat lon machine
0 11 21 1
1 12 22 2
2 13 23 3
In [8]: dist = pd.DataFrame(
...: [ [df.loc[c[0]].machine,
...: df.loc[c[1]].machine,
...: vincenty(df.loc[c[0], ['lat','lon']], df.loc[c[1], ['lat','lon']])
...: ]
...: for c in combinations(df.index, 2)
...: ],
...: columns=['machineA','machineB','distance'])
In [9]: dist
Out[9]:
machineA machineB distance
0 1 2 155.3664523771998 km
1 1 3 310.4557192973811 km
2 2 3 155.09044419651156 km
你能发布期望的输出/结果DF吗?好的,我来看看。