Python 从经纬度到连续行之间的距离

Python 从经纬度到连续行之间的距离,python,pandas,numpy,geospatial,haversine,Python,Pandas,Numpy,Geospatial,Haversine,在Python 2.7的数据框架中,我有以下内容: Ser_Numb LAT LONG 1 74.166061 30.512811 2 72.249672 33.427724 3 67.499828 37.937264 4 84.253715 69.328767 5 72.104828 33.823462 6 63.989462 51.918173 7 80.2

在Python 2.7的数据框架中,我有以下内容:

Ser_Numb        LAT      LONG
       1  74.166061 30.512811
       2  72.249672 33.427724
       3  67.499828 37.937264
       4  84.253715 69.328767
       5  72.104828 33.823462
       6  63.989462 51.918173
       7  80.209112 33.530778
       8  68.954132 35.981256
       9  83.378214 40.619652
       10 68.778571 6.607066
我希望计算数据帧中连续行之间的距离。输出应如下所示:

Ser_Numb          LAT        LONG   Distance
       1    74.166061   30.512811          0
       2    72.249672   33.427724          d_between_Ser_Numb2 and Ser_Numb1
       3    67.499828   37.937264          d_between_Ser_Numb3 and Ser_Numb2
       4    84.253715   69.328767          d_between_Ser_Numb4 and Ser_Numb3
       5    72.104828   33.823462          d_between_Ser_Numb5 and Ser_Numb4
       6    63.989462   51.918173          d_between_Ser_Numb6 and Ser_Numb5
       7    80.209112   33.530778   .
       8    68.954132   35.981256   .
       9    83.378214   40.619652   .
       10   68.778571   6.607066    .
企图

看起来有些相似,但它正在计算固定点之间的距离。我需要连续点之间的距离

我尝试将其改编如下:

df['LAT_rad'], df['LON_rad'] = np.radians(df['LAT']), np.radians(df['LONG'])
df['dLON'] = df['LON_rad'] - np.radians(df['LON_rad'].shift(1))
df['dLAT'] = df['LAT_rad'] - np.radians(df['LAT_rad'].shift(1))
df['distance'] = 6367 * 2 * np.arcsin(np.sqrt(np.sin(df['dLAT']/2)**2 + math.cos(df['LAT_rad'].astype(float).shift(-1)) * np.cos(df['LAT_rad']) * np.sin(df['dLON']/2)**2))
但是,我得到以下错误:

Traceback (most recent call last):
  File "C:\Python27\test.py", line 115, in <module>
    df['distance'] = 6367 * 2 * np.arcsin(np.sqrt(np.sin(df['dLAT']/2)**2 + math.cos(df['LAT_rad'].astype(float).shift(-1)) * np.cos(df['LAT_rad']) * np.sin(df['dLON']/2)**2))
  File "C:\Python27\lib\site-packages\pandas\core\series.py", line 78, in wrapper
    "{0}".format(str(converter)))
TypeError: cannot convert the series to <type 'float'>
[Finished in 2.3s with exit code 1]
根据:

这个:如果我使用Latitude1=74.166061, 纵向1=30.512811,横向2=72.249672,纵向2=33.427724 然后我跑了233公里 哈弗函数 as:print haversine30.512811、74.166061、33.427724、72.249672然后I 行驶232.55公里 答案应该是233公里,但我的方法是大约8000公里。我认为我试图在连续行之间迭代的方式有问题

问题: 有没有办法在熊猫身上做到这一点?或者我需要一次循环一行数据帧吗

其他信息:

要创建上述DF,请选择它并复制到剪贴板。然后:

import pandas as pd
df = pd.read_clipboard()
print df
你可以用“别忘了投票表决”;-:

# vectorized haversine function
def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
    """
    slightly modified version: of http://stackoverflow.com/a/29546836/2901002

    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees or in radians)

    All (lat, lon) coordinates must have numeric dtypes and be of equal length.

    """
    if to_radians:
        lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])

    a = np.sin((lat2-lat1)/2.0)**2 + \
        np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2

    return earth_radius * 2 * np.arcsin(np.sqrt(a))


df['dist'] = \
    haversine(df.LAT.shift(), df.LONG.shift(),
                 df.loc[1:, 'LAT'], df.loc[1:, 'LONG'])
结果:

In [566]: df
Out[566]:
   Ser_Numb        LAT       LONG         dist
0         1  74.166061  30.512811          NaN
1         2  72.249672  33.427724   232.549785
2         3  67.499828  37.937264   554.905446
3         4  84.253715  69.328767  1981.896491
4         5  72.104828  33.823462  1513.397997
5         6  63.989462  51.918173  1164.481327
6         7  80.209112  33.530778  1887.256899
7         8  68.954132  35.981256  1252.531365
8         9  83.378214  40.619652  1606.340727
9        10  68.778571   6.607066  1793.921854
更新:这将有助于理解逻辑:

In [573]: pd.concat([df['LAT'].shift(), df.loc[1:, 'LAT']], axis=1, ignore_index=True)
Out[573]:
           0          1
0        NaN        NaN
1  74.166061  72.249672
2  72.249672  67.499828
3  67.499828  84.253715
4  84.253715  72.104828
5  72.104828  63.989462
6  63.989462  80.209112
7  80.209112  68.954132
8  68.954132  83.378214
9  83.378214  68.778571

尝试替换math.cos->np.cosI我仍然收到TypeError错误:无法将序列转换为[在2.3s内完成,退出代码为1]。而且,我似乎在逻辑上有问题。A为什么在没有任何参数的情况下使用.shift?B从第二行开始使用df.ix[1:,'LONG']是有原因的吗?为什么不使用df.ix[:,'LONG'],并尝试用shift来纠正这个问题?@WR,shift==shift1是默认值。检查更新-它将显示将传递给函数的参数对…谢谢。好的,代码工作了,我不再收到那个类型错误了。另外,感谢您在回答中的更新。这很有帮助。我遇到的问题是理解如何将移位值与原始值结合在一起。谢谢你的解释。@WR,当然,很高兴我能解释help@MaxU感谢您的解决方案!只有一个问题:当df只有两行时,为什么在numpy.radians[lat1,lon1,lat2,lon2]行中会出现错误?可选的mapnpy.radians[lat1,lon1,lat2,lon2]工作原理是什么,而且速度更快?
In [573]: pd.concat([df['LAT'].shift(), df.loc[1:, 'LAT']], axis=1, ignore_index=True)
Out[573]:
           0          1
0        NaN        NaN
1  74.166061  72.249672
2  72.249672  67.499828
3  67.499828  84.253715
4  84.253715  72.104828
5  72.104828  63.989462
6  63.989462  80.209112
7  80.209112  68.954132
8  68.954132  83.378214
9  83.378214  68.778571