如何在柱上循环并使用lat和long in R计算距离
我有一个数据框,其中包含城市各个区域的lat和long 数据帧的子集:如何在柱上循环并使用lat和long in R计算距离,r,dplyr,R,Dplyr,我有一个数据框,其中包含城市各个区域的lat和long 数据帧的子集: structure(list(Locality = c("ADYAR", "AMBATTUR", "KOLATHUR", "AVADI", "AGARAM", "ANNA NAGAR WEST", "CHROMPET", "MADIPAKKAM", "M
structure(list(Locality = c("ADYAR", "AMBATTUR", "KOLATHUR",
"AVADI", "AGARAM", "ANNA NAGAR WEST", "CHROMPET", "MADIPAKKAM",
"MOGAPPAIR", "MYLAPORE"), Transactions = c(607, 569, 498, 409,
103, 257, 303, 343, 316, 205), lon = c(80.2564957, 80.1547844,
80.2121332, 80.0969511, 80.2294222, 80.2017906, 80.1461663, 80.1960832,
80.1749627, 80.2676303), lat = c(13.0011774, 13.1143393, 13.1239583,
13.1067448, 13.1116221, 13.0861782, 12.951611, 12.9647462, 13.0837224,
13.0367914), Ambatturlon = c(80.15478, 80.15478, 80.15478, 80.15478,
80.15478, 80.15478, 80.15478, 80.15478, 80.15478, 80.15478),
Ambatturlat = c(13.11434, 13.11434, 13.11434, 13.11434, 13.11434,
13.11434, 13.11434, 13.11434, 13.11434, 13.11434), Guindylon = c(80.22064,
80.22064, 80.22064, 80.22064, 80.22064, 80.22064, 80.22064,
80.22064, 80.22064, 80.22064), Guindylat = c(13.00666, 13.00666,
13.00666, 13.00666, 13.00666, 13.00666, 13.00666, 13.00666,
13.00666, 13.00666), OMRlon = c(80.22915, 80.22915, 80.22915,
80.22915, 80.22915, 80.22915, 80.22915, 80.22915, 80.22915,
80.22915), OMRlat = c(12.91261, 12.91261, 12.91261, 12.91261,
12.91261, 12.91261, 12.91261, 12.91261, 12.91261, 12.91261
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))
>
> df
# A tibble: 10 x 10
Locality Transactions lon lat Ambatturlon Ambatturlat Guindylon Guindylat OMRlon OMRlat
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ADYAR 607 80.3 13.0 80.2 13.1 80.2 13.0 80.2 12.9
2 AMBATTUR 569 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9
3 KOLATHUR 498 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9
4 AVADI 409 80.1 13.1 80.2 13.1 80.2 13.0 80.2 12.9
5 AGARAM 103 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9
6 ANNA NAGAR WEST 257 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9
7 CHROMPET 303 80.1 13.0 80.2 13.1 80.2 13.0 80.2 12.9
8 MADIPAKKAM 343 80.2 13.0 80.2 13.1 80.2 13.0 80.2 12.9
9 MOGAPPAIR 316 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9
10 MYLAPORE 205 80.3 13.0 80.2 13.1 80.2 13.0 80.2 12.9
>
Ambatturlon、Ambatturlat、Guindylon等列是同一城市内的地方。我需要计算每个地点与列中提到的其他地点之间的距离:Ambatturlon、Ambatturlat、Guindylon Guindylat、OMRlon OMRlat
我了解到我们可以使用geosphere软件包中的Distaversine函数来实现这一点
我用下面的代码在第一个地方进行了尝试:
> df %>%
+ rowwise() %>%
+ mutate(disttoAmbattur = distHaversine(c(lon, lat), c(Ambatturlon, Ambatturlat)))
Source: local data frame [10 x 11]
Groups: <by row>
# A tibble: 10 x 11
Locality Transactions lon lat Ambatturlon Ambatturlat Guindylon Guindylat OMRlon OMRlat disttoAmbattur
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ADYAR 607 80.3 13.0 80.2 13.1 80.2 13.0 80.2 12.9 16744.
2 AMBATTUR 569 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 0.483
3 KOLATHUR 498 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 6309.
4 AVADI 409 80.1 13.1 80.2 13.1 80.2 13.0 80.2 12.9 6326.
5 AGARAM 103 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 8098.
6 ANNA NAGAR WEST 257 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 5984.
7 CHROMPET 303 80.1 13.0 80.2 13.1 80.2 13.0 80.2 12.9 18139.
8 MADIPAKKAM 343 80.2 13.0 80.2 13.1 80.2 13.0 80.2 12.9 17245.
9 MOGAPPAIR 316 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 4050.
10 MYLAPORE 205 80.3 13.0 80.2 13.1 80.2 13.0 80.2 12.9 14975.
>
我可以手动执行同样的操作,但有许多这样的本地化列。有人能告诉我我是否可以循环其他位置,并为所有位置列的每个lat-long组合添加一个类似distToAmbatur的新列。我们可以将所有lat和lon列聚集在一个向量中,并使用map2将它们并行传递。计算每对数据的distHaversine,并将它们作为新列添加到原始数据帧中
library(dplyr)
library(purrr)
lon_col <- grep('.lon', names(df), value = TRUE)
lat_col <- grep('.lat', names(df), value = TRUE)
df %>%
bind_cols(map2_dfc(lon_col, lat_col, ~{
newcol <- paste0('dist', sub('lon', '', .x))
df %>%
rowwise() %>%
transmute(!!newcol := geosphere::distHaversine(c(lon, lat),
c(.data[[.x]], .data[[.y]])))
}))
# A tibble: 10 x 13
# Locality Transactions lon lat Ambatturlon Ambatturlat Guindylon Guindylat OMRlon OMRlat distAmbattur distGuindy distOMR
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 ADYAR 607 80.3 13.0 80.2 13.1 80.2 13.0 80.2 12.9 16744. 3937. 10296.
# 2 AMBATTUR 569 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 0.483 13953. 23861.
# 3 KOLATHUR 498 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 6309. 13090. 23599.
# 4 AVADI 409 80.1 13.1 80.2 13.1 80.2 13.0 80.2 12.9 6326. 17437. 25935.
# 5 AGARAM 103 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 8098. 11723. 22154.
# 6 ANNA NAGAR WEST 257 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 5984. 9085. 19548.
# 7 CHROMPET 303 80.1 13.0 80.2 13.1 80.2 13.0 80.2 12.9 18139. 10140. 9995.
# 8 MADIPAKKAM 343 80.2 13.0 80.2 13.1 80.2 13.0 80.2 12.9 17245. 5373. 6823.
# 9 MOGAPPAIR 316 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 4050. 9906. 19934.
#10 MYLAPORE 205 80.3 13.0 80.2 13.1 80.2 13.0 80.2 12.9 14975. 6101. 14440.
我们可以将所有lat和lon列聚集在一个向量中,并使用map2以并行方式传递它们。计算每对数据的distHaversine,并将它们作为新列添加到原始数据帧中
library(dplyr)
library(purrr)
lon_col <- grep('.lon', names(df), value = TRUE)
lat_col <- grep('.lat', names(df), value = TRUE)
df %>%
bind_cols(map2_dfc(lon_col, lat_col, ~{
newcol <- paste0('dist', sub('lon', '', .x))
df %>%
rowwise() %>%
transmute(!!newcol := geosphere::distHaversine(c(lon, lat),
c(.data[[.x]], .data[[.y]])))
}))
# A tibble: 10 x 13
# Locality Transactions lon lat Ambatturlon Ambatturlat Guindylon Guindylat OMRlon OMRlat distAmbattur distGuindy distOMR
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 ADYAR 607 80.3 13.0 80.2 13.1 80.2 13.0 80.2 12.9 16744. 3937. 10296.
# 2 AMBATTUR 569 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 0.483 13953. 23861.
# 3 KOLATHUR 498 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 6309. 13090. 23599.
# 4 AVADI 409 80.1 13.1 80.2 13.1 80.2 13.0 80.2 12.9 6326. 17437. 25935.
# 5 AGARAM 103 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 8098. 11723. 22154.
# 6 ANNA NAGAR WEST 257 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 5984. 9085. 19548.
# 7 CHROMPET 303 80.1 13.0 80.2 13.1 80.2 13.0 80.2 12.9 18139. 10140. 9995.
# 8 MADIPAKKAM 343 80.2 13.0 80.2 13.1 80.2 13.0 80.2 12.9 17245. 5373. 6823.
# 9 MOGAPPAIR 316 80.2 13.1 80.2 13.1 80.2 13.0 80.2 12.9 4050. 9906. 19934.
#10 MYLAPORE 205 80.3 13.0 80.2 13.1 80.2 13.0 80.2 12.9 14975. 6101. 14440.
Ronak,我读到了“!!”用于要求R执行先前的表达式,transmute也是!!newcol:=要求R使用上一个paste0行的newcol输出?为什么我们在transmute中需要“:=”而不仅仅是“=”。因此,当您想要添加一个新列,并且该列的名称存储在变量中时,可以使用“:=”。使用!!我们计算变量newcol.Ronak,我读到“!!”用于要求R执行先前的表达式,transmute也是!!newcol:=要求R使用上一个paste0行的newcol输出?为什么我们在transmute中需要“:=”而不仅仅是“=”。因此,当您想要添加一个新列,并且该列的名称存储在变量中时,可以使用“:=”。使用!!我们评估变量newcol。