R从地址字符串中提取房屋/街道编号
假设我有以下带有地址的数据,即街道名称。我的目标是将街道名称与门牌号分开R从地址字符串中提取房屋/街道编号,r,extract,tidyr,R,Extract,Tidyr,假设我有以下带有地址的数据,即街道名称。我的目标是将街道名称与门牌号分开 mydf <- tribble( ~street, "Some Way 10", "Shiny Street 12b", "Dark Street from Netflix Movie 17c - 17d", "Seasame Street", "Dark Alley 15c", )
mydf <- tribble(
~street,
"Some Way 10",
"Shiny Street 12b",
"Dark Street from Netflix Movie 17c - 17d",
"Seasame Street",
"Dark Alley 15c",
)
mydf <- mydf %>% mutate(street= str_squish(street)) # get rid of whitespace
mydf这是否有效:
mydf %>%
transmute(street_name_only = str_remove(street, '\\d.*'),
house_number = str_extract(street, '\\d.*'))
# A tibble: 5 x 2
street_name_only house_number
<chr> <chr>
1 "Some Way " 10
2 "Shiny Street " 12b
3 "Dark Street from Netflix Movie " 17c - 17d
4 "Seasame Street" NA
5 "Dark Alley " 15c
mydf%>%
转换(仅街道名称=str_移除(街道,\\d.*),
房屋编号=街道摘录(街道,\\d.*))
#一个tibble:5x2
街道\u名称\u唯一房屋\u编号
1“某种方式”10
2“闪亮街”12b
3“Netflix电影中的黑暗街道”17c-17d
4“Seasame Street”NA
5“黑暗巷”15c
使用tidyr::分离:
tidyr::separate(mydf, street, c("street_name_only", "house_number"),
'(?=\\d)', extra = 'merge', fill = 'right')
# street_name_only house_number
# <chr> <chr>
#1 "Some Way " 10
#2 "Shiny Street " 12b
#3 "Dark Street from Netflix Movie " 17c - 17d
#4 "Seasame Street" NA
#5 "Dark Alley " 15c
tidyr::单独(mydf,street,c(“仅街道名称”、“房屋编号”),
“(?=\\d)”,额外='merge',填充='right')
#街道\u名称\u唯一房屋\u编号
#
#1“某种方式”10
#2“闪亮街”12b
#3“Netflix电影中的黑暗街道”17c-17d
#4“Seasame Street”NA
#5“黑暗巷”15c
是的,的确如此!非常感谢你!我知道您使用的函数,但我没有想到将它们组合在一起!。再次感谢。
tidyr::separate(mydf, street, c("street_name_only", "house_number"),
'(?=\\d)', extra = 'merge', fill = 'right')
# street_name_only house_number
# <chr> <chr>
#1 "Some Way " 10
#2 "Shiny Street " 12b
#3 "Dark Street from Netflix Movie " 17c - 17d
#4 "Seasame Street" NA
#5 "Dark Alley " 15c