Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/68.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如何合并多个变量并使其中一个变量处于模糊匹配中_R_Join_Dplyr - Fatal编程技术网

R 如何合并多个变量并使其中一个变量处于模糊匹配中

R 如何合并多个变量并使其中一个变量处于模糊匹配中,r,join,dplyr,R,Join,Dplyr,在前一篇文章中,我最初在做模糊匹配时得到了帮助 感谢:@Ronak Shah、@r2evans和@akrun的帮助 这很有帮助,我根据这两个数据集得到了我想要的模糊匹配 structure(list(ID = 1:8, Address = c("Canal and Broadway", "55 water street room number 73", "Mulberry street", "Front street an

在前一篇文章中,我最初在做模糊匹配时得到了帮助

感谢:@Ronak Shah、@r2evans和@akrun的帮助

这很有帮助,我根据这两个数据集得到了我想要的模糊匹配

structure(list(ID = 1:8, Address = c("Canal and Broadway", "55 water street room number 73", 
"Mulberry street", "Front street and Fulton", "62nd street ", 
"wythe street", "vanderbilt avenue", "South Beach avenue")), class = "data.frame", row.names = c(NA, 
-8L))

运行

fuzzyjoin::stringdist_left_join(df1, df2, by = 'Address', max_dist = 5)
给我

structure(list(ID = 1:8, Address.x = c("Canal and Broadway", 
"55 water street room number 73", "Mulberry street", "Front street and Fulton", 
"62nd street ", "wythe street", "vanderbilt avenue", "South Beach avenue"
), ID2 = c(1L, NA, 3L, NA, 8L, 8L, 7L, 5L), Address.y = c("Canal & Broadway", 
NA, "Mulberry street", NA, "62nd street", "62nd street", "vanderbilt ave", 
"south beach avenue")), row.names = c(NA, -8L), class = "data.frame")
这场比赛做得很好,我接受这一点。我接下来要做的是匹配df1_new和df2_new

df1

和df 2

structure(list(ID2 = 1:8, Address = c("Canal & Broadway", "Somewhere around 55 water street", 
"Mulberry street", "Front street and close to Fulton", "south beach avenue", 
"along wythe street on the southwest ", "vanderbilt ave", "62nd street"
), Age = c(32L, 33L, 37L, 39L, 42L, 50L, 60L, 35L), Name = c("John", 
"Adam", "Ryan", "Greg", "Mark", "Anthony", "Mike", "Phil")), class = "data.frame", row.names = c(NA,-8L))
通常我会跑步

df3<-df1 %>% left_join(df2, by=c("Address","Age","Name")
请注意,尽管62街和桑树街在模糊匹配上匹配,但它们没有相同的对应年龄和名称

fuzzyjoin::stringdist_left_join(df1_new, df2_new ['Address'], by = 'Address', max_dist 
= 5) %>%
mutate(Address.z=Address.y) %>% left_join(df2_new %>% 
mutate(Address.z=Address),by=c("Age","Name", "Address.z"))
这让我得到了我想要的结果。

可能重复:
df3<-df1 %>% left_join(df2, by=c("Address","Age","Name")
ID   Address.x                       D2   Address.y        Age        Name
1   Canal and Broadway                1   Canal & Broadway  32        John
2   55 water street room number 73    
3   Mulberry street                   
4   Front street and Fulton           
5   62nd street                       8 62nd street
6   wythe street                      
7   vanderbilt avenue                 7 vanderbilt ave      60        Mike
8   South Beach avenue                5 south beach avenue  42        Mark
fuzzyjoin::stringdist_left_join(df1_new, df2_new ['Address'], by = 'Address', max_dist 
= 5) %>%
mutate(Address.z=Address.y) %>% left_join(df2_new %>% 
mutate(Address.z=Address),by=c("Age","Name", "Address.z"))