Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/svg/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 将不精确值与不精确值相结合_R - Fatal编程技术网

R 将不精确值与不精确值相结合

R 将不精确值与不精确值相结合,r,R,我有两个tibble,我想结合他们的基础上击球手列。但是,两列中的值并不完全相同,即“V Kohli”与“Virat Kohli(IND)”。如何根据这些不精确的匹配来组合TIBLES 谢谢大家! x1 <- tibble(Batsman=c("V Kohli (INDIA)","RG Sharma (INDIA)","Babar Azam (PAK)","GJ Maxwell (AUS)"), Runs=c(500,400,300,200),

我有两个tibble,我想结合他们的基础上击球手列。但是,两列中的值并不完全相同,即“V Kohli”与“Virat Kohli(IND)”。如何根据这些不精确的匹配来组合TIBLES

谢谢大家!

x1 <- tibble(Batsman=c("V Kohli (INDIA)","RG Sharma (INDIA)","Babar Azam (PAK)","GJ Maxwell (AUS)"),
                       Runs=c(500,400,300,200),
                       Matches=c(67,54,47,23)  

x2 <- tibble(Rank=c(1,2,3,4),
             Batsman=c("Virat Kohli", "Rohit Sharma", "Glenn Maxwell","Babar Azam"),
             Rating=c(853,820,640,500))

x1所以你想连接两个文本字符串

> x1$Batsman
[1] "V Kohli (INDIA)"   "RG Sharma (INDIA)" "Babar Azam (PAK)"  "GJ Maxwell (AUS)" 
> x2$Batsman
[1] "Virat Kohli"   "Rohit Sharma"  "Glenn Maxwell" "Babar Azam"  
我猜你的名字比这四个多得多? 这绝对是一项棘手的任务,计算机在完成这类任务方面是出了名的差劲。(这里有一些非常长的函数的著名例子,仅用于读取电话号码)。从您提供的字符串中,我可以看到它们总是有相似的名称

我将使用regexp来提取名称

完整代码:

library(tibble)
library(stringr)

x1 <- tibble(Batsman=c("V Kohli (INDIA)","RG Sharma (INDIA)","Babar Azam (PAK)","GJ Maxwell (AUS)"),
             Runs=c(500,400,300,200),
             Matches=c(67,54,47,23) )

x2 <- tibble(Rank=c(1,2,3,4),
            Batsman=c("Virat Kohli", "Rohit Sharma", "Glenn Maxwell","Babar Azam"),
            Rating=c(853,820,640,500))


AA <- str_sub(x1$Batsman, start = str_locate(x1$Batsman, " ")[,1]+1, 20)
AA <- str_sub(AA, start = 1, end = str_locate(AA, " ")[,1]-1)  %>%
  str_to_lower()


BB <- str_sub(x2$Batsman, start = str_locate(x2$Batsman, " ")[,1]+1, 20) %>%
  str_to_lower()

match(AA, BB)
库(TIBLE)
图书馆(stringr)
你不是又问了吗?