Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/80.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 字符数据的集合操作(整数字符串)_R_Character_Set Operations - Fatal编程技术网

R 字符数据的集合操作(整数字符串)

R 字符数据的集合操作(整数字符串),r,character,set-operations,R,Character,Set Operations,有人知道如何设计一种快速的方法来计算两列的相对重叠吗?我想知道集合“b”中有多少个“a”元素。理想情况下,会生成一列“c”,用于存储每行的这些比较值。我真的很坚持这一点 b <- c("20", "1, 8, 19, 20, 22, 23, 28, 34, 41", "3, 8, 10, 11, 18, 20, 26, 37", "1, 3, 6, 18, 21, 35", "NA", "1, 21, 33", "14, 37", "4, 14

有人知道如何设计一种快速的方法来计算两列的相对重叠吗?我想知道集合“b”中有多少个“a”元素。理想情况下,会生成一列“c”,用于存储每行的这些比较值。我真的很坚持这一点

b <- c("20", "1, 8, 19, 20, 22, 23, 28, 34, 41", 
       "3, 8, 10, 11, 18, 20, 26, 37", 
       "1, 3, 6, 18, 21, 35", "NA", "1, 21, 33", "14, 37",
       "4, 14, 18, 23, 33, 37, 40", "14", 
       "4, 14, 20, 23, 33, 37, 40", 
       "2, 3, 5, 7, 8, 10, 14, 16, 18, 23, 25, 34, 40", 
       "6, 8, 10, 14, 19, 29, 33, 35, 36, 39, 41",
       "1, 20", "1, 28, 36", "14", 
       "1, 6, 33, 12, 39", "28", 
       "1, 6, 11, 13, 18, 19, 21, 28, 33, 35, 36, 39", 
       "35, 40", "20", "20, 38", "6, 8, 19, 22, 29, 32, 33, 34, 40",
       "1, 10, 21, 25, 33, 35, 36, 39, 40", "36")

a <- c("14", "10", "8, 39", "26, 39", "14, 20", "33, 36", "14", 
       "NA", "8, 39", "33, 36", "8, 39", "1, 36",  "10", "28, 33",
       "14, 20", "33, 40", "28, 34", "1, 36", 
       "8, 39",  "20", "14, 20", "29, 33", "36", "14")

df <- data.frame(a, b)

df$a <- as.character(df$a)
df$b <- as.character(df$b)

我不明白为什么需要将
转换为.numeric
。就是那个给你警告的。“NA”在数据帧中被视为一个字符值,这是一个无法转换为数字的字符值

请注意,警告不是错误,因此您的代码实际上也适用于第5行(除非您预期为NA)

我会做以下几件事:

getCounts <- function(x,y){
  x <- strsplit(x,", ")[[1]]
  y <- strsplit(y,", ")[[1]]
  mean(y %in% x)
}
# gives
> getCounts(df$a[5],df$b[5])
[1] 0
getCounts <- function(x,y){
  x <- strsplit(x,", ")[[1]]
  y <- strsplit(y,", ")[[1]]
  mean(y %in% x)
}
# gives
> getCounts(df$a[5],df$b[5])
[1] 0
out <- mapply(getCounts,df$a, df$b)