dplyr：计算A列与其他几列的匹配数并写入新列_R_Dplyr

dplyr：计算A列与其他几列的匹配数并写入新列

dplyr：计算A列与其他几列的匹配数并写入新列,r,dplyr,R,Dplyr,我有一张这样的桌子： ID | Word1 | Word2 | Word3 | Word4 | Word5 | Word6 | Word7 1 | like | grilled | cheese | except| omelet| and | cheese 1 | like | grilled | cheese | except| omelet| and | cheese 1 | like | gri

我有一张这样的桌子：

ID  |   Word1   | Word2     | Word3     | Word4 | Word5 | Word6 | Word7
1   |   like    | grilled   | cheese    | except| omelet| and   | cheese
1   |   like    | grilled   | cheese    | except| omelet| and   | cheese
1   |   like    | grilled   | cheese    | except| omelet| and   | cheese
1   |   like    | grilled   | cheese    | except| omelet| and   | cheese
2   |   i       | have      | to        | write | it    | six   | times
2   |   i       | have      | to        | write | it    | six   | times

我想添加一个新列，计算列

Word7

中的每个单词在所有其他

WordX

列中出现的次数。因此，对于ID=1的行，这个新列的值为1（因为cheese出现在列

Word3

中）。对于ID=2的行，其值为0。但是，如果`Word7'中的单词多次出现在第1-6列中，也可能有值大于1的行

我已经尝试了一些使用dplyr

intersect（）

和

select（）

的方法，但是我甚至很难将这种方法概念化（我有点笨）

这些列中具有完全相同内容的FYI行可能会出现多次，但还有其他列具有唯一值（但与此问题无关，这就是我将其忽略的原因）

下面是一种处理

mapply

rowSums(mapply(function(x, y) grepl(y, x), df[,-c(1, 8)], df[[8]]))
#[1] 1 1 1 1 0 0

您需要使用

mapply

，它逐个（针对每一行）应用带有x和y参数的函数。我们在这里应用的功能是在所有其他列中检测

word7

的字（排除的ID列除外）。一旦我们这样做，我们就会得到一个带有逻辑语句的数据帧，其中我们进行

行和

来计算TRUEs的总数

library(dplyr)
df %>% mutate(A=rowSums(.[2:7]==Word7))

使用BaseR

rowSums(df[,-c(1,8)]==df$Word7)
[1] 1 1 1 1 0 0

df[，-c（1,8）]==df$Word7

将返回真数据帧和假数据帧，然后我们可以使用

RowSums

数据

df这里有一种方法rowSums（df[，-c（1,8）]==df$Word7）
你能详细说明它的实际作用吗？你试过了吗？很有趣。。。我得到行和（df[，-c（1，8）]==df$Word7）[1]0
。这也是我的第一次尝试，但错误的输出促使我做mapply
是否[2:7]
意味着它只查看数据帧中的第2列到第7列？@rayne是的，如果列被分隔，那么你可以尝试[c（2,3,4:7）]==Word7
嗯，我得到以下消息：变异中的错误（.data，dots）：计算错误：未实现这些类型的比较。此外：警告消息：1:In行和（[1:14，16:22]==Word22）“==”2:In.[1:14，16:22]==Word22:较长的对象长度不是较短对象长度的倍数
 df <- read.table(text="
  ID      Word1     Word2       Word3       Word4   Word5   Word6   Word7
                   1       like      grilled     cheese      except  omelet  and     cheese
                   1       like      grilled     cheese      except  omelet  and     cheese
                   1       like      grilled     cheese      except  omelet  and     cheese
                   1       like      grilled     cheese      except  omelet  and     cheese
                   2       i         have        to          write   it      six     times
                   2       i         have        to          write   it      six     times",
       header=T,stringsAsFactor=F)