将数据帧中的字与R中的字符串匹配
我有一个回忆任务的数据框架,参与者从他们之前学习的列表中尽可能多地回忆单词。这是一个数据模型。每行为主题,每列(w1-w5)为单词:将数据帧中的字与R中的字符串匹配,r,match,R,Match,我有一个回忆任务的数据框架,参与者从他们之前学习的列表中尽可能多地回忆单词。这是一个数据模型。每行为主题,每列(w1-w5)为单词: df <- data.frame(subject = 1:5, w1 = c("screen", "toad", "toad", "witch", "toad"), w2 = c("package", "tuna", "tuna", "postage", "dinosaur"), w3 = c("tuna", "postage", "toas
df <- data.frame(subject = 1:5,
w1 = c("screen", "toad", "toad", "witch", "toad"),
w2 = c("package", "tuna", "tuna", "postage", "dinosaur"),
w3 = c("tuna", "postage", "toast", "athlete", "ranch"),
w4 = c("toad", "witch", "tuna", "package", "NA"),
w5 = c("windwo", "mermaid", "NA", "NA", "NA")
)
我想将生成的每个单词(w1-w5列)与正确单词的列表相匹配,这些单词是:
words <- c("screen", "package", "tuna", "toad", "window",
"postage", "witch", "mermaid", "toast", "dinosaur")
受试者1将得到四分,因为他们拼错了一个单词
受试者2得5分
受试者3会得到3分,因为他们重复了金枪鱼并且漏掉了一个单词
受试者4会得到三分,因为他们有一个不正确的单词和一个遗漏的单词
受试者5会得到两分,因为他们有一个不正确的单词和两个遗漏的单词
data.frame(subject = df$subject
, nCorrect = apply(df[, -1], 1, function(x) sum(unique(x) %in% words)))
# subject nCorrect
# 1 1 4
# 2 2 5
# 3 3 3
# 4 4 3
# 5 5 2
带有数据。表(相同结果)
另一个选项是以长格式转换数据。按主题分组
使用dplyr::summary
查找正确数量的匹配答案
library(tidyverse)
words <- c("screen", "package", "tuna", "toad", "window",
"postage", "witch", "mermaid", "toast", "dinosaur")
df %>% gather(key, value, -subject) %>%
group_by(subject) %>%
summarise(nCorrect = sum(unique(value) %in% words))
# # A tibble: 5 x 2
# subject nCorrect
# <int> <int>
# 1 1 4
# 2 2 5
# 3 3 3
# 4 4 3
# 5 5 2
库(tidyverse)
单词%gather(关键字、值、主题)%%>%
分组依据(受试者)%>%
总结(nCorrect=总和(唯一值(以%字表示)%)
##tibble:5 x 2
#主题不正确
#
# 1 1 4
# 2 2 5
# 3 3 3
# 4 4 3
# 5 5 2
data.frame(subject = df$subject
, nCorrect = apply(df[, -1], 1, function(x) sum(unique(x) %in% words)))
# subject nCorrect
# 1 1 4
# 2 2 5
# 3 3 3
# 4 4 3
# 5 5 2
setDT(df)
df[, sum(unique(unlist(.SD)) %in% words), by = subject]
library(tidyverse)
words <- c("screen", "package", "tuna", "toad", "window",
"postage", "witch", "mermaid", "toast", "dinosaur")
df %>% gather(key, value, -subject) %>%
group_by(subject) %>%
summarise(nCorrect = sum(unique(value) %in% words))
# # A tibble: 5 x 2
# subject nCorrect
# <int> <int>
# 1 1 4
# 2 2 5
# 3 3 3
# 4 4 3
# 5 5 2