R 多个评分员的混淆矩阵
注意:这些值是虚构的,我没有手动计算上面的值R 多个评分员的混淆矩阵,r,dplyr,tidyr,R,Dplyr,Tidyr,注意:这些值是虚构的,我没有手动计算上面的值 我甚至不知道从哪里开始。我搜索的大多数混淆矩阵都是比较实际模型输出和预测模型输出,例如。如有任何建议,将不胜感激 对于这个解决方案,我使用dplyr和purrr包 red orange blue yellow purple red 22 6 2 3 2 orange 6 13 1 4 1 blue 2 1 10
我甚至不知道从哪里开始。我搜索的大多数混淆矩阵都是比较实际模型输出和预测模型输出,例如。如有任何建议,将不胜感激 对于这个解决方案,我使用dplyr和purrr包
red orange blue yellow purple
red 22 6 2 3 2
orange 6 13 1 4 1
blue 2 1 10 3 1
yellow 3 4 3 9 2
purple 2 1 1 2 9
我不太确定这里使用的逻辑。可能使用示例数据来显示所需的输出。所需的输出类似于上面的表i。一般来说,每个评分员对对角线的一致性较高,但对非对角线的一致性有一些不一致。太棒了。这是非常接近。。。我需要比较评分者,而不是比较事件。例如,在测试create_MISSION_矩阵时,需要创建create_MISSION_矩阵RATER1_1,rater2_1,结果看起来是正确的。我对事件之间的比较不像对评分者之间的比较那么感兴趣。我可以转换东西,但却被卡在了combn线上。在这里,我需要创建评分者的组合,而不是事件。@b222很高兴您找到了解决方案,我已经更新了我的答案抱歉耽搁了,pivot\u wider执行以下操作:从name\u from=name=不同的评级器中获取所有唯一的元素,并为每个评级器创建一列。我们只使用评级器中的两个元素调用函数,因此将创建两列,values\u from=value参数告诉我们在这些列中放置什么,在这里,我们将有来自评分员的实际评分。我会为其他人更新我的答案questions@b222,更新了答案,我希望我回答了你所有的问题。谢谢。我感谢你的帮助和详细的解释。
red orange blue yellow purple
red 22 6 2 3 2
orange 6 13 1 4 1
blue 2 1 10 3 1
yellow 3 4 3 9 2
purple 2 1 1 2 9
library(dplyr)
library(purrr)
# convert to long format
df_long <- test_data %>% pivot_longer(-event)
# df_long
# # A tibble: 35 x 3
# event name value
# <fct> <chr> <fct>
# 1 event1 rater1_1 red
# 2 event1 rater2_1 red
# 3 event1 rater3_1 red
# 4 event1 rater4_1 orange
# 5 event1 rater5_1 blue
# 6 event2 rater1_1 orange
# 7 event2 rater2_1 orange
# 8 event2 rater3_1 red
# 9 event2 rater4_1 orange
#10 event2 rater5_1 blue
# # ... with 25 more rows
# create function to compute the confusion matrix for two given events
create_confusion_matrix <- function(raters){
df_long %>% filter(name %in% raters) %>%
pivot_wider(names_from=name,values_from=value) %>%
select(-event) %>%
table()
}
# lets try this function with rater1_1 and rater2_1
create_confusion_matrix(c('rater1_1','rater2_1'))
# rater2_1
#rater1_1 orange purple red yellow blue
# orange 2 0 0 0 0
# purple 0 1 0 0 0
# red 1 0 2 0 0
# yellow 0 1 0 0 0
# blue 0 0 0 0 0
# now we need to get all combinations of two raters
raters2 <- combn(unique(df_long$name),2,simplify=FALSE)
# raters2 is a list, each element is a vector containing 2 raters
# loop over the list and apply create_confusion_matrix for each element
result_list <- map(raters2,create_confusion_matrix)
# result_list is a list, each element is a confusion matrix
#we can them sum all theses tables
contingency <- Reduce('+',result_list)
# rater2_1
#rater1_1 orange purple red yellow blue
# orange 14 1 2 1 5
# purple 6 4 0 3 0
# red 5 1 9 1 9
# yellow 0 4 0 3 1
# blue 0 1 0 0 0
# getting rid of rater1_1 and rater2_1 in dimnames
dimnames(contingency) <- list(dimnames(contingency)[[1]],dimnames(contingency)[[2]])
# orange purple red yellow blue
#orange 14 1 2 1 5
#purple 6 4 0 3 0
#red 5 1 9 1 9
#yellow 0 4 0 3 1
#blue 0 1 0 0 0
# sum symmetric cells and make contingency table lower triangular
# first lets extract the diagonal
# diag is needed twice, first to extract the diagonal from contingency as a vector
# second to convert this vector to a diagonal matrix
diag_contingency <- diag(diag(contingency))
# sum lower and upper matrices by adding the transposed matrix
# and substracting the diagonal (otherwise added twice)
contingency <- contingency + t(contingency) - diag_contingency
# we know have a symmetrical matrix
# orange purple red yellow blue
#orange 14 7 7 1 5
#purple 7 4 1 7 1
#red 7 1 9 1 9
#yellow 1 7 1 3 1
#blue 5 1 9 1 0
# set the upper triangular matrix to 0
contingency[upper.tri(contingency)] <- 0
# we get this matrix in the end
contingency
# orange purple red yellow blue
#orange 14 0 0 0 0
#purple 7 4 0 0 0
#red 7 1 9 0 0
#yellow 1 7 1 3 0
#blue 5 1 9 1 0