R 多个评分员的混淆矩阵_R_Dplyr_Tidyr

R 多个评分员的混淆矩阵

R 多个评分员的混淆矩阵,r,dplyr,tidyr,R,Dplyr,Tidyr,注意：这些值是虚构的，我没有手动计算上面的值我甚至不知道从哪里开始。我搜索的大多数混淆矩阵都是比较实际模型输出和预测模型输出，例如。如有任何建议，将不胜感激对于这个解决方案，我使用dplyr和purrr包 red orange blue yellow purple red 22 6 2 3 2 orange 6 13 1 4 1 blue 2 1 10

注意：这些值是虚构的，我没有手动计算上面的值

我甚至不知道从哪里开始。我搜索的大多数混淆矩阵都是比较实际模型输出和预测模型输出，例如。如有任何建议，将不胜感激

对于这个解决方案，我使用dplyr和purrr包

        red  orange  blue  yellow  purple
red      22    6      2      3      2
orange   6     13     1      4      1
blue     2     1      10     3      1
yellow   3     4      3      9      2
purple   2     1      1      2      9

我不太确定这里使用的逻辑。可能使用示例数据来显示所需的输出。所需的输出类似于上面的表i。一般来说，每个评分员对对角线的一致性较高，但对非对角线的一致性有一些不一致。太棒了。这是非常接近。。。我需要比较评分者，而不是比较事件。例如，在测试create_MISSION_矩阵时，需要创建create_MISSION_矩阵RATER1_1，rater2_1，结果看起来是正确的。我对事件之间的比较不像对评分者之间的比较那么感兴趣。我可以转换东西，但却被卡在了combn线上。在这里，我需要创建评分者的组合，而不是事件。@b222很高兴您找到了解决方案，我已经更新了我的答案抱歉耽搁了，pivot\u wider执行以下操作：从name\u from=name=不同的评级器中获取所有唯一的元素，并为每个评级器创建一列。我们只使用评级器中的两个元素调用函数，因此将创建两列，values\u from=value参数告诉我们在这些列中放置什么，在这里，我们将有来自评分员的实际评分。我会为其他人更新我的答案questions@b222，更新了答案，我希望我回答了你所有的问题。谢谢。我感谢你的帮助和详细的解释。

        red  orange  blue  yellow  purple
red      22    6      2      3      2
orange   6     13     1      4      1
blue     2     1      10     3      1
yellow   3     4      3      9      2
purple   2     1      1      2      9

library(dplyr)
library(purrr)
# convert to long format
df_long <- test_data %>% pivot_longer(-event)

# df_long
# # A tibble: 35 x 3
#   event  name     value 
#   <fct>  <chr>    <fct> 
# 1 event1 rater1_1 red   
# 2 event1 rater2_1 red   
# 3 event1 rater3_1 red   
# 4 event1 rater4_1 orange
# 5 event1 rater5_1 blue  
# 6 event2 rater1_1 orange
# 7 event2 rater2_1 orange
# 8 event2 rater3_1 red   
# 9 event2 rater4_1 orange
#10 event2 rater5_1 blue  
# # ... with 25 more rows

# create function to compute the confusion matrix for two given events
create_confusion_matrix <- function(raters){
 df_long %>% filter(name %in% raters) %>% 
             pivot_wider(names_from=name,values_from=value) %>% 
             select(-event) %>% 
             table()
}

# lets try this function with rater1_1 and rater2_1
create_confusion_matrix(c('rater1_1','rater2_1'))
#        rater2_1
#rater1_1 orange purple red yellow blue
#  orange      2      0   0      0    0
#  purple      0      1   0      0    0
#  red         1      0   2      0    0
#  yellow      0      1   0      0    0
#  blue        0      0   0      0    0


# now we need to get all combinations of two raters
raters2 <- combn(unique(df_long$name),2,simplify=FALSE)


# raters2 is a list, each element is a vector containing 2 raters

# loop over the list and apply create_confusion_matrix for each element
result_list <- map(raters2,create_confusion_matrix)
# result_list is a list, each element is a confusion matrix

#we can them sum all theses tables

contingency <- Reduce('+',result_list)
#        rater2_1
#rater1_1 orange purple red yellow blue
#  orange     14      1   2      1    5
#  purple      6      4   0      3    0
#  red         5      1   9      1    9
#  yellow      0      4   0      3    1
#  blue        0      1   0      0    0

# getting rid of rater1_1 and rater2_1 in dimnames
dimnames(contingency) <- list(dimnames(contingency)[[1]],dimnames(contingency)[[2]])
#       orange purple red yellow blue
#orange     14      1   2      1    5
#purple      6      4   0      3    0
#red         5      1   9      1    9
#yellow      0      4   0      3    1
#blue        0      1   0      0    0

# sum symmetric cells and make contingency table lower triangular
# first lets extract the diagonal
# diag is needed twice, first to extract the diagonal from contingency as a vector
# second to convert this vector to a diagonal matrix
diag_contingency <- diag(diag(contingency))
# sum lower and upper matrices by adding the transposed matrix
# and substracting the diagonal (otherwise added twice)
contingency <- contingency + t(contingency) - diag_contingency
# we know have a symmetrical matrix
#        orange purple red yellow blue
#orange     14      7   7      1    5
#purple      7      4   1      7    1
#red         7      1   9      1    9
#yellow      1      7   1      3    1
#blue        5      1   9      1    0

# set the upper triangular matrix to 0
contingency[upper.tri(contingency)] <- 0

# we get this matrix in the end
contingency
#           orange purple red yellow blue
#orange     14      0   0      0    0
#purple      7      4   0      0    0
#red         7      1   9      0    0
#yellow      1      7   1      3    0
#blue        5      1   9      1    0