R如何使用插入符号包可视化混淆矩阵_R_Confusion Matrix

R如何使用插入符号包可视化混淆矩阵

R如何使用插入符号包可视化混淆矩阵,r,confusion-matrix,R,Confusion Matrix,我想把我放在混乱矩阵中的数据形象化。是否有一个函数我可以简单地把混淆矩阵，它会可视化（绘图）示例我想做什么（矩阵$nnet只是一个包含分类结果的表）：您可以使用内置的fourfoldplot。比如说, ctable <- as.table(matrix(c(42, 6, 8, 28), nrow = 2, byrow = TRUE)) fourfoldplot(ctable, color = c("#CC6666", "#99CC99"), conf.leve

我想把我放在混乱矩阵中的数据形象化。是否有一个函数我可以简单地把混淆矩阵，它会可视化（绘图）

示例我想做什么（矩阵$nnet只是一个包含分类结果的表）：

您可以使用内置的

fourfoldplot

。比如说,

ctable <- as.table(matrix(c(42, 6, 8, 28), nrow = 2, byrow = TRUE))
fourfoldplot(ctable, color = c("#CC6666", "#99CC99"),
             conf.level = 0, margin = 1, main = "Confusion Matrix")

ctable您可以使用r中的rect功能来布局混淆矩阵。在这里，我们将创建一个函数，允许用户传入由插入符号包创建的cm对象，以便生成视觉效果
让我们从创建一个评估数据集开始，就像插入符号演示中所做的那样：
# construct the evaluation dataset
set.seed(144)
true_class <- factor(sample(paste0("Class", 1:2), size = 1000, prob = c(.2, .8), replace = TRUE))
true_class <- sort(true_class)
class1_probs <- rbeta(sum(true_class == "Class1"), 4, 1)
class2_probs <- rbeta(sum(true_class == "Class2"), 1, 2.5)
test_set <- data.frame(obs = true_class,Class1 = c(class1_probs, class2_probs))
test_set$Class2 <- 1 - test_set$Class1
test_set$pred <- factor(ifelse(test_set$Class1 >= .5, "Class1", "Class2"))

# calculate the confusion matrix
cm <- confusionMatrix(data = test_set$pred, reference = test_set$obs)

draw_confusion_matrix(cm)

以下是结果：
# construct the evaluation dataset
set.seed(144)
true_class <- factor(sample(paste0("Class", 1:2), size = 1000, prob = c(.2, .8), replace = TRUE))
true_class <- sort(true_class)
class1_probs <- rbeta(sum(true_class == "Class1"), 4, 1)
class2_probs <- rbeta(sum(true_class == "Class2"), 1, 2.5)
test_set <- data.frame(obs = true_class,Class1 = c(class1_probs, class2_probs))
test_set$Class2 <- 1 - test_set$Class1
test_set$pred <- factor(ifelse(test_set$Class1 >= .5, "Class1", "Class2"))

# calculate the confusion matrix
cm <- confusionMatrix(data = test_set$pred, reference = test_set$obs)

draw_confusion_matrix(cm)

我非常喜欢@controlnetic中漂亮的混乱矩阵可视化，并做了两次调整，希望能进一步改进
1） 我用类的实际值替换了Class1和Class2。
2） 我用一个基于百分位数生成红色（未命中）和绿色（命中）的函数替换橙色和蓝色。这样做的目的是快速发现问题/成功的地方及其规模
屏幕截图和代码：

draw\u composition\u matrix这里有一个简单的基于ggplot2
的想法，可以根据需要进行更改，我使用的数据来自：
#数据
混淆矩阵（iris$物种，样本（iris$物种））
newPrior您可以使用plusautoplot（）
中的函数conf_mat（）
在几行中获得一个非常好的结果
另外，您仍然可以使用basicggplot
sintax来修复样式
library(yardstick)
library(ggplot2)


# The confusion matrix from a single assessment set (i.e. fold)
cm <- conf_mat(truth_predicted, obs, pred)

autoplot(cm, type = "heatmap") +
  scale_fill_gradient(low="#D6EAF8",high = "#2E86C1")

更改图例的名称也很容易：+labs（fill=“legend\u name”）


数据示例：
set.seed(123)
truth_predicted <- data.frame(
  obs = sample(0:1,100, replace = T),
  pred = sample(0:1,100, replace = T)
)
truth_predicted$obs <- as.factor(truth_predicted$obs)
truth_predicted$pred <- as.factor(truth_predicted$pred)

set.seed（123）
我知道现在已经很晚了，但我自己也在寻找解决办法。
除此之外，我们正在研究上面的一些答案。
使用ggplot2
package和basetable
函数，我制作了一个简单的函数来绘制一个色彩鲜艳的混淆矩阵：
conf_matrix <- function(df.true, df.pred, title = "", true.lab ="True Class", pred.lab ="Predicted Class",
                        high.col = 'red', low.col = 'white') {
  #convert input vector to factors, and ensure they have the same levels
  df.true <- as.factor(df.true)
  df.pred <- factor(df.pred, levels = levels(df.true))
  
  #generate confusion matrix, and confusion matrix as a pecentage of each true class (to be used for color) 
  df.cm <- table(True = df.true, Pred = df.pred)
  df.cm.col <- df.cm / rowSums(df.cm)
  
  #convert confusion matrices to tables, and binding them together
  df.table <- reshape2::melt(df.cm)
  df.table.col <- reshape2::melt(df.cm.col)
  df.table <- left_join(df.table, df.table.col, by =c("True", "Pred"))
  
  #calculate accuracy and class accuracy
  acc.vector <- c(diag(df.cm)) / c(rowSums(df.cm))
  class.acc <- data.frame(Pred = "Class Acc.", True = names(acc.vector), value = acc.vector)
  acc <- sum(diag(df.cm)) / sum(df.cm)
  
  #plot
  ggplot() +
    geom_tile(aes(x=Pred, y=True, fill=value.y),
              data=df.table, size=0.2, color=grey(0.5)) +
    geom_tile(aes(x=Pred, y=True),
              data=df.table[df.table$True==df.table$Pred, ], size=1, color="black", fill = 'transparent') +
    scale_x_discrete(position = "top",  limits = c(levels(df.table$Pred), "Class Acc.")) +
    scale_y_discrete(limits = rev(unique(levels(df.table$Pred)))) +
    labs(x=pred.lab, y=true.lab, fill=NULL,
         title= paste0(title, "\nAccuracy ", round(100*acc, 1), "%")) +
    geom_text(aes(x=Pred, y=True, label=value.x),
              data=df.table, size=4, colour="black") +
    geom_text(data = class.acc, aes(Pred, True, label = paste0(round(100*value), "%"))) +
    scale_fill_gradient(low=low.col, high=high.col, labels = scales::percent,
                        limits = c(0,1), breaks = c(0,0.5,1)) +
    guides(size=F) +
    theme_bw() +
    theme(panel.border = element_blank(), legend.position = "bottom",
          axis.text = element_text(color='black'), axis.ticks = element_blank(),
          panel.grid = element_blank(), axis.text.x.top = element_text(angle = 30, vjust = 0, hjust = 0)) +
    coord_fixed()

} 

conf_matrix有没有办法让我不用手动输入数字，只需声明一个列表之类的东西？（c（42,6,8,28）->c（datafromtable））？我是这样做的：ctable，所以您将conf.level=0作为混淆矩阵。对吗？为了详细说明答案，假设你有confusionMatrix，要把它转换成一个表，CMTable对confusionMatrix使用四折图是个坏主意，因为这种图是基于行和列的边际总和加权的。你能看到你对面的角落有42个和28个，但在大小/面积上无法区分吗？fourfoldplot通常用于分析优势比，无论独立频率如何，默认权重都有助于分析优势比。如果将其用于二进制混淆矩阵，则可能会完全误导。你可能会忽略一个事实，即你的FP或FN比率很糟糕。你可以通过设置std=“all.max”@static\u rtti来解决这个问题。既然你已经放置了赏金，你能添加一些细节或例子来说明你想要什么类型的情节吗？@camille:类似这样的东西就好了：。理想情况下，直接从R包：）@static\u rtti有一些例子、、和似乎符合描述。我有一种感觉，如果今天发布的话，这个问题会被关闭，因为它太宽泛了。我认为卡米尔的观点是正确的。然而，添加一个详细的规范从来都不算晚，我不久前还觉得在R中的混淆矩阵选项不是很好。因此，我致力于在shiny/htmltools中实现。你有可能与矩阵“互动”。因此，单击某个矩阵元素，就会显示与该矩阵元素相关的数据。这能回答你的问题吗？还是RLave的答案已经值得你“接受”了？哦，这已经开始接近我想要的了，谢谢！
library(ggplot2)     # to plot
library(gridExtra)   # to put more
library(grid)        # plot together

# plotting the matrix
cm_d_p <-  ggplot(data = cm_d, aes(x = Prediction , y =  Reference, fill = Freq))+
  geom_tile() +
  geom_text(aes(label = paste("",Freq,",",Perc,"%")), color = 'red', size = 8) +
  theme_light() +
  guides(fill=FALSE) 

# plotting the stats
cm_st_p <-  tableGrob(cm_st)

# all together
grid.arrange(cm_d_p, cm_st_p,nrow = 1, ncol = 2, 
             top=textGrob("Confusion Matrix and Statistics",gp=gpar(fontsize=25,font=1)))

library(yardstick)
library(ggplot2)


# The confusion matrix from a single assessment set (i.e. fold)
cm <- conf_mat(truth_predicted, obs, pred)

autoplot(cm, type = "heatmap") +
  scale_fill_gradient(low="#D6EAF8",high = "#2E86C1")

+ theme(legend.position = "right")

set.seed(123)
truth_predicted <- data.frame(
  obs = sample(0:1,100, replace = T),
  pred = sample(0:1,100, replace = T)
)
truth_predicted$obs <- as.factor(truth_predicted$obs)
truth_predicted$pred <- as.factor(truth_predicted$pred)

conf_matrix <- function(df.true, df.pred, title = "", true.lab ="True Class", pred.lab ="Predicted Class",
                        high.col = 'red', low.col = 'white') {
  #convert input vector to factors, and ensure they have the same levels
  df.true <- as.factor(df.true)
  df.pred <- factor(df.pred, levels = levels(df.true))
  
  #generate confusion matrix, and confusion matrix as a pecentage of each true class (to be used for color) 
  df.cm <- table(True = df.true, Pred = df.pred)
  df.cm.col <- df.cm / rowSums(df.cm)
  
  #convert confusion matrices to tables, and binding them together
  df.table <- reshape2::melt(df.cm)
  df.table.col <- reshape2::melt(df.cm.col)
  df.table <- left_join(df.table, df.table.col, by =c("True", "Pred"))
  
  #calculate accuracy and class accuracy
  acc.vector <- c(diag(df.cm)) / c(rowSums(df.cm))
  class.acc <- data.frame(Pred = "Class Acc.", True = names(acc.vector), value = acc.vector)
  acc <- sum(diag(df.cm)) / sum(df.cm)
  
  #plot
  ggplot() +
    geom_tile(aes(x=Pred, y=True, fill=value.y),
              data=df.table, size=0.2, color=grey(0.5)) +
    geom_tile(aes(x=Pred, y=True),
              data=df.table[df.table$True==df.table$Pred, ], size=1, color="black", fill = 'transparent') +
    scale_x_discrete(position = "top",  limits = c(levels(df.table$Pred), "Class Acc.")) +
    scale_y_discrete(limits = rev(unique(levels(df.table$Pred)))) +
    labs(x=pred.lab, y=true.lab, fill=NULL,
         title= paste0(title, "\nAccuracy ", round(100*acc, 1), "%")) +
    geom_text(aes(x=Pred, y=True, label=value.x),
              data=df.table, size=4, colour="black") +
    geom_text(data = class.acc, aes(Pred, True, label = paste0(round(100*value), "%"))) +
    scale_fill_gradient(low=low.col, high=high.col, labels = scales::percent,
                        limits = c(0,1), breaks = c(0,0.5,1)) +
    guides(size=F) +
    theme_bw() +
    theme(panel.border = element_blank(), legend.position = "bottom",
          axis.text = element_text(color='black'), axis.ticks = element_blank(),
          panel.grid = element_blank(), axis.text.x.top = element_text(angle = 30, vjust = 0, hjust = 0)) +
    coord_fixed()

} 

mydata <- data.frame(true = c("a", "b", "c", "a", "b", "c", "a", "b", "c"),
                     predicted = c("a", "a", "c", "c", "a", "c", "a", "b", "c"))

conf_matrix(mydata$true, mydata$predicted, title = "Conf. Matrix Example")