Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/elixir/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R如何使用插入符号包可视化混淆矩阵_R_Confusion Matrix - Fatal编程技术网

R如何使用插入符号包可视化混淆矩阵

R如何使用插入符号包可视化混淆矩阵,r,confusion-matrix,R,Confusion Matrix,我想把我放在混乱矩阵中的数据形象化。是否有一个函数我可以简单地把混淆矩阵,它会可视化(绘图) 示例我想做什么(矩阵$nnet只是一个包含分类结果的表): 您可以使用内置的fourfoldplot。比如说, ctable <- as.table(matrix(c(42, 6, 8, 28), nrow = 2, byrow = TRUE)) fourfoldplot(ctable, color = c("#CC6666", "#99CC99"), conf.leve

我想把我放在混乱矩阵中的数据形象化。是否有一个函数我可以简单地把混淆矩阵,它会可视化(绘图)

示例我想做什么(矩阵$nnet只是一个包含分类结果的表):


您可以使用内置的
fourfoldplot
。比如说,

ctable <- as.table(matrix(c(42, 6, 8, 28), nrow = 2, byrow = TRUE))
fourfoldplot(ctable, color = c("#CC6666", "#99CC99"),
             conf.level = 0, margin = 1, main = "Confusion Matrix")

ctable您可以使用r中的rect功能来布局混淆矩阵。在这里,我们将创建一个函数,允许用户传入由插入符号包创建的cm对象,以便生成视觉效果

让我们从创建一个评估数据集开始,就像插入符号演示中所做的那样:

# construct the evaluation dataset
set.seed(144)
true_class <- factor(sample(paste0("Class", 1:2), size = 1000, prob = c(.2, .8), replace = TRUE))
true_class <- sort(true_class)
class1_probs <- rbeta(sum(true_class == "Class1"), 4, 1)
class2_probs <- rbeta(sum(true_class == "Class2"), 1, 2.5)
test_set <- data.frame(obs = true_class,Class1 = c(class1_probs, class2_probs))
test_set$Class2 <- 1 - test_set$Class1
test_set$pred <- factor(ifelse(test_set$Class1 >= .5, "Class1", "Class2"))
# calculate the confusion matrix
cm <- confusionMatrix(data = test_set$pred, reference = test_set$obs)
draw_confusion_matrix(cm)
以下是结果:

# construct the evaluation dataset
set.seed(144)
true_class <- factor(sample(paste0("Class", 1:2), size = 1000, prob = c(.2, .8), replace = TRUE))
true_class <- sort(true_class)
class1_probs <- rbeta(sum(true_class == "Class1"), 4, 1)
class2_probs <- rbeta(sum(true_class == "Class2"), 1, 2.5)
test_set <- data.frame(obs = true_class,Class1 = c(class1_probs, class2_probs))
test_set$Class2 <- 1 - test_set$Class1
test_set$pred <- factor(ifelse(test_set$Class1 >= .5, "Class1", "Class2"))
# calculate the confusion matrix
cm <- confusionMatrix(data = test_set$pred, reference = test_set$obs)
draw_confusion_matrix(cm)

我非常喜欢@controlnetic中漂亮的混乱矩阵可视化,并做了两次调整,希望能进一步改进

1) 我用类的实际值替换了Class1和Class2。 2) 我用一个基于百分位数生成红色(未命中)和绿色(命中)的函数替换橙色和蓝色。这样做的目的是快速发现问题/成功的地方及其规模

屏幕截图和代码:


draw\u composition\u matrix这里有一个简单的基于
ggplot2
的想法,可以根据需要进行更改,我使用的数据来自:

#数据
混淆矩阵(iris$物种,样本(iris$物种))

newPrior您可以使用plus
autoplot()
中的函数
conf_mat()
在几行中获得一个非常好的结果

另外,您仍然可以使用basic
ggplot
sintax来修复样式

library(yardstick)
library(ggplot2)


# The confusion matrix from a single assessment set (i.e. fold)
cm <- conf_mat(truth_predicted, obs, pred)

autoplot(cm, type = "heatmap") +
  scale_fill_gradient(low="#D6EAF8",high = "#2E86C1")
更改图例的名称也很容易:
+labs(fill=“legend\u name”)

数据示例:

set.seed(123)
truth_predicted <- data.frame(
  obs = sample(0:1,100, replace = T),
  pred = sample(0:1,100, replace = T)
)
truth_predicted$obs <- as.factor(truth_predicted$obs)
truth_predicted$pred <- as.factor(truth_predicted$pred)
set.seed(123)

我知道现在已经很晚了,但我自己也在寻找解决办法。 除此之外,我们正在研究上面的一些答案。 使用
ggplot2
package和base
table
函数,我制作了一个简单的函数来绘制一个色彩鲜艳的混淆矩阵:

conf_matrix <- function(df.true, df.pred, title = "", true.lab ="True Class", pred.lab ="Predicted Class",
                        high.col = 'red', low.col = 'white') {
  #convert input vector to factors, and ensure they have the same levels
  df.true <- as.factor(df.true)
  df.pred <- factor(df.pred, levels = levels(df.true))
  
  #generate confusion matrix, and confusion matrix as a pecentage of each true class (to be used for color) 
  df.cm <- table(True = df.true, Pred = df.pred)
  df.cm.col <- df.cm / rowSums(df.cm)
  
  #convert confusion matrices to tables, and binding them together
  df.table <- reshape2::melt(df.cm)
  df.table.col <- reshape2::melt(df.cm.col)
  df.table <- left_join(df.table, df.table.col, by =c("True", "Pred"))
  
  #calculate accuracy and class accuracy
  acc.vector <- c(diag(df.cm)) / c(rowSums(df.cm))
  class.acc <- data.frame(Pred = "Class Acc.", True = names(acc.vector), value = acc.vector)
  acc <- sum(diag(df.cm)) / sum(df.cm)
  
  #plot
  ggplot() +
    geom_tile(aes(x=Pred, y=True, fill=value.y),
              data=df.table, size=0.2, color=grey(0.5)) +
    geom_tile(aes(x=Pred, y=True),
              data=df.table[df.table$True==df.table$Pred, ], size=1, color="black", fill = 'transparent') +
    scale_x_discrete(position = "top",  limits = c(levels(df.table$Pred), "Class Acc.")) +
    scale_y_discrete(limits = rev(unique(levels(df.table$Pred)))) +
    labs(x=pred.lab, y=true.lab, fill=NULL,
         title= paste0(title, "\nAccuracy ", round(100*acc, 1), "%")) +
    geom_text(aes(x=Pred, y=True, label=value.x),
              data=df.table, size=4, colour="black") +
    geom_text(data = class.acc, aes(Pred, True, label = paste0(round(100*value), "%"))) +
    scale_fill_gradient(low=low.col, high=high.col, labels = scales::percent,
                        limits = c(0,1), breaks = c(0,0.5,1)) +
    guides(size=F) +
    theme_bw() +
    theme(panel.border = element_blank(), legend.position = "bottom",
          axis.text = element_text(color='black'), axis.ticks = element_blank(),
          panel.grid = element_blank(), axis.text.x.top = element_text(angle = 30, vjust = 0, hjust = 0)) +
    coord_fixed()

} 

conf_matrix有没有办法让我不用手动输入数字,只需声明一个列表之类的东西?(c(42,6,8,28)->c(datafromtable))?我是这样做的:ctable,所以您将conf.level=0作为混淆矩阵。对吗?为了详细说明答案,假设你有confusionMatrix,要把它转换成一个表,CMTable对confusionMatrix使用四折图是个坏主意,因为这种图是基于行和列的边际总和加权的。你能看到你对面的角落有42个和28个,但在大小/面积上无法区分吗?fourfoldplot通常用于分析优势比,无论独立频率如何,默认权重都有助于分析优势比。如果将其用于二进制混淆矩阵,则可能会完全误导。你可能会忽略一个事实,即你的FP或FN比率很糟糕。你可以通过设置std=“all.max”@static\u rtti来解决这个问题。既然你已经放置了赏金,你能添加一些细节或例子来说明你想要什么类型的情节吗?@camille:类似这样的东西就好了:。理想情况下,直接从R包:)@static\u rtti有一些例子、、和似乎符合描述。我有一种感觉,如果今天发布的话,这个问题会被关闭,因为它太宽泛了。我认为卡米尔的观点是正确的。然而,添加一个详细的规范从来都不算晚,我不久前还觉得在R中的混淆矩阵选项不是很好。因此,我致力于在shiny/htmltools中实现。你有可能与矩阵“互动”。因此,单击某个矩阵元素,就会显示与该矩阵元素相关的数据。这能回答你的问题吗?还是RLave的答案已经值得你“接受”了?哦,这已经开始接近我想要的了,谢谢!
library(ggplot2)     # to plot
library(gridExtra)   # to put more
library(grid)        # plot together

# plotting the matrix
cm_d_p <-  ggplot(data = cm_d, aes(x = Prediction , y =  Reference, fill = Freq))+
  geom_tile() +
  geom_text(aes(label = paste("",Freq,",",Perc,"%")), color = 'red', size = 8) +
  theme_light() +
  guides(fill=FALSE) 

# plotting the stats
cm_st_p <-  tableGrob(cm_st)

# all together
grid.arrange(cm_d_p, cm_st_p,nrow = 1, ncol = 2, 
             top=textGrob("Confusion Matrix and Statistics",gp=gpar(fontsize=25,font=1)))
library(yardstick)
library(ggplot2)


# The confusion matrix from a single assessment set (i.e. fold)
cm <- conf_mat(truth_predicted, obs, pred)

autoplot(cm, type = "heatmap") +
  scale_fill_gradient(low="#D6EAF8",high = "#2E86C1")
+ theme(legend.position = "right")
set.seed(123)
truth_predicted <- data.frame(
  obs = sample(0:1,100, replace = T),
  pred = sample(0:1,100, replace = T)
)
truth_predicted$obs <- as.factor(truth_predicted$obs)
truth_predicted$pred <- as.factor(truth_predicted$pred)
conf_matrix <- function(df.true, df.pred, title = "", true.lab ="True Class", pred.lab ="Predicted Class",
                        high.col = 'red', low.col = 'white') {
  #convert input vector to factors, and ensure they have the same levels
  df.true <- as.factor(df.true)
  df.pred <- factor(df.pred, levels = levels(df.true))
  
  #generate confusion matrix, and confusion matrix as a pecentage of each true class (to be used for color) 
  df.cm <- table(True = df.true, Pred = df.pred)
  df.cm.col <- df.cm / rowSums(df.cm)
  
  #convert confusion matrices to tables, and binding them together
  df.table <- reshape2::melt(df.cm)
  df.table.col <- reshape2::melt(df.cm.col)
  df.table <- left_join(df.table, df.table.col, by =c("True", "Pred"))
  
  #calculate accuracy and class accuracy
  acc.vector <- c(diag(df.cm)) / c(rowSums(df.cm))
  class.acc <- data.frame(Pred = "Class Acc.", True = names(acc.vector), value = acc.vector)
  acc <- sum(diag(df.cm)) / sum(df.cm)
  
  #plot
  ggplot() +
    geom_tile(aes(x=Pred, y=True, fill=value.y),
              data=df.table, size=0.2, color=grey(0.5)) +
    geom_tile(aes(x=Pred, y=True),
              data=df.table[df.table$True==df.table$Pred, ], size=1, color="black", fill = 'transparent') +
    scale_x_discrete(position = "top",  limits = c(levels(df.table$Pred), "Class Acc.")) +
    scale_y_discrete(limits = rev(unique(levels(df.table$Pred)))) +
    labs(x=pred.lab, y=true.lab, fill=NULL,
         title= paste0(title, "\nAccuracy ", round(100*acc, 1), "%")) +
    geom_text(aes(x=Pred, y=True, label=value.x),
              data=df.table, size=4, colour="black") +
    geom_text(data = class.acc, aes(Pred, True, label = paste0(round(100*value), "%"))) +
    scale_fill_gradient(low=low.col, high=high.col, labels = scales::percent,
                        limits = c(0,1), breaks = c(0,0.5,1)) +
    guides(size=F) +
    theme_bw() +
    theme(panel.border = element_blank(), legend.position = "bottom",
          axis.text = element_text(color='black'), axis.ticks = element_blank(),
          panel.grid = element_blank(), axis.text.x.top = element_text(angle = 30, vjust = 0, hjust = 0)) +
    coord_fixed()

} 
mydata <- data.frame(true = c("a", "b", "c", "a", "b", "c", "a", "b", "c"),
                     predicted = c("a", "a", "c", "c", "a", "c", "a", "b", "c"))

conf_matrix(mydata$true, mydata$predicted, title = "Conf. Matrix Example")