R如何使用插入符号包可视化混淆矩阵
我想把我放在混乱矩阵中的数据形象化。是否有一个函数我可以简单地把混淆矩阵,它会可视化(绘图) 示例我想做什么(矩阵$nnet只是一个包含分类结果的表):R如何使用插入符号包可视化混淆矩阵,r,confusion-matrix,R,Confusion Matrix,我想把我放在混乱矩阵中的数据形象化。是否有一个函数我可以简单地把混淆矩阵,它会可视化(绘图) 示例我想做什么(矩阵$nnet只是一个包含分类结果的表): 您可以使用内置的fourfoldplot。比如说, ctable <- as.table(matrix(c(42, 6, 8, 28), nrow = 2, byrow = TRUE)) fourfoldplot(ctable, color = c("#CC6666", "#99CC99"), conf.leve
您可以使用内置的
fourfoldplot
。比如说,
ctable <- as.table(matrix(c(42, 6, 8, 28), nrow = 2, byrow = TRUE))
fourfoldplot(ctable, color = c("#CC6666", "#99CC99"),
conf.level = 0, margin = 1, main = "Confusion Matrix")
ctable您可以使用r中的rect功能来布局混淆矩阵。在这里,我们将创建一个函数,允许用户传入由插入符号包创建的cm对象,以便生成视觉效果
让我们从创建一个评估数据集开始,就像插入符号演示中所做的那样:
# construct the evaluation dataset
set.seed(144)
true_class <- factor(sample(paste0("Class", 1:2), size = 1000, prob = c(.2, .8), replace = TRUE))
true_class <- sort(true_class)
class1_probs <- rbeta(sum(true_class == "Class1"), 4, 1)
class2_probs <- rbeta(sum(true_class == "Class2"), 1, 2.5)
test_set <- data.frame(obs = true_class,Class1 = c(class1_probs, class2_probs))
test_set$Class2 <- 1 - test_set$Class1
test_set$pred <- factor(ifelse(test_set$Class1 >= .5, "Class1", "Class2"))
# calculate the confusion matrix
cm <- confusionMatrix(data = test_set$pred, reference = test_set$obs)
draw_confusion_matrix(cm)
以下是结果:
# construct the evaluation dataset
set.seed(144)
true_class <- factor(sample(paste0("Class", 1:2), size = 1000, prob = c(.2, .8), replace = TRUE))
true_class <- sort(true_class)
class1_probs <- rbeta(sum(true_class == "Class1"), 4, 1)
class2_probs <- rbeta(sum(true_class == "Class2"), 1, 2.5)
test_set <- data.frame(obs = true_class,Class1 = c(class1_probs, class2_probs))
test_set$Class2 <- 1 - test_set$Class1
test_set$pred <- factor(ifelse(test_set$Class1 >= .5, "Class1", "Class2"))
# calculate the confusion matrix
cm <- confusionMatrix(data = test_set$pred, reference = test_set$obs)
draw_confusion_matrix(cm)
我非常喜欢@controlnetic中漂亮的混乱矩阵可视化,并做了两次调整,希望能进一步改进
1) 我用类的实际值替换了Class1和Class2。
2) 我用一个基于百分位数生成红色(未命中)和绿色(命中)的函数替换橙色和蓝色。这样做的目的是快速发现问题/成功的地方及其规模
屏幕截图和代码:
draw\u composition\u matrix这里有一个简单的基于ggplot2
的想法,可以根据需要进行更改,我使用的数据来自:
#数据
混淆矩阵(iris$物种,样本(iris$物种))
newPrior您可以使用plusautoplot()
中的函数conf_mat()
在几行中获得一个非常好的结果
另外,您仍然可以使用basicggplot
sintax来修复样式
library(yardstick)
library(ggplot2)
# The confusion matrix from a single assessment set (i.e. fold)
cm <- conf_mat(truth_predicted, obs, pred)
autoplot(cm, type = "heatmap") +
scale_fill_gradient(low="#D6EAF8",high = "#2E86C1")
更改图例的名称也很容易:+labs(fill=“legend\u name”)
数据示例:
set.seed(123)
truth_predicted <- data.frame(
obs = sample(0:1,100, replace = T),
pred = sample(0:1,100, replace = T)
)
truth_predicted$obs <- as.factor(truth_predicted$obs)
truth_predicted$pred <- as.factor(truth_predicted$pred)
set.seed(123)
我知道现在已经很晚了,但我自己也在寻找解决办法。
除此之外,我们正在研究上面的一些答案。
使用ggplot2
package和basetable
函数,我制作了一个简单的函数来绘制一个色彩鲜艳的混淆矩阵:
conf_matrix <- function(df.true, df.pred, title = "", true.lab ="True Class", pred.lab ="Predicted Class",
high.col = 'red', low.col = 'white') {
#convert input vector to factors, and ensure they have the same levels
df.true <- as.factor(df.true)
df.pred <- factor(df.pred, levels = levels(df.true))
#generate confusion matrix, and confusion matrix as a pecentage of each true class (to be used for color)
df.cm <- table(True = df.true, Pred = df.pred)
df.cm.col <- df.cm / rowSums(df.cm)
#convert confusion matrices to tables, and binding them together
df.table <- reshape2::melt(df.cm)
df.table.col <- reshape2::melt(df.cm.col)
df.table <- left_join(df.table, df.table.col, by =c("True", "Pred"))
#calculate accuracy and class accuracy
acc.vector <- c(diag(df.cm)) / c(rowSums(df.cm))
class.acc <- data.frame(Pred = "Class Acc.", True = names(acc.vector), value = acc.vector)
acc <- sum(diag(df.cm)) / sum(df.cm)
#plot
ggplot() +
geom_tile(aes(x=Pred, y=True, fill=value.y),
data=df.table, size=0.2, color=grey(0.5)) +
geom_tile(aes(x=Pred, y=True),
data=df.table[df.table$True==df.table$Pred, ], size=1, color="black", fill = 'transparent') +
scale_x_discrete(position = "top", limits = c(levels(df.table$Pred), "Class Acc.")) +
scale_y_discrete(limits = rev(unique(levels(df.table$Pred)))) +
labs(x=pred.lab, y=true.lab, fill=NULL,
title= paste0(title, "\nAccuracy ", round(100*acc, 1), "%")) +
geom_text(aes(x=Pred, y=True, label=value.x),
data=df.table, size=4, colour="black") +
geom_text(data = class.acc, aes(Pred, True, label = paste0(round(100*value), "%"))) +
scale_fill_gradient(low=low.col, high=high.col, labels = scales::percent,
limits = c(0,1), breaks = c(0,0.5,1)) +
guides(size=F) +
theme_bw() +
theme(panel.border = element_blank(), legend.position = "bottom",
axis.text = element_text(color='black'), axis.ticks = element_blank(),
panel.grid = element_blank(), axis.text.x.top = element_text(angle = 30, vjust = 0, hjust = 0)) +
coord_fixed()
}
conf_matrix有没有办法让我不用手动输入数字,只需声明一个列表之类的东西?(c(42,6,8,28)->c(datafromtable))?我是这样做的:ctable,所以您将conf.level=0作为混淆矩阵。对吗?为了详细说明答案,假设你有confusionMatrix,要把它转换成一个表,CMTable对confusionMatrix使用四折图是个坏主意,因为这种图是基于行和列的边际总和加权的。你能看到你对面的角落有42个和28个,但在大小/面积上无法区分吗?fourfoldplot通常用于分析优势比,无论独立频率如何,默认权重都有助于分析优势比。如果将其用于二进制混淆矩阵,则可能会完全误导。你可能会忽略一个事实,即你的FP或FN比率很糟糕。你可以通过设置std=“all.max”@static\u rtti来解决这个问题。既然你已经放置了赏金,你能添加一些细节或例子来说明你想要什么类型的情节吗?@camille:类似这样的东西就好了:。理想情况下,直接从R包:)@static\u rtti有一些例子、、和似乎符合描述。我有一种感觉,如果今天发布的话,这个问题会被关闭,因为它太宽泛了。我认为卡米尔的观点是正确的。然而,添加一个详细的规范从来都不算晚,我不久前还觉得在R中的混淆矩阵选项不是很好。因此,我致力于在shiny/htmltools中实现。你有可能与矩阵“互动”。因此,单击某个矩阵元素,就会显示与该矩阵元素相关的数据。这能回答你的问题吗?还是RLave的答案已经值得你“接受”了?哦,这已经开始接近我想要的了,谢谢!
library(ggplot2) # to plot
library(gridExtra) # to put more
library(grid) # plot together
# plotting the matrix
cm_d_p <- ggplot(data = cm_d, aes(x = Prediction , y = Reference, fill = Freq))+
geom_tile() +
geom_text(aes(label = paste("",Freq,",",Perc,"%")), color = 'red', size = 8) +
theme_light() +
guides(fill=FALSE)
# plotting the stats
cm_st_p <- tableGrob(cm_st)
# all together
grid.arrange(cm_d_p, cm_st_p,nrow = 1, ncol = 2,
top=textGrob("Confusion Matrix and Statistics",gp=gpar(fontsize=25,font=1)))
library(yardstick)
library(ggplot2)
# The confusion matrix from a single assessment set (i.e. fold)
cm <- conf_mat(truth_predicted, obs, pred)
autoplot(cm, type = "heatmap") +
scale_fill_gradient(low="#D6EAF8",high = "#2E86C1")
+ theme(legend.position = "right")
set.seed(123)
truth_predicted <- data.frame(
obs = sample(0:1,100, replace = T),
pred = sample(0:1,100, replace = T)
)
truth_predicted$obs <- as.factor(truth_predicted$obs)
truth_predicted$pred <- as.factor(truth_predicted$pred)
conf_matrix <- function(df.true, df.pred, title = "", true.lab ="True Class", pred.lab ="Predicted Class",
high.col = 'red', low.col = 'white') {
#convert input vector to factors, and ensure they have the same levels
df.true <- as.factor(df.true)
df.pred <- factor(df.pred, levels = levels(df.true))
#generate confusion matrix, and confusion matrix as a pecentage of each true class (to be used for color)
df.cm <- table(True = df.true, Pred = df.pred)
df.cm.col <- df.cm / rowSums(df.cm)
#convert confusion matrices to tables, and binding them together
df.table <- reshape2::melt(df.cm)
df.table.col <- reshape2::melt(df.cm.col)
df.table <- left_join(df.table, df.table.col, by =c("True", "Pred"))
#calculate accuracy and class accuracy
acc.vector <- c(diag(df.cm)) / c(rowSums(df.cm))
class.acc <- data.frame(Pred = "Class Acc.", True = names(acc.vector), value = acc.vector)
acc <- sum(diag(df.cm)) / sum(df.cm)
#plot
ggplot() +
geom_tile(aes(x=Pred, y=True, fill=value.y),
data=df.table, size=0.2, color=grey(0.5)) +
geom_tile(aes(x=Pred, y=True),
data=df.table[df.table$True==df.table$Pred, ], size=1, color="black", fill = 'transparent') +
scale_x_discrete(position = "top", limits = c(levels(df.table$Pred), "Class Acc.")) +
scale_y_discrete(limits = rev(unique(levels(df.table$Pred)))) +
labs(x=pred.lab, y=true.lab, fill=NULL,
title= paste0(title, "\nAccuracy ", round(100*acc, 1), "%")) +
geom_text(aes(x=Pred, y=True, label=value.x),
data=df.table, size=4, colour="black") +
geom_text(data = class.acc, aes(Pred, True, label = paste0(round(100*value), "%"))) +
scale_fill_gradient(low=low.col, high=high.col, labels = scales::percent,
limits = c(0,1), breaks = c(0,0.5,1)) +
guides(size=F) +
theme_bw() +
theme(panel.border = element_blank(), legend.position = "bottom",
axis.text = element_text(color='black'), axis.ticks = element_blank(),
panel.grid = element_blank(), axis.text.x.top = element_text(angle = 30, vjust = 0, hjust = 0)) +
coord_fixed()
}
mydata <- data.frame(true = c("a", "b", "c", "a", "b", "c", "a", "b", "c"),
predicted = c("a", "a", "c", "c", "a", "c", "a", "b", "c"))
conf_matrix(mydata$true, mydata$predicted, title = "Conf. Matrix Example")