策划precision@k及recall@k在ROCR（R）中_R_Plot_Classification_Data Visualization_Precision Recall

策划precision@k及recall@k在ROCR（R）中

r plot

策划precision@k及recall@k在ROCR（R）中,r,plot,classification,data-visualization,precision-recall,R,Plot,Classification,Data Visualization,Precision Recall,我正在用ROCR包评估R中的二进制分类器。我的分类器为目标0/1标签输出介于0和1之间的分数我想绘制精确图并回忆@k，但找不到方法。调用performance（）而不指定x轴测量值时，将按分数截止点绘制精度值： library(ROCR) #df <- a two-dimensional dataframe with prediction scores and actual labels of my classifier pred <- prediction(df$score,

我正在用ROCR包评估R中的二进制分类器。我的分类器为目标0/1标签输出介于0和1之间的分数

我想绘制精确图并回忆@k，但找不到方法。调用

performance（）

而不指定x轴测量值时，将按分数截止点绘制精度值：

library(ROCR)
#df <- a two-dimensional dataframe with prediction scores and actual labels of my classifier 
pred <- prediction(df$score, df$label)
pr_curve <- performance(pred, measure="prec")

库（ROCR）
#df加载库并定义列车和测试集：
library(mlbench)
library(e1071)
library(ROCR)
data(BreastCancer)
df = BreastCancer
idx = sample(1:nrow(df),150)
trn = df[idx,]
test = df[-idx,]

拟合朴素贝叶斯
fit = naiveBayes(Class ~ .,data=trn)

在性能手册中，它是这样写的
精度/召回图：measure=“prec”，x.measure=“rec”
绘图精度召回：
pred = prediction(predict(fit,test,type="raw")[,2],test$Class)
#plot to see it is working correctly:
plot(performance(pred,measure="prec",x.measure="rec"))


现在，为了让您的案例在K进行，我们还可以从头开始进行精确召回：
#combine prob, predicted labels, and actual labels
res = data.frame(prob=predict(fit,test,type="raw")[,2],
predicted_label=predict(fit,test),
label = test$Class)
res = res[order(res$prob,decreasing=TRUE),]
res$rank = 1:nrow(res)
# calculate recall, which is the number of actual classes we get back
res$recall = cumsum(res$label=="malignant")/sum(res$label=="malignant")
# precision, number of malignant cases we predicted correctly
res$precision = cumsum(res$label=="malignant")/res$rank

# check the two plots
par(mfrow=c(1,2))
plot(performance(pred,measure="prec",x.measure="rec"))
plot(res$recall,res$precision,type="l")


现在您已经知道了，在K处获取或绘制精度很简单：
par(mfrow=c(1,2))
with(res,
plot(rank,precision,main="self-calculated",type="l"))
plot(pred@n.pos.pred[[1]],
pred@tp[[1]]/(pred@fp[[1]]+pred@tp[[1]]),
type="l",main="from RORC")

我不知道如何使用.plot.performance函数。。但您可以使用“预测对象”下存储的变量。pred@tp这才是真正的积极因素，pred@fp是假阳性，因此tp/fp+fp给出了精度和pred@n.pos.pred基本上给出了等级
谢谢，这很有效。从本质上讲，我们是不是说，实现这一点的唯一方法是手动计算精度、召回率和排名？这个解决方案并不真正使用ROCR，也不使用任何库。问题在于，通过这种方式，我们还需要手动重新实现跨多个交叉验证运行的所有计算（即，k处的平均精度）。这可能很麻烦。@st1led，这是几行代码，您需要知道它是正确的，对吗？我编辑了我的答案，其中包括如何使用预测对象来绘制所需内容<代码>pred@n.pos.pred

就是我要找的！

par(mfrow=c(1,2))
with(res,
plot(rank,precision,main="self-calculated",type="l"))
plot(pred@n.pos.pred[[1]],
pred@tp[[1]]/(pred@fp[[1]]+pred@tp[[1]]),
type="l",main="from RORC")