使用ranger软件包计算Brier分数和综合Brier分数
我想使用“ranger”软件包计算Brier分数和综合Brier分数,以便进行分析 作为一个例子,我使用“生存”软件包中的老兵数据,如下所示使用ranger软件包计算Brier分数和综合Brier分数,r,machine-learning,regression,random-forest,survival-analysis,R,Machine Learning,Regression,Random Forest,Survival Analysis,我想使用“ranger”软件包计算Brier分数和综合Brier分数,以便进行分析 作为一个例子,我使用“生存”软件包中的老兵数据,如下所示 install.packages("ranger") library(ranger) install.packages("survival") library(survival) #load veteran data data(veteran) data <- veteran # training and test data n <- nrow
install.packages("ranger")
library(ranger)
install.packages("survival")
library(survival)
#load veteran data
data(veteran)
data <- veteran
# training and test data
n <- nrow(data)
testind <- sample(1:n,n*0.7)
trainind <- (1:n)[-testind]
#train ranger
rg <- ranger(Surv(time, status) ~ ., data = data[trainind,])
# use rg to predict test data
pred <- predict(rg,data=data[testind,],num.trees=rg$num.trees)
#cummulative hazard function for each sample
pred$chf
#survival probability for each sample
pred$survival
install.packages(“ranger”)
图书馆(游侠)
安装程序包(“生存”)
图书馆(生存)
#加载老兵数据
数据(退伍军人)
数据可使用pec
软件包的pec
功能计算综合Brier评分(IBS),但您需要定义predictSurvProb
命令,以从ranger
建模方法中提取生存概率预测(?pec::predictSurvProb
获取可用模型列表).
可能的解决方案是:
predictSurvProb.ranger <- function (object, newdata, times, ...) {
ptemp <- ranger:::predict.ranger(object, data = newdata, importance = "none")$survival
pos <- prodlim::sindex(jump.times = object$unique.death.times,
eval.times = times)
p <- cbind(1, ptemp)[, pos + 1, drop = FALSE]
if (NROW(p) != NROW(newdata) || NCOL(p) != length(times))
stop(paste("\nPrediction matrix has wrong dimensions:\nRequested newdata x times: ",
NROW(newdata), " x ", length(times), "\nProvided prediction matrix: ",
NROW(p), " x ", NCOL(p), "\n\n", sep = ""))
p
}
library(ranger)
library(survival)
data(veteran)
dts <- veteran
n <- nrow(dts)
set.seed(1)
testind <- sample(1:n,n*0.7)
trainind <- (1:n)[-testind]
rg <- ranger(Surv(time, status) ~ ., data = dts[trainind,])
# A formula to be inputted into the pec command
frm <- as.formula(paste("Surv(time, status)~",
paste(rg$forest$independent.variable.names, collapse="+")))
library(pec)
# Using pec for IBS estimation
PredError <- pec(object=rg,
formula = frm, cens.model="marginal",
data=dts[testind,], verbose=F, maxtime=200)
print(PredError, times=seq(10,200,50))
# ...
# Integrated Brier score (crps):
#
# IBS[0;time=10) IBS[0;time=60) IBS[0;time=110) IBS[0;time=160)
# Reference 0.043 0.183 0.212 0.209
# ranger 0.041 0.144 0.166 0.176