R 获取通过重采样计算的多元回归系数值
我正在使用重采样计算多个线性模型的系数。在我使用R 获取通过重采样计算的多元回归系数值,r,regression,resampling,statistics-bootstrap,R,Regression,Resampling,Statistics Bootstrap,我正在使用重采样计算多个线性模型的系数。在我使用boot功能之前,但是将来需要在分析中包含新的统计数据,所以我认为这种方法更好。一个可复制的例子: iris <- iris[,1:4] nboots <- 100 ncol = ncol(iris) boot.r.squared <- numeric(nboots) boot.p_model <- numeric(nboots) boot.coef_p <- numeric(nboots) boot.c
boot
功能之前,但是将来需要在分析中包含新的统计数据,所以我认为这种方法更好。一个可复制的例子:
iris <- iris[,1:4]
nboots <- 100
ncol = ncol(iris)
boot.r.squared <- numeric(nboots)
boot.p_model <- numeric(nboots)
boot.coef_p <- numeric(nboots)
boot.coef_estimate <- matrix(nrow= nboots,ncol=ncol)
boot.coef_error <- matrix(nrow= nboots,ncol=ncol)
boot.coef_t <- matrix(nrow= nboots,ncol=ncol)
boot.coef_p <- matrix(nrow= nboots,ncol=ncol)
for(i in 1:nboots){
boot.samp <- iris[sample(nrow(iris),size = 100, replace = TRUE,), ]
model <- lm(boot.samp$Sepal.Length ~ .,boot.samp)
model.sum <- summary(model)
boot.r.squared[i] <- model.sum$r.squared
stat <- model.sum$fstatistic
boot.p_model[i] <- pf(stat[1], stat[2], stat[3], lower.tail = FALSE)
boot.coef_estimate[i, 1:length(model$coefficients)] <- model$coefficients[1]
boot.coef_error[i, 1:length(model$coefficients)] <- model$coefficients[2]
boot.coef_t[i, 1:length(model$coefficients)] <- model$coefficients[3]
boot.coef_p[i, 1:length(model$coefficients)] <- model$coefficients[4]
}
iris你的代码对我来说没有错误。您可能尝试了不同的重复次数,但忘记了在循环之前更新您定义的空对象的大小,因为我可以在更改为nboots时重新解释您的错误。关于您的原始代码,您混淆了model和model.sum。当插入系数、误差等时,您需要获得summary(model)的系数
见下文:
for(i in 1:nboots){
boot.samp <- iris[sample(nrow(iris),size = 100, replace = TRUE,), ]
model <- lm(boot.samp$Sepal.Length ~ .,boot.samp)
model.sum <- summary(model)
boot.r.squared[i] <- model.sum$r.squared
stat <- model.sum$fstatistic
boot.p_model[i] <- pf(stat[1], stat[2], stat[3], lower.tail = FALSE)
# this is the part where you need to use model.sum
boot.coef_estimate[i, 1:length(model$coefficients)] <- model.sum$coefficients[,1]
boot.coef_error[i, 1:length(model$coefficients)] <- model.sum$coefficients[,2]
boot.coef_t[i, 1:length(model$coefficients)] <- model.sum$coefficients[,3]
boot.coef_p[i, 1:length(model$coefficients)] <- model.sum$coefficients[,4]
}
for(i in 1:nboots){
你是对的,从统计学的角度来看,这些系数的bootstrap本身并不会增加太多。我会将它们与初始统计数据进行比较,得到pThank You的值!以它的方式,结果要好得多,现在可以计算其他统计数据了。我不知道aperm,我会研究这个函数,因为它是让事情变得更好。
names(Aps)
# [1] "Estimate" "Std. Error" "t value" "Pr(>|t|)" "r2" "p.f"
estimates <- Aps$Estimate
estimates[1:3, ]
# (Intercept) Sepal.Width Petal.Length Petal.Width
# [1,] 1.353531 0.7655760 0.8322749 -0.7775090
# [2,] 1.777431 0.6653308 0.7353491 -0.6024095
# [3,] 2.029428 0.5825554 0.6941457 -0.4795787
Aps$p.f[1:3, ]
# (Intercept) Sepal.Width Petal.Length Petal.Width
# [1,] 2.759019e-65 2.759019e-65 2.759019e-65 2.759019e-65
# [2,] 5.451912e-66 5.451912e-66 5.451912e-66 5.451912e-66
# [3,] 3.288712e-54 3.288712e-54 3.288712e-54 3.288712e-54
Aps$p.f[1:3, 1]
# [1] 2.759019e-65 5.451912e-66 3.288712e-54
# Unit: seconds
# expr min lq mean median uq max neval cld
# forloop 2.215259 2.235797 2.332234 2.381035 2.401933 2.700622 100 a
# replicate 2.218291 2.240570 2.313526 2.257905 2.400532 2.601958 100 a
for(i in 1:nboots){
boot.samp <- iris[sample(nrow(iris),size = 100, replace = TRUE,), ]
model <- lm(boot.samp$Sepal.Length ~ .,boot.samp)
model.sum <- summary(model)
boot.r.squared[i] <- model.sum$r.squared
stat <- model.sum$fstatistic
boot.p_model[i] <- pf(stat[1], stat[2], stat[3], lower.tail = FALSE)
# this is the part where you need to use model.sum
boot.coef_estimate[i, 1:length(model$coefficients)] <- model.sum$coefficients[,1]
boot.coef_error[i, 1:length(model$coefficients)] <- model.sum$coefficients[,2]
boot.coef_t[i, 1:length(model$coefficients)] <- model.sum$coefficients[,3]
boot.coef_p[i, 1:length(model$coefficients)] <- model.sum$coefficients[,4]
}