R 获取通过重采样计算的多元回归系数值_R_Regression_Resampling_Statistics Bootstrap

R 获取通过重采样计算的多元回归系数值

R 获取通过重采样计算的多元回归系数值,r,regression,resampling,statistics-bootstrap,R,Regression,Resampling,Statistics Bootstrap,我正在使用重采样计算多个线性模型的系数。在我使用boot功能之前，但是将来需要在分析中包含新的统计数据，所以我认为这种方法更好。一个可复制的例子： iris <- iris[,1:4] nboots <- 100 ncol = ncol(iris) boot.r.squared <- numeric(nboots) boot.p_model <- numeric(nboots) boot.coef_p <- numeric(nboots) boot.c

我正在使用重采样计算多个线性模型的系数。在我使用

boot

功能之前，但是将来需要在分析中包含新的统计数据，所以我认为这种方法更好。一个可复制的例子：

iris <- iris[,1:4]

nboots <- 100
ncol = ncol(iris)
boot.r.squared <- numeric(nboots)      
boot.p_model <- numeric(nboots)
boot.coef_p <- numeric(nboots)
boot.coef_estimate <- matrix(nrow= nboots,ncol=ncol)
boot.coef_error <- matrix(nrow= nboots,ncol=ncol)
boot.coef_t <- matrix(nrow= nboots,ncol=ncol)
boot.coef_p <- matrix(nrow= nboots,ncol=ncol)

for(i in 1:nboots){
  boot.samp <- iris[sample(nrow(iris),size = 100, replace = TRUE,), ] 
  model <- lm(boot.samp$Sepal.Length ~ .,boot.samp)
  model.sum <- summary(model)

  boot.r.squared[i] <- model.sum$r.squared
  stat <- model.sum$fstatistic
  boot.p_model[i] <- pf(stat[1], stat[2], stat[3], lower.tail = FALSE)

  boot.coef_estimate[i, 1:length(model$coefficients)] <- model$coefficients[1]
  boot.coef_error[i, 1:length(model$coefficients)] <- model$coefficients[2]
  boot.coef_t[i, 1:length(model$coefficients)] <- model$coefficients[3]
  boot.coef_p[i, 1:length(model$coefficients)] <- model$coefficients[4] 
}

iris你的代码对我来说没有错误。您可能尝试了不同的重复次数，但忘记了在循环之前更新您定义的空对象的大小，因为我可以在更改为nboots时重新解释您的错误。关于您的原始代码，您混淆了model和model.sum。当插入系数、误差等时，您需要获得summary（model）的系数
见下文：
for(i in 1:nboots){
  boot.samp <- iris[sample(nrow(iris),size = 100, replace = TRUE,), ] 
  model <- lm(boot.samp$Sepal.Length ~ .,boot.samp)
  model.sum <- summary(model)

  boot.r.squared[i] <- model.sum$r.squared
  stat <- model.sum$fstatistic
  boot.p_model[i] <- pf(stat[1], stat[2], stat[3], lower.tail = FALSE)

  # this is the part where you need to use model.sum
  boot.coef_estimate[i, 1:length(model$coefficients)] <- model.sum$coefficients[,1]
  boot.coef_error[i, 1:length(model$coefficients)] <- model.sum$coefficients[,2]
  boot.coef_t[i, 1:length(model$coefficients)] <- model.sum$coefficients[,3]
  boot.coef_p[i, 1:length(model$coefficients)] <- model.sum$coefficients[,4] 
}

for（i in 1:nboots）{
你是对的，从统计学的角度来看，这些系数的bootstrap本身并不会增加太多。我会将它们与初始统计数据进行比较，得到pThank You的值！以它的方式，结果要好得多，现在可以计算其他统计数据了。我不知道aperm，我会研究这个函数，因为它是让事情变得更好。
names(Aps)
# [1] "Estimate"   "Std. Error" "t value"    "Pr(>|t|)"   "r2"         "p.f"

estimates <- Aps$Estimate

estimates[1:3, ]
#      (Intercept) Sepal.Width Petal.Length Petal.Width
# [1,]    1.353531   0.7655760    0.8322749  -0.7775090
# [2,]    1.777431   0.6653308    0.7353491  -0.6024095
# [3,]    2.029428   0.5825554    0.6941457  -0.4795787

Aps$p.f[1:3, ]
#       (Intercept)  Sepal.Width Petal.Length  Petal.Width
# [1,] 2.759019e-65 2.759019e-65 2.759019e-65 2.759019e-65
# [2,] 5.451912e-66 5.451912e-66 5.451912e-66 5.451912e-66
# [3,] 3.288712e-54 3.288712e-54 3.288712e-54 3.288712e-54

Aps$p.f[1:3, 1]
# [1] 2.759019e-65 5.451912e-66 3.288712e-54

# Unit: seconds
#      expr      min       lq     mean   median       uq      max neval cld
#   forloop 2.215259 2.235797 2.332234 2.381035 2.401933 2.700622   100   a
# replicate 2.218291 2.240570 2.313526 2.257905 2.400532 2.601958   100   a

for(i in 1:nboots){
  boot.samp <- iris[sample(nrow(iris),size = 100, replace = TRUE,), ] 
  model <- lm(boot.samp$Sepal.Length ~ .,boot.samp)
  model.sum <- summary(model)

  boot.r.squared[i] <- model.sum$r.squared
  stat <- model.sum$fstatistic
  boot.p_model[i] <- pf(stat[1], stat[2], stat[3], lower.tail = FALSE)

  # this is the part where you need to use model.sum
  boot.coef_estimate[i, 1:length(model$coefficients)] <- model.sum$coefficients[,1]
  boot.coef_error[i, 1:length(model$coefficients)] <- model.sum$coefficients[,2]
  boot.coef_t[i, 1:length(model$coefficients)] <- model.sum$coefficients[,3]
  boot.coef_p[i, 1:length(model$coefficients)] <- model.sum$coefficients[,4] 
}