将lm（）和predict（）应用于数据帧中的多个列_R_Regression_Linear Regression_Lm_Mlm

将lm（）和predict（）应用于数据帧中的多个列

将lm（）和predict（）应用于数据帧中的多个列,r,regression,linear-regression,lm,mlm,R,Regression,Linear Regression,Lm,Mlm,下面是一个示例数据集 train<-data.frame(x1 = c(4,5,6,4,3,5), x2 = c(4,2,4,0,5,4), x3 = c(1,1,1,0,0,1), x4 = c(1,0,1,1,0,0), x5 = c(0,0,0,1,1,1)) 然后，我想使用predict将这些模型应用于测试集，然后创建一个矩阵，将每个模型结果作为一列 test <- data.frame(x1 = c(4,3,2,1,5,6), x2 =

下面是一个示例数据集

train<-data.frame(x1 = c(4,5,6,4,3,5), x2 = c(4,2,4,0,5,4), x3 = c(1,1,1,0,0,1),
                  x4 = c(1,0,1,1,0,0), x5 = c(0,0,0,1,1,1))

然后，我想使用predict将这些模型应用于测试集，然后创建一个矩阵，将每个模型结果作为一列

test <- data.frame(x1 = c(4,3,2,1,5,6), x2 = c(4,2,1,6,8,5))
p1 <- predict(lm1, newdata = test)
p2 <- predict(lm2, newdata = test)
p3 <- predict(lm3, newdata = test)
final <- cbind(p1, p2, p3)

test我本来想把你的问题作为一个重复来结束，但遗憾的是，预测问题没有在那里得到解决。另一方面，讨论预测，但与您的情况有点遥远，因为您使用的是公式接口而不是矩阵接口
我没能找到一个完美的重复目标。因此，我认为为这个标签提供另一个答案是一个好主意。正如我在链接问题中所说，predict.mlm
不支持se.fit
，目前，这也是“mlm”标签中缺少的一个问题。所以我会抓住这个机会来填补这个空白

以下是获得预测标准误差的函数：
f <- function (mlmObject, newdata) {
  ## model formula
  form <- formula(mlmObject)
  ## drop response (LHS)
  form[[2]] <- NULL
  ## prediction matrix
  X <- model.matrix(form, newdata)
  Q <- forwardsolve(t(qr.R(mlmObject$qr)), t(X))
  ## unscaled prediction standard error
  unscaled.se <- sqrt(colSums(Q ^ 2))
  ## residual standard error
  sigma <- sqrt(colSums(residuals(mlmObject) ^ 2) / mlmObject$df.residual)
  ## scaled prediction standard error
  tcrossprod(unscaled.se, sigma)
  }

谢谢，这很有帮助。另外，如果我使用另一个模型，比如glmnet，我可以用什么来表示“y”值呢。我试过上面的表格，但不接受。
f <- function (mlmObject, newdata) {
  ## model formula
  form <- formula(mlmObject)
  ## drop response (LHS)
  form[[2]] <- NULL
  ## prediction matrix
  X <- model.matrix(form, newdata)
  Q <- forwardsolve(t(qr.R(mlmObject$qr)), t(X))
  ## unscaled prediction standard error
  unscaled.se <- sqrt(colSums(Q ^ 2))
  ## residual standard error
  sigma <- sqrt(colSums(residuals(mlmObject) ^ 2) / mlmObject$df.residual)
  ## scaled prediction standard error
  tcrossprod(unscaled.se, sigma)
  }

## fit an `mlm`
fit <- lm(cbind(x3, x4, x5) ~ x1 + x2, data = train)

## prediction (mean only)
pred <- predict(fit, newdata = test)

#            x3          x4         x5
#1  0.555956679  0.38628159 0.60649819
#2  0.003610108  0.47653430 0.95848375
#3 -0.458483755  0.48014440 1.27256318
#4 -0.379061372 -0.03610108 1.35920578
#5  1.288808664  0.12274368 0.17870036
#6  1.389891697  0.46570397 0.01624549

## prediction error
pred.se <- f(fit, newdata = test)

#          [,1]      [,2]      [,3]
#[1,] 0.1974039 0.3321300 0.2976205
#[2,] 0.3254108 0.5475000 0.4906129
#[3,] 0.5071956 0.8533510 0.7646849
#[4,] 0.6583707 1.1077014 0.9926075
#[5,] 0.5049637 0.8495959 0.7613200
#[6,] 0.3552794 0.5977537 0.5356451

## `lm1`, `lm2` and `lm3` are defined in your question
predict(lm1, test, se.fit = TRUE)$se.fit
#        1         2         3         4         5         6 
#0.1974039 0.3254108 0.5071956 0.6583707 0.5049637 0.3552794 

predict(lm2, test, se.fit = TRUE)$se.fit
#        1         2         3         4         5         6 
#0.3321300 0.5475000 0.8533510 1.1077014 0.8495959 0.5977537 

predict(lm3, test, se.fit = TRUE)$se.fit
#        1         2         3         4         5         6 
#0.2976205 0.4906129 0.7646849 0.9926075 0.7613200 0.5356451