如何在R中构建和测试多个模型

如何在R中构建和测试多个模型,r,R,假设我有一个数据集,如(忘记分布): 然后对每个模型进行Shapiro测试和boxcox测试,例如: shapiro.test(residuals(md1)) boxcox(md1, plotit = T) 有没有一种简便的方法可以在不手动键入每个模型的情况下构建和测试多个模型?下面是一种使用简单“lappy”的方法: #1。数据集 df这里有一种使用tidyverse的替代方法: modData <- data.frame("A" = rnorm(20, 15, 3),

假设我有一个数据集,如(忘记分布):

然后对每个模型进行Shapiro测试和boxcox测试,例如:

shapiro.test(residuals(md1))
boxcox(md1, plotit = T)

有没有一种简便的方法可以在不手动键入每个模型的情况下构建和测试多个模型?

下面是一种使用简单“lappy”的方法:

#1。数据集

df这里有一种使用
tidyverse
的替代方法:

modData <- data.frame("A" = rnorm(20, 15, 3),
                      "B" = rnorm(20, 20, 3),
                      "C" = rnorm(20, 25, 3),
                      "X" = rnorm(20, 5, 1))
library(tidyverse)
library(broom)

# specify predictor and target variables
x = "X"
y = names(modData)[names(modData)!= x]

expand.grid(y,x) %>%                                    # create combinations
  mutate(model_id = row_number(),                       # create model id
         frml = paste0(Var1, "~", Var2)) %>%            # create model formula
  group_by(model_id, Var1, Var2) %>%                    # group by the above
  nest() %>%                                            # nest data
  mutate(m = map(data, ~lm(.$frml, data = modData)),    # create models
         m_table = map(m, ~tidy(.)),                    # tidy model output
         st = map(m, ~shapiro.test(residuals(.)))) -> dt_model_info  # shapiro test

# access model info
dt_model_info
dt_model_info$m
dt_model_info$m_table
dt_model_info$st

# another way to access info
dt_model_info %>% unnest(m_table)
modData%#创建组合
变异(模型id=行编号(),#创建模型id
frml=paste0(Var1,“~”,Var2))%>%#创建模型公式
分组依据(模型id,Var1,Var2)%>%\
嵌套()%>%#嵌套数据
mutate(m=map(data,~lm(.$frml,data=modData)),#创建模型
m#u table=map(m,~tidy(.),#tidy模型输出
st=map(m,~shapiro.test(残差())->dt_模型信息#shapiro测试
#访问模型信息
dt_模型_信息
dt_模型_信息$m
dt_模型_信息$m_表格
dt_模型_信息$st
#访问信息的另一种方式
dt_模型_信息%>%unnest(m_表)

如果您不想引入几十个依赖项,您可以通过简单的
sapply
来实现。请注意,我没有提供一个
boxcox
部件,因为我不知道它来自哪里(汽车、大众?)

modData
data(“mtcars”)

谢谢,但是建筑程序呢?我想象的伪代码:
listOfModels谢谢!我也这么做了:)谢谢你提供了一个可复制的例子,清楚地说明了你的目标。新手访问者可以从这样的问题中学到很多东西。虽然这段代码可以提供问题的解决方案,但最好添加上下文来说明为什么/如何工作。这可以帮助未来的用户学习,并将这些知识应用到他们自己的代码中。在解释代码时,用户可能会以投票的形式向您提供正面反馈。您应该提供解释和理由,并给出答案,否则此答案可能会被删除。
shapiro.test(residuals(md1))
boxcox(md1, plotit = T)
# 1. Data set
df <- data.frame(
  a = rnorm(20, 15, 3),
  b = rnorm(20, 20, 3),
  c = rnorm(20, 25, 3),
  x = rnorm(20, 5, 1))

# 2. Models
fit_lm_a <- lm(a ~ x, df)
fit_lm_b <- lm(b ~ x, df)
fit_lm_c <- lm(c ~ x, df)

# 3. List of models
list_fit_lm <- list(fit_lm_a, fit_lm_b,fit_lm_c)

# 3. Shapiro test
lapply(
  list_fit_lm, function(x) {
    shapiro.test(residuals(x)) 
  })

 # 4. Box-Cox Transformations
 lapply(
  list_fit_lm, function(x) {
    boxcox(x, plotit = TRUE, data = df)
  }
 )
modData <- data.frame("A" = rnorm(20, 15, 3),
                      "B" = rnorm(20, 20, 3),
                      "C" = rnorm(20, 25, 3),
                      "X" = rnorm(20, 5, 1))
library(tidyverse)
library(broom)

# specify predictor and target variables
x = "X"
y = names(modData)[names(modData)!= x]

expand.grid(y,x) %>%                                    # create combinations
  mutate(model_id = row_number(),                       # create model id
         frml = paste0(Var1, "~", Var2)) %>%            # create model formula
  group_by(model_id, Var1, Var2) %>%                    # group by the above
  nest() %>%                                            # nest data
  mutate(m = map(data, ~lm(.$frml, data = modData)),    # create models
         m_table = map(m, ~tidy(.)),                    # tidy model output
         st = map(m, ~shapiro.test(residuals(.)))) -> dt_model_info  # shapiro test

# access model info
dt_model_info
dt_model_info$m
dt_model_info$m_table
dt_model_info$st

# another way to access info
dt_model_info %>% unnest(m_table)
modData <- data.frame("A" = rnorm(20, 15, 3),
                      "B" = rnorm(20, 20, 3),
                      "C" = rnorm(20, 25, 3),
                      "X" = rnorm(20, 5, 1))

deps <- c("A", "B", "C")
indeps <- c("X")

result <- sapply(deps, FUN = function(x, indeps, mydata) {
  myformula <- formula(sprintf("%s ~ %s", x, indeps))

  model <- lm(myformula, data = mydata)
  out.shapiro <- shapiro.test(residuals(model))

  return(list(model = model, shapiro = out.shapiro))
}, indeps = indeps, mydata = modData, simplify = FALSE)
data("mtcars")

formulas <- list(
  mpg ~ disp,
  mpg ~ disp + wt
)

res <- vector("list", length = length(formulas))

for(i in seq_along(formulas)){
  res[[i]] <- lm(formulas[[i]], data = mtcars)}
res

lapply(formulas, lm, data = mtcars)