R 如何通过大量模型来收集预测
在R 如何通过大量模型来收集预测,r,tidyverse,R,Tidyverse,在modeler包中,函数gather\u predictions可用于将多个模型的预测添加到数据帧中,但是我不确定如何在函数调用中指定这些模型。帮助文档提供了以下exmaple: df <- tibble::data_frame( x = sort(runif(100)), y = 5 * x + 0.5 * x ^ 2 + 3 + rnorm(length(x)) ) m1 <- lm(y ~ x, data = df) grid <- data.frame(x
modeler
包中,函数gather\u predictions
可用于将多个模型的预测添加到数据帧中,但是我不确定如何在函数调用中指定这些模型。帮助文档提供了以下exmaple:
df <- tibble::data_frame(
x = sort(runif(100)),
y = 5 * x + 0.5 * x ^ 2 + 3 + rnorm(length(x))
)
m1 <- lm(y ~ x, data = df)
grid <- data.frame(x = seq(0, 1, length = 10))
grid %>% add_predictions(m1)
m2 <- lm(y ~ poly(x, 2), data = df)
grid %>% spread_predictions(m1, m2)
grid %>% gather_predictions(m1, m2)
dfm%gather_预测(lm(y~poly(x,1),data=df))
对于(2:10中的N){
m%聚集_预测(lm(y~多边形(x,N),数据=df)))
}
解决这个问题有一些变通方法。我的做法是:
1.构建具有特定名称的模型列表
2.使用经过调整的modelr::gather_predictions()版本将列表中的所有模型应用于数据
# prerequisites
library(tidyverse)
set.seed(1363)
# I'll use generic name 'data' throughout the code, so you can easily try other datasets.
# for this example I'll use your data df
data=df
# data visualization
ggplot(data, aes(x, y)) +
geom_point(size=3)
#建立模型列表
模型适用于本例,但这确实意味着您必须在进行过程中创建模型,而不存储模型,这对于训练又好又快的线性模型很好,但对于更复杂的模型来说,这可能并不理想。@MarijnStevering哦,是的,我想它只是存储结果,但我这么做只是为了让它更简洁。为了存储模型,您只需在循环中再插入一行,这与您在问题中遇到的相同。话虽如此,您不能以这种方式使用列表数据类型,因此可能会导致您必须两次拟合模型,或者至少在将数据类型用于聚集\u预测之前从列表转换数据类型。
m <- grid %>% gather_predictions(lm(y ~ poly(x, 1), data = df))
for (N in 2:10) {
m <- rbind(m, grid %>% gather_predictions(lm(y ~ poly(x, N), data = df)))
}
# prerequisites
library(tidyverse)
set.seed(1363)
# I'll use generic name 'data' throughout the code, so you can easily try other datasets.
# for this example I'll use your data df
data=df
# data visualization
ggplot(data, aes(x, y)) +
geom_point(size=3)
# build a list of models
models <-vector("list", length = 5)
model_names <- vector("character", length=5)
for (i in 1:5) {
modelformula <- str_c("y ~ poly(x,", i, ")", sep="")
models[[i]] <- lm(as.formula(modelformula), data = data)
model_names[[i]] <- str_c('model', i) # remember we name the models here sequantially
}
# apply names to the models list
names(models) <- model_names
# this is modified verison of modelr::gather_predictions() in order to accept list of models
gather.predictions <- function (data, models, .pred = "pred", .model = "model")
{
df <- map2(models, .pred, modelr::add_predictions, data = data)
names(df) <- names(models)
bind_rows(df, .id = .model)
}
# the rest is the same as modelr's function...
grids <- gather.predictions(data = data, models = models, .pred = "y")
ggplot(data, aes(x, y)) +
geom_point() +
geom_line(data = grids, colour = "red") +
facet_wrap(~ model)