在r中的group_by之后建模后取消列表列_R_Dplyr_Tidyverse

在r中的group_by之后建模后取消列表列

在r中的group_by之后建模后取消列表列,r,dplyr,tidyverse,R,Dplyr,Tidyverse,我想在group\u by之后对所有组进行线性回归，将模型系数保存在列表列中，然后使用“unest”来“展开列表列”。这里我使用mtcarsdataset作为示例注意：我想在这里使用do'，因为broom:：tidy`不适用于所有型号 mtcars %>% group_by(cyl) %>% do(model=lm(mpg~wt+hp, data=.)) %>% mutate(coefs = list(summary(model)$coefficient

我想在

group\u by

之后对所有组进行线性回归，将模型系数保存在列表列中，然后使用“unest”来“展开列表列”。这里我使用

mtcars

dataset作为示例

注意：我想在这里使用

do'，因为broom:：tidy`不适用于所有型号
mtcars %>% group_by(cyl) %>% 
    do(model=lm(mpg~wt+hp, data=.)) %>% 
    mutate(coefs = list(summary(model)$coefficients)) %>% 
    unnest()

我想要这样的东西
cyl   term         Estimate Std. Error   t value     Pr(>|t|)
 4     (Intercept) 36.9083305 2.19079864 16.846975 1.620660e-16
 4     wt         -2.2646936 0.57588924 -3.932516 4.803752e-04
 4     hp          -0.0191217 0.01500073 -1.274718 2.125285e-01
 6.......
 6......
........

我得到如下错误：
Error: All nested columns must have the same number of elements.

有人能帮助解决这个问题吗？我试了这么多次都没想到
 一个选项是提取“coefs”列（$coefs
），使用“cyl”列设置列表
列的名称，使用映射
循环通过列表
，将其转换为数据.frame
，基于行名称创建一个新列，并使用.id
从列表的名称创建“cyl”列
library(tidyverse)
mtcars %>% 
   group_by(cyl) %>% 
   do(model=lm(mpg~ wt + hp, data=.)) %>% 
   mutate(coefs = list(summary(model)$coefficients)) %>%
                 select(-model) %>% 
   {set_names(.$coefs, .$cyl)} %>%
   map_df(~ .x %>% 
               as.data.frame %>%
               rownames_to_column('term'), .id = 'cyl')
# cyl        term    Estimate Std. Error   t value     Pr(>|t|)
#1   4 (Intercept) 45.83607319 4.78693568  9.575243 1.172558e-05
#2   4          wt -5.11506233 1.60247105 -3.191984 1.276524e-02
#3   4          hp -0.09052672 0.04359827 -2.076383 7.151610e-02
#4   6 (Intercept) 32.56630096 5.57482132  5.841676 4.281411e-03
#5   6          wt -3.24294031 1.37365306 -2.360815 7.759393e-02
#6   6          hp -0.02219994 0.02017664 -1.100279 3.329754e-01
#7   8 (Intercept) 26.66393686 3.66217797  7.280896 1.580743e-05
#8   8          wt -2.17626765 0.72094143 -3.018647 1.168393e-02
#9   8          hp -0.01367295 0.01073989 -1.273099 2.292303e-01

如果我们想使用tidy
，那么将map_df
的内容更改为
       ...                %>%
        map_df(~ .x %>% 
                          broom::tidy(.), .id = 'cyl')


另外，另一个选项是在分组后嵌套，然后在模型对象上应用扫帚：：整理，然后取消嵌套
mtcars %>% 
   group_by(cyl) %>%
   nest %>% 
   mutate(data = map(data, ~ .x %>%
                    summarise(model = list(broom::tidy(lm(mpg ~ wt + hp)))))) %>% 
   unnest %>% 
   unnest

你能说明你的预期产出是什么吗。列表
中的系数
具有不同的结构，如我所说，如果没有预期的输出，则不清楚。也许mtcars%%>%groupby（cyl）%%>%do（model=lm（mpg~cyl+hp，data=）%%>%mutate（coefs=list（summary（model）（summary（model）$coefs））%%>%select（-model）%%>%mutate（coefs=list（map\df（coefs，~.x%>%enframe））%%>%unest
@akrun我添加了预期的输出。尝试使用mtcars%%>%groupby（cyl）%%>%do（model=lm）（mpg~cyl+hp，data=）mutate%）（coefs=list（summary（model）$coefactors））%%>%select（-model）%%>%{set_name（.$coefs，.$cyl）}%%>%map_-df（~.x%%>%as_-tible，.id='cyl'））
@akrun谢谢你的回答。你能把它放在答案里让大家看得更清楚吗？另外，我把cyl
改为wt
作为自变量，因为cyl
是分组变量。谢谢。非常感谢。你能解释一下这行吗？{set\u names（.$coefs，.$cyl）}
@akrun为什么我们需要花括号？@zesla。我添加了一些说明。基本上，在map
中使用.id
参数创建一个列。需要花括号是为了在我们执行多个操作时保留整个代码块的求值，提取两个列