Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/76.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R从回归模型中提取系数,添加字符串,并使用Lappy和sprintf创建数据帧_R_Dataframe_Lapply - Fatal编程技术网

R从回归模型中提取系数,添加字符串,并使用Lappy和sprintf创建数据帧

R从回归模型中提取系数,添加字符串,并使用Lappy和sprintf创建数据帧,r,dataframe,lapply,R,Dataframe,Lapply,假设我有多个模型(为了方便起见,这里有两个生存模型和逻辑模型),我只想看看sex估计值 library(survival) data(colon) sn <- Surv(colon$time, colon$status) fit <- coxph(sn ~ sex + perfor + age, data = colon) fit1 <- coxph(sn ~ sex + perfor + surg + rx , data = colon) fit2 <- glm(fac

假设我有多个模型(为了方便起见,这里有两个生存模型和逻辑模型),我只想看看
sex
估计值

library(survival)
data(colon)
sn <- Surv(colon$time, colon$status)
fit <- coxph(sn ~ sex + perfor + age, data = colon)
fit1 <- coxph(sn ~ sex + perfor + surg + rx , data = colon)
fit2 <- glm(factor(status) ~ sex + age, data=colon, family=binomial(link = "logit")) 
fit3 <- glm(factor(status) ~ sex + age + nodes, data=colon, family=binomial(link = "logit")) 
而且

> df2
  model_survival                sur_estimate model_logistic           logistic_estimate
1            fit 0.97 (95 % CI 0.85 to 1.10)           fit2 0.97 (95 % CI 0.81 to 1.17)
2           fit1 0.94 (95 % CI 0.83 to 1.07)           fit3 0.98 (95 % CI 0.81 to 1.18)
我迄今为止的努力: 我使用了
lappy
,我认为它比
for loop
更好,并且已经很好地解决了这个问题,但是我希望lappy外部的部分在内部,这样如果我有更多的模型,它会更自动化。见下文

mylist<-list(fit,fit1,fit2,fit3)
results <- list()
results <- lapply(mylist, function(x) {
  sprintf("%.2f (95 %% CI %.2f to %.2f)",     
          exp(coef(x))["sex"], 
          exp(confint(x)[,1])["sex"], 
          exp(confint(x)[,2])["sex"])
})          
results <- do.call(rbind.data.frame, results)

要获得
df2
我可以做一些从长到宽的转换,但是我可以在
lappy
中预先定义布局和列名(我知道我可能需要两个独立的
lappy
——一个用于
df
df2
)。谢谢。

我们可以将
map
stack

library(tidyverse)
out <- mget(ls(pattern = "fit\\d*")) %>% 
        map(~sprintf("%.2f (95 %% CI %.2f to %.2f)",     
           exp(coef(.x))["sex"], 
           exp(confint(.x)[,1])["sex"], 
           exp(confint(.x)[,2])["sex"])) %>%
        stack %>%
        select(model = ind, estimate = values)
out
#  model                    estimate
#1   fit 0.97 (95 % CI 0.85 to 1.10)
#2  fit1 0.94 (95 % CI 0.83 to 1.07)
#3  fit2 0.97 (95 % CI 0.81 to 1.17)
#4  fit3 0.98 (95 % CI 0.81 to 1.18)

很好,在那里学到了很多。如果我们说需要多列,那么df2是从数据集解决方案创建另一列(从长格式到宽格式)最简单的方法,还是有什么方法可以合并到函数中?@user63230请检查我刚才提到的宽格式updated@user63230如果需要基于pvalues的星号,检查@user63230的
stars.pval
你正在使用
我需要睡眠,对不起!
colnames(results)[1]<-"estimate"
results$model<-c("fit","fit1","fit2","fit3")
library(tidyverse)
out <- mget(ls(pattern = "fit\\d*")) %>% 
        map(~sprintf("%.2f (95 %% CI %.2f to %.2f)",     
           exp(coef(.x))["sex"], 
           exp(confint(.x)[,1])["sex"], 
           exp(confint(.x)[,2])["sex"])) %>%
        stack %>%
        select(model = ind, estimate = values)
out
#  model                    estimate
#1   fit 0.97 (95 % CI 0.85 to 1.10)
#2  fit1 0.94 (95 % CI 0.83 to 1.07)
#3  fit2 0.97 (95 % CI 0.81 to 1.17)
#4  fit3 0.98 (95 % CI 0.81 to 1.18)
library(data.table)#using dcast as it can take multiple value.vars
out %>%
   group_by(group = rep(c("model_survival", "model_logistic"), each = 2)) %>%
   mutate(rn = row_number()) %>%
   as.data.table %>%
   dcast(., rn ~ group, value.var = c('model', 'estimate')) %>% 
   select(-rn)
# model_model_logistic model_model_survival     estimate_model_logistic     estimate_model_survival
#1:                 fit2                  fit 0.97 (95 % CI 0.81 to 1.17) 0.97 (95 % CI 0.85 to 1.10)
#2:                 fit3                 fit1 0.98 (95 % CI 0.81 to 1.18) 0.94 (95 % CI 0.83 to 1.07)