R 如何添加行并通过多个变量推断数据?

R 如何添加行并通过多个变量推断数据?,r,R,我试图为“日”添加缺失的行,并为“值”推断数据。在我的数据中,每个受试者(“id”)有2个时段(时段1和时段2)和连续天数的值 我的数据示例如下所示: df <- data.frame( id = c(1,1,1,1, 1,1,1,1, 2,2,2,2, 2,2,2,2, 3,3,3,3, 3,3,3,3), period = c(1,1,1,1, 2,2,2,2, 1,1,1,1, 2,2,2,2, 1,1,1,1, 2,2,2,2), day=

我试图为“日”添加缺失的行,并为“值”推断数据。在我的数据中,每个受试者(“id”)有2个时段(时段1和时段2)和连续天数的值

我的数据示例如下所示:

df <- data.frame(
  id  =    c(1,1,1,1,  1,1,1,1,  2,2,2,2,  2,2,2,2,  3,3,3,3,  3,3,3,3),
  period = c(1,1,1,1,  2,2,2,2,  1,1,1,1,  2,2,2,2,  1,1,1,1,  2,2,2,2),
  day=     c(1,2,4,5,  1,3,4,5,  2,3,4,5,  1,2,3,5,  2,3,4,5,  1,2,3,4),
  value   =c(10,12,15,16, 11,14,15,17, 13,14,15,16, 15,16,18,20,  16,17,19,29, 14,16,18,20))

df我们可以使用
complete

library(dplyr)
library(tidyr)
library(forecast)
df %>% 
    group_by(id, period) %>% 
    complete(day =1:7)%>% 
    mutate(value = as.numeric(na.interp(value)))

@akrun的答案很好,只要你不介意使用线性插值。但是,如果您确实想使用线性模型,可以尝试这种data.table方法

library(data.table)
model <- lm(value ~ day + period + id,data=df)
dt <- as.data.table(df)[,.SD[,.(day = 1:7,value = value[match(1:7,day)])],by=.(id,period)]
dt[is.na(value), value := predict(model,.SD),]
dt
    id period day    value
 1:  1      1   1 10.00000
 2:  1      1   2 12.00000
 3:  1      1   3 12.86714
 4:  1      1   4 15.00000
 5:  1      1   5 16.00000
 6:  1      1   6 18.13725
 7:  1      1   7 19.89396
 8:  1      2   1 11.00000
 9:  1      2   2 12.15545
10:  1      2   3 14.00000
11:  1      2   4 15.00000
12:  1      2   5 17.00000
13:  1      2   6 19.18227
14:  1      2   7 20.93898
15:  2      1   1 11.90102
16:  2      1   2 13.00000
17:  2      1   3 14.00000
18:  2      1   4 15.00000
19:  2      1   5 16.00000
20:  2      1   6 20.68455
21:  2      1   7 22.44125
22:  2      2   1 15.00000
23:  2      2   2 16.00000
24:  2      2   3 18.00000
25:  2      2   4 18.21616
26:  2      2   5 20.00000
27:  2      2   6 21.72957
28:  2      2   7 23.48627
29:  3      1   1 14.44831
30:  3      1   2 16.00000
31:  3      1   3 17.00000
32:  3      1   4 19.00000
33:  3      1   5 29.00000
34:  3      1   6 23.23184
35:  3      1   7 24.98855
36:  3      2   1 14.00000
37:  3      2   2 16.00000
38:  3      2   3 18.00000
39:  3      2   4 20.00000
40:  3      2   5 22.52016
41:  3      2   6 24.27686
42:  3      2   7 26.03357
    id period day    value
库(data.table)

该模型解决了第一部分,但无法使用:df3@Murat推断
na。interp
正在进行线性插值。我第一眼就意识到您的编辑,抱歉。伟大的解决方案!非常感谢你!实际上我还有一个问题,我意识到我不需要外推的其他列(不包括在样本数据中)没有外推,我很难完成扩展。添加的行中应继续使用相同的值。我不确定我是否应该提出另一个问题,或者是否能够就这个问题发表评论?
library(data.table)
model <- lm(value ~ day + period + id,data=df)
dt <- as.data.table(df)[,.SD[,.(day = 1:7,value = value[match(1:7,day)])],by=.(id,period)]
dt[is.na(value), value := predict(model,.SD),]
dt
    id period day    value
 1:  1      1   1 10.00000
 2:  1      1   2 12.00000
 3:  1      1   3 12.86714
 4:  1      1   4 15.00000
 5:  1      1   5 16.00000
 6:  1      1   6 18.13725
 7:  1      1   7 19.89396
 8:  1      2   1 11.00000
 9:  1      2   2 12.15545
10:  1      2   3 14.00000
11:  1      2   4 15.00000
12:  1      2   5 17.00000
13:  1      2   6 19.18227
14:  1      2   7 20.93898
15:  2      1   1 11.90102
16:  2      1   2 13.00000
17:  2      1   3 14.00000
18:  2      1   4 15.00000
19:  2      1   5 16.00000
20:  2      1   6 20.68455
21:  2      1   7 22.44125
22:  2      2   1 15.00000
23:  2      2   2 16.00000
24:  2      2   3 18.00000
25:  2      2   4 18.21616
26:  2      2   5 20.00000
27:  2      2   6 21.72957
28:  2      2   7 23.48627
29:  3      1   1 14.44831
30:  3      1   2 16.00000
31:  3      1   3 17.00000
32:  3      1   4 19.00000
33:  3      1   5 29.00000
34:  3      1   6 23.23184
35:  3      1   7 24.98855
36:  3      2   1 14.00000
37:  3      2   2 16.00000
38:  3      2   3 18.00000
39:  3      2   4 20.00000
40:  3      2   5 22.52016
41:  3      2   6 24.27686
42:  3      2   7 26.03357
    id period day    value