R 按日期间隔聚合数据帧
我有一个大数据框架,其中一个最小的工作示例是R 按日期间隔聚合数据帧,r,R,我有一个大数据框架,其中一个最小的工作示例是 df <- structure(list( from = structure( c(13858, 13859, 13860, 13861, 13864, 13865, 13866, 13867, 13868, 13871, 13871, 13872, 13873, 13874, 13875, 13878, 13878, 13879, 13880, 13881, 13882, 13885, 1
df <- structure(list(
from = structure(
c(13858, 13859, 13860, 13861,
13864, 13865, 13866, 13867, 13868, 13871, 13871, 13872, 13873,
13874, 13875, 13878, 13878, 13879, 13880, 13881, 13882, 13885,
13886, 13887), class = "Date"),
to = structure(
c(13859, 13860,
13861, 13864, 13865, 13866, 13867, 13868, 13871, 13872, 13874,
13873, 13874, 13875, 13878, 13879, 13880, 13880, 13881, 13882,
13885, 13886, 13887, 13888), class = "Date"),
X1 = c(6, 5, 5, NA, NA, 4, 5, 4, 3, NA, NA, NA, NA, 6, 0, NA, NA, NA, 3,
5, 4, 5, 6, 10),
X2 = c(11, 5, 3, NA, 6, 10, 7, 3, 8, NA, 3, NA, NA, 7, 7, NA, 5, NA, 7,
4, 3, 2, 8, 8),
X3 = c(9, 3, 3, NA, 5, 7, 7, 6, 9, NA, 1, NA, NA, 6, 6, NA, 8, NA, 9, 2,
9, 4, 5, 9),
X4 = c(8, 5, 5, 4, 8, 8, 6, 5, 2, 4, NA, 10, 4, 4, 4, 5, NA, 4, 3, 3, 7,
3, 2, 1)),
.Names = c("from", "to", "X1", "X2", "X3", "X4"),
row.names = c(NA, -24L), class = "data.frame")
数据帧由列from
和to
联合索引,因为
anyDuplicated(df[c('from','to')])==0 # TRUE
但也有一些重复,即(从
,到
)间隔没有唯一地划分日期范围,即
anyDuplicated(df['from'])>0 # TRUE
anyDuplicated(df['to'])>0 # TRUE
例如,2007年12月24日至2007年12月27日之间的(从
,到
)间隔也以三个子间隔(2007年12月24日,2007年12月25日),(2007年12月25日,2007年12月26日)和(2007年12月26日,2007年12月27日)的形式出现
我希望聚合此数据帧,以便(从
,到
)中的日期间隔彼此不重叠。我想对每个数据列X1
,…,X4
中在此意义上“重复”的值求和。生成的数据帧应该如下所示
from to X1 X2 X3 X4
1 2007-12-11 2007-12-12 6 11 9 8
2 2007-12-12 2007-12-13 5 5 3 5
3 2007-12-13 2007-12-14 5 3 3 5
4 2007-12-14 2007-12-17 NA NA NA 4
5 2007-12-17 2007-12-18 NA 6 5 8
6 2007-12-18 2007-12-19 4 10 7 8
7 2007-12-19 2007-12-20 5 7 7 6
8 2007-12-20 2007-12-21 4 3 6 5
9 2007-12-21 2007-12-24 3 8 9 2
10 2007-12-24 2007-12-25 NA NA NA 4
11 2007-12-24 2007-12-27 NA 3 1 NA
12 2007-12-25 2007-12-26 NA NA NA 10
13 2007-12-26 2007-12-27 NA NA NA 4
14 2007-12-27 2007-12-28 6 7 6 4
15 2007-12-28 2007-12-31 0 7 6 4
16 2007-12-31 2008-01-01 NA NA NA 5
17 2007-12-31 2008-01-02 NA 5 8 NA
18 2008-01-01 2008-01-02 NA NA NA 4
19 2008-01-02 2008-01-03 3 7 9 3
20 2008-01-03 2008-01-04 5 4 2 3
21 2008-01-04 2008-01-07 4 3 9 7
22 2008-01-07 2008-01-08 5 2 4 3
23 2008-01-08 2008-01-09 6 8 5 2
24 2008-01-09 2008-01-10 10 8 9 1
from to X1 X2 X3 X4
1 2007-12-11 2007-12-12 6 11 9 8
2 2007-12-12 2007-12-13 5 5 3 5
3 2007-12-13 2007-12-14 5 3 3 5
4 2007-12-14 2007-12-17 NA NA NA 4
5 2007-12-17 2007-12-18 NA 6 5 8
6 2007-12-18 2007-12-19 4 10 7 8
7 2007-12-19 2007-12-20 5 7 7 6
8 2007-12-20 2007-12-21 4 3 6 5
9 2007-12-21 2007-12-24 3 8 9 2
10 2007-12-24 2007-12-27 NA 3 1 18
11 2007-12-27 2007-12-28 6 7 6 4
12 2007-12-28 2007-12-31 0 7 6 4
13 2007-12-31 2008-01-02 NA 5 8 9
14 2008-01-02 2008-01-03 3 7 9 3
15 2008-01-03 2008-01-04 5 4 2 3
16 2008-01-04 2008-01-07 4 3 9 7
17 2008-01-07 2008-01-08 5 2 4 3
18 2008-01-08 2008-01-09 6 8 5 2
19 2008-01-09 2008-01-10 10 8 9 1
我以前从未遇到过这样的问题,在stackoverflow上也找不到类似的问题。这似乎是一种比我使用aggregate()
时更复杂的聚合类型。因此,任何解决方案、代码或参考资料都将不胜感激