R 如何在tibble中有效地计算开始和结束日期的顺序?
我有以下的出发点:R 如何在tibble中有效地计算开始和结束日期的顺序?,r,dplyr,lag,rowwise,R,Dplyr,Lag,Rowwise,我有以下的出发点: #dataset: schedule <- tibble(start = as.Date(c("2018-07-11", NA, NA)), duration = c(10,23,9),flag_StartActual = c(TRUE,FALSE,FALSE)) 在如下代码中包含rowwise不起作用: schedule %>% rowwise() %>% mutate( end = start + ddays(duration),
#dataset:
schedule <- tibble(start = as.Date(c("2018-07-11", NA, NA)), duration = c(10,23,9),flag_StartActual = c(TRUE,FALSE,FALSE))
在如下代码中包含rowwise不起作用:
schedule %>%
rowwise() %>%
mutate(
end = start + ddays(duration),
start = as_datetime(ifelse(flag_StartActual==TRUE,start,lag(end)))
)
不管怎样,我有点困了,希望有人对如何处理这个问题有一些聪明的想法 循环它:
for (i in 2:nrow(schedule))
schedule$start[i]<-schedule$start[i-1]+schedule$duration[i-1]
schedule$end<-schedule$start+schedule$duration
schedule
# A tibble: 3 × 4
start duration flag_StartActual end
<date> <dbl> <lgl> <date>
1 2018-07-11 10 TRUE 2018-07-21
2 2018-07-21 23 FALSE 2018-08-13
3 2018-08-13 9 FALSE 2018-08-22
注意:我在计算完所有的开始后创建了结束列,我认为这更容易一些。循环它:
for (i in 2:nrow(schedule))
schedule$start[i]<-schedule$start[i-1]+schedule$duration[i-1]
schedule$end<-schedule$start+schedule$duration
schedule
# A tibble: 3 × 4
start duration flag_StartActual end
<date> <dbl> <lgl> <date>
1 2018-07-11 10 TRUE 2018-07-21
2 2018-07-21 23 FALSE 2018-08-13
3 2018-08-13 9 FALSE 2018-08-22
schedule %>%
mutate(
start = schedule$start[1] + ddays(c(0, cumsum(schedule$duration)[- n()])),
end = schedule$start[1] + ddays(cumsum(schedule$duration))
)
# A tibble: 3 x 4
start duration flag_StartActual end
<date> <dbl> <lgl> <date>
1 2018-07-11 10 TRUE 2018-07-21
2 2018-07-21 23 FALSE 2018-08-13
3 2018-08-13 9 FALSE 2018-08-22
注意:我在计算完所有的开始后创建了结束列,我认为这会更容易一些
schedule %>%
mutate(
start = schedule$start[1] + ddays(c(0, cumsum(schedule$duration)[- n()])),
end = schedule$start[1] + ddays(cumsum(schedule$duration))
)
# A tibble: 3 x 4
start duration flag_StartActual end
<date> <dbl> <lgl> <date>
1 2018-07-11 10 TRUE 2018-07-21
2 2018-07-21 23 FALSE 2018-08-13
3 2018-08-13 9 FALSE 2018-08-22
library(dplyr)
schedule %>%
mutate(start = start[1] + lag(cumsum(duration), default = 0)
, end = start + duration)
# # A tibble: 3 x 4
# start duration flag_StartActual end
# <date> <dbl> <lgl> <date>
# 1 2018-07-11 10.0 T 2018-07-21
# 2 2018-07-21 23.0 F 2018-08-13
# 3 2018-08-13 9.00 F 2018-08-22