如何使用开始日期在R中创建滞后的结束日期?
假设存在一个如何使用开始日期在R中创建滞后的结束日期?,r,dataframe,data.table,bigdata,R,Dataframe,Data.table,Bigdata,假设存在一个date.frame或data.table,其中包含对数百万个ID的观察结果,因此子集如下所示: id <- c(3,3,3,5,5) data <- c(24,48,60,84,96) start <- as.Date(c("2006-01-01","2009-12-09","2010-01-02","2006-04-24", "2009-12-09")) df <- data.frame(id,data,start) ; head(df) id da
date.frame
或data.table
,其中包含对数百万个ID的观察结果,因此子集如下所示:
id <- c(3,3,3,5,5)
data <- c(24,48,60,84,96)
start <- as.Date(c("2006-01-01","2009-12-09","2010-01-02","2006-04-24", "2009-12-09"))
df <- data.frame(id,data,start) ; head(df)
id data start
1 3 24 2006-01-01
2 3 48 2009-12-09
3 3 60 2010-01-02
4 5 84 2006-04-24
5 5 96 2009-12-09
df$end <- as.Date(c("2009-12-08","2010-01-01","9999-12-31","2009-12-08",
"9999-12-31"));head(df)
id data start end
1 3 24 2006-01-01 2009-12-08
2 3 48 2009-12-09 2010-01-01
3 3 60 2010-01-02 9999-12-31
4 5 84 2006-04-24 2009-12-08
5 5 96 2009-12-09 9999-12-31
id这是我的数据。表
解决方案:
library(data.table)
id <- c(3,3,3,5,5)
data <- c(24,48,60,84,96)
start <- as.Date(c("2006-01-01","2009-12-09","2010-01-02","2006-04-24", "2009-12-09"))
dt <- data.table(id,data,start=start, end=as.Date("9999-01-01"))
setkey(dt, id, start)
dt[, end := c(tail(start, -1) - 1, as.Date("9999-01-01")), by="id"]
id data start end
1: 3 24 2006-01-01 2009-12-08
2: 3 48 2009-12-09 2010-01-01
3: 3 60 2010-01-02 9999-01-01
4: 5 84 2006-04-24 2009-12-08
5: 5 96 2009-12-09 9999-01-01
库(data.table)
作为对我答案的评论…我学会了这个巧妙的小技巧,在这里使用head
和tail
,例如,是使用它的答案之一。我不知道这是不是唯一一个,或者是我学过的。