Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/67.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/date/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 获取数据框中ID的多个日期之间的天数/月数?_R_Date - Fatal编程技术网

R 获取数据框中ID的多个日期之间的天数/月数?

R 获取数据框中ID的多个日期之间的天数/月数?,r,date,R,Date,对于每个唯一ID,我试图确定第一次发生和最近一次发生之间的天数(或月数,最好) 例如: date id 1 2011-02-09 A 2 2011-02-09 B 3 2011-02-09 C 4 2011-03-10 A 5 2011-03-10 D 6 2011-01-19 B 7 2011-02-02 C 8 2011-02-02 D 输出: days id 29 A 21 B 7 C 36 D 这是我真实数据的一个简单示例。数据集跨越

对于每个唯一ID,我试图确定第一次发生和最近一次发生之间的天数(或月数,最好)

例如:

        date id
1 2011-02-09  A
2 2011-02-09  B
3 2011-02-09  C
4 2011-03-10  A
5 2011-03-10  D
6 2011-01-19  B
7 2011-02-02  C
8 2011-02-02  D
输出:

days id
  29 A
  21 B
   7 C
  36 D
这是我真实数据的一个简单示例。数据集跨越多年,每个ID可能有几十个相关日期。因此,每个ID的结果将是每个ID的最小和最大日期之间的差异。

我用于创建示例的代码:

date <- c("2011-02-09","2011-02-09","2011-02-09","2011-03-10","2011-03-10","2011-01-19","2011-02-02","2011-02-02")
id <- c("A","B","C","A","D","B","C","D")
df <-data.frame(date,id)
date您可以尝试

library(dplyr)
df %>% 
   group_by(id) %>% 
   summarise(days=c(max(date)-min(date)))
#    id days
#1  A   29
#2  B   21
#3  C    7
#4  D   36
或使用
base R

aggregate(date~id, df, function(x) max(x)-min(x))
#   id date
#1  A  29 
#2  B  21 
#3  C   7 
#4  D  36 
数据
df或(强制性)
数据。表
完整性解决方案

library(data.table)
setDT(df)[, .(days = diff(range(as.Date(date)))), by = id]
#    id    days
# 1:  A 29 days
# 2:  B 21 days
# 3:  C  7 days
# 4:  D 36 days
或者可能的基本R实现(尽管这里最好的选择是
tapply


我扩展了我的评论,提出了另一个
base
R解决方案:

 tapply(df$date,df$id,function(x) diff(range(x)))
如注释中所述,如果
df$date
不是
date
对象,则上行将更改为:

 tapply(as.Date(df$date),df$id,function(x) diff(range(x)))

在base R中,您可以尝试:
tapply(df$date,df$id,function(x)diff(range(x))
@nicola,我会将此作为答案发布,因为这是最好的base R解决方案,而且回复速度非常快。我要试一试,然后回复。谢谢你可能应该在这里的某个地方转换成
Date
类,不是吗?@DavidArenburg我确实转换成了
Date
,假设OP创建了这个示例,并且忘记了它。
do.call(rbind, list(by(df, df$id, function(x) diff(range(as.Date(x[, "date"]))))))
 tapply(df$date,df$id,function(x) diff(range(x)))
 tapply(as.Date(df$date),df$id,function(x) diff(range(x)))