R 在跨多行的单个列中查找最大日期_R_Date

R 在跨多行的单个列中查找最大日期

r date

R 在跨多行的单个列中查找最大日期,r,date,R,Date,我有以下数据框： id <- c(1,1,2,3,3) date <- c("23-01-08","01-11-07","30-11-07","17-12-07","12-12-08") df <- data.frame(id,date) df$date2 <- as.Date(as.character(df$date), format = "%d-%m-%y") id date date2 1 23-01-08

我有以下数据框：

id       <- c(1,1,2,3,3)
date     <- c("23-01-08","01-11-07","30-11-07","17-12-07","12-12-08")
df       <- data.frame(id,date)
df$date2 <- as.Date(as.character(df$date), format = "%d-%m-%y")


id     date      date2
1   23-01-08 2008-01-23
1   01-11-07 2007-11-01
2   30-11-07 2007-11-30
3   17-12-07 2007-12-17
3   12-12-08 2008-12-12

如果您能帮助我，我将不胜感激。

idlibrary（sqldf）
id<-c(1,1,2,3,3)
date<-c("23-01-08","01-11-07","30-11-07","17-12-07","12-12-08")
df<-data.frame(id,date)
df$date2<-as.Date(as.character(df$date), format = "%d-%m-%y")
# aggregate can be used for this type of thing
d = aggregate(df$date2,by=list(df$id),max)
# And merge the result of aggregate 
# with the original data frame
df2 = merge(df,d,by.x=1,by.y=1)
df2

  id     date      date2          x
1  1 23-01-08 2008-01-23 2008-01-23
2  1 01-11-07 2007-11-01 2008-01-23
3  2 30-11-07 2007-11-30 2007-11-30
4  3 17-12-07 2007-12-17 2008-12-12
5  3 12-12-08 2008-12-12 2008-12-12

表您不能使用0作为日期值，因此您需要放弃将其保留为日期或接受NA值：
# Date values:
df$maxdt <- ave(df$date2, df$id, 
                    FUN=function(x) ifelse( x == max(x), as.character(x), NA) ) 
str(ave(df$date2, df$id, FUN=function(x) ifelse( x == max(x), as.character(x), NA) ) )
# Date[1:5], format: "2008-01-23" NA "2007-11-30" NA "2008-12-12"

另一种方法是使用plyr
包：
library(plyr)
ddply(df, "id", summarize, max = max(date2))

#  id        max
#1  1 2008-01-23
#2  2 2007-11-30
#3  3 2008-12-12

现在，这不是您想要的格式，因为它只显示每个id
一次。不要害怕，我们可以使用转换
而不是总结
：
ddply(df, "id", transform, max = max(date2))

#  id     date      date2        max
#1  1 01-11-07 2007-11-01 2008-01-23
#2  1 23-01-08 2008-01-23 2008-01-23
#3  2 30-11-07 2007-11-30 2007-11-30
#4  3 12-12-08 2008-12-12 2008-12-12
#5  3 17-12-07 2007-12-17 2008-12-12

正如@seandavi的回答一样，这会重复每个id
的max
日期。如果要将重复项更改为NA
，类似的操作将完成此任务：
within(ddply(df, "id", transform, max = max(date2)), max[max != date2] <- NA)

在（ddply（df，“id”，transform，max=max（date2）），max[max！=date2]中添加dplyr
解决方案，以防有人查看：
library(dplyr)

df %>%
  group_by(id) %>%
  mutate(max = if_else(date2 == max(date2), date2, as.Date(NA))) 

结果：
# A tibble: 5 x 4
# Groups:   id [3]
     id     date      date2        max
  <dbl>   <fctr>     <date>     <date>
1     1 23-01-08 2008-01-23 2008-01-23
2     1 01-11-07 2007-11-01         NA
3     2 30-11-07 2007-11-30 2007-11-30
4     3 17-12-07 2007-12-17         NA
5     3 12-12-08 2008-12-12 2008-12-12

#一个tible:5 x 4
#组别:id[3]
id日期日期2最大值
1     1 23-01-08 2008-01-23 2008-01-23
2011-11-07 2007-11-01 NA
3     2 30-11-07 2007-11-30 2007-11-30
4317-12-072007-12-17NA
5     3 12-12-08 2008-12-12 2008-12-12
当我想查看列的最小/最大日期时，我发现这有帮助
最大值：head（df%>%distinct（date）%%>%arrange（desc（date））


最小值：head（df%>%distinct（date）%%>%arrange（date））

最大值将按降序排列日期列，允许您查看最大值。最小值将按升序排列，允许您查看最小值
您需要为此使用dplyr
包。我这样使用它：mutate（flag\u last=if\u else（date==max（date），TRUE，FALSE））%%>%过滤器（flag\u last==TRUE）
ddply(df, "id", transform, max = max(date2))

#  id     date      date2        max
#1  1 01-11-07 2007-11-01 2008-01-23
#2  1 23-01-08 2008-01-23 2008-01-23
#3  2 30-11-07 2007-11-30 2007-11-30
#4  3 12-12-08 2008-12-12 2008-12-12
#5  3 17-12-07 2007-12-17 2008-12-12

within(ddply(df, "id", transform, max = max(date2)), max[max != date2] <- NA)

library(dplyr)

df %>%
  group_by(id) %>%
  mutate(max = if_else(date2 == max(date2), date2, as.Date(NA))) 

# A tibble: 5 x 4
# Groups:   id [3]
     id     date      date2        max
  <dbl>   <fctr>     <date>     <date>
1     1 23-01-08 2008-01-23 2008-01-23
2     1 01-11-07 2007-11-01         NA
3     2 30-11-07 2007-11-30 2007-11-30
4     3 17-12-07 2007-12-17         NA
5     3 12-12-08 2008-12-12 2008-12-12