如何在R中按日期对事件求和
我有雨量计记录的数据。它记录0.2 l/m2的事件及其发生日期。经过一点处理后,我的数据如下所示:如何在R中按日期对事件求和,r,date,events,R,Date,Events,我有雨量计记录的数据。它记录0.2 l/m2的事件及其发生日期。经过一点处理后,我的数据如下所示: head(df) V2 V3 V4 1 2018-10-08 11:54:43 1 0.2 2 2018-10-08 12:49:21 2 0.2 3 2018-10-08 15:55:33 3 0.2 4 2018-10-08 16:43:37 4 0.2 5 2018-10-08 16:47
head(df)
V2 V3 V4
1 2018-10-08 11:54:43 1 0.2
2 2018-10-08 12:49:21 2 0.2
3 2018-10-08 15:55:33 3 0.2
4 2018-10-08 16:43:37 4 0.2
5 2018-10-08 16:47:41 5 0.2
6 2018-10-08 16:56:44 6 0.2
请注意,第V2列是事件发生的日期,第V3列只是事件的累积计数,我添加了第V4列,每个事件的值为l/m2
我想以常规的日期序列对V4列的值求和,比如说,每小时(或每天,或任何其他时间段),用“零”填充那些没有事件的时间段
要获得类似于:
date rain
1 2018-10-08 11:00:00 0.2
2 2018-10-08 12:00:00 0.2
3 2018-10-08 13:00:00 0.0
4 2018-10-08 14:00:00 0.0
5 2018-10-08 15:00:00 0.2
6 2018-10-08 16:00:00 0.6
我确实解决了这个问题,但方法非常复杂(见下面的代码)。有没有直接的方法
df$date<-round.POSIXt(df$V2, units = "hour")
library(xts)
df.xts <- xts(df$V4,as.POSIXct(df$date))
hourly<-period.apply(df.xts,endpoints(df$date,"hours"),sum)
hourly<-as.data.frame(hourly)
hourly$date<-as.POSIXct(rownames(hourly))
ref<- data.frame(date=seq.POSIXt(from=min(df$date),to=max(df$date),by="hour"))
all<-merge(hourly,ref,by="date",all.y = TRUE)
all$V1[is.na(all$V1)]<-0
df$date使用tidyverse
可以执行以下操作:
library(tidyverse)
x <- df %>%
group_by(date = floor_date(as.POSIXct(V2), "1 hour")) %>%
summarize(rain = sum(V4))
库(tidyverse)
x%
组员(日期=现场日期(如POSIXct(V2),“1小时”))%>%
汇总(雨水=总和(V4))
然后填写缺少的小时数:
x <- as_tibble(seq(min(x$date), max(x$date), by = "hour")) %>%
left_join(., x, by = c("value" = "date")) %>%
replace_na(list(rain = 0))
# value rain
# <dttm> <dbl>
#1 2018-10-08 11:00:00 0.2
#2 2018-10-08 12:00:00 0.2
#3 2018-10-08 13:00:00 0
#4 2018-10-08 14:00:00 0
#5 2018-10-08 15:00:00 0.2
#6 2018-10-08 16:00:00 0.6
x%
左连接(,x,by=c(“值”=“日期”))%>%
替换(列表(rain=0))
#重视雨水
#
#1 2018-10-08 11:00:00 0.2
#2 2018-10-08 12:00:00 0.2
#3 2018-10-08 13:00:00 0
#4 2018-10-08 14:00:00 0
#5 2018-10-08 15:00:00 0.2
#6 2018-10-08 16:00:00 0.6
数据:
df
df <- structure(list(V2 = structure(1:6, .Label = c(" 2018-10-08 11:54:43",
" 2018-10-08 12:49:21", " 2018-10-08 15:55:33", " 2018-10-08 16:43:37",
" 2018-10-08 16:47:41", " 2018-10-08 16:56:44"), class = "factor"),
V3 = 1:6, V4 = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.2)), class = "data.frame", row.names = c(NA,
-6L))