链接R中同一时间窗口内出现的值
问题:需要根据每行出现的时间窗口将值从一个数据帧添加到另一个数据帧 我有一个数据帧,其中有一个单数事件列表,如下所示: 日期时间事件 1 FAU 15/11/2016 06:40:43 A 2 POR 15/11/2016 12:26:51 V 3 POR 15/11/2016 14:52:53 B 4 MAM 20/11/2016 08:12:19 G 5 SUR 03/12/2016 13:51:18 A链接R中同一时间窗口内出现的值,r,time,data-cleaning,R,Time,Data Cleaning,问题:需要根据每行出现的时间窗口将值从一个数据帧添加到另一个数据帧 我有一个数据帧,其中有一个单数事件列表,如下所示: 日期时间事件 1 FAU 15/11/2016 06:40:43 A 2 POR 15/11/2016 12:26:51 V 3 POR 15/11/2016 14:52:53 B 4 MAM 20/11/2016 08:12:19 G 5 SUR 03/12/2016 13:51:18 A 6 SUR 14/12/2016 07:47:06 V这至少对您的示例有效: df1
6 SUR 14/12/2016 07:47:06 V这至少对您的示例有效:
df1 <- structure(list(Ind = c("FAU", "POR", "POR", "MAM", "SUR", "SUR"
), Date = c("15/11/2016", "15/11/2016", "15/11/2016", "20/11/2016",
"03/12/2016", "14/12/2016"), Time = c("06:40:43", "12:26:51",
"14:52:53", "08:12:19", "13:51:18", "07:47:06"), Event = c("A",
"V", "B", "G", "A", "V")), .Names = c("Ind", "Date", "Time",
"Event"), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6"))
df2 <- structure(list(Date = c("15/11/2016", "15/11/2016", "15/11/2016",
"15/11/2016", "15/11/2016", "15/11/2016", "15/11/2016", "15/11/2016",
"15/11/2016", "15/11/2016", "15/11/2016", "15/11/2016", "15/11/2016",
"15/11/2016", "15/11/2016", "15/11/2016", "15/11/2016", "15/11/2016",
"15/11/2016", "15/11/2016", "15/11/2016", "15/11/2016", "15/11/2016",
"15/11/2016"), Time = c("06:56:48", "06:59:40", "07:27:36", "07:29:10",
"07:34:51", "07:35:10", "07:37:19", "07:39:55", "07:51:59", "08:00:13",
"08:08:01", "08:13:21", "08:16:21", "12:14:48", "12:16:58", "12:51:22",
"12:52:09", "13:26:29", "13:26:55", "13:34:14", "13:50:41", "13:53:25",
"14:15:17", "14:54:49"), Event = 1:24), .Names = c("Date", "Time",
"Event"), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16", "17", "18", "19", "20", "21", "22", "23", "24"))
启动df1的新计数变量:
结果:
> df1
Ind Date Time Event datetime count
1 FAU 15/11/2016 06:40:43 A 2016-11-15 06:40:43 0
2 POR 15/11/2016 12:26:51 V 2016-11-15 12:26:51 15
3 POR 15/11/2016 14:52:53 B 2016-11-15 14:52:53 23
4 MAM 20/11/2016 08:12:19 G 2016-11-20 08:12:19 0
5 SUR 03/12/2016 13:51:18 A 2016-12-03 13:51:18 0
6 SUR 14/12/2016 07:47:06 V 2016-12-14 07:47:06 0
我可以为您提供data.table解决方案。唯一的问题是,我必须将第二个数据帧中第一个事件的开始时间移到更早的日期,因为它在第一个数据帧的第一个事件的开始时间之后。
您将需要附加的包data.table和lubridate
你从哪里得到一个事件?如果它对你有效,你可以接受它作为一个答案。这似乎是一个类似的问题/答案。
df1$count <- NA
for(i in 1:nrow(df1)){
df1$count[i] <- sum(df2$datetime[df2$Date == df1$Date[i]] < df1$datetime[i])
}
> df1
Ind Date Time Event datetime count
1 FAU 15/11/2016 06:40:43 A 2016-11-15 06:40:43 0
2 POR 15/11/2016 12:26:51 V 2016-11-15 12:26:51 15
3 POR 15/11/2016 14:52:53 B 2016-11-15 14:52:53 23
4 MAM 20/11/2016 08:12:19 G 2016-11-20 08:12:19 0
5 SUR 03/12/2016 13:51:18 A 2016-12-03 13:51:18 0
6 SUR 14/12/2016 07:47:06 V 2016-12-14 07:47:06 0
library(data.table)
library(lubridate)
dt1 <- data.table(df1)
dt2 <- data.table(df2)
dt1[, Date.Time := as.POSIXct(strptime(paste(Date, Time, sep = " "), "%d/%m/%Y %H:%M:%S"))]
dt2[, Date.Time := as.POSIXct(strptime(paste(Date, Time, sep = " "), "%d/%m/%Y %H:%M:%S"))]
# Create the start and end time columns in the second data.table
dt2[, `:=`(Start.Time = Date.Time
, End.Time = shift(Date.Time, n = 1L, fill = NA, type = "lead"))]
# Change the start date to an earlier one
dt2[Event == 1,`:=`(Start.Time = Start.Time - days(1)) ]
# Merge on multiple conditions and the selection of the relevant columns
dt2[dt1, on=.(Start.Time < Date.Time
, End.Time > Date.Time)
, nomatch = 0L][,.(Ind
, Date
, Time
, Eventx = i.Event
, Eventy = Event)]
# Output of the last merge
Ind Date Time Eventx Eventy
1: FAU 15/11/2016 06:56:48 A 1
2: POR 15/11/2016 12:16:58 V 15
3: POR 15/11/2016 14:15:17 B 23