R 将日期时间间隔分配给日期时间事件

R 将日期时间间隔分配给日期时间事件,r,date,intervals,lubridate,R,Date,Intervals,Lubridate,我有一个大于1M的个人日期时间检测间隔表(即在此期间连续检测到一个人)和一个表示闸门关闭时间的日期时间间隔表。单个检测分为“进入”或“退出” 使用%内的%我可以确定检测是否在任何间隔内 然而,我想做一些不同的事情。对于每个闸门关闭间隔,我想知道有多少人在外面被检测到,有多少人在里面被检测到。我相信最简单的方法是将每个检测事件分配给一个门周期,但是如果没有非常混乱的嵌套ifelse语句,我无法理解如何编写该函数 预期输出(非真实数据): 闸门关闭间隔 new("Interval", .Data =

我有一个大于1M的个人日期时间检测间隔表(即在此期间连续检测到一个人)和一个表示闸门关闭时间的日期时间间隔表。单个检测分为“进入”或“退出”

使用%内的%我可以确定检测是否在任何间隔内

然而,我想做一些不同的事情。对于每个闸门关闭间隔,我想知道有多少人在外面被检测到,有多少人在里面被检测到。我相信最简单的方法是将每个检测事件分配给一个门周期,但是如果没有非常混乱的嵌套ifelse语句,我无法理解如何编写该函数

预期输出(非真实数据):

闸门关闭间隔

new("Interval", .Data = c(-81060, -117060, -59400, -16200, -76680, 
-51000, -81120), start = structure(c(1412238660, 1412362800, 
1412434800, 1412454600, 1412542980, 1412602200, 1412690400), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), tzone = "UTC")

因此,您的示例数据存在一些问题,使得回答您的问题变得困难

  • 您的示例检测数据包含许多不需要解决或理解您的问题的额外信息
  • 您的示例间隔是向后的。(它们在开始之前就结束了)
  • 您的示例数据没有以易于使用的格式设置(您不包括读取它所需的包)
  • 也就是说,让我们简化并纠正这些问题:

    library(tidyverse)  # needed because your data is in tibble format
    library(lubridate)  # needed for time interval class used in your example
    
    # I called your example data "Detections" and "Intervals", not importing them here
    
    set.seed(914)
    row.sample <- c(1,5,2, sample(1:100, 10)) # to include the samples not in the given intervals mixed with good values
    use.cols <- c(1,4,6,7,9)  
    Detections.use <- Detections[row.sample, use.cols]
    
    Detections.use
    # A tibble: 13 x 5
       Tag     site  species      StartDateTime_UTC   EndDateTime_UTC    
       <fct>   <fct> <chr>        <dttm>              <dttm>             
     1 5004.24 IC1   Striped Bass 2014-09-29 22:40:40 2014-09-29 22:46:35
     2 5004.24 RGD1  Striped Bass 2014-10-02 00:15:47 2014-10-02 00:30:08
     3 5004.24 IC1   Striped Bass 2014-09-29 22:49:15 2014-09-29 22:50:05
     4 5004.24 RGD1  Striped Bass 2014-10-04 18:38:40 2014-10-04 18:52:30
     5 5004.24 RGD1  Striped Bass 2014-10-05 07:15:43 2014-10-05 08:05:47
     6 5004.24 RGD1  Striped Bass 2014-10-03 02:21:36 2014-10-03 02:24:01
     7 5004.24 RGD1  Striped Bass 2014-10-03 12:13:46 2014-10-03 12:18:02
     8 5004.24 RGD1  Striped Bass 2014-10-03 09:30:15 2014-10-03 10:14:19
     9 5004.24 RGD1  Striped Bass 2014-10-04 14:09:39 2014-10-04 14:09:45
    10 5004.24 RGD1  Striped Bass 2014-10-03 02:17:05 2014-10-03 02:18:05
    11 5004.24 RGD1  Striped Bass 2014-10-05 03:08:26 2014-10-05 03:12:31
    12 5004.24 RGD1  Striped Bass 2014-10-03 01:42:02 2014-10-03 01:58:18
    13 5004.24 RGD1  Striped Bass 2014-10-02 00:33:12 2014-10-02 01:10:21
    
    
    # Convert your intervals to a table, then correct them to make them positive time intevals
    Interval.table <- tibble(Intervals, "ID" = seq(from = 1, to = length(Intervals))) %>%
          mutate(end_time = Intervals$start,
                 start_time = end_time+Intervals$.Data,
                 new_Interval = start_time %--% end_time) %>%
          select(ID, new_Interval)
    Interval.table
    # A tibble: 7 x 2
         ID new_Interval                                    
      <int> <Interval>                                      
    1     1 2014-10-01 10:00:00 UTC--2014-10-02 08:31:00 UTC
    2     2 2014-10-02 10:29:00 UTC--2014-10-03 19:00:00 UTC
    3     3 2014-10-03 22:30:00 UTC--2014-10-04 15:00:00 UTC
    4     4 2014-10-04 16:00:00 UTC--2014-10-04 20:30:00 UTC
    5     5 2014-10-04 23:45:00 UTC--2014-10-05 21:03:00 UTC
    6     6 2014-10-05 23:20:00 UTC--2014-10-06 13:30:00 UTC
    7     7 2014-10-06 15:28:00 UTC--2014-10-07 14:00:00 UTC
    
    编辑以添加
    sapply
    解决方案---

    这也可以通过稍微平滑一点的
    sapply
    来完成,并且不会留下瑕疵:

    which.join <- function(x, y) {
          z <- which(x %within% y)
          z <- ifelse(isTRUE(z > 0), z, NA)
          z
    }
    
    Detections.use$Interval <- sapply(Detections.use$StartDateTime_UTC, 
                                 function(x) which.join(x,Interval.table$new_Interval)
    

    which.join您能否提供一个示例,说明您正在寻找的输出内容(可以是虚拟数据),但格式可能会有所帮助。运行示例数据代码时,我还收到一个错误“getClass(Class,其中=topenv(parent.frame())):“Period”不是已定义的类”。也许可以将其剥离,使其更加精简—听起来您可以使用数据库连接方法将观测数据与间隔数据连接起来。这假设它们在两个单独的表中。@BrianFisher我已经更新了问题并修复了大门关闭期间的dput。我认为您应该能够使用类似
    的东西(Detections$StartDateTime\u UTC%in%Interval)
    来获取索引,然后您可以将其转换为ID。不过,我无法测试它,因为您提供的两组数据没有重叠(间隔时间都在2013年3月/4月,而检测时间都在2014年9月/10月)@BrianFisher这就是我尝试对数据进行子集的原因。不幸的是,它只返回整数0。我将尝试为该个体获得正确的间隔这与我得出的答案非常接近:
    for(I in 1:length(dat_overlap$StartDateTime_UTC)){tryCatch({dat_overlap$interval.id[I]
    
    library(tidyverse)  # needed because your data is in tibble format
    library(lubridate)  # needed for time interval class used in your example
    
    # I called your example data "Detections" and "Intervals", not importing them here
    
    set.seed(914)
    row.sample <- c(1,5,2, sample(1:100, 10)) # to include the samples not in the given intervals mixed with good values
    use.cols <- c(1,4,6,7,9)  
    Detections.use <- Detections[row.sample, use.cols]
    
    Detections.use
    # A tibble: 13 x 5
       Tag     site  species      StartDateTime_UTC   EndDateTime_UTC    
       <fct>   <fct> <chr>        <dttm>              <dttm>             
     1 5004.24 IC1   Striped Bass 2014-09-29 22:40:40 2014-09-29 22:46:35
     2 5004.24 RGD1  Striped Bass 2014-10-02 00:15:47 2014-10-02 00:30:08
     3 5004.24 IC1   Striped Bass 2014-09-29 22:49:15 2014-09-29 22:50:05
     4 5004.24 RGD1  Striped Bass 2014-10-04 18:38:40 2014-10-04 18:52:30
     5 5004.24 RGD1  Striped Bass 2014-10-05 07:15:43 2014-10-05 08:05:47
     6 5004.24 RGD1  Striped Bass 2014-10-03 02:21:36 2014-10-03 02:24:01
     7 5004.24 RGD1  Striped Bass 2014-10-03 12:13:46 2014-10-03 12:18:02
     8 5004.24 RGD1  Striped Bass 2014-10-03 09:30:15 2014-10-03 10:14:19
     9 5004.24 RGD1  Striped Bass 2014-10-04 14:09:39 2014-10-04 14:09:45
    10 5004.24 RGD1  Striped Bass 2014-10-03 02:17:05 2014-10-03 02:18:05
    11 5004.24 RGD1  Striped Bass 2014-10-05 03:08:26 2014-10-05 03:12:31
    12 5004.24 RGD1  Striped Bass 2014-10-03 01:42:02 2014-10-03 01:58:18
    13 5004.24 RGD1  Striped Bass 2014-10-02 00:33:12 2014-10-02 01:10:21
    
    
    # Convert your intervals to a table, then correct them to make them positive time intevals
    Interval.table <- tibble(Intervals, "ID" = seq(from = 1, to = length(Intervals))) %>%
          mutate(end_time = Intervals$start,
                 start_time = end_time+Intervals$.Data,
                 new_Interval = start_time %--% end_time) %>%
          select(ID, new_Interval)
    Interval.table
    # A tibble: 7 x 2
         ID new_Interval                                    
      <int> <Interval>                                      
    1     1 2014-10-01 10:00:00 UTC--2014-10-02 08:31:00 UTC
    2     2 2014-10-02 10:29:00 UTC--2014-10-03 19:00:00 UTC
    3     3 2014-10-03 22:30:00 UTC--2014-10-04 15:00:00 UTC
    4     4 2014-10-04 16:00:00 UTC--2014-10-04 20:30:00 UTC
    5     5 2014-10-04 23:45:00 UTC--2014-10-05 21:03:00 UTC
    6     6 2014-10-05 23:20:00 UTC--2014-10-06 13:30:00 UTC
    7     7 2014-10-06 15:28:00 UTC--2014-10-07 14:00:00 UTC
    
    y <- c()
    for (i in 1:length(Detections.use$StartDateTime_UTC)){
    z<- which(Detections.use$StartDateTime_UTC[i] %within% Interval.table$new_Interval)
    y[i] <- ifelse(isTRUE(z>0), z, NA)
    }
    Detections.use$Interval <- y
    Detections.use
    # A tibble: 13 x 6
       Tag     site  species      StartDateTime_UTC   EndDateTime_UTC     Interval
       <fct>   <fct> <chr>        <dttm>              <dttm>                 <int>
     1 5004.24 IC1   Striped Bass 2014-09-29 22:40:40 2014-09-29 22:46:35       NA
     2 5004.24 RGD1  Striped Bass 2014-10-02 00:15:47 2014-10-02 00:30:08        1
     3 5004.24 IC1   Striped Bass 2014-09-29 22:49:15 2014-09-29 22:50:05       NA
     4 5004.24 RGD1  Striped Bass 2014-10-04 18:38:40 2014-10-04 18:52:30        4
     5 5004.24 RGD1  Striped Bass 2014-10-05 07:15:43 2014-10-05 08:05:47        5
     6 5004.24 RGD1  Striped Bass 2014-10-03 02:21:36 2014-10-03 02:24:01        2
     7 5004.24 RGD1  Striped Bass 2014-10-03 12:13:46 2014-10-03 12:18:02        2
     8 5004.24 RGD1  Striped Bass 2014-10-03 09:30:15 2014-10-03 10:14:19        2
     9 5004.24 RGD1  Striped Bass 2014-10-04 14:09:39 2014-10-04 14:09:45        3
    10 5004.24 RGD1  Striped Bass 2014-10-03 02:17:05 2014-10-03 02:18:05        2
    11 5004.24 RGD1  Striped Bass 2014-10-05 03:08:26 2014-10-05 03:12:31        5
    12 5004.24 RGD1  Striped Bass 2014-10-03 01:42:02 2014-10-03 01:58:18        2
    13 5004.24 RGD1  Striped Bass 2014-10-02 00:33:12 2014-10-02 01:10:21        1
    
    which.join <- function(x, y) {
          z <- which(x %within% y)
          z <- ifelse(isTRUE(z > 0), z, NA)
          z
    }
    
    Detections.use$Interval <- sapply(Detections.use$StartDateTime_UTC, 
                                 function(x) which.join(x,Interval.table$new_Interval)