R:按时间间隔计算15分钟

R:按时间间隔计算15分钟,r,count,aggregate,intervals,R,Count,Aggregate,Intervals,我想计算一个大型数据集中每15分钟一次的工作日会话量 我的数据如下所示: df <- Start_datetime End_datetime Duration Volume 2016-04-01 06:20:55 2016-04-01 14:41:22 08:20:27 8.360 2016-04-01 08:22:27 2016-04-01 08:22:40 00:00:13 0.000 2016-04-01 08:38:53 2016-04

我想计算一个大型数据集中每15分钟一次的工作日会话量

我的数据如下所示:

df <- 

Start_datetime       End_datetime       Duration    Volume
2016-04-01 06:20:55 2016-04-01 14:41:22  08:20:27   8.360
2016-04-01 08:22:27 2016-04-01 08:22:40  00:00:13   0.000
2016-04-01 08:38:53 2016-04-01 09:31:58  00:53:05   12.570
2016-04-01 09:33:57 2016-04-01 12:37:43  03:03:46   7.320
2016-04-01 10:05:03 2016-04-01 16:41:16  06:36:13   9.520
2016-04-01 12:07:57 2016-04-02 22:22:32  34:14:35   7.230
2016-04-01 16:56:55 2016-04-02 10:40:17  17:43:22   5.300
2016-04-01 17:29:18 2016-04-01 19:50:29  02:21:11   7.020
2016-04-01 17:42:39 2016-04-01 19:45:38  02:02:59   2.430
2016-04-01 17:47:57 2016-04-01 20:26:35  02:38:38   8.090
2016-04-01 22:00:15 2016-04-04 08:22:21  58:22:06   4.710
2016-04-02 01:12:38 2016-04-02 09:49:00  08:36:22   3.150
2016-04-02 01:32:00 2016-04-02 12:49:47  11:17:47   5.760
2016-04-02 07:28:48 2016-04-04 06:58:56  47:30:08   0.000
2016-04-02 07:55:18 2016-04-05 07:55:15  71:59:57   0.240
等等,还有周末的数据

我已尝试过剪切功能:

df$PTU <- table (cut(df$Start_datetime, breaks="15 minutes"))
data.frame(PTU)
还有lubridate的一些功能,但我似乎无法让它工作。我的最终目标是创建一个如下表,但间隔15分钟。

在datetimes上使用
cut
时,必须记住两件事:

  • 确保您的数据实际上是一个
    POSIXt
    类。我很确定你的不是,否则R不会使用
    cut.default
    cut.POSIXt
    作为方法
  • “15分钟”
    应该是
    “15分钟”
    。参见
    ?cut.POSIXt
  • 所以这是可行的:

    Start_datetime <- as.POSIXct(
      c("2016-04-01 06:20:55",
        "2016-04-01 06:22:12",
        "2016-04-01 05:30:12")
    )
    
    table(cut(Start_datetime, breaks = "15 min"))
    # 2016-04-01 05:30:00 2016-04-01 05:45:00 2016-04-01 06:00:00 2016-04-01 06:15:00 
    #                   1                   0                   0                   2 
    

    Start\u datetime这里有一个从datetime“字符串”到所需格式的完整过程。开始是一个字符串向量:

    Start_time <- 
    c("2016-04-01 06:20:55", "2016-04-01 08:22:27", "2016-04-01 08:38:53", 
      "2016-04-01 09:33:57", "2016-04-01 10:05:03", "2016-04-01 12:07:57", 
      "2016-04-01 16:56:55", "2016-04-01 17:29:18", "2016-04-01 17:42:39", 
      "2016-04-01 17:47:57", "2016-04-01 22:00:15", "2016-04-02 01:12:38", 
      "2016-04-02 01:32:00", "2016-04-02 07:28:48", "2016-04-02 07:55:18"
    )
    df <- data.frame(Start_time)
    

    Start\u time您能解释一下为什么
    cut
    方法不起作用吗您能
    dput
    一点数据吗?如果您要查找工作日,请检查@akrun:I收到我在问题中添加的错误。我希望在工作日(mo、tu、wed、th、fri)而不是在工作日,但无论如何都要感谢您的提示。@ima当您想在R中共享数据时,
    dput
    命令会将您的数据框转换为几行代码,您可以复制粘贴到问题中。请尝试
    ?dput
    以了解更多信息
    Start_datetime <- as.POSIXct(
      c("2016-04-01 06:20:55",
        "2016-04-01 06:22:12",
        "2016-04-01 05:30:12")
    )
    
    table(cut(Start_datetime, breaks = "15 min"))
    # 2016-04-01 05:30:00 2016-04-01 05:45:00 2016-04-01 06:00:00 2016-04-01 06:15:00 
    #                   1                   0                   0                   2 
    
    Start_time <- 
    c("2016-04-01 06:20:55", "2016-04-01 08:22:27", "2016-04-01 08:38:53", 
      "2016-04-01 09:33:57", "2016-04-01 10:05:03", "2016-04-01 12:07:57", 
      "2016-04-01 16:56:55", "2016-04-01 17:29:18", "2016-04-01 17:42:39", 
      "2016-04-01 17:47:57", "2016-04-01 22:00:15", "2016-04-02 01:12:38", 
      "2016-04-02 01:32:00", "2016-04-02 07:28:48", "2016-04-02 07:55:18"
    )
    df <- data.frame(Start_time)
    
    ## We will use two packages
    library(lubridate)
    library(data.table)
    
    # convert df to data.table, parse the datetime string
    setDT(df)[, Start_time := ymd_hms(Start_time)] 
    # floor time by 15 min to assign the appropriate slot (new variable Start_time_slot)
    df[, Start_time_slot := floor_date(Start_time, "15 min")]
    
    # aggregate by wday and time in a date
    start_time_data_frame <- df[, .N, by = .(wday(Start_time_slot), format(Start_time_slot, format="%H:%M:%S") )]
    
    # output looks like this 
    start_time_data_frame
    ##     wday     time N
    ##  1:    6 06:15:00 1
    ##  2:    6 08:15:00 1
    ##  3:    6 08:30:00 1
    ##  4:    6 09:30:00 1
    ##  5:    6 10:00:00 1
    ##  6:    6 12:00:00 1
    ##  7:    6 16:45:00 1
    ##  8:    6 17:15:00 1
    ##  9:    6 17:30:00 1
    ## 10:    6 17:45:00 1
    ## 11:    6 22:00:00 1
    ## 12:    7 01:00:00 1
    ## 13:    7 01:30:00 1
    ## 14:    7 07:15:00 1
    ## 15:    7 07:45:00 1