R 如何计算“a”；批次；每15分钟_R_Rolling Computation_Runner_Batching

R 如何计算“a”；批次；每15分钟

R 如何计算“a”；批次；每15分钟,r,rolling-computation,runner,batching,R,Rolling Computation,Runner,Batching,我希望能够在滚动的基础上，每15分钟计算一次参赛人数。例如，12:59将在15分钟内为1，14:08=>14:22将在15分钟内全部为1，因此这将在该批次中返回4，最后14:41将在另一个15分钟批次中单独返回我希望这是有意义的，并提前表示感谢抱歉没有包括这个 1 2021-01-01 12:59:38 2 2021-01-01 14:08:59 3 2021-01-01 14:09:08 4 2021-01-01 14:11:30 5 2021-01-01 14:22:19 6 2021-

我希望能够在滚动的基础上，每15分钟计算一次参赛人数。例如，12:59将在15分钟内为1，14:08=>14:22将在15分钟内全部为1，因此这将在该批次中返回4，最后14:41将在另一个15分钟批次中单独返回

我希望这是有意义的，并提前表示感谢

抱歉没有包括这个

1 2021-01-01 12:59:38
2 2021-01-01 14:08:59
3 2021-01-01 14:09:08
4 2021-01-01 14:11:30
5 2021-01-01 14:22:19
6 2021-01-01 14:41:07

新编辑-感谢您在此方面的工作。我犯了一个错误

> dput(df)
structure(list(ClickedDate = structure(c(1609460198.707, 1609462979.593, 
1609465088.437, 1609476270.88, 1609478479.177, 1609479667.373, 
1609493081.887, 1609499187.29, 1609507506.37, 1609510989.533, 
1609511522.023, 1609511894.067, 1609512194.773, 1609512377.227, 
1609514474.153), tzone = "UTC", class = c("POSIXct", "POSIXt"
)), batch_no = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
12L, 12L, 12L, 13L), batch_size = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 3L, 3L, 3L, 1L)), row.names = c(NA, -15L), class = c("tbl_df", 
"tbl", "data.frame"))

这似乎很奇怪，我的变量在课堂上

Error in UseMethod("mutate") : 
  no applicable method for 'mutate' applied to an object of class "c('integer', 'numeric')"

这对mutate有效吗，或者我需要转换它吗

> class(df$ClickedDate)
[1] "POSIXct" "POSIXt"

提前感谢

使用

runner

软件包在这种情况下会有所帮助。使用以下策略

> dput(df)
structure(list(ClickedDate = structure(c(1609460198.707, 1609462979.593, 
1609465088.437, 1609476270.88, 1609478479.177, 1609479667.373, 
1609493081.887, 1609499187.29, 1609507506.37, 1609510989.533, 
1609511522.023, 1609511894.067, 1609512194.773, 1609512377.227, 
1609514474.153), tzone = "UTC", class = c("POSIXct", "POSIXt"
)), batch_no = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
12L, 12L, 12L, 13L), batch_size = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 3L, 3L, 3L, 1L)), row.names = c(NA, -15L), class = c("tbl_df", 
"tbl", "data.frame"))

使用的数据

df %>% mutate(b_len = runner::runner(x = ClickedDate,
                             idx = ClickedDate,
                             k = "15 mins",
                             lag = "-14 mins",
                             f = length),
              b_no = purrr::accumulate(seq_len(length(b_len)-1), .init = b_len[1], ~ifelse(.x > .y, .x, .x + b_len[.x +1])),
              b_no = dense_rank(b_no)) %>%
  group_by(b_no) %>%
  mutate(b_len = n()) %>%
  ungroup()
# A tibble: 15 x 3
   ClickedDate         b_len  b_no
   <dttm>              <int> <int>
 1 2021-01-01 00:16:38     1     1
 2 2021-01-01 01:02:59     1     2
 3 2021-01-01 01:38:08     1     3
 4 2021-01-01 04:44:30     1     4
 5 2021-01-01 05:21:19     1     5
 6 2021-01-01 05:41:07     1     6
 7 2021-01-01 09:24:41     1     7
 8 2021-01-01 11:06:27     1     8
 9 2021-01-01 13:25:06     1     9
10 2021-01-01 14:23:09     2    10
11 2021-01-01 14:32:02     2    10
12 2021-01-01 14:38:14     3    11
13 2021-01-01 14:43:14     3    11
14 2021-01-01 14:46:17     3    11
15 2021-01-01 15:21:14     1    12

df
>df
#一个tibble:15x1
点击日期
1 2021-01-01 00:16:38
2 2021-01-01 01:02:59
3 2021-01-01 01:38:08
4 2021-01-01 04:44:30
5 2021-01-01 05:21:19
6 2021-01-01 05:41:07
7 2021-01-01 09:24:41
8 2021-01-01 11:06:27
9 2021-01-01 13:25:06
10 2021-01-01 14:23:09
11 2021-01-01 14:32:02
12 2021-01-01 14:38:14
13 2021-01-01 14:43:14
14 2021-01-01 14:46:17
15 2021-01-01 15:21:14

使用

runner

包在这种情况下会有所帮助。使用以下策略

> dput(df)
structure(list(ClickedDate = structure(c(1609460198.707, 1609462979.593, 
1609465088.437, 1609476270.88, 1609478479.177, 1609479667.373, 
1609493081.887, 1609499187.29, 1609507506.37, 1609510989.533, 
1609511522.023, 1609511894.067, 1609512194.773, 1609512377.227, 
1609514474.153), tzone = "UTC", class = c("POSIXct", "POSIXt"
)), batch_no = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
12L, 12L, 12L, 13L), batch_size = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 3L, 3L, 3L, 1L)), row.names = c(NA, -15L), class = c("tbl_df", 
"tbl", "data.frame"))

使用的数据

df %>% mutate(b_len = runner::runner(x = ClickedDate,
                             idx = ClickedDate,
                             k = "15 mins",
                             lag = "-14 mins",
                             f = length),
              b_no = purrr::accumulate(seq_len(length(b_len)-1), .init = b_len[1], ~ifelse(.x > .y, .x, .x + b_len[.x +1])),
              b_no = dense_rank(b_no)) %>%
  group_by(b_no) %>%
  mutate(b_len = n()) %>%
  ungroup()
# A tibble: 15 x 3
   ClickedDate         b_len  b_no
   <dttm>              <int> <int>
 1 2021-01-01 00:16:38     1     1
 2 2021-01-01 01:02:59     1     2
 3 2021-01-01 01:38:08     1     3
 4 2021-01-01 04:44:30     1     4
 5 2021-01-01 05:21:19     1     5
 6 2021-01-01 05:41:07     1     6
 7 2021-01-01 09:24:41     1     7
 8 2021-01-01 11:06:27     1     8
 9 2021-01-01 13:25:06     1     9
10 2021-01-01 14:23:09     2    10
11 2021-01-01 14:32:02     2    10
12 2021-01-01 14:38:14     3    11
13 2021-01-01 14:43:14     3    11
14 2021-01-01 14:46:17     3    11
15 2021-01-01 15:21:14     1    12

df
>df
#一个tibble:15x1
点击日期
1 2021-01-01 00:16:38
2 2021-01-01 01:02:59
3 2021-01-01 01:38:08
4 2021-01-01 04:44:30
5 2021-01-01 05:21:19
6 2021-01-01 05:41:07
7 2021-01-01 09:24:41
8 2021-01-01 11:06:27
9 2021-01-01 13:25:06
10 2021-01-01 14:23:09
11 2021-01-01 14:32:02
12 2021-01-01 14:38:14
13 2021-01-01 14:43:14
14 2021-01-01 14:46:17
15 2021-01-01 15:21:14

感谢您的回复，我认为这是在尝试做正确的事情，但我遇到了一些错误。（对不起，我是R新手-只是刚从excel过渡过来）-我的数据集中应该有什么？我希望最终结果是案例1,4,4,4,1的一列。我希望这是有道理的。提前感谢Hanks的回复，我认为这是在尝试做正确的事情，但我得到了一些错误。（对不起，我是R新手-只是刚从excel过渡过来）-我的数据集中应该有什么？我希望最终结果是案例1,4,4,4,1的一列。我希望这是有道理的。提前谢谢你没有包括在内谢谢AnilGoyal我一定会在这之后给你买杯咖啡-非常感谢请看新的编辑-得到一个关于mutate的错误，希望是一个简单的修复。提前谢谢。抱歉-我不知道，在此之前我从未使用过变异。我首先通过附加（df）来添加数据帧。很抱歉没有包括在内谢谢AnilGoyal我一定会在这之后给你买杯咖啡-非常感谢，请看新的编辑-得到一个关于mutate的错误，希望是一个简单的修复。提前谢谢。抱歉-我不知道，在此之前我从未使用过变异。我首先通过附加（df）来添加数据帧。这非常有效，非常感谢。