Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/email/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql 如何使用BigQuery分析函数计算时间戳行之间的时间?_Sql_Google Bigquery_Bigquery Standard Sql - Fatal编程技术网

Sql 如何使用BigQuery分析函数计算时间戳行之间的时间?

Sql 如何使用BigQuery分析函数计算时间戳行之间的时间?,sql,google-bigquery,bigquery-standard-sql,Sql,Google Bigquery,Bigquery Standard Sql,我有一个表示分析事件的数据集,如: Row timestamp account_id type 1 2018-11-14 21:05:40 UTC abc start 2 2018-11-14 21:05:40 UTC xyz another_type 3 2018-11-26 22:01:19 UTC xyz start 4 2018-11-26 22:01:23 UTC abc start 5 2018-11-26

我有一个表示分析事件的数据集,如:

Row     timestamp   account_id  type     
1   2018-11-14 21:05:40 UTC abc start    
2   2018-11-14 21:05:40 UTC xyz another_type     
3   2018-11-26 22:01:19 UTC xyz start    
4   2018-11-26 22:01:23 UTC abc start    
5   2018-11-26 22:01:29 UTC xyz some_other_type
11  2018-11-26 22:13:58 UTC xyz start
...
具有一定数量的帐户ID。我需要找到每个
帐户id
开始
记录之间的平均时间

我试图使用前面描述的解析函数。我的最终目标是这样一张桌子:

Row     account_id     avg_time_between_events_mins
1     xyz     53
2     abc     47
3     pqr     65
...
我的最佳尝试(基于)如下所示:

WITH
  events AS (
  SELECT
    COUNTIF(type = 'start' AND account_id='abc') OVER (ORDER BY timestamp) as diff,
    timestamp
  FROM
    `myproject.dataset.events`
  WHERE
    account_id='abc')
SELECT
  min(timestamp) AS start_time,
  max(timestamp) AS next_start_time,
  ABS(timestamp_diff(min(timestamp), max(timestamp), MINUTE)) AS minutes_between
FROM
  events
GROUP BY
  diff
WITH
  events AS (
  SELECT
    COUNT(*) OVER (PARTITION BY account_id ORDER BY timestamp ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) as diff,
    timestamp
  FROM
    `myproject.dataset.events`
  WHERE
    type = 'start')
SELECT
  min(timestamp) AS start_time,
  max(timestamp) AS next_start_time,
  ABS(timestamp_diff(min(timestamp), max(timestamp), MINUTE)) AS minutes_between
FROM
  events
GROUP BY
  diff
这将计算每个
启动
事件与下一个
启动
事件之前的上一个非
启动
事件之间的时间,该事件针对特定的
帐户id

我尝试使用
分区
窗口框架子句
,如下所示:

WITH
  events AS (
  SELECT
    COUNTIF(type = 'start' AND account_id='abc') OVER (ORDER BY timestamp) as diff,
    timestamp
  FROM
    `myproject.dataset.events`
  WHERE
    account_id='abc')
SELECT
  min(timestamp) AS start_time,
  max(timestamp) AS next_start_time,
  ABS(timestamp_diff(min(timestamp), max(timestamp), MINUTE)) AS minutes_between
FROM
  events
GROUP BY
  diff
WITH
  events AS (
  SELECT
    COUNT(*) OVER (PARTITION BY account_id ORDER BY timestamp ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) as diff,
    timestamp
  FROM
    `myproject.dataset.events`
  WHERE
    type = 'start')
SELECT
  min(timestamp) AS start_time,
  max(timestamp) AS next_start_time,
  ABS(timestamp_diff(min(timestamp), max(timestamp), MINUTE)) AS minutes_between
FROM
  events
GROUP BY
  diff

但我得到了一张毫无意义的结果表。有人能告诉我,我会如何写这样一个问题,并对其进行推理吗

这并不需要解析函数:

select timestamp_diff(min(timestamp), max(timestamp), MINUTE)) / nullif(count(*) - 1, 0)
from `myproject.dataset.events`
where type = 'start'
group by account_id;

这是最近的时间戳减去最早的时间戳,除以比开始次数少一的时间戳。这是两次出发之间的平均值。

哦,哇!我不敢相信这个解决方案是多么简单和明智。非常感谢。@SolomonBothwell-同意,这很简单-但我真的怀疑它是否真的回答了你的问题!如果你接受这个答案,我会相信你的。但在这种情况下,你需要考虑调整你的问题来匹配答案:O?这是因为建议的结果有
列吗?没有。我想OP要求在下次启动之前,从启动到最后一次非启动事件之间的平均时间。当你回答两次开始之间的平均值时。有道理?这显然是一项简单的任务,但如果我的帖子让人困惑的话,就需要窗口函数库。我要求按帐户id设置
开始
事件之间的时间间隔。我认为@GordonLinoff的解决方案需要修改为按帐户id设置的第一部分