Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/visual-studio-2008/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql 从一组范围计算并发性_Sql_Google App Engine_Google Bigquery - Fatal编程技术网

Sql 从一组范围计算并发性

Sql 从一组范围计算并发性,sql,google-app-engine,google-bigquery,Sql,Google App Engine,Google Bigquery,我有一组包含开始时间戳和持续时间的行。我想使用重叠或并发来执行各种摘要 例如:peak daily concurrency,peak concurrency分组在另一列上 示例数据: timestamp,duration 2016-01-01 12:00:00,300 2016-01-01 12:01:00,300 2016-01-01 12:06:00,300 我想知道这段时间的峰值是12:01:00-12:05:00,时间是2点 关于如何使用BigQuery或不太令人兴奋的Map/Redu

我有一组包含开始时间戳和持续时间的行。我想使用重叠或并发来执行各种摘要

例如:peak daily concurrency,peak concurrency分组在另一列上

示例数据:

timestamp,duration
2016-01-01 12:00:00,300
2016-01-01 12:01:00,300
2016-01-01 12:06:00,300
我想知道这段时间的峰值是12:01:00-12:05:00,时间是2点


关于如何使用BigQuery或不太令人兴奋的Map/Reduce作业来实现这一点,您有什么想法吗?

对于每分钟的分辨率,会话长度最长可达255分钟:

SELECT session_minute, COUNT(*) c
FROM (
  SELECT start, DATE_ADD(start, i, 'MINUTE') session_minute FROM (
    SELECT * FROM (
      SELECT TIMESTAMP("2015-04-30 10:14") start, 7 minutes
    ),(
      SELECT TIMESTAMP("2015-04-30 10:15") start, 12 minutes
    ),(
      SELECT TIMESTAMP("2015-04-30 10:15") start, 12 minutes
    ),(
      SELECT TIMESTAMP("2015-04-30 10:18") start, 12 minutes
    ),(
      SELECT TIMESTAMP("2015-04-30 10:23") start, 3 minutes
    ) 
  ) a
  CROSS JOIN [fh-bigquery:public_dump.numbers_255] b
  WHERE a.minutes>b.i
)
GROUP BY 1
ORDER BY 1

第1步-首先,您需要找到所有句点(开始和结束) 各自的并发条目

假设输入如下

(SELECT TIMESTAMP('2016-01-01 12:00:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:01:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:06:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:07:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:10:00') AS ts, 300 AS duration),
(SELECT TIMESTAMP('2016-01-01 12:11:00') AS ts, 300 AS duration)
上述查询的输出将以某种方式如下所示:

start                       finish                      concurrent_entries   
2016-01-01 12:00:00 UTC     2016-01-01 12:01:00 UTC     1    
2016-01-01 12:01:00 UTC     2016-01-01 12:05:00 UTC     2    
2016-01-01 12:05:00 UTC     2016-01-01 12:07:00 UTC     1    
2016-01-01 12:07:00 UTC     2016-01-01 12:10:00 UTC     2    
2016-01-01 12:10:00 UTC     2016-01-01 12:12:00 UTC     3    
2016-01-01 12:12:00 UTC     2016-01-01 12:15:00 UTC     2    
2016-01-01 12:15:00 UTC     2016-01-01 12:16:00 UTC     1    
2016-01-01 12:16:00 UTC     null                        0   
您可能仍然想稍微润色一下上面的查询,但它主要满足您的需要

第2步-现在你可以根据以上结果进行任何统计

例如,整个周期的峰值:

SELECT 
  start, finish, concurrent_entries, RANK() OVER(ORDER BY concurrent_entries DESC) AS peak
FROM (
  SELECT ts AS start, LEAD(ts) OVER(ORDER BY ts) AS finish, 
         SUM(entry) OVER(ORDER BY ts) AS concurrent_entries
  FROM (
    SELECT ts, SUM(entry)AS entry FROM 
      (SELECT ts, 1 AS entry FROM yourTable),
      (SELECT DATE_ADD(ts, duration, 'second') AS ts, -1 AS entry FROM yourTable)
    GROUP BY ts
    HAVING entry != 0
  )
)
ORDER BY peak
SELECT 
  start, finish, concurrent_entries, RANK() OVER(ORDER BY concurrent_entries DESC) AS peak
FROM (
  SELECT ts AS start, LEAD(ts) OVER(ORDER BY ts) AS finish, 
         SUM(entry) OVER(ORDER BY ts) AS concurrent_entries
  FROM (
    SELECT ts, SUM(entry)AS entry FROM 
      (SELECT ts, 1 AS entry FROM yourTable),
      (SELECT DATE_ADD(ts, duration, 'second') AS ts, -1 AS entry FROM yourTable)
    GROUP BY ts
    HAVING entry != 0
  )
)
ORDER BY peak