Sql 查找不同行中2个数字之间的差异
我有一个Postgres表,记录每分钟的int值请求计数。 我在一些服务器上有一些请求类型,它们都在同一个表中:Sql 查找不同行中2个数字之间的差异,sql,postgresql,greatest-n-per-group,Sql,Postgresql,Greatest N Per Group,我有一个Postgres表,记录每分钟的int值请求计数。 我在一些服务器上有一些请求类型,它们都在同一个表中: time | key1 | key2 | key3 | value ----------------------------------------------------------------------- 2017-01-16 18:00:53 | server1 | webpage1 | type1 | 30
time | key1 | key2 | key3 | value
-----------------------------------------------------------------------
2017-01-16 18:00:53 | server1 | webpage1 | type1 | 30
2017-01-16 18:00:55 | server1 | webpage2 | type1 | 31
2017-01-16 18:00:58 | server1 | webpage3 | type1 | 32
2017-01-16 18:00:59 | server1 | webpage4 | type1 | 33
2017-01-16 18:01:00 | server1 | webpage5 | type1 | 34
2017-01-16 18:01:01 | server1 | webpage6 | type1 | 35
2017-01-16 18:01:02 | server1 | webpage7 | type1 | 36
2017-01-16 18:01:03 | server1 | webpage8 | type1 | 37
2017-01-16 18:01:04 | server1 | webpage1 | type1 | 56
2017-01-16 18:01:06 | server1 | webpage2 | type1 | 35
2017-01-16 18:01:07 | server1 | webpage3 | type1 | 43
2017-01-16 18:01:10 | server1 | webpage4 | type1 | 64
2017-01-16 18:01:13 | server1 | webpage5 | type1 | 44
2017-01-16 18:01:14 | server1 | webpage6 | type1 | 66
2017-01-16 18:01:16 | server1 | webpage7 | type1 | 56
2017-01-16 18:01:18 | server1 | webpage8 | type1 | 22
假设key1和key3也有不同的值。为了这个示例,我发出了一些数据
我需要的结果是组key1,key2,key3上的最新值减去最新值的1偏移量的差值[我需要每分钟的速率]
我成功地在同一个表中获得了按键分组的最新偏移量和1偏移量的结果:
SELECT * FROM
(SELECT ROW_NUMBER()
OVER(PARTITION BY key1, key2, key3 ORDER BY time DESC) as rnum,
time, key1, key2, key3, value FROM test ORDER BY time DESC) a
WHERE rnum < 3;
现在,我想我可以取MINtime和MAXtime的value列并计算diff,但是我不能合并这些行
在@HartCO评论之后,我能够做到:
select time, new_val-last_val, key1, key2, key3 from
(select distinct max(time) over(partition by key1, key2, key3) as time,
max(value) over(partition by key1, key2, key3) as new_val,
min(value) over(partition by key1, key2, key3) as last_val,
key1, key2, key3
from (select row_number() over(partition by key1, key2, key3 order by time desc) as rnum,
time, key1, key2, key3, value from test order by time desc) a
where rnum < 3) b;
但是在网页8上,所需的输出应该是-15,而不是22。这些行之间的差异被一定量的偏移,最好使用窗口函数来处理。如果您的表不是很大,则可以获取最新的值。请注意,DISTINCT ON是一个Postgresql扩展
SELECT DISTINCT ON (key1, key2, key3)
time,
key1,
key2,
key3,
value - lag(value) OVER (PARTITION BY key1, key2, key3 ORDER BY time)
FROM test
ORDER BY key1, key2, key3, time DESC;
这给了我们
time | key1 | key2 | key3 | ?column?
---------------------+------------+-------------+----------+----------
2017-01-16 18:01:04 | server1 | webpage1 | type1 | 26
2017-01-16 18:01:06 | server1 | webpage2 | type1 | 4
2017-01-16 18:01:07 | server1 | webpage3 | type1 | 11
2017-01-16 18:01:10 | server1 | webpage4 | type1 | 31
2017-01-16 18:01:13 | server1 | webpage5 | type1 | 10
2017-01-16 18:01:14 | server1 | webpage6 | type1 | 31
2017-01-16 18:01:16 | server1 | webpage7 | type1 | 20
2017-01-16 18:01:18 | server1 | webpage8 | type1 | -15
(8 rows)
当然,您可以使用well解决方案,例如左连接
WITH diffs AS (
SELECT time,
key1,
key2,
key3,
value - lag(value) OVER (PARTITION BY key1, key2, key3 ORDER BY time)
FROM test)
SELECT d1.*
FROM diffs d1
LEFT JOIN diffs d2
ON (d1.key1, d1.key2, d1.key3) = (d2.key1, d2.key2, d2.key3)
-- This allows us to single out the greatest row
AND d1.time < d2.time
WHERE d2.time IS NULL
-- Ordering is just for show
ORDER BY d1.key1, d1.key2, d1.key3;
使用Postgresql 9.5,规划人员识别出了这种模式,并使用反连接作为最终的查询计划。使用NOT EXISTS也可以得到类似的结果。您似乎很接近,就像您使用ROW\U NUMBER OVER一样,您可以使用MINtime OVER。。。对于MAX`来说也是一样的,以获取那些与PARTITION by子句定义的行组相关的值,这两个子句不使用ORDER by。@HartCO-你能给出MAX和MIN的例子吗?@HartCO-请看我的编辑。但这还不够好。如果计数器减少,我将无法捕捉到它,我正在执行maxval和minval,并且它与maxtime和MinTime不相关。您是否可以获取样本数据并显示所需的输出以帮助澄清?请参阅。这将更容易使用。
time | key1 | key2 | key3 | ?column?
---------------------+------------+-------------+----------+----------
2017-01-16 18:01:04 | server1 | webpage1 | type1 | 26
2017-01-16 18:01:06 | server1 | webpage2 | type1 | 4
2017-01-16 18:01:07 | server1 | webpage3 | type1 | 11
2017-01-16 18:01:10 | server1 | webpage4 | type1 | 31
2017-01-16 18:01:13 | server1 | webpage5 | type1 | 10
2017-01-16 18:01:14 | server1 | webpage6 | type1 | 31
2017-01-16 18:01:16 | server1 | webpage7 | type1 | 20
2017-01-16 18:01:18 | server1 | webpage8 | type1 | -15
(8 rows)
WITH diffs AS (
SELECT time,
key1,
key2,
key3,
value - lag(value) OVER (PARTITION BY key1, key2, key3 ORDER BY time)
FROM test)
SELECT d1.*
FROM diffs d1
LEFT JOIN diffs d2
ON (d1.key1, d1.key2, d1.key3) = (d2.key1, d2.key2, d2.key3)
-- This allows us to single out the greatest row
AND d1.time < d2.time
WHERE d2.time IS NULL
-- Ordering is just for show
ORDER BY d1.key1, d1.key2, d1.key3;