Postgresql WHERE子句用于不选择两边都有50毫秒时间戳的行?

Postgresql WHERE子句用于不选择两边都有50毫秒时间戳的行?,postgresql,timestamp,where,lag,lead,Postgresql,Timestamp,Where,Lag,Lead,我有这样一张桌子的一部分: timestamp | Source ----------------------------+---------- 2017-07-28 14:20:28.757464 | Stream 2017-07-28 14:20:28.775248 | Poll 2017-07-28 14:20:29.777678 | Poll 2017-07-28 14:21:28.582532 | Stream 我想做到这一点: time

我有这样一张桌子的一部分:

 timestamp                  | Source
----------------------------+----------
 2017-07-28 14:20:28.757464 | Stream
 2017-07-28 14:20:28.775248 | Poll
 2017-07-28 14:20:29.777678 | Poll
 2017-07-28 14:21:28.582532 | Stream
我想做到这一点:

 timestamp                  | Source
----------------------------+----------
 2017-07-28 14:20:28.757464 | Stream
 2017-07-28 14:20:29.777678 | Poll
 2017-07-28 14:21:28.582532 | Stream
原始表中的第二行已被删除,因为它在时间戳之前或之后的50毫秒内。重要信息是仅在源='Poll'时删除行

不确定使用WHERE子句如何实现这一点


提前感谢您的帮助。

无论我们做什么,我们都可以将其限制为池,然后将这些行与流合并

with 
streams as (
 select *
 from test 
 where Source = 'Stream'  
),
pools as (
  ...
)

(select * from pools) union (select * from streams) order by timestamp
要获取池,有不同的选项:

相关子查询 对于每一行,我们运行额外的查询以获取具有相同源的前一行,然后仅选择没有前一行时间戳或前一行时间戳早于50毫秒的行

with 
...
pools_with_prev as (
  -- use correlated subquery
  select 
    timestamp, Source, 
    timestamp - interval '00:00:00.05' 
      as timestamp_prev_limit,
    (select max(t2.timestamp)from test as t2 
      where t2.timestamp < test.timestamp and
     t2.Source = test.Source) 
      as timestamp_prev
  from test
),
pools as (
  select timestamp, Source
  from pools_with_prev
  -- then select rows which are >50ms apart
  where timestamp_prev is NULL or
  timestamp_prev < timestamp_prev_limit
)

...
滑动窗口 现代SQL也可以做类似的事情,按源分区,然后使用滑动窗口连接上一行

with 
...
pools_with_prev as (
  -- use sliding window to join prev timestamp
  select *, 
    timestamp - interval '00:00:00.05' 
      as timestamp_prev_limit,
    lag(timestamp) over(
      partition by Source order by timestamp
    ) as timestamp_prev
  from test
),
pools as (
  select timestamp, Source
  from pools_with_prev
  -- then select rows which are >50ms apart
  where timestamp_prev is NULL or
  timestamp_prev < timestamp_prev_limit
)


...

我相信这是最理想的。

无论我们做什么,我们都可以将其限制在池中,然后将这些行与流合并

with 
streams as (
 select *
 from test 
 where Source = 'Stream'  
),
pools as (
  ...
)

(select * from pools) union (select * from streams) order by timestamp
要获取池,有不同的选项:

相关子查询 对于每一行,我们运行额外的查询以获取具有相同源的前一行,然后仅选择没有前一行时间戳或前一行时间戳早于50毫秒的行

with 
...
pools_with_prev as (
  -- use correlated subquery
  select 
    timestamp, Source, 
    timestamp - interval '00:00:00.05' 
      as timestamp_prev_limit,
    (select max(t2.timestamp)from test as t2 
      where t2.timestamp < test.timestamp and
     t2.Source = test.Source) 
      as timestamp_prev
  from test
),
pools as (
  select timestamp, Source
  from pools_with_prev
  -- then select rows which are >50ms apart
  where timestamp_prev is NULL or
  timestamp_prev < timestamp_prev_limit
)

...
滑动窗口 现代SQL也可以做类似的事情,按源分区,然后使用滑动窗口连接上一行

with 
...
pools_with_prev as (
  -- use sliding window to join prev timestamp
  select *, 
    timestamp - interval '00:00:00.05' 
      as timestamp_prev_limit,
    lag(timestamp) over(
      partition by Source order by timestamp
    ) as timestamp_prev
  from test
),
pools as (
  select timestamp, Source
  from pools_with_prev
  -- then select rows which are >50ms apart
  where timestamp_prev is NULL or
  timestamp_prev < timestamp_prev_limit
)


...

我相信这是最理想的。

如果我们有三个轮询行,并且所有三个都在一个时间戳的50毫秒内,会发生什么?三个轮询,每个轮询彼此之间的距离都小于50毫秒,而第三个轮询是51个来自Stream,那么会发生什么?数据中永远不会发生这种情况,因为轮询器设置在超过50毫秒的时间范围内。只有流数据可以在50毫秒内进行轮询。如果我们在一行中有三个轮询行,并且所有三个都在时间戳的50毫秒内,会发生什么?三个轮询,每个轮询之间的距离都小于50毫秒,而第三个轮询是流中的51毫秒,那么会发生什么?数据中永远不会发生这种情况,因为轮询器设置在超过50毫秒的时间范围内。只有流数据可以在轮询的50毫秒内。