Postgresql 数据窗口功能

Postgresql 数据窗口功能,postgresql,Postgresql,我有下面的事件表 key time_stamp geohash k1 1 thred0y k2 5 thred0v k4 7 thre6rd k3 9 thre6rg k1 10 thred3t k1 12 thred3u k2 14 thred3s 如果钥匙在10分钟的时间间隔内落在500毫秒的范围内,我想将钥匙分组 我试着交叉加入他们 sel

我有下面的事件表

key time_stamp  geohash
k1  1           thred0y
k2  5           thred0v
k4  7           thre6rd
k3  9           thre6rg
k1  10          thred3t
k1  12          thred3u
k2  14          thred3s
如果钥匙在10分钟的时间间隔内落在500毫秒的范围内,我想将钥匙分组

我试着交叉加入他们

select a.key, b.key, a.geohash, b.geohash, a.time_stamp, b.time_stamp,
  round(ST_Distance(ST_PointFromGeoHash(a.geohash, 4326), ST_PointFromGeoHash(b.geohash, 4326), true)) distance,
  abs(round(extract(EPOCH from a.time_stamp - b.time_stamp)/60))
from t a, t b
where a.key <> b.key
  and a.time_stamp between b.time_stamp - interval '10 min' and b.time_stamp + interval '10 min'
  and ST_Distance(ST_PointFromGeoHash(a.geohash, 4326), ST_PointFromGeoHash(b.v, 4326), true) <= 500
  and least(a.key, b.key) = a.key
order by a.time_stamp desc
选择a.key、b.key、a.geohash、b.geohash、a.time\u stamp、b.time\u stamp、,
圆(ST_距离(ST_点fromGeohash(a.geohash,4326),ST_点fromGeohash(b.geohash,4326),真))距离,
abs(圆形(摘录(a.time_stamp-b.time_stamp的历元)/60))
来自t a,t b
其中a键b键
和a.时间戳介于b.时间戳-间隔“10分钟”和b.时间戳+间隔“10分钟”之间

和ST_距离(ST_PointFromGeoHash(a.geohash,4326),ST_PointFromGeoHash(b.v,4326),true)我通过在60分钟内聚集密钥以及1.2公里的距离找到了解决方案

with x as (
select key, time_stamp, geo, prev_ts, geo_hash6,
 count(case when prev_ts is null or prev_ts > 60 then 1 else null end) over(order by time_stamp) cluster_id
from (
    select key, time_stamp, geo,  
        EXTRACT(EPOCH FROM time_stamp - lag(time_stamp) over(order by time_stamp)) prev_ts,
        substring(geo, 1, 6) geo_hash6
    from t
) a
order by cluster_id, geo_hash6, geo, time_stamp)
select x.cluster_id, x.key, x.geo_hash6, min(time_stamp) first_time, max(time_stamp) last_time
from x, (select cluster_id, geo_hash6, count(distinct key) num_uniques from x group by cluster_id, geo_hash6) y
where x.cluster_id = y.cluster_id and x.geo_hash6 = y.geo_hash6 and y.num_uniques > 2
group by x.cluster_id, x.geo_hash6, x.key
order by x.cluster_id, x.geo_hash6;

欢迎对解决方案提出任何改进建议。

我通过在60分钟内聚集密钥以及1.2公里的间隔找到了解决方案

with x as (
select key, time_stamp, geo, prev_ts, geo_hash6,
 count(case when prev_ts is null or prev_ts > 60 then 1 else null end) over(order by time_stamp) cluster_id
from (
    select key, time_stamp, geo,  
        EXTRACT(EPOCH FROM time_stamp - lag(time_stamp) over(order by time_stamp)) prev_ts,
        substring(geo, 1, 6) geo_hash6
    from t
) a
order by cluster_id, geo_hash6, geo, time_stamp)
select x.cluster_id, x.key, x.geo_hash6, min(time_stamp) first_time, max(time_stamp) last_time
from x, (select cluster_id, geo_hash6, count(distinct key) num_uniques from x group by cluster_id, geo_hash6) y
where x.cluster_id = y.cluster_id and x.geo_hash6 = y.geo_hash6 and y.num_uniques > 2
group by x.cluster_id, x.geo_hash6, x.key
order by x.cluster_id, x.geo_hash6;
欢迎提出任何改进解决方案的建议