Postgresql 数据窗口功能
我有下面的事件表Postgresql 数据窗口功能,postgresql,Postgresql,我有下面的事件表 key time_stamp geohash k1 1 thred0y k2 5 thred0v k4 7 thre6rd k3 9 thre6rg k1 10 thred3t k1 12 thred3u k2 14 thred3s 如果钥匙在10分钟的时间间隔内落在500毫秒的范围内,我想将钥匙分组 我试着交叉加入他们 sel
key time_stamp geohash
k1 1 thred0y
k2 5 thred0v
k4 7 thre6rd
k3 9 thre6rg
k1 10 thred3t
k1 12 thred3u
k2 14 thred3s
如果钥匙在10分钟的时间间隔内落在500毫秒的范围内,我想将钥匙分组
我试着交叉加入他们
select a.key, b.key, a.geohash, b.geohash, a.time_stamp, b.time_stamp,
round(ST_Distance(ST_PointFromGeoHash(a.geohash, 4326), ST_PointFromGeoHash(b.geohash, 4326), true)) distance,
abs(round(extract(EPOCH from a.time_stamp - b.time_stamp)/60))
from t a, t b
where a.key <> b.key
and a.time_stamp between b.time_stamp - interval '10 min' and b.time_stamp + interval '10 min'
and ST_Distance(ST_PointFromGeoHash(a.geohash, 4326), ST_PointFromGeoHash(b.v, 4326), true) <= 500
and least(a.key, b.key) = a.key
order by a.time_stamp desc
选择a.key、b.key、a.geohash、b.geohash、a.time\u stamp、b.time\u stamp、,
圆(ST_距离(ST_点fromGeohash(a.geohash,4326),ST_点fromGeohash(b.geohash,4326),真))距离,
abs(圆形(摘录(a.time_stamp-b.time_stamp的历元)/60))
来自t a,t b
其中a键b键
和a.时间戳介于b.时间戳-间隔“10分钟”和b.时间戳+间隔“10分钟”之间
和ST_距离(ST_PointFromGeoHash(a.geohash,4326),ST_PointFromGeoHash(b.v,4326),true)我通过在60分钟内聚集密钥以及1.2公里的距离找到了解决方案
with x as (
select key, time_stamp, geo, prev_ts, geo_hash6,
count(case when prev_ts is null or prev_ts > 60 then 1 else null end) over(order by time_stamp) cluster_id
from (
select key, time_stamp, geo,
EXTRACT(EPOCH FROM time_stamp - lag(time_stamp) over(order by time_stamp)) prev_ts,
substring(geo, 1, 6) geo_hash6
from t
) a
order by cluster_id, geo_hash6, geo, time_stamp)
select x.cluster_id, x.key, x.geo_hash6, min(time_stamp) first_time, max(time_stamp) last_time
from x, (select cluster_id, geo_hash6, count(distinct key) num_uniques from x group by cluster_id, geo_hash6) y
where x.cluster_id = y.cluster_id and x.geo_hash6 = y.geo_hash6 and y.num_uniques > 2
group by x.cluster_id, x.geo_hash6, x.key
order by x.cluster_id, x.geo_hash6;
欢迎对解决方案提出任何改进建议。我通过在60分钟内聚集密钥以及1.2公里的间隔找到了解决方案
with x as (
select key, time_stamp, geo, prev_ts, geo_hash6,
count(case when prev_ts is null or prev_ts > 60 then 1 else null end) over(order by time_stamp) cluster_id
from (
select key, time_stamp, geo,
EXTRACT(EPOCH FROM time_stamp - lag(time_stamp) over(order by time_stamp)) prev_ts,
substring(geo, 1, 6) geo_hash6
from t
) a
order by cluster_id, geo_hash6, geo, time_stamp)
select x.cluster_id, x.key, x.geo_hash6, min(time_stamp) first_time, max(time_stamp) last_time
from x, (select cluster_id, geo_hash6, count(distinct key) num_uniques from x group by cluster_id, geo_hash6) y
where x.cluster_id = y.cluster_id and x.geo_hash6 = y.geo_hash6 and y.num_uniques > 2
group by x.cluster_id, x.geo_hash6, x.key
order by x.cluster_id, x.geo_hash6;
欢迎提出任何改进解决方案的建议