Sql 围绕已知群集中心的群集点
我有一组点~1000和一组簇中心~100。现在,我想将已知的群集中心考虑在内,对点集进行群集。所有集群应从已知的集群中心开始向外扩展,收集距离集群内最近点不到x米的所有点 我现在有了以下标准PostGIS dbscan查询:Sql 围绕已知群集中心的群集点,sql,postgresql,postgis,Sql,Postgresql,Postgis,我有一组点~1000和一组簇中心~100。现在,我想将已知的群集中心考虑在内,对点集进行群集。所有集群应从已知的集群中心开始向外扩展,收集距离集群内最近点不到x米的所有点 我现在有了以下标准PostGIS dbscan查询: WITH clusters AS ( SELECT landmark_id, coordinate, ST_ClusterDBSCAN(coordinate, eps := (30 / 111111.0), minpoints := 10) OVER()
WITH clusters AS (
SELECT
landmark_id, coordinate,
ST_ClusterDBSCAN(coordinate, eps := (30 / 111111.0), minpoints := 10) OVER() AS cluster_id
FROM landmarks
WHERE coordinate IS NOT NULL
)
SELECT
cluster.id, cluster.landmark_ids,
ST_Centroid(cluster.geometry) AS coordinate,
ST_AsGeoJSON(cluster.geometry) AS geometry
FROM (
SELECT
cluster_id AS id,
array_agg(landmark_id) AS landmark_ids,
ST_ConvexHull(ST_Collect(coordinate)) AS geometry
FROM clusters
WHERE cluster_id IS NOT NULL
GROUP BY cluster_id
) AS cluster;
任何关于我如何调整上述查询或编写另一个查询来做我想做的事情的指针,如果是这样的话,我也希望有一些关于这方面的指针?我不确定您是指第一个集群拾取的查询,还是包括您将递归拾取的查询 此解决方案仅与原始集群进行比较,不尝试基于递归集群匹配进行比较。这将需要一个递归查询,我怀疑它是否会产生更好的答案 也不确定为什么决定使用convexhull来计算质心,我假设您想要真实的质心,这可以根据ST_Collect输出来完成
WITH cluster1 AS (
SELECT
landmark_id, coordinate,
ST_ClusterDBSCAN(coordinate, eps := (30 / 111111.0), minpoints := 10) OVER() AS cluster_id
FROM landmarks
WHERE coordinate IS NOT NULL
),
clustered AS ( SELECT * FROM cluster1 WHERE cluster_id IS NOT NULL )
clusterall AS (
SELECT
l.landmark_id, l.coordinate, c.cluster_id
FROM landmarks AS l
CROSS JOIN
-- find closest cluster
LATERAL (SELECT cluster_id
FROM clustered AS c
ORDER BY c.coordinate <-> l.coordinate LIMIT 1 ) AS c
-- only look for landmarks not matched to a cluster
WHERE l.landmark_id NOT IN(SELECT c.landmark_id FROM clustered AS c)
UNION ALL
SELECT c.landmark_id, c.coordinate, c.cluster_id
FROM cluster1
)
SELECT
cluster.id, cluster.landmark_ids,
ST_Centroid(cluster.geometry) AS coordinate,
ST_AsGeoJSON(cluster.geometry) AS geometry
FROM (
SELECT
cluster_id AS id,
array_agg(landmark_id) AS landmark_ids,
ST_ConvexHull(ST_Collect(coordinate)) AS geometry
FROM clusterall
GROUP BY cluster_id
) AS cluster;
我不确定我想要的是否仍然可以称为集群。基本上,它应该是这样工作的:1从某个点向外生长,从该点收集所有点x米或更少。2从所有新收集的点重复1。这将持续到距离最近添加的点x米范围内没有点为止。这似乎是递归发生的地方,3当它完成后,我从所有收集的点创建一个凸包。还是要看看我想用哪个质心。