Arrays PostgreSQL-基于ID从多行创建并连接数组_Arrays_Postgresql

Arrays PostgreSQL-基于ID从多行创建并连接数组

arrays postgresql

Arrays PostgreSQL-基于ID从多行创建并连接数组,arrays,postgresql,Arrays,Postgresql,我有两张桌子：该表链接有： LINK_ID --- integer, unique ID FROM_NODE_X -- numbers/floats, indicating a geographical position FROM_NODE_Y -- FROM_NODE_Z -- TO_NODE_X -- TO_NODE_Y -- TO_NODE_Z -- LINK_ID --- integer, refers to above UID ORDER --- integer, indicati

我有两张桌子：

该表链接有：

LINK_ID --- integer, unique ID
FROM_NODE_X -- numbers/floats, indicating a geographical position
FROM_NODE_Y --
FROM_NODE_Z --
TO_NODE_X --
TO_NODE_Y --
TO_NODE_Z --

LINK_ID --- integer, refers to above UID
ORDER --- integer, indicating order
X ---
Y ---
Z ---

表链接坐标包含：

LINK_ID --- integer, unique ID FROM_NODE_X -- numbers/floats, indicating a geographical position FROM_NODE_Y -- FROM_NODE_Z -- TO_NODE_X -- TO_NODE_Y -- TO_NODE_Z --

LINK_ID --- integer, refers to above UID ORDER --- integer, indicating order X --- Y --- Z ---
从逻辑上讲，每条链路由多个航路点组成。最后的命令是：

FROM_NODE , 1 , 2 , 3 , ... , TO_NODE
链路至少有两个航路点（从_节点到_节点），但可以在（0到100+）之间有可变数量的航路点
我现在需要一种方法来聚合、排序并将每条链路的航路点存储在一个数组中，该数组稍后将用于绘制一条线
我正在努力解决链接坐标作为单独的行可用的问题。在other（LINKS）表中设置开始和结束位置也没有帮助。如果我有办法至少让所有的LINK_COORDS加入/更新到LINKS表中，那么剩下的我可能会自己解决。因此，如果你有一个如何做到这一点的想法，这将是非常感谢了
考虑性能会很好（表中现在有500k到1mio的条目，以后会有成倍的条目），但现在不是必需的
编辑：谢谢你的建议，一匹没有名字的马。在这一步之前，我选择为每个XYZ创建点几何体（PostGIS），因此最终只需要从各个点创建一个点阵列。适应的SQL

UPDATE "Link" SET "POINTS" = array_append( (array_prepend( "FROM_POINT", (SELECT array_agg(lc."POINT" ORDER BY lc."COUNT") FROM "LinkCoordinate" lc WHERE lc."LINK_ID" = "Link"."LINK_ID"))) , "TO_POINT")
但是，它的运行速度非常慢：在10条链路上运行它需要约120秒。为所有130万IO链接和更多linkcoords运行它可能需要大约半年的时间。不太理想
我怎样才能知道这种巨大的缓慢是从哪里来的
如果我以预先排序的格式获取源数据（因此每个link_ID的link坐标），这是否允许我显著加快SQL查询的速度
编辑：似乎主减速源于array_agg（）函数中使用的SELECT子查询。其他一切（包括订购）都不会真正导致任何减速

我目前的猜测是，SELECT查询会对每个链接的整个“LinkCoordinate”进行迭代，这使得它的工作更加困难，因为属于一个链接的所有LinkCoordination总是存储在行的“块”中。对链接坐标进行一次单独的、连续的处理就足够了。
类似这样的事情可能会：

select l.link_id, min(l.from_node_x) as from_node_x, min(l.from_node_y) as from_node_y, min(l.from_node_z) as from_node_z, array_agg(lc.x order by lc."ORDER") as points_x, array_agg(lc.y order by lc."ORDER") as points_y, array_agg(lc.z order by lc."ORDER") as points_z, min(l.to_node_x) as to_node_x, min(l.to_node_y) as to_node_y, min(l.to_node_z) as to_node_z from links l join link_coords lc on lc.link_id = l.link_id group by l.link_id;

min（）
是必需的，因为
分组依据
，但不会更改结果，因为
链接
中的所有值都是相同的
另一种可能是使用标量子查询。我不确定哪一个更快，但是join/GroupBy可能更有效

select l.link_id, l.from_node_x, l.from_node_y, l.from_node_z, (select array_agg(lc.x order by lc."ORDER") from link_coords lc where lc.link_id = l.link_id) as points_x, (select array_agg(lc.y order by lc."ORDER") from link_coords lc where lc.link_id = l.link_id) as points_y, (select array_agg(lc.z order by lc."ORDER") from link_coords lc where lc.link_id = l.link_id) as points_z, l.to_node_x, l.to_node_y, l.to_node_z from links l