Sql 雪花获取路径()或展平()数组查询-查找最新键:值
我在“示例”表中有一列“amp”。列“amp”是一个如下所示的数组:Sql 雪花获取路径()或展平()数组查询-查找最新键:值,sql,arrays,snowflake-cloud-data-platform,Sql,Arrays,Snowflake Cloud Data Platform,我在“示例”表中有一列“amp”。列“amp”是一个如下所示的数组: [{ “名单”:[{ “要素”:{ “x_id”:“12356789XXX”, “y_id”:“12356789XXX38998”, } }, { “要素”:{ “x_id”:“5677888356789XXX”, “y_id”:“1xx387688”, } }] }] 我应该如何使用get\u path()或flatte()进行查询,以提取最新的x\u id和y\u id值(或其他可选值) 在本例中,只有2个元素,但可能有
[{
“名单”:[{
“要素”:{
“x_id”:“12356789XXX”,
“y_id”:“12356789XXX38998”,
}
},
{
“要素”:{
“x_id”:“5677888356789XXX”,
“y_id”:“1xx387688”,
}
}]
}]
我应该如何使用get\u path()
或flatte()
进行查询,以提取最新的x\u id
和y\u id
值(或其他可选值)
在本例中,只有2个元素,但可能有1到6000个元素包含x\u id
和y\u id
非常感谢您的帮助 有些人可能有比这更优雅的方式,但你可以使用CTE。在第一个表表达式中,获取数组的最大值。在第二部分中,获取所需的值
set json = '[{"list": [{"element": {"x_id": "12356789XXX","y_id": "12356789XXX38998"}},{"element": {"x_id": "5677888356789XXX","y_id": "1XXX387688",}}]}]';
create temp table foo(v variant);
insert into foo select parse_json($json);
with
MAX_INDEX(M) as
(
select max("INDEX") MAX_INDEX
from foo, lateral flatten(v, recursive => true)
),
VALS(V, P, K) as
(
select "VALUE", "PATH", "KEY"
from foo, lateral flatten(v, recursive => true)
)
select k as "KEY", V::string as VALUE from vals, max_index
where VALS.P = '[0].list[' || max_index.m || '].element.x_id' or
VALS.P = '[0].list[' || max_index.m || '].element.y_id'
;
假设外部数组始终包含单个dictionary元素,则可以使用以下方法:
SELECT amp[0]:"list"[ARRAY_SIZE(amp[0]:"list")-1]:"element":"x_id"::VARCHAR AS x_id
,amp[0]:"list"[ARRAY_SIZE(amp[0]:"list")-1]:"element":"y_id"::VARCHAR AS y_id
FROM T
;
WITH CTE1 AS (
SELECT amp[0]:"list" AS _ARRAY
FROM T
)
,CTE2 AS (
SELECT _ARRAY[ARRAY_SIZE(_ARRAY)-1]:"element" AS _DICT
FROM CTE1
)
SELECT _DICT:"x_id"::VARCHAR AS x_id
,_DICT:"y_id"::VARCHAR AS y_id
FROM CTE2
;
或者,如果您更喜欢模块化/可读性,您可以使用:
SELECT amp[0]:"list"[ARRAY_SIZE(amp[0]:"list")-1]:"element":"x_id"::VARCHAR AS x_id
,amp[0]:"list"[ARRAY_SIZE(amp[0]:"list")-1]:"element":"y_id"::VARCHAR AS y_id
FROM T
;
WITH CTE1 AS (
SELECT amp[0]:"list" AS _ARRAY
FROM T
)
,CTE2 AS (
SELECT _ARRAY[ARRAY_SIZE(_ARRAY)-1]:"element" AS _DICT
FROM CTE1
)
SELECT _DICT:"x_id"::VARCHAR AS x_id
,_DICT:"y_id"::VARCHAR AS y_id
FROM CTE2
;
注意:我没有在这里使用扁平化,因为我没有找到使用它的好理由。不确定时间在哪里,以查找最新的?@DavidG.Pickett抱歉,我忘了提到最后的x_id和y_id是最新的(在本例中为第二个元素)