Athena在另一个json结构数组中运行字符串json数组
我有以下AWS Athena create table语句:Athena在另一个json结构数组中运行字符串json数组,json,presto,amazon-athena,Json,Presto,Amazon Athena,我有以下AWS Athena create table语句: CREATE EXTERNAL TABLE IF NOT EXISTS s2cs3dataset.s2c_storage ( `MessageHeader` string, `TimeToProcess` float, `KeyCreated` string, `KeyLastTouch` string, `CreatedDateTime` st
CREATE EXTERNAL TABLE IF NOT EXISTS s2cs3dataset.s2c_storage (
`MessageHeader` string,
`TimeToProcess` float,
`KeyCreated` string,
`KeyLastTouch` string,
`CreatedDateTime` string,
`TableReference` array<struct<`BusinessObject`: string,
`TransactionType`: string,
`ReferenceKeyId`: float,
`ReferencePrimaryKey`: string,
`IncludedTables`: array<string>>>,
`SAPStoreReference` string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1' ) LOCATION 's3://api-dev-dpstorage-s3/S2C_INPUT/storage/' TBLPROPERTIES ('has_encrypted_data'='false');
但是,我得到以下错误:
语法错误:第9:1行:表达式“it”不是ROW类型
如果我删除底部的交叉连接和引用它的列,查询就可以正常工作,因此我在尝试解压struct数组中字符串数组的JSON数据时出错了。有什么建议吗 根据澄清意见,
tr.IncludedTables
属于数组(varchar)
类型。
因此,在查询
。。。交叉连接UNNEST(tr.IncludedTables)作为p(it)
,它的类型是varchar
。在select子句中,您可以将该值称为it
(或者,提供别名:it as IncludedTables
),但不能将其与it.IncludedTables
(varchar
值没有“字段”,因此特别是它没有IncludedTables
字段)。选择类型的输出是什么(tr.IncludedTables),如果tr.IncludedTables
是array(varchar)
则在unest(tr.IncludedTables)之后作为p(it),则从s2c_存储交叉连接unest(s2c_存储.tablereference)的tr.IncludedTables
,it
是varchar
。在您的查询中,将替换为它。IncludedTables
为它
——这有用吗?这有用吗?Piotr。您能帮助我理解为什么它有用吗?我添加了一些解释作为答案。
SELECT MessageHeader,
TimeToProcess,
KeyCreated,
KeyLastTouch,
CreatedDateTime,
tr.BusinessObject,
tr.TransactionType,
tr.ReferencePrimaryKey,
it.IncludedTables,
SAPStoreReference
FROM s2c_storage
cross join UNNEST(s2c_storage.tablereference) as p(tr)
cross join UNNEST(tr.IncludedTables) as p(it)