Arrays 用于将结构数组转换为字符串数组的配置单元查询

Arrays 用于将结构数组转换为字符串数组的配置单元查询,arrays,hive,hiveql,Arrays,Hive,Hiveql,我的用例如下所示: 我想将数据从表A复制到表B,并将field1从struct数组转换为string数组,其中string是表A中struct的val1属性,忽略val2 Table-A: field1: array<struct<val1: str, val2: int>> sample data: [{val1: "abc", val2: 123}, {val1: "def", val2: 456}], [{val1: &quo

我的用例如下所示:

我想将数据从表A复制到表B,并将field1从struct数组转换为string数组,其中string是表A中struct的val1属性,忽略val2

Table-A:
field1: array<struct<val1: str, val2: int>>
sample data:
[{val1: "abc", val2: 123}, {val1: "def", val2: 456}], [{val1: "xyz", val2: 789}]

Table-B:
field1: array<string>
sample data:
["abc", "def"], ["xyz"]
我还希望严格地通过hiveql而不是通过python中的udf来实现这一点


谢谢。

使用分解原始数组并使用收集集或收集列表收集struct.val1的数组:

with mydata as (--This is your data example, use your table instead of this CTE
select stack (2,
array(named_struct("val1", "abc", "val2", 123), named_struct("val1", "def", "val2", 456)), 
array(named_struct("val1", "xyz", "val2", 789))
) as myarray
)

select t.myarray as original_array, collect_set(s.val1) as result_array
  from mydata t
       lateral view explode(myarray) e as s --struct
group by t.myarray 
结果:

original_array                                          result_array
[{"val1":"abc","val2":123},{"val1":"def","val2":456}]   ["abc","def"]
[{"val1":"xyz","val2":789}]                             ["xyz"]

您的结构也可以声明为map,而不是struct。在本例中,使用s['val1']而不是s.val1来获取map元素。

在处理数组时有一些神奇之处,可以让您做到这一点:

select t.myarray as original_array, t.myarray.val1 from mydata t
即,从结构数组中选择结构字段val1将返回val1数组


中,您可以发布一些具有预期输出和您的尝试的样本数据吗?@VamsiPrabhala,添加了样本数据。我的查询类似于:选择collect_listselect col.val1 from explodefield1作为表A中的col
select t.myarray as original_array, t.myarray.val1 from mydata t