Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/apache/8.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 尝试通过“横向视图”和“分解”获取具有配置单元中数组中的值的多行_Hadoop_Hive_Hiveql_Impala - Fatal编程技术网

Hadoop 尝试通过“横向视图”和“分解”获取具有配置单元中数组中的值的多行

Hadoop 尝试通过“横向视图”和“分解”获取具有配置单元中数组中的值的多行,hadoop,hive,hiveql,impala,Hadoop,Hive,Hiveql,Impala,我有一个有两列的表,我正在尝试用数组中的值获取多行 date users 2019-01-01 ["U00001","U00002","U00002"] date user 2019-01-01 "U00001","U00002","U00002" 我试图得到如下输出 date users 2019-01-01 "U00001" 2019-01-01 "U0

我有一个有两列的表,我正在尝试用数组中的值获取多行

date                  users
2019-01-01       ["U00001","U00002","U00002"]
date                 user
2019-01-01    "U00001","U00002","U00002"
我试图得到如下输出

date               users
2019-01-01       "U00001"
2019-01-01       "U00002"
2019-01-01       "U00003"
我正在使用下面的查询

SELECT date, user FROM  table1
LATERAL VIEW  explode(users)  myTable2 AS user;
我无法获得上述预期的输出,
我的查询结果如下所示

date                  users
2019-01-01       ["U00001","U00002","U00002"]
date                 user
2019-01-01    "U00001","U00002","U00002"
我的列数据类型是

column         data_type
date            string
user            Array

date
user
是配置单元中的保留字,请使用反勾号。另外(检查我的示例)侧视图应具有别名(u),柱分解为别名用户:

with your_data as (--use your table instead of this 
select stack(1, '2019-01-01', array("U00001","U00002","U00002")) as(`date`, users)
)

select t.`date`, u.`user` 
  from your_data t --use your table instead
       lateral view explode(t.users) u as `user` ;
如果用户类型为字符串,请删除方括号和双引号,然后拆分和分解:

with your_data as (--use your table instead of this 
select stack(1, '2019-01-01', '["U00001","U00002","U00002"]') as (`date`, users)
)

select t.`date`, u.`user` 
  from your_data t --use your table instead
       lateral view explode(split(regexp_replace(t.users,'\\[|\\]|\\"',''),',')) u as `user` ;
结果:

t.date      u.user
2019-01-01  U00001
2019-01-01  U00002
2019-01-01  U00002

感谢@LeftJoin,我已经尝试过了,但是我的用户列是字符串类型,它看起来像是“U00001”、“U00002”、“U00003”,因为explode只接受数组或映射类型,我尝试将用户列转换为数组类型,我从表中选择了数组(““U00001”、“U00002”、“U00003”),我得到了如下输出[“\”U00001“\“,\“U00002\”,\“U00003\”],不明白为什么我在output@Rahul用户列到底包含什么?这个字符串
“[“U00001”、“U00002”、“U00002”]”
?那么为什么要使用array(something)@Rahul更新的字符串答案type@Rahul回答您的问题,为什么数组('U00001',U00002',U00003')会导致[“\“U00001\”、“U00002\”、“U00003\”]?由于数组打印为JSON字符串,所以每个元素都用引号括起来。单元素数组中包含双引号,配置单元用斜杠屏蔽以生成正确的JSON。根据JSON规范,JSON中的双引号应该被屏蔽,因为双引号是特殊字符。若要从字符串生成数组,需要使用拆分函数M not array()。选择拆分(“'U00001”“U00002”“U00003”“U00003”“U00001”“U00002”“U00003”“U00003”“U00003”“U00001”“U00002”“U00003”“U00003”“)也会产生相同的结果