Hadoop 如何在配置单元0.13中更新表?

Hadoop 如何在配置单元0.13中更新表?,hadoop,hive,hiveql,acid,Hadoop,Hive,Hiveql,Acid,我的配置单元版本是0.13。我有两张表,table_1和table_2 表1包含: customer_id | items | price | updated_date ------------+-------+-------+------------- 10 | watch | 1000 | 20170626 11 | bat | 400 | 20170625 customer_id | items | price | updated_da

我的配置单元版本是0.13。我有两张表,
table_1
table_2

表1
包含:

customer_id | items | price | updated_date
------------+-------+-------+-------------
10          | watch | 1000  | 20170626
11          | bat   | 400   | 20170625
customer_id | items    | price | updated_date
------------+----------+-------+-------------
10          | computer | 20000 | 20170624
表2
包含:

customer_id | items | price | updated_date
------------+-------+-------+-------------
10          | watch | 1000  | 20170626
11          | bat   | 400   | 20170625
customer_id | items    | price | updated_date
------------+----------+-------+-------------
10          | computer | 20000 | 20170624
如果
表2
中已经存在
客户id
,我想更新
表2
的记录,如果不存在,则应将其附加到
表2


由于Hive 0.13不支持更新,我尝试使用联接,但失败。

您可以使用
行号
完全联接
。这是使用
行编号
的示例:

insert overwrite table_1 
select customer_id, items, price, updated_date
from
(
select customer_id, items, price, updated_date,
       row_number() over(partition by customer_id order by new_flag desc) rn
from 
    (
     select customer_id, items, price, updated_date, 0 as new_flag
       from table_1
     union all
     select customer_id, items, price, updated_date, 1 as new_flag
       from table_2
    ) all_data
)s where rn=1;

使用
完全连接
也可以查看此更新答案:

您可以使用
行号
完全连接
。这是使用
行编号
的示例:

insert overwrite table_1 
select customer_id, items, price, updated_date
from
(
select customer_id, items, price, updated_date,
       row_number() over(partition by customer_id order by new_flag desc) rn
from 
    (
     select customer_id, items, price, updated_date, 0 as new_flag
       from table_1
     union all
     select customer_id, items, price, updated_date, 1 as new_flag
       from table_2
    ) all_data
)s where rn=1;
另请参见此答案,以了解使用
完全联接进行更新的情况: