Sql 将旧记录插入并替换为新记录

Sql 将旧记录插入并替换为新记录,sql,impala,Sql,Impala,我有一个使用sqoop获取数据的表,它每天都会被截断 开头的tblSqoop具有以下值: +----+-------+--------------+---------------+---------+--------+ | id | names | created_date | modified_date | country | number | +----+-------+--------------+---------------+---------+--------+ | 33 | nic

我有一个使用sqoop获取数据的表,它每天都会被截断

开头的tblSqoop具有以下值:

+----+-------+--------------+---------------+---------+--------+
| id | names | created_date | modified_date | country | number |
+----+-------+--------------+---------------+---------+--------+
| 33 | nick  | 1/1/2020     | 1/1/2020      | Dubai   | 1234   |
| 45 | ted   | 2/7/2020     | 2/7/2020      | Spain   | 12345  |
+----+-------+--------------+---------------+---------+--------+
并将insert解析为tblMaxed

第二天,tblSqoop有以下数据:

 +----+-------+--------------+---------------+---------+--------+
| id | names | created_date | modified_date | country | number |
+----+-------+--------------+---------------+---------+--------+
| 33 | nick  | 1/1/2020     | 12/31/2020    | Dubai   | 1234   |
| 45 | ted   | 2/7/2020     | 8/19/2020     | Spain   | 12345  |
| 45 | ted   | 2/7/2020     | 9/12/2020     | Spain   | 12345  |
| 45 | ted   | 2/7/2020     | 10/11/2020    | Spain   | 12346  |
| 45 | ted   | 2/7/2020     | 1/1/2021      | Spain   | 12345  |
+----+-------+--------------+---------------+---------+--------+
我想要的是在TBL内获得最新信息,如:

+----+-------+--------------+---------------+---------+--------------------+
| id | names | created_date | modified_date | country | number |status_date|
+----+-------+--------------+---------------+---------+--------+-----------+
| 33 | nick  | 1/1/2020     | 12/31/2020    | Dubai   | 1234   |12/31/2020 |
| 45 | ted   | 2/7/2020     | 10/11/2020    | Spain   | 12346  |10/11/2020 |
| 45 | ted   | 2/7/2020     | 1/1/2021      | Spain   | 12345  |1/1/2021   |
+----+-------+--------------+---------------+---------+--------+-----------+
我正在运行这个:

insert into tblMaxed 
select 
id,
names,
created_date,
modified_date,
country,
number,
MAX(modified_date) as status_date
from tblSqoop
group by id,
names,
created_date,
modified_date,
country,
number

但结果我又把所有的记录都拿走了。是否有助于PK的使用?

您是否可以使用此选项截断表格并重新加载
tblMaxed
?(代码中有解释)


我认为分区需要一个订单。另外,它会再次返回所有非maxed的列。你能试试这个新的SQL吗?它成功了。非常感谢你。
select 
id,
names,
created_date,
modified_date,
country,
number,
modified_date as status_date
FROM 
(select  t.*, row_number() OVER (PARTITION BY id,number  Order by  id,number , modified_date desc) rn from tblSqoop t) rs 
where rs.rn=1 -- This will pick up data for MAX modified_date from sqoop table