SQL—按一列和某些字段类型进行分区_Sql_Hive_Window_Partition

SQL—按一列和某些字段类型进行分区

sql hive

SQL—按一列和某些字段类型进行分区,sql,hive,window,partition,Sql,Hive,Window,Partition,我的桌子要大得多，但是一个小的剪子会是这样的： ---------+---+----------+--------+------------+--- |distance|qtt|deliver_by| store |deliver_time| ... +--------+---+----------+--------+------------|--- | 11 | 1| pa | store_a| 1111 | | 123 | 2| pa

我的桌子要大得多，但是一个小的剪子会是这样的：

---------+---+----------+--------+------------+---
|distance|qtt|deliver_by| store  |deliver_time| ...
+--------+---+----------+--------+------------|---
|   11   |  1|  pa      | store_a|  1111      |
|   123  |  2|  pa      | store_a|  1112      |
|   33   |  3|  pb      | store_a|  1113      |
|   33   |  2|  pa      | store_b|  2221      |
|   44   |  2|  pb      | store_b|  2222      |
|   5    |  2|  pc      | store_b|  2223      |
|   5    |  2|  pc      | store_b|  2224      |
|   6    |  5|  pb      | store_c|  3331      |
|   7    |  5|  pb      | store_c|  3332      |
----------------------------------------------....

有多家商店，但只有3家可能的送货店（送货人：pa、pb和pc）在特定时间送货。考虑<代码>交付时间< /代码>时间戳。我想选择整个表，并在商店中添加6个新列、min和max每个

deliver\u的时间。
一个商店可以由三种送货方式（pa、pb、pc）中的任何一种提供服务，但不是必需的
我可以完成几乎正确的结果，通过下面的查询，问题是在情况下

delivery\u bypX不存在，我不会得到空值，而是商店交货的最小/最大值
我真的很想使用分区，所以我写了这个来添加新的min/max列：

select min(deliver_time) over (partition by store, deliver_by='pa') as as min_time_sd_pa , max(deliver_time) over (partition by store, deliver_by='pa') as as min_time_sd_pa , min(deliver_time) over (partition by store, deliver_by='pb') as as min_time_sd_pb , max(deliver_time) over (partition by store, deliver_by='pb') as as min_time_sd_pb , min(deliver_time) over (partition by store, deliver_by='pc') as as min_time_sd_pc , max(deliver_time) over (partition by store, deliver_by='pc') as as min_time_sd_pc , distance, qtt, .... from mytable
正确的输出应为：

min_time_sd_pa|max_time_sd_pa|min_time_sd_pb|max_time_sd_pb|min_time_sd_pc|max_time_sd_pc|distance|qtt|deliver_by| store |deliver_time --------------+--------------+--------------+--------------+--------------+--------------+--------+---+----------+--------+------------ 1111 | 1112 | 1113 | 1113 | null | null | 11 | 1| pa | store_a| 1111 1111 | 1112 | 1113 | 1113 | null | null | 123 | 2| pa | store_a| 1112 1111 | 1112 | 1113 | 1113 | null | null | 33 | 3| pb | store_a| 1113 2221 | 2221 | 2222 | 2222 | 2223 | 2224 | 33 | 2| pa | store_b| 2221 2221 | 2221 | 2222 | 2222 | 2223 | 2224 | 44 | 2| pb | store_b| 2222 2221 | 2221 | 2222 | 2222 | 2223 | 2224 | 5 | 2| pc | store_b| 2223 2221 | 2221 | 2222 | 2222 | 2223 | 2224 | 5 | 2| pc | store_b| 2224 null | null | null | null | 3331 | 3332 | 6 | 5| pb | store_c| 3331 null | null | null | null | 3331 | 3332 | 7 | 5| pb | store_c| 3332 ---------------------------------------------------------------------------------------------------------------------------------------
我的
select min（..）over..
语句中缺少了什么，或者我如何以最简单的方式实现这个结果？我使用的是HiveQL，但我想这在大多数SQL DBMS中是通用的

谢谢
您可以在
min
和
max
中使用
大小写
表达式来完成

select min(case when deliver_by='pa' then deliver_time end) over (partition by store) as min_time_sd_pa ,max(case when deliver_by='pa' then deliver_time end) over (partition by store) as max_time_sd_pa ,min(case when deliver_by='pb' then deliver_time end) over (partition by store) as min_time_sd_pb ,max(case when deliver_by='pb' then deliver_time end) over (partition by store) as max_time_sd_pb ,min(case when deliver_by='pc' then deliver_time end) over (partition by store) as min_time_sd_pc ,max(case when deliver_by='pc' then deliver_time end) over (partition by store) as max_time_sd_pc ,m.* from mytable m

您可以使用
min
和
max
中的
case
表达式执行此操作

select min(case when deliver_by='pa' then deliver_time end) over (partition by store) as min_time_sd_pa ,max(case when deliver_by='pa' then deliver_time end) over (partition by store) as max_time_sd_pa ,min(case when deliver_by='pb' then deliver_time end) over (partition by store) as min_time_sd_pb ,max(case when deliver_by='pb' then deliver_time end) over (partition by store) as max_time_sd_pb ,min(case when deliver_by='pc' then deliver_time end) over (partition by store) as min_time_sd_pc ,max(case when deliver_by='pc' then deliver_time end) over (partition by store) as max_time_sd_pc ,m.* from mytable m