在Hive中转置数据
我有一个配置单元表,其中包含以下格式的数据在Hive中转置数据,hive,hiveql,Hive,Hiveql,我有一个配置单元表,其中包含以下格式的数据 day class start_time count kpi1 kpi2 kpi3 kpi4 ... kpi160 ----------------------------------------------------------------------- 20161010 abc 00 12 1 0 null 0 ... 我想编写一个配置单元查询,以获取以下格
day class start_time count kpi1 kpi2 kpi3 kpi4 ... kpi160
-----------------------------------------------------------------------
20161010 abc 00 12 1 0 null 0 ...
我想编写一个配置单元查询,以获取以下格式的数据
使用一些计算,如max
、min
和avg
day class start_time count kpi_name kpi_max kpi_min kpi_avg
-----------------------------------------------------------------------
20161010 abc 00 12 kpi1 max(kpi1) min(kpi1) avg(kpi1)
20161010 abc 00 12 kpi2 max(kpi2) min(kpi2) avg(kpi2)
请建议一种解决方案,以获取此格式的数据
谢谢。如果要获得最小值、最大值、平均值,必须按列指定分组,假设要按天分组
SELECT day,
class,
start_time,
count,
kpi1,
MAX(kpi1) as max_kpi1,
MIN(kpi1) as min_kpi1,
AVG(kpi1) as avg_kpi1
FROM table
GROUP BY day
您需要将所有
kpi
s放在一个映射中,分解映射以创建一列,然后进行聚合
例如:
数据:
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|day_ |class |start_time |count_ |kpi0 |kpi1 | kpi2 | kpi3 | kpi4 | kpi5 |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|20161010 |abc |00 |12 |1 |2 |3 |8 |9 |6 |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|20161010 |abc |00 |12 |4 |5 |null |6 |10 |null |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
SELECT day_
, class
, start_time
, count_
, kpi_type
, MAX(vals) AS max_vals
, MIN(vals) AS min_vals
, AVG(vals) AS avg_vals
FROM (
SELECT day_, class, start_time, count_, kpi_type, vals
FROM database.table
LATERAL VIEW EXPLODE(MAP('kpi0', kpi0
, 'kpi1', kpi1
, 'kpi2', kpi2
, 'kpi3', kpi3
, 'kpi4', kpi4
, 'kpi5', kpi5)) et AS kpi_type, vals ) x
GROUP BY day_, class_, start_time, count_, kpi_type
+---------+------+-----------+-------+---------+---------+---------+---------+
|day_ |class |start_time |count_ |kpi_type |max_vals |min_vals |avg_vals |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi0 |4 |1 |2.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi1 |5 |2 |3.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi2 |3 |3 |3.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi3 |8 |6 |7.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi4 |10 |9 |9.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi5 |6 |6 |6.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
查询:
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|day_ |class |start_time |count_ |kpi0 |kpi1 | kpi2 | kpi3 | kpi4 | kpi5 |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|20161010 |abc |00 |12 |1 |2 |3 |8 |9 |6 |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|20161010 |abc |00 |12 |4 |5 |null |6 |10 |null |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
SELECT day_
, class
, start_time
, count_
, kpi_type
, MAX(vals) AS max_vals
, MIN(vals) AS min_vals
, AVG(vals) AS avg_vals
FROM (
SELECT day_, class, start_time, count_, kpi_type, vals
FROM database.table
LATERAL VIEW EXPLODE(MAP('kpi0', kpi0
, 'kpi1', kpi1
, 'kpi2', kpi2
, 'kpi3', kpi3
, 'kpi4', kpi4
, 'kpi5', kpi5)) et AS kpi_type, vals ) x
GROUP BY day_, class_, start_time, count_, kpi_type
+---------+------+-----------+-------+---------+---------+---------+---------+
|day_ |class |start_time |count_ |kpi_type |max_vals |min_vals |avg_vals |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi0 |4 |1 |2.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi1 |5 |2 |3.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi2 |3 |3 |3.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi3 |8 |6 |7.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi4 |10 |9 |9.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi5 |6 |6 |6.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
输出:
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|day_ |class |start_time |count_ |kpi0 |kpi1 | kpi2 | kpi3 | kpi4 | kpi5 |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|20161010 |abc |00 |12 |1 |2 |3 |8 |9 |6 |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|20161010 |abc |00 |12 |4 |5 |null |6 |10 |null |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
SELECT day_
, class
, start_time
, count_
, kpi_type
, MAX(vals) AS max_vals
, MIN(vals) AS min_vals
, AVG(vals) AS avg_vals
FROM (
SELECT day_, class, start_time, count_, kpi_type, vals
FROM database.table
LATERAL VIEW EXPLODE(MAP('kpi0', kpi0
, 'kpi1', kpi1
, 'kpi2', kpi2
, 'kpi3', kpi3
, 'kpi4', kpi4
, 'kpi5', kpi5)) et AS kpi_type, vals ) x
GROUP BY day_, class_, start_time, count_, kpi_type
+---------+------+-----------+-------+---------+---------+---------+---------+
|day_ |class |start_time |count_ |kpi_type |max_vals |min_vals |avg_vals |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi0 |4 |1 |2.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi1 |5 |2 |3.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi2 |3 |3 |3.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi3 |8 |6 |7.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi4 |10 |9 |9.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi5 |6 |6 |6.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
谢谢你的回复!如果我采用这种方法,那么我将在一行中得到12列。但我希望计算不同行中的每个kpi。日期、类别、开始时间、计数、kpi\u名称、kpi\u最大值、kpi\u最小值、kpi\u平均值----------------------------------------------------------------------------------------------------------------------------------20161010、abc、00、12、kpi1、最大值(kpi1)、最小值(kpi1)、平均值(kpi1)20161010、abc、00、12、kpi2、最大值(kpi2)、最小值(kpi2)、平均值(kpi2)