Hive 如何使用配置单元将列值分隔为不同的列

Hive 如何使用配置单元将列值分隔为不同的列,hive,hiveql,Hive,Hiveql,输入: name year run 1. a 2008 4 2. a 2009 3 3. a 2008 4 4. b 2009 8 5. b 2008 5 配置单元中的输出: name 2008 2009 1. a 8 3 2. b 5 8 固定年份: select name, max(case when year=2008 then run end) as year_2008, max(case when yea

输入:

 name year run
 1. a    2008 4
 2. a    2009 3
 3. a    2008 4
 4. b    2009 8
 5. b    2008 5
配置单元中的输出:

 name 2008 2009
 1. a 8 3
 2. b 5 8
固定年份:

select name,
       max(case when year=2008 then run end) as year_2008, 
       max(case when year=2009 then run end) as year_2009, 
       ... and so on
  from my_table
  group by name;
在配置单元中不可能动态生成这样的列,但可以先选择不同的年份,然后使用shell生成此SQL

对于固定年份:

select name,
       max(case when year=2008 then run end) as year_2008, 
       max(case when year=2009 then run end) as year_2009, 
       ... and so on
  from my_table
  group by name;

在配置单元中不可能动态生成这样的列,但可以先选择不同的年份,然后使用shell生成此SQL

根据我的理解,您需要将每年的一些运行数据透视到年列中

你需要的是求和函数,而不是max

select
sum(case when year=2008 then run else 0 end) 2008_run,
sum(case when year=2009 then run else 0 end) 2009_run,
from table t1
group by name;
找出每年排名前五的跑步得分手

with table1 as
(
select name, sum(runs) as RunsPerYear, year from myTable group by name, year
)
table2 as
(
select name, year, RunsPerYear, dense_rank() over (partition by name, year order by RunsPerYear) as rnk from table2
)
select name, year, RunsPerYear from table2 where rnk<=5;

根据我的理解,您需要将每年的一些运行数据透视到年列中

你需要的是求和函数,而不是max

select
sum(case when year=2008 then run else 0 end) 2008_run,
sum(case when year=2009 then run else 0 end) 2009_run,
from table t1
group by name;
找出每年排名前五的跑步得分手

with table1 as
(
select name, sum(runs) as RunsPerYear, year from myTable group by name, year
)
table2 as
(
select name, year, RunsPerYear, dense_rank() over (partition by name, year order by RunsPerYear) as rnk from table2
)
select name, year, RunsPerYear from table2 where rnk<=5;

如何使用这些语法查找前5名击球手的全年跑数?按名称顺序计算分区上的密集排名,按run DESC作为子查询中的排名,而不是从my_表中计算排名,并按rankHow的位置筛选以查找前5名击球手的全年跑数,使用这些语法?按名称顺序计算分区上的稠密_排名,方法是在子查询中以排名的形式运行DESC,而不是从my_表中运行,并按排名的位置进行过滤