Sql 如何获取组数据并将结果收集到配置单元中的映射中?
我有一张这样的桌子:Sql 如何获取组数据并将结果收集到配置单元中的映射中?,sql,group-by,hive,hql,collect,Sql,Group By,Hive,Hql,Collect,我有一张这样的桌子: id | job | school | 1 | programmer | school1 | 2 | programmer | school1 | 3 | programmer | school2 | 4 | pm | school3 | 5 | pm | school2 | 6 | pm | school3 | 我想做以下工作: 按职务分组 获取学校列表并计数,如下所示[(学校1,2),(学校2
id | job | school |
1 | programmer | school1 |
2 | programmer | school1 |
3 | programmer | school2 |
4 | pm | school3 |
5 | pm | school2 |
6 | pm | school3 |
我想做以下工作:
programmer | [(school1, 2), (school2, 1)]
pm | [(school3, 2), (school2, 1)]
我们不能在配置单元中的集合(collect_集)内有映射(因为collect_集中只允许原始数据类型) 这两个查询将给出您要查找的内容(除了一个涉及映射,另一个不涉及映射之外,这两个查询是相同的) 希望这有帮助:)只需添加jar并创建一个
collect()
函数
add jar ./brickhouse-0.7.1.jar;
create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';
select job
, collect(school, c) school_count_map
from (
select *
from (
select job, school
, count( * ) c
from table
group by job, school ) x
order by job, c desc) y
group by job
但是您没有对收集集进行排序
add jar ./brickhouse-0.7.1.jar;
create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';
select job
, collect(school, c) school_count_map
from (
select *
from (
select job, school
, count( * ) c
from table
group by job, school ) x
order by job, c desc) y
group by job