Hive 在配置单元中为直方图创建范围箱

Hive 在配置单元中为直方图创建范围箱,hive,hiveql,hive-query,Hive,Hiveql,Hive Query,我有一个包含学生id和年龄的数据集。我想标记应该被安排在一个范围或桶大小为10的箱子 stud_id ages 101 11 102 13 103 21 104 25 同样,我有更多记录的日期。这必须安排一个尺寸为10的料仓 stud_id ages 101 11 102 13 103 21 104 25 预期产出为: stud_id ages_bin 101

我有一个包含学生id和年龄的数据集。我想标记应该被安排在一个范围或桶大小为10的箱子

stud_id    ages
101        11
102        13
103        21
104        25
同样,我有更多记录的日期。这必须安排一个尺寸为10的料仓

stud_id    ages
101        11
102        13
103        21
104        25
预期产出为:

stud_id     ages_bin
101         11-20
102         11-20
103         21-30
104         21-30
我在蜂箱中尝试了简单的案例陈述

select stud_id,
case when ages between 0 and 10 then '0-10'
when ages between 11 and 20 then '11-20'
when ages between 21 and 30 then '21-30'
when ages between 31 and 40 then '31-40'
when ages between 41 and 50 then '41-50'
when ages between 51 and 60 then '51-60'
when ages between 61 and 70 then '61-70'
when ages between 71 and 80 then '71-80'
when ages between 81 and 90 then '81-90'
when ages between 91 and 100 then '91-100'
when ages between 101 and 110 then '101-110'
when ages between 111 and 120 then '111-120'
when ages between 121 and 130 then '121-130'
when ages between 131 and 140 then '131-140'
when ages between 141 and 150 then '141-150'
else NULL end as ages_bin
from students
是否有任何简单的方法可以使存储桶大小为10的装箱数据


有人能帮我写一个简单的代码吗?

有一个简单的方法来排列直方图的箱子范围。代码如下:

选择螺柱id、地板((年龄)/10)*10作为标准尺寸范围,
楼层((年龄)/10)*10+9,从学生开始
这将产生以下输出:

stud_id     ages_bin
101         10-19
102         10-19
103         20-29
104         20-29

试试这个。这应该能够以bin格式获取u个bin:

select stud_id, concat(cast(floor((ages)/10)*10 as string),'-',
cast(floor((ages)/10)*10+9 as string)) from students 
为了能够获得适当的输出,最好将其分组并排序 恰当地