Hive 配置单元SQL-条件计数

Hive 配置单元SQL-条件计数,hive,count,conditional-statements,hiveql,Hive,Count,Conditional Statements,Hiveql,我在这个条件计数中遇到了一个问题,被困在这里一天了!希望能从大家这里找到解决办法。 所以我有一个等级表,它有三列: 1.大主题 2.主题(这是大主题的细分) 3.等级 所以,我想做的是,我要计算一个大的科目和每个科目的分数{A分数百分比=多少A除以A加B的分数(多少A/(多少A+多少B)*100%}。有两个规则适用如下:1)。对于这两大主题:离散数学和统计学,分数计算应结合为数学;但是,对于其他大科目将单独计算(物理、历史) 2). 对于“科目”列,应分别计算每个科目的成绩 预期输出如下所示:

我在这个条件计数中遇到了一个问题,被困在这里一天了!希望能从大家这里找到解决办法。 所以我有一个等级表,它有三列: 1.大主题 2.主题(这是大主题的细分) 3.等级

所以,我想做的是,我要计算一个大的科目和每个科目的分数{A分数百分比=多少A除以A加B的分数(多少A/(多少A+多少B)*100%}。有两个规则适用如下:1)。对于这两大主题:离散数学和统计学,分数计算应结合为数学;但是,对于其他大科目将单独计算(物理、历史) 2). 对于“科目”列,应分别计算每个科目的成绩

预期输出如下所示:

                                         A_Grade_Percentage
-----------------------------------------------------------------
    Maths (Discrete Math + Statistics)         54.50%
    Discrete Math 1                             50%
    Statistics 1                                60%
-----------------------------------------------------------------
    Physics                                   54.50%
    Physics Basic                               75%
    Physics 1                                   25%
    Physics 2                                 66.70%
-----------------------------------------------------------------
    History                                     50%
    History 1                                   50%
    History 2                                   50%
-----------------------------------------------------------------
    Geography                                   25%
到目前为止,我已经尝试了以下长代码:

with on_grand_subject as(
select
case when grand_subject in ('Discrete Math', 'Statistics') then 'Maths'
     else grand_subject end as grand_subject,
count(case when grade = 'A' then 1 else 0 end) as A_Grade,
100.0*count(case when grade = 'A' then 1 else 0 end) / 
   (count(case when grade = 'A' then 1 else 0 end) + count(case when grade = 'B' then 1 else 0 end)) 
   as A_Grade_percentage,
from
grade_table
where
 ...................
group by grand_subject in ('Discrete Math', 'Statistics') then 'Maths' else grand_subject end
order by grand_subject
limit 1000
),

on_subject as(
select
grand_subject,
subject,
count(case when grade = 'A' then 1 else 0 end) as A_Grade,
100.0*count(case when grade = 'A' then 1 else 0 end) / 
   (count(case when grade = 'A' then 1 else 0 end) + count(case when grade = 'B' then 1 else 0 end)) 
   as A_Grade_percentage,
from
grade_table
where
 ...................
group by grand_subject, subject
order by subject
limit 1000
)

select
g.grand_subject, s.subject, g.A_Grade, g.A_Grade_percentage, s.A_Grade, g.A_Grade_percentage
from on_grand_subject g join on_subject s on g.grand_subject = s.grand_subject

但是输出看起来很奇怪,不是它应该是什么。有什么解决办法吗?非常感谢

这是错误的:
count(当等级为'A'时,则为1,否则为0结束)
因为count也将计算1和0。它应该是sum()而不是count。或者删除
else 0
else null
如果您以方便的形式发布文本,而不是图片,您获得答案的机会会更好。目前,除了使用手动输入,您的数据集无法复制。@leftjoin啊!对我忘了用sum(),过去用count()。。那我就试试sum()了,顺便说一句,我用方便的形式编辑了这个问题,谢谢你的建议!更好,但仍然需要一些工作才能获得正常的数据集。这可能会有帮助:
with on_grand_subject as(
select
case when grand_subject in ('Discrete Math', 'Statistics') then 'Maths'
     else grand_subject end as grand_subject,
count(case when grade = 'A' then 1 else 0 end) as A_Grade,
100.0*count(case when grade = 'A' then 1 else 0 end) / 
   (count(case when grade = 'A' then 1 else 0 end) + count(case when grade = 'B' then 1 else 0 end)) 
   as A_Grade_percentage,
from
grade_table
where
 ...................
group by grand_subject in ('Discrete Math', 'Statistics') then 'Maths' else grand_subject end
order by grand_subject
limit 1000
),

on_subject as(
select
grand_subject,
subject,
count(case when grade = 'A' then 1 else 0 end) as A_Grade,
100.0*count(case when grade = 'A' then 1 else 0 end) / 
   (count(case when grade = 'A' then 1 else 0 end) + count(case when grade = 'B' then 1 else 0 end)) 
   as A_Grade_percentage,
from
grade_table
where
 ...................
group by grand_subject, subject
order by subject
limit 1000
)

select
g.grand_subject, s.subject, g.A_Grade, g.A_Grade_percentage, s.A_Grade, g.A_Grade_percentage
from on_grand_subject g join on_subject s on g.grand_subject = s.grand_subject