Hive 计算配置单元表中的多列值
以下是按类型划分的家庭成员(八名成员)的数据集Hive 计算配置单元表中的多列值,hive,hiveql,Hive,Hiveql,以下是按类型划分的家庭成员(八名成员)的数据集 h1 h2 h3 h4 h5 h6 h7 h8 U U P U Y null Y U U H U U Y Y P P U U U H U nuLL Y
h1 h2 h3 h4 h5 h6 h7 h8
U U P U Y null Y U
U H U U Y Y P P
U U U H U nuLL Y null
null null H H U null null null
P U U U Y null Z P
Y P null H Y P U H
U null U null P U Z Y
null null null null null null null null
在上述数据集中,统计H=户主、p=父母、U=成人、Y=妻子的总人数,null=不匹配。我使用了这个代码,这个代码给出了按类型划分的正确的家庭成员计数,但是如果为空,我就没有得到正确的计数。有人能告诉我为什么会这样吗?请解决它。下面我提供我的代码
select sum( Head_cnt) as H,
sum( parent_cnt) as P,
sum( adult_cnt) as U,
sum(spouce_cnt) as Y,
sum( nomatch_cnt) as Nomatch
from(
select length(regexp_replace(row_concatenated, '[^U]', '')) as adult_cnt,
length(regexp_replace(row_concatenated, '[^H]', '')) as head_cnt,
length(regexp_replace(row_concatenated, '[^P]', '')) as parent_cnt,
length(regexp_replace(row_concatenated, '[^Y]', '')) as spouce_cnt,
length(regexp_replace(row_concatenated, '[null]', '')) as nomatch_cnt
from(select concat_ws(',',h1,h2,h3,h4,h5,h6,h7,h8) as row_concatenated
from table_name)s
)s;
请给我代码中空值的解决方案。我得到了除空值之外的所有值的正确计数。记住这不是一个空值。此处null表示不匹配。如果我理解正确,您需要计算每列中每个字母的出现次数,忽略null。一种方法是合并所有列,然后分组并计数:
with union_table as (
select h1 as h
union all
select h2 as h
union all
... (up to h8)
select h8 as h
from your_table
)
select h, count(*) as cnt
from union_table
group by h
默认的
count
函数将忽略空值。将空值转换为某个字符,如果不匹配,请说“N”:
select sum( Head_cnt) as H,
sum( parent_cnt) as P,
sum( adult_cnt) as U,
sum(spouce_cnt) as Y,
sum( nomatch_cnt) as Nomatch
from(
select length(regexp_replace(row_concatenated, '[^U]', '')) as adult_cnt,
length(regexp_replace(row_concatenated, '[^H]', '')) as head_cnt,
length(regexp_replace(row_concatenated, '[^P]', '')) as parent_cnt,
length(regexp_replace(row_concatenated, '[^Y]', '')) as spouce_cnt,
length(regexp_replace(row_concatenated, '[^N]', '')) as nomatch_cnt
from
(
select concat_ws(',',nvl(h1,'N'),nvl(h2,'N'),nvl(h3,'N'),nvl(h4,'N'),nvl(h5,'N'),nvl(h6,'N'),nvl(h7,'N'),nvl(h8,'N')) as row_concatenated
from table_name)s
)s;
请给我一个解决方案?不,我不能用N替换null,因为我需要按照说明维护它。“那么还有其他解决办法吗?”SaikatRoy我按照你的指示做了。。将空值计算为nomatch。请解释为什么你认为这不是你所期望的