Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ruby-on-rails/55.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hive 带有GROUP BY子句的CASE语句中配置单元查询中的分析错误_Hive - Fatal编程技术网

Hive 带有GROUP BY子句的CASE语句中配置单元查询中的分析错误

Hive 带有GROUP BY子句的CASE语句中配置单元查询中的分析错误,hive,Hive,我的输入中有两个URL格式需要解析 网址 abc.com/abcd?id=123 xyz.com/abcd/id123 解析后,我需要在数据库中存储id=123和id123 这是我的配置单元查询,用于解析url和 insert into table table2 select CASE WHEN parse_url(url_domain,'HOST')="abc.com" THEN parse_url(url_domain,'HOST') as host, split(ur

我的输入中有两个URL格式需要解析

网址

abc.com/abcd?id=123

xyz.com/abcd/id123

解析后,我需要在数据库中存储
id=123
id123

这是我的配置单元查询,用于解析url和

insert into table table2
select  
CASE
WHEN parse_url(url_domain,'HOST')="abc.com"
THEN 
    parse_url(url_domain,'HOST') as host,
    split(url_domain,'\\?id=')[1] as id,
    count(*) 
    from table1
    GROUP BY parse_url(url_domain, 'HOST'), split(url_domain,'\\?id=')[1]    
WHEN parse_url(url_domain,'HOST')="xyz.com"
THEN
    parse_url(url_domain,'HOST') as host,
    split(url_domain,'\\/id')[1] as id,
    count(*) 
    from table1
    GROUP BY parse_url(url_domain, 'HOST'), split(url_domain,'\\/id=')[1]
END
ORDER BY host, id DESC limit 100;
但是当我执行查询时,它给出了以下错误

失败:解析异常行6:33在'as'near'处缺少KW_端]'

我尝试删除id别名和其他组合,但没有成功

信息: I配置单元I不能在Group by子句中使用别名作为其配置单元限制

split(url\u域,\\?id=')[1]作为id

如果我使用
groupbyid
它会给出错误,但这很好
groupbyparse\uurl(url\u域,'HOST')

因此,由于这个原因,我无法将小组转移到案例陈述之外


更新

insert into table table2
select  
CASE
WHEN parse_url(url_domain,'HOST')="abc.com"
THEN 
    parse_url(url_domain,'HOST') as host, split(url_domain,'\\?id=')[1] as id,
WHEN parse_url(url_domain,'HOST')="xyz.com"
THEN
    parse_url(url_domain,'HOST') as host, split(url_domain,'\\/id')[1] as id,
END
count(*) 
from table1
GROUP BY parse_url(url_domain, 'HOST')
ORDER BY host, id DESC limit 100;
相同错误:(

错误日志

NoViableAltException(262@[146:1: selectExpression : ( expression | tableAllColumns );])
        at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
        at org.antlr.runtime.DFA.predict(DFA.java:116)
        at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectExpression(HiveParser_SelectClauseParser.java:2882)
        at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2266)
        at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1052)
        at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:789)
        at org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:31425)
        at org.apache.hadoop.hive.ql.parse.HiveParser.regular_body(HiveParser.java:29083)
        at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement(HiveParser.java:28968)
        at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:28762)
        at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1238)
        at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:938)
        at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:190)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:342)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1000)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
        at java.lang.reflect.Method.invoke(Method.java:619)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
FAILED: ParseException line 6:33 missing KW_END at 'as' near ',' in select expression
line 7:0 cannot recognize input near 'WHEN' 'parse_url' '(' in select expression

我从未使用过hive,但允许您在THEN条件下使用看起来像子查询的内容吗?我知道在transact-sql中必须有一个值。

最终通过创建两个独立的
CASE
语句解决了这一问题,一个用于
解析url
,另一个用于
拆分url
。因为hive在执行LIA在
GROUP BY
子句中,所以这是唯一的方法

最终配置单元SQL语句

insert into table table2 select parse_url(url_domain,'HOST') as host,
CASE
WHEN parse_url(url_domain,'HOST')="abc.com" 
THEN 
    parse_url(url_domain,'QUERY','id')
WHEN parse_url(url_domain,'HOST')="def.com"
THEN 
    parse_url(url_domain,'QUERY','packageName')
WHEN parse_url(url_domain,'HOST')="xyz.com"
THEN
    split(split(url_domain,'\\/id')[1],'\\?')[0]
ELSE
    "NULL"
END as appid,
count(*)
from table1
group by parse_url(url_domain,'HOST'),
CASE
WHEN parse_url(url_domain,'HOST')="abc.com" 
THEN 
    parse_url(url_domain,'QUERY','id')
WHEN parse_url(url_domain,'HOST')="def.com"
THEN 
    parse_url(url_domain,'QUERY','packageName')
WHEN parse_url(url_domain,'HOST')="xyz.com"
THEN
    split(split(url_domain,'\\/id')[1],'\\?')[0]
ELSE
"NULL"
END
order by host,appid DESC LIMIT 100;

根据我读到的论坛,我可能也是hive的新手,所以不是100%确定你能从case语句中提取你在case语句中选择的“ID”吗?这两种情况下都是一样的。我在更新中注意到的另一件事是case语句结尾和随后的“计数”之间缺少逗号.我的意思是把主人拉出来,而不是把身份证拉出来。