Hive 配置单元接受了无效查询并引发了RuntimeException

Hive 配置单元接受了无效查询并引发了RuntimeException,hive,hiveql,mapr,Hive,Hiveql,Mapr,今天,我在hive中发现了一个奇怪的行为(mapr分布hive 0.13.0-mapr-1508-21228) 表定义: CREATE EXTERNAL TABLE gd_temp_test.rate_merchants_test( ROW_KEY string, TRANS_DESC1 string, TRANS_DESC2 string, TRANS_DESC3 string, TRANS_ID string ) ROW FORMAT DELIMITED FIELDS TERMINATED

今天,我在hive中发现了一个奇怪的行为(mapr分布hive 0.13.0-mapr-1508-21228)

表定义:

CREATE EXTERNAL TABLE gd_temp_test.rate_merchants_test(
ROW_KEY string,
TRANS_DESC1 string,
TRANS_DESC2 string,
TRANS_DESC3 string,
TRANS_ID string
) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\u0001' LINES TERMINATED BY '\n'
STORED AS textfile
LOCATION '/home/gd/tempdata';
当执行下面的查询时,它确实接受了相同的查询并抛出了RuntimeException

select * from gd_temp_test.rate_merchants_test t1 where t1.TRANS_DESC1 limit 1;
请注意,查询中的表是一个外部表,
TRANS_DESC1
String
类型

例外情况

 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row .................

        ... 8 more
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Boolean
        at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:134)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
        at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
        ... 9 more
我希望它在解析或评估执行计划时拒绝。 当我做一个解释询问时

explain select * from gd_temp_test.rate_merchants_test t1 where t1.TRANS_DESC1 limit 1;
它能够解释这个问题

Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: t1
            Statistics: Num rows: 344 Data size: 1238497 Basic stats: COMPLETE Column stats: NONE
            Filter Operator
              predicate: trans_desc1 (type: string)
              Statistics: Num rows: 172 Data size: 619248 Basic stats: COMPLETE Column stats: NONE
              Select Operator
                expressions: row_key (type: string), trans_desc1 (type: string), trans_desc2 (type: string), trans_desc3 (type: string), trans_id (type: string)
                outputColumnNames: _col0, _col1, _col2, _col3, _col4
                Statistics: Num rows: 172 Data size: 619248 Basic stats: COMPLETE Column stats: NONE
                Limit
                  Number of rows: 1
                  Statistics: Num rows: 1 Data size: 3600 Basic stats: COMPLETE Column stats: NONE
                  File Output Operator
                    compressed: false
                    Statistics: Num rows: 1 Data size: 3600 Basic stats: COMPLETE Column stats: NONE
                    table:
                        input format: org.apache.hadoop.mapred.TextInputFormat
                        output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                        serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: 1

Time taken: 0.196 seconds, Fetched: 33 row(s)
有没有解释是什么原因导致这一切要等到运行时?这是正常的行为吗

编辑1:添加了示例表定义。

如果您看到,它不允许字符串到布尔值的转换


在您的查询中,
其中t1.TRANS_DESC1
尝试将TRANS_DESC1读取为布尔类型,这就是为什么它抛出
java.lang.ClassCastException

请上传您的表结构并确保所有数据类型必须相同,数据类型转换不适用。

但我无法理解这一点。Hive在求值时知道表的模式,可以很容易地推断出表达式不能生成布尔值。那么,为什么它要等到执行时才报告运行时错误呢。