Java 蜂巢:Kryo例外

Java 蜂巢:Kryo例外,java,hadoop,hive,hiveql,Java,Hadoop,Hive,Hiveql,我正在执行一个HQL查询,该查询几乎没有连接、并集和插入覆盖操作,如果只运行一次,它就可以正常工作。 如果我第二次执行相同的任务,我将面临这个问题。 有人能帮我确定在哪种情况下会出现这种异常吗 Error: java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID: 107 Serialization trace: r

我正在执行一个HQL查询,该查询几乎没有连接、并集和插入覆盖操作,如果只运行一次,它就可以正常工作。
如果我第二次执行相同的任务,我将面临这个问题。 有人能帮我确定在哪种情况下会出现这种异常吗

Error: java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID: 107
Serialization trace:
rowSchema (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.UnionOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
    at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:364)
    at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:275)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:440)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:433)
    at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID: 107
Serialization trace:
rowSchema (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.UnionOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
错误:java.lang.RuntimeException:org.apache.hive.com.esotericsoftware.kryo.KryoException:遇到未注册的类ID:107
序列化跟踪:
rowSchema(org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.FilterOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.UnionOperator)
childOperators(org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork(org.apache.hadoop.hive.ql.plan.MapWork)
位于org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:364)
位于org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:275)
位于org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254)
位于org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:440)
位于org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:433)
位于org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
位于org.apache.hadoop.mapred.MapTask$TrackedRecordReader。(MapTask.java:169)
位于org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
位于org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
位于org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
位于java.security.AccessController.doPrivileged(本机方法)
位于javax.security.auth.Subject.doAs(Subject.java:415)
位于org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
位于org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
原因:org.apache.hive.com.esotericsoftware.kryo.KryoException:遇到未注册的类ID:107
序列化跟踪:
rowSchema(org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.FilterOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators(org.apache.hadoop.hive.ql.exec.UnionOperator)
childOperators(org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork(org.apache.hadoop.hive.ql.plan.MapWork)

通过将下面的属性修改为false来避免配置单元的并行执行

hive.exec.parallel


让我知道它是否适合您。

我尝试了
设置hive.exec.parallel=false然后它成功运行,尽管速度较慢。我的代码是:

SELECT
    CASE WHEN a.did IS NOT NULL THEN a.did ELSE b.did END AS device_id,
    CASE WHEN a.did IS NOT NULL THEN a.package ELSE b.package END AS package,
    CASE WHEN a.did IS NOT NULL THEN a.channel ELSE b.channel END AS channel,
    CASE WHEN a.did IS NOT NULL THEN a.time ELSE b.time END AS time
FROM
    (SELECT
      a1.package,
      a1.did,
      MIN(a1.source) AS channel,
      MIN(a1.time) AS time
    FROM
      (SELECT * FROM thetable
        WHERE date_hour = "20160601"
          AND source_type IN ('A', 'B', 'C')
      ) a1
      JOIN
      (SELECT
        package AS package,
        did AS did,
        MIN(time) AS time
      FROM thetable
      WHERE date_hour = "20160601"
        AND source_type IN ('A', 'B', 'C')
      GROUP BY package, did
      ) min
      ON (a1.package = min.package
        AND a1.did = min.did
        AND a1.time = min.time)
    GROUP BY a1.package, a1.did
    ) a
    FULL OUTER JOIN
    (SELECT
      a1.package,
      a1.did,
      MIN(a1.source) AS channel,
      MIN(a1.time) AS time
    FROM
      (SELECT * FROM thetable
        WHERE date_hour = "20160601"
          AND source_type IN ('D')
      ) a1
      JOIN
      (SELECT
        package AS package,
        did AS did,
        MIN(time) AS time
      FROM thetable
      WHERE date_hour = "20160601"
        AND source_type IN ('D')
      GROUP BY package, did
      ) min
      ON (a1.package = min.package
        AND a1.did = min.did
        AND a1.time = min.time)
    GROUP BY a1.package, a1.did
    ) b
    ON (a.package = b.package AND a.did = b.did);

这些表是否使用任何SerDe?不,我们没有使用任何外部SerDe。是的,当然。Hive通常与Kryo配合使用,但有时会在并行执行期间造成对象反序列化问题。为了避免这种情况,您已经尝试了其他方法,我已经提供了一种。我非常感谢您的帮助。我会让你试着让你知道的。非常感谢,事实上我当时没有找到如何格式化。