Hive 从具有嵌套结构的内部配置单元表读取时发生AvroTypeException

Hive 从具有嵌套结构的内部配置单元表读取时发生AvroTypeException,hive,avro,hortonworks-data-platform,azure-hdinsight,Hive,Avro,Hortonworks Data Platform,Azure Hdinsight,我在版本为3.6的Azure HDInsight群集上工作。它使用Hortonworks HDP2.6,它与Hive 2.1.0(在Tez 0.8.4上)一起提供 我有一些内部配置单元表,其中包含以Avro格式存储的嵌套结构字段。下面是CREATE语句的一个示例: CREATE TABLE my_example_table( some_field STRING, some_other_field STRING, some_struct struct<field1:

我在版本为3.6的Azure HDInsight群集上工作。它使用Hortonworks HDP2.6,它与Hive 2.1.0(在Tez 0.8.4上)一起提供

我有一些内部配置单元表,其中包含以Avro格式存储的嵌套结构字段。下面是CREATE语句的一个示例:

CREATE TABLE my_example_table(
    some_field STRING,
    some_other_field STRING,
    some_struct struct<field1: BIGINT, inner_struct struct<field2: STRING, field3: STRING>>)
PARTITIONED BY (year INT, month INT)
STORED AS AVRO;
当我想要查询内部表时,我得到了以下错误:
失败,异常为java.io.IOException:org.apache.avro.AvroTypeException:Found core.record\u 0,应为union

我使用avro工具从其中一个内部表中提取了avro模式,并认识到Hive从我定义的结构创建联合类型:

{
  "type" : "record",
  "name" : "my_example_table",
  "namespace" : "my_namespace",
  "fields" : [ {
    "name" : "some_field",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "some_other_field",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "my_struct",
    "type" : [ "null", {
      "type" : "record",
      "name" : "record_0",
      "namespace" : "",
      "doc" : "struct<field1: BIGINT, struct<field2: STRING, field3: STRING>>",
      "fields" : [ {
        "name" : "field1",
        "type" : [ "null", "long" ],
        "doc" : "bigint",
        "default" : null
      }, {
        "name" : "inner_struct",
        "type" : [ "null", {
          "type" : "record",
          "name" : "record_2",
          "namespace" : "",
          "doc" : "struct<field2: STRING, field3: STRING>",
          "fields" : [ {
            "name" : "field2",
            "type" : [ "null", "string" ],
            "doc" : "bigint",
            "default" : null
          }, {
            "name" : "field2",
            "type" : [ "null", "long" ],
            "doc" : "bigint",
            "default" : null
          }]
        }
      ]}
    ]}
  ]}
}
{
“类型”:“记录”,
“名称”:“我的示例表”,
“名称空间”:“我的名称空间”,
“字段”:[{
“名称”:“某些字段”,
“类型”:[“空”、“字符串”],
“默认值”:空
}, {
“名称”:“一些其他字段”,
“类型”:[“空”、“字符串”],
“默认值”:空
}, {
“名称”:“我的结构”,
“类型”:[“空”{
“类型”:“记录”,
“名称”:“记录0”,
“命名空间”:“”,
“doc”:“struct”,
“字段”:[{
“名称”:“字段1”,
“类型”:[“空”、“长”],
“doc”:“bigint”,
“默认值”:空
}, {
“名称”:“内部结构”,
“类型”:[“空”{
“类型”:“记录”,
“名称”:“记录2”,
“命名空间”:“”,
“doc”:“struct”,
“字段”:[{
“名称”:“字段2”,
“类型”:[“空”、“字符串”],
“doc”:“bigint”,
“默认值”:空
}, {
“名称”:“字段2”,
“类型”:[“空”、“长”],
“doc”:“bigint”,
“默认值”:空
}]
}
]}
]}
]}
}
这里出了什么问题?我很确定这几天前确实起了作用,所以我猜测微软将HDP换成了另一个补丁版本,用于HDInsight clusters,它有另一个Avro或Hive版本,但我没有发现任何迹象表明这一点

我发现:这似乎是非常类似的问题(在同一个蜂巢版本上)

有谁知道这里出了什么问题,我可以做些什么来解决这个问题或作为一个解决办法

{
  "type" : "record",
  "name" : "my_example_table",
  "namespace" : "my_namespace",
  "fields" : [ {
    "name" : "some_field",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "some_other_field",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "my_struct",
    "type" : [ "null", {
      "type" : "record",
      "name" : "record_0",
      "namespace" : "",
      "doc" : "struct<field1: BIGINT, struct<field2: STRING, field3: STRING>>",
      "fields" : [ {
        "name" : "field1",
        "type" : [ "null", "long" ],
        "doc" : "bigint",
        "default" : null
      }, {
        "name" : "inner_struct",
        "type" : [ "null", {
          "type" : "record",
          "name" : "record_2",
          "namespace" : "",
          "doc" : "struct<field2: STRING, field3: STRING>",
          "fields" : [ {
            "name" : "field2",
            "type" : [ "null", "string" ],
            "doc" : "bigint",
            "default" : null
          }, {
            "name" : "field2",
            "type" : [ "null", "long" ],
            "doc" : "bigint",
            "default" : null
          }]
        }
      ]}
    ]}
  ]}
}