Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/typo3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark AVSC中别名值与AVRO文件不相称_Apache Spark_Hive - Fatal编程技术网

Apache spark AVSC中别名值与AVRO文件不相称

Apache spark AVSC中别名值与AVRO文件不相称,apache-spark,hive,Apache Spark,Hive,我已经更新了avsc文件,将列重命名为 "fields" : [ { "name" : "department_id", "type" : [ "null", "int" ], "default" : null }, { "name" : "office_name", "type" : [ "null", "string" ], "default" : null, "aliases" : [ "department_name" ],

我已经更新了avsc文件,将列重命名为

 "fields" : [ {
    "name" : "department_id",
    "type" : [ "null", "int" ],
    "default" : null
  }, {
    "name" : "office_name",
    "type" : [ "null", "string" ],
    "default" : null,
    "aliases" : [ "department_name" ],
    "columnName" : "department_name"
  }
然而,在5月份,avro文件列像是
department\u id:10,department\u name:“math”

现在当我像下面这样询问时

select office_name from t
它总是返回
空值
。它不会从avro中的部门名称返回值。是否有一种方法可以为avsc中的列指定多个名称,从“我们建议使用表中字段的原始名称,而不是别名,因为在加载到Spark中的过程中,Avro别名会被去除。”

带别名的架构,

val schema = new Schema.Parser().parse(new File("../spark-2.4.3-bin-hadoop2.7/examples/src/main/resources/user.avsc"))
schema:org.apache.avro.schema={“type”:“record”,“name”:“User”,“namespace”:“example.avro”,“fields”:[{“name”:“name”:“name”,“type”:“string”,“alias”:[“customer_name”],“columnName”:“customer_name”},{“name”:“favorite_color”,“type”:[“string”,“null”],“alias”:[“color”],“columnName”:“color”}

火花条带化别名

val usersDF = spark.read.format("avro").option("avroSchema",schema.toString).load("../spark-2.4.3-bin-hadoop2.7/examples/src/main/resources/users.avro")
usersDF:org.apache.spark.sql.DataFrame=[名称:string,收藏夹颜色:string]


我想你可以使用spark内置功能来重命名一个列,但是如果你发现任何其他解决方法,也请告诉我

你能分享
描述t
output@wypul描述输出类似于
office\u名称字符串