Apache spark 从Avro表读取数据时发生spark sql错误

Apache spark 从Avro表读取数据时发生spark sql错误,apache-spark,avro,spark-avro,Apache Spark,Avro,Spark Avro,当我尝试使用spark sql从avro表中读取数据时,我遇到了这个错误 Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.supportedCategories(AvroObjectInspectorGenerator.java:142) at org.apache.hadoop.hive.ser

当我尝试使用spark sql从avro表中读取数据时,我遇到了这个错误

Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.supportedCategories(AvroObjectInspectorGenerator.java:142)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:91)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:121)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspectorWorker(AvroObjectInspectorGenerator.java:104)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.createObjectInspector(AvroObjectInspectorGenerator.java:83)
        at org.apache.hadoop.hive.serde2.avro.AvroObjectInspectorGenerator.<init>(AvroObjectInspectorGenerator.java:56)

我需要添加任何依赖项吗?代码在hive中运行良好,但spark有问题。

请共享代码请与libraryDependencies+=“org.apache.spark”%%“spark avro”%%“2.4.0”Hello@dassum一起尝试,我尝试过,看起来很好。但现在我得到了这个错误。原因:org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:442)处org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:250)处的MetaException(消息:java.lang.ClassNotFoundException Class NotfoundException找不到)需要始终使用与Spark版本匹配的依赖版本。将其更改为2.4.2或重新使用variable@cricket_007,我尝试使用pyspark——驱动程序内存10g——jars/tmp/spark-avro_2.11-2.4.2.jar,但仍然得到相同的错误。当我试图用avro模式连接表的数据时,就会发生这种情况。试着看看一个简单的过滤器是否也能做到这一点。
val sparkVersion = "2.4.2"

libraryDependencies ++=  Seq(
  "org.apache.spark" %% "spark-sql" % sparkVersion
)

libraryDependencies += "com.databricks" %% "spark-avro" % "4.0.0"