Sql 当我尝试执行一个简单的查询时,Spark 3失败了

Sql 当我尝试执行一个简单的查询时,Spark 3失败了,sql,crash,spark3,Sql,Crash,Spark3,我把这张桌子放在蜂箱上: CREATE TABLE `mydb`.`raw_sales` ( `combustivel` STRING, `regiao` STRING, `estado` STRING, `jan` STRING, `fev` STRING, `mar` STRING, `abr` STRING, `mai` STRING, `jun` STRING, `jul` STRING, `ago` STRING, `set` STRING, `out` STRING, `nov` S

我把这张桌子放在蜂箱上:

CREATE TABLE `mydb`.`raw_sales` (
`combustivel` STRING,
`regiao` STRING,
`estado` STRING,
`jan` STRING,
`fev` STRING,
`mar` STRING,
`abr` STRING,
`mai` STRING,
`jun` STRING,
`jul` STRING,
`ago` STRING,
`set` STRING,
`out` STRING,
`nov` STRING,
`dez` STRING,
`total` STRING,
`created_at` TIMESTAMP,
`ano` STRING)
USING orc
LOCATION 'hdfs://localhost:9000/jobs/etl/tables/raw_sales.orc'
TBLPROPERTIES (
  'transient_lastDdlTime' = '1601322056',
  'ORC.COMPRESS' = 'SNAPPY')
表中有数据,但当我在下面尝试此查询时:

spark.sql("SELECT * FROM mydb.raw_sales WHERE ano = '2000' AND combustivel like '%GASOLINA%'").show()
它要崩溃了

>>spark.sql(“从mydb.raw\u sales中选择*,其中ano=2000,combustivel类似于“%GASOLINA%”)。show()[164/3679]
20/09/28 19:25:30错误执行器。执行器:第61.0阶段任务0.0中的异常(TID 133)
java.lang.ClassCastException:java.lang.String不能转换为java.lang.Number
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.castLiteralValue(OrcFilters.scala:163)
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeaveSearchArgument(OrcFilters.scala:235)
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:134)
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:137)
位于scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
位于scala.collection.immutable.List.foreach(List.scala:392)
位于scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
位于scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
位于scala.collection.immutable.List.flatMap(List.scala:355)
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:136)
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:75)
位于org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildreader,分区值为$4(OrcFileFormat.scala:189)
位于org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildreader,分区值$4$adapted(OrcFileFormat.scala:188)
位于scala.Option.map(Option.scala:230)
位于org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildreader,分区值为$1(OrcFileFormat.scala:188)
位于org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:116)
位于org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:169)
位于org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93)
位于org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:491)
位于org.apache.spark.sql.catalyst.expressions.GeneratedClass$generatorforcodegenstage1.columnartorow_nextBatch_0$(未知来源)
位于org.apache.spark.sql.catalyst.expressions.GeneratedClass$GenerateEditorForCodeGenStage1.processNext(未知源)
位于org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
位于org.apache.spark.sql.execution.whisttagecodegenexec$$anon$1.hasNext(whisttagecodegenexec.scala:729)
位于org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:340)
位于org.apache.spark.rdd.rdd.$anonfun$mapPartitionsInternal$2(rdd.scala:872)
在org.apache.spark.rdd.rdd.$anonfun$mapPartitionsInternal$2$adapted(rdd.scala:872)
位于org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:349)
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:313)
位于org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
位于org.apache.spark.scheduler.Task.run(Task.scala:127)
在org.apache.spark.executor.executor$TaskRunner.$anonfun$run$3(executor.scala:446)
位于org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
位于org.apache.spark.executor.executor$TaskRunner.run(executor.scala:449)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
运行(Thread.java:748)
20/09/28 19:25:30 WARN scheduler.TaskSetManager:在阶段61.0中丢失了任务0.0(TID 133,639773a482b8,执行器驱动程序):java.lang.ClassCastException:java.lang.String无法转换为java.l
ang.编号
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.castLiteralValue(OrcFilters.scala:163)
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeaveSearchArgument(OrcFilters.scala:235)
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:134)
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:137)
位于scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
位于scala.collection.immutable.List.foreach(List.scala:392)
位于scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
位于scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
位于scala.collection.immutable.List.flatMap(List.scala:355)
位于org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:136)
在org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:75)[112/3679]
位于org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildreader,分区值为$4(OrcFileFormat.scala:189)
位于org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildreader,分区值$4$adapted(OrcFileFormat.scala:188)
位于scala.Option.map(Option.scala:230)
在org.apache.spark.sql上
spark.sql("SELECT * FROM mydb.raw_sales WHERE ano = 2000").show()
spark.sql("SELECT combustivel FROM mydb.raw_sales WHERE combustivel like '%GASOLINA C%'  ").show()