Apache spark FileNotFoundException的pyspark刷新表_Apache Spark_Pyspark_Filenotfoundexception

Apache spark FileNotFoundException的pyspark刷新表

apache-spark pyspark

Apache spark FileNotFoundException的pyspark刷新表,apache-spark,pyspark,filenotfoundexception,Apache Spark,Pyspark,Filenotfoundexception,我在编写CSV时看到以下错误 Stdoutput Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 9560 in stage 21.0 failed 4 times, most recent failure: Lost task 9560.3 in stage 21.0 (TID 88857, .., executor 12): java.io.FileNotFoundEx

我在编写CSV时看到以下错误

Stdoutput Caused by: org.apache.spark.SparkException: Job aborted due to stage failure:
 Task 9560 in stage 21.0 failed 4 times, most recent failure:
 Lost task 9560.3 in stage 21.0 (TID 88857, .., executor 12):
 java.io.FileNotFoundException: File does not exist: <hdfs dependent table location>/000017_0
Stdoutput It is possible the underlying files have been updated.
 You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.
Stdoutput   at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:131)

...
Stdoutput Caused by: java.io.FileNotFoundException: File does not exist: 
 <hdfs dependent table location>/000017_0
Stdoutput It is possible the underlying files have been updated.
 You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.

上面的

df

使用依赖表中的数据，并在上面的write语句之前进行一些转换（您在上面的错误中看到的位置）

我知道

REFRESH table

会更新元数据，但在执行写入CSV的最后操作之前，刷新所有依赖表的元数据有意义吗

df.write.format(format).mode(mode).saveAsTable("{}.{}".format(runtime_db, table_name))