Scala 错误AzureNativeFileSystemStore:DirectorySnotempty_Scala_Azure_Apache Spark_Hadoop_Azure Hdinsight

Scala 错误AzureNativeFileSystemStore:DirectorySnotempty

scala azure apache-spark hadoop

Scala 错误AzureNativeFileSystemStore:DirectorySnotempty,scala,azure,apache-spark,hadoop,azure-hdinsight,Scala,Azure,Apache Spark,Hadoop,Azure Hdinsight,我试图在Azure HdInsigth中执行此代码。我有一个与Data Lake存储相连的群集Spark spark.conf.set( "fs.azure.sas.data.spmdevsharedstorage.blob.core.windows.net", "xxxxxxxxxxx key xxxxxxxxxxx" ) val shared_data = "wasbs://data@spmdevsharedstorage.blob.core.windows.net/" //Read

我试图在Azure HdInsigth中执行此代码。我有一个与Data Lake存储相连的群集Spark

spark.conf.set(
"fs.azure.sas.data.spmdevsharedstorage.blob.core.windows.net",
"xxxxxxxxxxx key xxxxxxxxxxx"
)


val shared_data = "wasbs://data@spmdevsharedstorage.blob.core.windows.net/"

//Read Csv
val dfCsv = spark.read.option("inferSchema", "true").option("header", true).csv(shared_data + "/test/4G-pixel.csv")
val dfCsv_final_withcolumn = dfCsv.select($"latitude",$"longitude")
val dfCsv_final = dfCsv_final_withcolumn.withColumn("new_latitude",col("latitude")*100)

//write
dfCsv_final.coalesce(1).write.format("com.databricks.spark.csv").option("header", "true").mode("overwrite").save(shared_data + "/test/4G-pixel_edit.csv")

代码可以很好地读取csv文件。因此，在编写新文件csv时，我看到以下错误：

20/04/03 14:58:12 ERROR AzureNativeFileSystemStore: Encountered Storage Exception for delete on Blob: https://spmdevsharedstorage.blob.core.windows.net/data/test/4G-pixel_edit.csv/_temporary/0, Exception Details: This operation is not permitted on a non-empty directory. Error Code: DirectoryIsNotEmpty
org.apache.hadoop.fs.azure.AzureException: com.microsoft.azure.storage.StorageException: This operation is not permitted on a non-empty directory.
  at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.delete(AzureNativeFileSystemStore.java:2627)
  at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.delete(AzureNativeFileSystemStore.java:2637)

新文件csv被写入数据湖，但代码停止。我需要你不要看到这个错误。

我该如何修复它？

我也遇到了类似的问题

我使用以下配置解决了此问题。。将此设置为true

--conf spark.hadoop.mapreduce.fileoutputcommitter.cleanup.skipped=true

或

尝试将写入模式删除为覆盖并检查作业我需要覆盖检查作业是否具有覆盖权限即使我消除覆盖，错误仍然会出现没有人能帮助我吗？

spark.conf.set("spark.hadoop.mapreduce.fileoutputcommitter.cleanup.skipped","true")