Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 随机获取spark流中的LeaseExpiredException_Apache Spark_Hadoop_Hdfs_Spark Streaming_Parquet - Fatal编程技术网

Apache spark 随机获取spark流中的LeaseExpiredException

Apache spark 随机获取spark流中的LeaseExpiredException,apache-spark,hadoop,hdfs,spark-streaming,parquet,Apache Spark,Hadoop,Hdfs,Spark Streaming,Parquet,我有一个spark streaming(2.1.1和cloudera 5.12)。带输入卡夫卡和输出HDF(拼花格式) 问题是,我随机获得LeaseExpiredException(不是在所有小批量中) org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):无租赁/user/qoe_fixe/data\u tv/tmp/cleanData/\u tem

我有一个spark streaming(2.1.1和cloudera 5.12)。带输入卡夫卡和输出HDF(拼花格式) 问题是,我随机获得LeaseExpiredException(不是在所有小批量中)

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):无租赁/user/qoe_fixe/data\u tv/tmp/cleanData/\u temporary/0/\u temporary/trument\u 20180629132202\u m_0000000/year=2018/month=6/day=29/hour=11/source=LYO2/part-00000-c6f21a40-4088-4d97-ae0c-24fa463550ab.snapy.2024:文件不存在。持有人DFSClient_尝试_20180629132202_0215_m_000000_0_u0-1048963677_900没有任何打开的文件

我正在使用dataset API来写入hdfs

      if (!InputWithDatePartition.rdd.isEmpty() ) InputWithDatePartition.repartition(1).write.partitionBy("year", "month", "day","hour","source").mode("append").parquet(cleanPath)

由于此错误,我的作业在几个小时后失败

两个作业写入同一目录并共享同一个
\u临时文件夹

因此,当第一个作业完成时,将执行此代码(FileOutputCommitter类):

当第二个作业仍在运行时,它将删除挂起的JobAttemptSpath(\u临时) 这可能会有帮助:


您确定没有其他作业尝试更新/删除路径
“cleanPath”
?我有两个作业流式写入此文件夹,但我添加了“source”作为分区,因此它们将写入不同的分区(文件夹)。当我将这两个作业的cleanpath(父文件夹)更改为不同时,我没有遇到这个问题目录路径是临时位置。你能试着给出一些具体的路径,看看这个问题是否存在吗?
  public void cleanupJob(JobContext context) throws IOException {
    if (hasOutputPath()) {
      Path pendingJobAttemptsPath = getPendingJobAttemptsPath();
      FileSystem fs = pendingJobAttemptsPath
          .getFileSystem(context.getConfiguration());
      // if job allow repeatable commit and pendingJobAttemptsPath could be
      // deleted by previous AM, we should tolerate FileNotFoundException in
      // this case.
      try {
        fs.delete(pendingJobAttemptsPath, true);
      } catch (FileNotFoundException e) {
        if (!isCommitJobRepeatable(context)) {
          throw e;
        }
      }
    } else {
      LOG.warn("Output Path is null in cleanupJob()");
    }
  }