Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/django/20.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 关于maxPartitionId,ShuffleMapStage和ResultStage之间有区别吗?_Apache Spark_Sparkcore - Fatal编程技术网

Apache spark 关于maxPartitionId,ShuffleMapStage和ResultStage之间有区别吗?

Apache spark 关于maxPartitionId,ShuffleMapStage和ResultStage之间有区别吗?,apache-spark,sparkcore,Apache Spark,Sparkcore,我不明白为什么ShuffleMapStage maxPartitionId是stage.numPartitions-1,而ResultStage是s.rdd.partitions.length-1。当我深入到stage.numPartitions时,我发现stage.numPartitions相当于rdd.partitions.length。为什么不使用ShuffleMapStagerdd.partitions.length而不是stage.numPartitions 相关代码如下所示 priv

我不明白为什么ShuffleMapStage maxPartitionId是
stage.numPartitions-1
,而ResultStage是
s.rdd.partitions.length-1
。当我深入到
stage.numPartitions
时,我发现
stage.numPartitions
相当于
rdd.partitions.length
。为什么不使用ShuffleMapStage
rdd.partitions.length
而不是
stage.numPartitions

相关代码如下所示

private[spark] class DAGScheduler(){
    //.........
    stage match {
      case s: ShuffleMapStage =>
        outputCommitCoordinator.stageStart(stage = s.id, maxPartitionId = s.numPartitions - 1)
      case s: ResultStage =>
        outputCommitCoordinator.stageStart(
          stage = s.id, maxPartitionId = s.rdd.partitions.length - 1)
    }
    //.........
}

需要帮助!需要帮助!