Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/google-apps-script/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 在dataframe中修改后,将创建多少阶段和任务_Apache Spark - Fatal编程技术网

Apache spark 在dataframe中修改后,将创建多少阶段和任务

Apache spark 在dataframe中修改后,将创建多少阶段和任务,apache-spark,Apache Spark,当一个数据框被拆分并再次与不同的列连接时,在DAG中创建了多少个阶段,以及如何在阶段中创建任务 4. How DAG works in Spark? The interpreter is the first layer, using a Scala interpreter, Spark interprets the code with some modifications. Spark creates an operator graph when you enter your

当一个数据框被拆分并再次与不同的列连接时,在DAG中创建了多少个阶段,以及如何在阶段中创建任务

4. How DAG works in Spark?

    The interpreter is the first layer, using a Scala interpreter, Spark interprets the code with some modifications.
    Spark creates an operator graph when you enter your code in Spark console.
    When we call an Action on Spark RDD at a high level, Spark submits the operator graph to the DAG Scheduler.
    Divide the operators into stages of the task in the DAG Scheduler. A stage contains task based on the partition of the input data. The DAG scheduler pipelines operators together. For example, map operators schedule in a single stage.
    The stages pass on to the Task Scheduler. It launches task through cluster manager. The dependencies of stages are unknown to the task scheduler.
    The Workers execute the task on the slave.
您可以从以下链接获得更多信息: