Apache spark 为什么这个pyspark.ml.RandomForestRegressor由于上下文停止而失败？_Apache Spark_Pyspark_Apache Spark Ml

Apache spark 为什么这个pyspark.ml.RandomForestRegressor由于上下文停止而失败？

apache-spark pyspark

Apache spark 为什么这个pyspark.ml.RandomForestRegressor由于上下文停止而失败？,apache-spark,pyspark,apache-spark-ml,Apache Spark,Pyspark,Apache Spark Ml,我试图在一个名为train的数据帧上训练一个随机森林回归器，如下所示： rf=pyspark.ml.regression.randomforestrestregressor（featuresCol=self.featuresCol，labelCol=self.labelCol） param_grid=ParamGridBuilder（）\ .addGrid（右数树[5,10,20]）\ .addGrid（rf.maxDepth，[5,10,15]）\ .build（） crossval=Cros

我试图在一个名为

train

的数据帧上训练一个

随机森林回归器

，如下所示：

rf=pyspark.ml.regression.randomforestrestregressor（featuresCol=self.featuresCol，labelCol=self.labelCol）
param_grid=ParamGridBuilder（）\
.addGrid（右数树[5,10,20]）\
.addGrid（rf.maxDepth，[5,10,15]）\
.build（）
crossval=CrossValidator（估计器=rf，
估计器参数映射=参数网格，
评估器=回归评估器（），
numFolds=3）
self.model=crossval.fit（列车）

以下是dataframe中的行数、分区数、示例行和dataframe架构：

Training on 26398 examples with 8 partitions
{'features': SparseVector(10479, {5: 1.0, 360: 1.0, 361: 0.2444, 362: -0.9697, 363: 1.0, 10476: -0.0685}),
 'label': 989}
root
 |-- features: vector (nullable = true)
 |-- label: long (nullable = true)

尝试拟合模型后的最终错误消息：

org.apache.spark.SparkException: Job 44 cancelled because SparkContext was shut down

是什么导致了这次失败

掌握

m4.xlarge
8 vCPU
16吉比特存储器

工人（4个实例）

r4.xlarge
4 vCPU
30.5gib存储器

谢谢，这非常有帮助。我重新运行了我的训练管道，并确保我的分区得到更仔细的管理，但由于同样的原因（停止上下文），它仍然失败，因此我认为分区是一个骗人的东西。是的，所以，只尝试了477个长度为366的特征向量示例，训练失败了，出现了相同的错误。当我尝试训练模型时，预处理产生的数据可能以某种方式在工作人员中积累并占用内存？在这一点上，我可能必须在预处理后将数据帧写入hdfs，然后关闭spark作业并启动新的spark上下文来实际训练模型。