Apache spark Pyspark:TaskMemoryManager:未能分配页面_Apache Spark_Pyspark_Hdfs_Out Of Memory_Taskmanager

Apache spark Pyspark:TaskMemoryManager:未能分配页面

apache-spark pyspark

Apache spark Pyspark:TaskMemoryManager:未能分配页面,apache-spark,pyspark,hdfs,out-of-memory,taskmanager,Apache Spark,Pyspark,Hdfs,Out Of Memory,Taskmanager,在服务器上以独立群集模式运行spark作业时遇到错误我得到的错误类似于： WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again. 其中x可以是这样的： WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again. 一些建议的解决方案： WARN TaskMemoryManager: Failed to allocate

在服务器上以独立群集模式运行spark作业时遇到错误

我得到的错误类似于：

WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.

其中x可以是这样的：

WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.

一些建议的解决方案：

WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.

我的spark工作目标是：

WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.

连接一些表（3到4）
应用一些清洁功能
将结果（1个df，最大大小300 MB）保存到HDFS中

htop在运行作业后：

WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.

我的服务器规格：

WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.

内存：31GB
中央处理器：8
每个插座的芯数：8

我的配置：（伪代码）

p.S:

WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.

以前，这种方法在处理上述细节和更大的数据集时也适用（约300GB）没有问题
我还是个新手

我试过：

WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.

stop-all.sh（hadoop和spark）
更改配置
```
spark.executor.memory:10g
```

添加一些配置：

spark.sql.autoBroadcastJoinThreshold:-1

和

spark.sql.broadcastTimeout:3000

在观察了许多测试场景并尝试了很多东西之后

服务器重启解决了这个问题，我知道这不是一个完美的解决方案，但确实如此