Apache spark Pyspark:TaskMemoryManager:未能分配页面
在服务器上以独立群集模式运行spark作业时遇到错误 我得到的错误类似于:Apache spark Pyspark:TaskMemoryManager:未能分配页面,apache-spark,pyspark,hdfs,out-of-memory,taskmanager,Apache Spark,Pyspark,Hdfs,Out Of Memory,Taskmanager,在服务器上以独立群集模式运行spark作业时遇到错误 我得到的错误类似于: WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again. 其中x可以是这样的: WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again. 一些建议的解决方案: WARN TaskMemoryManager: Failed to allocate
WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.
其中x可以是这样的:
WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.
一些建议的解决方案:
WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.
我的spark工作目标是:
WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.
- 连接一些表(3到4)
- 应用一些清洁功能
- 将结果(1个df,最大大小300 MB)保存到HDFS中
WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.
我的服务器规格:
WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.
- 内存:31GB
- 中央处理器:8
- 每个插座的芯数:8
WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.
- 以前,这种方法在处理上述细节和更大的数据集时也适用 (约300GB)没有问题
- 我还是个新手
WARN TaskMemoryManager: Failed to allocate a page (x bytes), try again.
- stop-all.sh(hadoop和spark)
- 更改配置
spark.executor.memory:10g
- 添加一些配置:
和spark.sql.autoBroadcastJoinThreshold:-1
spark.sql.broadcastTimeout:3000